Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients (28 page)

Read Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients Online

Authors: Ben Goldacre

BOOK: Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients

11.65Mb size Format: txt, pdf, ePub

One classic failure at the analysis stage which can pervert your data horribly is to analyse patients according to the treatment they actually took, rather than the treatment they were assigned at the randomisation stage of the trial. At first glance, this seems perfectly reasonable: if 30 per cent of your patients dropped out and didn’t take your new tablet, they didn’t experience the benefit, and shouldn’t be included in the ‘new tablet’ group at analysis.

But as soon as you start to think about why patients drop out of treatment in trials, the problems with this method start to become apparent. Maybe they stopped taking your tablets because they had horrible side effects. Maybe they stopped taking your tablets because they decided they didn’t work, and just tipped them in the bin. Maybe they stopped taking your tablets, and coming to follow-up appointments, because they were dead, after your drug killed them. Looking at patients only by the treatment they took is called a ‘per protocol’ analysis, and this has been shown to dramatically overstate the benefits of treatments, which is why it’s not supposed to be used.

If you keep all the patients prescribed your new treatment – including those who stopped taking it – in the ‘new treatment’ group when you do your final calculation, this is called an ‘intention to treat’ analysis. As well as being more conservative, this analysis makes much more sense philosophically. You’re going to use the results of a trial to inform your decision about whether to ‘give someone some tablets’, not ‘force some tablets down their throat compulsorily’. So you want the results to be from an analysis that looks at people according to what they were given by their doctor, rather than what they actually swallowed.

I’ve had the joy of marking sixty exam papers – a Groundhog Day experience if ever there was one – in which a fifth of the marks were to be earned by explaining ‘intention to treat analysis’. This is at the absolute core of the evidence-based medicine curriculum, so it’s utterly bizarre that there are still endless ‘per protocol’ analyses being reported by the drugs industry. One systematic review looked at all the trial reports submitted by companies to the Swedish drug regulator, and then the published academic papers relating to the same trials (if they even existed).
²³All but one of the submissions to the regulator featured both ‘intention to treat’ and ‘per protocol’ analyses, because regulators are, for all their faults and obsessive secrecy, at least a little sharper about methodological rigour than many academic journals. All but two of the academic papers, meanwhile, only reported one analysis, usually the ‘per protocol’ one that overstates the benefits. This is the version that doctors read. In the next section, we will see another example of how academic journals participate in the game of overstating results: often, for all their claims to be the gatekeepers for good-quality research, these journals do not do their job well.

Trials that change their main outcome after they’ve finished

If you measure a dozen outcomes in your trial, but cite an improvement in any one of them as a positive result, then your results are meaningless. Our tests for deciding if a result is statistically significant assume that you are only measuring one outcome. By measuring a dozen, you have given yourself a dozen chances of getting a positive result, rather than one, without clearly declaring that. Your study is biased by design, and is likely to find more positive results than there really are.

Imagine we’re playing with dice, and we make a simple (albeit one-sided) arrangement: if I throw a double six, you have to give me £10. So I roll the dice, and they come up double three. But I still demand my £10, claiming that our original agreement was in fact that you give me £10 if I roll a double three; and you still pay me, with the cheerful encouragement of everyone around us. This exact scenario is played out in clinical academic research, as a matter of routine, every day, when we tolerate people doing something called ‘switching the primary outcome’.

Before you begin a clinical trial, you write out the protocol. This is a document describing what you’re going to do: how many participants you’re going to recruit, where and how you’re going to recruit them, what treatment each group will receive, and what outcomes you’re going to measure. In a trial you’ll measure all kinds of things as possible outcomes: perhaps a few different rating scales for ‘pain’, or ‘depression’, or whatever you’re interested in; maybe ‘quality of life’, or ‘mobility’, that you’ll measure with some kind of questionnaire; possibly ‘death from all causes’, and death from each of a number of specific causes too; and lots of other things.

Among all of these many outcomes, you will specify one (or perhaps a couple more, if you account for this in your analysis) as the main, primary outcome. You do this before the trial starts, because you’re trying to avoid one simple problem: if you measure lots of things, some of them will come up as statistically significantly improved, simply from the natural random variation in all trial data. These are real people, remember, in the real world, and their pain, depression, mobility, quality of life and so on will all vary, for all kinds of reasons, many of which have nothing whatsoever to do with the intervention that you’re testing in your trial.

If you’re a pure-hearted researcher, you’re using statistical tests specifically to identify genuine benefits of the treatment you’re testing. You’re trying to distinguish these real changes from the normal random variation of background noise that you would expect to see in your patients’ results on various tests. More than anything, you want to avoid finding false positives.

The traditional cut-off for statistical significance is ‘one in twenty’. Roughly speaking, clearing this bar means that if you repeated the same study over and over again, with the same methods, in participants taken from the same population, you’d expect to get the same positive finding you’ve observed one time in every twenty, simply by chance, even if the drug really had no benefit. If you dip two cups into the same jar of white and red beads, every now and then, purely by chance, you will come out with an unusually small number of red beads in one cup, and an unusually large number of red beads in the other. The same is true for any measurement we take in patients: there will be some random variation, and it can sometimes make it look as if one treatment is better than another, on one scoring method, simply through chance. Statistical tests are designed to stop us being misled by that kind of random variation.

So now, let’s imagine you’re running a trial where you measure ten different, independent outcomes. If we set the cut-off for statistical significance as ‘one in twenty’, then even if your drug does nothing useful at all, in your single trial you’ve still got a 50/50 chance of finding a positive benefit on at least one of your outcomes, simply from random variation in your data. If you didn’t pre-specify which of the many outcomes is your primary outcome before you started, you could be cheeky, and report any positive finding you get, in any of your ten outcomes, as a positive result from your trial.

Could you get away with doing this openly, and simply saying: ‘Hey, we measured ten things, and one of them came up as improved, therefore our new drug is awesome’? Well, you probably could get away with it in some quarters, because the consumers of scientific papers aren’t universally switched on to this kind of bait and switch. But generally people would spot it: they would expect to see a ‘primary outcome’ nominated and reported, because they know that if you measure ten things, one of them is pretty likely to come up as improved simply through chance.

The problem is this: even though people know that you should nominate a primary outcome, these primary outcomes often change between the protocol and the paper, after the people conducting the research have seen the results. Even you – a random punter who’s picked up this book on a station platform, and not a professor of either statistics or medicine – can see the madness in this. If the primary outcome reported in the finished paper is different from the primary outcome nominated before the trial started, then that is absurd: the entire point of the primary outcome is that it’s the primary outcome nominated
before
the trial started. But people do switch their primary outcomes, and this is not just an occasional problem. In fact, it’s almost routine practice.

In 2009, a group of researchers got all the trials they could find on various uses of a drug called gabapentin.
²⁴They then looked at those for which they could obtain internal documents, which meant they could identify the original, pre-specified primary outcome. Then they looked at the published academic papers that reported these trials. Of course, about half of the trials were never published at all (the scandal of this should not wear off with repetition). Twelve trials were published, and they checked to see if the things reported as primary outcomes in the academic papers really were pre-specified as primary outcomes in the internal documents, before the trial started.

What they found was a mess. Of the twenty-one primary outcomes pre-specified in the protocols, which should all have been reported, only eleven actually appeared. Six weren’t reported in any form, and four were reported, but reported as if they were secondary outcomes instead. You can also look at this from the other end of the telescope: twenty-eight primary outcomes were reported in the twelve published trials, but of those, about half were newly introduced, and were never really primary outcomes at all. This is nothing short of ridiculous: there is no excuse, not for the researchers doing the switching, and not for the academic journals failing to check. But that was only one drug. Was it a freak occurrence?

No. In 2004 some researchers published a paper looking at all areas of medicine: they took all the trials approved by the ethics committees of two cities over two years, then chased up the published papers.
²⁵About half of all the outcomes were incorrectly reported. Of the published papers, almost two thirds had at least one pre-specified primary outcome that had been switched, and this was not being done at random: exactly as you’d expect, positive outcomes were more than twice as likely to be properly reported. Other studies on primary-outcome switching report similar results.

To be clear: if you switch your pre-specified primary outcome between the beginning and the end of your trial, without a very good explanation for why you’ve done so, then you’re simply not doing science properly. Your study is broken by design. It should be a universal requirement that all studies report their pre-specified primary outcome as the primary outcome. This should be enforced by all journals, and things should have been done this way since trials began. It’s really not difficult. Yet we have collectively failed to adhere to this simple, obvious core requirement on an epic scale.

For one final illustration of what this means in practice, I shall return to paroxetine, and the studies that were conducted in children. Remember, when an area of medicine is subject to some kind of litigation, documents often become available to researchers that would otherwise be hidden from view, allowing them to identify problems, discrepancies and patterns that would not normally be detectable. For the most part these are documents which should always be in the public domain, but are not. So paroxetine may not be worse than any other drug for this kind of mischief (in fact, as we have seen from the study just described, outcome switching happens across the board): it’s simply one of the cases about which we have the most detail.

In 2008 a group of researchers decided to go through the documents opened up by the litigation over paroxetine, and examine how the results of one clinical trial – ‘trial 329’ – had been published.
²⁶As late as 2007 systematic reviews were still describing this trial as having a positive result, which is how it was reported in publications of its results. But in reality that was completely untrue: the original protocols specified two primary outcomes and six secondary ones. At the end of the trial there was no difference between paroxetine and placebo for any of these outcomes. At least nineteen more outcomes were also measured, making twenty-seven in total. Of those, only four gave a positive result for paroxetine. These positive findings were reported as if they were the main outcomes.

It would be tempting to regard the reporting of trial 329 as some kind of freak episode, an appalling exception in an otherwise sane medical world. Tragically, as the research above demonstrates, this behaviour is widespread.

So widespread, in fact, that there’s room for a small cottage industry, if there are any academics feeling brave enough to pursue the project. Someone somewhere needs to identify all the studies where the main outcomes have been switched, demand access to the raw data, and helpfully, at long last, conduct the correct analyses for the original researchers. If you choose to do this, your published papers will immediately become the definitive reference on these trials, because they will be the only ones to correctly present the pre-specified trial outcomes. The publications from the original researchers will be no more than a tangential and irrelevant distraction.

I’m sure they’ll be pleased to help.

Dodgy subgroup analyses

If your drug didn’t win overall in your trial, you can chop up the data in lots of different ways, to try and see if it won in a subgroup: maybe it works brilliantly in Chinese men between fifty-six and seventy-one. This is as stupid as playing ‘Best of three…Best of five…’ And yet it is commonplace.

Time and again we have come back to the same principle in this chapter: if you give yourself multiple chances at finding a positive result, but use statistical tests that assume you only had one go, then you vastly increase your chances of getting the result you want – if you flip a coin for long enough, you will eventually get four heads in a row.

Other books

El mundo perdido by Michael Crichton

English Rider by Bonnie Bryant

After the Fear (Young Adult Dystopian) by Rivers, Rosanne

Karl Bacon by An Eye for Glory: The Civil War Chronicles of a Citizen Soldier

To the Edge (Hideaway) by Scott, Elyse

Mercy Killing (Affairs of State Book 1) by Johnson, Kathryn

Wanderlust: A History of Walking by Rebecca Solnit

Lost by Gregory Maguire

The Seal by Adriana Koulias

Unkillable by Patrick E. McLean