Read Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients Online
Authors: Ben Goldacre
Here we see the same problem as in medicine: positive findings are more likely to be published than negative ones. Every now and then, a freak positive result is published showing, for example, that people can see into the future. Who knows how many psychologists have tried, over the years, to find evidence of psychic powers, running elaborate, time-consuming experiments, on dozens of subjects – maybe hundreds – and then found no evidence that such powers exist? Any scientist trying to publish such a ‘So what?’ finding would struggle to get a journal to take it seriously, at the best of times. Even with the clear target of Bem’s paper on precognition, which was widely covered in serious newspapers across Europe and the USA, the academic journal with a proven recent interest in the question of precognition simply refused to publish a paper with a negative result. Yet replicating these findings was key – Bem himself said so in his paper – so keeping track of the negative replications is vital too.
People working in real labs will tell you that sometimes an experiment can fail to produce a positive result many times before the outcome you’re hoping for appears. What does that mean? Sometimes the failures will be the result of legitimate technical problems; but sometimes they will be vitally important statistical context, perhaps even calling the main finding of the research into question. Many research findings, remember, are not absolute black-and-white outcomes, but fragile statistical correlations. Under our current system, most of this contextual information about failure is just brushed under the carpet, and this has huge ramifications for the cost of replicating research, in ways that are not immediately obvious. For example, researchers failing to replicate an initial finding may not know if they’ve failed because the original result was an overstated fluke, or because they’ve made some kind of mistake in their methods. In fact, the cost of proving that a finding was wrong is vastly greater than the cost of making it in the first place, because you need to run the experiment many more times to prove the
absence
of a finding, simply because of the way that the statistics of detecting weak effects work; and you also need to be absolutely certain that you’ve excluded all technical problems, to avoid getting egg on your face if your replication turns out to have been inadequate. These barriers to refutation may partly explain why it’s so easy to get away with publishing findings that ultimately turn out to be wrong.
30
Publication bias is not just a problem in the more abstract corners of psychology research. In 2012 a group of researchers reported in the journal
Nature
how they tried to replicate fifty-three early laboratory studies of promising targets for cancer treatments: forty-seven of the fifty-three could not be replicated.
31
This study has serious implications for the development of new drugs in medicine, because such unreplicable findings are not simply an abstract academic issue: researchers build theories on the back of them, trust that they’re valid, and investigate the same idea using other methods. If they are simply being led down the garden path, chasing up fluke errors, then huge amounts of research money and effort are being wasted, and the discovery of new medical treatments is being seriously retarded.
The authors of the study were clear on both the cause of and the solution for this problem. Fluke findings, they explained, are often more likely to be submitted to journals – and more likely to be published – than boring, negative ones. We should give more incentives to academics for publishing negative results; but we should also give them more opportunity.
This means changing the behaviour of academic journals, and here we are faced with a problem. Although they are usually academics themselves, journal editors have their own interests and agendas, and have more in common with everyday journalists and newspaper editors than some of them might wish to admit, as the episode of the precognition experiment above illustrates very clearly. Whether journals like this are a sensible model for communicating research at all is a hotly debated subject in academia, but this is the current situation. Journals are the gatekeepers, they make decisions on what’s relevant and interesting for their audience, and they compete for readers.
This can lead them to behave in ways that don’t reflect the best interests of science, because an individual journal’s desire to provide colourful content might conflict with the collective need to provide a comprehensive picture of the evidence. In newspaper journalism, there is a well-known aphorism: ‘When a dog bites a man, that’s not news; but when a man bites a dog…’ These judgements on newsworthiness in mainstream media have even been demonstrated quantitatively. One study in 2003, for example, looked at the BBC’s health news coverage over several months, and calculated how many people had to die from a given cause for one story to appear. 8,571 people died from smoking for each story about smoking; but there were three stories for every death from new variant CJD, or ‘mad cow disease’.
32
Another, in 1992, looked at print-media coverage of drug deaths, and found that you needed 265 deaths from paracetamol poisoning for one story about such a death to appear in a paper; but every death from MDMA received, on average, one piece of news coverage.
33
If similar judgements are influencing the content of academic journals, then we have a problem. But can it really be the case that academic journals are the bottleneck, preventing doctors and academics from having access to unflattering trial results about the safety and effectiveness of the drugs they use? This argument is commonly deployed by industry, and researchers too are often keen to blame journals for rejecting negative findings en masse. Luckily, this has been the subject of some research; and overall, while journals aren’t blameless, it’s hard to claim that they are the main source of this serious public-health problem. This is especially so since there are whole academic journals dedicated to publishing clinical trials, with a commitment to publishing negative results written into their constitutions.
But to be kind, for the sake of completeness, and because industry and researchers are so keen to pass the blame on to academic journals, we can see if what they claim is true.
One survey simply asked the authors of unpublished work if they had ever submitted it for publication. One hundred and twenty-four unpublished results were identified, by following up on every study approved by a group of US ethics committees, and when the researchers contacted the teams behind the unpublished results, it turned out that only six papers had ever actually been submitted and rejected.
34
Perhaps, you might say, this was a freak finding. Another approach is to follow up all the papers submitted to one journal, and see if those with negative results are rejected more often. Here again, the journals seem blameless: 745 manuscripts submitted to the
Journal of the American Medical Association
(
JAMA
) were followed up, and there was no difference in acceptance rate for significant and non-significant findings.
35
The same thing has been tried with papers submitted to the
BMJ
, the
Lancet
,
Annals of Internal Medicine
and the
Journal of Bone and Joint Surgery
.
36
Again and again, no effect was found. Might that be because the journals played fair when they knew they were being watched? Turning around an entire publishing operation for one brief performance would be tough, but it’s possible.
These studies all involved observing what has happened in normal practice. One last option is to run an experiment, sending identical papers to various journals, but changing the direction of the results at random, to see if that makes any difference to the acceptance rates. This isn’t something you’d want to do very often, because it wastes a lot of people’s time; but since publication bias matters, it has been regarded as a justifiable intrusion on a few occasions.
In 1990 a researcher called Epstein created a series of fictitious papers, with identical methods and presentation, differing only in whether they reported positive or negative results. He sent them at random to 146 social-work journals: the positive papers were accepted 35 per cent of the time, and the negative ones 26 per cent of the time, a difference that wasn’t large enough to be statistically significant.
37
Other studies have tried something similar on a smaller scale, not submitting a paper to a journal, but rather, with the assistance of the journal, sending spoof academic papers to individual peer reviewers: these people do not make the final decision on publication, but they do give advice to editors, so a window into their behaviour would be useful. These studies have had more mixed results. In one from 1977, sham papers with identical methods but different results were sent to seventy-five reviewers. Some bias was found from reviewers against findings that disagreed with their own views.
38
Another study, from 1994, looked at reviewers’ responses to a paper on TENS machines: these are fairly controversial devices sold for pain relief. Thirty-three reviewers with strong views one way or the other were identified, and again it was found that their judgements on the paper were broadly correlated with their pre-existing views, though the study was small.
39
Another paper did the same thing with papers on quack treatments; it found that the direction of findings had no effect on reviewers from mainstream medical journals deciding whether to accept them.
40
One final randomised trial from 2010 tried on a grand scale to see if reviewers really do reject ideas based on their pre-existing beliefs (a good indicator of whether journals are biased by results, when they should be focused simply on whether a study is properly designed and conducted). Fabricated papers were sent to over two hundred reviewers, and they were all identical, except for the results they reported: half of the reviewers got results they would like, half got results they wouldn’t. Reviewers were more likely to recommend publication if they received the version of the manuscript with results they’d like (97 per cent vs 80 per cent), more likely to detect errors in a manuscript whose results they didn’t like, and rated the methods more highly in papers whose results they liked.
41
Overall, though, even if there are clearly rough edges in some domains, these results don’t suggest that the journals are the main cause of the problem of the disappearance of negative trials. In the experiments isolating the peer reviewers, those individual referees were biased in some studies, but they don’t have the last word on publication, and in all the studies which look at what happens to negative papers submitted to journals in the real world, the evidence shows that they proceed into print without problems. Journals may not be entirely innocent, but it would be wrong to lay the blame at their door.
In the light of all this, the data on what researchers say about their own behaviour is very revealing. In various surveys they have said that they thought there was no point in submitting negative results, because they would just be rejected by journals: 20 per cent of medical researchers said so in 1998;
42
61 per cent of psychology and education researchers said so in 1991;
43
and so on.
44
If asked why they’ve failed to send in research for publication, the most common reasons researchers give are negative results, a lack of interest, or a lack of time.
This is the more abstract end of academia – largely away from the immediate world of clinical trials – but it seems that academics are mistaken, at best, about the reasons why negative results go missing. Journals may pose some barriers to publishing negative results, but they are hardly absolute, and much of the problem lies in academics’ motivations and perceptions.
More than that, in recent years, the era of open-access academic journals has got going in earnest: there are now several, such as
Trials
, which are free to access, and have a core editorial policy that they will accept any trial report, regardless of result, and will actively solicit negative findings. With offers like this on the table, it is very hard to believe that anyone would really struggle to publish a trial with a negative result if they wanted to. And yet, despite this, negative results continue to go missing, with vast multinational companies simply withholding results on their drugs, even though academics and doctors are desperate to see them.
You might reasonably wonder whether there are people who are supposed to prevent this kind of data from being withheld. The universities where research takes place, for example; or the regulators; or the ‘ethics committees’, which are charged with protecting patients who participate in research. Unfortunately, our story is about to take a turn to the dark side. We will see that many of the very people and organisations we would have expected to protect patients from the harm inflicted by missing data have, instead, shirked their responsibilities; and worse than that, we will see that many of them have actively conspired in helping companies to withhold data from patients. We are about to hit some big problems, some bad people, and some simple solutions.
How ethics committees and universities have failed us
By now, you will, I hope, share my view that withholding results from clinical trials is unethical, for the simple reason that hidden data exposes patients to unnecessary and avoidable harm. But the ethical transgressions here go beyond the simple harm inflicted on future patients.