Go to transcript
It is worth reiterating that contrary to popular depictions of science, science does not rely on authority as an indicator of truth.
The video reminds me of an xkcd comic showing the problem with using statistical significance if the studies showing no effect are unreported.
In this analogy, the study showing a link between green jelly beans and acne has only a 5% probability (or less) of being a coincidence (p < 0.05). This would be convincing evidence that there is a link between green jelly beans and acne, except all the 19 studies showing no link between non-green jelly beans and acne were unreported and discarded. If all the study results were reported, then it would suggest that the result of the green-jelly-bean study is indeed a coincidence: 1/20 = 5%.
Scientific studies in real life can be even worse. Companies, and even university researchers, are not obligated to publish studies in which the results show no effect (studies with “null results”). This means that researchers can run the hypothetical green-jelly-bean study 20 times until they get the result that they want, by coincidence. What normally happens does not involve ill intent, but has the same effect. The hypothetical green-jelly-bean study is run independently by 20 different research teams (who can be separated by time), who are unaware of each other, because studies with negative results are not published. Only the group with the positive result publishes its results, but the result is actually a coincidence. See the concept of publication bias at Wikipedia.
Transcript of Ben Goldacre talk from TED.com:
So I’m a doctor, but I kind of slipped sideways into research, and now I’m an epidemiologist. And nobody really knows what epidemiology is. Epidemiology is the science of how we know in the real world if something is good for you or bad for you. And it’s best understood through example as the science of those crazy, wacky newspaper headlines. And these are just some of the examples.
These are from the Daily Mail. Every country in the world has a newspaper like this. It has this bizarre, ongoing philosophical project of dividing all the inanimate objects in the world into the ones that either cause or prevent cancer. So here are some of the things they said cause cancer recently: divorce, Wi-Fi, toiletries and coffee. Here are some of the things they say prevents cancer: crusts, red pepper, licorice and coffee. So already you can see there are contradictions. Coffee both causes and prevents cancer. And as you start to read on, you can see that maybe there’s some kind of political valence behind some of this. So for women, housework prevents breast cancer, but for men, shopping could make you impotent. So we know that we need to start unpicking the science behind this.
And what I hope to show is that unpicking dodgy claims, unpicking the evidence behind dodgy claims, isn’t a kind of nasty carping activity; it’s socially useful, but it’s also an extremely valuable explanatory tool. Because real science is all about critically appraising the evidence for somebody else’s position. That’s what happens in academic journals. That’s what happens at academic conferences. The Q&A session after a post-op presents data is often a blood bath. And nobody minds that. We actively welcome it. It’s like a consenting intellectual S&M activity. So what I’m going to show you is all of the main things, all of the main features of my discipline — evidence-based medicine. And I will talk you through all of these and demonstrate how they work, exclusively using examples of people getting stuff wrong.
So we’ll start with the absolute weakest form of evidence known to man, and that is authority. In science, we don’t care how many letters you have after your name. In science, we want to know what your reasons are for believing something. How do you know that something is good for us or bad for us? But we’re also unimpressed by authority, because it’s so easy to contrive. This is somebody called Dr. Gillian McKeith Ph.D, or, to give her full medical title, Gillian McKeith. (Laughter) Again, every country has somebody like this. She is our TV diet guru. She has massive five series of prime-time television, giving out very lavish and exotic health advice. She, it turns out, has a non-accredited correspondence course Ph.D. from somewhere in America. She also boasts that she’s a certified professional member of the American Association of Nutritional Consultants, which sounds very glamorous and exciting. You get a certificate and everything. This one belongs to my dead cat Hetti. She was a horrible cat. You just go to the website, fill out the form, give them $60, and it arrives in the post. Now that’s not the only reason that we think this person is an idiot. She also goes and says things like, you should eat lots of dark green leaves, because they contain lots of chlorophyll, and that will really oxygenate your blood. And anybody who’s done school biology remembers that chlorophyll and chloroplasts only make oxygen in sunlight, and it’s quite dark in your bowels after you’ve eaten spinach.
Next, we need proper science, proper evidence. So, “Red wine can help prevent breast cancer.” This is a headline from the Daily Telegraph in the U.K. “A glass of red wine a day could help prevent breast cancer.” So you go and find this paper, and what you find is it is a real piece of science. It is a description of the changes in one enzyme when you drip a chemical extracted from some red grape skin onto some cancer cells in a dish on a bench in a laboratory somewhere. And that’s a really useful thing to describe in a scientific paper, but on the question of your own personal risk of getting breast cancer if you drink red wine, it tells you absolutely bugger all. Actually, it turns out that your risk of breast cancer actually increases slightly with every amount of alcohol that you drink. So what we want is studies in real human people.
And here’s another example. This is from Britain’s leading diet and nutritionist in the Daily Mirror, which is our second biggest selling newspaper. “An Australian study in 2001 found that olive oil in combination with fruits, vegetables and pulses offers measurable protection against skin wrinklings.” And then they give you advice: “If you eat olive oil and vegetables, you’ll have fewer skin wrinkles.” And they very helpfully tell you how to go and find the paper. So you go and find the paper, and what you find is an observational study. Obviously nobody has been able to go back to 1930, get all the people born in one maternity unit, and half of them eat lots of fruit and veg and olive oil, and then half of them eat McDonald’s, and then we see how many wrinkles you’ve got later.
You have to take a snapshot of how people are now. And what you find is, of course, people who eat veg and olive oil have fewer skin wrinkles. But that’s because people who eat fruit and veg and olive oil, they’re freaks, they’re not normal, they’re like you; they come to events like this. They are posh, they’re wealthy, they’re less likely to have outdoor jobs, they’re less likely to do manual labor, they have better social support, they’re less likely to smoke — so for a whole host of fascinating, interlocking social, political and cultural reasons, they are less likely to have skin wrinkles. That doesn’t mean that it’s the vegetables or the olive oil.
So ideally what you want to do is a trial. And everybody thinks they’re very familiar with the idea of a trial. Trials are very old. The first trial was in the Bible — Daniel 1:12. It’s very straightforward — you take a bunch of people, you split them in half, you treat one group one way, you treat the other group the other way, and a little while later, you follow them up and see what happened to each of them. So I’m going to tell you about one trial, which is probably the most well-reported trial in the U.K. news media over the past decade. And this is the trial of fish oil pills. And the claim was fish oil pills improve school performance and behavior in mainstream children. And they said, “We’ve done a trial. All the previous trials were positive, and we know this one’s gonna be too.” That should always ring alarm bells. Because if you already know the answer to your trial, you shouldn’t be doing one. Either you’ve rigged it by design, or you’ve got enough data so there’s no need to randomize people anymore.
So this is what they were going to do in their trial. They were taking 3,000 children, they were going to give them all these huge fish oil pills, six of them a day, and then a year later, they were going to measure their school exam performance and compare their school exam performance against what they predicted their exam performance would have been if they hadn’t had the pills. Now can anybody spot a flaw in this design? And no professors of clinical trial methodology are allowed to answer this question. So there’s no control; there’s no control group. But that sounds really techie. That’s a technical term. The kids got the pills, and then their performance improved.
What else could it possibly be if it wasn’t the pills? They got older. We all develop over time. And of course, also there’s the placebo effect. The placebo effect is one of the most fascinating things in the whole of medicine. It’s not just about taking a pill, and your performance and your pain getting better. It’s about our beliefs and expectations. It’s about the cultural meaning of a treatment. And this has been demonstrated in a whole raft of fascinating studies comparing one kind of placebo against another. So we know, for example, that two sugar pills a day are a more effective treatment for getting rid of gastric ulcers than one sugar pill. Two sugar pills a day beats one sugar pill a day. And that’s an outrageous and ridiculous finding, but it’s true. We know from three different studies on three different types of pain that a saltwater injection is a more effective treatment for pain than taking a sugar pill, taking a dummy pill that has no medicine in it — not because the injection or the pills do anything physically to the body, but because an injection feels like a much more dramatic intervention. So we know that our beliefs and expectations can be manipulated, which is why we do trials where we control against a placebo — where one half of the people get the real treatment and the other half get placebo.
But that’s not enough. What I’ve just shown you are examples of the very simple and straightforward ways that journalists and food supplement pill peddlers and naturopaths can distort evidence for their own purposes. What I find really fascinating is that the pharmaceutical industry uses exactly the same kinds of tricks and devices, but slightly more sophisticated versions of them, in order to distort the evidence that they give to doctors and patients, and which we use to make vitally important decisions.
So firstly, trials against placebo: everybody thinks they know that a trial should be a comparison of your new drug against placebo. But actually in a lot of situations that’s wrong. Because often we already have a very good treatment that is currently available, so we don’t want to know that your alternative new treatment is better than nothing. We want to know that it’s better than the best currently available treatment that we have. And yet, repeatedly, you consistently see people doing trials still against placebo. And you can get license to bring your drug to market with only data showing that it’s better than nothing, which is useless for a doctor like me trying to make a decision.
But that’s not the only way you can rig your data. You can also rig your data by making the thing you compare your new drug against really rubbish. You can give the competing drug in too low a dose, so that people aren’t properly treated. You can give the competing drug in too high a dose, so that people get side effects. And this is exactly what happened which antipsychotic medication for schizophrenia. 20 years ago, a new generation of antipsychotic drugs were brought in and the promise was that they would have fewer side effects. So people set about doing trials of these new drugs against the old drugs, but they gave the old drugs in ridiculously high doses — 20 milligrams a day of haloperidol. And it’s a foregone conclusion, if you give a drug at that high a dose, that it will have more side effects and that your new drug will look better.
10 years ago, history repeated itself, interestingly, when risperidone, which was the first of the new-generation antipscyhotic drugs, came off copyright, so anybody could make copies. Everybody wanted to show that their drug was better than risperidone, so you see a bunch of trials comparing new antipsychotic drugs against risperidone at eight milligrams a day. Again, not an insane dose, not an illegal dose, but very much at the high end of normal. And so you’re bound to make your new drug look better. And so it’s no surprise that overall, industry-funded trials are four times more likely to give a positive result than independently sponsored trials.
But — and it’s a big but — (Laughter) it turns out, when you look at the methods used by industry-funded trials, that they’re actually better than independently sponsored trials. And yet, they always manage to to get the result that they want. So how does this work? How can we explain this strange phenomenon? Well it turns out that what happens is the negative data goes missing in action; it’s withheld from doctors and patients. And this is the most important aspect of the whole story. It’s at the top of the pyramid of evidence. We need to have all of the data on a particular treatment to know whether or not it really is effective. And there are two different ways that you can spot whether some data has gone missing in action. You can use statistics, or you can use stories. I personally prefer statistics, so that’s what I’m going to do first.
This is something called funnel plot. And a funnel plot is a very clever way of spotting if small negative trials have disappeared, have gone missing in action. So this is a graph of all of the trials that have been done on a particular treatment. And as you go up towards the top of the graph, what you see is each dot is a trial. And as you go up, those are the bigger trials, so they’ve got less error in them. So they’re less likely to be randomly false positives, randomly false negatives. So they all cluster together. The big trials are closer to the true answer. Then as you go further down at the bottom, what you can see is, over on this side, the spurious false negatives, and over on this side, the spurious false positives. If there is publication bias, if small negative trials have gone missing in action, you can see it on one of these graphs. So you can see here that the small negative trials that should be on the bottom left have disappeared. This is a graph demonstrating the presence of publication bias in studies of publication bias. And I think that’s the funniest epidemiology joke that you will ever hear.
That’s how you can prove it statistically, but what about stories? Well they’re heinous, they really are. This is a drug called reboxetine. This is a drug that I myself have prescribed to patients. And I’m a very nerdy doctor. I hope I try to go out of my way to try and read and understand all the literature. I read the trials on this. They were all positive. They were all well-conducted. I found no flaw. Unfortunately, it turned out, that many of these trials were withheld. In fact, 76 percent of all of the trials that were done on this drug were withheld from doctors and patients. Now if you think about it, if I tossed a coin a hundred times, and I’m allowed to withhold from you the answers half the times, then I can convince you that I have a coin with two heads. If we remove half of the data, we can never know what the true effect size of these medicines is.
And this is not an isolated story. Around half of all of the trial data on antidepressants has been withheld, but it goes way beyond that. The Nordic Cochrane Group were trying to get a hold of the data on that to bring it all together. The Cochrane Groups are an international nonprofit collaboration that produce systematic reviews of all of the data that has ever been shown. And they need to have access to all of the trial data. But the companies withheld that data from them, and so did the European Medicines Agency for three years.
This is a problem that is currently lacking a solution. And to show how big it goes, this is a drug called Tamiflu, which governments around the world have spent billions and billions of dollars on. And they spend that money on the promise that this is a drug which will reduce the rate of complications with flu. We already have the data showing that it reduces the duration of your flu by a few hours. But I don’t really care about that. Governments don’t care about that. I’m very sorry if you have the flu, I know it’s horrible, but we’re not going to spend billions of dollars trying to reduce the duration of your flu symptoms by half a day. We prescribe these drugs, we stockpile them for emergencies on the understanding that they will reduce the number of complications, which means pneumonia and which means death. The infectious diseases Cochrane Group, which are based in Italy, has been trying to get the full data in a usable form out of the drug companies so that they can make a full decision about whether this drug is effective or not, and they’ve not been able to get that information. This is undoubtedly the single biggest ethical problem facing medicine today. We cannot make decisions in the absence of all of the information.
So it’s a little bit difficult from there to spin in some kind of positive conclusion. But I would say this: I think that sunlight is the best disinfectant. All of these things are happening in plain sight, and they’re all protected by a force field of tediousness. And I think, with all of the problems in science, one of the best things that we can do is to lift up the lid, finger around in the mechanics and peer in.
Thank you very much.
TED video via Sociological Images.