By Gary Taubes, author of Nobel Dreams (1987), Bad Science (1993), Good Calories, Bad Calories (2007), and Why We Get Fat (2011). Taubes is a former staff member at DISCOVER. He has won the Science in Society Award of the National Association of Science Writers three times and was awarded an MIT Knight Science Journalism Fellowship for 1996-97. A modified version of this post appeared on Taubes’ blog.
The last couple of weeks have witnessed a slightly-greater-than-usual outbreak of extremely newsworthy nutrition stories that could be described as bad journalism feasting on bad science. The first was a report out of the Harvard School of Public Health that meat-eating apparently causes premature death and disease (here’s how the New York Times covered it), and the second out of UC San Diego suggesting that chocolate is a food we should all be eating to lose weight (the Times again).
Both of these studies were classic examples of what is known technically as observational epidemiology, a field of research I discussed at great length back in 2007 in a cover article for in the New York Times Magazine. The article was called “Do We Really Know What Makes Us Healthy?” and I made the argument that this particular pursuit is closer to a pseudoscience than a real science.
As a case study, I used a collaboration of researchers from the Harvard School of Public Health, led by Walter Willett, who runs the Nurses’ Health Study. And I pointed out that every time that these Harvard researchers had claimed that an association observed in their observational trials was a causal relationship—that food or drug X caused disease or health benefit Y—and that this supposed causal relationship had then been tested in experiment, the experiment had failed to confirm the causal interpretation—i.e., the folks from Harvard got it wrong. Not most times, but every time.
Now it’s these very same Harvard researchers—Walter Willett and his colleagues—who have authored the article from two weeks ago claiming that red meat and processed meat consumption is deadly; that eating it regularly raises our risk of dying prematurely and contracting a host of chronic diseases. Zoe Harcombe has done a wonderful job dissecting the paper at her site. I want to talk about the bigger picture (in a less concise way).
This is an issue about science itself and the quality of research done in nutrition. Science is ultimately about establishing cause and effect. It’s not about guessing. You come up with a hypothesis—force x causes observation y—and then you do your best to prove that it’s wrong. If you can’t, you tentatively accept the possibility that your hypothesis might be right. In the words of Karl Popper, a leading philosopher of science, “The method of science is the method of bold conjectures and ingenious and severe attempts to refute them.” The bold conjectures, the hypotheses, making the observations that lead to your conjectures… that’s the easy part. The ingenious and severe attempts to refute your conjectures is the hard part. Anyone can make a bold conjecture. (Here’s one: space aliens cause heart disease.) Testing hypotheses ingeniously and severely is the single most important part of doing science.
The problem with observational studies like the ones from Harvard and UCSD that gave us the bad news about meat and the good news about chocolate, is that the researchers do little of this. The hard part of science is left out, and they skip straight to the endpoint, insisting that their causal interpretation of the association is the correct one and we should probably all change our diets accordingly.
In these observational studies, the epidemiologists establish a cohort of subjects to follow (tens of thousands of nurses and physicians, in the Harvard case) and then ask them about what they eat. The fact that they use questionnaires that are notoriously fallible is almost irrelevant here because the rest of the science is so flawed. Then they follow the subjects for decades. Now they have a database of diseases, deaths and foods consumed, and they can draw associations between what these people were eating and the diseases and deaths.
The end result is an association. In the meat-will-kill-you report, eating a lot of red meat and processed meat was associated with premature death and increased risk of chronic disease. That’s what they observed in the cohorts—the observation. The fifth of the subjects who ate the most meat (the top quintile, as its known in the technical jargon) had a 20 percent greater risk of dying over the course of the study than the subjects who ate the least meat (the bottom quintile).
This association then generates a hypothesis, which is why these associations used to be known as “hypothesis-generating data” (before these epidemiologists decided they were tired of their hypotheses being shot down by experiments and they’d skip this step). Because of the association that we’ve observed, so this thinking goes, we now hypothesize that eating red meat and particularly processed meat is bad for our health and we will live longer and prosper more if we don’t do it. We hypothesize that the cause of the association we’ve observed is that red and processed meat is unhealthy stuff.
Terrific. We have our bold conjecture. What should we do next?
Well, because this is supposed to be a science, we ask the question whether we can imagine other less newsworthy explanations for the association we’ve observed. What else might cause it? An association by itself contains no causal information. There are an infinite number of associations that are not causally related for every association that is, so the fact of the association itself doesn’t tell us much.
Moreover, this meat-eating association with disease is a tiny association. It’s not the 20-fold increased risk of lung cancer that pack-a-day smokers have compared to non-smokers. It’s a 0.2-fold increased risk—1/100th the size. So with lung cancer we could buy as a society the observation that cigarettes cause lung cancer because it was and remains virtually impossible to imagine what other factor could explain an association so huge and dramatic. Experiments didn’t need to be done to test the hypothesis because, well, the signal was just so big that the epidemiologists of the time could safely believe it was real. And then experiments were, in effect, done anyway. People quit smoking and lung cancer rates came down.
When I first wrote about the questionable nature of observational epidemiology in Science back in 1995, “Epidemiology Faces Its Limits”, I noted that very few epidemiologists would ever take seriously an association smaller than a 3- or 4-fold increase in risk. (Not that they believed it was a causal relationship; only that they thought it was worth studying.) These Harvard people are discussing and getting an extraordinary amount of media attention over a 0.2-fold increased risk.
So how can we explain this tiny association between the risk of eating a lot of red and processed meat—the 1/100th-the-size-of-the-lung-cancer-cigarette effect—compared to eating virtually none? Again, we have an association;and our job is to figure out exactly how these two variables, meat-eating and disease, relate to each other, if they do at all.. Here’s how the great German pathologist Rudolph Virchow phrased this in 1849: How, he said, can we “with certainty decide which of two coexistent phenomena is the cause and which the effect, whether one of them is the cause at all instead of both being effects of a third cause, or even whether both are effects of two entirely unrelated causes”? Again, this is the hard part.
The answer ultimately is that we do experiments. But we’ll get back to this in a minute. First, we must rack our brains to figure out if there are other causal explanations for this association beside the meat-eating one. Another way to think of this is that we’re looking for all the myriad possible ways our methodology and equipment might have fooled us. The first principle of science, as the legendary physicist Richard Feynman liked to say, is that you must not fool yourself—and you’re the easiest person to fool. Once we’ve thought up every possible, reasonable alternative hypotheses (so space aliens are out), we can see which ones survive the tests: our preferred hypothesis (meat-eating causes disease, in this case) or one of the many others we’ve considered.
So let’s think of reasonable ways in which people who eat a lot of meat might be different from people who don’t, looking specifically for differences that might also explain some of the association we observed between meat-eating, disease, and premature death. Zoe Harcombe did this beautifully with the Harvard data. The obvious clue is that as we move from the bottom quintile of meat-eaters (those who are effectively vegetarians) to the top quintile of meat-eaters, we see an increase in virtually every accepted unhealthy behavior (smoking, drinking, sedentary behavior), and we also see an increase in markers for unhealthy behaviors (high BMI, high blood pressure, etc). So what could be happening here?
In my New York Times Magazine article on this research, I discussed a whole host of effects, known as confounders—they confound the interpretation of the association—that could explain associations between two variables but have nothing to do biologically with the variables themselves. One of these confounders is called the compliance or adherer effect. Here’s what I said about it in the article:
The Bias of Compliance
A still more subtle component of healthy-user bias has to be confronted. This is the compliance or adherer effect. Quite simply, people who comply with their doctors’ orders when given a prescription are different and healthier than people who don’t. This difference may be ultimately unquantifiable. The compliance effect is another plausible explanation for many of the beneficial associations that epidemiologists commonly report, which means this alone is a reason to wonder if much of what we hear about what constitutes a healthful diet and lifestyle is misconceived.
The lesson comes from an ambitious clinical trial called the Coronary Drug Project that set out in the 1970s to test whether any of five different drugs might prevent heart attacks. The subjects were some 8,500 middle-aged men with established heart problems. Two-thirds of them were randomly assigned to take one of the five drugs and the other third a placebo. Because one of the drugs, clofibrate, lowered cholesterol levels, the researchers had high hopes that it would ward off heart disease. But when the results were tabulated after five years, clofibrate showed no beneficial effect. The researchers then considered the possibility that clofibrate appeared to fail only because the subjects failed to faithfully take their prescriptions.
As it turned out, those men who said they took more than 80 percent of the pills prescribed fared substantially better than those who didn’t. Only 15 percent of these faithful “adherers” died, compared with almost 25 percent of what the project researchers called “poor adherers.” This might have been taken as reason to believe that clofibrate actually did cut heart-disease deaths almost by half, but then the researchers looked at those men who faithfully took their placebos. And those men, too, seemed to benefit from adhering closely to their prescription: only 15 percent of them died compared with 28 percent who were less conscientious. “So faithfully taking the placebo cuts the death rate by a factor of two,” says David Freedman, a professor of statistics at the University of California, Berkeley [who passed away, regrettably, in 2008]. “How can this be? Well, people who take their placebo regularly are just different than the others. The rest is a little speculative. Maybe they take better care of themselves in general. But this compliance effect is quite a big effect.”
The moral of the story, says Freedman, is that whenever epidemiologists compare people who faithfully engage in some activity with those who don’t—whether taking prescription pills or vitamins or exercising regularly or eating what they consider a healthful diet—the researchers need to account for this compliance effect or they will most likely infer the wrong answer. They’ll conclude that this behavior, whatever it is, prevents disease and saves lives, when all they’re really doing is comparing two different types of people who are, in effect, incomparable.
This phenomenon is a particularly compelling explanation for why the Nurses’ Health Study and other cohort studies saw a benefit of H.R.T. [hormone replacement therapy, one subject of the article] in current users of the drugs, but not necessarily in past users. By distinguishing among women who never used H.R.T., those who used it but then stopped and current users (who were the only ones for which a consistent benefit appeared), these observational studies may have inadvertently focused their attention specifically on, as Jerry Avorn says, the “Girl Scouts in the group, the compliant ongoing users, who are probably doing a lot of other preventive things as well.”
It’s this compliance effect that makes these observational studies the equivalent of conventional wisdom-confirmation machines. Our public health authorities were doling out pretty much the same dietary advice in the 1970s and 1980s, when these observational studies were starting up, as they are now. The conventional health-conscious wisdom of the era had it that we should eat less fat and saturated fat, and so less red meat, which would also give us colon cancer, and certainly less processed meat, and more fruits, vegetables, and whole grains. And so the people who are studied in the cohorts could be divided into two groups: those who complied with this advice—the Girl Scouts, as Avorn put it—and those who didn’t.
Now when we’re looking at the subjects who avoided red meat and processed meat and comparing them to the subjects who ate them in quantity, we can think of it as effectively comparing the Girl Scouts to the non-Girl Scouts, the compliers to the conventional wisdom to the non-compliers. And the compliance effect tells us right there that we should see an association—that the Girl Scouts should appear to be healthier. (Actually they should be even healthier than Willett et al. are now reporting, which suggests that there’s something else working against them—maybe not eating enough red meat?) In other words, the people who avoided red meat and processed meats were the ones who fundamentally cared about their health and had the energy (and maybe the health and economic security) to act on it. And the people who ate a lot of red meat and processed meat in the 1980s and 1990s were the ones who didn’t. One example of how this advice could affect people’s behavior: I lived in LA in the 1990s where health-conscious behavior was and is the norm, and I’d bet that I didn’t have more than half a dozen servings of bacon or more than two steaks a year through the 1990s. It was all skinless chicken breasts and fish and way too much pasta and cereal (oatmeal or some other non-fat grain) and thousands upon thousands of egg whites without the yolks. Because that’s what we thought was healthy.
So when we compare people who ate a lot of meat and processed meat in this period to those who were effectively vegetarians, we’re comparing people who are inherently incomparable. We’re comparing health-conscious compliers to non-compliers; people who cared about their health and had the income and energy to do something about it and people who didn’t. And the compliers should always appear to be healthier in these cohorts because of the compliance effect if nothing else. No amount of “correcting” for BMI and blood pressure, smoking status, etc. can correct for this compliance effect, which is the product of all these health-conscious behaviors that can’t be measured, or just haven’t been measured. And we know this because they’re even present in randomized controlled trials, where this effect was first discovered. When the Harvard people insist they can “correct” for this, or that it’s not a factor, they’re fooling themselves. And we know they’re fooling themselves because the experimental trials keep confirming that.
That was the message of my 2007 article. As one friend described it to me a few years ago, when these cohort studies compare their top quintile of meat-eaters to their bottom quintile, they might as well be comparing Berkeley vegetarians who eat at Alice Water’s famous Chez Panisse restaurant once a week after their yoga practice to truck drivers from West Virginia whose idea of a night on the town is chicken-fried steak (and potatoes and beer and maybe some sweet potato pie with whipped cream) at the local truck stop. The researchers can imply, as Willett and his colleagues do, that the most likely reason these people have different levels of morbidity and mortality is the amount of meat they eat; but that’s only because that’s what these observational epidemiologists have to believe to justify the decades of work and tens, if not hundreds, of millions of dollars that have been spent on these trials. Not because it’s the most likely explanation. It’s far more likely that the difference is caused by all the behaviors that associate with meat-eating or effective vegetarianism—whether they are, in effect, Girl Scouts or not.
As for the chocolate study, it’s the same story. All you have to do is ask yourself who eats a lot of chocolate, or who admits to researchers in this kind of study that they eat a lot of chocolate? And what’s the chance that it’s the lean people, healthy people? In fact, I have one epidemiologist friend at UCLA who’s tall and lean and active and he knows my opinion about sugar—probably toxic—and so every time we get together for a meal he makes a point of having two or three desserts and no appetizer or main course. He can do it, because of his genetic disposition, and so he does. I’m convinced I can’t, and so I don’t. So it’s a very good chance that what these epidemiologists are learning from their studies is that lean people can eat chocolate without getting fat (at least yet) and so they do. And those of us who put on weight easily stay away from it because we have a pretty strong feeling that it’s bad for us and will make us fatter. Before I published an article based on these observational studies making any other claim than that—let alone the claim that eating more chocolate would make me thin—I’d want to spend a few years at least, if not a few decades, trying to figure out how I was probably fooling myself.
This is why the best epidemiologists—the ones I quote in the NYT Magazine article—think this nutritional epidemiology business is a pseudoscience. Observational studies like those run by the Harvard or UCSD researchers can come up with the right hypothesis of causality about as often as a stopped clock gives you the right time. It’s bound to happen on occasion, but there’s no way to tell when that is without doing experiments to test all your competing hypotheses.
It’s a sad state of affairs.
Now let’s get back to the idea of doing experiments—i.e., how we ultimately settle this difference of opinion. This is science. Do the experiments. We have at least two reasonable explanations explanations for the tiny association between meat-eating and morbidity and mortality. One is that it’s the meat itself. The other is that it’s the behaviors that associate with meat-eating. So do an experiment to see which is right. Start with a cohort of subjects and assign them at random to eat either a diet rich in red meat and processed meat, or to a diet that’s not—a mostly vegetarian diet. By assigning subjects at random to one of these two interventions, we mostly get rid of the behavioral (and socio-economic, educational, etc.) factors that might associate with choosing of your own free will whether to be a vegetarian (or a mostly-vegetarian) or a meat-eater.
These experiments have effectively been done. They’re the trials that compare Atkins-like diets to other more conventional weight loss diets—AHA Step 1 diets, Mediterranean diets, Zone diets, Ornish diets, etc. These conventional weight loss diets tend to restrict meat consumption to different extents because they restrict fat and/or saturated fat consumption and meat has a lot of fat and saturated fat in it. Ornish’s diet is the extreme example. And when these experiments have been done, the meat-rich, bacon-rich Atkins diet almost invariably comes out ahead, not just in weight loss but also in heart disease and diabetes risk factors. I discuss this in detail in chapter 18 of Why We Get Fat, ”The Nature of a Healthy Diet.” The Stanford A TO Z Study is a good example of these experiments. Over the course of the experiment—two years in this case—the subjects randomized to the Atkins-like meat- and bacon-heavy diet were healthier. That’s what we want to know.
Ultimately we’re left with a decision about what we’re going to believe: the observations, or the experiments designed to test those observations. Good scientists will always tell you to believe the experiments. That’s why they do them.