I’ve been on somewhat of an unintended hiatus for the past few months, as I try to wrap up some projects, and deal with a few other things in my life. However, I just read something that has given me a kick in the pants. And I don’t mean that in a good way. In late December there was an article by Jonah Lehrer in the New Yorker titled “The truth wears off”. Much more suggestive was the subtitle, “Is there something wrong with the scientific method?”. The story discusses the “decline effect”: an article is published with startling results, and then subsequent work finds increasingly diminished evidence for the initial unexpected result. It’s as if there’s “cosmic habituation”, with the Universe conspiring to make a surprising result go away with time. The last paragraph sums things up:
The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.
I don’t particularly disagree with any of this. But it’s completely besides the point, and to untutored ears can be immensely misleading. The article is a perfect example of precisely the effect it seeks to describe (there must be a catchy word for this? Intellectual onomatopoeia?). The article gives a few examples of people finding interesting results, only to have them disappear on sustained scrutiny. It makes it sound like there is an epidemic of declining confidence:
One of the first demonstrations of this mysterious phenomenon came in the early nineteen-thirties. Joseph Banks Rhine, a psychologist at Duke, had developed an interest in the possibility of extrasensory perception, or E.S.P. Rhine devised an experiment featuring Zener cards, a special deck of twenty-five cards printed with one of five different symbols: a card was drawn from the deck and the subject was asked to guess the symbol. Most of Rhine’s subjects guessed about twenty per cent of the cards correctly, as you’d expect, but an undergraduate named Adam Linzmayer averaged nearly fifty per cent during his initial sessions, and pulled off several uncanny streaks, such as guessing nine cards in a row. The odds of this happening by chance are about one in two million. Linzmayer did it three times.
Rhine documented these stunning results in his notebook and prepared several papers for publication. But then, just as he began to believe in the possibility of extrasensory perception, the student lost his spooky talent. Between 1931 and 1933, Linzmayer guessed at the identity of another several thousand cards, but his success rate was now barely above chance. Rhine was forced to conclude that the student’s “extra-sensory perception ability has gone through a marked decline.”
This all sounds quite impressive. I don’t know the details of how many cards he was going through, but it sounds like it’s easily thousands. I calculate the odds of a 9 card streak as a tenth of a percent if you go through a couple of thousand cards. This is much more likely than 1 in 2 million (which is relevant only if you only look at 9 cards, one time). No doubt getting 9 in a row three times over a period of a few weeks (or even years) would be a large statistical anomaly. But it’s a long way from something I would issue a press release about. Carl Sagan summed it up best: “Extraordinary claims require extraordinary evidence”. If you’re going to claim some “extra-sensory perception” that would require a new physical force, and fundamentally alter all of modern physics, you might need more than a one-time statistical fluke. How about a whole series of controlled, double-blind experiments? Lo and behold, when this is done, the effects vanish. But by then the original results are published, and the damage is done. We’re still talking about this one “experiment” 80 years later. But if we integrate over all the equivalent subsequent experiments, there’s no doubt that the effect regressed to the mean, and can be ignored. So how is this even remotely interesting?
It takes Lehrer six pages to finally get around to the topic of publication bias. Suppose you do an experiment and find a sensational, Earth-shattering result. Human nature being what it is, you’re likely to try to publish it (and journals like Nature are likely to publicize it). Fads happen all the time in science. It’s a human activity after all. And then you (and the rest of the community) do a lot more work, and if it’s a statistical fluke, or poorly analyzed data, or a poorly conceived or biased experiment, the result will fade into oblivion. The “decline effect” that this article is making a fuss about is precisely the process by which the scientific method works. The truth will out.
On the other hand, suppose you do an experiment and find the result you (and everyone else) would expect. For example, you drop a ball and, indeed, it falls to the floor, exactly in accordance with our theory of gravity. You’re unlikely to write up the results. You’re even less likely to be able to get them published. And you’re certainly not going to spawn a whole bunch of follow-up experiments trying to duplicate your “null” results. So there’s no “incline effect”. This is not a surprise. It’s not a sign that science is broken. It’s a sign that we try to be selective and efficient in our experiments.
None of this is to say that there aren’t legitimate concerns. It’s one thing for publication bias and poor data to lead to a (temporarily) incorrect measure of the Hubble constant, and hence the age of the Universe. It’s an entirely different matter when a statistical fluke (encouraged by huge sums of money) engenders useless (or worse) medical treatment for millions of people. The only way to address this is by ever more careful and thorough application of the scientific method. (Obama’s Comparative Effectiveness Council, one of the many positive aspects of his new healthcare bill, is a good example of this.)
Lehrer’s article is a dramatic example of the problem he decries. The title and subtitle, and the first few pages, make it sound like there’s something profoundly and mysteriously wrong with the scientific method. Far into the article the obvious and rational explanations appear. Really, the article should be titled “Science works”, with a subtitle “The scientific method conquers all (eventually).” But that would be a lot less sexy, and my guess is that the New Yorker wouldn’t have published it. So there’ll be a bunch of people out there who misread or cherry-pick the article (Deepak Chopra: “Watch out, the truth is slipping away”), and end up convinced that the scientific method is broken. And they won’t vaccinate their children, and they’ll make important life decisions based on their horoscopes, and they’ll continue to believe that the world is magic. The scientific method is healthy and well. The problem is a society that, to a surprising degree, doesn’t pay much attention to it. And this article is a brilliant example of how things go wrong.