The scientific method is alive and well

By Daniel Holz | March 9, 2011 11:45 am

I’ve been on somewhat of an unintended hiatus for the past few months, as I try to wrap up some projects, and deal with a few other things in my life. However, I just read something that has given me a kick in the pants. And I don’t mean that in a good way. In late December there was an article by Jonah Lehrer in the New Yorker titled “The truth wears off”. Much more suggestive was the subtitle, “Is there something wrong with the scientific method?”. The story discusses the “decline effect”: an article is published with startling results, and then subsequent work finds increasingly diminished evidence for the initial unexpected result. It’s as if there’s “cosmic habituation”, with the Universe conspiring to make a surprising result go away with time. The last paragraph sums things up:

The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.

I don’t particularly disagree with any of this. But it’s completely besides the point, and to untutored ears can be immensely misleading. The article is a perfect example of precisely the effect it seeks to describe (there must be a catchy word for this? Intellectual onomatopoeia?). The article gives a few examples of people finding interesting results, only to have them disappear on sustained scrutiny. It makes it sound like there is an epidemic of declining confidence:

One of the first demonstrations of this mysterious phenomenon came in the early nineteen-thirties. Joseph Banks Rhine, a psychologist at Duke, had developed an interest in the possibility of extrasensory perception, or E.S.P. Rhine devised an experiment featuring Zener cards, a special deck of twenty-five cards printed with one of five different symbols: a card was drawn from the deck and the subject was asked to guess the symbol. Most of Rhine’s subjects guessed about twenty per cent of the cards correctly, as you’d expect, but an undergraduate named Adam Linzmayer averaged nearly fifty per cent during his initial sessions, and pulled off several uncanny streaks, such as guessing nine cards in a row. The odds of this happening by chance are about one in two million. Linzmayer did it three times.

Rhine documented these stunning results in his notebook and prepared several papers for publication. But then, just as he began to believe in the possibility of extrasensory perception, the student lost his spooky talent. Between 1931 and 1933, Linzmayer guessed at the identity of another several thousand cards, but his success rate was now barely above chance. Rhine was forced to conclude that the student’s “extra-sensory perception ability has gone through a marked decline.”

This all sounds quite impressive. I don’t know the details of how many cards he was going through, but it sounds like it’s easily thousands. I calculate the odds of a 9 card streak as a tenth of a percent if you go through a couple of thousand cards. This is much more likely than 1 in 2 million (which is relevant only if you only look at 9 cards, one time). No doubt getting 9 in a row three times over a period of a few weeks (or even years) would be a large statistical anomaly. But it’s a long way from something I would issue a press release about. Carl Sagan summed it up best: “Extraordinary claims require extraordinary evidence”. If you’re going to claim some “extra-sensory perception” that would require a new physical force, and fundamentally alter all of modern physics, you might need more than a one-time statistical fluke. How about a whole series of controlled, double-blind experiments? Lo and behold, when this is done, the effects vanish. But by then the original results are published, and the damage is done. We’re still talking about this one “experiment” 80 years later. But if we integrate over all the equivalent subsequent experiments, there’s no doubt that the effect regressed to the mean, and can be ignored. So how is this even remotely interesting?

It takes Lehrer six pages to finally get around to the topic of publication bias. Suppose you do an experiment and find a sensational, Earth-shattering result. Human nature being what it is, you’re likely to try to publish it (and journals like Nature are likely to publicize it). Fads happen all the time in science. It’s a human activity after all. And then you (and the rest of the community) do a lot more work, and if it’s a statistical fluke, or poorly analyzed data, or a poorly conceived or biased experiment, the result will fade into oblivion. The “decline effect” that this article is making a fuss about is precisely the process by which the scientific method works. The truth will out.

On the other hand, suppose you do an experiment and find the result you (and everyone else) would expect. For example, you drop a ball and, indeed, it falls to the floor, exactly in accordance with our theory of gravity. You’re unlikely to write up the results. You’re even less likely to be able to get them published. And you’re certainly not going to spawn a whole bunch of follow-up experiments trying to duplicate your “null” results. So there’s no “incline effect”. This is not a surprise. It’s not a sign that science is broken. It’s a sign that we try to be selective and efficient in our experiments.

None of this is to say that there aren’t legitimate concerns. It’s one thing for publication bias and poor data to lead to a (temporarily) incorrect measure of the Hubble constant, and hence the age of the Universe. It’s an entirely different matter when a statistical fluke (encouraged by huge sums of money) engenders useless (or worse) medical treatment for millions of people. The only way to address this is by ever more careful and thorough application of the scientific method. (Obama’s Comparative Effectiveness Council, one of the many positive aspects of his new healthcare bill, is a good example of this.)

Lehrer’s article is a dramatic example of the problem he decries. The title and subtitle, and the first few pages, make it sound like there’s something profoundly and mysteriously wrong with the scientific method. Far into the article the obvious and rational explanations appear. Really, the article should be titled “Science works”, with a subtitle “The scientific method conquers all (eventually).” But that would be a lot less sexy, and my guess is that the New Yorker wouldn’t have published it. So there’ll be a bunch of people out there who misread or cherry-pick the article (Deepak Chopra: “Watch out, the truth is slipping away”), and end up convinced that the scientific method is broken. And they won’t vaccinate their children, and they’ll make important life decisions based on their horoscopes, and they’ll continue to believe that the world is magic. The scientific method is healthy and well. The problem is a society that, to a surprising degree, doesn’t pay much attention to it. And this article is a brilliant example of how things go wrong.

  • Ellipsis

    Not to be a curmudgeon (although I am), but there is certainly a “decline effect” when it comes to the quality of science journalism in major newspapers and magazines over the past two decades. There needs to be some way to credit journalists for writing things that stand the test of time, as well as discrediting them (seriously, in a financial/employment sense) for writing reams about garbage (e.g. what has now become the classic example of LHC producing black holes that will eat the world) that merely gets headlines and has no connection with reality.

  • Joseph J Veverka

    If you heard of the 20/80 rule where 80 present of profits come from 20 present of the force then there is the 2/98 rule in science. Only two present are actually doing the scientific method while 98 present are looking for dark matter, dark energy, gravity waves and/or super gravity thinking there’s got be be a Nobel prize out there somewhere. But the Nobel is constantly given to the 2 present who work at things that really matter to the fundament science method.

  • The Cosmist

    On the issue of statistical anomalies, I can tell you from my experience playing online poker for years that even very rational human minds have a hard time dealing with randomness. The apparent non-randomness of random events can be so bizarre, and our brains are so good at finding patterns, that it’s almost impossible not to start believing that something spooky is going on. This bug in our programming explains why superstition and pseudo-science is so widespread and persistent. As always, the limiting factor in the scientific enterprise is human beings themselves, and suggests to me that the way forward is to move beyond this bug-ridden substrate called homo sapiens ASAP!

  • Pingback: Everything Is Fine, Science Is Not Broken « metadatta.()

  • Bee

    Physicists publish null and negative results all the time. They just call them ‘constraints’ on some more or less plausible modification of established theories.

  • Simon

    There was some discussion of this over at Jerry Coyne’s blog here in December. And at Andrew Gelman’s blog here. I think these guys both had some interesting thoughts on this article.

    My own point is that the general readership of this article need to realise there’s a huge difference between papers published at the ‘cutting edge’ of science and solidly established results that are found in repeated experiments, re-analysed by many teams, etc…

    There’s a lovely paper by John Ioannidis of “Why most published research findings are false” which I think more people should read. The conclusion (in the title) is virtually inevitable given noisy data and the number of hypotheses and tests being examined around the globe. But it doesn’t mean the method is no good. It just means new research papers should be only the start of the process – to be followed by critical analysis and replication by independent (or differently biased!) peers.

    The news media should therefore hold off publishing stories based on a single new result in a single paper, or risk publicising wrong results almost all the time. (Trouble is, the incentives within the news media world are such that this is not much of a concern.)

  • sievemaria lucianus

    I had an experience one time when working with 2 people when I called the girl wrongly by the name *heather* I was embarrassed and said I dont know why I said that, I dont even know anyone by that name and she said ,” My sisters name is Heather and I was just thinking of her” – when the fellow said – whoa – i am thinking of a number what is it ? and according to him I got it right 3 times – but who is to say – if she really had a sister named Heather – and was thinking of her – and the numbers …. impossible. For certain I could not do it again.

  • sievemaria lucianus

    “… Efforts at control are spurious, desperate and doomed ….”

  • Scott B

    I think the problem is being underestimated here. It’s not proper to say the scientific method is broken but I think it’s fair to say that the peer review processes, publication requirements, and especially reporting of scientific “discoveries” needs to change. This may be the way science works, but with the internet people have access to far more news daily. If they keep hearing multiple findings such as X is healthy/unhealthy for your or arctic ice will be gone by 2010 or whatever else, then the results are overturned later, peoples’ confidence in real findings with a solid basis will be reduced. The general public is not going to read multiple papers and come to an understanding of why a previous finding ended up being proven wrong or realize the person that wrote the article hyped up the findings beyond what the paper supported. They are just going to think scientists don’t know as much as many would like them to believe.

    I think scientists need to be far more critical of reports on their work and others. Not all publicity is good publicity. Also, somehow moving away from publishing in journals that require money to access the papers need to happen. The general public may not go through the papers, but they’ll read the opinions of the many people that would be willing to. If everyone had access to papers and preferably the data sets that went into whatever conclusions the paper reached blogs and other outlets could counteract some of the problems caused by inaccurate reporting.

  • Bob Kirshner

    Thanks for this thoughtful post. I had the same creepy feeling when I was reading this article. When you find a result, and further investigation makes it weaker, that’s telling you something you don’t want to hear, but, honestly, Nature does not care how you feel.

    On the other hand, when we first published results on cosmic acceleration based on supernovae in 1998, the data were slim, but, we thought, adequate to support the claim. After a decade, the samples have gotten bigger, the systematic errors have been tracked down and made smaller, and there’s even a prediction that’s checked out OK (the cosmic jerk!) This is surely the signature of something real!

  • Curious Wavefunction

    -So there’ll be a bunch of people out there who misread or cherry-pick the article (Deepak Chopra)

    If I had a penny for every time that Deepak Chopra pounced on an article and cherry-picked and distorted it to his own misguided ends, I would be as rich as him.

  • AnotherSean

    Nothing is wrong with the scientific method, in so much as such a method can be defined. The problem is with some of its users. Does science conquer all, eventually? Thats a tremendous extrapolation, that I don’t think we can answer with any degree of confidence.

  • daniel

    @Bob (#10): “Nature does not care how you feel.” That just about sums up the life of a scientist. The trick is not to take it personally.

  • réalta fuar

    How many comments on a science blog by Bob Kirshner equals one (probably wrong) astronomy publication in Nature? Like many things, it depends on who is counting. For me, it would be one, for a tenure committee probably a LOT more than that.

  • Pingback: Can product management be scientific? « Evil Fish()

  • Pingback: Slow Down, It’s Sunday « 'tis nobler – to learn and change()

  • costanza

    This “decline effect” is well documented in the bio/medical literature, tho’ there it’s referred to as the Proteus Phenomanon”.

  • Jonathan

    I think it’s “Lo and behold”, not “Low and behold”.

  • daniel

    @Jonathan (#18): I had corrected this earlier, but it was autosaved, and never went through. Hopefully this time it will take.

  • Karel Rei

    You do sound like a bit of an optimist.
    We need some kind of evidence that in fact these distortions do go out of the literature
    with time. Often they do not. They get in the textbooks
    and all sorts of stuff are called ‘fact’ that are not really.
    A second level critique of statistics really IS necessary.

  • Dave

    @Joseph J Veverka (#2): Please don’t be ignorant. Dark matter, dark energy, gravity waves are all well motivated and scientifically reasonable hypothesis which have not yet been refuted. People are certainly not hunting nobel prizes–they are trying to understand the fundamental physics of our universe.

  • Pingback: The Scientific Method is Alive and Well at


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Cosmic Variance

Random samplings from a universe of ideas.

About Daniel Holz


See More

Collapse bottom bar