Blinded Analysis For Better Science?

By Neuroskeptic | November 3, 2015 8:40 am


In an interesting Nature comment piece, Robert MacCoun and Saul Perlmutter say that “more fields should, like particle physics, adopt blind analysis to thwart bias”: Blind analysis: Hide results to seek the truth

As they put it,

Decades ago, physicists including Richard Feynman noticed something worrying. New estimates of basic physical constants were often closer to published values than would be expected given standard errors of measurement.

They realized that researchers were more likely to ‘confirm’ past results than refute them — results that did not conform to their expectation were more often systematically discarded or revised.

To minimize this problem, teams of particle physicists and cosmologists developed methods of blind analysis: temporarily and judiciously removing data labels and altering data values…

Blind analysis ensures that all analytical decisions have been completed, and all programmes and procedures debugged, before relevant results are revealed to the experimenter.

One investigator – or a computer program – methodically perturbs data values, data labels or both, often with several alternative versions of perturbation.

The rest of the team then conducts as much analysis as possible ‘in the dark’. Before unblinding, investigators should agree that they are sufficiently confident of their analysis to publish whatever the result turns out to be, without further rounds of debugging or rethinking.

As a procedure, blind analysis has much in common with preregistration. Both involve the creation of a “Chinese wall” that prevents knowledge of the results from affecting decisions about the analysis. Both require a hypothesis of interest to be framed in advance. Both are intended to prevent p-hacking and other conscious and unconscious biases.

Blind analysis shares many of the same limitations as preregistration, too. MacCoun and Perlmutter discuss concerns such as “won’t people just peek at the raw data?”. This is analogous to a criticism commonly raised against preregistration, “won’t people just preregister retrospectively?” Both methods ultimately rest on trust.

MacCoun and Perlmutter discuss preregistation, but they say that blind analysis is better as it offers the advantage of flexibility: “preregistration requires that data-crunching plans are determined before analysis… but many analytical decisions (and computer programming bugs) cannot be anticipated.”

However, I think that preregistration and blind analysis could work together. Each brings important benefits.

For instance, preregistration ensures that negative results don’t just disappear unpublished. Indeed, with pre-peer review, it can help them get published, if a journal agrees to publish the paper on the strength of the methods, before the results (negative or otherwise) are collected. Blinded analysis, alone, doesn’t achieve that.

But blinded analysis could help to make preregistration more useful. A prespecified analysis plan could incorportate a blinded phase. So, rather than having to decide at the outset how (say) outliers will be treated, researchers could leave this question open, and then decide based on a blinded look at the final data. It would be the best of both worlds.

  • Uncle Al

    Discovery inertia given peer pressure is one side of the coin. When data do not fall into line, is it insufficient technique or a new phenomenon? Big G, Newton’s constant of gravitation, is by far the worst characterized physical constant. Like beta-decay showing elapsed time-rate periodicity, could there be unsuspected confounded variables?

  • sonia

    When statisticians see the raw data they should be able to spot the difference between placebo vs treatment group from just looking at the data and whether it’ll be significant or not. Altering data may invoke the statistician to employ an inappropriate standardization & analysis method. It’s like altering data on cats to dogs and proving it’s a dog when it’s not. I’m not understanding the logic here. I believe raw data should be kept pure and not altered for the sake of eradicating human biases. Perhaps there should be stricter requirements to deposit raw data in a repository.

    • OWilson

      Wow! Somebody who gets it!

      The history of science is rife with false “interpretations” of data only to find that later higher technology analysis reveals the truth.

      But that only works when the actual data is truthful and preserved.

      With this current bunch in charge of the data (pleading the fifth, wiping the hard drives, “losing” their emails, and blaming YouTube for anything and everything), there is no guarantee that, ….well!

      Halley, Einstein were successful, because they could really rely on the data recorded by Tycho Brae, in the case of Halley, and Newton, in the case of Einstein.

      These days who would base serious science on the work of a Muslim Outreach NASA, or a IPCC sex fiend, Pachauri?

  • practiCalfMRI

    I’m confused. The implication behind these well meaning suggestions seems to be that the data under consideration are good data. That is, they pass whatever litmus tests exist in that arena to ensure that they have a fighting statistical chance to answer the question(s) asked of them. Many times I’ve seen results that don’t “make sense” based on prior expectations, and then gone on to identify a systematic mistake in the acquisition. So I suppose my confusion surrounds the issue of systematically flawed data. Do we really want all the crappy experiments that people manage to perform written up and added to the corpus? Who’s going to review these flawed studies? As much as I want to see negative results published, I also want to see “genuine negative results” published and not simply flawed experiments published. It’s trivially easy to get a negative result based on a crappy experiment. How do these crappy experiments help us learn the truth?

    • D Samuel Schwarzkopf

      Funny you should mention this as we’ve had a number of discussions about this very topic recently. I wrote two blog posts about this looking at this from two different angles (I will only link the first one lest I don’t fall afoul of NS’s spam filter – but the second one comes right after this one):

      I believe all good science should be published, especially the negative results. But you should make damn sure that the negative results aren’t just because you screwed up (The same applies to positive results of course). So you need to have some proper quality assurance criteria. The more of these you can define a priori the better – but it’s not uncommon that you discover that something went wrong only afterwards. The problem is that you mustn’t confound your quality assurance criteria with the desired outcome.

      I still hold that it is easier to get negative results from screwing something up than positive results. But this can’t be your criterion as that is circular logic. You surely must scrutinise the results in the same way whatever you find.

    • Neuroskeptic

      But surely the best way to know whether a study is crappy is to have it published. After all, many times the authors will think a study is solid, but then after it’s published, someone else will spot a flaw. The original authors (or data acquirers) are not the final authority on data quality.

      I would say that even flawed data should be published. If the authors know it is flawed, they should say so – they shouldn’t try to publish it as if it were solid. But even flawed data could prove useful – if only for educational reasons (or as a way of validating artifact detection tools.)

  • D Samuel Schwarzkopf

    I like the concept of blinded analysis and I’ve used this before. I have done this several times in the past when I had to manually delineate regions of interest and the outcome of these ROI definitions was likely to influence the outcome. The most sophisticated version was a script that removed subject IDs and randomised the data to be delineated. In different projects we simply split tasks between group members who didn’t know the other person’s data at the time. Of course neither of these fully reduce any unconscious biases you may have at the summary stage. Ideally you would do blind analysis until the results are finalised but that’s quite a hassle to do. My personal alternative to this has been so far to work on ways that minimise the manual input to the analyses.

    Anyway, I don’t see any reason why blind analysis and preregistration shouldn’t be compatible. Surely you can even preregister your blind analysis plans?

    • Neuroskeptic

      Exactly, I think they are more than compatible – they are made for one another.

    • Neuroskeptic

      I once wanted to do a blinded analysis using some (vintage) electrophysiology software that insisted on displaying the subject’s name and ID on screen at all times.

      So I put some sticky tape on those parts of the screen.

      • D Samuel Schwarzkopf

        Sometimes the low-tech solution is the simplest 😉



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar