Team of Rivals: Does Science Need “Adversarial Collaboration”?

By Neuroskeptic | January 28, 2015 6:15 pm

When scientists disagree about something, the two sides of the argument often come to form separate communities, with scientists collaborating with others on their “team” while avoiding working with their “opponents”. But is there a better way?

A paper just published today presents the results of an experiment that was conducted as an ‘adversarial collaboration‘. This is where some researchers sit down with some members of the “other side” and agree upon a plan for a study to test the hypothesis in question.


In this case the hypothesis was that horizontal eye movements would boost the ability to remember words. Many, but not all, previous studies have reported an effect of horizontal eye movement on memory. There’s also a body of theory to explain it, but some skeptics are not convinced.

This paper has six authors, all of them Dutch psychologists: three (Matzke, van Rijn and Wagenmakers) were ‘skeptics’, and two (Nieuwenhuis and Slagter) were ‘proponents’ of the effect. The fifth author, van der Molen, pitched in as an adviser and an impartial referee, but he says the whole thing went so smoothly that he didn’t need to arbitrate.

The team agreed on a protocol, preregistered it (here), and then ran the study. Volunteers (students) were shown a list of words and later had to write down as many words as they could remember, with a pen and paper.

Immediately before the recall phase, volunteers were randomly asked to move their eyes either side to side (horizontal), or up and down (vertical), or to do nothing (no movement.) The latter two conditions were controls, expected to have no effect on recall.

However, it turned out that horizontal eye movements offered no memory benefits. If anything it made memory worse:

matzke_recallIn the discussion section, the skeptics and the proponents both got to comment separately. The skeptics argue that these negative findings are trustworthy, and they suggest that previous positive results (i.e. of an effect of eye movement on memory) may result from p-hacking.

The proponents counter this argument, using a p-curve analysis to argue that p-hacking can’t account for all of the positive results. They conclude by saying

Considering the empirical results and the p-curve analysis reported here, did the present adversarial collaboration resolve the disagreement between the skeptics and the proponents? No; the skeptics are probably no less skeptical, and we, the proponents, are not convinced by a single failure to replicate, especially given the results of the p-curve analysis.

However, we have become more cautious about the conclusions that can be drawn from the studies reported so far, and will follow the further development of this field of research with a critical eye.

Both sides, however, praise the adversarial collaboration process, and recommend the method to others. It’s not a new idea; there have been advocates of adversarial collaboration for some time, but this paper is one of the few examples of a completed adversarial study.

But what did it achieve? In his summing-up, referee van der Molen says that

The adversarial collaboration could not settle the empirical debate conclusively: despite the highly diagnostic outcome of the experiment, the proponents are still convinced that the effect is real. In hindsight, this result was to be expected.

A single experiment, even when pre-registered and conducted in the framework of an adversarial collaboration, may not provide sufficient evidence to overturn an opinion that was shaped over the course of many years.

It’s unrealistic to expect any single paper to end a debate such as this one. We certainly shouldn’t regard this collaboration as having failed just because it didn’t produce unanimity.

I wonder if future adversarial collaborations could encourage the participants to specify, publicly, at the outset, what kind of evidence would make them change their mind. The goal then would be to design a study that would produce enough evidence to satisfy these preregistered ‘we admit defeat’ conditions.

Of course, even if the results did pass the threshold that they had previously stated would end the debate, researchers might still demand even more evidence. However, in this case, it would obvious that they had moved the goalposts, because the original goalposts would be a matter of public record.

ResearchBlogging.orgMatzke D, Nieuwenhuis S, van Rijn H, Slagter HA, van der Molen MW, & Wagenmakers EJ (2015). The effect of horizontal eye movements on free recall: a preregistered adversarial collaboration. Journal of Experimental Psychology: General, 144 (1) PMID: 25621378

  • D Samuel Schwarzkopf

    Adversarial collaborations are great. I personally believe this is the only good way you can really seek to replicate when you are skeptical of somebody else’s finding. In fact, the Devil’s Neuroscientist tells me she wanted to discuss this in her last blog post but that had already gotten so long (again) that she chose to delay this topic… 😛

    I first encountered such a collaboration regarding this paper:
    I think this is the way scientific disagreements should be resolved. If you find a problem with some finding, directly contact the authors first of all and discuss it with them. If they ignore or block you out, you by all means write your own paper without them if you must but at least try to solve this problem amicably first. I’m sure a lot of egos can get in the way but I’d like to believe that most people would actually be open to have a serious collaboration if both parties are genuinely interested in the truth.
    Of course, one thing that I think we need to change in the way we do science (and science publishing) that whether an effect “is real”, whether it replicates, whether people made an error, is not used to diminish these researchers. Explicitly or implicitly I think this currently is very much what is happening in many cases and that is at the core of all the bad blood this often causes.

    One last thing about the outcome of this particular case (and several other adversolaborations I have seen): it frequently doesn’t seem to resolve the issue for the authors. If the experiment fails to replicate proponents keep believing and skeptics become more skeptical. But a publication is not for the authors but for the scientific community and the wider readership can use the conflicting arguments to make up their own minds.

    • Neuroskeptic

      That’s a good point about the ‘eagle eyed autism paper’! I blogged about it at the time but I never thought of it as an adversarial collaboration, but you’re quite right it was one.

      • D Samuel Schwarzkopf

        It wasn’t quite framed as such but I heard about it from one of the authors and it seems to fit the definition.

      • D Samuel Schwarzkopf

        Also, I think I first learned about this study from your blog 😉

  • Sebastiaan Mathôt

    I remember that Sander Nieuwenhuis gave a talk about this at a Dutch conference (NVP) a few years back. EJ Wagenmakers was in audience and said “I don’t believe this! I bet I can’t reproduce this!”. Or something along those lines. Interesting to see that they actually got together and worked on it. I must say that I’m a bit surprised about the outcome. My impression was always that EMDR-like things, while a bit mysterious, are pretty robust phenomena. (But then again, that’s what p-hacking will do for you.)

  • Mario de Jonge

    I think this study is something really special and I hope many more will follow. I do wonder though, what would have happened if the effect of horizontal eye movements had turned up? Would the skeptics have changed their minds? Or would they, like the proponents, also hold their original position?

    And even though the debate has not been settled, I did get the feeling that there is now more doubt about the effect, even for the proponents. So, that is surely worth something in terms of getting towards a point of settling the debate. It might also just need some more time to sink in.

    For future adversarial studies, I think it might also be a good idea to also have an “impartial” lab to conduct the experiment to minimize the possible influence of experimenter bias. This was now suggested as a possible explanation for the failure to replicate in the general discussion. I think this is an unlikely explanation, and I expect that both parties also agreed beforehand on who was going to be responsible for collecting the data. But still, just to rule out such post-hoc explanations, it would be a good idea.

    • Neuroskeptic

      Maybe there could be three experimental centres, 1. skeptics, 2. neutral, and 3. proponents.

      That way if one centre was biasing the results it would be obvious by comparing them against the other two!

      • Mario

        Or you could get a nice crossover interaction. If bias does play a role it might work both ways. If we can find someone to back this hypothesis, I will be a proponent for the null (sounds more positive than being a mean old skeptic of the alternatives…). And I am willing to stake my reputation on it too (don’t really have a rep yet, but perhaps I will get one if I am correct). We could do a three-way adversarial collaboration. Who’s in? Place your bets now. After the experimental roulette wheel starts spinning, it is “rien ne va plus”!

        • Neuroskeptic

          I agree that “skeptic” and “proponent” are not ideal terms. Maybe we could have “Team H0” and “Team H1”.

          Or “Blue” and “Red”!

  • ohwilleke

    The Large Hadron Collider has a somewhat similar methodology. Two independent teams (ATLAS and CMS) that in principle have access to each other’s work only when it is published, use the same piece of equipment to run similar, although not identical, experiments in parallel with a cast of thousands each. Thus, most findings are replicated (or not) by both teams. They aren’t true adversaries, but they are independent.

  • Pingback: Markierungen 01/30/2015 - Snippets()

  • petrossa

    Aside from the subject in question, which to me seems a good thing, going by Dutch scientists is walking the minefield of bad eduction. In the 70’s the Dutch had this fantasy of eduction of the masses. For the obvious reason that no one is equal to another intellectually in order to have a more than average success rate they had to lower the threshold.
    This started a downward spiral of educational levels, each time exam results were adjusted upward in hindsight due to falling success rates.
    Furthermore it was decided in the 90’s to have women partake more in science, so positive discrimination was introduced to cause more women to graduate. Whilst i personally see no difference in intellectual capacities it did open the door for less qualified women to graduate.
    Add to that 30 years of importing illiterate immigrants and nowadays a Dutch university graduation isn’t worth the paper it’s written on, particularly in the soft sciences were results are hard to qualify.

    • Neuroskeptic

      Comment removed – it’s off topic.

      • petrossa

        and way too close to the truth for your comfort zone

  • Pingback: I’ve Got Your Missing Links Right Here (31 January 2015) – Phenomena: Not Exactly Rocket Science()

  • Pingback: Week in review | Climate Etc.()

  • Pingback: I’ve Got Your Missing Links Right Here (31 January 2015) | Gaia Gazette()

  • Pingback: Education snippits | Pearltrees()

  • Pingback: Do Bilingual People Have a Cognitive Advantage? - Neuroskeptic()

  • Pingback: The Myth of the Optimism Bias? - Neuroskeptic()



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar