Is Reproducibility Really Central to Science?

By Neuroskeptic | January 2, 2018 12:13 pm

In a new paper in the Journal of Experimental & Theoretical Artificial Intelligence, Chris Drummond takes aim at the ‘reproducibility movement’ which has lately risen to prominence in science.

As one of the early advocates for this movement, I was interested to see what Drummond had to say. While I don’t find his argument wholly convincing, he does raise some good points.

science_warning1Drummond begins by summarizing the case for reproducible research as it sees it. The claim is that reproducibility – the ability of other scientists to exactly reproduce and confirm a given result – is central to science. It is further claimed that we can promote reproducibility by requiring authors to submit their data, and their analysis scripts (code), with each publication and that this will, amongst other benefits, help to prevent scientific fraud.

Against this, Drummond says that

(1) Reproducibility, at least in the form proposed, is not now, nor has it ever been, an essential part of science.
(2) The idea of a single well-defined scientific method resulting in an incremental, and cumulative, scientific process is, at the very best, moot.
(3) Requiring the submission of data and code will encourage a level of distrust among researchers and promote the acceptance of papers based on narrow technical criteria.
(4) Misconduct has always been part of science with surprisingly little consequence. The public’s distrust is likely more to with the apparent variability of scientific conclusions.

To my mind the most interesting part of this paper falls under Drummond’s discussion of point (1), in which he argues that reproducibility is not very important to science, contrary to popular belief. Drummond defines ‘reproducibility’ as the ability to repeat an experiment as exactly as possible and get the same result: “The aim is to minimise the difference from the first experiment including its flaws, to produce independent verification of the result as reported.”

But, Drummond points out, scientists are generally not interested in experimental results for their own sake; rather, we use experimental results to test hypotheses. The best way to test a hypothesis is to carry out several different experiments, using different methodologies, to provide convergent evidence. Drummond says that what scientists are really interested in is the ‘retestability‘ of a given hypothesis – not the reproducibility of a given piece of data.

Now, I’ve previously said that reproducibility is fundamental to science:

In my view, replicability is the essence of scientific truth. To say that a certain scientific result is true or valid, is nothing other than to say that someone, who correctly carries out the same methods, would be able to confirm it for themselves. Without the assumption of replicability, scientific papers would become merely historical documents – ‘we did so and so, and we observed so and so, but your mileage may vary.’

However, I actually think that Drummond and I have common ground. Certainly, I agree with Drummond that convergent evidence from multiple different methods is the strongest kind of support for a hypothesis. This is because any given piece of evidence may be misleading, even if it is reproducible – it might be the result of a reproducible artifact.

That said, I still think that reproducibility is fundamental. If we have multiple pieces of evidence for a hypothesis, but none of those pieces of evidence are reproducible, the hypothesis would have no support. Reproducibility of the primary evidence must be there first, before we can marshal the evidence to support our models. A model supported by lots of unreproducible evidence is a house built on sand.

So I agree with Drummond that reproducibility, alone, is not sufficient to make strong science (I’m not sure if anyone thinks it is), but I stand by my view that it is necessary.

ADVERTISEMENT
  • http://www.eiko-fried.com Eiko Fried

    Iso-Ahola (2017) took a somewhat similar (and probably more pronounced) stance than Drummond, arguing that:

    1) prior psych findings cannot be all dismissed as flukes “because they were published in the best journals of social psychology”
    2) experiments can never disprove a theory
    3) “psychological phenomena, by their nature, are not fully reproducible”
    4) “nobody has provided a theoretically and logically rigorous rationale and justification why ego depletion as a phenomenon should and would not exist.” The burden of proof is on the skeptic.

    We published a brief letter that is a rebuttal to the arguments above, and also tackles some of Drummond’s points. I just learned a few days ago that this letter made it into the top 100 articles on the Open Science Framework in 2017, and hope it is relevant to the debate here.

    https://www.frontiersin.org/articles/10.3389/fpsyg.2017.01004/full

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Thanks!

      I myself have always been guilty of using ‘replicability’ and ‘reproducibility’ interchangably, which is why I did so here…

    • John Thompson

      There is some overlap of the terms.
      Many things can be both replicated and reproduced.
      It’s such a minimal difference that there seems to be a lack of easy and specific examples (at least on the internet).
      Perhaps you could provide us with an example of something in the real world that fits one of the two words but not the other, and then vice-versa?

      • http://www.eiko-fried.com Eiko Fried

        Hey John, I think it’s very important to clearly separate reproducibility and replicability. And even if X and Y would often co-occur in nature, such as higher values of neuroticism and female gender (the bivariate correlation in general population samples is about 0.5 to 0.6), that doesn’t justify confusing the concepts of neuroticism and gender. Weight and height are also highly correlated, but again, that doesn’t mean it’s not important to clearly separate them.

        For your question about examples: any study that provides code and data is reproducible. Reproducibility is independent of any outcome. Replicability is about a phenomenon or effect, about the substantive outcome of a study. A reproducible study doesn’t need to be replicable, and a replicable study doesn’t need to be reproducible.

  • Leonid Schneider

    Reproducibility is be nice, but not really necessary for science. A piece of research may still be a breakthrough worth of finest funding even if it is not reproducible. Don’t take my word for it, read this 2017 statement by German central national funding agency, DFG. http://www.dfg.de/service/presse/pressemitteilungen/2017/pressemitteilung_nr_13/

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Oh dear…

    • Marshall Gill

      Thank you! FUNDING is what these people practice, not “science”. They do not care one whit about the accuracy of their claims, they just want the taxpayer gravy to continue to roll in…

      • DavidT

        Sadly, not only Big Science (CERN, for example) but also moderate science, is expensive, and whether a scientist works in academia, in industry, or in government, funds are necessary to keep labs functioning, to conduct experiments, or simply to pay those theoretical physicists to sit around thinking 😉 My experience over many years of sitting on NIH and NSF review panels is that scientists are strategic in trying to locate their work within major agendas, but the actual review of proposals focuses on the quality of science being proposed. In other words, we all care about the bucks rolling in — what walk of life functions free? — but that does not preclude a competition for funds that is based in large measure on quality within those parameters.

        David T, MD, PhD

        • Marshall Gill

          Some of the scientists you mention are not like the others. You have a PhD, can you not tell what that is? You are super, super smart but don’t know the meaning of the word “coerced”?

  • OWilson

    Without reproducibility what exactly is seinece?

    • DavidT

      One approach suggests that “science” is the systematic observation and classification of nature. What we term reproducibility is one way to test those observations and systems of classification, but reproducibilty is not in and of itself science. After all, much of the science conducted in the past is no longer “reproducible” — i.e. turns out to have been wrong — yet we still regard it as science.

      • Tom Aaron

        How do we know it is wrong if reproducibility is not at the essence of established science?

        • DavidT

          Tom: thanks for your reply. As I suggested, there is ‘doing science’ and there is ‘testing science.’ As your question implies, reproducibility/replicability is [are?] important for establishing validity, but a lot of science is good science even if it is ultimately wrong or non-reproducible. My PhD is in physiology, and it’s common for us to point to early anatomy and Linnean taxonomy as good examples of early modern science, though it was not ‘tested’; as systems of classification these early schemes were useful or not, but the issue of reproducibility was not an issue. I’m not a historian, obviously, but my impression is that issues of reproducibility/replication are fairly recent standards, and we’ve been doing “science” for a long time.

          Cordially,

          David T, MD, PhD

          • John Thompson

            How can it be good science if the results of experiments are not reproducible?
            That sounds much more like flawed science.
            I would replace the word “reproducible” with “confirm-able” and it makes much more sense why it is so important.
            Essentially, being able to reproduce a result from an experiment is confirming the claimed results as valid.
            You cannot just take people’s word on anything.
            Everyone has their own personal interests and motives to at minimum shade the results (whether they do or not is up to them)..
            It is not uncommon that the way an error or blatant fraud is caught is that someone else repeats the experiment but gets different results.

          • DavidT

            John: I think your suggestion gets closer to my own view — when we think of observational sciences such as early anatomy that I mentioned, or early geology, the hallmark of good science was careful observation. Was it Boyle who revolutionized testing in science by insisting on independent observations, much like your suggestion of confirmation? I’m also tempted to note that, since virtually all science done up until a year ago has been overturned (I’m exaggerating to make a point), none of the science done for a couple of centuries has been “confirmed” and therefore was not good science. But my point was that it was good science in as much as it pursued careful observation and classification of nature, which could then lead to the formulation of hypotheses that could be tested and theories that could withstand testing (or replacement, if you’re a Kuhnian or if you like Lakatos…) There’s good science and bad science, valid science and wrong science: what makes these ‘science’ is different from the bad, the good, the valid or wrong.

            Dave

        • FSE

          Because ultimately science is more than a catalog of measurements.

          For instance, consider a study of the volume and pressure of a gas. The individual measurements are just a stepping stone to the final result: the two are inversely related, aka Boyle’s Law

          You can use Boyle’s Law without access to the gas and equipment Boyle used. You can refute Boyle’s Law without knowing any of his measurements. In short, it hardly matters whether you can reproduce Boyle’s experiment. What matters is the insight it provides.

      • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

        Try passing organic chemistry using rigorous MO theory instead of jury-rigged LCAO. Minimize a structure in a week in a rack with Gaussian, or in five minutes in your PC with MM+, “Better is not the perpetual enemy of “good enough” when you have 90 minutes for the final (unless you need the Woodward-Hoffmann rules).

        Phlogiston and Vital Force, not so much. Bullvalene? Yee haw!

    • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

      Without reproducibility, science is diversity, Social Marxism, feminist dialectic, Lysenkoism…macoeconomics.

    • bezotch

      Without reproducibility or replicability it is difficult to distinguish between results that provide true insights into how the universe works (information) and results that are mere statistical anomalies, spurious correlations, or flaws in the design or execution of the experiment (misinformation).
      Science is not only the process of increasing the information on how the universe works, it is also the process of decreasing the misinformation. A discipline that is unable or disinclined to identify and discard misinformation will not be self correcting and is therefore at best a philosophy, at worst a pseudo-science.

  • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

    Reproducibility, at least in the form proposed, is not now, nor has it ever been, an essential part of science.” World 2017 polyethylene resin production was 100 million metric tonnes. Suppose it is 1% non-reproducible in 2018 – and we pump that million tonnes of crud into your bedroom window…after the thalidomide.

    en(.)wikipedia(.)org/wiki/Therac-25
    pbs(.)twimg(.)com/media/CZ0YMeoWkAAcjXh.jpg
    www(.)lhup(.)edu/~dsimanek/museum/unwork.htm

  • Marshall Gill

    ” The public’s distrust is likely more to with the apparent variability of scientific conclusions.”

    No, the public’s distrust is because social justice warrior bureaucrats posing as “scientists” lie on a regular basis. Drummond is simply another one, wanting to make sure he gets to continue to steal from the taxpayers.

    Much of “science” today is nothing more than bureaucrats receiving wealth transfers to play with technology. “Climate scientists” which do not apply scientific method come immediately to mind.

    Of course these self-named “scientists” do not care if their results are reproducible. Especially in those cases where the “science” is made up, which occurs regularly.

    • Warren

      Since thousands of climate scientists, working in every country of the industrialized world, affirm AGW through peer reviewed research, your post seems to have no objective foundation other than opinion.

      • Gary Hoffman

        If I were in their shoes, I also would vote to keep that gravy train coming (and load more gravy please).

        • Warren

          Scientists doing research are not paid well –your post is nonsense. If Apple Corp thought your ideas made any sense, they would have ignored their scientists and never developed iPhones or computers. I expect if any Apple Engineer had expressed your viewpoint, he would have been frog-marched out the door long ago.

          • John Thompson

            That you think they would expressly say it is the flaw in your thinking.
            “Even honest people lie about lying.”
            It’s exactly why being able to reproduce experimental results is so key – reproducibility is a check and balance on claims.
            It takes away the influence that things like money, job security or prestige have on all people.
            I’ll wade into the topic at hand on this thread.
            We can’t claim that all the scientists not finding man made global warming who are employed by businesses are biased, but then give a pass to other scientists employed by Universities or Environmental organizations to find man made climate change.
            Exactly how many climate scientists would exist if they claimed no climate changes?
            The point is that reproducible experimental results helps a great deal when talking about issues that are so controversial and where there are many biases on all sides.

          • Warren

            Your ignorance of Science is astounding. Read any of the peer reviewed science? Start here: https://royalsociety.org/topics-policy/projects/climate-change-evidence-causes/

      • Marshall Gill

        I think that the talking point which you repeated here claims “tens of thousands of scientists”. You also failed to say 97% but I will grant you a gentlemen’s B-

        • Warren

          Dont grant me anything, Just look at facts. Every Institution of Science in the world — 100% – affirm AGW, without exception. 80 Science Academies and 190 Scientific Progressional Societies. Not one disputes the evidence or the conclusion that Man’s activities are warming the planet, dangerously so. There’s no scientific basis and no evidence for any claim to the contrary.

          • Marshall Gill

            You deserve to have the talking points that you repeat graded. Pro Tip: Repeating other people’s talking points is not the same as arguing.

            LMFAO 100%? Not just 99% or 98% but 100%? There isn’t a single scientist who doesn’t believe in AGW? Not one? You appeal to authority did not produce the expected result so you double down and REALLY REALLY appeal to authority? Truly pitiful.

            You should suggest that every single person on the planet says it’s true. No, you should suggest that every living thing in the Universe says it’s true. REALLY, REALLY, REALLY appeal to authority, that should do the trick! I am sure that you live like the Unibomber as a result? No global warming electricity for you?! Or climate changing automobiles either! Right? You live what you claim here? Yeah, didn’t think so.

          • Warren

            You’re not reading. Not 100% of scientists, 100% of the world’s scientific institutions. SciencecAcademies and Professional Societies. No exceptions. Go to any of those institutions websites and search for their formal statements or reports. Every one affirms AGW. No exceptions.

          • Marshall Gill

            Sure they do. And satellite temperatures don’t count because something something something heat hiding in the deep ocean?

            The last three years have been the “hottest on record” heavily adjusted, but with scientific accuracy? They had to adjust them all up because they know it was hotter than the actual measurements even show?

    • Neurosiscientist

      Oh I know that guy! Marshall Gill believes in home schooling. He lives in a world that is so simple that he can teach to his children everything that they need to know. Like in the middle ages.
      And we are discussing about modern science with him…

      • Marshall Gill

        LOL Sure, I create all of the curriculum for my children myself. Do I have to actually make the paper they use too? Manufacture the pencil’s to write “I, Pencil” when I had them read that? Are you really so ignorant about home schooling? Of course you are not, you are simply someone who felates authority and are threatened by those of us who do not.

        You discussed nothing only showed your own personal ignorance. Pathetic really.

  • sherrick13

    LOL, they can’t reproduce global warming, so they are trying to redefine science so they can push even more socialism on us and call it science.

  • Alan_McIntire

    “It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong. ”
    Richard P. Feynman

  • RiffleDragon

    It seems that Drummond wants to keep machine learning code as a trade secret rather than share it to allow reproducable research. Machine learning algorithms may provide a distinct competitve advantage for a business, so why would the dev team even care if they are scientific about it? I wouldnt care to spend cash to read his paper to find out. The abstract reveals faulty argumentation… in the guise of a minority report.

    Repeated testing provides us with … a measure of reliability … of measured results, under those conditions. Determining the Reliability of the hypothetical case is the purpose of scientific experimentation, in general. Reliability testing or quality testing is essential in engineering, especially software engineering. Drummond is identifying machine learning as a scientific field, but I would characterize it as engineering science, rather than natural or social sciences which involve living organisms.

    I would think that since the objective of machine learning is to process specific inputs in an orderly manner to produce an acceptable output… then assuring the reliability of that processing would be paramount. If his organization has the resources to keep the testing and code a trade secret, perhaps that code will be licensed for use for the rest of us… if the open source people dont get there first by sharing…

    I wholeheartedly agree with Feynmann as shared by Alan McIntire regarding proper experimentation.And I am saddened by Leonid Schneiders post regarding the way pursestrings and politicians have dominant influence over experimental design.

    Hoping the furor over lack of reproducability and the resulting unreliable ideas can be overcome by a new generation of scientists and engineers who are not swayed by dogma or dollars…

  • Tom Aaron

    General discussions on science are fine. However, fortunately, each discipline governs itself. I really don’t give a hoot about the baloney research in psychology, climatology, sociology. Medicine is important but I know nothing of medicine. I DO KNOW my own niche area of geology…the few who do research in it ( about 7 in the world) are quite vigilant and we expect high standards from each other. We build upon and add to the credibility of the peer reviewed published research that came before us.

    One would think that the baloney sciences would strive even harder to maintain reproducibility standards. Instead they are toooftrn culturally or ideologically driven. Results are predictable politically correct flavours of the moment.

  • Pingback: Reproducibility? | …and Then There's Physics()

  • Pingback: Lectuur op zaterdag: nutteloze quality time, 2 miljard 60-jarigen, Liberia en meer | X, Y of Einstein?()

  • http://arturotozzi.webnode.it/ Arturo Tozzi cns

    This discussion is full taken from the neopositivist distinction between Wittgenstein’s “verification” and Carnap’s “confirmability” of a meaningful sentence. Science needs to talk about something novel, not to chat about second-hand stuff. Therefore, please, scientists, say something novel, instead of changing a few words from old, venerable Authors and pretend to be original. There is always somebody not ignorant enough to not discard you.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Neuroskeptic

No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.

ADVERTISEMENT

See More

@Neuro_Skeptic on Twitter

ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar
+