Data Duplication in 25% of Cancer Biology Papers?

By Neuroskeptic | June 16, 2015 3:54 am

25% of papers published in cancer biology journals contain signs of ‘data duplication’, which can be a sign of scientific errors or even misconduct.

That’s according to a remarkable paper just published in Science and Engineering Ethics by a Norwegian cancer researcher, Morten P. Oksvold.

Oksvold writes that he randomly selected 40 recent original data papers from three cancer journals, for a total of 120 articles. The journals were chosen to represent one low, one middle, and high impact factor (IF).

For each of the 120 selected papers, Oksvold laboriously searched for data duplication in the figures:

The articles were first screened systematically for duplication of data by manual examination of zoomed high-quality images. Figures were analyzed in Adobe Photoshop CS4 and all figures from individual articles were compared side-by-side to identify potential duplication of data.

Articles found to contain duplications were subjected to a more thorough analysis of supplementary figures (if any). Further, the five latest original publications listing the first author (of the article containing duplication of data) as author (if any) were identified and examined.

Oksvold found evidence of duplication in 25% of the articles. The rates did not vary with the impact factor of the journal. What’s more, over half of these cases represented what he calls ‘category 2’ duplication, in which the same data is presented as representing two or more different conditions.

Here’s an example (from PubPeer, see below) of two separate apparent duplications. Within the same paper, the same microscope images are presented twice, with different captions. The red squares, which Morten added, highlight two identical areas. The yellow squares show another pair of duplicates.

duplication_example

This is pretty worrying, although there might be innocent explanations for at least some of these duplications. Any given duplication could represent an innocent mistake of no scientific consequence, a more serious error, or deliberate fraud. Without further investigation, it’s impossible to know.

So clearly further investigation is needed. It’s deeply problematic, therefore, that when Oksvold sent his findings to the journals concerned, they completely ignored him.

He says that he sent “full documentation of the duplications” to the editorial offices of the three journals in October 2014 but “no editorial replies have been received so far (May 2015)”.

This is just indefensible. Oksvold’s allegations deserve to be taken seriously. Every journal has a responsibility to ensure the scientific integrity of the work they publish, to the best of their ability. These journals should have detected these cases of duplication at the review stage. One can forgive them for not spotting everything, but now that Oksvold has helped them out by tipping them off, they ought to take action (and thank him.)

Oksvold also posted all of his findings on the scientific discussion forum PubPeer, in this thread (which now has 112 posts). PubPeer usually emails the authors of each paper as soon as someone posts a comment about it. Assuming that they wrote to the authors of these papers, it means that each of the authors would have been notified and invited to come on PubPeer and reply.

Yet, Oksvold says, out of the 29 cases of duplication he posted about, only two authors replied on PubPeer, both of whom claimed that the duplication was honest error.

From PubPeer we also learn the identities of the three journals: they were the International Journal of Oncology, Oncogene, and Cancer Cell. The editors of these publications have some explaining to do, in my view – as do the authors concerned.

ResearchBlogging.orgOksvold, M. (2015). Incidence of Data Duplications in a Randomly Selected Pool of Life Science Publications Science and Engineering Ethics DOI: 10.1007/s11948-015-9668-7

CATEGORIZED UNDER: blogging, papers, science, select, Top Posts
ADVERTISEMENT
  • Bill C

    Interesting. I read through the PubPeer thread. First time on there….I’m not in the field but it was a bit hard to get a flavor for what were the more significant incidences. Most of them seem to be duplications within papers rather than across papers. Most of the comments appear to be Oskvold presenting his data. Doing this as a PR article after first putting it out on the PubPeer site is interesting and good, I think.
    Despite calls for the journals and original authors to respond, I wonder if the model has ever NOT been that ignoring criticism is the best strategy until forced? So calls for a reexamining of the original PR literature are probably still left to future papers and those adventurous enough to write and publish them, while the best hope for the field is future improvements in study/publication quality control.
    Heh – maybe I can start my own journal, published by me, and write all my own articles about other people’s mistakes.

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      “Despite calls for the journals and original authors to respond, I wonder if the model has ever NOT been that ignoring criticism is the best strategy until forced?”

      I’m hoping that the pressure will build on the journals to respond now that Oskvold’s paper is published. If they don’t respond then it might be possible to make a complaint to COPE as Oncogene and Cancer Cell are both listed as members on COPE’s website (I’m not sure about Int J Oncol.)

  • gagz

    seems those with original ideas, rigorous methodology, open data, and generous documentation are the ones who aren’t welcome, and those who are willing to ‘take risks’ by recycling data & ideas are embraced.

    bizarro world

    • Amoral Atheist

      How do you mean?

  • Richard Van Noorden

    Re that 25% figure: it’s worth pointing out that EMBO Press, which employs an ‘image detective’ to rigorously check images in every accepted manuscript before publication, say they typically find 20% of papers have some kind of problem (most of which turn out to be innocent mistakes).

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Good point. Oksvold also notes that

      “Recently, it was reported that an automatic screening method has been
      developed to detect irregularities in life science publications (Abbott 2013).
      The system, developed by Enrico Bucci, was used to document a number of
      duplications in life science publications by Italian researchers… Bucci estimated that midway through the analysis;
      approximately 25 % of the thousands of articles containing gels that he
      has analyzed so far potentially represent violations of the widely
      accepted guidelines on reproducing gel images.”

  • Pingback: Weekend reads: Duplication rampant in cancer research?; meet the data detective; journals behaving badly - Retraction Watch at Retraction Watch()

  • Pingback: I’ve Got Your Missing Links Right Here (20 June 2015) – Phenomena: Not Exactly Rocket Science()

  • Pingback: I’ve Got Your Missing Links Right Here (20 June 2015) | Gaia Gazette()

  • CL

    This seems to be a bit overinflated. The PubPeer thread indicates that a lot of the duplications are actually authors presenting the same western blot in two separate figures, sometimes noting that the data is presented twice, sometimes not. Oskvold claims that this is unacceptable practice, but seems to do the same in one of his papers.. see https://pubpeer.com/publications/17868870

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Indeed, I saw that thread. But you’re talking about what Oksvold calls ‘category 1’ duplications, in which the same blot is duplicated but annotated the same way both times.

      However half of the duplications Oksvold found were category 2, in which the same blot is presented as being two different conditions.

      • CL

        ah, ok, so about 13% are what I would call severe errors. Pretty bad.

  • Pingback: The Perfect Scientific Crime? - Neuroskeptic()

  • Pingback: We are anonymous (or not). We need you to join. We are (mostly) making scientific discussion in the open possible and easy. | Rapha-z-lab()

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Neuroskeptic

No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.

ADVERTISEMENT

See More

@Neuro_Skeptic on Twitter

ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar
+