25% of papers published in cancer biology journals contain signs of ‘data duplication’, which can be a sign of scientific errors or even misconduct.
That’s according to a remarkable paper just published in Science and Engineering Ethics by a Norwegian cancer researcher, Morten P. Oksvold.
Oksvold writes that he randomly selected 40 recent original data papers from three cancer journals, for a total of 120 articles. The journals were chosen to represent one low, one middle, and high impact factor (IF).
For each of the 120 selected papers, Oksvold laboriously searched for data duplication in the figures:
The articles were first screened systematically for duplication of data by manual examination of zoomed high-quality images. Figures were analyzed in Adobe Photoshop CS4 and all figures from individual articles were compared side-by-side to identify potential duplication of data.
Articles found to contain duplications were subjected to a more thorough analysis of supplementary figures (if any). Further, the five latest original publications listing the first author (of the article containing duplication of data) as author (if any) were identified and examined.
Oksvold found evidence of duplication in 25% of the articles. The rates did not vary with the impact factor of the journal. What’s more, over half of these cases represented what he calls ‘category 2’ duplication, in which the same data is presented as representing two or more different conditions.
Here’s an example (from PubPeer, see below) of two separate apparent duplications. Within the same paper, the same microscope images are presented twice, with different captions. The red squares, which Morten added, highlight two identical areas. The yellow squares show another pair of duplicates.
This is pretty worrying, although there might be innocent explanations for at least some of these duplications. Any given duplication could represent an innocent mistake of no scientific consequence, a more serious error, or deliberate fraud. Without further investigation, it’s impossible to know.
So clearly further investigation is needed. It’s deeply problematic, therefore, that when Oksvold sent his findings to the journals concerned, they completely ignored him.
He says that he sent “full documentation of the duplications” to the editorial offices of the three journals in October 2014 but “no editorial replies have been received so far (May 2015)”.
This is just indefensible. Oksvold’s allegations deserve to be taken seriously. Every journal has a responsibility to ensure the scientific integrity of the work they publish, to the best of their ability. These journals should have detected these cases of duplication at the review stage. One can forgive them for not spotting everything, but now that Oksvold has helped them out by tipping them off, they ought to take action (and thank him.)
Oksvold also posted all of his findings on the scientific discussion forum PubPeer, in this thread (which now has 112 posts). PubPeer usually emails the authors of each paper as soon as someone posts a comment about it. Assuming that they wrote to the authors of these papers, it means that each of the authors would have been notified and invited to come on PubPeer and reply.
Yet, Oksvold says, out of the 29 cases of duplication he posted about, only two authors replied on PubPeer, both of whom claimed that the duplication was honest error.
From PubPeer we also learn the identities of the three journals: they were the International Journal of Oncology, Oncogene, and Cancer Cell. The editors of these publications have some explaining to do, in my view – as do the authors concerned.
Oksvold, M. (2015). Incidence of Data Duplications in a Randomly Selected Pool of Life Science Publications Science and Engineering Ethics DOI: 10.1007/s11948-015-9668-7