A paper on the genetics of the Roma (“Gypsies”), Reconstructing Roma History from Genome-Wide Data, has finally come out in a journal. It’s been on arXiv for a while, so nothing too surprising. But, reading through the paper I have to note one rather clear aspect for me: there is a crispness and detail to the way they outlined and integrated their methods into the results section. Unfortunately there is an obvious tendency in the pressure to publish for people to use methods and tools (which usually consists of software written by others which you use in a blackbox fashion) in a slapdash manner with an aim toward arriving at a publishable unit. Because of the specialization within science it seems one can entirely make it through peer review by using methods which signal that one does not really know what one is talking about. To give a concrete example, a year ago I was told about a phylogenetic package isin moderate usage which seems to basically be a “random number generator.” The fact that this package is used is a testament to the fact that many researchers who are not phylogeneticists simply reach for the nearest method at hand, and trust the results if they make some intuitive sense (presumably in this case they would simply report the results which were intelligible).
The ultimate future, I’m hoping, is for open data, open code, and open methods. When a shady or sketchy paper makes it through peer review there is now visible public anger which bubbles out of the scientific community, but the process of reproducing the results can still be tedious (see Arsenic life). This is less true in cases where the means are more computational. The only things stopping the process of science from operating more efficiently are human barriers (e.g., cultural norms, institutional barriers toward data release).