Method well articulated makes for good science

By Razib Khan | March 18, 2013 10:13 am

A paper on the genetics of the Roma (“Gypsies”), Reconstructing Roma History from Genome-Wide Data, has finally come out in a journal. It’s been on arXiv for a while, so nothing too surprising. But, reading through the paper I have to note one rather clear aspect for me: there is a crispness and detail to the way they outlined and integrated their methods into the results section. Unfortunately there is an obvious tendency in the pressure to publish for people to use methods and tools (which usually consists of software written by others which you use in a blackbox fashion) in a slapdash manner with an aim toward arriving at a publishable unit. Because of the specialization within science it seems one can entirely make it through peer review by using methods which signal that one does not really know what one is talking about. To give a concrete example, a year ago I was told about a phylogenetic package isin moderate usage which seems to basically be a “random number generator.” The fact that this package is used is a testament to the fact that many researchers who are not phylogeneticists simply reach for the nearest method at hand, and trust the results if they make some intuitive sense (presumably in this case they would simply report the results which were intelligible).

The ultimate future, I’m hoping, is for open data, open code, and open methods. When a shady or sketchy paper makes it through peer review there is now visible public anger which bubbles out of the scientific community, but the process of reproducing the results can still be tedious (see Arsenic life). This is less true in cases where the means are more computational. The only things stopping the process of science from operating more efficiently are human barriers (e.g., cultural norms, institutional barriers toward data release).

  • jugni

    In regards to the earlier paper you had reviewed on Gypsy origins, the one by Thangaraj and company, I thought you might find this Open Magazine article interesting. In particular, this statement on the study by a linguist which the writer seems to accept too easily at face value:

    According to Hancock, the PLOS One study that genetically links the Roma with SC/STs is just another hypothesis. ‘It doesn’t take social and historical factors into consideration,’ he says. ‘Several other genetic studies on my people have been done which give quite different results; Bhalla, for instance, found that the Roma descend from Jats; Kochanowski finds a Rajput origin.’

  • JonFrum

    Putting aside the general controversy, paleoclimatology has been a case study in what should not be done in science in recent years. Not only have authors stonewalled when asked for data and code, but journal editors at Nature and other major publications have refused to require archiving of data and code. Given that the relevant studies consist entirely of statistical analyses of data sets, these studies are simply not reproducible. A molecular biologist can read a methods section and re-do the work herself in her own lab. Without access to the exact data set and computer program used in paleoclimate studies, there is no possibility of independent replication. Which obviously violates the first principle of science communication.

    Worse than the behavior of such scientists and editors, is the silence within the field. Some journals have made archiving of data and code a requirement for publishing, and should be recognized for doing so. The others – and the field as a whole – has a lot to answer for. As a schoolboy, my math teachers told me that I had to show my work. That is all that is being asked. it has been called intimidation. It’s a sad state of affairs.


