Every few months someone asks me what I use to manage my papers. Stupidly, I don’t use anything. Or I haven’t. Over the past few weeks I’ve been playing around with PubChase and Mendeley. You probably know of the latter, and the fact that it’s been purchased Elsevier. Elsevier is what it is. Mendeley on the other hand is a firm that I have a positive view of, in part because of their culture of openness and support for the free flow of information, but also due to the fact that I’ve known their head of outreach for ten years. You trust people, not things. Mendeley‘s not a charity, and I don’t begrudge them their new resources now that they are under the corporate wing of Elsevier. Whether you’re pessimistic or optimistic about their future, I think caution is warranted.
It’s no secret to people who read this blog that I hate the way scientific publishing works today. Most of my efforts in this domain have focused on removing barriers to the access and reuse of published papers. But there are other things that are broken with the way scientists communicate with each other, and chief amongst them is pre-publication peer review. I’ve written about this before, and won’t rehash the arguments here, save to say that I think we should publish first, and then review. But one could argue that I haven’t really practiced what I preach, as all of my lab’s papers have gone through peer review before they were published.
No more. From now on we are going to post all of our papers online when we feel they’re ready to share – before they go to a journal. We’ll then solicit comments from our colleagues and use them to improve the work prior to formal publication. Physicists and mathematicians have been doing this for decades, as have an increasing number of biologists. It’s time for this to become standard practice.
Some ground rules. I will not filter comments except to remove obvious spam. You are welcome to post comments under your name or under a pseudonym – I will not reveal anyone’s identity – but I urge you to use your real name as I think we should have fully open peer review in science.
Peter A. Combs and Michael B. Eisen (2013). Sequencing mRNA from cryo-sliced Drosophila embryos to determine genome-wide spatial patterns of gene expression.
Please leave comments on Eisen’s post.
Via Haldane’s Sieve.
Three articles which illustrate the difficulty of the sort of science which tackles what Jim Manzi would term phenomena characterized by high causal density. First, the simplest one is the report that extrapolating from some mouse models to human biological systems may be problematic. Anyone who has talked to human geneticists who use mouse models is aware that these inbred lineages can be somewhat particular and specific. Order the wrong mice, and all of your experimental designs might be for naught. So the result is not surprising, but it seems useful to have it documented in such a concrete fashion (though this has been reported in the media before).
Second, a long piece in The Chronicle of Higher Education on the problems in replicating ground breaking research in the area of priming. This may be a case of a robust result which turns out to fade into irrelevance as time passes, and illustrates the fundamental problems of attempting to do sciences on humans; we’re diverse and protean. I think the jury’s out on this, and we’ll wait and see. Fortunately this probably won’t be an issue we’ll be debating in 10 years, as replications will start to occur, or, they won’t.
Over at ScienceDaily there is a report on a new paper on affirmative action and academia, Understanding the Impact of Affirmative Action Bans in Different Graduate Fields of Study. The paper is gated, but the regression model used really doesn’t seem to do much more than confirm intuition. The descriptive details are more interesting and straightforward.
A week ago Keith Kloor had a post up, What Science, Environmentalism and the GOP Have in Common, where he bemoaned the lack of representation of non-whites in these categories. As a matter of fact I think Keith is wrong about science. Even constraining the data set to American citizens and permanent residents people of Asian ancestry are well represented in many areas of science. But not all sciences are created equal. In 2011 there were 158 doctorates which were awarded within the category of ‘evolutionary biology’ for American citizens or permanent residents. Of these 135 were non-Hispanic white, and 5 were Asian. In ‘neuroscience’ the respective figures were 742, 535, and 96. In ‘zoology’ 55, 49, and 0. In ‘bioinformatics’ they were 80, 51, and 17. Finally, in ‘ecology’ the breakdown was 330, 300, and 11. If you are involved in academic biology I’m rather sure that these numbers won’t surprise you too much, even if you’d never thought about it. You can even infer these by walking through the posters at ASHG 2012, and seeing how the demographics of the crowds shift.
We can look at this issue another way. In 2010 US News & World Report listed the top 10 ecology & evolution graduate programs. I went to the faculty websites after typing the university and ‘ecology,’ and then ‘neuroscience.’ Looking at names, and sometimes head shots, I classified everyone as ‘Asian’ (as defined by the US Census) and ‘Not Asian.’ You can find the data here. Please note that the left columns are ecology faculty, and the right are neuroscience.
As many of you know, right before the election I made a $50 bet with Hank Campbell that Nate Silver would get at least 48 out of 50 states correct for the 2008 presidential election. I also got one of Hank’s readers to sign on to the same bet. Additionally, a few readers and Twitter followers got in on the wager; they were bullish on Romney’s prospects, and I was not (more honestly, I was moderately sure they were self-delusional, and willing to take their money to make them more cautious about their self-delusional biases in the future). But there’s a major precondition that needs to be stated here: I hedged.
Last February a friend told me he was 100% confident that Barack Hussein Obama would be reelected. This prompted me to ask for favorable terms on a bet. The logic was simple, if he was 100% confident, then it shouldn’t be a major issue for him, because he was collecting anyhow. As it happens he gave me 5 to 1 odds, so that I would collect $5 for every $1 he might collect. I told him beforehand that I actually thought that Obama had a 60-70% chance of winning, so I went into the wager assuming I’d be out a modest amount of money. But that was no concern. My goal was now to convince those who were irrationally supportive of Romney to take the other side of the bet. For whatever reason people have an inordinate bias toward their hoped-for-candidate in terms of who they think will win, as opposed to who they wish to win. The future ought gets confused with the future is.* I got people to take the other side, which means that I was going to make money no matter who won.
At this point one might wonder about my comment that I suspected that those who were bullish on Romney were delusional. It’s rather strong, and my reasoning is actually rather strange. Overall I accepted the polling averages. A few years back I was an economic determinist in election outcomes, but Nate Silver had convinced me that the sample size was too small to get a good sense of the real proportion of variation being predicted here. In short, the economy matters, but I stepped back from the supposition that it was determinative (as it happens, purely economic models that were excellent at predicting past elections face-planted this time). So that’s why I relied on the polls. Though I leaned on Nate Silver, I didn’t think he was particularly oracular, and I’d say that I’m mildly skeptical of the excessive faith some put in his particular person. When I put a link up to Colby Cosh’s mild take-down of Silvermania I received a few moderately belligerent comments. This despite the fact that I was willing to put money on Silver’s prediction.
Science is about “updating” with new information. But people are attached to their propositions, and shifts in paradigms can take a very long time, often dependent more on human lifespans than the constellation of the data. But please see this post by Luke Jostins’ over at Genomes Unzipped. He has “updated” his own view of his recent Nature paper on inflammatory bowel disease. This is rather awesome, because yes, there was some talk about the balancing selection aspect of the paper at ASHG, and now Luke has gone and amended his own position.
The reality is that emotions are a big deal in science. But in theory we simply look at the evidence. Bridging that gap, and shifting the balance to the latter, is very important in keeping the enterprise honest, fruitful, and attractive to young scholars. I’m hoping that the more rapid dissemination of information via projects like Haldane’s Sieve will aid in the rate of iteration.
Richard Lewontin’s fame rests in part on his pioneering role in the development of the field of molecular evolution, and secondarily due to his trenchant Left-wing politics. Several readers have already pointed me to his rather strange review of two new works in The New York Review of Books. The prose strikes me as viscous and meandering, but some of the assertions are rather peculiar. For example:
The other exception to random inheritance is not in the chromosomes, but in cellular particles called ribosomes that contain not DNA but a related molecule, RNA, which has heritable variation and is of basic importance to cell metabolism and the synthesis of proteins. Although the cells of both sexes have ribosomes, they are inherited exclusively through their incorporation in the mother’s egg cell rather than through the father’s sperm. Our ribosomes, then, provide us, both male and female, with a record of our maternal ancestry, uncontaminated by their male partners.
Harry Ostrer, who is a professor of genetics at Albert Einstein College of Medicine, and Raphael Falk, who is one of Israel’s most prominent geneticists, depend heavily on our ability to trace ancestry by looking at the DNA of Y chromosomes and ribosomes….
There is no mention of ribosomes in Legacy: A Genetic History of the Jewish People. I know, because I used Amazon’s ‘search inside’ feature. Rather, there’s a lot of reference to mitochondrial DNA and mtDNA, which is what Lewontin truly meant. Or at least I hope that’s what he meant. Because Lewontin is an eminent evolutionary biologist I assume they felt like they didn’t need a science editor, but perhaps they need to reconsider that.
I was a little sad when I heard my friend Steve Hsu had accepted a position at Michigan State some months back. My reasons were two-fold. First, I swing by Eugene now and then, and I wouldn’t have the opportunity to drop in on his office. Second, it seemed that Steve was becoming an Administrator. To some extent I feel like that’s going over to the dark side. But ultimately it’s his decision, and Steve has a lot of things going on at any given moment, and I’m hopeful he’ll continue to be involved in the production of scholarship in some form (he’s more of a scholar than most as it is).
Now apparently his move has resulted in submerged tensions coming to the fore. You can read the article in The Lansing Journal, New director’s experience a plus for MSU, but his controversial views concern some. Let’s qualify who these “some” are:
Fifteen years ago John Horgan wrote The End Of Science: Facing The Limits Of Knowledge In The Twilight Of The Scientific Age. I remain skeptical as to the specific details of this book, but Carl’s write-up in The New York Times of a new paper in PNAS on the relative commonness of scientific misconduct in cases of retraction makes me mull over the genuine possibility of the end of science as we know it. This sounds ridiculous on the face of it, but you have to understand my model of and framework for what science is. In short: science is people. I accept the reality that science existed in some form among strands of pre-Socratic thought, or among late antique and medieval Muslims and Christians (not to mention among some Chinese as well). Additionally, I can accept the cognitive model whereby science and scientific curiosity is rooted in our psychology in a very deep sense, so that even small children engage in theory-building.
I’m reading Jim Manzi’s Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society right now. No complaints, though that’s no surprise, as I’m familiar with the broad outline’s of Manzi’s work, and have found much to agree with him on in the past (though there are issues where we differ, never fear). That being said, I did ponder one aspect of Manzi’s characterization of science: that it makes non-obvious predictions. This is not controversial, and I don’t want to really quibble with it too much. But in the context of social science in particular I think one of the gains of ‘science’ is the clarification of obvious predictions.
Dr. Joe Pickrell has a follow up to his widely discussed post on updating scientific publication for the 21st century. One section jumped out at me, not because it was revolutionary, but because it made explicit a complaint that I had often heard:
The solution to this problem relies on a simple observation–in my field, I am completely indifferent to whether a paper has been “peer-reviewed” for the basic reason that I consider myself a “peer”. I do not think it extremely hubristic to say that I am reasonably capable of evaluating whether a paper in my field is worth reading, and then if so, of judging its merits. The opinions of other people in the field are of course important, but in no way does the fact that two or three nameless people thought a paper worth publishing influence my opinion of it. This immediately suggests a system in which papers are posted online as soon as the authors think they are ready (on so-called pre-print servers). This system is the default in many physics, math, and economics communities, among others, and as far as I can tell it’s been quite successful.
The reality is that often the “peers” are not peers. How else to explain the publication of the longevity study in Science, now retracted? Or the non-canonical RNA editing? (presumably this is less common of a problem in specialized journals). And sometimes the feedback of peers can indicate that they don’t really know what they’re talking about. For example, I was once told that the authors of a phylogenetics paper which used Bayesian methods were asked to reanalyze their data with a max likelihood framework (jump to the last sentence of this section to see why this is peculiar).
The theory of classical peer review made sense in the pre-internet age. But now there are a plenty of reasons why we might need to revisit this.*
* Not to mention that “peer review” is a somewhat subjective concept. Richard A. Muller has gotten into a back & forth on this issue whether his latest work has undergone peer review. He claims it has, others claim not. I suspect most traditional biologists would be skeptical of Muller’s claim, but physicists would accept it.
Here’s a comment which is interesting, if hard to actually engage with because of the difficulty of the subject matter:
You’re obviously aware of the arguments employed by feminists in the critique of the philosophy of science; that cultural values, in their view patriarchy, could unintentionally contaminate science by affecting how evidence is interpreted and what hypothesises are formed from it. This argument is usually combined with the more fundamental problem of using inductive logic in science, especially biology, and how any cultural norms could be mistaken for biological facts.
My question is how do you separate out the biases from the facts?
What makes you think that the lefts reservations about the studies into sex and race are the result of their own bias and not a legitimate acusation of bias within science? It is obviously not a totally improbable claim considering the long history of racist science in the two previous centuaries.
From my own lay mans knowledge of the subject I’ve got the impression the jury is still out on both innate sex difference and the genetic realities of race.
There is one other drawback to the arXiv that makes me, as a potential submitter, very nervous: being scooped.
A paper is “scooped” if someone else publishes the same (or very similar) concept before you get a chance to publish yours. But, wait, if it is on the arXiv, isn’t that documentation that I had the idea first? Well, yes, but… the arXiv isn’t commonly used in Biology yet, so it isn’t clear how important or how much priority will be given to authors who publish there before “traditional” peer review. This is especially concerning if the novelty of the paper is the idea (which is easy to reproduce with the same or different data) versus a method (which is more difficult to replicate). Maybe this isn’t a valid concern, because anonymous reviewers could, one might argue, just as easily “scoop” ideas from a manuscript they have reviewed. Furthermore, perhaps posting ideas/research early might facilitate more collaborations instead of competitions between research groups.
All said, I think that submitting to pre-print servers can be a very valuable tool for facilitating scientific discourse and advances. Will I start submitting there? We will have to wait and see.
It doesn’t matter to me at this point that people might have qualms. Once sufficient consciousness is raised and critical mass is achieved, then you’ll see a stampede. Some fields in biology may be late into the shift toward preprint distribution, but for the purposes of a lot of the stuff I cover on this weblog I doubt that will matter. When it comes to evolutionary biology that isn’t being funded by pharma or private foundations I don’t think there’s much holding people back aside from the worry about being scooped.
I don’t know much about academia and its intrigues personally, but I have heard of instances of reviewers squatting on a paper until someone else associated with the reviewer publishes (yes, people know who is reviewing in many cases, or suspects). This is a form of scooping, but it occurs in the shadows, and there’s always deniability. Who knows how we can quantify this sort of behavior? But it’s something to that we need to keep in mind when we’re worried about the pitfalls of open access and preprint distribution.
Over at Scientific American Blogs Maria Konnikova posts Humanities aren’t a science. Stop treating them like one. The whole write-up leaves me scratching my head, because I don’t really get what the whole point of all the prose is. This is a thesis that is as old as 19th century romantics, and not all too complicated. The author herself has an academic webpage which indicates she works within an analytic framework that’s anything but “soft.” There are huge confusions with terminology, and Jerry Coyne has a response which addresses many of my questions (e.g., what exactly is the alternative to doing statistical tests in psychology? Rely on the impressions and intuition of the researchers and just trust them?). But let me highlight one section:
… Societal conventions change. And is today’s real-world social network really comparable on any number of levels to one, say, a thousand, or even five or one hundred years ago?
Yes, today’s real-world social network probably is comparable to those of the past. There is some science on this issue. Not even rocket science with abstruse statistics. Science which is highly relevant today. Question science, and it may surprise you with what it has discovered!
Update: It’s online.
Well, maybe the title is hyperbolic. But it’s been frustrating for years that PNAS seems to have some of the most backward post-publication delay policies/patterns in the business. So, for example there’s a new paper in PNAS which is being covered in the media extensively with a DOI link released, but the paper still isn’t on the website. This allows David Reich free rein to do a little amusing slap-slap without any paper to check him:
….The PNAS paper questioning Neanderthal admixture addresses issues swirling around two years ago but not Reich and Slatkin’s latest work. “It’s been an issue for several years. They were right to work on this,” says Reich. But now “it’s kind of an obsolete paper,” he says.
And of course Reich’s group put their preprint up on arXiv yesterday (though the linked piece above says that it’s already been accepted into PLoS Genetics), so we can slice & dice it while we’re waiting on PNAS.
My primary reason for putting a whole post on this issue, which Ed Yong has mentioned many times, is that Twitter kept buzzing (at least my feed) about when the paper was going live earlier today. On the one hand this generates pent up demand, but it also creates irritation and resentment. I understand that people present at conferences and give talks, and then you wait for the paper. But it’s really testing patience to release media coverage before you put the paper up. And if the past teaches us anything it could be days before they push it live.
I have no idea if this is PNAS‘ policy. Nature and Science somehow manage without this ‘feature.’ And I doubt anything will change. But it should change.
Since Jonah Lehrer came up in the open thread last week, I’m going to mention it. First, I’ll preface this by saying that my interactions with Jonah, who I labeled the “boy king of cognitive neuroscience” jokingly at one point, were all positive. Since Jonah quoted me in The Wall Street Journal I received an email from a fact-checker asking me about this. He didn’t misquote me, and on the contrary, he was punctilious about correcting a misspelling of my name. I liked what I knew about Jonah as a person, and the whole episode has left me rather depressed.
Plagiarism, fabulism, whatever you call it, what he did was horrible. But what really sent me over the edge was the possibility that Jonah threw an editor under the bus, casting blame elsewhere to cover up his sloppiness. The main reason I post this is that I want to reproduce a comment which illustrates the sort of error Jonah regularly made:
Betsey Stevenson and Justin Wolfers hail the way increases in computing power are opening vast new horizons of empirical economics.
I have no doubt that this is, on the whole, change for the better. But I do worry sometimes that social sciences are becoming an arena in which number crunching sometimes trumps sound analysis. Given a nice big dataset and a good computer, you can come up with any number of correlations that hold up at a 95 percent confidence interval, about 1 in 20 of which will be completely spurious. But those spurious ones might be the most interesting findings in the batch, so you end up publishing them!
Those in genomics won’t be surprised at this caution. I think in some ways social psychology and areas of medicine suffered a related problem, where a massive number of studies were “mined” for confirming results. And we see this more informally all the time. In domains where I’m rather familiar with the literature and distribution of ideas it is often easy to infer exactly which Google query the individual entered to fetch back the result they wanted. More worryingly I’ve noticed the same trend whenever people find the historian or economist who is willing to buttress their own perspective. Sometimes I know enough to see exactly how the scholars are shading their responses to satisfy their audience.
With great possibilities comes great peril. I think the era of big data is an improvement on abstruse debates about theory which can’t ultimately be resolved. But you can do a great deal of harm as well as good.
Last year when Dr. Joseph Pickrell posted Why publish science in peer-reviewed journals? at Genomes Unzipped many of the responses naturally turned to criticism of such a system which overturned the conventions of publication in biology. The critiques were fair enough, but my own confusion and irritation was with the fact that many seemed to pretend, or not know, about arXiv. It is perhaps true that biological sciences are different in some fundamental ways from physics, or even social sciences which put preprints up at SSRN. But it seems that any objection to the revolution in scientific production and dissemination which Dr. Pickrell proposed should at least grapple with the fact that physics, mathematics, computer science, economics, etc. continue to remain viable academic fields despite the fact that preprints circulate widely among scholars, and even to the general public.* Publication in a paper in these fields is often an after the fact stamp of approval, depending on the reception from the community of peers.
Few principles are more depressingly familiar to the veteran scientist: the more surprising a result seems to be, the less likely it is to be true. We cannot know whether, or why, this principle was overlooked in any specific study. However, more generally, in a world in which unexpected results can lead to high-impact publication, acclaim and headlines in The New York Times, it is easy to understand how there might be an overwhelming temptation to move from discovery to manuscript submission without performing the necessary data checks.
This is not just an issue in genomics. I’ve discussed it before as being a major problem in psychology. Though the infamous centenarian study will do nothing for the careers of the scientists involved, I do wonder what the effects of publishing large numbers of false positive results in science are on an individuals’ career when it isn’t so inexpertly executed (i.e., in this particular case the technical errors were so glaring that the authors should never have submitted their findings). I wonder because apparently major newspapers are now running with stories which they know are highly likely to be exaggerations or misrepresentation to induce pageviews, and then subsequently ‘correcting’ them. More specifically, the number of corrections has been rising rapidly.