A paper just out in the journal Psychological Science says that: Women Can Keep the Vote: No Evidence That Hormonal Changes During the Menstrual Cycle Impact Political and Religious Beliefs
This eye-catching title heads up an article that’s interesting in more ways than you’d think.
According to the paper, authors Christine Harris and Laura Mickes tried to reproduce the results of another paper, previously published in the same journal. The original study, by Kristina Durante and colleagues, claimed that some women’s political preferences changed over the course of their menstrual cycles: The Fluctuating Female Vote.
Harris and Mickes say that they found no menstrual effects in their replication. Durante et al’s rebuttal says, amongst other things, that Harris and Mickes’ data do in fact confirm some of their hypotheses, when the results are analyzed appropriately; they also present new data of their own.In other words it’s a pretty typical dispute among psychologists.
I don’t know, or especially care, who’s right. But what makes this exchange very interesting to me is the way in which both sides of the debate are deploying methodological concerns about the scientific enterprise as a rhetorical weapon.
Harris and Mickes for instance write that:
This study adds to a growing number of failures to replicate several menstrual cycle effects on preferences and attraction (e.g., 1, 2), which invites concerns that this literature as a whole may have a false-positive rate well above the widely presumed 5%.
That inflation is expected if data analysis flexibility of the sort cautioned against by Simmons et al. is present… However, each purported effect should be assessed on its own merits.
The F Problem (flexible analyses, researcher degrees of freedom, and failures to replicate) which has been much discussed (including, famously, by Psychological Science) is here being used to partisan effect. This is hardly a new rhetorical tactic but my impression is that it’s getting more common and I predict it will continue to do so.
I find the whole tactic disingenuous. In the last line, Harris and Mickes correctly note that every case should be judged on the basis of the pertinent data alone; yet the preceding sentences seem designed make the reader consider the possibility that all cases in this literature (including this one) as suspect as a whole.
Yet, if we are going to start tarring lots of studies with the same brush, why stop there? False positive concerns and failed replications are a growing concern across the whole of psychology and beyond. To single out just one literature for skepticism is unfair, and breeds complacency.
Durante et al hit back – but their F Problem rhetoric is no less problematic. They write (my emphasis):
A recent meta-analysis of 134 ovulatory effects [on women's mate preferences] from 38 published and 12 unpublished studies revealed robust cycle shifts not due to researcher degrees of freedom (Gildersleeve et al., in press).
It didn’t. You cannot exclude the power of researcher of degrees of freedom retrospectively. Hidden flexibility refers to parameters that can vary in the production of a given set of statistical results in ways that leave no trace in the final result. They are hidden.
No-one knows how many things you tried before you found the result you report. At best a statistical analysis can suggest that bias seems more or less likely in a certain literature, given some assumptions, but this is never conclusive: and these analyses are themselves full of researcher degrees of freedom – perhaps why both Harris and Mickes and Durante et al were able to cite rival meta-analyses supporting their opposing positions…
Durante et al’s response also presents the results of their own replication of their menstrual-cycle-politics effect – and the results are positive.
But remember that, had the data come out negative, Durante et al might not have told us about it. And they might, in theory, have run several replications and picked only the best one… or used other forms of p-value hacking. Replications are not immune to bias.
However, how we do know that Harris and Mickes didn’t use the same tricks ‘in reverse’ to get a null result? It’s easy to forget that questionable research practices can work both ways. P-values can be hacked up as well as down, and although the former may be less common, it’s no less problematic.
I know of only one way to put a stop to all this uncertainty: preregistration of studies of all kinds. It won’t quell existing worries, but it will help to prevent new ones, and eventually the truth will out.
In particular, if you’re engaged in a bun-fight of the kind discussed here, you should consider voluntarily preregistering your contributions (e.g.) – if you’re really confident that you’re in the right, that is. Because if you can get your results even with your methodological hands tied behind your back, your position will be strengthened enormously.
Harris, C., & Mickes, L. (2014). Women Can Keep the Vote: No Evidence That Hormonal Changes During the Menstrual Cycle Impact Political and Religious Beliefs Psychological Science DOI: 10.1177/0956797613520236
Durante, K., Arsena, A., & Griskevicius, V. (2014). Fertility Can Have Different Effects on Single and Nonsingle Women: Reply to Harris and Mickes (2014) Psychological Science DOI: 10.1177/0956797614524422