Hormones and Women Voters: A Very Modern Scientific Controversy

By Neuroskeptic | March 4, 2014 2:25 pm

A paper just out in the journal Psychological Science says that: Women Can Keep the Vote: No Evidence That Hormonal Changes During the Menstrual Cycle Impact Political and Religious Beliefs

This eye-catching title heads up an article that’s interesting in more ways than you’d think.

According to the paper, authors Christine Harris and Laura Mickes tried to reproduce the results of another paper, previously published in the same journal. The original study, by Kristina Durante and colleagues, claimed that some women’s political preferences changed over the course of their menstrual cycles: The Fluctuating Female Vote.

Harris and Mickes say that they found no menstrual effects in their replication. Durante et al’s rebuttal says, amongst other things, that Harris and Mickes’ data do in fact confirm some of their hypotheses, when the results are analyzed appropriately; they also present new data of their own.fixing_science_womenIn other words it’s a pretty typical dispute among psychologists.

I don’t know, or especially care, who’s right. But what makes this exchange very interesting to me is the way in which both sides of the debate are deploying methodological concerns about the scientific enterprise as a rhetorical weapon.

Harris and Mickes for instance write that:

This study adds to a growing number of failures to replicate several menstrual cycle effects on preferences  and attraction (e.g., 1, 2), which invites concerns that this literature as a whole may have a false-positive rate well above the widely presumed 5%.

That inflation is expected if data analysis flexibility of the sort cautioned against by Simmons et al. is present… However, each purported effect should be assessed on its own merits.

The F Problem (flexible analyses, researcher degrees of freedom, and failures to replicate) which has been much discussed (including, famously, by Psychological Science) is here being used to partisan effect. This is hardly a new rhetorical tactic but my impression is that it’s getting more common and I predict it will continue to do so.

I find the whole tactic disingenuous. In the last line, Harris and Mickes correctly note that every case should be judged on the basis of the pertinent data alone; yet the preceding sentences seem designed make the reader consider the possibility that all cases in this literature (including this one) as suspect as a whole.

Yet, if we are going to start tarring lots of studies with the same brush, why stop there? False positive concerns and failed replications are a growing concern across the whole of psychology and beyond. To single out just one literature for skepticism is unfair, and breeds complacency.

Durante et al hit back – but their F Problem rhetoric is no less problematic. They write (my emphasis):

A recent meta-analysis of 134 ovulatory effects [on women’s mate preferences] from 38 published and 12 unpublished studies revealed robust cycle shifts not due to researcher degrees of freedom (Gildersleeve et al., in press).

It didn’t. You cannot exclude the power of researcher of degrees of freedom retrospectively. Hidden flexibility refers to parameters that can vary in the production of a given set of statistical results in ways that leave no trace in the final result. They are hidden.

No-one knows how many things you tried before you found the result you report. At best a statistical analysis can suggest that bias seems more or less likely in a certain literature, given some assumptions, but this is never conclusive: and these analyses are themselves full of researcher degrees of freedom – perhaps why both Harris and Mickes and Durante et al were able to cite rival meta-analyses supporting their opposing positions…

Durante et al’s response also presents the results of their own replication of their menstrual-cycle-politics effect – and the results are positive.

But remember that, had the data come out negative, Durante et al might not have told us about it. And they might, in theory, have run several replications and picked only the best one… or used other forms of p-value hacking. Replications are not immune to bias.

However, how we do know that Harris and Mickes didn’t use the same tricks ‘in reverse’ to get a null result? It’s easy to forget that questionable research practices can work both ways. P-values can be hacked up as well as down, and although the former may be less common, it’s no less problematic.

I know of only one way to put a stop to all this uncertainty: preregistration of studies of all kinds. It won’t quell existing worries, but it will help to prevent new ones, and eventually the truth will out.

In particular, if you’re engaged in a bun-fight of the kind discussed here, you should consider voluntarily preregistering your contributions (e.g.) – if you’re really confident that you’re in the right, that is. Because if you can get your results even with your methodological hands tied behind your back, your position will be strengthened enormously.

ResearchBlogging.orgHarris, C., & Mickes, L. (2014). Women Can Keep the Vote: No Evidence That Hormonal Changes During the Menstrual Cycle Impact Political and Religious Beliefs Psychological Science DOI: 10.1177/0956797613520236

Durante, K., Arsena, A., & Griskevicius, V. (2014). Fertility Can Have Different Effects on Single and Nonsingle Women: Reply to Harris and Mickes (2014) Psychological Science DOI: 10.1177/0956797614524422

  • DS

    The pressure to publish is at the core of the veracity issue in biological sciences. Although prereg may significantly improve veracity I don’t think it will sufficiently improve veracity.

  • SpyroShay

    I agree that in the process of refuting the factual evidence
    which the two sides presented, there are many factors to consider. As the author of this article mentioned, the
    whole of the research results is a huge factor.
    This goes hand-in-hand with the personal biases of the people conducting
    the experiments or gathering research.
    If, in fact, Durante et al had not have told us about his concluding
    data had it came out negative, his personal bias on the claim would be
    reflected. All the same, if Harris and
    Mickes “didn’t use the same tricks ‘in reverse’ to get a null result”, they
    would be acting on some unseen, personal bias.

    This factor leads me to another detail worthy of
    examination- the reason for conducting the research in the first place. This is valuable to consider when refuting or
    accepting someone’s data. The claim, I
    gather, is that “hormonal changes during the menstrual cycle impact political
    and religious beliefs”. The reasons in
    which the claim was made in the first place (thus creating a need for
    collecting data) plays a big part in weather or not I would accept the evidence
    as relevant grounds. So if Durante et al,
    let us say, wanted to prove the claim in order to support the opinion that
    women should not be allowed to vote, then that would be reason to dismiss his factual
    evidence. Not to say that his findings
    would not be an interesting and valuable discovery, but they should not account
    for an avenue to inequality on any scale in my opinion.

    Those are all things that come to my mind regarding this

  • Wouter

    Lovely bit of mud throwing in the scientific arena.
    I agree that preregistration would decrease the chances of fraudulent scientific behavior, such as p-value hacking. But why not move away from p-values altogether? Obviously I’m referring to Bayesian statistics, that do not suffer from nasty p-value side effects. In addition, Bayesian methods require the scientist to formulate an hypothesis in detailed terms, which would go nicely with preregistration.
    Finally, you mention that p-values can be hacked up and down. However, through orthodox statistics the null-hypothesis can never be accepted; the null can be rejected or not, but nothing more. So, if your aim is to prove the null, then orthodox statistics cannot be your weapon of choice anyway.

    • Scott UK

      Can you (or anyone else) point me to a good beginner’s guide to Bayesian statistics (that explains the concepts verbally, preferably with examples, without resort to incomprehensible logic symbols)? Thanks.

      • Wouter

        Several comprehensible introductions on Bayes’ rule can be found on the internet (e.g this one: http://www.ualberta.ca/~chrisw/BayesForBeginners.pdf).
        However, after reading this it might still be a huge pain to apply Bayes in real life, i.e. on your data. The most applicable instances of Bayesian statistics (in my opinion) are the Bayes Factor (Z. Dienes, 2011) or “AIC” and “BIC” values. Explanatory articles on these instances, however, do require to some extent that you understand Bayes’ rule. But to give a brief definition: they depend on the likelihoods of one hypothesis over another hypothesis. These likelihoods can be obtained fairly easily from all sorts of data, but often require you to make real quantative hypotheses.

        • Scott UK

          Hooray — thanks.

          • poniesinjudah

            Scott, let me know via twitter if those reference things are usable by non mathematicians. I agree about verbal examples vs incomprehensible logic symbols. Thanks.

          • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

            A clear way through the Bayes maze.

  • Pingback: Magapsine (05/03/2014) | dronte.es()

  • Etienne P. LeBel

    >>>>However, how we do know that Harris and Mickes didn’t use the same tricks ‘in reverse’ to get a null result? It’s easy to forget that questionable research practices can work both ways.

    That’s ridiculous! The reason we know that Harris and Mickes didn’t p-hack to get a failed replication null result is that you cannot p-hack if you’re doing an independent direct replication correctly because you *MUST* use the same procedures, same manipulation, same measures, and same analytic procedures as the original authors. Consequently, you do *NOT* have the flexibility in design and analysis to even *possibly* p-hack!

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Not so. They had less flexibility, but still enough. There’s the sample size, possibly also the definition of outliers and the treatment of incomplete data (which are not mentioned).

      More fundamentally though, in the absence of preregistration, we don’t know what they present (a direct replication, plus the actual voting bit) was the only thing they tried.

      I’m not saying of course that the number of degrees of freedom are as high as Durante et al originally had. But without prereg, the degrees are there, even in studies like this.

      • Etienne P. LeBel

        The researcher-degrees-of-freedom are basically non-existent except maybe for sample size choice (for outliers and missing data you simply and must follow the criteria used in the original).

        I of course strongly support and recommend the pre-registration of independent replication studies (and this is what I do when I myself execute independent replications, see for e.g., here and here), however I strongly feel that you substantially mischaracterized the situation by implying that independent replicators can exploit just as many researcher-degrees-of-freedom as original authors, which couldn’t be further from the truth.

        • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

          I did not mean to imply that a replication has as many degrees of freedom, but I was merely saying that, unless preregistered, it still has some.

          The ultimate free parameter being the decision whether to publish at all.

          I also worry that in many cases (although probably not in this case), the decision whether to frame a certain study as ‘a replication of X’ or as, say, a new study merely ‘building on X’, is made after the results are analyzed.

  • Christine Harris

    I am not sure why the Neuroskeptic finds our comment that
    the evo-psych/cycle literature as a whole may have a false-positive rate over 5% and that each purported effect should be assessed on its own merits a “disingenuous tactic”. Anyone interested should take a look at Harris, Chabot, & Mickes (2013, Figure 1) to see some of the reasons we have a broad concern about the menstrual cycle literature as a whole. But of course, if the results of any given study can be checked out in a direct replication, then those concerns may be allayed in specific cases (hence the value of doing replications.)

    As for our study, I can also assure anyone interested that
    we did not use p-hacking tricks ‘in reverse’ to get null results. We followed the procedures of Durante et al. as closely as possible throughout. In fact, when we first submitted this we had done the categorization of fertility in exactly the way that Durante and colleagues reported having done it in their paper. At the editor’s suggestion, we contacted Durante and found that they actually had calculated fertility in a different way than what they had stated! We then spent a great deal of time redoing our analyses, following their actual rather than their described procedure (all this is described in our SOM).

    Why didn’t we preregister? Primarily because when we heard about the Durante et al. work for the first time,1) the elections were fast approaching and we had to quickly spring into action and 2), as a direct replication, almost all our choices were fixed in advance anyway. Nonetheless, we do believe that the options for study registration emerging these days will help improve all work including replications.

    Neuroskeptic, I find it unfortunate that you cast vague unsupported aspersions as to our research methods. If you wanted to know if we had examined any additional measures beyond those reported, you could have simply emailed us and we would have been happy to answer. We did not have any additional unreported measures pertaining to the Durante et al. paper. (We did have women in one of our samples make face preference judgments for a clearly distinct project that will be reported in a separate manuscript.)

    Christine Harris

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Thanks for the comment:

      I said the tactic seems disingenuous because I believe – as you rightly said – that every study and claim should be judged on its own merits. But whether the literature as a whole is full of p-hacked false positives is not a merit or demerit for any particular study, so I don’t think it’s relevant to the discussion.

      Note that I never questioned your assertion that the literature is shaky! Honestly, I believe you. I’m just saying that it wasn’t relevant in this instance and that your mention of it sits oddly with what you said about judging on individual merits. Of course, in other contexts, it would be very relevant, e.g. when writing a meta-analysis.

      I am very glad to hear that you didn’t p-hack! Yet I don’t think I was “casting vague unsupported aspersions” when I said that, given the absence of preregistration, a researcher in your position might have done so. Because my point was a general one, which is why I began by saying that Durante et al or any researchers in their position might have done so-and-so as well.

      I’ve long believed that the problem of researcher degrees of freedom is a problem with the system, not the personalities within it, and I hope that my post didn’t give the impression of a personal accusation.

  • Pingback: Preregistration: what’s in it for you? « Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, and Social Science()

  • Pingback: Women, Ovulation and Voting in 2016 | graph paper diaries()



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar