The Voodoo Curse of Circular fMRI Analysis

By Neuroskeptic | October 28, 2017 4:12 pm

Remember the ‘voodoo’ fMRI controversy? Well, I just came across a new voodoo-ish paper – just in time for Halloween.

The study, published in Neuropsychopharmacology, comes from Franziska Plessow and colleagues of Boston. The main claim is that a dose of oxytocin reduced the response of “reward-related food motivation brain regions” to pictures of high-calorie foods, suggesting that the hormone might be of use in the treatment of obesity.

However, I have some questions.

In their key analysis, Plessow et al. compared the fMRI response to high-calorie stimuli (over control stimuli) on oxytocin vs. placebo. The ‘blobs’ from this contrast within the brain’s reward-related VTA region can be seen below as (a). The bar chart (b) shows the signal extracted from the peak voxels of this contrast:

plessow_fmriHere’s the description of this method:

Oxytocin-related changes were tested using contrasts that reflected the difference in beta weights between oxytocin high-calorie food images minus oxytocin non-food objects vs placebo high-calorie food images minus placebo non-food objects. For the primary confirmatory analysis of oxytocin effects on VTA activation, we used max voxel coordinates from the left and right hemispheres of the group level statistical maps at the centers of 2.5mm spheres and the beta values using Marsbar software.

The trouble is that this is a circular (aka ‘voodoo’) analysis strategy. If we select the peak coordinates of the oxytocin-placebo contrast, the values extracted from these peaks (i.e. (b) above) are likely to show an oxytocin-placebo difference, but this might just be a chance difference. So, while Plessow et al. report statistically significant oxytocin-placebo differences in the signal extracted from the peak voxels, the circularity makes these invalid.

The way to ensure that a difference isn’t due to chance is to do an analysis over the whole area of interest, with no selection but corrected for multiple comparisons. As far as I can see, when Plessow et al. report blobs, these are not corrected for multiple comparisons in any way (p<0.05 uncorrected threshold.) The study sample was also small, consisting of only 10 obese or overweight participants.

So overall, this paper is a bit of a Halloween horror story.

A harsh verdict perhaps, but bear in mind that senior author Elizabeth A. Lawson reports having a rather relevant conflict of interest:

Dr Lawson is a member of the scientific advisory board of OXT Therapeutics and has a financial interest in the company.

The goal of OXT Therapeutics lies in “Developing Novel Oxytocin Analogues: An Approach for Obesity and Metabolic Disorders”. So it’s fair to say that these results could be of commercial value for the company, and might end up guiding drug development, which to my mind makes it important to check their validity.

h/t: Nils Kroemer

  • Bernard Carroll

    I had to smile at the mention of Marsbar software. Did they also use Mars Bars for the high calorie food images?

    • Neuroskeptic

      Well spotted!

  • Christophe Phillips

    This looks like some results obtained with SPM (don’t have access to the full paper to check this out). I always find it suspicious when someone feels the need to 1/ extract values from the data, picked under some SPM contrast, and 2/ perform further testing in another statistical software…

    • TheRealTruthWillBlowYourGDMIND

      Marsbar is a toolbox for SPM, btw.

  • Uncle Al

    A problem with all physiological MRI is broad linewidths. I propose magic angle (arccos [sqrt(1/3)] versus field) spinning the subject to substantially narrow the lines.

    • sean


  • Bon Obo

    The yellowish spots on the anatomical scan shows statistically significant voxels (according to p < 0.05 uncorrected). Agreed that this is not a strong statistical evidence given the mentioned multiple-comparison problem, t-values on the colormap are pretty low anyway. But this seems to be what they have and they report it and it is OK. If I understand correctly barplots are just presented to show how this weak interaction arises. There cd be many scenarios to lead to an interaction, and we see that OX decreases BOLD at the doubledipped voxels, but this was anyway the intention to show what is happened at those voxels. This information is not visible on the yellow map. The statistical point being already made in the yellowish map -however, weak it is-, I don't really see how circularity arises here for the bar plots. Maybe, mind to tell more about it?

    • Neuroskeptic

      Thanks for the comment. You’re right that the bar chart would not constitute circularity if it were just an illustration, but the Results section makes it clear that a circular analysis was done:

      “VTA Activation Following Oxytocin Administration vs Placebo

      Following oxytocin administration, compared to placebo, participants showed hypoactivation in the left VTA (−6 −13 −14) in response to viewing high-calorie food stimuli compared with the objects, p=0.004, d=0.91. We observed a matching hypoactivation in the right VTA (6 −19 −17), p=0.015, d=0.73 (Figure 1)”

      • Bon Obo

        Thanks for the reply. As you might have guessed I didn’t read the paper. I agree with you that the claims about hypoactivation are not supported.

  • diana kornbrot

    What’s more, seem to be averaged over participants.
    How many participants actually showed the effect?
    See fro false discovery

  • CuriousDude

    Perhaps I’m not fully qualified to understand, but I fail to see the real problem with this paper. It seems to me that the authors could have cherry-picked the region of the brain they are interested in for analysis, but that might have seemed like a biased approach. Instead they tried a more unbiased whole-brain approach and (if I understand your criticism correctly) this then requires multiple hypothesis correction because it is technically testing the possibility that any region of the brain could show different activity.

    But wouldn’t it be a remarkable coincidence if their exploratory analysis picked up the region they most suspected to be involved and there wasn’t some sort of causal link? Furthermore, wouldn’t it be perhaps more dishonest if the authors chose not to include these results?

    Obviously a lot of these problems would be easier to solve/discuss if they had a higher sample size, but one assumes they must be aware of those limitations.

    • adullard

      The problem is that the selection of voxels for analysis was biased to begin with– it wasn’t a whole-brain approach, they specifically chose the highest responding voxels. If I select the person in a room with the highest IQ and then test to see if they have a higher IQ than everyone else… you see the problem. Of course the maximum responding voxels are higher responding than the rest.



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar