Brain Scanning – Just the Tip of the Iceberg?

By Neuroskeptic | March 21, 2012 7:45 am

Neuroimaging studies may be giving us a misleading picture of the brain, according to two big papers just out.

By big, I don’t just mean important. Both studies made use of a much larger set of data than is usual in neuroimaging studies. Thyreau et al scanned 1,326 people. For comparison, a lot of fMRI studies have more like n=13. Gonzalez-Castillo et al, on the other hand, only had 3 people – but each one was scanned while performing the same task 500 times over.

Both studies found that pretty much the whole brain “lit up” when people are doing simple tasks. In one case it was seeing videos of people’s faces, in the other it was deciding whether stimuli on the screen were letters or numbers.

With all that data, the authors could detect effects too small to be noticed in most fMRI experiments, and it turned out that pretty much everywhere was activated. The signal was stronger in some areas than others, but it wasn’t limited to particular “blobs”.

So conventional fMRI experiments may just be showing us the tip of the iceberg of brain activity. In a small study, only the strongest activations pass the statistical threshold to show up as blobs, but that doesn’t mean the rest of the brain is inactive. It just means it’s less active. The idea that only small parts of the brain are ‘involved’ in any particular task may be a statistical artefact.

In fact, I wonder if the whole idea of treating statistically significant blobs as different from nearly-significant areas is itself a form of the error of interacting effects?

As if that wasn’t enough, Gonzalez-Castillo further show that there are lots of activations in the brain – even to very simple stimuli – that might go undetected in conventional studies, because they don’t follow the time-course predicted by the usual models.

Have a look –

This shows the average neural activation from various regions of the brain during a letter-number task. The two areas I’ve highlighted in red are the primary visual cortex, and they do follow the expected ‘boxcar’ pattern – the brain is active when the stimuli are on the screen, inactive when they’re not. But you can see that all kinds of other brain areas are also responding to the stimuli – just in different ways.

For example, the left primary motor cortex was activated during the task. That area controls the right hand, and that makes sense, as people responded by pressing buttons with the right hand. But interestingly, the same area on the other side of the brain was deactivated at exactly the same time, even though people weren’t doing anything with their left hand.

These papers illustrate the fact that conventional fMRI is a blunt instrument that often only tells us about the most straightforward events that happen in the brain. A bit like how we only hear the shouts and screams from through our neighbor’s walls, not their normal conversations, which aren’t loud enough to reach our ears.

That’s the bad news, but every blob has a silver lining. fMRI is clearly more powerful than most neuroscientists have realized, and this holds out hope for cracking some of the trickiest questions. As Gonzalez-Castillo et al put it

This result helps narrow the gap between thousands of fMRI manuscripts showing limited activation in response to tasks and cognition theories that defend that cognition—understood as the process of “con?guring the way in which sensory information becomes linked to adaptive responses and meaningful experiences”—can only result from the distributed collaboration of primary sensory, upstream and downstream unimodal, heteromodal, paralimbic, and limbic regions… [we were able to] switch from a regime where activity detection relates primary to sensory processing to a more sensitive regime, where activity detection includes also cognitive processes with subtler BOLD signatures.

Link: See also the interesting discussion here: Surely, God loves the .06 (blob) nearly as much as the .05.

ResearchBlogging.orgThyreau, B., Schwartz, Y., Thirion, B., Frouin, V., Loth, E., Vollstädt-Klein, S., Paus, T., Artiges, E., Conrod, P., Schumann, G., Whelan, R., and Poline, J. (2012). Very large fMRI study using the IMAGEN database: Sensitivity–specificity and population effect modeling in relation to the underlying anatomy NeuroImage DOI: 10.1016/j.neuroimage.2012.02.083

Gonzalez-Castillo, J., Saad, Z., Handwerker, D., Inati, S., Brenowitz, N., and Bandettini, P. (2012). Whole-brain, time-locked activation with simple tasks revealed using massive averaging and model-free analysis Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.1121049109

  • Geraint Rees

    This (the Thyreu paper) empirically illustrates the difference between biological significance and statistical significance – in a linear framework (as is used for almost all neuroimaging analyses and indeed most biological analyses) as the number of participants increases the standard error decreases as the square root of the number of participants. This is expected behaviour of the statistical tests, and means that at very high sample sizes what is statistically significant may be biologically irrelevant. As it's a property of the statistical tests, not functional MRI or the brain, it will be observed for any biological phenomenon subject to analysis in a similar way.

    It most definitely should not be taken to mean that the whole brain 'lights up' in any biologically meaningful sense. Nor should it be taken to indicate that neuroimaging data analyses are in some way flawed – it's a feature of any large sample size in any area. In effect it means that statistical significance is not a relevant concept unless you take into account effect size and biological significance i.e. interpret the findings.

  • Jayarava

    Are we really surprised that the whole brain is active when we are active?

    It has always struck me as simplistic to say that a certain area is active with the implication that other areas are inactive. Surely how we perceive and respond to stimuli is a function of the whole brain all the time if only because of the massive amount of interconnection?

    Is the problem that we mistake reductive explanations for reductive phenomena? So if an area stands out against the background we simply treat the background as non-existent?

  • Federico Turkheimer

    As discussed elsewhere:

    1. Is brain mapping localizing anything?
    Current statistical methodology (mass univariate) does not allow any inference of localization, whether you have a very large sample or not. It only establishes an association between a factor and voxels. To claim localization one needs to use double dissociation (Jernigan et al., 2003) that is test that the association in X is sgnificantly greater than anywhere else in the brain. No-one is doing it so no-one can claim any localization in what he/she has been doing

    2. Is the global null hypothesis true (e.g. should we correct for multiple comparisons)?
    That depends on how we believe the brain works. In Bayesian terms a multiple comparison correction equates to the a-priori belief that the activation will be very sparse. Do we have evidence in this sense? I dare say not. These papers help as they demonstrate biologically (not statistically) meaningful patterns. All those doing pattern analysis or working on co-variances (ICA, network analysis etc.), which is roughly the other 50% of the field, have obviously the opposite belief as a system with long-distance covariance cannot exhibit localized activity only.

    3. Is the problem in the analytics or elsewhere?
    Tononi et al 15 years ago wrote:
    “Traditionally, localizationist and holist views of brain function have exclusively emphasized evidence for either functional segregation or for functional integration among components of the nervous system. Neither of these views alone adequately accounts for the multiple levels at which interactions occur during brain activity”
    [Tononi G, Sporns O, Edelman, PNAS, 1994]

  • Neuroskeptic

    Geraint Rees: Thanks for the comment. In a sense, yes, all this shows us is that as you increase the statistical power, you find smaller effects. Which is not surprising in itself.

    However, the question this raises – and I think it is a question specific to neuroimaging – is how large a sample you need to detect “biologically meaningful” activity in the brain.

    Supposing that fMRI was much more expensive than it really is, and all fMRI studies had sample sizes of 1 or 2 people, we might only have found the very largest BOLD effects (in primary sensory and motor cortex) by now.

    In that case we would clearly be missing many biologically meaningful activations. We can say that with confidence because we can detect smaller effects and we know they're interesting. But in expensive-fMRI-world it wouldn't be so easy to work that out.

    Likewise we may look back on the current era of functional imaging, as a time when we were only detecting the biggest effects, not the most biologically significant ones.

  • NeuroKüz

    I think these papers nicely reinforce the notion that a “negative finding” in fMRI is meaningless. Unfortunately some authors of studies with 10 to 20 subjects argue that certain brain regions are “not involved” in a particular process because these regions were not significantly activated in the study.

    I would expect that the number of subjects needed to detect biologically meaningful activity in the brain would be dependent on the level of activation evoked by the specific task/stimulus, so this would be impossible to calculate for a novel task.

    Another problem is sort of the opposite of the one raised here – the ceiling effect (i.e., the fMRI BOLD signal has a maximum). Maybe most brain regions are significantly activated if you have a large number of subjects, but the differences in degree of activity between regions is what is important. The ceiling effect can prevent us from determining whether this is the case.

    So taken together, I think this all provides more impetus for moving toward multimodal analyses to figure out how much of what BOLD fMRI is telling us is biologically interpretable. Large-scale studies using tools like ASL and MEG (and stimulation tools like TMS) could help.

  • Tom Johnstone

    To some extent it is true that the same problem is true of any somewhat under-powered research; only the largest effects will be statistically significant. However, most fMRI research is at least semi-exploratory because we test conduct many different tests and look for where differences are significant. This differentiates it from, say, a simple reaction time experiment where we have a single test to make. As long as we correct for multiple comparisons, the conclusion that area A shows a difference is not statistically problematic. But the implication that other areas do not is a problem, as pointed out in these articles.

    This problem of lack of proper localisation is compounded by the fact that we don't really know what a biologically significant effect is when using fMRI. That's because fMRI yields correlational, not causal information and because the BOLD signal measured in fMRI experiments is arbitrarily scaled and doesn't have a sufficiently well-characterised relationship to underlying neural activity. So its magnitude cannot be converted to a meaningful biological quantity.

    There are potential solutions to such problems. One is to look for double-dissociations, statistically tested as task/condition by brain region interactions. This has been done in a limited number of brain imaging studies, most commonly to test for hemispheric laterality effects.

    Another solution is to use brain stimulation (e.g. TMS or DCS) to test the causal role of activated regions.

    As pointed out by Tononi and others, searching for localised regions might be inherently flawed if the brain is not organised in that way, though assumptions of at least some degree of modularity seem to be reasonable on the basis of animal and human lesion and stimulation studies.

  • practiCal fMRI

    Technical comment: the ipsilateral motor cortex should be inhibited (deactivated), so that's a good finding! (It would worry me if that weren't seen.) This was the approach Stefanovich and Pike took in 2004 to show why deactivation was neurally meaningful, and not “blood steal” or some other purely vascular effect.

    Also, most studies are designed to contrast this condition against that control condition, and the choice of control is all-important. (BOLD being a relative measure, and requiring some sort of baseline, is one of the big hurdles in the method.) Thus, most “background” (de/)activations are designed to subtract out, leaving the stats to work on a few nodes. So, yes, in that sense the common task-based fMRI experiment is designed to push the tide line up the iceberg! And for sure we must be leaving “information” behind, which is what it appears Gonzalez-Castillo et al are addressing.

  • practiCal fMRI

    Afterthought… I wonder how much of the submerged iceberg is due to poor subjects. We continue to recruit large N of fMRI-naive subjects for many studies. But as the Bandettini group and others have shown, perhaps there's more information (less “noise”) available when you focus on a handful of highly motivated, experienced subjects. False positives and false negatives should be less of a concern the better the subjects…

  • Geraint Rees


    “However, the question this raises – and I think it is a question specific to neuroimaging – is how large a sample you need to detect “biologically meaningful” activity in the brain.”

    It's not specific to neuroimaging. For example, it applies to single unit electrophysiology (how many neurons do you need to detect 'biologically meaningful' differences in spike rate). Or even specific to the brain (how many blood samples do you need to detect 'biologically meaningful' differences in serum potassium)

    I suspect a Bayesian would say you can do this with posterior probability maps provided you know the expected effect size for 'biologically meaningful' differences. And you can do this for all other areas of biology too.

  • Dan H

    Glad to read this discussion.

    @Geraint, This is more than a property of statistics. Yes, the standard error can decrease with larger samples. Still, if the underlying effect is null and the statistical test is unbiased, the effect magnitude should also approach zero. This is why, in Gonzalez-Castillo's paper, we repeated our analyses on a phantom (Figure 3). No matter how much data we collected from a a sphere of liquid, we didn't get statistical significance. We present several analyses and results beyond significance maps to interpret our findings and increase the likelihood that we are observing whole-brain neural metabolism changes. There could be non-neural explanations, but these findings are a direct result of the biological system being measured and not a quirk of statistics. Looking at the Thyeau methods, I don't see anything that would mathematically bias them towards significance with large samples.

    @Jayarava, No one should be surprised that the whole brain responds to a task. The surprising part is that fMRI has the sensitively to both see that response and resolve subtle response shape variations across the entire brain. In my very biased option, that's pretty cool.

    @Frederico, I agree that the global null hypothesis is false and we need to spend less time worrying about what crosses an arbitrary threshold (like some recent blog discussions) and more figuring out robust ways to interpret the meaning of response shape changes. Double dissociations and most commonly used methods don't address some of our findings. For example, we go beyond significance maps and apply clustering to segment the brain. As neuroskeptic notes, we clearly see some regions with traditional box car responses and other with every type of signal increase/decrease, and onset/offset transients. Supplementary figure 8 shows how a traditional box car analysis wouldn't detect a difference between a sustained response and a response with only onset and offset transients. This is actual data, not a thought experiment, and the most common methods for testing double-dissociation would completely miss this.

  • Dan H

    (Continuing my comment to Frederico) I don't expect radical changes in the short-term because the perfect analysis methods don't exist and this is an impractical amount of data for everyone to collect, but, with expected MRI scanner and data quality improvements over the next decade, similar results should be achievable in a more practical amount of time. What then?

    @Neuroskeptic, There are several ways to interpret “biologically meaningful.” For example, Murphy & Garavan, Neuroimage 2004 shows that studies with 20 subjects will get roughly the same activation blobs each replication. That's meaningful. The areas that have the strongest responses are meaningful. Primary visual and primary auditory cortices aren't equally important for visual processing just because we show they both respond to flashing checkerboard stimuli. The stroke and lesion literature makes this quite clear. There's obviously more things happening beyond these strongest responses, but that points to a future of more interesting and subtle understandings of brain function thanks to fMRI. I think the regret looking back will be less our current focus on blobs and more the situations where false assumptions regarding the range of response shapes caused us to misinterpret our existing data.

    @NeuroKuz, I don't think these findings show a risk of a BOLD ceiling response. Based on CO2 breathing and breath holding studies with fMRI, we know there's quite a large range of BOLD magnitudes. Even clustered across large brain areas, we see significant response magnitude changes from less than a quarter percent to several percent (Figure 4). Given high enough temporal signal to noise, that's a lot of dynamic range to play with.

    I'm also not sure how much MEG will help with this. Most MEG source localization needs to assume some spatial spread of responses. If someone collected enough MEG data to make the entire brain measurably active with contributions from everywhere some regions having much larger responses, I wonder if existing localization methods would break down.

    @practiCal, You doubted we'd get the expected motor cortex responses?! :)
    While experienced volunteers help, the ultimate clinical applications of fMRI require robust methods that work in naive volunteers. Pushing the limits of our tools requires experienced volunteers, but the end result needs to be robust methods that work for everyone.

  • Geraint Rees

    @Dan H.

    I'm only commenting on the Thyreau paper – busy day and haven't had time to read the other one. You're missing the point, which is that an effect may be statistically significant (because you've got a huge sample size) but biologically irrelevant. Of course, just because an effect size is tiny doesn't *necessarily* mean it's biologically irrelevant. But you can't use statistics to tell that, nor does imaging a phantom and finding no tiny effects tell you that a tiny effect in a very large sample of brains must be biologically significant.

  • practiCal fMRI

    @Dan H: “You doubted we'd get the expected motor cortex responses?! :)”

    Au contraire! Just letting NS know that this is expected. From the post:

    “For example, the left primary motor cortex was activated during the task. That area controls the right hand, and that makes sense, as people responded by pressing buttons with the right hand. But interestingly, the same area on the other side of the brain was deactivated at exactly the same time, even though people weren't doing anything with their left hand.”

    An aside: anyone wanting the ultimate BOLD test experiment – leaving aside some sort of respiratory challenge – should try left vs right motor activation. The baseline of the ipsilateral cortex goes down, hence delta-S goes way up! :-) And it's motion-controlled because both hands are moving in the control and task condition. Nice!

    As for robustness… AGREED!!!!

  • DS

    Dan H

    In the supplemental methods section of the Gonzalez-Castillo paper there was this description of the receive coil:

    “… a custom 16-element receive-only surface coil brain-array (Nova Medical) was used for two subjects …”

    Would you please tell me a bit more about this head array. In particular:

    (1) Can you describe its geometry a bit more?

    (2) How does GE combine the images from the 16 coil elements into a single composite image? Standard sum-of-squares (SOS)?

    (3) If SOS is used did you also use any sort of normalization to eliminate the image contrast due to the inhomogeneous receive field generated by the SOS combine?


  • DS

    Nothing to do with this topic really but what is the time lag between ipslateral and contralateral motor strip fMRI activity. I am sure that Dan H or practiCal knows the answer to this question. I just don't … and I am willing to admit it.

  • Neuroskeptic

    Geraint Rees: “It's not specific to neuroimaging. For example, it applies to single unit electrophysiology (how many neurons do you need to detect 'biologically meaningful' differences in spike rate). Or even specific to the brain (how many blood samples do you need to detect 'biologically meaningful' differences in serum potassium)”

    You're right, it's not specific to neuroimaging as such. My bad wording. What I meant was, although the general problem of statistical significance vs. biological meaningfulness is not specific to neuroimaging, the issue is especially acute in the case of imaging because we don't know what a 'biologically meaningful' difference in neural activation is.

    For serum potassium, we know there is a normal range, within which differences are not associated with health outcomes. So we can say with some certainty that differences below X are unimportant.

    For BOLD, we don't know that, yet.

  • Dan H

    @Neuroskeptic & Geraint,
    The issue of statistical significance & biological meaning has little to do with sample size. A larger sample means it's more likely an effect is real vs resulting from random fluctuations, but biological interpretation is always a concern regardless of effect size. For this type of study, the big question is whether something besides blood oxygenation changes could cause these effects. The phantom data show the results aren't inherent to our hardware or our data processing methods. We recorded pulse and chest movement & used standard methods to remove signals that correlate with heart rate & breathing. Those methods aren't perfect, but they decrease the chance that our results are a function of global pulse & respiration changes. We looked carefully for evidence that head motion might explain some of the results.

    While we can't be certain, there's a good chance that the observed effects are regional changes in blood oxygenation. We, obviously don't understand the full relationship between blood oxygenation changes and neural metabolism, but, whatever the relationship, changes in blood oxygenation have a biological meaning.

    Unless we go back to the pre-fMRI dark ages of neuroscience where our scientific ancestors thought, “If it doesn't cause axonal spiking it doesn't matter,” my default assumption is we shouldn't discount something that causes any changes in the brain without very good justification. :)

  • Dan H

    You can read more than probably you want to know about the 16 channel coil configuration at:
    de Zwart, J.A., et al, 2004. Signal-to-noise ratio and parallel imaging performance of a 16-channel receive-only brain coil array at 3.0 Tesla. Magn Reson Med 51, 22–26.
    We used standard GE reconstruction (I think SOS or root SOS). Each time series was intensity normalized by dividing by its mean, which would supersede any single volume corrections of contrast.

    If you want to know time lags in various areas of our data, figure it out yourself! As listed in the paper, we uploaded data to (username: gonzalezcastilloj). The voxel-wise finite impulse response model fits for each volunteer are there.

  • DS

    Hi Dan H

    Thanks for the references about the coil
    geometry. I will check it out.

    Was the normalization to the mean done before or after motion coreection?


  • DS

    Hi Dan

    Please disregard my question about the normalization. I just took a look at the supplemental methods section and the answer was there. Normalization was performed following motion correction.

  • gregory


    there is a reason this description of neuroscience has life

  • Geraint Rees


    You said “For serum potassium, we know there is a normal range, within which differences are not associated with health outcomes. So we can say with some certainty that differences below X are unimportant.”

    This is just not true! Whether or not serum potassium differences in the normal range are associated with health outcomes is an active area of research (with thanks to @Joel_Winston for pointing this out today!).

    For example, here's a community cohort (the Framingham study) where the investigators are interested in working out whether different levels of serum potassium (and magnesium) are associated with prevalence of particular arrythmias, because the literature shows conflicting results indicating that it is difficult to establish whether there is a particular effect size (difference in potassium levels) associated with biological outcome (difference in arrythmias).

    This is exactly analogous to the neuroimaging situation you outline where it is difficult to establish whether a particular effect size (difference in BOLD signals) is associated with biological outcome (difference behavior).

    It is really striking how people seem determined to pin particular issues on neuroimaging, when in fact they are generic biological issues. That doesn't make them any the less interesting of course!

  • DS

    In the history of science there has always been more below the tip of the iceberg. The question has always been whether the iceberg is sufficiently exposed to allow us to build useful models about its behavior. If not then science has looked under the water line at increasing depths building and testing models as it goes until an useful one is obtained.

    The good news contained in this work, should it stand up to further scrutiny, is that there may be access to more of the iceberg.

  • Ivana Fulli MD

    Thanks all of you fore trying to do good science.

    Please remeber also to give healthy modest title to your publications.

    Last week at a conference I heard an aspie with a PhD in psychology telling that a lab test for autism with brain scanning will be available soon…

    Of course I kept my mouth shut not being of the ultracrepidarian sect when she was so full of knowledge -and also I had a more important objection to make to her lecture.

  • Nitpicker

    Sorry for the late contribution but I wanted to make sure to have read the papers in question prior to commenting. First of all, hats off to the heroic effort that went into both of those experiments (particularly to the subjects in the Gonzalez one who endured all those scans).

    As has already been pointed out, statistical significance does not imply biological significance. Biological significance can only be established using complementary approaches, say, by targeting regions with TMS. Based on what we currently know, it is unlikely that many of the regions reported as weakly activated in these two studies play any causal role in visual processing.

    In my mind, the most likely explanation for many of these activations “below the waterline” are probably at least to some degree artifactual: correcting for head motion and physiological confounds are far from perfect. Just because you ran motion correction does not rid the data of the motion artifacts – which is the reason why motion parameters are often used as covariates in the GLM. Yet even regressing out these covariates does not entirely clean up the data. If there are even tiny residual motion artifacts correlated with the task they would have become amplified with this amount of data. Normally they would be well below the statistical threshold but in this case they can become significant. I deem this a likely explanation for some of the time series shown in Gonzalez-Castillo et al. In some reason the average response is just a swiggly line barely different from baseline – this looks a lot more like averaged correlated noise than a biologically meaningful signal.

    Note that even if things like subject motion or respiration can be ruled out as confounds (and I don't think they can) we don't really know the hemodynamics at this level of sensitivity. It is quite possible that there are slow, longe-range BOLD effects to any focal BOLD activation. This would be of biological interest as it explains more about neurovascular coupling. But it probably doesn't tell us much about cognitive processing. One way to test for this may perhaps be to compare the activation pattern for different sensory modalities.

    There are other signals in there which seem more exciting. For example there are regions that show clear and strong responses at the onset and offset of the task blocks.

    So in summary, I agree these are interesting papers as they do open up great possibilities for future study – but I don't think they are quite the damning verdict of conventional fMRI statistics that you made out in your blog post.

  • Federico Turkheimer

    Late reply from a busy week
    @ Dan H
    “Double dissociations and most commonly used methods don't address some of our findings. For example, we go beyond significance maps and apply clustering to segment the brain. As neuroskeptic notes, we clearly see some regions with traditional box car responses and other with every type of signal increase/decrease, and onset/offset transients”

    Indeed this is the second fascinating aspect of this. The box car epitomises an expected finite resolution in time as thresholding does it in space. None seem to be true nor is believable that these are artefacts (respiration or else) unless somehow they are associated with the task otherwise they should cancel out as present in the control condition as well. This evidence supports the idea that brain activity has a long correlation structure in space and time which fits with the idea of fractality/criticality of brain function proposed by others and us recently

  • Neuroskeptic

    Nitpicker – I'm inclined to agree with all of those points, but my question is, where do we draw the line?

    Clearly if you had an n of 1,000,000 brains, you would pick up absolutely everything, including the tiniest artefacts and purely physiological effecs.

    With an n of 1 you wouldn't.

    But between those two extremes, where should we position ourselves? How do we know where interesting cognitive BOLD ends, and boring physiological noise begins?

    Can we be sure that we can reach that line, with an n of 20ish as most people today use? I don't think we can be sure. So there is an open question over how to interpret most fMRI studies, which may be underpowered. They might well turn out not to be – I hope not because I've published some of them! But they could be.

  • Nitpicker

    Neuroskeptic: “…but my question is, where do we draw the line?”

    I agree with your point but I think this is an empirical question. The findings of both these papers suggest a need for further experiments to answer it. This should involve TMS or other causal manipulations to target some regions to test whether activity there is causally necessary.

    Moreover, it will require more experiments in a similar vein as reported here but using different tasks to ascertain how general these effects are. For example, are the transient responses at the beginning and end of a stimulus block a general signal indicating the change of general behavioral engagement (or arousal) or are they more specifically related to the particular tasks used here.

    Perhaps the first experiment someone should run though is to repeat the Gonzalez-Castillo experiment but without stimulation or even better, having periods of stimulation as they used interspersed with longer epochs of several minutes without any stimulus. This will be a better way to quantify the artifactual component than using a phantom which contains no biological artifacts. A few years ago there was this Das optical imaging paper claiming to find BOLD effects entrained by a periodic stimulation paradigm in the absence of any significant stimulus (sorry I can't find this paper right now). I think it's conceivable that similar effects are seen here.

    All things considered, I agree these are interesting findings and they should be studied further – however, I don't think it's damning the conventional fMRI design just yet. I would argue that even the now standard n=20 is perhaps a bit excessive already. The most robust findings in neuroimaging can be seen even in single-subject case studies and there is a work left to be done to understand these effects. So I don't believe we must now all scan 100 repetitions per subject or have sample sizes in the thousands.

  • Ivana Fulli MD

    Great expectations (proving Freud right thanks to the Tip of the Iceberg and self-confidence of a believer)under Pr Nuttt-well credit?

    I feel jalous: your neuropsychoanalysts are better connected than the French ones!

  • Dan H

    @Nitpicker, Great comments and food for thought.
    Particularly for the issues here, TMS is not the ideal complimentary approach. TMS, like lesion studies, can only show regions that are essential to task performance (or at least measurably alter task performance). These fMRI results may be showing areas that respond to a task whether or not they are task-essential. No one thinks we need primary auditory cortex for visual processing, but the fact that fMRI responds to a visual task (assuming we're measuring neuro-metabolism) is still scientifically interesting and can lead to new experimental designs to figure out what is happening in auditory cortex. For an extreme TMS-like example, very very few tasks that involve pressing buttons while looking at images on a monitor would show differences if you put a patch over a person's eye. That doesn't mean the eye isn't involved in vision or changes in neural responses aren't relevant. The better complimentary measures would be direct electrophysiology (ECoG, LFP, spiking, …) since they could probe what types of neural responses we're observing with fMRI, but they obviously have their own limits.

    Regarding this being motion or another artifact, in addition to trying to remove as many potential artifacts as possible, we generally know what a task-related motion artifact looks like. Task-related motion usually causes similar squiggles at tissue boundaries with similar orientations. The clustering analyses and examination of the voxel-wise data just doesn't show this type of effect. A task-locked pulse/breathing artifact should have some relationship to vessel density or location in brain vasculature. These have some anatomical structure, but the variety of responses just don't look how I'd expect these type of artifacts to look. This is a (biased) personal & qualitative opinion rather than backed by hard data, but I think it's very unlikely that most of these findings are driven by motion artifacts and, while pulse/respiration artifacts might play a role, I think they could only explain a fraction of our findings.

    One way to test for this may perhaps be to compare the activation pattern for different sensory modalities. Thanks for announcing to the world some of our follow-up study plans. :)

    I agree with you that our results aren't damning to current methods, but I hope they contribute to a better understanding of the limits of those methods (all methods have limits) and provoke bigger discussions on new ways to analyze neuroscience data with fMRI and other modalities.

  • Dan H

    @Nitpicker, Also to your comment about Das' observations regarding a disconnect between hemodynamics and metabolism. I have some published critiques of their interpretations. You can read them at:
    Another critique by Kleinschmidt & Muller is here:
    Here's another from Logothetis:
    I don't think anyone else has replicated their finding and I haven't seen anything from their group that addresses the fundamental concerns regarding how they designed their study & interpreted their results. Speaking to many people, there's a healthy dose of skepticism regarding their findings, but the original paper made for great press and it's an easy citation for people who want to criticize fMRI. As usual, the less flashy critiques of a surprising finding don't seem to have traveled as widely.

    Just to be clear, I think their work is interesting and stimulates some good discussions, but the place it's taken in critiques of fMRI is much greater than the actual data/results merit.

  • Dan H

    But between those two extremes, where should we position ourselves? How do we know where interesting cognitive BOLD ends, and boring physiological noise begins?
    I love this framing, but I think this is a search for a non-existant border. There's always going to be an overlap between physiological noise & neuronal metabolism changes and there's no N where these problems magically disappear or appear. One can design an N=3 study with reasonable confidence that there are few physiological artifacts or an N=1000 study where we have no clue. The recent papers on motion in resting connectivity (Power et al, Neuroimage 2012 & Van Dijk et al Neuroimage 2012) show that you can have huge samples with serious artifacts if you don't look for those artifacts & try to correct them. This is true with task too. Birn et al Neuroimage 2009 recorded task-locked breathing variation and showed that it noticeably altered results. In a failed population study I once did, I was simply showing flashing checkerboard blocks & telling people to press buttons. Some volunteers (more in one population) spontaneously held their breath for the 10s task period. Those data were unusable and would have been unusable regardless of sample size. The big difference between the Birn study or my failed study and many other studies is that we bothered to collect and look at respiration data.

    The magic thing of N=20 is that it seems like the point where, if you collect data from the same population you're fairly likely to get the same group result. There are other thresholds we care about, but, if we care about replication, that's a pretty important one. Then again, the N=20 evidence is from a single study: Murphy & Garavan, Neuroimage 2004. It was a high quality study, but who knows if that result is replicable!

  • Nitpicker

    @Dan H: thanks for your detailed response and clarification. I also think I may have come across too harsh. Even if some of the effects turn out to be artifactual (as in an effect that is not directly related to neuronal processing) they are not uninteresting. In fact we don't even know how subtle microscale motion artifacts look like. It would already be interesting to know that there are very small motion artifacts that correlate with a task. This may not be what your data are showing but just generally speaking I don't think it's impossible.

    As for the Das finding thanks for pointing out those replies. I had seen one of them but not the others. Again more generally speaking though I believe that there may be hemodynamics we are not really understanding. This is I think something your results point to.

  • Dan H

    @Nitpicker, No harshness perceived. This feels like a pretty standard scientific discussion. Skepticism is good.

    Regarding issues with hemodynamics, since I'm self-promoting already, I figure there's no reason to stop now. :) I have a fun history/review article on hemodynamics and fMRI in press:
    “The continuing challenge of understanding and modeling hemodynamic variation in fMRI”
    I focused mainly on what fMRI researchers should consider for our typical studies and only glancingly mention the complexities of neurovascular coupling. To cover everything known would take several book chapters, if not an entire book. Hopefully that article is interesting and a bit provokative.

  • Anonymous

    For perspective the application of 'tip of the iceberg' to fMRI was by Marc Raichle in regards to a range of studies (electrical recordings, PET, MRS, calibrated fMRI) that showed that the amount of neuronal activity at 'rest' is many times higher than the changes in activity measured during a task state. Below is a link to a public accessible review on the topic

    One of the key questions is whether the high level of activity throughout the brain acts in a coordinated manner or is just some kind of system noise. If system noise then interpretations about processing from fMRI that neglect it are presumably still valid.

    From this perspective the findings discussed here support a view in which total brain activity is to some degree coordinated during tasks and therefore cannot be neglected.

  • Nitpicker

    I'm sure you already know about it but Karl Friston published a rather sardonic paper. It deals in large parts with the fallacy of classical inference. This should be relevant to this discussion:

  • Neuroskeptic

    Nitpicker – Yeah, I've seen that. I think it raises the question I've been asking in this thread namely, where does “trivial” and “biologically meaningful” begin, in terms of BOLD? We just don't know, as far as I can see. No good reason to think that all the interesting stuff can already be detected with current study designs and tech – but it could be.

  • pay per head

    Thank you for sharing to us.there are many person searching about that now they will find enough resources by your post.I would like to join your blog anyway so please continue sharing with us



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar