A new paper brings worrying news for neuroscientists using fMRI to study memory:
Across-subject reliabilities were only poor to fair… for novelty encoding paradigms, the interpretation of fMRI results on a single subject level is hampered by its low reliability. More studies are needed to optimize the retest reliability of fMRI activation for memory tasks.
The researchers, David Brandt and colleagues from Marburg, Germany, scanned 15 healthy volunteers twice each. In order to measure the neural activation associated with memory formation, the subjects were shown word and picture stimuli that they’d never seen before.
In a group-level analysis, when all the volunteer’s data was averaged together, several brain areas were activated by the memory task (and these were pretty much the areas one would expect.)
However, the degree of activitation in different brain areas was not very stable within individuals. Comparing the two activity patterns from the scans a month apart, the ICC for each voxel of the brain was poor (the median ICC was at best 0.35, which is low, and that was in the most favorable of the several conditions of the task.)
In other words if an individual shows a huge activation of the hippocampus in one session, it doesn’t mean that will happen whenever they get scanned. Which implies that we shouldn’t ‘read too much into’ a given individual’s degree of activation – as it might be quite different next time around.
You might remember that four years ago, I blogged about a review finding that the test-retest reliability of fMRI was modest: Can We Rely On fMRI? That paper reported that the mean test-retest ICC for fMRI is 0.50 which means that memory activations are well below average.
An interesting discussion followed my tweeting about the new paper.
Simon W Davis pointed out that, four years ago, Miller et al (2009) reported comparable results. The two studies are very similar in some ways, both involving fMRI of volunteers scanned twice, several weeks apart, with memory tasks, although the paradigms were different (novelty encoding in Brandt, vs episodic retrieval, semantic retrieval, and working memory tasks in Miller.)
However, as Kirstie Whitaker observed, there was a difference in interpretation:
Miller et al span it as “brains are different to each other”, Brandt et al as “neuroimaging is noisy”.
To which Davis commented that
This is what 3 years of neurobashing gets you: same study, more conservative interpretation.
Brandt DJ, Sommer J, Krach S, Bedenbender J, Kircher T, Paulus FM, & Jansen A (2013). Test-Retest Reliability of fMRI Brain Activity during Memory Encoding. Frontiers in Psychiatry, 4 PMID: 24367338