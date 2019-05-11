Search DISCOVERmagazine.com
CURRENT ISSUE
See inside the current issue of Discover Magazine
SUBSCRIBE
DIGITAL EDITIONS
RENEW | GIVE A GIFT
BACK ISSUES
DIGITAL PRODUCTS
CUSTOMER SERVICE

BLOGS

«

Scarred Brains or Shiny Statistics: The Perils of CCA

By Neuroskeptic | May 11, 2019 5:42 am

A paper in PNAS got some attention on Twitter recently. It’s called Childhood trauma history is linked to abnormal brain connectivity in major depression and in it, the authors Yu et al. report finding (as per the Significance Statement)

A dramatic primary association of brain resting-state network (RSN) connectivity abnormalities with a history of childhood trauma in major depressive disorder (MDD).

The authors go on to note that even though “the brain imaging took place decades after trauma occurrence, the scar of prior trauma was evident in functional dysconnectivity.”

Now, I think that this talk of dramatic scarring is overblown, but in this case there’s also a wider issue with the use of a statistical method which easily lends itself to misleading interpretations – canonical correlation analysis (CCA).

*

First, we’ll look at what Yu et al. did. In a sample of 189 unmedicated patients with depression, Yu et al. measured the resting-state functional connectivity of the brain using fMRI. They then analyzed this to give a total of 55 connection strengths for each individual. Each of these 55 measures reflects the functional coupling between two brain networks.

For each patient, Yu et al. also administered questionnaires measuring personality, depression and anxiety symptoms, and history of trauma. These measures were then compressed into 4 clinical clusters,  (i) anxious misery (ii) positive traits (iii) physical and emotional neglect or abuse, and (iv) sexual abuse.

This is where the CCA comes in. CCA is a method for extracting statistical associations between two sets of variables. Here one set was the 55 brain connectivity measures, and the other was the 4 clinical clusters. Yu et al.’s CCA revealed a single, strong association (or ‘mode of variation’) between the two variable sets:

yu_pnas_modesA correlation coefficient of 0.68 is very large for a study of a brain-behaviour relationship. Normally, this kind of result would certainly justify the term “dramatic association”.

But the result isn’t as impressive as it seems, because it’s a CCA result. CCA is guaranteed to find the best possible correlation between two sets of variables, essentially by combining the variables (via a weighted sum) in whatever way maximizes the correlation coefficient. In other words, it is guaranteed to over-fit and over-estimate the association.

Yu et al. show this, as they found that using a permutation procedure (which eliminates any true associations) the CCA still produced a mean correlation coefficient of r=0.55. In 5% of cases, the CCA was lucky enough to hit r=0.62 or higher. Remember that the ‘true’ correlation is zero in this case! CCA is able to magic up a strong correlation of 0.55 or higher from out of thin air.

yu_permutation

The observed correlation of r=0.68 is statistically significant, because it’s higher than the 95% null of 0.62, but it’s not much higher. In other words, while there does seem to be some true relationship between the brain and behavior variables here, it is almost certainly much weaker than it appears.

(Yu et al. in their paper also carried out a comparison of depressed patients to healthy controls, which does not rely on CCA, and which I’m not discussing here.)

*

So what is the use of CCA, if it is guaranteed to overfit the data? Well, it can be useful so long as you have two (or more) independent datasets, allowing you to test the validity of the CCA model, derived from one dataset, in another. The CCA would be overfitted to the first dataset, but by testing it in the second dataset, we can know how much of the correlation is real.

Unfortunately, Yu et al. is not the only paper to adopt a single-sample CCA approach. A well-cited paper Smith et al. (2015) in Nature Neuroscience, which Yu et al. refer to several times, did the same thing. (I blogged about it at the time, rather un-skeptically).

Smith et al. compared brain functional connectivity to behaviour and lifestyle variables, and found a mode of CCA variation with a spectacularly strong correlation of r=0.8723. But the 95% significance threshold under the permuted null hypothesis turned out to be an almost-as-spectacular r=0.84! So, just as with Yu et al., the observed result was significant, but only slightly better than CCA produced by chance alone.

In fact, Smith et al. went on to test the validity of the CCA by running CCA for 80% of the dataset (‘training set’) and testing it in the remaining left-out 20%. This is a kind of rough-and-ready approximation of using a second dataset. Smith et al. found that the correlation in the left-out data was r=0.25 – a much more modest result, although still something.

I would say that this kind of train/test analysis should be a bare minimum in any neuroscience CCA paper. I suspect that if it were applied in Yu et al.’s case the correlation would be small.

CATEGORIZED UNDER: fMRI, mental health, methods, select, statistics, Top Posts, Uncategorized
ADVERTISEMENT
  • OWilson

    Such a correlation might be a kick off point for further testing and study, but hardly merits a determinative conclusion.

    As you point out, the problem with statistical analysis, besides Selection Bias and Confirmation Bias, is the Probability Bias, which treats outlier results as significant anomalies. There is also the possibility of other, less obvious, factors influencing the findings.

    It manifests itself in many areas of inquiry, from pseudo scientific telepathy and predictive abilities (calling all ten coin tosses accurately) to localized cancer apparent groupings and even some main stream scientific modeling.

    Usually resolved by further testing, and empirical observation :)

  • Steve Smith

    Nice clear article – I agree with these points entirely. You’re right that in our positive-negative CCA paper we did a bunch of supplemental tests to (a) verify that the p-values did make sense and (b) characterise the strength of the null CCA r-value. But I also agree that we (all) should do more when presenting the primary results to ameliorate the risk of giving the impression of the main CCA r being much higher than it meaningfully is.

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Thanks!

  • Angela Green

    The PNAS paper actually split the patients into female/male and younger/older subgroups, and re-ran the CCA for the four subgroups. The main results still hold for the female (who usually have childhood trauma more than males) and younger patients. Those results are shown in the Supporting Information.

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      Thanks for the comment!

      They did do subgroup analysis, but this is not the same as the split-sample validation that I mentioned, because (as far as I can see) they didn’t look for the correlation between the canonical variables from one subgroup in another subgroup. Rather, they did CCA for the subgroups separately.

      The CCA in the female group gave results similar to the whole sample, but the male group didn’t; similarly the young subgroup was similar to whole sample but CCA modes in the older patients were not significant.

      If we check the correlations between modes and the raw variables for whole sample, female and young (S10B, 11B and 15B) it’s clear that although the overall performance was similar in the the subgroups, the specific variables driving the effects differed a bit, although they do look pretty similar which is reassuring.

      • Angela Green

        I agree that the interpretation of CCA results must be made with caution. Importantly, current CCA findings in the literature should be validated using new and larger dataset(s). But it’s better to let the audience know the whole picture of any article, otherwise it’s misleading and unfair for the authors. I also read the discussions on Twitter. But to be honest, before reading through the entire article, I didn’t know that 1) the authors actually did perform subgroup analysis (although they didn’t do what you are recommending to do); 2) they never claimed that they identified a biomarker (they only said “has the potential to serve as a biomarker); 3) the high correlation (r = 0.98) was not between brain imaging and behaviour measures, but variate-to-variable correlation…However, scientifically speaking, it is always very important for us to read any articles critically.

NEW ON DISCOVER
@DISCOVERMAG ON TWITTER
POPULAR
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Neuroskeptic

No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.

ADVERTISEMENT

See More

@Neuro_Skeptic on Twitter

ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Follow us:
More great sites from Kalmbach Media:
Collapse bottom bar
See inside the current issue of Discover Magazine
+