Arrange your fingers like the image below, and then look at them closely.
Do you notice anything odd?
Psychologist Marco Bertamini of the University of Liverpool describes this test in a fun new paper. According to Bertamini, seven of the ten people he surveyed reported that their little fingers clearly appeared to be ‘too far away’, to the extent that they did not appear to be part of their hands.
Bertamini suggests that the illusion is caused by the fact that the little finger is considerably smaller than the others, and our visual system tends to assume that smaller things are further away, similar to another illusion called the Ames window.
Bertamini named his discovery the “Bathtub Illusion”, after the location where he first observed the perceptual distortion. He even includes a pic of himself relaxing in the very tub where this occured.
Personally, I can experience the illusion when I look at the photo shown above, but I wasn’t able to make it work by staring at my own hands. Six of Bertamini’s ten volunteers also reported that the photo was stronger. I suspect this is because we can feel the position of our fingers, as well as seeing them.
This is not the first bathtub-based visual illusion, believe it or not. A (quite powerful) ‘stretching out in the tub’ illusion was revealed by Lydia Maniatis in 2010.
This week I came across a brain stimulation device called Humm that promises to improve your cognitive function and memory if you stick it to your forehead.
There are several broadly similar devices on the market, which make use of the principle of transcranial alternating current stimulation (tACS) – passing a current through the head (the front of the head, generally) in order to modulate brain activity.
Working memory calls on a specific type of brainwave – slow-frequency, thrumming rhythms called theta waves. As we age, the strength of our theta waves naturally decreases and the brain’s overall rhythms become out of sync, resulting in a decline in working memory. Humm utilizes a proven method called tACS to resynchronize these rhythms and strengthen memory by gently stimulating the brain at a theta wave frequency of 6hz. tACS acts like the conductor of an orchestra that guides populations of neurons to fire simultaneously, allowing separate areas of the brain to communicate better.
What caught my eye about Humm is that they report doing a randomized, controlled study to show that their device really works and isn’t just a placebo. Here’s the write-up of the experiment.
I have to say that Humm’s study pleasantly surprised me. I was expecting it to be some kind of half-baked study that’s more marketing stunt than science. However, the study actually looks solid and I think it would pass peer-review, although it’s not published.
The Humm team randomly assigned n=36 volunteers to receive either active Humm stimulation or a sham (placebo) condition. The experimenters were also blinded to the condition. Before, during, and after the stimulation, participants completed a simple working memory task.
The active stimulation group performed better during and after stimulation than the control group (p=0.004 in both time-points.)
There were no group differences in expectations of benefit from Humm at baseline, or in perceived benefit experienced at the end of the experiment, which counts against a placebo explanation of the improvement.
The write-up even includes the raw data, and graphs the individual data-points, which is something that many academic papers still fail to do, so in that regard, Humm could be said to have gone beyond the gold standard.
Overall, this seems like a well designed experiment, except for one problem: it’s small. n=36 split between two groups is not much data; it’s not a tiny sample size, but it’s certainly not a large one, and I would want to see a much larger study before I would pay $99.00 to pre-order a pack of 12 Humms.
I consider it a priori unlikely that tACS stimulation could improve cognition. Humm hits the prefrontal cortex with 6 Hz current, which is supposed to enhance theta oscillations. But the frequency of theta oscillations varies across individuals; the theta range is usually said to be 4-7 Hz. Very few people would have a theta frequency of ‘exactly’ 6 Hz.
If my personal theta frequency is, let’s say, 5 Hz, then adding a 6 Hz stimulation would seem more likely to disrupt the normal theta function, rather than helping it.
Even supposing that my theta frequency was precisely 6 Hz, then Humm stimulation might be in phase with my theta waves, enhancing them, but it would be equally likely to be out of phase and suppress them. There is indeed evidence that individually-tailored theta tACS can disrupt working memory, although to be fair, plenty of other studies show a benefit. My point is that, a priori, there is no reason to assume a beneficial effect of this kind of stimulation.
A highly acclaimed neuroscientist whose work offered hope for many patients with brain injury has fallen from grace.
Prof. Niels Birbaumer, of the Eberhard-Karls University of Tübingen in Germany, came under investigation earlier this year. The probe began after researcher Martin Spüler raised serious concerns over a 2017 paper in PLoS Biology by Ujwal Chaudhary et al. Birbaumer was the senior author.Read More
This very blog forms a large part of a newly published study on research methods blogs in psychology. The paper has a spicy backstory.
Back in 2016, psychologist Susan Fiske caused much consternation with a draft article which branded certain (unnamed) bloggers as being “bullies” and “destructo-critics” who “destroy lives” through “methodological terrorism.”
Fiske’s post (which later appeared in a more moderate version) was seen as pushback against bloggers who criticized the robustness and validity of published psychology papers. According to Fiske, this criticism often spilled over into personal attacks on certain individuals. Much debate ensued.
Now, Fiske is the senior author of the new study, which was carried out to examine the content and impact of 41 blogs that have posted on psychology methods, and, in particular, to find out which individual researchers were being mentioned (presumably, criticized) by name.
The included blogs (listed in the supplementary material) were a fairly comprehensive list, as far as I can see. My blog has the second largest number of posts out of all the blogs included (1180), but this pales into comparison with Andrew Gelman‘s 7211, although that is a multi-author blog. All posts were downloaded and subjected to text mining analysis. Data was collected in April 2017.
The results about the bloggers’ ‘targets’ were fairly unsurprising to me. It turned out that, out of a list of 38 researchers who were nominated as potential targets, the most often mentioned name was Daryl Bem (of precognition fame), followed by Diederik Stapel (fraud), and then Brian Wansink and Jens Förster (data ‘abnormalities’.)
These results seem inconsistent with the idea that bloggers were especially targeting female researchers, which had been one of the bones of contention in the 2016 debate. As the paper says:
Equal numbers of men and women were nominated, but nominated men were mentioned in posts more often.
I would note though that many of the male names high on the list have been ‘officially’ found guilty, or resigned (Stapel, Wansink, Förster, Smeesters), while none of the women have to my knowledge (Fredrickson, Schnall, Cuddy). At best you could try to argue that bloggers unfairly target innocent women? I’m not sure that this kind of question can be answered with quantitative data, anyway.
I have to say that it’s to her credit that Fiske carried out this detailed analysis of blogs in the wake of the firestorm over her 2016 comments. She could easily have just decided to walk away from the whole topic but instead she decided to collect some real data. On the other hand, I agree with Hilda Bastian’s comments on the weaknesses of this paper in statistical terms:
In some ways, the study has more relevance to a debate about weaknesses in methods in psychological science than it does to science blogging. It’s a small, disparate English-language-biased sample of unknown representativeness, with loads of exploratory analyses run on it. (There were 41 blogs, with 11,539 posts, of which 73% came from 2 blogs.) Important questions about power are raised, but far too much is made of analyses by gender and career stage for such a small and biased sample. And they studied social media, but not Twitter.
A paper just out in eccentric medical journal Medical Hypotheses caught my eye yesterday:
Hmm, I thought, this looks interesting. I’d never heard of the idea that nanoparticles could cause neurological illness.
So I read the paper and quickly found myself falling down a (nano)rabbithole into a fascinating and little-known tale of strange science.Read More
If you delve into the wildest depths of the scientific literature, you will find a trilogy of papers so weird, that they have become legendary.
In these articles, spanning a 12 year period, author Jarl Flensmark says that heeled shoes cause mental illness, while flat footwear promotes brain health:
A dubious paper just published in Molecular Neurobiology makes the suggestion that all military recruits should be offered genetic testing to assess their risk of PTSD. According to the authors, Kenneth Blum et al.,
We hypothesize that, even before combat, soldiers with a childhood background of violence (or with a familial susceptibility risk) would benefit from being genotyped for high-risk alleles (DNA variants). This process may assist us in identifying potential military candidates who would be less well suited for combat than those without high-risk alleles.
Fortunately, the authors claim, such a test already exists, and it’s called the Genetic Addiction Risk Score (GARS).Read More
A paper in a peer-reviewed medical journal makes the suggestion that physicist Stephen Hawking’s disability, which famously confined him to a wheelchair and robbed him of his speech, was psychosomatic in nature.
Hmm. I think this says more about the author than it does about Hawking.Read More
A paper in PNAS got some attention on Twitter recently. It’s called Childhood trauma history is linked to abnormal brain connectivity in major depression and in it, the authors Yu et al. report finding (as per the Significance Statement)
A dramatic primary association of brain resting-state network (RSN) connectivity abnormalities with a history of childhood trauma in major depressive disorder (MDD).
The authors go on to note that even though “the brain imaging took place decades after trauma occurrence, the scar of prior trauma was evident in functional dysconnectivity.”
Now, I think that this talk of dramatic scarring is overblown, but in this case there’s also a wider issue with the use of a statistical method which easily lends itself to misleading interpretations – canonical correlation analysis (CCA).
First, we’ll look at what Yu et al. did. In a sample of 189 unmedicated patients with depression, Yu et al. measured the resting-state functional connectivity of the brain using fMRI. They then analyzed this to give a total of 55 connection strengths for each individual. Each of these 55 measures reflects the functional coupling between two brain networks.
For each patient, Yu et al. also administered questionnaires measuring personality, depression and anxiety symptoms, and history of trauma. These measures were then compressed into 4 clinical clusters, (i) anxious misery (ii) positive traits (iii) physical and emotional neglect or abuse, and (iv) sexual abuse.
This is where the CCA comes in. CCA is a method for extracting statistical associations between two sets of variables. Here one set was the 55 brain connectivity measures, and the other was the 4 clinical clusters. Yu et al.’s CCA revealed a single, strong association (or ‘mode of variation’) between the two variable sets:
But the result isn’t as impressive as it seems, because it’s a CCA result. CCA is guaranteed to find the best possible correlation between two sets of variables, essentially by combining the variables (via a weighted sum) in whatever way maximizes the correlation coefficient. In other words, it is guaranteed to over-fit and over-estimate the association.
Yu et al. show this, as they found that using a permutation procedure (which eliminates any true associations) the CCA still produced a mean correlation coefficient of r=0.55. In 5% of cases, the CCA was lucky enough to hit r=0.62 or higher. Remember that the ‘true’ correlation is zero in this case! CCA is able to magic up a strong correlation of 0.55 or higher from out of thin air.
The observed correlation of r=0.68 is statistically significant, because it’s higher than the 95% null of 0.62, but it’s not much higher. In other words, while there does seem to be some true relationship between the brain and behavior variables here, it is almost certainly much weaker than it appears.
(Yu et al. in their paper also carried out a comparison of depressed patients to healthy controls, which does not rely on CCA, and which I’m not discussing here.)
So what is the use of CCA, if it is guaranteed to overfit the data? Well, it can be useful so long as you have two (or more) independent datasets, allowing you to test the validity of the CCA model, derived from one dataset, in another. The CCA would be overfitted to the first dataset, but by testing it in the second dataset, we can know how much of the correlation is real.
Unfortunately, Yu et al. is not the only paper to adopt a single-sample CCA approach. A well-cited paper Smith et al. (2015) in Nature Neuroscience, which Yu et al. refer to several times, did the same thing. (I blogged about it at the time, rather un-skeptically).
Smith et al. compared brain functional connectivity to behaviour and lifestyle variables, and found a mode of CCA variation with a spectacularly strong correlation of r=0.8723. But the 95% significance threshold under the permuted null hypothesis turned out to be an almost-as-spectacular r=0.84! So, just as with Yu et al., the observed result was significant, but only slightly better than CCA produced by chance alone.
In fact, Smith et al. went on to test the validity of the CCA by running CCA for 80% of the dataset (‘training set’) and testing it in the remaining left-out 20%. This is a kind of rough-and-ready approximation of using a second dataset. Smith et al. found that the correlation in the left-out data was r=0.25 – a much more modest result, although still something.
I would say that this kind of train/test analysis should be a bare minimum in any neuroscience CCA paper. I suspect that if it were applied in Yu et al.’s case the correlation would be small.
A Swedish company called Emotra make a device to detect someone’s risk of suicide based on measuring the body’s autonomic responses to certain sounds. It’s called EDOR®.
I’ve been blogging about this machine for the past 18 months (1, 2, 3) because such a product, if it worked, would be very important. It could help save countless lives. Unfortunately, I don’t think EDOR® has been proven to be effective. As I’ve argued in my previous posts, the evidence just isn’t there yet.