Getting warm has a dramatic antidepressant effect, according to a new report published in the prestigious journal JAMA Psychiatry. But is it hot science or a hot mess?
The researchers, led by Clemens Janssen of the University of Wisconsin-Madison, studied 29 people with depression who were not receiving any other treatments. Half were randomized to receive whole-body hyperthermia (WBH), using a setup which raised their core body temperature to 38.5 degrees (37 degrees is normal).
The other half of the patients were the control group who received a ‘sham’ treatment. They spent time in the hot-box as well, but on a much lower setting: their core temperature only rose to 37.7 degrees. This mild treatment was intended to make the placebo patients think that they really had received an active treatment, to generate the same placebo effect in both groups.
What happened? Although both groups became less depressed following the treatment (which was just one session, lasting 2-3 hours), the active WBH group improved much more than the sham group on the HDRS depression rating scale over subsequent weeks:
The difference between the two groups is dramatic. At week 1, the WBH group improved by 6 more HAMD points than the sham, and the Cohen’s d effect size is stated as 2.23. This is a spectacular Cohen’s d score – given that 0.5 is considered ‘medium’ and 0.8 is considered a ‘large’ effect! For comparison, the average antidepressant medication causes improvement of about 3-4 HAMD points vs. placebo, with a Cohen’s d of around 0.35.
Quite frankly I don’t believe that WBH has an effect size of 2.23 vs. sham. Has something gone wrong with the calculations?
Yes, it looks like it has. Look again at that figure – the error bars, indicating the data standard deviation, are tiny. I believe that in the figure, Janssen et al. have confused the standard error (SE) for the standard deviation (SD). The SE is always smaller than the SD. Indeed, in Table 2, we see the means and SD for these measures and it’s obvious that the SD is much higher than in the graph.
Janssen et al. appear to have mixed up the SD and SE for the Cohen’s d calculation as well. Based on Table 2 I calculate the true Week 1 Cohen’s d as 1.35 in favor of WBH. This is still extraordinarily high, but not quite as spectacular as 2.23. I’m still not sure what to make of it.
(Edit 16.5.2016: on further investigation I’m not sure that SD/SE confusion is the whole story. Figure 1 seemingly does show SE, rather than SD as the caption says – a typo? However, this can’t explain the d=2.23 result. Rather, it seems that the graph and the d=2.23 result might be the output of a statistical model – the hierarchical linear mixed-effects model Janssen et al. used – whereas Table 2 gives the raw data. Thanks to Jon Roiser for input on this point.)
Another issue here is the placebo effect. To their credit Janssen et al. did go to lengths to ensure that the sham treatment was a believable one (i.e. it did involve some body heating) in order to prevent unblinding. However, it’s not clear they were successful: “10 of 14 participants [71.4%] randomized to sham believed they had received active hyperthermia immediately on completion of the procedure (compared with 15 of 16 [93.8%] receiving active WBH).” Note the possible ceiling effect in the active group. So the active group may have been more likely to expect to improve.
I would have liked to see a more powerful sham condition. Why not tell participants that the study is of “changing body temperature to treat depression”, and then use body cooling as the control condition? This way, both groups would get a bona fide temperature modification.
As to how hyperthermia could have an antidepressant effect, Janssen et al. speculate that it’s all about the brain regions:
In humans, exposure to cutaneous heating (41°C) activates the mid orbitofrontal cortex, the pregenual anterior cingulate cortex,and the ventral striatum, with the degree of activation being associated with subjective pleasantness ratings made in response to the warm temperature. Importantly, these and other brain regions most implicated in registering – and reacting to – pleasant thermal signals show decreased activity in patients with MDD [major depressive disorder]… Based on these considerations, we conducted animal studies demonstrating that whole-body heating activated subdivisions of the dorsal raphe nucleus implicated in mood regulation an antidepressant-like responses, whil notactivating other dorsal raphe subregions, including those implicated in the facilitation of anxiety states…
h/t Bernard Carroll
Edit 19.5.2016: Chuck Raison, the senior author of the paper, has made the fellowing response:
We did not mix up standard deviations and standard errors. As suspected in Dr. Carroll’s piece, the effect sizes we reported were those delivered by the linear mixed model used to evaluate differences between groups. The LMM delivered larger effect sizes than would be calculated from the raw data that we reported in Table 2 in the paper. For consistency, working with JAMA Psychiatry, we are amending the effect sizes as those that would be calculated from the raw data presented in Table 2. This reduces the effect sizes, but they remain large, as will be apparent in the modified paper.
The Figure in the online version inadvertently shows standard errors due to a mix-up in the editorial process. This will also be corrected. I think the important point to note here is that the effect sizes do not speak as much to the antidepressant power of active hyperthermia as to the fact that the change in depressive symptoms observed in the active group are not likely explained by a number of non-specific aspects of the intervention captured by the sham condition. The sham condition did not produce a very robust effect. The idea of a cold comparator is intriguing but has its own issues–the conditions would likely be quite different so it would be a bit of an apples to oranges comparison, and it may well be that hypothermia may also have some promise as an antidepressant.
Some of the other folks making comments on this study might be interested to know that in the first version of the paper we did in fact report stats on the secondary variables we reported in the supplemental file, but were asked to remove these on revision. In fact, we did see evidence that hyperthermia improved self-reported depressive symptoms at one week post-treatment and saw improvements in function and quality of life over the first couple of weeks that had dissipated by week 6. The important point about hyperthermia is not that chronic heat makes people feel better–it doesn’t generally–but that an acute exposure to intense heat may regulate thermoregulatory pathways linked to mood modulation.
Frankly we did not expect the effect of a single treatment to persist out to six weeks and we note that the sham group remained slightly improved at six weeks, so I am not convinced that this protracted effect is entirely explained by the active treatment. Rather examining the response curves it seems to me that there is an active effect apparent over the first two weeks post-treatment that then is maintained in the treatment group.
Janssen CW, Lowry CA, Mehl MR, Allen JJ, Kelly KL, Gartner DE, Medrano A, Begay TK, Rentscher K, White JJ, Fridman A, Roberts LJ, Robbins ML, Hanusch KU, Cole SP, & Raison CL (2016). Whole-Body Hyperthermia for the Treatment of Major Depressive Disorder: A Randomized Clinical Trial. JAMA Psychiatry PMID: 27172277