Mice, Math and Drugs: On Science without Understanding

By Neuroskeptic | January 13, 2009 10:45 pm

The latest issue of Neuropsychopharmacology is chock full of goodies – not only one of the first ever controlled trials of medical marijuana, but also a surprise gem from an American-Israeli collaboration, called A Data Mining Approach to In Vivo Classification of Psychopharmacological Drugs. Yet despite being an excellent paper, it raises some worrying questions about what is and isn’t science.

In a nutshell, the authors sought to discover a way of efficiently determining what a drug does. There are several broad classes of psychoactive drugs, such as stimulants, e.g. cocaine, and opioids, e.g. morphine. If you want to find out whether an unknown drug has opioid-like painkilling effects, for example, you have to test for them specifically – e.g. by measuring how the drug alters a mouse’s pain threshold in a test called the Hot Plate test (guess what that involves.) If you want to test whether the same compound has antidepressant effects, you would have to do a different test entirely, like the Porsolt test. And so on.

The authors tried – and claim to have succeeded – to find a way of detecting the effects of drugs in a single, simple test. The test involved putting a mouse onto an empty circular platform (an “open field”) and just allowing it to run around for an hour. A camera records the movements of the mouse, and a computer analyzes the video to give the mouse’s position every 1/30th of a second. The result is a series of numbers showing the path which the mouse took around the area.

The clever bit follows: from this path data, one can derive various other numbers – for example, the mouse’s velocity, acceleration, and direction of movement relative to the wall of the platform, at any given point in time. An hour of a mouse’s life can be broken down into a veritable mountain of data (especially since there are 30 x 60 seconds x 60 minutes = 108,000 time points.)

The authors then used a technique called data mining to discover patterns in this data which could be useful in discovering drugs. Data mining is nothing complicated – it essentially means taking a lot of data and searching it all for anything interesting. In this case, they injected mice with various doses of various different drugs from three different classes – stimulants, opi
oids, and “psychotomimetics” such as phencyclidine (angel dust) and ketamine. They recorded their movement over the course of an hour and analyzed it to get 10 numbers (“attributes”) at each of the 108,000 time points. They then considered the combination of up to 4 different attributes simultaneously in a procedure they call (and have no doubt patented as) “Pattern Array Analysis”.

The single-attribute pattern coded P{*,*,3,*,*,*,*,*,*,*} is defined only by the third bin (40-60 cm/s) of the third attribute (speed), ie the animal is moving moderately fast… as more attributes are added to the definition of a pattern it becomes more and more specific, eg the four-attribute pattern P{*,*,1,2,*,1,5,*,*,*} means moving very slowly while slightly decelerating in the direction of the arena wall but turning sharply away from it.

They then took every one of this huge number of possible “behaviour patterns” (there were 73,042), measured how many times each mouse did each one over the course of the hour, and worked out which patterns became more or less common after giving each of the different drugs. They ended up with this:
This is a plot with 73,042 dots on it. Each dot represents a pattern of mouse movement behaviour. Dots further to the right represent behaviours which are more common, while dots higher up represent behaviours the frequency of which is most significantly different between mice given opioids and mice given other drugs (or no drugs). Most of the dots are low down the plot, showing that the opioids had little effect on them. But the dot with an arrow pointing at it represents a behaviour which is both common, and much, much less common in mice injected with opioids; in fact the significance p value of the difference is below 0.00000000000001 (that’s 15 zeroes).

What is this behaviour? It’s P{*,*,*,*,4,*,*,*,*,*} (‘moderately positive jerk’), meaning that the mouse’s acceleration was increasing at a certain point in time (for those who know calculus: the second derivative of speed was positive & quite high). So, give a mouse morphine, and you can be pretty sure that its acceleration won’t be increasing very often. Hmm. A similar procedure was performed for the other two classes of drugs.

Now, what on earth does that mean? Why do opioids suppress the ‘moderately positive jerk’? No-one knows – and the odd thing is that we don’t need to know. Once we’ve identified the pattern of behaviour to look for, we could use it to determine whether drugs have opioid-like activity, even if you haven’t got any idea why it works. And it does work – the authors report that by looking for the right behaviours, they could successfully classify a range of other drugs, including a couple of mystery drugs for which the person running the experiment didn’t know what they were. This plot shows the success rate; the three classes of drugs are in different colours, and they clearly occupy three distinct regions of the “space”, the two dimensions of which are frequency of two different patterns of behaviour. Overall, this is a very impressive paper, and the practical implications are potentially very great – soon, it might be possible to tell what effects a newly designed drug has, all in a single mouse test. This could greatly speed up, and reduce the cost, of drug discovery. For drug companies, it could be very useful indeed.

But is it “science“? This paper doesn’t really add to our understanding of the world – all it does is tell us that a seriously obscure aspect of mouse movement, ‘moderately positive jerk’, is altered by opioids. This is a potentially useful fact, especially if you’re a drug company, but it’s a completely uninterpretable one – it doesn’t help us to explain, or understand, anything about mice, or opioids, or anything. It’s not a theory or a hypothesis, and it will probably never give rise to one. It’s just an isolated, brute fact. This is the kind of “science” that the most hard-core logical positivist would be happy with.

And this kind of thing is becoming popular in neuroscience. Essentially similar techniques are becoming widely used in fMRI data analysis. Here’s a diagram from another paper from 2007 reporting on a method of using genetic algorithms to data-mine MEG data (a way of recording changes in the magnetic field surrounding the brain) to discover patterns which could be used to diagnose various neurological and psychiatric illnesses. It works:
It’s an elegant technique and it’s a nice result. But again, no-one has any idea what this diagram really “means” and almost certainly no-one never will. The fact that the schizophrenia patients and the Alzheimer’s disease patients occupy different areas of this imaginary 2D “space” defined by two complex variables somehow derived from a huge mountain of numbers is potentially useful, if you want to diagnose a disease, but it tells you absolutely nothing about that disease. It’s like going to a witch-doctor and asking if someone is ill; she’s always right, but if you ask her how she knows, she just says “By magic”.

Data mining’s cool, but when it’s done like this, it’s not science…

ResearchBlogging.orgNeri Kafkafi, Daniel Yekutieli, Greg I Elmer (2008). A Data Mining Approach to In Vivo Classification of Psychopharmacological Drugs Neuropsychopharmacology, 34 (3), 607-623 DOI: 10.1038/npp.2008.103

Apostolos P Georgopoulos, Elissaios Karageorgiou, Arthur C Leuthold, Scott M Lewis, Joshua K Lynch, Aurelio A Alonso, Zaheer Aslam, Adam F Carpenter, Angeliki Georgopoulos, Laura S Hemmy, Ioannis G Koutlas, Frederick J P Langheim, J Riley McCarten, Susan E McPherson, José V Pardo, Patricia J Pardo, Gareth J Parry, Susan J Rottunda, Barbara M Segal, Scott R Sponheim, John J Stanwyck, Massoud Stephane, Joseph J Westermeyer (2007). Synchronous neural interactions assessed by magnetoencephalography: a functional biomarker for brain disorders Journal of Neural Engineering, 4 (4), 349-355 DOI: 10.1088/1741-2560/4/4/001

CATEGORIZED UNDER: animals, drugs, marijuana, mental health, papers
  • Anonymous

    Isn’t gathering a load of data and then analyzing it to see what pops out poor science? I was under the impression you need to work from a hypothesis. Underwood, and many other bio-statisticians have condemned the data mining approach when it is used in ecology (sort of a “tag first, ask questions later” approach). Would the same restrictions apply to data mining in this field, or are there different statistical parameters/requirements for the drugged mouse study which render Underwood et al. objections not relevant?–Daniel J. Andrews

  • BioinfoTools

    I’m an independent computational biology consultant (available for hire, etc.!) and ‘data mining’ is a term that gets bandied around in my field. I’m cautious about over-reaching when using it myself for a variety of reasons, one of which you have identified: it’s common not to be able to identify the underlying reasons a correlation exists (and hence answer the ‘why’ question in your examples), but to only be able to state that a correlation exists. This doesn’t itself say that data mining is useless, but rather that used on it’s own, it’s (direct) applications can be limited.It can be useful in diagnostics, as you say, because there the application may not be interested in how a correlation arises, but that it exists (reliably, hopefully!).Another use is to throw up useful starting points to provide an initial research focus. If you want to identify genes that cause (not, are associated with) a cancer and you start with, say, the RNA expression data for all human genes in the cancer cell line in question, you might be quite happy to locate, say, 30 candidates that are more likely to contain genes that play a role in the cancer in question. It won’t say why those genes are involved, or for that matter if they are are only “bystanders” that are guilty by association as it were, but it will provide a starting point. (This particular application is helped by the fact that genes are generally part of pathways, but I’d better stop explaining this at some point…!)With that in mind, the usefulness of data-mining depends on the application.I prefer a model-based approach myself, given a choice. But then my focus is on understanding biological mechanisms, especially with a structural biology angle to it (where you can work from first principles in a sense), and I like to do a lot of reading of the relevant biological literature associated with my work. Data mining is easier. Or at least looks that way from my side of the fence…

  • http://www.blogger.com/profile/06647064768789308157 Neuroskeptic

    Daniel – Well, as I said, it’s arguably not science at all. That said, what they did seems statistically ok – they applied a conservative Bonferroni correction for the 70,000+ multiple comparisons, and they then verified that their predictors worked by testing them on novel drugs. So I’m convinced by what they did. although to be really sure that it worked I’d want to see an independent lab replicate the same finding.Bioinfotools – Thanks for the comments. I know that many genetic association studies are essentially “data-mining” and certainly it can be used as a starting point for further research. But in this case I don’t think it will be, because who cares about how mice move? In fact it might be possible to find out why this effect occurs, although given the obscure nature of the effect it would be tricky, but I don’t see anyone asking that question…

  • http://www.blogger.com/profile/10454137692287849653 Daivd

    This comment has been removed by the author.

  • http://abb3w.livejournal.com/ abb3w

    My usual shtick: Science refers to the process of gathering evidence, forming conjectures about the evidence, developing a formal hypothesis which indicates how the current evidence may be described under the conjecture, competitive testing of all candidate hypotheses under a formal criterion for probable correctness, plus the body of hypotheses testing best thereby and which thereafter are referred to as “Theories”.This may be viewed as a means of hypothesis generation. So, in that sense: yes.How does the hypothesis stand up against others in describing this data? Well, probably pretty good… which is a start. How does it stand up at describing other data? That may indicate whether the hypothesis generated may one day grow up to be a Theory. One would want to test against further data before being considered particularly interesting.

  • http://scienceblogs.com/neurotopia scicurious

    Hi! Long-time lurker here, but being as I do a hell of a lot of mouse behavior work…I have an issue with the results of the test. To me, finding that locomotor data point is interesting, but I’m not sure that its widespread application would work very well for positive drug IDs. I say this from my personal experience both with locomotor activity and with the Porsolt test.The Porsolt test is usually used to identify new classes of antidepressant, even though swimming has little to do with psychiatric results. The Porsolt test is immediate, while antidepressants commonly don’t work for several weeks of treatment. All that the test can do is provide data that allows you class drugs in a certain way, but it doesn’t really do a lot to determine whether or not those drugs will be effective in the clinic. For example, fluoxetine (Prozac) has positive Porsolt test results. So does cocaine. No one is going to say that these two drugs work in remotely the same way, or that both drugs can be used for the same purpose in the clinic (though I’ve been told that cocaine is a hell of an antidepressant for the 20 minutes you’re on it). Thus, that one point they found that is apparently common to opioids will help identify similar opioid compounds, but it may not be selective enough. It may identify opioids, but not ones that are useful in in the clinic. A locomotor test is very different from a hot plate test, and is testing entirely different parameters. Some would say that, because locomotor tests involve the dopamine system in many ways, what you’re measuring is actually a specific effect of opioids on the dopamine system, rather than an effect of opioids themselves. This would mean that, though you could perhaps identify a drug class, it wouldn’t determine whether that drug would be at all useful in the clinic.OTOH, I really need to go read this article myself fully. Thanks for pointing it out! I clearly need to read my TOCs more closely.

  • http://www.blogger.com/profile/06647064768789308157 Neuroskeptic

    abb3w: It could be used for hypothesis generation, but the “moderately positive jerk” seems so remote from anything else we know about mouse behaviour that I’m not sure the hypothesis would be – trying to understand how opioids affect this behaviour is going to hard, and I suspect not many people will try.scicurious: hi!You’re right of course, we don’t know whether this test would identify useful opioids or not. But in theory, if it didn’t, you might be able to refine the test until it did – just find the movement pattern that corresponds to “useful opioids” vs “other opioids”.That’s the (potentially) very useful thing about this technique.

  • http://www.blogger.com/profile/08773931739173488968 Ryan Morehead

    Seems like this study uses the “Google approach,” outlined by Chris Anderson here: http://www.wired.com/science/discoveries/magazine/16-07/pb_theory.Google uses crazy complex algorithms to know how to layout webpages and how to connect content. Why the stuff works is irrelevant; that it works is critical.Another way of thinking about it is ‘cheating’ in AI. Instead of creating a neural network or some type of program that models the human mind, and then accomplishing some human task with that, they just us brute computing power achieve what they want. Deep Blue, the computer that beat Kasparov by doing a million zillion calculations a second, is a good example of that.The point is, they’re working smarter, not harder, to achieve an intended goal. If it is going to take years and years to sift through certain classes of drugs, or genes, in order just to find candidates for further study, it’s a damn good idea to data mine.Is it science? No, but you’ve hit the nail on the head when you say it’s practical.Edge also had a lot of scientists respond to Chris Anderson’s article, if you’re interested: http://edge.org/discourse/the_end_of_theory.htmlSuffice it to say, data mining and google are not the end of theory.

  • Anonymous

    Hi, I’m a first-time reader and not a scientist in any way, just someone interested in reading about science. Please forgive me if I say anything inaccurate.Nobody would argue with the fact that data mining doesn’t provide any insight. But neither does the Hot Plate Test. And in this case, data mining is just that — a substitute for the Hot Plate Test and a host of other tests.Whether it’s science is really beside the point. After all, a thing’s value to society is how USEFUL it is, not how SCIENTIFIC it is.Best,Nick



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar