More Bad News For Voice “Lie Detection”

By Neuroskeptic | March 9, 2013 11:06 am

“Layered Voice Analysis” (LVA) is a controversial technology promoted as a tool for helping detect stress and other emotions by analysis of the human voice. According to the company behind the method, Nemesysco:

LVA technology enables better understanding of your suspect’s mental state and emotional makeup at a given moment by detecting the emotional cues in his or her speech. The technology identifies various types of stress levels, cognitive processes, and emotional reactions that are reflected in different properties of the voice… it provides the professional user easy access to truth verification in real time or from recorded data, during face to face and over the phone, during a free or structured investigation session.

Long-term Neuroskeptic readers will remember LVA and Nemesysco from way back in 2009. That was when I blogged about the company’s legal moves against two Swedish academics who had published a paper critical of LVA. That contentious article is still available online.

Now, a newly published study evaluated whether LVA is an effective truth verifying tool: The Accuracy of Auditors’ and Layered Voice Analysis (LVA) Operators’ Judgments of Truth and Deception During Police Questioning.

The authors, led by Michigan Professor Frank Horvath, studied 74 suspects who were interviewed by the Michigan State Police. Audio recordings of the interviews were made. Which of the suspects were being deceptive? Two investigators used LVA (after receiving the manufacturer’s recommended 40 hours of training) to try to judge deception from the records. Three other investigators just listened to the recordings, and formed an opinion based on their own intuition and experience.

What’s a bit iffy is that Horvath and colleagues used the results of a conventional lie detector – the polygraph – as the ‘gold standard’ of truth. The results showed that the experts’ judgements of the truthfulness of the suspects agreed with the polygraph results more often than chance. By contrast, the authors report, LVA didn’t. This means that either the LVA doesn’t work, or the polygraph doesn’t. Or both. The trouble is that the accuracy of the polygraph is itself controversial, so I’m not sure what to make of this.

Luckily, though, there’s more. Of the 74 suspects, 18 of them claimed in the interview to be innocent but later admitted their guilt. So (barring false confessions) those 18 people certainly were lying in the interview. However, the LVA couldn’t detect this: on average, the two LVA operators got just 42% of them right. The experts, who didn’t use LVA and just relied on their intuition, managed to score 70% correct.

Although 18 is a small sample size, this is still evidence that, under realistic ‘field’ law enforcement conditions, LVA doesn’t work.

Horvath and colleagues also note another real-world study from 2008 showing that LVA couldn’t detect lying among prisoners. Prisoners were asked whether they’d taken drugs recently, and then got urine tested to discover whether they really had. LVA failed to detect deceptive answers any better than would be expected by chance.

So if these studies are right, the implications are worrying, given the widespread use of LVA for security purposes around the world. Just a couple of months ago, there was a minor scandal in the UK when a local politician resigned after leaking details of a local government’s plans to introduce LVA to catch liars over the phone.

ResearchBlogging.orgHorvath F, McCloughan J, Weatherman D, & Slowik S (2013). The Accuracy of Auditors’ and Layered Voice Analysis (LVA) Operators’ Judgments of Truth and Deception During Police Questioning* Journal of forensic sciences PMID: 23406506

CATEGORIZED UNDER: bad neuroscience, law, nemesysco, papers, woo
  • Joseph Tan

    The drug-test study is much more convincing than the first study presented, as if they’re going to evaluate accuracy and truth, they better have an outcome that can be measured objectively. I wonder if there are non-field studies performed to evaluate LVA, as you can manipulate truthfulness in a lab setting. I’d hope they’d have to establish it in a non-field setting before moving it into field use, but maybe that’s not true.

    • Neuroskeptic

      True, but you can never prove (to scientific standards of proof) that someone committed a crime. There will always be an element of doubt… drug testing is exceptional in that regard in that it does provide ‘hard’ evidence. In the legal world, a confession is pretty much the best you’re going to get.

      I believe there have been lab studies, but the problem there is, do they apply to the field?

      So I think as field research goes, this study looks pretty good (the n=18 confessed bit.)

    • unity_ministry

      I’ll try and keep this as brief as possible.

      The key issue with polygraphs and reliability is that while there are capable of detecting physiological stress responses in test subjects, the leap from stress to deception is impossible to evaluate reliably and wide open to bias.

      LVA is different because it doesn’t produce meaningful information at all, it just butchers the speech signal and generates a stream of statistical noise.

      As far as independently conducted lab-based testing of LVA is concerned, the two studies you want to look at are Elkins (2009) “Evaluating the credibility assessment capability of vocal analysis software’ and Hollien and Harnsberger (2006) ‘Voice Stress Analyzer Instrumentation Evaluation’.

      Key points from these studies:

      1) Elkins found the system performed no better than chance but was able to extract ‘meaningful’ information from its raw output using logistic regression – in short he found patterns in the noise that appeared to be meaningful when he tortured the data post hoc by retrofitting the output to the correct answers.

      2) Hollien and Harnsberger also found the system performed no better than chance, but the most interesting aspect of their paper is appendix C, which details an email exchange between the researchers and the system’s US vendor in which the vendor the researchers to alter their predetermined protocols based on what are obviously their own attempts to simulate the research.

      This explains why the only studies to report positive findings are those conducted by Nemesysco and their associates – some of whom have been the habit of failing to disclose their pecuniary interests in the system – or where researchers have relied entirely on Nemesysco to ‘calibrate’ the system for them.

      By any reasonable standards, what the company does to obtain positive findings amounts to research fraud.

      @Neuroskeptic – if you want copies of both papers, they’re uploaded to the Ministry, just look for the post on Capita and Local Authorities from December 2012. I would post the links, but I’m not sure how the spam catcher operates here in regards to multiple links in posts.

      • Neuroskeptic

        Thanks very much for the detailed comments…! Neuroskeptic readers should be aware that Unity’s blog is awesome ( and has much of interest about Nemesysco, and other topics.

  • Y.

    It’s shocking how easily you can get courts and law enforcement agencies to adopt these things with little to no scientific support.

  • Sean

    May as well buy a blood pressure cuff and plug it into the fax machine for all the good polygraphs or lva are. Both machines rely on human interpretation which will always be flawed and prone to bias

  • Pingback: Los Alcances del Análisis de la Voz. | Pablo Della Paolera()

  • Dr. Sarkozy Mikal

    Healthy Lifestyle and Entertainment News

  • Pingback: I’ve got your missing links right here (16 March 2013) – Phenomena: Not Exactly Rocket Science()

  • Arthur Wulf White

    This information (provided in the article) is not scientific and is simply lacking. You cannot state that the machine got 42% of the 18 liars without stating how many of the suspects were deemed as lhiars our of the entire group. If the machine tosses a coin and tells the user the suspect is lying 50% of the time no matter what it’s one thing. If the machine said nearly no one lied in the entire group (~10%) and 42% of the group of known liars that might be phenomenal (depending on how many lied).

    The same is true about the experts. If the generally said ~70-80% of the entire suspect group were lying and also said 70% out of the known liars are lying that’s useless. You could say everyone our of the 74 is a liar and get 100% of the 18 liars. For the results to be meaningful and worth reading you need to include the expectancy in a mixed group vs. the expectancy in a group of known liars.

    Also, it is important to note a confession does not always mean the person is a liar. It is know that a rather alarming percentage of people admit to crimes they did not commit.

  • Pingback: Doctors Test Diagnosing Heart Disease By The Sound Of Your Voice – Vocativ()

  • Pingback: Doctors Test Diagnosing Heart Disease By The Sound Of Your … – Vocativ()



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar