Via Mind Hacks, we learn about the case of Francisco Lacerda, a University of Stockholm academic who’s been threatened with legal action by the sinister-sounding Nemesysco company. Nemesysco sell software which, they claim, can detect deception and emotions by analyzing the sound of people’s voices – lie detection, in other words. (In fact it turns out that it can also be used to detect love, or at least, so they say – see below…)
The legal dispute surrounds a 2007 paper authored by Lacerda and Anders Erikkson, entitled Charlatanry in Forensic Speech Science: A Problem to be Taken Seriously. It was originally published in The International Journal of Speech, Language and the Law, but was taken down from the journal’s website following Nemesysco’s threats. However, the full text is still available on scribd.
To be fair to Nemesysco, you can see why they took offence. The paper is unusually lively for an academic article. Here are some of the best bits
Contrary to the claims of sophistication…the LVA [Nemesysco’s “Layered Voice Analysis” system] is a very simple program written in Visual Basic. The entire program code, published in the patent documents, comprises no more than 500 lines of code… there is really nothing in the program that requires any mathematical insights beyond very basic secondary school mathematics… we initially intended to use the code published in the patent documents to make a running copy of the program, but the code is rather messy and not particularly well structured and we decided it would not be worth the time and effort to clean up the code in order to convert it into a running program.
In fact, in parts the thing reads more like a blog post or an op-ed than a scientific paper – no bad thing, of course. Even Lacerda admits that “The article had a journalistic tone and was rather provocatively written. We wanted to prove that the technology behind the lie detector is a scam.” It’s also not entirely clear why Nemesysco, who claim no specific scientific credentials, are a fit subject for an academic journal. (Other voice analysis companies who mis-read scientific papers in support of their claims seem a more obvious target.)
Still, Erikkson and Lacerda make an excellent case against Nemesysco. They point out that, according to the patent documents, Nemesysco’s “LVA” system does nothing more than apply a simplistic analysis to the amplitude waveform of the speech, involving counting the number of “thorns” (sharp peaks or throughs) and “plateaus” (flat bits):
As they point out, the number of these things will depend upon, amongst other factors, the quality of the audio recording and digitizating process: a better sound recording with a higher sampling rate (more “dots” on the graph above) will inevitably have more thorns and plateaus
The number of thorns and plateaus…depends crucially on the sampling rate, amplitude resolution, and the threshold values defined in the program
Even setting aside these issues, the fundamental point is that there is absolutely no reason to think that the number of thorns and plateaus in the speech waveform has any relation to whether someone is lying, under emotional stress, or whatever. This makes the LVA system even less plausible than the older “Voice Stress Analysis” (VSA) method of vocal lie detection, which Erikkson and Lacerda also discuss. There is at least some theoretical basis in physiology for that system, although a very very shaky one. LVA doesn’t even have that – or at least none has been provided – so when Nemesysco claim that
The SENSE technology can detect the following emotional and cognitive states:
Excitement Level: Each of us becomes excited (or depressed) from time to time. SENSE compares the presence of the Micro-High-frequencies of each sample to the basic profile to measure the excitement level in each vocal segment.
Confusion Level: Is your subject sure about what he or she is saying? SENSE technology measures and compares the tiny delays in your subject’s voice to assess how certain he or she is.
Stress Level: Stress is physiologically defined as the body’s reaction to a threat, either by fighting the threat, or by fleeing. However, during a spoken conversation neither option may be available. The conflict caused by this dissonance affects the micro-low-frequencies in the voice during speech.
Thinking Level: How much is your subject trying to find answers? Might he or she be “inventing” stories?
S.O.S: (Say Or Stop) – Is your subject hesitating to tell you something?
Concentration Level: Extreme concentration might indicate deception.
Anticipation Level: Is your subject anticipating your responses according to what he or she is telling you?
Embarrassment Level: Is your subject feeling comfortable, or does he feel some level of embarrassment regarding what he or she is saying?
Arousal Level: What triggers arousal in the subject? Is he or she interested in you? Aroused by certain visuals? This new detection can be used both for personal use for issues of romance, or professionally for therapy relating to sex-offenders.
Deep Emotions: What long-standing emotions does your subject experience? Is he or she “excited” or “uncertain” in general?
SENSE’s “Deep” Technology: Is your subject thinking about a single topic when speaking, or are there several layers (i.e., background issues, something that may be bothering him or her, planning, etc.) SENSE technology can detect brain activity operating at a pre-conscious level.
and yet nowhere on their website is there any hint of evidence for any of this, skepticism is justified. Amongst many other things, it’s unlikely that even if we each have a vocal pattern associated with, say, arousal, (not implausible), the same pattern would be present in the voice of men, women, people of different ages, and so forth. People just aren’t that alike, as any psychologist or neuroscientist knows. Even direct measures of brain activity during very simple cognitive tasks vary greatly between individuals. The chance that any kind of analysis of the voice could reveal such complex information about an individual without their compliance is remote.
Almost certainly, Nemesysco’s analysis provides no useful information about the speaker as such, but as Erikkson and Lacerda suggest, it probably “works” through two psychological mechanisms. Firstly, the fact that if someone believes that their voice is being analyzed, they may tend to be more truthful because they think that lies will be detected. Secondly, the fact that the voice analysis user is able to interpret the output – e.g. “speaker stressed, concentrating hard” – in terms of what they already know about the speaker. Anyone might be stressed and concentrating hard during almost any conversation, so it always “fits”.
Still, if you don’t believe me, and you want to try out LVA for yourself, you can – and you don’t have to be a cop or a spy. Nemesysco are now marketing their technology directly to consumers in the form of the Love Detector. The Love Detector is available as a Skype plug-in for just $29, and it allows you to know whether the object of your affections feels the same way about you, all from the sound of their voice.
Love Detector was originally designed with young singles in mind, or anyone searching for “the ONE”. If you are currently looking for love, starting to date someone, or just have that unmistakable feeling, and you want to make sure it’s mutual, Love Detector is the tool for you. If you are in a long-term relationship or even married, this version of Love Detector offers a “Relationship Selector” option designed to meet your needs as well.
There is even, apparantly, a free online version. If the mood strikes, maybe I’ll try it out. Watch this space. And lock up your daughters (or at least unplug their microphones…)
Anders Eriksson, Francisco Lacerda (2008). Charlatanry in forensic speech science: A problem to be taken seriously International Journal of Speech Language and the Law, 14 (2) DOI: 10.1558/ijsll.2007.14.2.169