In his 1950 paper “Computing Machinery and Intelligence,” Alan Turing proposed what is now known as the Turing test in artificial intelligence. The idea is that if you are unable to discriminate between a computer and a human who is answering your questions via a keyboard and screen, then the computer is intelligent.
There are many problems with this idea, but despite these problems, it still remains a compelling benchmark, and one that has yet to be reached. But think of the following variation: rather than have your computer and human team answer any old question, the questions have to be similar to what you would expect on the quiz TV show Jeopardy! – clues about trivia in the form of answers to a question that you must come up with.
Even this greatly restricted version of the Turing test is very challenging, but I.B.M.’s machine called “Watson” has recently made intriguing steps toward passing it. Watson takes any Jeopardy-type question and gives a response. It was not developed as a new type of intelligence test, but instead as a grand challenge to beat a human at a language-based task, like a Deep Blue of language (IBM’s Deep Blue chess playing computer beat the world chess champion in 1997). You can challenge it yourself here. It currently uses a fixed set of a large number (in the millions) of documents and a sophisticated parallelized statistical algorithm running on a supercomputer. By being parallelized, the algorithm can try a large number of possible interpretations of the question out at once, and pick the most likely interpretation.
One of the problems with the original Turing test, which the Jeopardy test also suffers from, is that only forms of intelligence expressible through language are eligible for testing. It’s perfectly consistent, for example, for a robot to pass the Turing test, but then fall over when it tries to take its first step. Walking is a skill. It’s learned by a different part of the brain than are facts like what a neti pot is. Memory for these skills is called “procedural memory,” while “declarative memory” is the kind of memory for facts – and many studies within neuroscience have shown that these two forms of memory live in different places in the brain.
But just as we can identify what sorts of intelligence would fall through the cracks of the Turing test, it’s also interesting to think about the kinds of intelligence a Jeopardy test could test for, and where it might fail relative to the Turing test.
For instance, while a machine that passes the Turing test should convincingly answer questions about its first kiss, Watson would be pretty stumped on that one. Presumably, a machine that convincingly describes what their first kiss was like is being masterfully deceptive (at least for the first few models!) – but nonetheless, describing “first-person” states like the emotions that went with your first kiss is quite complex, and faking this convincingly is hard. In this way, we can see the Jeopardy test is a less demanding test than the Turing test. But that’s good – the Jeopardy test is a big challenge as it stands and it will be a remarkable breakthrough if Watson ultimately succeeds when it goes head to head against a human, as it is expected to sometime this fall. Having more attainable tests leading up to the Turing test is very helpful to researchers in artificial intelligence.
While we know not to expect Watson to get the answers to questions regarding feelings right, there are also some simple facts we can’t expect it to know. For instance, a computer that passes the Turing test must be able to give convincing answers to random biographical questions like “where did you get your first report card?”, but giving Watson the Jeopardy form of this question — “where I got my first report card” — should stop him in his tracks.
Can you think of other examples of things that would pass the Turing test but leave Watson with smoke coming out of its ears? Leave your thoughts in the comments.