In a neat example of Internet-enabled “crowdsourcing,” the method of distributing a large task to many contributors, researchers are using an anti-spam program to get people to decipher damaged or faded texts, one word at a time. Chances are that if you’ve solved one of those distorted-word tests to secure an account with Facebook, Craigslist, or Ticketmaster, you’ve helped The New York Times inch a little closer to digitizing its entire print newspaper archive from 1851 to 1980 [CNET].
The program, known as reCAPTCHA, is widely used to ensure that humans, rather than spam bots, are commenting on blogs (including some of DISCOVER’s) and signing up for free email accounts. “More web sites are adopting reCAPTCHAs each day, so the rate of transcription keeps growing,” said [lead researcher Luis] von Ahn. “More than 4 million words are being transcribed every day. It would take more than 1,500 people working 40 hours a week at a rate of 60 words a minute to match our weekly output” [Telegraph]. The service is available for free to any site.
Ahn’s lab uses two different optical character recognition (OCR) software programs to scan an old book or newspaper article and convert it into a digital, searchable file. But when the programs disagree on the reading of a word, that word is added to the reCAPTCHA database, and used as part of an anti-spam puzzle. According to a report published in the journal Science [subscription required], humans decipher such words with 99 percent accuracy.
In 2000, von Ahn helped invent the first “CAPTCHA,” which stands for “Completely Automated Public Turing test to tell Computers and Humans Apart,” with a nod to the early computer scientist Alan Turing. The new reCAPTCHA cleverly slips a useful task into what has already become a mundane Internet activity. Says Ahn: “We are demonstrating that we can take human effort — human processing power — that would otherwise be wasted and redirect it to accomplish tasks that computers cannot yet solve” [Wired News].
Last year DISCOVER saw how humans could act as artificial artificial intelligence at the Amazon Mechanical Turk, another fine example of crowdsourcing.