Reading without understanding: baboons can tell real English words from fake ones

By Ed Yong | April 12, 2012 2:00 pm

‘Wasp’ is an English word, but ‘telk’ is not. You and I know this because we speak English. But in a French laboratory, six baboons have also learned to tell the difference between genuine English words, and nonsense ones. They can sort their wasps from their telks, even though they have no idea that the former means a stinging insect and the latter means nothing. They don’t understand the language, but can ‘read’ nonetheless.

At its most basic level, reading is about recognising patterns. We look at letters (or other symbols) and identify them based on their number, position and angles of lines. This is a trivial task, and one that doesn’t require any language. Letters are no different to any other object in our environment that we can recognise. A pigeon can be trained to do discriminate between letters.

The next step is harder. We unite letters into words by looking at their positions relative to one another. This is called “orthographic processing”. It’s the stage where, according to general consensus, language kicks in. As we see clusters of letters, we think about the sounds they represent and we read the word aloud in our heads. But Jonathan Grainger from Aix-Marseille University has shown that orthographic processing can happen without any knowledge of language, or how words are meant to sound.

Grainger trained baboons to recognise English words, and tell them apart from very similar nonsense words. The monkeys learned quickly, and could even categorise words they had never seen before. They weren’t anglophiles by any stretch. Instead, their abilities suggest that the act of reading words is just a more advanced version of the pattern-recognition skill that lets us identify letters. It’s a skill that was there long before the first human had scrawled the first letter.

The baboons lived in a unique facility designed by Joel Fagot, where they can volunteer for experiments. Their enclosures included touch-screens that would flash a real four-letter English word, like ‘done’, ‘land’ or ‘vast’,  or non-words, like ‘dran’, ‘lons’ or ‘virt’. The baboon had to categorise the words and non-words by touching one of two shapes. If they got the right answer, they earned a tasty reward. Unlike many similar experiments, the animals decided when they wanted to take part.

None of the six baboons had seen words or letters before. But over a month and a half, and thousands of trials, all of them learned to distinguish words from non-words with around 75 per cent accuracy (50 per cent would be pure guesswork). The most successful of them – Dan – built up a vocabulary of 308 words.

Their achievement is remarkable, not least because the non-words were very similar to the actual ones. Rather than obvious fakes like ‘qzxc’, they all contained pairs of letters that occur in real words, although they veered towards rarer combinations. And the monkeys weren’t just memorising the words. They were still more likely to pick a set of letters they had never seen before, if it was an actual English word.

Grainger thinks that the baboons learned to tell the real words from the fakes by using the frequencies of letter combinations within them. They learned which combinations were most likely to be found in real words, and made their choices accordingly. They had gleaned the stats of English, without any knowledge of the language itself.

Stanislas Deheane, one of the leading figures in the science of reading, thinks that the study is “extraordinarily exciting”. He says, “It fits very nicely with my own research, which suggests that reading relies, in part, on learning the purely visual statistics of letters and their combinations.”

But Noah Gray, a neuroscience editor at Nature, says that “Dan” may have skewed the results by performing exceptionally well. “The “animal genius” effect is big,” he said on Twitter. “Remove ‘Dan’ & effect isn’t nearly as impressive.”

, Grainger’s study also suggests that the primate brain was already pre-adapted to process printed words. When we invented writing systems, we co-opted ancient neural circuits that help primates to recognise patterns. This shouldn’t be surprising. Written language is only around 5,000 years old, and millions of people today still cannot read. We can, however, develop that ability very quickly. In the 19th century, when the Cherokee of North America finally invented a writing system for their spoken language, they started learning and using it within a single generation.

Deheane has shown that one part of the brain, of the many that activate when we read, selectively buzzes in response to written characters, rather than other sights or spoken words. This region is known as the left visual word form area (VWFA), and Deheane now wants to see if Grainger’s baboons activate the equivalent area when they discriminate between real and fake English words.

By recording the activity of individual neurons in the brains of baboons, while they look at words, Deheane thinks it will be possible to “examine the neural code for written words”. He suspects that we will find “bigram neurons”, which are tuned to a specific combination of two letters, such as ‘EN’. “I can’t wait to see if this prediction holds up,” he says.

Reference: Grainger, Dufau, Montant, Ziegler & Fagot. 2012. Orthographic Processing in Baboons (Papio papio). Science

Updated: to include Noah Gray comments

Photo by Joel Fagot; images from Science/AAAS


CATEGORIZED UNDER: Select, Uncategorized

Comments (19)

  1. Glen

    As a layman I find the information published so far on this matter totally inadequate for the conclusion made. Have we ruled out vowel combinations for instance- would “Dan” pick out – YARN vs a comparasion with MAIRE. How about KIWI vs BOAMS. I bet MAIRE & BOAMS win over 75% of the time.

  2. “… If they got the right answer, they earned a tasty reward. Unlike many similar experiments, the animals decided when they wanted to take part.”

    I see some minor glitch in this experiments. Is it possible that somehow baboons, after thousands of trials, remember the pattern (word) that benefiting them?

  3. @Sulhan – “And the monkeys weren’t just memorising the words. They were still more likely to pick a set of letters they had never seen before, if it was an actual English word.”

  4. Georg

    “”The next step is harder. We unite letters into words by looking at their positions relative to one another. This is called “orthographic processing”. “”

    Many people read by recognizing words as a unique picture.

  5. This is fascinating and exciting, also humbling. Whenever I read language experiments with apes, it raises ethical questions. I know that this is about pattern recognition, but still…if combined with teaching language, could apes learn to read?

  6. I wondered whether this could be repeated with any language, such as Russian or Greek. I suppose those are fairly similar to English in terms of having letters ‘for’ A, B, G, D sounds etc. Then I wondered whether this would work in hieroglyphic / pictographic languages…

    Also what type / font did they use in this study and does that makes a difference 😉

  7. I (of Horus)

    Exciting indeed.
    This throws up so many questions, can’t wait to hear what answers this research will come up with

  8. Old Geezer

    I’m pretty sure it wasn’t Comic Sans

  9. When you consider the complexity of recognizing edible plants in a diverse, ever-changing natural environment, which all primates (except ourselves) learn to do from a young age, recognizing linear patterns in 26 letters doesn’t seem like much of a feat. Plants are constantly changing shape and colour, and yet with training they are easy to recognize from seed to nearly decayed, and every stage in between. Letters, on the other hand, don’t change much. Technically, reading and writing are really quite simple. Perhaps that’s why we didn’t bother to invent them for so long.

  10. Is there hyperlexia in baboons? Give Dan a dictionary.

  11. Thanks for this, Ed, I heard an report yesterday on NPR which I found really confusing and not-quite-right. Your report is much clearer.

    The NPR report claimed that

    ….shows baboons are able to pick up the first step in reading — identifying recurring patterns and determining which four-letter combinations are words and which are just gobbledygook.

    But that’s not the first step in reading. Two of the first steps are: twigging to the idea that specific symbols represent specific sounds (the alphabetic principle); and realizing that words break down into individual sounds (phonemic awareness ). It is likely that these two steps are intertwined.

    Recognizing words as a unit ("sight words"), which is what the baboons did, comes much later in the process of learning to read, and is essential to both comprehension and fluency (which again, are intertwined. However, comprehension also has to do with the reader's underlying grasp of oral language.

    An simple example will suffice: if you have never seen so much as a picture of a giraffe, you might be able to sound the word out, but have no idea what it means.

  12. Hi Jo! As far as I’ve been able to find in the research, fonts really don’t make that much of a difference, no matter what that group in Holland claims.

  13. I have a comment in moderation about some of the inaccurate claims made by other reports on this experiment.

    I fail to see what the baboons’ performance has to do with dyslexia or “how we teach children to read” as the news reports have claimed. The baboons took ” 50,000 trials for each animal” to get to “least 81 words at an accuracy rate of about 75 percent”.

    The animals were only discriminating words from non-words, not the meaning of words. As far as I can see, the experiment didn’t address whether the baboons could distinguish among (for example) cab, cap, can, nab, and nap. Yet a reader has to make just those distinctions in order to accurately read (derive correct meaning) from text.

    For a child reading after (say) 3rd grade, 75% accuracy is a disaster. A reader’s frustration level is when she can recognize 93% or fewer of the words presented.

  14. deepak

    I’m a lil confused. Are they trying to say that there is some rule to what words can exist in English? a word that might seem weird to us today might be perfectly fine if we use it long enough. virt could totally mean something and as such I don’t see whats so fundamentally different between virt and dirt. but the baboons magically know that humans don’t use virt just yet?

  15. dan

    i’m also flooded with ambivalent feelings – though the fact that i’m flooded already testifies for being truly impressed. perhaps i really have problems understanding:
    what really confuses me is the recognition of ENGLISH words and dismissal of ‘nonsensical’ ones; phonetically/orthographically some of these nonsensical words are similar to words of other languages, i.e. what would happen if the baboons were confronted with simple 4-letter words of other languages – many languages feature words composed very different from english, lining up consonants no english word ever would. for instance the czech word ‘tvrz’: what would the baboons make of it? would they judge it ‘nonsensical’?
    in other words, teaching the baboons to recognize english words in specific might mean that they indeed recognize an orthographic ‘fingerprint’, which would indeed be pretty impressive. on the other hand, as stated, if it’s basically pattern-recognition what Sulhan [pattern-recognize for treats] suggests might be the case, which would take some of the wind: pattern-recognition is not the same as word-recognition. as georg points out people [in fact all, unless being afflicted with a degree of dyslexia] recognize words the way they recognize faces, as a whole, NOT by detail, and it’s an interesting matter as how one learns to recognize anything.
    the fact remains that the baboons COULD distinguish, which is fabulous enough. it might imply something else than stated above, though.

  16. When a baboon writes Oliver Twist, Hamlet, or the Declaration of Independence, I’ll be impressed.

    (or the jaded view, “Oh no, I knew those Planet of the Ape movies weren’t merely fiction!”

  17. @Jo – Mark Changizi’s done a study comparing different scripts. Alphabetic languages tend to have an average of three strokes per character. So the same principles should apply across. Pictorial languages are obviously different.

    @Liz Ditz – I didn’t mention the dyslexia element because I thought it was a bit far-flung. But the basic point is that understanding how people actually read at a neurological level, might help us understand what happens when they can’t.

    @Deepak – Yes, there are rules, but not hard and fast ones. So some letter pairs are more likely than others – “er” is very common, but “xz” is less so. The closer the “nonsense words” were to actual English ones, the worse the baboons were at categorising them.

    @Dan – They’re not saying that non-English words are nonsensical. This isn’t a value judgement about different languages. Different languages will have different statistical patterns in their letter combinations. So for example, baboons trained on Czech words would be able to discriminate Czech words from non-Czech words.

    And again, this isn’t just simple pattern recognition. It’s a step further – it’s making statistical inferences from pattern recognition. THE CRITICAL BIT is that they could categorise words they had never seen before.

  18. HP

    Follow up study: Teach baboons to discriminate between legitimate English words and legitimate French words, by offering them a reward of either mushy peas or fresh baguette.

  19. Just popping in to add a link to the Language Log response to this study and the media coverage of it:


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Not Exactly Rocket Science

Dive into the awe-inspiring, beautiful and quirky world of science news with award-winning writer Ed Yong. No previous experience required.

See More

Collapse bottom bar