A computer analysis of symbols inscribed on stone tablets and artifacts more than 4,000 years ago has prompted a new debate on a fiercely contested question: Did the people of the Indus Valley civilization have a written language? According to the researchers who conducted the latest analysis, the answer is yes, and the next step is to search for the grammatical rules governing the language. But other researchers have harsh words for the methods used in the study. “As they say: garbage in, garbage out,” [New Scientist], one critic says.
The Indus civilisation flourished in isolation 4,500 years ago along the border of what is now eastern Pakistan, but almost no historical information exists about the people and their long-lost community. Archaeologists working in the region have unearthed a rich hoard of artifacts, including amulets, seals and ceramic tablets, many of which are embellished with the unusual symbols [The Guardian]. But some researchers contend that the symbols are simply religious or political imagery, and that they don’t add up to a language. They note that most of the inscriptions are extremely short (averaging only four or five symbols), and that few symbols are used repeatedly.
For the new study, which will be published in Science, computer scientist Rajesh Rao used pattern-analyzing software to first analyze a collection of languages, including Sanskrit, ancient Sumerian, and modern English. They then examined other information systems, including a computer programming language and the sequence of DNA. The analysis used what is called “conditional entropy”. When aimed at language, this statistical technique comes up with a measure for the “orderedness” of words, letters or characters – from totally ordered to utterly random [New Scientist]. Rao’s team found that the computer programming language was highly ordered (to avoid ambiguity in commands), the DNA sequence was very random, and that spoken languages fell in the middle.
When they next seeded the program with fragments of Indus script, it returned with grammatical rules based on patterns of symbol arrangement. These proved to be moderately ordered, just like spoken languages…. [A]ccording to Rao, this early analysis provides a foundation for a more comprehensive understanding of Indus script grammar, and ultimately its meaning. “The next step is to create a grammar from the data that we have” [Wired], he says.
But researchers on the other side of the argument say that comparing the inscriptions on the Indus tablet to a small handful of languages and other information systems doesn’t provide nearly enough information to reach an informed conclusion, and argue that Rao’s team has just impressed its audience with a fancy computer trick. “There’s zero chance the Indus valley is literate. Zero,” says Steve Farmer, … who authored a 2004 paper with two academics with the goading title “The Collapse of the Indus Script Thesis: The myth of a literate Harappan civilization” [New Scientist].
Related Content:
DISCOVER: Writing on the Half Shell asks if writing was invented in Asia instead of the Middle East
DISCOVER: Writing Right explores why written and spoken languages can differ so dramatically
Image: J.M. Kenoyer/Harrapa.com

April 24th, 2009 at 12:13 am
I tend to believe that the ancients were a lot smarter than we give them credit for. Case in point, my oft-brought-up Kerkythea mechanism, or the chromed swords of the Terracotta Army.
April 24th, 2009 at 3:53 am
to be more credible, the paper should have included/compared proven non-scripts (sign based inscriptions) in the study. Without these it is not worth giving serious consideration.
April 24th, 2009 at 9:30 am
It is interesting to see a fresh trial for a solution to an old problem. If the method could be improved well and good. It is good to have a few proven non- scripts in such study.But endless refinement cannot be undertaken initially either. More over we can do it still. But there is no point in criticizing any effort as unscientifically as saying that there is zero chance for indus valley people to have been literate. No scientist who has studied statistics will come up with such a stance unless she herself lived at the time in question, in which case, as per her hypothisis any way she would not be literate!! but Farmer can safely bet on his own opinion. Can’t others do the same with a greater truth?
Funny indeed are the ways of the world, even the literate one.
April 24th, 2009 at 10:45 am
*sigh* science and skeptics always claim to know 100% of the facts until they prove themselves wrong. I’m not saying they were literate, but c’mon, how do you just dismiss the possibility. as nick says, they were a lot smarter than we credit them, and this is proven time and again, especially of late.
not to mention, being open minded in fields like science is the only way to truly solve the mysteries of the world. if you just write something off without proving or disproving the facts surrounding it, no matter how far fetched they may seem, how are you ever going to advance?
April 24th, 2009 at 11:34 am
In an undergraduate paper I wrote on Mohenjo-Daro, oh so many years ago, I recall such Indus Valley scripts being associated with trade ledgers, not unlike in various maya codices where ideograms were used to track trade goods. I didn’t see this mentioned anywhere above. The archeological record at Mohenjo-Daro suggests social hierarchy, craft specialization, water engineering, trade etc. More credit indeed – busy, sophisticated folks. Is doesn’t seem reasonable to dismiss with such certainty that Indus Valley ideograms lack syntax – instead, it requires futher investigation.
April 25th, 2009 at 12:43 pm
“Experts” had also said flying, going to space are impossible, or nobody would want to watch
TV instead of listening to radio, and so on and on… It never ends! ^_^
April 25th, 2009 at 6:10 pm
Why diagreement?
Language was never invented! It developed from simple signs necessary for indirect communication! Its origin being trail reading!
After all, when one can read a trail, someone has already written it!
Purposely added clues were the next development.
The writing and reading of the clues is basic grammar as the order is all important!
The teaching of track reading is basic language teaching.
Back then, a gramatical error could very well have spelled,”death”!
regards aquatic thinking.
April 25th, 2009 at 8:16 pm
It seems to me that the two arguments against the Indus script thesis used in the article, short inscriptions and few symbols used repeatedly, are really reaching for something to disprove the theory. I believe that the length of a script is irrelevant to whether or not it is a language. Also it stands to reason that in short scripts there will be few if any repeating symbols, since it is a short script. An open mind is required when dealing with the unknown, and jumping to conclusions in a scientific inquiry should be avoided. Such a definitive statement saying that there is “zero chance” of literacy in the ancient Indus valley makes me suspect the motivations behind the statement, and shows a closed mind. I would not be surprised if competition for research money was a motivating factor, or simple jealousy, in this debate.
April 26th, 2009 at 5:44 pm
Kerkythea mechanism?
You mean the antikythera mechanism?
Stop saying Kerkythea, ,thats a graphics rendering system for modern computers.
April 28th, 2009 at 3:17 am
Hi Folks!
I’m afraid some of you have missed the point. Have you read the paper by Farmer, Sproat and Witzel? You can find it here. Moreover, I strongly recommend reading their ‘refutation of the refutation’ BEFORE you make any dismissive conclusions, because what they say is really important here . Also see the following link for the maths behind: .
In short, what Farmer, Sproat & Witzel say is that using FAKE data sets and statistical information that has never been shown to be of any relevance to deciding whether a given system is of linguistic or non-linguistic origin is simply pointless and doesn’t prove anything.
It quite natural that the Hindutva nationalist movement dislikes the idea of illiterate society. We should all realize, however, that there’s nothing bad in accepting this. Contrary to that, it may open new horizons in comparative work, since non-linguistic symbol systems are as important for the study of human past as any other archeological data. They carry important information which should be, is and will be studied carefully – no need to worry…
Believe me, everyone, including Farmer, Sproat and Witzel, would be happy if the Indus ‘script’ were a linguistic SCRIPT, indeed, because that would give us an extraordinary source of information about the culture. However, if the ‘script’ isn’t a SCRIIPT, it is necessary to stop wasting time with ‘deciphering’ or ‘translating’ efforts and start to investigate the symbols in a totally different way in order to extract as much useful information from them as possible…
June 26th, 2009 at 6:30 pm
@Petusek:
Feeding known fake data into the program tests the programs false positive rate, so a reliable measure of the PPV of the test can be gained. Then, when the program returns a yes for the test script, it’s reliability can be trusted, and discussed.
It’s not testing the script, it’s testing the program.
July 26th, 2010 at 2:09 am
As Iravatham Mahadevan points out, archaeological evidence makes it inconceivable that Indus Valley Civilization’s large, well-administered and sophisticated trading society would have functioned without effective long-distance communication. Unlike the clay tablets of Mesopotamia, no written records were discovered from Indus Valley sites except seals. The people of IVC might have written on cotton cloth, leaves, bark or hide which would have decayed by now, leaving no trace.
September 6th, 2010 at 6:54 pm
I’ve never bought the argument that the Indus Valley civilization must have been literate just because their neighbors were. Sometimes folks decide to be different from their neighbors. The Chinese had illiterate neighbors, once upon a time. The Egyptians had illiterate neighbors, once upon a time. Why couldn’t the Mesopotamians have had illiterate neighbors in South Asia? The civilizations of Central America weren’t all literate. The Mayans had writing but their neighbors — the Aztecs — were illiterate.
The problem with using conditional entropy to establish “grammar” is that it doesn’t establish grammar’s existence. It only shows that there’s order. But people tend to order things that aren’t linguistic. Navaho sand paintings are chock full of symbols that have order. But the symbols aren’t linguistic and the order isn’t based on syntax or grammar.
If Rao et al had used real data on Vinca symbols and symbols on kudurru, they might have discovered this fact for themselves. But they decided that Vinca symbols had no order and that kudurru symbols had absolutely rigid order. So they didn’t bother to look at any real symbols of either type. So those data sets are totally invented, based on their assumptions about what they would have seen if they had looked at the data — which they did not do.
They later explained that this didn’t mean anything because this invented data only served as a control, showing the limits of what’s possible. But a real control isn’t supposed to be an invention and made up data doesn’t show what’s possible, only what’s it’s possible to imagine.
As for those mysterious perishable written records that Harappans are supposed to have written, be they on cloth, leaves, bark, or hide, we can’t talk about them. They’ve perished. All we have are the imperishable, short messages. They’re all as short as the shortest tags from the earliest period of Egyptian hieroglyphs, the proto-cuneiform cylinder seals — not the economic tablets — and a few fragments of turtle plastrons (oracle bones) from China. It’s just not much to go on. I think Farmer et al have made a good case….
September 17th, 2010 at 5:01 am
@Petusek @ Diwiyana
If it did not have a linguistic component, it qualifies as proto-literacy, not illiteracy. There is every reason to beleive it had a linguistic component- it co-evolved with the scripts of the Middle East. If it did have a linguisitc component, it qualifies for full literacy, since in that case it will be possible to write anything. Try it out! The Etruscan script is generally found in short texts. We can’t rule out the possibility of longer inscriptions existing – as a matter of fact, it is alphabetic. People do what makes sense most of the time. The Indus was mulitligual, and a non-linguistic system (with a linguistic component) as Farmer, Witzel and Sproat agree would have made sense. Most inscriptions are short because, they were meant to have been ready by people speaking different languages which means that it comprised of a very small linguistic component if any. Now come to the Dholavira signboard. It probably represented the name of a place. This inscription had sign repetition unlike others – This kind of dual usage would qualify it for full literacy and point out to longer texts- atleast adminstrative texts, if not literary ones.
September 24th, 2010 at 2:16 am
Please find the response by Steve farmer . He is happy that India is no longer represented in a new book. Then why do they have to be indologists? Let them resign. This is not an isolated instance. This happens with them all the time.
Then let’s follow Farmer’s advice and close down the indology department, It doesn’t deserve to exist as long as the Farmer -Witzel duo are around.
Let Harvard shine in all fields be it in Sinology or Indology, but let it be with a different team!!
re: [Indo-Eurasia] BOOKS: Visible Language
This book is not actually out yet, but when it is, it will be available for sale
as well as for download free of charge at:
http://oi.uchicago.edu/research/pubs/catalog/oimp/
Some teasers from the exhibition installation are appearing on facebook at
-Chuck Jones-
—- Original message —-
Steve Farmer wrote:
> New book out from the Oriental Institute, passed on
> from the Agade List.
>
> Note how the so-called “Indus script” — which is
> certainly not a “script” as linguists view that term — is
> slowly but surely disappearing from the world of international
> scholarship. About time, and I’m happy with Michael and Richard
> to have started that process.
>
> Steve
September 30th, 2010 at 9:08 am
Please read the 200 page research on the Indus script by Michael Korvink!
It adopts a statistic positional approach and is very methodical and it is a wonderful piece of work.
Even if it did not have a linguistic component, it qualifies for proto-literacy, not literacy. Even Vinca symbols which were much simpler consitute proto-literacy. I am sure the Indus script had a linguistic component.
The Harappans had a unique interpretation of literacy.
No other civilization of its time mass produced writing. (4500 years ago). no other civilization of its time had a signboard. One tenth of a town has produced thousands of seals.
Can I declate the Indus to be the most literate of ancient civilizations?
‘Deliberate’ Wrong usage of terms
is already causing havoc in schools and colleges. History is taught in schools all over the world. This is sheer irresponsibility.
We hate those who misrepresent history. He is using Harvard resources for this puropose.
A man in the street can be excused. A Harvard University professor cannot.
Witzel et all haven’t said anything new really. It is just a lot of hot air and only goes to show
the depths to which science has sunk and that a new generation of Western scholars needs to take Indology forward. There is no other way. There are many names I can suggest.
October 3rd, 2010 at 1:33 am
@ Petusek @Diwiyana
There are better terminonologies avaialble in the business than ‘symbols’ . Please research.
Please research the history of writing systems. The conclusions you come to will be somewhat different.
March 18th, 2011 at 1:50 am
my published paper ‘The reconfirmation and reinforcement of the Indus script thesis’ . this was published in a scienfitic journal recently.
This shows why longer texts certainly existed in the Indus and why the Indus script was logo-syllabic. This is a complete refutation of Farmers thesis and refutes sproat’s smoking gun completely. Sproat’s smoking gun is a complete non-starter. If Farmer disagrees, he has to reply to me point by point
Sujay Rao Mandavilli
http://www.scribd.com/doc/46387240/Sujay-Indus-Script-Final-Version-Final-Final
June 12th, 2011 at 8:10 am
Few sensible scholars will be able to deny that the Indus script was a logo-syllabic script. Facts about the Dholavira signboard. However seals may have been non-linguistic. (a) It is one of the most famous of Harappan inscriptions. (b) It was very large in size. (c) It was located in Far from Mesopotamia Dholavira and in one of the furthest sites from Mesopotamia. (d) It hung over the citadel there. (e) It must have represented the name of the place and must have been closely tied to speech: note the sign repetition. (f) The sign which was used as a determinative was a very common Indus sign. (g) The sign used as a determinative appears to have been also similar to determinatives in other writing systems. (h) The Indus script was also related to Proto-Elamite which means it probably had a linguistic component. (i) The other signs with which the determinative was used were also common Indus signs. (j) Few sensible scholars will now dispute the fact that the Indus script was a logo-syllabic script on the basis of this evidence. (k) Few sensible scholars will deny the fact that speech encoding was one of the major functions of the Indus script and had this feature had reached a very precocious maturity. (l) This inscription was apparently more closely tied to speech than most proto-Elamite inscriptions. (m) Dholavira was not even the most important of sites. (n) The fact that it was hung over the citadel meant it was meant to be read by elites. (o) It was put to the most frivolous use. (p) Speech encoding would have been a prized possession: no one would have used it just for a decorative signboard at far-from-Mesopotamia Dholavira. Why would a man who had inscribed this, done so (a) if nobody else could read it (b) why would he have learnt to encode speech only to inscribe this signboard? This automatically implies the existence of longer texts. It also shows that the Indus elites used more complex forms of communication. (q) Even if we assume that speech-encoding was added in Mature Harappan 3B, this logic would still hold good. (r) This logic is already accepted by mainstream Indus archaeologists as a precursor to the existence of longer texts
please refer to the book by Jane Macintosh (Mcintosh 2008 p 374) “The Harappans did not create monumental art or architecture on which such inscriptions may have been written. The nearest that the Harappans came to this is the Dholavira signboard which is quite possibly the tip of the iceberg of a now vanished public inscriptions.Farmers arguments fail to account convincingly for the structural regularities that analysis have revealed in the use of Harappan signs. These strongly seem to support the hypothesis that the Indus script represent a writing system”
June 14th, 2011 at 12:10 pm
British archeologist
Jane Mcintosh demoslishes the non-script thesis
“Farmer also draws attention to the absence of long Harappan inscriptions on potsherds. If the Harappan signs were a script, he contends, this absence would make it unique among the scripts of literate cultures, who all used potsherds often like scrap paper. This need only, imply however, that the Harappans had other media that were easier to scribble on, such as cotton cloth or wooden boards, or that the writing medium was not well suited for use on sherds. Likewise the absence of long monumental inscriptions seems significant to Farmer, but the Harappans did not create monumental art or architecture on which such inscriptions might have been written; the nearest they came to this is the Dholavira signboard, which is quite possibly the tip of an iceberg of a now vanished public inscriptions.”
“He (Farmer) also considers that the proportion of singleton and rare signs is unusually high; other scholars such as Parpola (2005) demonstrate that this is not so, since in general logo-syllabic scripts contain a small corpus of frequently used signs and a large number of much less common ones. Moreover, new signs are continuously added, even when the writing system is a fully developed one, something Farmer also denies. Statistically the Harappan script does not differ significantly in its sign proportions from other logographic scripts. A further point regarding the singletons is that Wells (n.d.) has demonstrated that many are variants or ligatures of basic signs, rather than completely different signs; again, this is something to be expected in a genuine script”
“Perhaps more significantly, the brevity of the majority of the Harappan texts (four to five signs on average) makes it less likely that signs would repeat within them than it is in the longer texts with which Farmer compares them (McIntosh 2008, p. 374).
“Farmer’s arguments fail to account convincingly for the structural regularities that analyses have revealed in the use of the Harappan signs; these seem strongly to support the hypothesis that the Harappan signs represent a writing system. The theory put forward by Farmer and his collaborators has not been widely accepted, but it has been valuable in compelling scholars to look afresh at their assumptions about the script and in provoking a stimulating debate from which a deeper understanding of the script should emerge (McIntosh 2008, p. 374).