The Sandawe: after the demographic flood

By Razib Khan | April 9, 2011 10:21 pm

Over the past few days I’ve been trying to read a bit on the Sandawe. Most of the stuff I’ve been able to find is in the domain of linguistics, and is basically unintelligible to me in any substantive manner. The crux of the curiosity here is that the Sandawe, like their Hadza neighbors, have clicks in their language, and so have been classified with the Khoisan. Here’s some background:

The most promising candidate as a relative of Sandawe are the Khoe languages of Botswana and Namibia. Most of the putative cognates Greenberg (1976) gives as evidence for Sandawe being a Khoesan language in fact tie Sandawe to Khoe. Recently Gueldemann and Elderkin have strengthened that connection, with several dozen likely cognates, while casting doubts on other Khoisan connections. Although there are not enough similarities to reconstruct a Proto-Khoe-Sandawe language, there are enough to suggest that the connection is real.

I can’t speak to the validity of this at all, obviously. Some scholars do argue that the clicks in the Sandawe language were only acquired through interaction with peoples such as the Hadza, making an analogy to Xhosa, a Bantu language which has been strongly influenced by Khoi dialects. In any case, after having run ADMIXTURE a bunch of times on African population sets, and checked the genetic distances of the inferred ancestral ones, one thing that is clear is that the Sandawe don’t show a particularly close genetic relationship to the Bushmen, nor do they show a close relationship to the Hadza. In fact, the Hadza, Pygmies, and Bushmen show a closer relationship to each other, distant as it is, than to the Sandawe. The Sandawe themselves are distinctive from their Bantu neighbors, but, their connections seem more clear to the Masai and other peoples to the north.

Some of the anthropological stuff that I did find on the Sandawe not having to do with linguistics considered the issue of their status as hunter-gatherers, and their shift toward a form of agriculture within the past few centuries. Not surprisingly much of this literature consisted of ideologically shrill posturing, denouncing past scholarship for insensitivity and bigotry, while taking their own maximalist position. For example there has been the hypothesis that hunter-gatherer populations tend to be genetically and culturally isolated from agriculturalists, with several African groups used as exemplars. A group of anthropologists argue strenuously that this model may just be a construction of the biases of previous generations of scholars. But they offer little in the way of counterargument, more keen on uncovering the faults in the motives and methods of their predecessors than in building anything anew.

Genetics can help us a little here. Below are the results of ADMIXTURE and PCA I ran for a selection of populations. I pulled in some Behar et al. samples and merged it with the Henn et al. data set. The marker list was pruned down to ~160,000 SNPs. The limited selection of populations was conscious, insofar as I was exploring specific questions about the relationship of East African populations to Eurasian ones. At K = 8 the populations in my data set separated rather well. Do not take this separation as evidence that this K is a reflection of absolute concrete ancestral populations. Here’s the bar plot:

Since I’ve been running this data set, with some modifications, for a week now I can pick out some trends which I feel are robust at K = 8. For example, the Eurasian-like admixture you see across eastern Africa seems to be distinctively of a southern nature, centered on Arabia (probably Yemen). This makes total geographical sense. The Ethiopians and Somalis (I have some Somali samples which I threw in with the Ethiopians since the Cushitic Ethiopians seem more similar to the Somalis than to Semitic Ethiopians) lack the genetic influence of Bantus in totality. Rather, they have an affinity with the Nilo-Saharan peoples. Finally, the Sandawe tend to “break out” as a separate population only at higher K’s, generally clustering with the Nilo-Saharan element as long as possible.

Let’s also look at a PCA of the populations above on the first two principal components:

The PCA looks a little different from the ones you’re used to seeing because there are only West Eurasian and African groups in the sample. So the second component is not the familiar west-east axis in Eurasia, but the separation between the Mbuti and other Africans. On the far right of the plot you have Orcadians, then Druze, Saudis, and Yemenis. Then you have Horn of Africa populations, Ethiopians and Somalis along the vertical axis. Then Masai and Sandawe, and Luhya, a Kenyan Bantu group. The Masai are a confusing group. Even after removing problem individuals who might be related there tends to be a choppiness in the Masai results. The Sandawe on the other hand are more consistent by and large.

The genetic distances of the inferred ancestral groups aren’t too surprising. Here are MDS visualizations:

One of the consistent trends you see is that the Masai are closer to Eurasians than the Sandawe, but, the “Masai” modal ancestral component is no closer, or even further, from Eurasians than the “Sandawe” ancestral component. At higher K’s once the “Sandawe” element partitions out it is extremely dominant among the Sandawe, and found in lower fractions among other East African groups, especially non-Bantu such as the Masai. I wouldn’t put too much stock in the high proportion in the Ethiopians above, as the outcomes are rather scattered across the K’s and population combinations. The Masai are a population who always seem to have a low fraction of Eurasian-like “Arabian”, and this is what drags the population toward the Eurasians as in the PCA above. The Sandawe seem to lack this admixture; rather, their affinity with Eurasians is deeper and may not be due to admixture at all (ADMIXTURE itself is not perfect, and may transform an admixed group into a “pure” component, as we can see sometimes as among the Fulani or among South Asians, and, I suspect the Mozabites).

Back to the Sandawe and their position in the history of East Africa. Unlike the Pygmies and Khoisan they are not basal in relation to other human lineages from what I can see here. That is, they don’t “split off” as early from the main cluster of branches in a phylogenetic tree of human populations. In fact, unlike the Pygmies and Khoisan, and like the Masai, they are closer to Eurasians than the West African or Bantu peoples. In other words, they’re less basal. In fact, the Sandawe may be closer to Eurasians than most of the Nilotic groups when recent admixture with Eurasians is removed from the picture.

I do not know if the Sandawe are indigenous to their region of Tanzania. If I had to bet money I’d say not, and that some scholarly suppositions for a northerly origin may be plausible based on the affinities with the Masai and even Cushitic and Semitic peoples of Ethiopia and Somalia. The distinctiveness of the Sandawe from their Bantu neighbors seems clear, and there is no special closeness to the Khoisan of Southern Africa.  Many anthropologists and historians have pointed out that some groups can “revert” to hunting and gathering facultatively. But the total Bantu domination of much of East Africa suggests to me that this is was not the case with the Sandawe. I think a plausible model is that the Sandawe were part of the substrate of East African hunter-gatherers who have mostly been eliminated and absorbed by the Bantu. In the north related peoples contributed to the emergent Nilo-Saharan and Ethiopian and Cushitic societies, which were able to avoid being swamped by the Bantu because of ecology and their own agricultural traditions. In this model the Sandawe affinities to Khoisan groups was more a matter of horizontal cultural borrowing and influence due to proximity, than a close genetic relationship.

CATEGORIZED UNDER: Genetics, Genomics, History
  • Ian

    I’m sure someone has proposed this before, but what about the idea that clicks actually are an ancestral feature that was lost in all other lineages, a ancient feature of human speech that was retained by just these two groups?

    I’m guessing this is not a novel idea. The Wikipedia article on“click consonants” answers my initial question about how diverse click sounds are. It would only take two losses of clicks (non-Africans and Bantu-speakers) to eliminate clicks from most human populations.

  • Ian

    I suppose scrolling down a little would have provided at least one possible answer to my question…

  • Eze

    According to C. Ehret there was a southern bound migration of proto-Cushites from the Horn of Africa into the savanna areas south of Mount Kilimanjaro around 2,000 BC (2k years prior to the Bantu expansion in this region). This migration got groups like the Iraqw (South Cushitic speakers) into parts of Tanzania. The Sandawe could quite possibly been a branch of this ancient migrant group and came into contact with Hadza-like people and reverted to hunter-gatherer lifestyles. It’s a pity that the Henn et al. 2011 study didn’t sample the Iraqw people, they could have given us more insight into this.

  • John Emerson

    “Areal effects” are a contemporary topic in linguistics which was not mentioned when I studied linguistics 30+ years ago. Basically languages which neighbor on one another, with substantial bilingualism, start to borrow from one another. It’s a little like species swapping genes in bio; a language will acquire traits which were absent from its “ancestral” languages.

    The two areas I’ve seen this discussed are S China and SE Asia, and the area W of the Black Sea, including the Balkans. For example, Romanian is a Romance language, Hungarian is Ugric, Bulgarian is Slavic, and all of them have traded back and forth. Further south Greek and Albanian enter in, and further east Turkish is a factor.

    In SE Asia the language families are supposedly Chinese, Vietnamese etc., Mon-Khmer (Cambodian), Miao-Yao (Hmong-Mien), Tibeto-Burman, and Thai etc. Except for the Sino-Tibeto-Burman group, as I understand the ancient relationships between these languages is pretty much up in the air, in contrast to the Black Sea / Balkan area, and because of areal effects and a general lack of morphology these relationships are hard to determine. (Indo European morphology made the relationships *relatively* easy to figure out, another case of “low-hanging fruit”.

    “Sprachbunde” is the key word:

  • Bob

    I thought I knew, at least basically, how to read those bar plots, but apparently I don’t. I’m confused by the Mbuti bar. It consists of *two* chunks that are not, at least to my eye, shared by another groups in the plot. In my previous understanding, that should not have happened. Instead, they’d have a single color representing a linear combination of the two.

    Have I described my confusion enough that a helpful person might help me understand this better?

  • ohwilleke

    The idea that the Sandawe were a group that culturally borrowed language (and perhaps more) from the Hadze seems plausible.

  • Eurasian Sensation

    If the Sandawe are a remnant of a population of hunter-gatherers that once inhabited larger swathes of East Africa, I’d be interested to see if there are any significant genetic connections with Malagasy people. Since it is assumed that the Indonesians who settled in Madagascar began arriving around 2000 years ago, I would guess at that time the Bantu expansion might not have completely overwhelmed the indigenous East African populations. So were the Africans who mingled with Indonesians (possibly as slaves) to become the Malagasy all Bantu, or were they hunter-gatherers like the Sandawe (or Khoisan)?

  • ohwilleke

    @ Eurasian Sensation

    A genetic study of uniparental Y-DNA and mtDNA markers was published in Cell in 2005 (n=363 men) by Bryan Skykes and others. The abstract reports: ” we demonstrate approximately equal African and Indonesian contributions to both paternal and maternal Malagasy lineages. The most likely origin of the Asia-derived paternal lineages found in the Malagasy is Borneo. This agrees strikingly with the linguistic evidence that the languages spoken around the Barito River in southern Borneo are the closest extant relatives of Malagasy languages. ” But, the abstract doesn’t discuss the affinities within Africa on the African side of the equation. Perhaps more detail could be gleaned from the study itself or popular articles such as Science Daily summing up the results.

    A 2008 study in the Journal of Human Genetics by M. Regueiro, et al, looked at autosomal affinities (n=15). It found that “while Madagascar derives 66.3% of its genetic makeup from Africa, a clear connection between the East African island and Southeast Asia can be discerned.” The underlying data are probably available online or not too hard to obtain for inclusion in DIY analysis.

    An open access 2010 study on the population genetics of the nearby Comoros Islandsbased on uniparental markets (N=577) concludes in its abstract that it finds “the Comoros population to be a genetic mosaic, the result of tripartite gene flow from Africa, the Middle East and Southeast Asia. A distinctive profile of African haplogroups, shared with Madagascar, may be characteristic of coastal sub-Saharan East Africa. Finally, the absence of any maternal contribution from Western Eurasia strongly implicates male-dominated trade and religion as the drivers of gene flow from the North. The Comoros provides a first view of the genetic makeup of coastal East Africa.”

    On the African component of the Y-DNA side it reported:

    “The most common Comorian haplogroups, E1b1-M2 (41%) and E2-M90 (14%), are those that are frequent in sub-Saharan Africa. They are present, respectively, at 56 and 6.4%, in Madagascar. Two haplogroups were identified under E1b1-M2, derived for markers M191 (22%) and U209 (9%). The haplogroup E1b1a-M191 has been found in east and west sub-Saharan Africa, 19% in Tanzania and 57% in Benin. The marker U209 was identified in Afro-Americans,39 and has not, until now, been tested for in African populations.

    The low incidence of E-M293 (0.8%) and A-M91 (0%) on the Comoros contrasts strongly with the frequency of these haplogroups in East African populations. E-M293 is found mainly in East Africa, Kenya and Tanzania (18%). Furthermore, on the African mainland, M293 chromosomes carry either 10, or 13 and more repeats at the DYS389I STR locus, whereas on the Comoros, they have 12 repeats. Haplogroup A has a frequency of 14% in Kenyan Bantu and 7% in Tanzania. Other haplogroups of likely sub-Saharan African origin on the Comoros are E-SRY4064(xM2,M35,M75) (1.3%) and B2a (1.6%). B2a has a low frequency in southern Iran and Qatar, but this is thought to be a consequence of the Arab slave trade. We therefore treat B2a as an African chromosome in this study.”

    On the mtDNA side they conclude in the body text: “As for the Y chromosome, the majority of mitochondrial haplogroups on the Comoros are of African origin. The haplogroups L0, L1, L2 and L3′4(xMN) compose 84.7% of the mitochondria in the Comoros sample, and their relative proportions are most similar to profiles found in East and South East Africa. The higher affinity with sub-Saharan East African populations is also evident in the MDS analysis. ”

    In the 2010 study’s comparisons to other work it notes:

    “Interestingly, there are a number of similarities between the genetic profile of the Comoros islanders and the Lemba of South Africa, a Bantu speaking people whose Semitic origins are evident at both the cultural and genetic level. The Lemba have high frequencies of the Middle Eastern Y-chromosome HgJ-12f2a (25%), a potentially SEA Y, Hg-K(xPQR) (32%) and a Bantu Y, E-PN1 (30%) (similar to E-M2), raising the possibility that the Lemba and Comorian populations are consequences of similar demographic processes. . . .

    The Comoros and Madagascar show similarities in the paternal and maternal contribution from SEA and Africa. The absence of a strong Middle Eastern signal on Madagascar could be due to sampling bias, as Arab or Persian traders are known to have established posts on the Northwest coast of Madagascar, whereas only populations from the centre and South of Madagascar have been studied to date. The low frequencies of E-M293 and A-M91, on both the Comoros and Madagascar, contrasts with the high frequency found in inland populations from Tanzania and Kenya,and could be characteristics of a genetic profile specific to sub-Saharan coastal East Africa.”


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar