The genetic affinities of Ethiopians

By Razib Khan | January 10, 2011 2:04 pm

In the open thread someone asked: “Any recent stuff on the genetics of Ethiopians.” That prompted me to look around, because I’m curious too. Poking around Wikipedia I couldn’t find anything recent. A lot of the studies are older uniparental lineage based works (NRY and mtDNA). Ethiopia is interesting because unlike almost all other Sub-Saharan African nations it has a long written history. Culturally and linguistically it has both Sub-Saharan African, and non-Sub-Saharan African, affinities. The languages of highland Ethiopia are clearly Semitic. Those of lowland Ethiopia are Cushitic, a branch of the broader Afro-Asiatic language family concentrated around the Horn of Africa (Somali is a Cushitic language, though most Ethiopian nationals who speak a Cushitic dialect are of the Oromo group).

From a human evolutionary genetic perspective, Ethiopia also has specific interest. It is likely that the main recent pulse of humans Out of Africa traversed this region. Additionally, there is some evidence of deep time connections between the groups ancestral to Ethiopians and the Khoisan of southern Africa. It may be that Ethiopians and Khoisan are reservoirs of ancient genetic variation in Sub-Saharan Africa which as been overlain by Bantu in most other regions outside of West Africa. Finally, Ethiopians are known to have high altitude adaptations. This could be due to long term residence in the region, or, assimilation of favorable alleles from the long term residents by later populations.

Fortunately we can get a sense of the genetic affinities of Ethiopians thanks to a paper published last spring, The genome-wide structure of the Jewish people. The focus was clearly on Jews, but they surveyed Amhara & Tigray (Semitic speaking highlanders), Ethiopian Jews (similar ethnically to the Amhara & Tigray, but religiously non-Christian), and Oromo. In the PCA the Oromo and Semitic speaking populations are pretty obviously distinct clusters.


This just means that when you take worldwide genetic variation, and pull out the biggest independent dimensions, and then visualize individuals on the two largest dimensions in terms of how they explain variance, the Oromo and other Ethiopians don’t really intersect. Interestingly the Amhara and Tigray are almost indistinguishable, but the Ethiopian Jews are in their own cluster. There are, for the record, 7 Oromo, 7 Amhara, 5 Tigray, and 13 Ethiopian Jews in the sample.

Now let’s look at the genetic variation in ADMIXTURE. Remember this assigns the genomes of individuals in proportions to K ancestral units. As an example, if you had African Americans, Yoruba, and White Americans, in a total pool, and did K = 2, you might have a tendency where Yoruba and White Americans are in two totally different ancestral populations of K, while African Americans are 80% in one ancestry and 20% in another. The interpretation of this is straightforward, but when it comes to populations whose backgrounds we don’t know as well, one should be careful. The selection of a particular value for K is going to be really important, and we shouldn’t confuse the method from the reality which the method is trying to plumb.

First, K = 8 from Behar et al. I’ve reedited to highlight populations which might inform the variation of Ethiopians.

Now let’s look at a series of K’s. Note the changes.

Luckily for us, we don’t need to stop here. Dienekes included Behar’s Ethiopians (non-Jews) for Dodecad. Additionally, he included the Masai population from the HapMap. This turns out to be important because he found that Ethiopian Sub-Saharan ancestry is similar to that of the Masai, not the other African groups.

Dienekes also provided individual outputs. I’ve stitched together Ethiopians with Egyptians and Saudis. The color coding is the same as above.

You should be able to tell where the three groups start and stop pretty easily. I’m 99% sure that the six individuals with more East African and less Southwest Asian ancestry are all Oromo. Ethiopians, in particular highland Ethiopians, seem to me likely an ancient stabilized hybrid population between a population from Arabia, and a local Sub-Saharan population. This population seems unlikely to have been related to the peoples of West-Central Africa, who are associated with the Bantus across eastern and southern Africa. The Bantu agricultural toolkit runs into ecological constraints in various regions, and it is in those regions that non-Bantu populations have persisted. Ethiopia, with its unique climate and topography, naturally remains non-Bantu (as well as the Horn of Africa as a whole). The possible connections between Khoisan and Ethiopia may be a function of the fact that these areas harbor genetic variants which have disappeared in the intervening regions because of the Bantu expansion. I have a hard time accepting that the Bantu expansion was particular eliminationist, but I am starting to suspect that outside of Ethiopia population densities were very, very, low.

The antiquity of this ancient hybridization event to me is attested by the fact that Ethiopians lack any of the other Middle Eastern components besides the one modal in Saudi Arabia. There is a great deal of intra-population variance in the Saudi data set. Why? Part of this must be the slave trade, as well as pilgrims who remained in places like Mecca. But, I think part of the untold story here is that there may have been a larger genetic impact on Arabia after the rise of Islam from the Levant than vice versa! Probably the gene flow precedes Islam, as Arabia was hooked into worldwide trade and population movements, which Ethiopia was relatively insulated from. The Saudi data set has several people who are “pure” Southwest Asian, but also several who have a great deal of West Asian + South European. These seem likely to be people who have some background in the Fertile Crescent.

CATEGORIZED UNDER: Genetics, Genomics
  • JL

    It’s Amhara.

  • Ian

    Ethiopians, in particular highland Ethiopians, seem to me likely an ancient stabilized hybrid population between a population from Arabia, and a local Sub-Saharan population

    Forgive me if I’m missing something obvious, but is there some reason to assume that the “Southwest Asian” component originates in Arabia rather than Ethiopia? Could this just as well represent a substratum that colonised southwest Asia prior to the influx of East Africans? Or, for that matter (though less probably) could the absence of ancient East African lineages in Saudi Arabia reflect some sort of a bottleneck in the settlement of southwest Asia? Or is it possible to add some sort of temporal estimate into this?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    ian, i think we need to look to NRY stuff for that. no time right now, but i’ll check it out if you don’t. there is a model that semitic languages come from northeast africa, so perhaps.

  • Ian

    But, I think part of the untold story here is that there may have been a larger genetic impact on Arabia after the rise of Islam from the Levant than vice versa!

    If you think about this in terms of the traditional understanding of things, it’s a revolutionary idea. But if you fit the rise of Islam into your model of how herders affect agriculturalists (much more linguistically than genetically) then the Arabisation of the Middle East is sort of like the Turkisation of Anatolia. Granted, it’s stating the obvious – most Syrians are “white” in a way that Saudis often aren’t – but it’s still interesting to see it fit that general model you discussed a few weeks ago.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    ok, might be informative:

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1181965/?tool=pmcentrez

    no time for a close read.

  • http://washparkprophet.blogspot.com ohwilleke

    One of the leading mtDNA papers on Ethiopia is here:

    http://www.ncbi.nlm.nih.gov/pubmed/15457403

    Most notably, on the mtDNA side, Ethiopia is the rough geographic center of the haplogroups of mtDNA haplogroup L3 that a closest to parent mtDNA lineage of macro-haplogroups M and N found predominantly in Eurasia.

    The leading view among linguists is that the Ethio-Semitic languages “reflect a single introduction of early Ethiosemitic from southern Arabia approximately 2800 years ago”, and that this single introduction of Ethiosemitic underwent “Rapid Diversification” within Ethiopia and Eritrea.”

    I have seen both Y-DNA and mtDNA papers dealing with Ethiopia (I can’t find the Y-DNA cite right at the moment) that clearly show a disproportionately male lineage distinction between Ethio-Semitic language speakers and non-Ethio-Semitic language speakers consistent with this hypothesis. The genome wide data clearly confirm the uniparental data in showing a distinction between admixed Ethio-Semitic language speakers and relatively unadmixed Oromo populations.

    Hence, the Oromo population is believed to be generally similar genetically to Ethiopia ca. 1000 BCE. Ethiopia has participated in maritime trade since ancient times providing one possible source of Southeast Asian admixture, although hardly the only possible scenario.

    The place of the non-Semitic Afro-Asiatic languages of Ethiopia, such as Oromo and other Cushitic languages in the Afro-Asiatic language family tree is unsettled and almost every combination of links has been proposed. One school of thought puts Oromo at the base of the Afro-Asiatic langauge family; others see the language family as originating in the Near East and back migrating (as a small number of minority mtDNA and Y-DNA haplogroups appear to have backmigrated). The age depth of the Afro-Asiatic language family is also a matter of great dispute, particularly over the question of whether it post-dates or pre-dates food production. An older date tends to favor an African origin for the language family; a more recent date tends to favor a Middle Eastern origin.

    The extent to which the Ethiopia’s original adoption of farming was independent of Sahel agriculture in the West African Sahel that ultimately merged with it, or was an extension of it that added supplmentary crops, and its timing, is also controversial. But, the Ethiopian and Sahel agricultural crops are generally viewed as distinct and independent in origin from the Fertile Crescent package. The point in time at which domesticated livestock arrived in the region is also a subject of considerable debate.

    Upper and Lower Egypt have population genetic differences in many respects, and it would be useful to know where the Egyptian samples come from in the ADMIXTURE plot to know what inferences we should draw from them.

  • Geo

    Razib your map gives a poor illustration of where Semitic and Cushitic languages are spoken.

    Many Cushitic languages are also spoken in Northern Ethiopia. In fact, Ethiopia is mostly Cushitic:

    http://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Afroasiatic-en.svg/800px-Afroasiatic-en.svg.png

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    ah, is a better map.

  • http://entitledtoanopinion.wordpress.com TGGP

    Could you elaborate on the extent to which the Bantu expansion was eliminationist and what evidence we have?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    Could you elaborate on the extent to which the Bantu expansion was eliminationist and what evidence we have?

    we don’t have records of it, so only supposition. the main surprise to me is that khoisan and mbuti pygmy cluster together against the bantu populations. if the bantu expanded more through absorption that shouldn’t happen, the intervening populations would span the khoisan and pygmy. as it is, it looks like the khoisan, pygmy, and hazda, are relics in a sea of bantu.

    the bantu take pygmy wives, and there’s been assimilation of khoisan into the xhosa people. so i think mostly it was just low population density of the indigenes.

  • Geo

    The Bantus near Lake Victoria (Kenya, Tanzania etc) do show signs of admixture with the native Cushitic populations. Although it is highly variable in each tribe. According to Tishkoff et al. some Bantu speaking groups like the Tutsi, Mbugwe and Kikuyu have Cushitic ancestry in excess of 25%. While others like the Luhya only have it minimally (7%). The Tanzanian Iraqw people are remnants of the original Cushitic population which used to inhabit these lands before the Bantus (from West Africa) and Nilo-Saharans (from South Sudan) arrived. The Iraqw only have less than 15% admixture from those groups and are largely Cushitic.

    The Maasai from Kenya (East African HapMap samples used by Dienekes) have ancestral components which aren’t shown well in his admixture runs. The Maasai are a mixed people, in Tishkoff’s study they picked up the East African Cushitic cluster at 50%, the Nilotic (Southern Sudanese) at 25% and the remainder was mostly Bantu with minor traces from elsewhere in Africa. The Bantu/West African admixture in Maasai was about ~15-20%. Dienekes’ admixture run did pick that up quite well with the Yoruba (Nigerian) samples. Interestingly, the Ethiopian samples completely lack West African admixture.

  • Pingback: Anthropology.net

  • Anonymous

    First of all, you should know that Behar’s Oromos were sampled in areas neighboring the Kenyan border, while Ethiopian Oromos inhabit a large part of Ethiopia, all the way to the north.

    Allow me to give you a brief background on the demographic history of Ethiopia. In recent history, many Ethiopians, particularly those who now speak the Amharic language, underwent a language shift (in the case of Amharas, a switch from Cushitic to Semitic). You may ask yourself why there is such a clear distinction between Semitic speakers and Cushitic speakers in Dienekes’ ADMIXTURE runs.

    There are several explanations for this. One is, obviously, the geographic difference between the Semitic speakers and Cushitic speakers . Another explanation is that the “Cushitic” source population of most Semitic-speaking groups in Ethiopia were not Oromos, who speak an East Cushitic language, but “Agaws”, who speak a Central Cushitic language. Central Cushites speak a very distinct form of Cushitic, which split from the other Cushitic branches close to the proto-Cushitic root. Actually, the Ethiopian Jews went through a very recent, well-documented language shift from Central Cushitic to Semitic, although a small minority still speak their native language.

    Regarding Y-DNA and mtDNA differences between the ethnicities of Ethiopia, there is a lot of data available. Maternally, Oromos are just as close to Eurasians as Amharas are; they are basically identical. The only exception is the Tigrayan ethnic group, which carries an elevated frequency of Eurasian lineages.

    Paternally, Cushitic speakers are actually quite similar to Semitic speakers once all of the samples from different studies are put together. Regardless, the overwhelming majority of J in all of Ethiopia is old, much older than the entrance of Semitic. That is also why Ethiopian J1 lineages are among the most diverse in the whole world, the only region with higher diversity is parts of the northern Middle East.

    Just thought this information would be relevant.

  • http://washparkprophet.blogspot.com ohwilleke

    Could you elaborate on the extent to which the Bantu expansion was eliminationist and what evidence we have?

    A 2010 journal article http://www.nature.com/ejhg/journal/v19/n1/full/ejhg2010141a.html
    looking at whole genome comparisons in Sub-Saharan Africa, particularly Mozambique, addresses the issues somewhat. Basically, it suggests that there was more assimilation in some places than in others. For example, “we find a strong differentiation of the southeastern Bantu population from Mozambique, which suggests an assimilation of a pre-Bantu substrate by Bantu speakers in the region.” It goes on to say in the body of the open access article that:

    “ii.The southeastern Bantu from Mozambique are remarkably differentiated from the western Niger-Congo speaking populations, such as the Mandenka and the Yoruba, and also differentiated from geographically closer Eastern Bantu samples, such as Luhya. These results suggest that the Bantu expansion of languages, which started ~5000 years ago at the present day border region of Nigeria and Cameroon, and was probably related to the spread of agriculture and the emergence of iron technology, was not a demographic homogeneous migration with population replacement in the southernmost part of the continent, but acquired more divergence, likely because of the integration of pre-Bantu people. The complexity of the expansion of Bantu languages to the south (with an eastern and a western route), might have produced differential degrees of assimilation of previous populations of hunter gatherers. This assimilation has been detected through uniparental markers because of the genetic comparison of nowadays hunter gatherers (Pygmies and Khoisan) with Bantu speaker agriculturalists. Nonetheless, the singularity of the southeastern population of Mozambique (poorly related to present Khoisan) could be attributed to a complete assimilation of ancient genetically differentiated populations (presently unknown) by Bantu speakers in southeastern Africa, without leaving any pre-Bantu population in the area to compare with.”

    In other words, in addition to Pygmies and Khoisans, the two extant remnant hunter-gatherer populations of Africa, there was a third equally distinct Mozambiquian hunter-gatherer group that was entirely assimilated by Bantus to the point where no remnant hunter-gatherer group survived. Their genetics remain a large part of Mozambique’s Bantu population’s genetics (i.e. most Mozambiquans are mostly descended from this “lost race” of people), but their language and way of life has vanished every bit as completely as Western and Southern Europe’s hunter-gatherer populations.

    * * *

    Another interesting finding of the paper pertinent to the original post is to recall that the Maasai, which shows so much affinity with African part of the Ethiopian Oromo population, is proto-typical Nilo-Saharan language population. This is surprising because the Oromo are an Afro-Asiatic language family population. Naiively, given the Afro-Asiatic languages of Ethiopia, we would expect Ethiopians to be more genetically similar to other Afro-Asiatic language speakers than they actually are.

    The strong genetic connection to the Maasai is particularly surprising given that “There is a strong differentiation of Nilo-Saharans, much beyond what would be expected by geography.” Within African populations, looking at autosomal genetics, “The first PC (Figure 2a) and STRUCTURE with K=2 (Figure 3) separate the Nilo-Saharan-speaking Maasai from all other populations, with neighboring Luhya and African Americans in an intermediate position.” Thus, the deepest and most obvious break in African component of African autosomal population genetics is between the cluster that includes the Maasai and the African component of Ethiopians on the one hand, and all other African specific populations on other.

    * * *

    What does this mean? We don’t know. The kind of comparisons one does in autosomal genetics don’t clearly resolve themselves into tree like structures in the way that uniparental genetic methods do. Adding new populations or permitting more putative ancesteral populations in an analysis can turn what looked like a “source” cluster, not admixed with anything else, into a mixture of other components. The whole gene comparisons also provide fuzzier estimates of the relative evolutionary age of different components (although the degree of uniformity of a mix in a population is suggestive of whether the mix has “fixed” in the population over many generations or is recent). The post from “Anonymous at January 11th, 2011 at 8:12 am” illustrates this point by noting that Tishkoff’s analysis sees the Maasai as an admixture of Cushitic and Nilo-Saharans, while Dienekes’ plot shows them as a nearly pure type population in the admixture analysis (as does the 2010 paper that I cite which expands on Tishkoff’s analysis), and also appropriately notes that the sample sizes in whole gene analysis are small and prone to be less representative, with Dienekes’ analysis of the Oromo possibly failing to capture effects do to geographic variation within the Oromo since his Oromo samples are drawn from near the Kenyan border — small wonder that a population form near Kenya would be genetically similar to one from Kenya itself.

    What kind of scenario could be a fit to those fact? Here is a simple one. Given that Ethiopian Oromo have substantial Southwest Asian admixture not attributable to the Ethio-Semitic language transition about 2800 years ago, but similar in character, while the Maasai do not, one possibility is that the Afro-Asiatic Cushitic languages may be an artifact of a similar pre-historic language transition that took place more than 2800 years ago (i.e. 800 BCE) that caused most Ethiopians to transition from some pre-Cushitic language to to Cushitic languages in connection with a population influx from Southwest Asia.

    In that scenario, prior to that Cushitic transition, the Ethiopians may have been more genetically similar to the Maasai.

    Barring any other plausible linguistic candidate, the genetic similarity of the Ethiopian’s African component to the Nilo-Saharan language speaking Maasi suggests that pre-Cushitic Ethiopians may have spoken Nilo-Saharan languages. The possibility that there were pre-Oromo languages in Ethiopia that were Nilo-Saharan is also supported by the fact that Nilo-Saharan languages are common in Ethiopia’s neighbors (Kenya 29% of the population), Sudan (29% of the population — and more in soon to be separated out Southern Sudan), Uganda (26%), Chad (41%), Niger (33%) (Source http://www.bookrags.com/tandf/nilo-saharan-languages-tf/); that Cushitic (http://en.wikipedia.org/wiki/Cushitic_languages) and Nilo-Saharan languages have a similar and intertwined geographic distribution; and that Cushitic languages are at the fringe of the Afro-Asiatic linguistic area. Cushitic languages are also found in some of those countries. About 0.7% of Ethiopians speak one of sixteen Nilo-Saharan languages which could be a remant of an older substrate of Ethiopian Nilo-Saharan languages.

    Another possible clue that there might have been a transition in relatively recent pre-history from Nilo-Saharan to Afro-Asiatic languages could be present in the group of Omotic languages spoken by a small number of people in Southwest Ethiopia, which share the strongly tonal character of the Nilo-Saharan languages. Omotic is generally considered the most divergent branch of the Afroasiatic languages. Some argue it isn’t Afro-Asiatic at all http://www.uio.no/studier/emner/hf/iln/LING2110/v07/THEIL%20Is%20Omotic%20Afroasiatic.pdf noting that “Omotic has a very innovative and mixed lexicon with many intrusions from [AA] languages, especially Cushitic, and also from Nilo-Saharan.” The lexical blending of Cushitic and Nilo-Saharan, the Nilo-Saharan style tonality, and the suggestively Afro-Asiatic grammatical flourishes of Omotic that have caused many to classify it as Afro-Asiatic are together suggestive of an incomplete Afro-Asiatic language transition, or of an Cushitic-Nilo-Saharan creole, at the fringe of the Cushitic (and Afro-Asiatic) language area where the Afro-Asiatic influence would be weakest.

    Is this proof of this scenario? No. But, it is a scenario that would be consistent with the facts and there is suggestive circumstantial evidence to support it.

  • pconroy

    Andrew,

    As we discussed on Dienekes’ Blog back in March 2010, I predict that once more discoveries are made Y-DNA E will be shown to be Eurasian, with only A and B left as African.

    I also wonder if the mystery component in Mozambique is Austronesian, likewise if Austronesians are the mystery component in the North Kannadi – who knows, but if I’m right, it would provide the stepping stones for Austronesians ending up in Madagascar, as I think a direct voyage is unlikely.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    The only exception is the Tigrayan ethnic group, which carries an elevated frequency of Eurasian lineages.

    did you note the clustering of the tigray with the ahmara in the PCA? were the ‘central cushitic’ then simply always closer to the semitic speakers than ‘eastern cushitic’?

  • Anonymous

    The post from “Anonymous at January 11th, 2011 at 8:12 am” illustrates this point by noting that Tishkoff’s analysis sees the Maasai as an admixture of Cushitic and Nilo-Saharans, while Dienekes’ plot shows them as a nearly pure type population in the admixture analysis (as does the 2010 paper that I cite which expands on Tishkoff’s analysis), and also appropriately notes that the sample sizes in whole gene analysis are small and prone to be less representative, with Dienekes’ analysis of the Oromo possibly failing to capture effects do to geographic variation within the Oromo since his Oromo samples are drawn from near the Kenyan border — small wonder that a population form near Kenya would be genetically similar to one from Kenya itself.

    Dienekes’ analysis is not comparable to Tishkoff’s, and the latter is more appropriate for “genetic” linguistic affiliations, because it has a wider range of samples. Most importantly Nilo-Saharans from Sudan and other parts of East Africa. It could be that what distinguishes Dienekes’ “East African” cluster from Tishkoff’s “Cushitic/Afroasiatic” cluster is the Nilo-Saharan “portion” of the East African cluster, i.e. that it can not be distinguished from the ancestry of the Maasai that is similar to Afroasiatic Ethiopians due to the lack of other Nilo-Saharan samples or some other factor that made it difficult for a split to occur (K value, genetic relationships between other populations included in the analysis).

    Tishkoff’s study shows Oromos from similar parts of Ethiopia as the Behar samples (that Dienekes uses) with only around 6% “Nilo-Saharan” ancestry. Meanwhile, Tishkoff’s Maasai samples from three different Maasai groups were all around 50% “Cushitic/Afroasiatic” on average.

    Of course, there are most likely older genetic links between Afroasiatic Ethiopians and neighboring Nilo-Saharans, but that is a different matter.

    Omotic is generally considered the most divergent branch of the Afroasiatic languages. Some argue it isn’t Afro-Asiatic at all http://www.uio.no/studier/emner/hf/iln/LING2110/v07/THEIL%20Is%20Omotic%20Afroasiatic.pdf noting that “Omotic has a very innovative and mixed lexicon with many intrusions from [AA] languages, especially Cushitic, and also from Nilo-Saharan.” The lexical blending of Cushitic and Nilo-Saharan, the Nilo-Saharan style tonality, and the suggestively Afro-Asiatic grammatical flourishes of Omotic that have caused many to classify it as Afro-Asiatic are together suggestive of an incomplete Afro-Asiatic language transition, or of an Cushitic-Nilo-Saharan creole, at the fringe of the Cushitic (and Afro-Asiatic) language area where the Afro-Asiatic influence would be weakest.
    As far as I know, the divergence of Omotic is not believed to be caused by the minor Nilo-Saharan substratum, but rather a more ancient split from the rest of Afroasiatic. But that is outside of my comfort zone.

    Sure Omotic is spoken at the fringe of the Cushitic language area, but I would not say that Afroasiatic influence should be weakest there, considering that it seems likely to be close to the area from which proto-Afroasiatic originally spread.

  • Anonymous

    did you note the clustering of the tigray with the ahmara in the PCA? were the ‘central cushitic’ then simply always closer to the semitic speakers than ‘eastern cushitic’?
    I did. By the way, it’s Amhara. :)

    It depends on which East Cushitic reference is being used. But yes, closer than to Behar’s Oromo samples from southernmost Ethiopia. That is the only possible explanation, since many Amharas and all Ethiopian Jews were Central Cushitic speakers a few centuries ago. It is also telling that almost all Amharas cluster firmly with the Tigray samples, since Tigray people have an older Semitic-speaking history on average, while the Amharic language has had a massive recent expansion (the Oromo language as well, by the way). Unless the Amhara samples are not representative at all, which I doubt.

    The main region where Amharas have mixed with Oromos is the southern parts of the Amharic-speaking region, Shewa and thereabouts. That also explains the extreme pull of one of the Amhara samples toward the Oromos in the plot. The Amharas were sampled in Addis Abeba, where Amharas from all around Ethiopia (and Amharas native to areas around Addis Abeba) live.

    Allow me to clarify one more thing: although Oromos (or Oromo speakers) now inhabit southern and northern Ethiopia, that was due to a recent expansion. Their original homeland is southern Ethiopia.

  • Geo

    What kind of scenario could be a fit to those fact? Here is a simple one. Given that Ethiopian Oromo have substantial Southwest Asian admixture not attributable to the Ethio-Semitic language transition about 2800 years ago, but similar in character, while the Maasai do not, one possibility is that the Afro-Asiatic Cushitic languages may be an artifact of a similar pre-historic language transition that took place more than 2800 years ago (i.e. 800 BCE) that caused most Ethiopians to transition from some pre-Cushitic language to to Cushitic languages in connection with a population influx from Southwest Asia.

    The Oromo do carry a variety of Eurasian lineages. Maternally macrogroup M & N, paternally J & T, and they have significant levels of an E1b1b subclade which originated in Northern Africa (E1b1b1a) near Libya/Egypt and migrated backwards to the Horn of Africa.

    It is possible that the upper paleolithic Ethiopians were similar to other Nilotic populations on the genomic level and that ancient back-migrations from Arabia caused the modern Ethiopians like the Amhara and Oromo to stream more towards West Eurasians. Tishkoff noted that this scenario is a possibility.

  • marcel

    Razib:

    Could you post a glossary of acronyms and other jargon, with a link on your main page. I’m new to your blog, with no background in genetics. At this point, I am blipping over things like PCA, much like Linus van Pelt did (IIRC) when he was reading War and Peace or the Brothers K, and came to another long Russian name. It would be irritating to your longer time readers, not to mention yourself, if you explained these things each time, but a permanent glossary page, with a link to it on the right would be helpful.

    Thanks

  • Anonymous

    The Oromo do carry a variety of Eurasian lineages. Maternally macrogroup M & N, paternally J & T, and they have significant levels of an E1b1b subclade which originated in Northern Africa (E1b1b1a) near Libya/Egypt and migrated backwards to the Horn of Africa.

    It is actually likely that E1b1b1a originated in Northeastern Africa (Upper Egypt/Northern Sudan). Not that much of a difference, but anyway.

    It is possible that the upper paleolithic Ethiopians were similar to other Nilotic populations on the genomic level and that ancient back-migrations from Arabia caused the modern Ethiopians like the Amhara and Oromo to stream more towards West Eurasians. Tishkoff noted that this scenario is a possibility.

    It is indeed possible. But we don’t have any appropriate samples, so it’s impossible to say anything about that right now. The Maasai have a significant amount of recent shared ancestry with Afroasiatic speakers, and Bantu admixture. Some proper Nilotic samples would be great!

  • http://washparkprophet.blogspot.com ohwilleke

    “I also wonder if the mystery component in Mozambique is Austronesian, likewise if Austronesians are the mystery component in the North Kannadi – who knows, but if I’m right, it would provide the stepping stones for Austronesians ending up in Madagascar, as I think a direct voyage is unlikely.”

    Interesting theory, but it is completely contradicted by the facts.

    Austronesian mtDNA and Y-DNA, which has a well characterized profile, would stick out like a sore thumb in Mozambique. But, Mozambique is 78% Y-DNA haplogroup E1b1, 7% Y-DNA haplogroup E2, and 15% Y-DNA haplogroup B2a, none of which are found at all among Austronesians. See, e.g., http://i88.photobucket.com/albums/k178/argiedude/Africay-dna-allstudiescombined7000s.gif

    On the mtDNA side: “From the north came sequences that may have been involved in the Bantu expansion (from western, through eastern, to southern Africa), such as members of haplogroups L3b, L3e1a and a subset of L1a. The dating of the major component of Mozambican mtDNAs, the subset L2a of haplogroup L2, displayed an age range compatible with the Bantu expansion. The southern influence was traced by the presence of sequence types from haplogroup L1d, a probable relict of Khoisan-speaking populations that inhabited the region prior to their displacement by the Bantu-speaking incomers.” http://www.ncbi.nlm.nih.gov/pubmed/11806853 There are no Austronesians in Austronesia with mtDNA haplogroup L.

    FWIW, there are also no Austronesian linguistic traces in Mozambique. http://en.wikipedia.org/wiki/Languages_of_Mozambique All of the indigeneous languages of the region are Bantu languages. Portugese and English are lingua franca of trade and of the educated classes. “Small communities of Arabs, Chinese, and Indians speak their own languages (Indians from Portuguese India speak any of the Portuguese Creoles of their origin).” None of those languages are Austronesian.

    While autosomal DNA don’t precisely track either Y-DNA or mtDNA, it would be essentially impossible for the predominant component of Mozambique’s autosomal DNA to be Austronesian (an arrival that is only about 1000 years old, give or take) without leaving any trace in Y-DNA, mtDNA, or any linguistic trace.

    By comparison, autosomally, Madagascar is about 66% East African and 33% Asian, http://www.nature.com/jhg/journal/v53/n2/full/jhg2008213a.html and the uniparental mtDNA and Y-DNA mix is close to 50-50 between Borneo and East Africa. http://news.mongabay.com/2005/0708-wildmadagascar.html There is clear Austronesian linguistic influence in Madagascar that is specific to Borneo.

    * * *

    “I predict that once more discoveries are made Y-DNA E will be shown to be Eurasian.”

    This is basically impossible given the data we already have strongly disfavoring this scenario.

    The latest study of Y-DNA E1b1, “strongly supports the hypothesis that haplogroup E1b1 originated in eastern Africa.” http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016073 Similarly, E2 is found outside Africa only on a narrow strip of the Red Sea coast of Arabia. http://forwhattheywereweare.blogspot.com/2011/01/some-new-insights-in-phylogeny-of-y-dna.html and E1a is exclusively African. All of the examples basal Y-DNA paragroup DE* are found in West Africa or Tibet, with Y-DNA haplogroup D having an a peculiar but almost exclusively Asian distribution and Y-DNA haplogroup E having an almost exclusively African distribution. Y-DNA haplogroup E is not found in any place in Asia where its sister clade, Y-DNA D is found.

    Only a few fairly high level branches of the Y-DNA haplogroup taxonomy are found outside Africa.

    * Essentially all Y-DNA haplogroup E1b1 is found in people of European descent has haplogroup E-V68 or E-V257 or E-M123, found on three of the nine branches of Y-DNA haplogroup E-M215, the rest of which are exclusively African. The other main branch of haplogroup E1b1, E-V38/V100 is exclusively African and is the predominant haplogroup of West Africans, Bantu language speaking Southern Africans, and of East Africans.

    ** E-V257 consists of E-M81 and E-V257*. E-M81 is characteristic of North African Berbers, and is also found at lower frequencies in Central and Southern Iberia (within Iberia with a mostly Western distribution, but with some presence in the East but very low in the SE). E-V257* is found (at low frequencies) in Iberians (Cantabrians and Andalusians), Corsicans, Sardinians, Marrakesh Berbers and Borana (Oromo) from Kenya. It is found nowhere else in Europe.

    ** E-M123 is found in much of Afro-Asiatic language speaking Africa, is most common in Arabia, and is found at low frequencies in Europe and West Asia in areas that roughly correspond to holdings of the Islamic empire at its peak. It is not found in Asia or in most of Europe and is even absent from Iberia.

    ** E-V68 consists of E-M78 and E-V68* (found only in Sardinians). E-M78, which is half of one of nine branches of the two main branches of the Y-DNA E1b1 haplogroup family tree, is the only kind of Y-DNA E found at any frequency among people outside Africa in areas not historically under Moorish rule in people who are not recent immigrants. In Arabia and Iran it becomes less frequent the further one gets from the Gate of Tears and the Sinai. In Europe, it becomes less frequent the further one gets from Greece. By the time you get to India and all points East, and by the time you cross the Urals, E-M78 vanishes entirely. It is also essentially absent from the Atlantic and Baltic Coasts of Europe. E-M78 is also found in the Afro-Asiatic language speaking parts of Africa and in the same parts of Iberia where E-M81 is found.

    ** The only place ancient Y-DNA haplogroup E specimens have been found outside of Africa is the Canary Islands in 2000 year old samples. http://dienekes.blogspot.com/2009/08/ancient-y-chromosomes-from-canary.html and then only in the E-M78 and E-M81 haplogroups found in nearby North Africa.

    * It is possible to imagine that some men with the E-M78 haplogroup left Africa for the Arabian Pennisula, and then backmigrated to Africa bringing a major cultural innovation such as the Fertile Crescent package of crops and the Afro-Asiatic languages with them that made their descendants common in the Afro-Asiatic language area, and that these men were also among those who were part of the Danubian Neolithic expansion. Similarly, it is conceivable that E-M123 arose in Arabia from people with ancestors in Africa.

    But, it is almost impossible to imagine a plausible scenario in which Y-DNA haplogroup E as a whole arose outside Africa; the evidence instead points strongly to an Eastern African origin. Its almost total lack of diversity outside Africa, geographical spread and frequency outside Africa all point decisively to an African origin.

    The vast majority of people with Y-DNA haplogroup E speak languages not spoken outside Africa.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    anon, please use a distinctive handle. adding a number to your handle would do. it is hard to track comments from a generic user.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    marcel, good idea.

  • http://washparkprophet.blogspot.com ohwilleke

    One more footnote on Y-DNA E origins in Africa.

    Y-DNA haplogroup E also splits from the Eurasian haplogroups descended from CF, at the very base of the Y-DNA tree (just after A and B) and split very neatly from D which is Asian. So, if Y-DNA haplogroup E originated in Eurasia, it would have to have made the split right at the very dawn of the Out of Africa moment, and there is evidence from the 1000 Genomes project that the CF/DE split may be even more ancient than earlier Y-DNA analysis had suggested. So, any scenarios in which Y-DNA haplogroup E arises any time in the last 50,000 years is falsified.

    Thus, any scenario with a non-African Y-DNA haplogroup E origin has to explain why it completely disappeared everywhere else in almost all of its varieties, without even a single individual in billions and billions whose ancestry is not attributable to recent immigration in all of Asia, despite the fact that one’s Y-DNA haplogroup was invisible after just a few generations until the last decade or two.

    Notably, while there are major branches of the Y-DNA and mtDNA trees that are found in Asia but not in Europe, there are no major branches of the Y-DNA or mtDNA trees that are found in Europe but not in Asia as is the case with Y-DNA haplogroup E.

  • http://washparkprophet.blogspot.com ohwilleke

    “Sure Omotic is spoken at the fringe of the Cushitic language area, but I would not say that Afroasiatic influence should be weakest there, considering that it seems likely to be close to the area from which proto-Afroasiatic originally spread.”

    Every imaginable arrangement of the phylogenetic tree for the origin of the major families Afroasiatic languages has been proposed by credible leading linguists and none have won a consensus, although the validity of the macrofamily (apart from the question of whether Omotic is Afro-Asiatic, or that if it is that is belongs in the same group as Cushitic or not) is fairly solid.

    Points of origin for the Afro-Asiatic language family have been proposed for everywhere from its far Northern extent in the North Levant to the expanse of East Africa, Sudan, Egypt, and perhaps even in the Sahara (if it has links to a diaspora population from Green Sahara period in the early Holocene). There is a fair degree of consensus on the main subfamilies, but not about their relationships or places of origin.

    This lack of consensus from the linguists make population genetics, ancient climate studies, and archaeological culture associations, and the analogies we can draw from cases where we understand how languages have spread elsewhere, are especially important in informing our understanding of the period in this region.

    In most historical cases, some combination of the spread of agricultural technology or other technological complexes, mass religious movements, mass migrations of peoples (away from or to something that is a very good reason), or conquest seem to explain how language families become widespread. Of course, in Africa, somebody has to speak the “original” language of modern humans (or for that matter the last common language of modern humans, which might not be the same thing) so one of the language families could be that language (then again, maybe that language, like Sumerian, the original written language is now dead and has no recognizable descendants).

    FWIW, my bet is that Afro-Asiatic languages have their roots in the expansion of Fertile Cresent, Sahel or Ethiopian agriculture since agriculture seems to be the main driver of macrolanguage groups whose origins can’t be traced to the historic era. And, given (1) their area of spread, (2) the anomolous R1b Y-DNA haplogroup frequency of predominantly pastoral Chadic speakers found neither in other Afro-Asiatic language speakers or other African linguistic groups at high frequencies (suggesting back migration to Africa from an area outside the Afro-Asiatic language area probably accompanied by language shift and a lack of connection to Sahel crops), and (3) the narrow geographic scope of Ethiopian origin domesticates relative to the Afro-Asiatic territory: my money would be on Afro-Asiatic as part of the Fertile Cresent farming and herding package (i.e. with the main expansion starting ca. 6000 BCE or perhaps a couple of thousand years later depending on the latest archaeological finds).

    This would put the most likely place of origin for Afro-Asiatic languages in either the Nile Valley or the Levant (there is no evidence that other places that were part of the Near Eastern Neolithic ever spoke Afro-Asiatic languages in the earliest historically documented periods or before then).

    Coptic, as the first written Afro-Asiatic language, is a particularly strong contender as an epicenter of African fertile cresent agriculture, with historically attested boating technology, with easy access to Berber, Semitic, Coptic, Chadic, Cushitic and Omotic regions via sea coasts and rivers. A common source language for all of the major families would help also explain the lack of a tree-like structure amongst them that one would expect if Semitic languages were at the root of all of the Afro-Asiatic languages in a Levant origin scenario.

    It is certainly possible that Afro-Asiatic languages could have originated in Ethiopia and worked their way via the Red Sea and/or Nile and/or Green Sahara to North Africa and the Near East. But, what would the Ethiopians have done that would have brought them to the cultural dominance that would have led to linguistic dominance across this wide region? I guess coffee, but I don’t really believe that.

    Yes, genetically, Y-DNA E1b1 some parts of which trace Afro-Asiatic languages seem to have origins in Eastern Africa and perhaps Ethiopia. But, Eastern Africa is also the ancesteral home, perhaps of Southern Africa hunter-gatherers, Nilo-Saharans and Niger-Congo language speakers, so the notion that the linguistic roots of Afro-Asiatic languages, even though it spawned other linguistically very different groups, isn’t very convincing.

  • onur

    But, Eastern Africa is also the ancesteral home, perhaps of Southern Africa hunter-gatherers, Nilo-Saharans and Niger-Congo language speakers

    I assume here you are referring to the origin of modern humans as a whole, am I right? If yes, I must state that enough time has passed for the divergence of those groups you mention and their language families.

  • pconroy

    Andrew,

    I think Y-DNA E originates in the Southern Fertile Crescent, probably with the Natufian Culture and brought Afro-Asiastic languages to Africa, along with agriculture.

  • pconroy

    Also, here’s a pretty good blog post by Matilda on this very subject:
    http://mathildasanthropologyblog.wordpress.com/2008/03/04/caucasian-africans/

    Her blog focuses on the genetics of Northern Africa mostly.

  • pconroy

    Andrew said:

    Interesting theory, but it is completely contradicted by the facts.
    Austronesian mtDNA and Y-DNA, which has a well characterized profile, would stick out like a sore thumb in Mozambique.

    I wasn’t talking about uniparental markers, but autosomal DNA, which is where the unique component in Mozambiques was discovered.
    If you had male mediated Austronesian settlement in coastal Mozambique, which were later overrun by male Bantu’s, then Arabs from Oman, there would be little evidence in uniparental markers, now would there?!

  • Pingback: Friday Fluff – January 14th, 2011 | Gene Expression | Discover Magazine

  • http://washparkprophet.blogspot.com ohwilleke

    If you had male mediated Austronesian settlement in coastal Mozambique, which were later overrun by male Bantu’s, then Arabs from Oman, there would be little evidence in uniparental markers, now would there?!

    Not a lot, but given that you have discernable Khoisan uniparental traces, you wouldn’t expect there to be a huge autosomal component and less than 1% uniparental markers at all. Also, I would be quite surprised if there wasn’t at least some exploratory comparison of a Madagascar component to the Mozambique unique component that wasn’t reported in the final product because it wasn’t a match.

    Matilda’s makes a case with which no one disagrees that there is a substantial Eurasian genetic component in Ethiopia, which may include some Y-DNA sub-haplogroups of E found in Euraisans. But, she doesn’t even begin to argue that Y-DNA haplogroup E is non-African, and indeed, clearly identifies some varieties of it with African Niger-Congo populations in the post you cite, which is the mainstream view.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »