The three poles of South Asian genetic variation

By Razib Khan | April 28, 2011 5:48 pm

Zack Ajmal has posted his K = 11 Reference 3 results including Harappa Ancestry Project participants. Below are the results sorted by the East Asian, South Asian, and Onge. I limited it to those who had 5% or more East Asian. All caps = reference populations. The rest are individuals from HAP:

Group Subgroup Ethnicity S Asian Onge E Asian
Austro-Asiatic Khasic KHASI 21% 21% 48%
Austro-Asiatic Munda JUANG 26% 43% 28%
Austro-Asiatic Munda BONDA 27% 44% 27%
Austro-Asiatic Munda GADABA 29% 42% 24%
Austro-Asiatic Munda KHARIA 33% 44% 21%
Austro-Asiatic Munda SAVARA 33% 44% 21%
Austro-Asiatic Munda HO 34% 44% 20%
Austro-Asiatic Munda MAWASI 38% 44% 16%
Austro-Asiatic Munda ASUR 42% 42% 14%
Austro-Asiatic Munda SANTHAL 40% 45% 13%
Indo-European Indo-European SAHARIYA 44% 39% 12%

Bengali 51% 28% 12%

Bengali 49% 28% 11%
Indo-European Indo-European SATNAMI 49% 36% 8%

Isolate BURUSHO 47% 10% 6%

Bengali 54% 29% 6%

Bengali/Oriya 53% 29% 5%
Dravidian Dravidian MALAYAN 50% 42% 5%

UP 48% 21% 5%

That’s my parents at 12 and 11 percent East Asian. Using the new reference population Zack estimates that my “Ancestral South Indian” (ASI) is ~43%. That makes more sense to me that Dodecad’s estimate of ~34%. I think that Dodecad method was confused because I do have genuine East Asian admixture, and the estimate of “Ancestral North India” (ANI) vs. ASI is confounded by other components. I suspect that the estimates of Onge are probably less valid for groups like the Khasi because of bleeding over from the East Asian component (in other words, the regression which Zack used to predict ASI is fitted to South Asian populations without East Asian admixture, and isn’t fully transferable to those that have it). But the geographical breakdown of the East Asian element is pretty striking, if expected. The Bengalis have more East Asian than other Indians, as you’d expect. Here are all the HAP individuals + reference populations as points on a two dimensional plot:

Now let’s zoom in the far right section with labels:

Zack has stated that the Gujaratis in his sample are Patels, and they seem to cluster with the Gujarati subset in the HapMap which is very South Asian (what I termed “Gujarati_B” and Zack “Gujarati_A”; I’ll go with Zack’s terms in the future). So I think we’ve solved the mystery of the endogamous cluster in the Houston Gujarati HapMap sample. They’re Patels.

I wanted to create a 3 plot that works. With South Asian, Onge, and East Asian components specifically. I generated one that I could manipulate in R, but I don’t have time to figure how to turn it into an applet or gif. So I recorded me manipulating the cube and converted it into an swf. The two Bengalis who project “outside of the plane” are my parents. And notice that the Khasi are also outliers.

You can download the file here. If you resize file after you open it it might pixelate, so be careful! Also, here’s a merged spreadsheet for K = 11 with reference + HAP project members, with ethnicity information included for the latter.

CATEGORIZED UNDER: Genetics, Genomics
MORE ABOUT: Genetics, Genomics

Comments are closed.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar