Peeling the population genetic Indian onion

By Razib Khan | December 8, 2011 9:50 pm

There’s a new paper in The American Journal of Human Genetics, Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South Asia. It’s free, so go read it. I don’t have time to comment in detail, but I did read the paper, and I want to mention a few things:

1) If you follow Harappa Ancestry Project or Dodecad Ancestry Project the ADMIXTURE and PCA won’t be surprising. They’ll be familiar. Though the researchers got some nice additional populations in Uttar Pradesh it didn’t change the general outlines of what you can already ascertain with the public data sets.

2) The authors seem to de facto ignore the argument in Reconstructing Indian population history that ADMIXTURE components can themselves be decomposed into further real elements. (they acknowledge it in the text, but it doesn’t go any further) This is obvious as you move up the K’s, and a given component collapses into two obvious elements. More precisely, it may be that the modal South Asian component in their runs is actually a composite, as may be the the minor component. I suspect this is the case, because very low caste tribal elements in South India which show up as “pure” South Asian in some of these ADMIXTURE runs nevertheless may carry West Eurasian markers, such as the derived variant of SLC24A5.

3) The big finding the paper is that the West Eurasian regions of the genome in South Asians may be more diverse in terms of haplotypes than in Europeans, and some extent Near Easterners. The inference here then is that perhaps “Ancestral North Indians” are a source population for other West Eurasians! I think more likely this is not the case. The authors note that it could be due to large effective population size in South Asia. But I think it may have something to do with the fact that the regions of the genome they selected are more admixed and composite than the authors assume. In other words, the diversity is elevated by the bleed-in of “Ancestral South Indians” into this genetic background.

There’s some interesting stuff about natural selection and diabetes in the paper, but I’ll leave that for later.

  • http://washparkprophet.blogspot.com ohwilleke

    West Eurasian regions of the genome in South Asians may be more diverse in terms of haplotypes than in Europeans, and some extent Near Easterners. The inference here then is that perhaps “Ancestral North Indians” are a source population for other West Eurasians!”

    The scenario I imagine is that South Asia is the largest population component of Upper Paleolithic Western Eurasia. A subset of that population that is necessarily less diverse resettles the Near East and settles Europe ca. 40,000-50,000 years ago. Then, ca. 6,000-7,000 years ago, there is one backmigration to the Indus River Valley from the Near East that adds to South Asian genetic diversity by superimposing a new layer on existing populations there without changing the genetic diversity of Europe or the Near East. Then, ca. 3,500 years ago, pre-East Asian migration Central European Indo-Aryans (sprinkled with a small number of Europeans who have been integrated into Proto-IE society) add another layer still again adding to diversity in South Asia (in both the IVC and IE waves, in the ANI component of South Asian genetics), which in turn is filtered into the rest of South Asia, particularly in higher caste populations where the ANI infusion is more complete. Some of the very messy profiles in Pakistan relative to India probably arise from historic era waves of back and forth empire expansions.

    But, k3 (Near East) in Southern Pakistan could be a credible original Harappan layer that is an admixed subcomponent of k5 (ANI) and hence invisible in k5 areas for the most part.

    One can have pretty intense replacement or outnumbering with admixture of a subtrate population by a new superstrate (perhaps 90% new-10% old for a decent sized effective population in the old substrate) without losing much genetic diversity at all – you reduce the frequency of a lot of variants but the variants mostly stay in the population at some frequency.

    The ASI component may not be actually “pure”, but it does seem to have a lot less complicated composition than the ANI component, which according to Maju who has looked at the numbers in the supplemental materials more closely, fractures into a Caucasian looking component and a more genuinely Indo-Iranian component at about K=13.

    The conclusions about Tibeto-Burmans being fairly genetically distinct (and probably recent and hence not admixed as much) is unsurprising and confirms prior findings of both genetics and the historic era record. The relatively distinct (although not quite as much as Tibeto-Burman) Munda genetic profiles and ties to SE Asia in the Munda not present in other populations also favors a relatively late Munda arrival in South Asia over the notion that tribal Munda populatons are the most ancient in India. If they were really autochronous, or even Upper Paleolithic in age depth they would have reached fixation by now, would have been indistinguishable from ASI, and ASI would have a shorter FST distance from SE Asia across the board. My guess would be that the Munda presence in South Asia is probably not much more than 5000-6000 years old. The insight that a couple of Indo-European speaking populations probably experienced language shift from Munda in the Indo-Aryan invasion era or later, however, is insightful.

    I was also not impressed at all by the conclusions by historical dates that they inferred from genetic diversity which made assumptions not very well supported by what we know from other sources, and were quite at odds with the highly equivocal tone of the paper that lays out possibilities rather than reaching conclusions elsewhere. Basically, they say that because genetic diversity is similar everywhere that the ANI component must be older than 12,500 years ago. An inference that all of those populations were seeded at the same time, all from similar and substantial effective population sized sources that were rapidly expanding and hence not losing diversity from serial founder effects seems more plausible. (A good case in point of a migration that didn’t lose much genetic diversity at all despite a long range, effective one way migration because it met these criteria is the Basque migration to Idaho in the 1880s). Their conclusion about ANI time depth also seems to have implicit assumptions about how long the populations of Europe and the Near East have had something approximating their current levels of genetic diversity and population genetic makeup that are probably counterfactual because they assume much greater time depth than is supported by ancient DNA evidence of genetic continuity in these regions.

  • Giggsy

    ”The inference here then is that perhaps “Ancestral North Indians” are a source population for other West Eurasians! I think more likely this is not the case.”

    This is EXACTLY the case. The Abrahamic laced intellectual world is slowly but surely coming to this realization, that ANI are the ancestors of west Asians. Not the other way as was forced onto the world…aka Aryan theory the invasion of the sons of Shem and Japeth to enslave the cursed sons of Ham. I cannot believe that Christian and Muslims are not aware of their own historical ruling elite. If anyone knows about Aryan theory its VERY CLEAR why it was needed. To confirm the home of the jews, christians and muslims aka the abrahamic tribe to the middle east and attach ‘home of civilization’ to it, they needed to show that India was indeed younger than abrahamic origins, to justify their own claim to civilization fame. To look at it deeper, its very easy to see why, first two sons of Noah, where Shem and Japeth, the light skinned special tribe of god, the chosen people. Who set up judaism, christianity and islam. However Noah had third son, who was Ham, and he was cursed and made dark to be servant of the other two sons. This curse of ham, was the Reason why aryan invasion was created, to justify to themselves that india was not the home of civilisation, and in fact the reason why india is civilized is because of an unknown group of white aryans from europe or central asia, or somewhere else, came and civilised india. Therefore the religous sanction is given. Ham was not civilised he was civilised by sons of shem and japeth.

    That is the aryan theory. Muslims, christians, and jews fully support this lie.

    ‘Researchers found that the Indian populations had more genetic diversity than Europeans and East Asians, which gives a good indicator of the age of a population” Genographic project IBM.

    Sahoo et al had actually written the following words:“The perennial concept of people, language, and agriculture arriving to India together through thenorthwest corridor does not hold up to close scrutiny.Recent claims for a linkage of haplogroups J2, L, R1a,and R2 with a contemporaneous origin for the majority of the Indian castes’ paternal lineages from outside the subcontinent are REJECTED, although our findings do support a local origin of haplogroups F* and H.” .They also rule out arrivals from Southwest Asia because West Asian haplogroups (like Y-Hg G) are not found in India.

    Kivisild’s findings (2003) too had shown that humans could not have arrived from West Asia into Indiabecause of lack of West Asian Y-hgs E, G, I, J* and J2f. Kivisild et al wrote,“When compared with European and Middle Eastern populations (Semino et al. 2000), Indians (i) share with themclades J2 and M173 derived sister groups R1b and R1a, the latter of which is particularly frequent in India; and (ii) lack or show amarginal frequency of clades E, G, I, J*, and J2f.”

    There is a fundamental unity of mtDNA lineages in India, in spite of the extensive cultural and linguistic diversity, pointing to a relatively small founding group of females in India. Most of the mtDNA diversity observed in Indian populations is between individuals within populations; there is no significant structuring of haplotype diversity by socio-religious affiliation, geographical location of habitat or linguistic affiliation.- Scientists Susanta Roychoudhury and thirteen others studying 644 samples of mtDNA from ten Indian ethnic groups.

    Dravidian” authorship of the Indus-Sarasvati civilization rejected indirectly, since it noted, “Our data are also more consistent with a peninsular origin of Dravidian speakers than a source with proximity to the Indus….” They found, in conclusion, “overwhelming support for an Indian origin of Dravidian speakers.”The frequencies of R2 seems to mirror the frequencies of R1a (i.e. both lineages are strong and weak in the same social and linguistic subgroups). This may indicate that both R1a and R2 moved into India at roughly the same time. R2 is very rare in Europe.Sanghamitra Sengupta, L. Cavalli-Sforza, Partha P. Majuder, and P. A. Underhill. – 2006.

    A (2009) study headed by geneticist Swarkar Sharma, collated information for 2809 Indians (681 Brahmins, and 2128 tribals and schedule castes). The results showed “no consistent pattern of the exclusive presence and distribution of Y-haplogroups to distinguish the higher-most caste, Brahmins, from the lower-most ones, schedule castes and tribals”. Brahmins from West Bengal showed the highest frequency (72.22%) of Y-haplogroups R1a1* hinting that it may have been a founder lineage for this caste group. The authors found it significant that the Saharia tribe of Madhya Pradesh had not only 28.07% R1a1, but also 22.8% R1a*, out of 57 people, with such a high percentage of R1a* never having been found before. Based on STR variance the estimated age of R1a* in India was 18,478 years, and for R1a1 it was 13,768 years.In its conclusions the study proposed “the autochthonous origin and tribal links of Indian Brahmins” as well as “the origin of R1a1* … in the Indian subcontinent”.S. Sharma, argued for an Indian origin of R1a1 lineage among Brahmins, by pointing out the highest incidence of R1a*, ancestral clade to R1a1, among Kashmiri Pandits (Brahmins) and Saharias, an Indian tribe.
    - Sharma et al 2009

    Human Genetics at the University of Michigan, conducted genetic analysis of Indian-born individuals in the US. Their studies of 1,200.’We were struck both by the low level of diversity amongst people spanning such a large geographical region, and by the fact that people of the Indian sub-continent constituted a distinct group when compared to populations from other parts of the world,’ said Pragna I. Patel.

    The study analysed 500,000 genetic markers across the genomes of 132 individuals from 25 diverse groups from 13 states. All the individuals were from six-language families and traditionally upper and lower castes and tribal groups. “The genetics proves that castes grew directly out of tribe-like organizations during the formation of the Indian society.”
    “Impossible to distinguish between castes and tribes since their genetics proved they were not systematically different.”
    -”Reconstructing Indian Population History”
    - David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price & Lalji Singh
    - 2009

    Moreover, there are other DNA lineages found in good numbers in West Asia like R1*, R1b3,J*, J2f, I, G and E which are in total more than 53% population of west Asia. These arevirtually absent from India (Sahoo). Had people migrated from West Asia to India,these haplogroups would also have arrived into India. This evidence proves that J2 did notarrive from West Asia, because no lineage can ever migrate without other lineages alsomigrating along with it from the place of origin or expansion. On the other hand nearly all of the Indian male lineages like F*, L1, H (M-69), K2, C5, C*, R1a (M-17) etc. are found inWest Asia, proving a definite Indian migration to West Asia. The HIV protective gene, whichis found in West Asia, and Central Asia too, is absent from India (Majumder and Dey, 2001).Thus on no account, any migration from West Asia to India can be supported.

    Sengupta (2006) showed that J2 is well distributed in Indian population.Sengupta et al (2006) found that the haplogroup J2 had a quite high variance, and hence deep time-depthin Indian tribes and castes too. Moreover the frequency is higher in the Dravidian speakingsouth Indians (19%) than the Indo-European speaking north Indians (11%). This destroys theAryan migration into India from West Asia hypothesis of Bellwood (2003 and 2005). The inference what we can derive from Sengupta and colleagues study’s data is that J2 haplogroup originated in India during Last Glacial Maximum, and migrated out of India whenclimate permitted. J2 is 18.7% in south Pakistan, the central place of Indus civilization.Lineage J2 and its derivatives are 23% in Iran and 22.2% in Turkey. (Regueiro et al.2006).But their variances are less than in India. Semino (2004) gives 18,000 ybp as the time of origin of J2. The variance was also high indicating indigenous origin of the haplogroup in India.J2 as well as its sub-clade J2b2 show a decreasing variance from India to the Balkans.

    Sahoo et al had actually written the following words:“The perennial concept of people, language, and agriculture arriving to India together through thenorthwest corridor does not hold up to close scrutiny.Recent claims for a linkage of haplogroups J2, L, R1a,and R2 with a contemporaneous origin for the majority of the Indian castes’ paternal lineages from outside the subcontinent are REJECTED, although our findings do support a local origin of haplogroups F* and H.” .They also rule out arrivals from Southwest Asia because West Asian haplogroups (like Y-Hg G) are not found in India.

    Kivisild’s findings (2003) too had shown that humans could not have arrived from West Asia into Indiabecause of lack of West Asian Y-hgs E, G, I, J* and J2f. Kivisild et al wrote,“When compared with European and Middle Eastern populations (Semino et al. 2000), Indians (i) share with themclades J2 and M173 derived sister groups R1b and R1a, the latter of which is particularly frequent in India; and (ii) lack or show amarginal frequency of clades E, G, I, J*, and J2f.”

    There is a fundamental unity of mtDNA lineages in India, in spite of the extensive cultural and linguistic diversity, pointing to a relatively small founding group of females in India. Most of the mtDNA diversity observed in Indian populations is between individuals within populations; there is no significant structuring of haplotype diversity by socio-religious affiliation, geographical location of habitat or linguistic affiliation.- Scientists Susanta Roychoudhury and thirteen others studying 644 samples of mtDNA from ten Indian ethnic groups.

    Dravidian” authorship of the Indus-Sarasvati civilization rejected indirectly, since it noted, “Our data are also more consistent with a peninsular origin of Dravidian speakers than a source with proximity to the Indus….” They found, in conclusion, “overwhelming support for an Indian origin of Dravidian speakers.”The frequencies of R2 seems to mirror the frequencies of R1a (i.e. both lineages are strong and weak in the same social and linguistic subgroups). This may indicate that both R1a and R2 moved into India at roughly the same time. R2 is very rare in Europe.
    Sanghamitra Sengupta, L. Cavalli-Sforza, Partha P. Majumder, and P. A. Underhill. – 2006.

    A (2009) study headed by geneticist Swarkar Sharma, collated information for 2809 Indians (681 Brahmins, and 2128 tribals and schedule castes). The results showed “no consistent pattern of the exclusive presence and distribution of Y-haplogroups to distinguish the higher-most caste, Brahmins, from the lower-most ones, schedule castes and tribals”. Brahmins from West Bengal showed the highest frequency (72.22%) of Y-haplogroups R1a1* hinting that it may have been a founder lineage for this caste group. The authors found it significant that the Saharia tribe of Madhya Pradesh had not only 28.07% R1a1, but also 22.8% R1a*, out of 57 people, with such a high percentage of R1a* never having been found before. Based on STR variance the estimated age of R1a* in India was 18,478 years, and for R1a1 it was 13,768 years.In its conclusions the study proposed “the autochthonous origin and tribal links of Indian Brahmins” as well as “the origin of R1a1* … in the Indian subcontinent”.
    S. Sharma, argued for an Indian origin of R1a1 lineage among Brahmins, by pointing out the highest incidence of R1a*, ancestral clade to R1a1, among Kashmiri Pandits (Brahmins) and Saharias, an Indian tribe.
    - Sharma et al 2009

    The study analysed 500,000 genetic markers across the genomes of 132 individuals from 25 diverse groups from 13 states. All the individuals were from six-language families and traditionally upper and lower castes and tribal groups. “The genetics proves that castes grew directly out of tribe-like organizations during the formation of the Indian society.”
    “Impossible to distinguish between castes and tribes since their genetics proved they were not systematically different.”
    -”Reconstructing Indian Population History”
    - David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price & Lalji Singh
    - 2009

    Underhill and colleagues (2009) presented a detailed study of R1a lineages.
    Theyfound that R1a is oldest in India. This lineage started expanding from Gujarat about 16,000years back. By 14,000 years back or earlier, it reached the Ganga Valley and Indus Valley.Then people carrying R1a genes migrated out of India, through Afghanistan and Tajikistan,reaching Central Asia. From Central Asia they entered East Europe. They inhabited thePontic-Caspian area. Then they populated those areas which are inhabited today by Slavicand Baltic speaking people

    Team working on the same topic included Sengupta, King, Cavalli-Sforza, Underhilland colleagues. They showed that R (especially R1a1 and R2) diversity in India is indigenousin origin and does not support hypothesis of immigration from Central Asia or anywhere outside. R1a prevalence is not only high in Indo-European speaking Punjab, south Pakistan and Ganga Valley, but also in Chenchu and Koya tribes of south India (Kivisildet al.200

    Oppenheimer (2003) also had supported Indian origin of R1a which is also called M17 in
    genetic circles. He wrote, “And sure enoug
    h we find highest rates and greatest diversity of the M17 line in Pakistan, north India, and eastern Iran, and low rates in the Caucasus. M17 isnot only more diverse in South Asia than in Central Asia but diversity characterizes itspresence in isolated tribal groups in the south, thus undermining any theory of M17 as amarker of a ‘male Aryan Invasion of India.’ Study of the geographical distribution and thediversity of genetic branches and stems again suggests that Ruslan, along with his son M17,arose early in South Asia, somewhere near India”.

  • Giggsy

    http://www.biomedcentral.com/1471-2148/7/47/figure/F1?highres=y

    To sum up we conclude that, because of its very high frequency and diversity, haplogroup O-M95 had an in-situ origin among the Indian Austro-Asiatics, particularly among the Mundaris, not in Southeast Asia as envisaged earlier. Given the large estimate of TMRCA, our study suggests that the Mundari populations are one of the earliest settlers in the Indian Subcontinent.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    wow giggsy, lots of shit larded around a few flecks of gold. YOU’RE BANNED!!! :-)

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #1, your model is close to what i think is probably right. though my confidence is low.

  • Karl Zimmerman

    Apologies if this was widely known before, as I don’t keep as up on the literature as you obviously do. The Brahuis look genetically indistinguishable from the Balochis. I had thought that current theory was they had migrated from Central India within the last thousand years or so (mainly due to the lack of Persian loanwords), but unless it was a case of uber elite dominance it’s hard to see how you wouldn’t end up with at least a somewhat elevated “dark green” component. Thus the hypothesis that the Dravidian was indigenous to the region becomes more likely.

  • Justin Giancola

    hahahaha! – perfect. when you can make an ohwilleke comment seem like a walk in the park and compete with a razib mega post that’s a sign you should be linking to your own damn blog or maybe a google doc or something!

  • Mona 2

    i challenge anyone to a single demonstration that ANI closer to west asia and not south asia!!

  • pconroy

    Check out Dienekes’ latest Doedecad Mod – incorporates Metspalu et al. (2011) data:

    http://dienekes.blogspot.com/2011/12/first-analysis-of-metspalu-et-al-2011.html

    My results are in percentages:
    Mediterranean 38.55
    Far_Asian 0.00
    Siberian 0.00
    North_European 48.92
    South_Asian 0.00
    West_African 0.00
    Caucasus 4.05
    Gedrosia 8.13
    East_African 0.26
    Southwest_Asian 0.03
    Southeast_Asian 0.00
    Northwest_African 0.00

    The newest factor here is the Gedrosian one, which previously was subsumed into the West Asian and South Asian ones. Gedrosia is:
    http://en.wikipedia.org/wiki/File:Gedrosia-Map-Route-of-Alexander-1823-Lucas.png

    Previously Doug McDonald estimated that I had a factor that was 3.1% Sindhi or 3% Pathan, so these results tie in with that.

  • pconroy

    @Mona,

    The ANI component is basically the same as Dienekes’s Gedrosia component, and here are the FST values:

    http://2.bp.blogspot.com/-MS8Xp-YlcPI/TuMo_f6zLyI/AAAAAAAAEWc/_X6DX4dIvD8/s1600/K12a_fst.png

    So based on that, Gedrosia/ANI is nearest to:
    0.036 Caucasus
    0.049 North European
    0.061 Mediterranean
    0.065 Southwest Asian
    0.073 Northwest African
    0.075 South Asian

    The Caucasus component was labelled West Asian in older runs – so I think that’s fairly conclusive.

  • http://India Indrajeet Kashyap

    “West Eurasian regions of the genome in South Asians may be more diverse in terms of haplotypes than in Europeans, and some extent Near Easterners. The inference here then is that perhaps “Ancestral North Indians” are a source population for other West Eurasians!”

    This argument of “more Diversity than” is used to point source population in many issues but does this diversity is only greater due to this process or many more things / influences can also play their part in increasing the diversity.

    Caste system is very big thing in South Asia and it is thousands of years old .In what ways such unique & complex social phenomena can play in shaping the population is hard to calculate even by experts i think. I wonder how the Researchers take this into account while concluding so many things from of their results.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    Caste system is very big thing in South Asia and it is thousands of years old

    you don’t know it’s thousands of years old, so don’t say it is as if it is. in any case, some geneticists do see evidence that it’s thousands of years old.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »