The importance of representativeness

By Razib Khan | November 9, 2010 6:56 am

A few weeks ago when I posted on the results of a high likelihood of a partially eastern origin for the Mundari people I received a message via Facebook that the article really wasn’t relevant to most South Asians, since only 1-2% spoke a Mundari language (along with pointers to old out of date articles). I immediately replied that it is likely that the Mundari were one of the base populations from which the Indo-Aryan speaking peoples of Bengal, Orissa and Assam arose. The Santals are present as a minority in all three of these states, and the likelihood is that Santal tribals were assimilated into the Hindu (and later Muslim) society, not the other way around. My interlocutor was a little too fixated on issues having to do with colonialism to see clearly what I was trying to get at. That’s fine, we all have our own experiences.

But in any case the bigger point of that post was to emphasize the importance of representativeness. This is something that really stands out with South Asians. There are around 1.3 billion of us, but the HGDP sample has only Pakistani groups. Some of these, such as the Kalash and Burusho are cultural isolates, whose sampling was justified on the grounds that these people were likely going to be assimilated in the near future. Of the HGDP South Asians only one, the Sindhi, are Indo-Aryan speakers, the language family which covers about ~80% of South Asians. More recent papers have moderately rectified that situation. Though as a Indian American Bengali friend of mine observed, “there are 200 million of us!” I believe, and hope, in three years that these sorts of worries and questions will seem like ancient history. Below the fold I’ve taken Dienekes ADMIXTURE estimates for HGDP and HapMap3 South Asian groups and appended myself to them.

razibme

I’m soon going to get my parents tested via 23andMe, and I’ll have a better sense of my elevated “East Asian” ancestry is due to recent admixture, or part of the normal range in eastern Bengal. If, as I suspect, most of the East Asian is from my father I’ll increase the probability of the former. If it’s more balanced I’ll increase the likelihood that I’m representative of many Bengalis. There are a few Bengalis on 23andMe and most of them have elevated “Asian” ancestry, though not as much as me.

CATEGORIZED UNDER: Genetics, Genomics
MORE ABOUT: Bengali, South Asia
  • VG

    I have a feeling your profile fits that of an average Bengali/Bangladeshi who is not from a lower caste background. The further east you go in Bengal, the greater the east asian component. I notice that there’s no mapping for southeast Asian, maybe that could throw some light.

  • VG

    What is particularly striking, though, is how little difference there is between you and the average Gujarati, except for the east Asian component, considering one group represents the extreme west of India and the other the extreme east, not counting the northeast tribes.

  • Pingback: Tweets that mention The importance of representativeness | Gene Expression | Discover Magazine -- Topsy.com()

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    low caste groups in bengal are going to be somewhat different, as many of them are of mundari tribal origin. IOW, they’ll be enriched for the “eastern” element.

  • http://ecophysio.fieldofscience.com/ EcoPhysioMichelle

    I am going to have to learn to use this program when I get my 23andme results (how long did that take you, btw? website says 6months). Looks similar to Structure, which I’ve at least seen others use.

  • toto

    I’m sure there are mundane explanations for the large, nearly-homogeneous-within-populations “West Asian” component (except for the South Indians). I’m sure there are also very simple reasons why it’s shared with most Europeans, but not Basques.

    I just can’t see one right now.

    Well actually I can see two, but they’re weird. One is that the presumed “Ancient North Indian” component has the same origin as the W. Asian components in Europeans – an ancient demic wave of diffusion from the Near East, perhaps riding on the Neolithic revolution.

    The other is Indo-Europeans, but I can’t quite believe that. So much impact, so well-mixed, in less than 4k years?

    Admittedly the correlation in both European and South Asian graphs is a bit unsettling (“Everybody except the Tamils, the Basque and the Finns – see a pattern here?”). But this correlation, like others, may simply be the result of an accidental correlation between a Neolithic demic wave, and a later Indo-European cultural advance. I mean, if you’re going to expand from a region located roughly around the Black Sea, the resulting pattern is bound to be predictable.

    (/idle speculation)

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    I am going to have to learn to use this program when I get my 23andme results (how long did that take you, btw? website says 6months). Looks similar to Structure, which I’ve at least seen others use.

    got my results in less than a month. that was during the huge order spike with discount. like structure, but max lik instead of bayesian. faster. i think you need to filter for LD. i ran the hapmap data sets. still took too long on my notebook, so i left to dienekes.

    But this correlation, like others, may simply be the result of an accidental correlation between a Neolithic demic wave, and a later Indo-European cultural advance. I mean, if you’re going to expand from a region located roughly around the Black Sea, the resulting pattern is bound to be predictable.

    if you want me to be honest, i think the indo-europeans came to india later after the demic wave. but if reich et. al. are correct the low bound for “ancient north indian” is ~40%, in some south indian tribals.* IOW, the “south asian” element popping out of ADMIXTURE at K = 10 is a linear combination of a european-like & ancient south asian group. on top of this there seems to be a distinctive “west asian” and “north european” element. so there is probably at least a second wave, but i bet there are probably three waves. notice that the “northern european” present in northwest south asia and afghanistan is pretty much absent among armenians and assyrians (excluding the outlier armenians).

    but if you think that’s crazy, look at this

    http://dodecad.blogspot.com/2010/11/multidimensional-scaling-in-italy.html

    see my suggestion in the comments.

    * if you assume that .4 out of the purple element is ANI, and all west asian + north european, you get about the ANI total estimates in reich et al. supplements for pakistani groups

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »