More markers, or more populations?

By Razib Khan | July 2, 2010 11:33 am

Here’s a letter to The American Journal of Human Genetics worth reading, Genetic Landscape of Eurasia and “Admixture” in Uyghurs:

…In the papers…by Xu and Jin, the genetic structure of Uyghurs was described by 8150 ancestry-informative markers (AIMs). These markers estimated the admixture rate of the Uyghur population to be around 50% East Asian ancestry by comparing Uyghurs to East Asians and Europeans. However, we suspect that the estimate of Xu and Jin may be considerably biased by insufficient reference population coverage….

The difference between the estimate of Xu and Jin (52%) and our estimate (31%) may stem from either the different population coverage or the sample size. We analyzed a different and larger sample of Uyghur individuals (n = 48) than that analyzed by Xu and Jin….Their small sample size may have contributed to their overestimation of the European component to admixture (i.e., to cluster assignment). However, the insufficient population coverage may be more responsible for the difference than the sample size or the number of markers. Concerning the number of markers, it is known that a relatively small but specifically selected number of AIMs can accurately predict ethnicity proportion…As the two papers of Xu and Jin have demonstrated, the estimated admixture rates reported did not change much regardless of whether they were using chromosome 21 data only or the whole genome, and thus a large number of markers may not be necessary to estimate the “admixture” rate of Uyghurs. When we analyzed only the 12 markers with the highest FST values in our samples…the Uyghurs had a 30.2% assignment at K = 2 to the Europe and Western Asia cluster. This estimate was not significantly different from the above 31.2% when using all 68 markers. We consider it unlikely that a different set of appropriately chosen SNPs would give a markedly different answer based on unpublished data on some of these same populations….

Basically the authors are arguing that you’d rather have a more diverse range of populations (to get more between population genetic variance) than just keep increasing the number of markers within individuals to really capture geographic diversity. Reference population matters. I know that 23andMe tells South Asians to expect to get back that they’re 70-90% “European,” with the balance “East Asian.” People with only Native American ancestry are going to be 75% “East Asian” and 25% “European.” These sorts of results from the reference populations are pretty misleading in my opinion. If you model the variation of all the world’s populations as the combination of variation of a few reference populations you’re getting a stylized fact which is confusing if you don’t know to interpret it correctly.

Below is figure one, where they show the difference between K = 2 and K = 6 (assume two or six ancestral populations for your data set). The map illustrates the distribution of K = 6, as the intensity of each color represents the current contribution in that region of a K ancestral group.


CATEGORIZED UNDER: Genetics, Genomics, History
MORE ABOUT: Genetics, Uyghurs
  • Pingback: Tweets that mention More markers, or more populations? | Gene Expression | Discover Magazine --

  • Anon

    “I know that 23andMe tells South Asians to expect to get back that they’re 70-90% “European,” with the balance “East Asian.” People with only Native American ancestry are going to be 75% “East Asian” and 25% “European.””

    Seriously? Surely they could fix this relatively easily?

  • LongMa

    They also have a hard time telling the difference between Bantoid ancestry and Native American/Asian. Many pure black Africans who are more distant from their Yaruba reference sample get “Asian” admixture.

    I got 20% Asians, 62% African, and the rest Euro. I had someone else run my raw data (who uses more samples) and he said my “Asian” (which is mostly likely Native American) is about 4%, somewhere between 15-20% is Euro, and the rest West African, which would make me very stereotypically African American. LOL

    23&me has a lot of folks confused.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar