Visualizing "typical" Eurasians

By Razib Khan | February 23, 2011 4:50 am

A few weeks ago I started looking at the 23andMe raw files of some of my friends and integrating them into HGDP and HapMap population data sets. One of the first things I did is remove the African populations from my total data. The reasons is as you can see to the left, Africans occupy the largest principal component of variation, which sets them apart from Eurasians. Without this dimension of variation the non-Africans are squeezed into one dimension, and groups like Oceanians and Amerindians show up in the strangest places. But that’s because these groups are non-African, and do not differ as much along the primary west-east axis of genetic variance which shakes out out of any such analysis. Africans aren’t the only issue though. As I’ve noted before I’ve been running ADMIXTURE, and isolated groups such as the Kalash can “monopolize” one particular color. This may be due to the Kalash being some distilled essence of an ancestral population, but I suspect that it’s more genetic drift due to isolation which has made these sorts of groups distinctive. So I removed these outliers…though do note that other “outliers” often pop out of the data to take their place quite often.

Below is a slide show with the PCAs of the 1st component of variance plotted with the 2nd, 3rd, and 4th, components. At the 5th and beyond it seems that the lower eigenvectors achieve a level of stability in magnitude. Remember that the plots are not scaled. The 1st PC is about an order of magnitude bigger than the 2nd. I’ve also attached an ADMIXTURE plot with K = 12, both for populations, and the individuals who have given me their 23andMe files. I’ve placed them upon the PCA. And yes, ID001 and ID002, are my parents.



As you can see, I’ve color coded the population groups. Europeans are red, Middle Easterners are blue, South Asians brown, and East Asians purple. Initially I assumed I’d made an error when I saw that the Russians and northwest Europeans were adjacent to a Middle Easterner cluster, but it goes to show you what happens when you remove African variance. Much of the Middle Eastern distribution in the conventional HGDP PCAs seem to be due to genetic relatedness with Sub-Saharan Africa. That’s not in this plot, so Middle Easterners form a relatively tight cluster. Not only that, but there’s sometimes a weird connection between northern European populations and groups closer to the heart of Eurasia. I think that’s why Orcadians and my friends tend to be shifted in that direction, while Sardinians, Basques, French, and even Tuscans, are “more European.” This weird pattern is especially evident in ID004, who is by and large a vanilla white American of Germanic heritage, but always seems to exhibit a tendency to have a trace but non-trivial element which connects him to Central and South Eurasian groups.

Finally, also, note that my parents tend to cluster together in all the higher PCs, not just 1 & 2. This stuff isn’t totally arbitrary.

Note: I put the raw PCA results generate by EIGENSOFT here in csv. I would caution that plotting with a conventional desktop spreadsheet might be a touch computationally intensive, but you’re welcome to try.

CATEGORIZED UNDER: Genetics, Genomics
MORE ABOUT: Genetics, Genomics
  • Perahu

    How come there is one Japanese individual in ‘Central Asia’ close to the Uyghur/Hazara on the plot? Labeling issue?

  • Fogbraider

    I can’t remember whether I’ve mentioned this before, but there’s a significant Amerindian element in the Orcadian gene pool – Orcadians (whalers, trappers) were perhaps more inclined than other British men to bring their Amerindian children (and sometimes wives) back home. There is, or used to be, a display about this in the museum in Stromness.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #1, i assume so. #2, i’m adding the pima indians back into my pool, so we should see it. i don’t recall having seen it before.

  • pconroy

    Slightly OT, but who are the data points by themselves, to the East/South East of your parents?
    Are they South East Asians of some sort?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    to the East/South East of your parents?
    Are they South East Asians of some sort?

    chinese ethnic minorities in yunnan actually are closest.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »