D.I.Y. PCA

By Razib Khan | February 11, 2011 1:50 am

Long time readers know that I have a fixation on people not taking PCA too literally as something concrete. Tonight I finally merged the HGDP data set with some of the HapMap ones I’ve been playing with, and tacked my parents onto the sample. I took the ~50 HGDP populations, added the Tuscans, the two Kenyan groups, and the Gujaratis, and merged them. I thinned the marker set to 105,000 SNPs (I had to flip the HGDP strand too). Then I just let Eigensoft do its magic, and 2 hours on I produced my own plot. I’m still getting a hang of the labeling issues, but first let’s look at what 23andMe produces (I’m green):

Now let’s see what I outputted:

I suspect that the gap between my parents and the main South Asian cluster is just an artifact of the lack of South and East Indians in the sample. Additionally, things would look different if I removed the Africans, since the first principal component would be freed up. More on that later. All in all, still pretty awesome that circa 2011 this sort of thing is just an evening’s concentration.

CATEGORIZED UNDER: Genetics, Genomics
MORE ABOUT: Genetics, Genomics, PCA
  • John Emerson

    Was there ever an endogenous Mughal group in South Asia? If both your parents distantly came from that group, even though assimilated to the local populations for a few generations, the Uighur connection would be unremarkable.

    I know a Lao named Somboumkhan whose grandfather was a British mercenary of South Asian descent. The military and war create a lot of genetic movement.

  • Justin Giancola

    Are europeans steelers fans? what’s up with that?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    Was there ever an endogenous Mughal group in South Asia? If both your parents distantly came from that group, even though assimilated to the local populations for a few generations, the Uighur connection would be unremarkable.

    u mean endogamous? they weren’t endogamous. but yeah, akbar’s mongolian appearance was pretty evident. his 3/4 rajput grandson, not so much. his 1/2 persian great-grandson, not at all. i i will look into the details of that issue later (uighurs are in the HGDP).

    i’m a steeler’s fan ;-)

  • Pingback: A problem of aggregation of information | Gene Expression | Discover Magazine()

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »