One baby, alone on a PCA island

By Razib Khan | April 24, 2012 8:40 pm

A week ago I reported that according to 23andMe I’m 40% Asian, and she is 8% Asian (in the future if I say “she” without explanation, you know of whom I speak). Obviously something is off here. The situation resolved itself when I tuned my parameters and increased my sampled populations in Interpretome. By now I’ve already done the estimates of recombination on the chromosomes which came together to produce her, and the realized value of 8 percent instead of 20 percent “Asian” simply can not be due to a particular set of unlikely crossing over events. From what I can gather it seems like ancestry painting should be viewed as a qualitative rather than a quantitative assessment. This sounds really strange when you are given percentages, but the results are strange, and obviously wrong too often in terms of the specific values.

Here’s an admixture plot which shows more realistically informative values:

I’ve run several admixture plots already with my daughter, and one thing that seems clear to me is that she received more than her “fair share” of East Asian ancestry from me. By this, I mean that I usually come out as about 15 percent or so East Asian. My daughter seems to consistently be more than 7.5 percent East Asian. This could be some sort of bias in the method, but it seems just as likely that it’s the natural outcome of sample variance. I don’t have that much East Asian to go around, so it isn’t surprising if there’s a large error in my transmission.

The rest checks out as you’d expect. There are few ancestral components where her mother and me overlap in small portions (e.g., a “West Asian” one which spans Central Europe to North India), so it isn’t always so easy to see where to draw the “50:50” line. But one thing that I want to emphasize is that these plots don’t show you “real” ancestral components. There’s no such thing. Populations and ancestries are ultimately reducible down to genetic variation. These visualizations, and the components generated by their hypotheses, reduce a set of human non-readable information to human-readable format. If the argument outlined in Reconstructing Indian History is correct then the “Gujarati B” ancestral element in this plot is actually a stabilized admix between a West Eurasian component, and a very diverged South Eurasian one. It is therefore just as accurate, and historically more informative, to state that my daughter is ~20 percent South Eurasian, ~70 percent West Eurasian, and ~10 percent East Eurasian.

When I read ADMIXTURE bar plots I try hard (and do not always succeed) to remember that they are telling with excellent precision relative relationships, but they are not telling me absolute truths. By modulating the populations sampled or changing random seeds one can obtain radically different results. From this we should not conclude that reality is a fiction. Rather, our methods are incomplete and imprecise mappings upon reality. All that said it is generally difficult to distort the rough topology of relationships out of these plots so that East Eurasians are genetically closer to Africans than they are to West Eurasians. The details may be twisted and stretched, but the general outline of relations will remain.

23andMe has a PCA where they project you upon the HGDP data set variation. The north-south axis is Eurasia vs. Africa, and west-east is Europe vs. East Asia. My daughter is in green. She’s about halfway between her parents, somewhere in the Central/South Asian cluster. This plot seems to be much more robust to what you throw at it than ancestry painting. People are where they “should be.” I suspect that’s because the PCA methods require fewer markers.

But frankly I wish they would give you more options in terms of what you could see. For example, both South Asians and Oceanians are rendered as linear combinations of the variance components dominated by Africans, Europeans, and East Asians. This is not optimal. Going down the PC dimensions would almost certainly allow for the shake out of South Asian and Oceanian informative dimensions. But you can click on the regions, and get a PCA plot which places you within your geographic context.

This turns out to be useless for my daughter. She’s shoehorned into a cluster where she’s closer to South Asian populations than the Balochi samples? I presume the issue here is that she’s being projected upon South Asian variation, which works for half her genome. But her European ancestry is a lot less informative here, and a lot of the variance in the plot is taken up by the inbred and distinctive Kalash. I really hope that 23andMe improves this feature over the summer, it can’t be that hard. I don’t think you need to recompute the PCA, but if you did PCA doesn’t take up nearly as much horsepower as hypothesis based inference or chromosomal ancestry assignments.

Naturally I wanted to give it a shot. So I took the data set which I used above in ADMIXTURE and ran it through a PCA. With the Gujarati data set from the HapMap the South Asian component was more fully fleshed out. But perhaps more importantly I discarded African, Amerindian, and Oceanian populations. Basically my daughter is Eurasian, and I wanted to flesh out Eurasian variation.

Obviously the fact that my daughter is out there “alone” is a function of lack of sampling of much of West Asia. I suspect that her position on the PCA is similar to a Turkic Iranian population; mostly West Asian, but with some East Asian ancestry. Of course, position on the PCA here brings together two very distinct types of individuals. West Asian with an East Asian component, and someone who is a synthesis of South Asian and Northern European, with an East Asian component.

When you are young there often comes a day when you ask your parents “Where am I from? Who are my people?” At least in the genetic sense my daughter’s generation will be robbed of such mystery, or delivered from confusion, depending on how you look at it. For the past few decades it has been chic to have Native American ancestry, at least purported. Genotyping can now answer whether this ancestry rises to the level of detectability. By and large I think this is a good thing, but your mileage may vary. Unfortunately my daughter is only one type of “Indian.”

CATEGORIZED UNDER: Anthroplogy, Personal Genomics
MORE ABOUT: Personal genomics

Comments (24)

  1. Eurologist

    It’s not really that unusual, in the US, to have such a diverse background.

    My son is adopted and is (likely) of German, Scottish, Spanish, Native US, and Native Mexican origin. Had he been our biological child, he would also have been of German, Spanish, and Native American origin – but with substantial Italian and Romanian mixed in.

  2. #1, do you have aspergers?

  3. Sandgroper

    If I had my time over again, I would tell my daughter where she came from before she went to kindergarten – it seemed kind of difficult/irrelevant when she was 3 years old, but that’s what I should have done.

  4. pconroy

    My eldest daughter is 1/2 Irish (me) and the rest is French/Corsican/Breton/Swiss-Italian – her Mother is from Paris, France.

    I talked to her early and often about her background, but today at 8 yo, she identifies as Irish 90% of the time and occasionally as “partly Italian” – due to the distribution of her friends in the greater New York area.

    She spoke and understood French as a child, but today refuses to speak it. Last year I asked her why, and her response was, “French people are weird”…

    So, influence by peers in determining who you identify as is strong, stronger maybe than parental input it seems.

  5. Dm

    Do we have any data on where the European Romani / Gypsies are in terms of Pan-Eurasian PCA? I am being asked to help with identifying a rumored Gypsy admixture, and all I can think of at the moment is a broad S Asian similarity … and I realize that if there a stronger ancestral South Indian component in the Romanis, then it may give us a better sensitivity.

  6. Sandgroper

    @4 – Paul, yes, agreed, but it might have given her some sort of defence against the kindergarten teacher.

    Later at primary school she befriended 3 half Irish/half Chinese sisters, and began identifying strongly as Irish, to the point of becoming very interested and steeped in Irish culture and language. It was quite funny to hear the 4 of them discussing all things Irish in Cantonese. I had to take her aside one day and say “Erm – you do realise we’re not actually Irish, don’t you?” “Don’t care.”

    Of course, I have some Irish ancestry, but I’m a bloody mess, really.

  7. #5, zack ajmal has some roma in harappa. you might think of doing ibd with various indian populations.

  8. Nihaya Khateb

    Very interesting. Then your doughter has a unique chromosomal mixture. She is going to be a very unique person. I always encouraged outbred genomic mixture. I prefer such generations than racial and inbred generations.

  9. #8, there are a fair number of bengali/northern european people. e.g., norah jones or lisa ray.

  10. pconroy


    One of my son’s play dates is 1/2 German from the Mid-West (father, video editor), 1/2 South Asian Muslim from Mumbai (mother, doctor).

    So there are some kinda similar people out there.

  11. Dr. Stephen J.Krune III

    Have you ruled out cuckolding? Serious question.

  12. #11, unless i have an identical twin whose existence i don’t know of, yeah. r = 0.5 are you a retard? serious question.

  13. Dm

    Thanks Razib, checked and the 2 Romani participants in Zack’s analysis come across as substantialy more W Asian than S Asian. Implying a relatively modest ancestral South Indian component, not that his standard admixture spreadsheet looked at it specifically. OK gotta wait a few more weeks while this potentially crypto-Romani samples winds its way through 23andme. Then, I guess if it doesn’t land right in the middle of their 2D PCA European cluster, then we’ll see where to go from there…

  14. Dr. Stephen J.Krune III

    Don’t be so sure, Razib! Men are fooled all the time, and women are known to prefer “manly” “alpha” males so I felt it needed to be asked. However if you have proof in the form of blood tests or a signed oath of fidelity that would remove all doubt in my mind. By the way I am not retarded but I do have Aspergers, which I felt was an advantage when doing my doctoral dissertation.

  15. #14, i have her genotype you fucking retard. you think it wouldn’t be obvious?

  16. Sandgroper

    “I am not retarded” – I think you should seriously reconsider that point.

  17. Sandgroper

    At the risk of stating the blindingly obvious:

    1. In the case of a bi-racial child, the possibility of a non-paternal event is lower.

    2. The rate of non-paternity events is a lot lower than a lot of people apparently like to think it is. It is over-estimated by paternity testing, for the simple reason that paternity tests are frequently carried out where there are already grounds for suspicion. The blogger actually posted on the subject of frequency of non-paternal events 3 whole days ago. The data do not support your assertion that “men are fooled all the time” – actually, men are fooled a lot less than you apparently like to think.

    3. The blogger aready knows more about his child’s genes than most people know about what they had for dinner, which renders your “don’t be so sure” kind of redundant, don’t you think? Rather more than blood tests or an oath would do.

  18. J Taylor

    Identity and ambigous origins

    Although it is not a statistically verified scientific observation, a person lacking in a sense of identity can go either of 2 ways ie more binomial than bell curve.
    In Milton’s book white gold the arab masters did not allow within racial marriage for slaves they allowd to marry. They promoted multiracial marriage in order to minimise any sense of identity in the children that could lead to rebellion at a later stage. The product multiracial marriages wereeasier to control and manipulate. They could also control the environment so that such people could strongly identify with Islam…which they often became fanatical supporters of eg Janisseries. The need for identity is very strong.

    The other side is that often people with conflicted identities become shall I say more catholic than the pope…I call it the verword effect eg Verwoerd , the ultimate afrikaner was dutch. Winnie Mandela the ultimate black rebel was part coloured,Hitler the ultimate german was austrian(and definitely of part middle eastern ancestry eg Jewish?), Churchill , the ultimate englishman was part native american or jewish depending on different sources etc.

    A lot probably depends on the environmental/ cultural environment on how the need for identity works out but the need “to find yourself” remains strong in most people

    During economic downturns the need for identity waxes and wanes during times of economic security

  19. Eurologist


    Oxytocin overdose?

    You said: At least in the genetic sense my daughter’s generation will be robbed of such mystery, or delivered from confusion, depending on how you look at it.

    I essentially said that this is not unusual in the US, and gave a likewise (but much earlier) anecdotal example, where a combination of (partially) known ancestry and genetic testing is providing answers/conundrum.

    You and your offspring are not that special, in this sense – this blessing/conundrum has been happening one way or another in the US for quite a while, now – except you have state-of-the-art tools at your hands. That’s all. No need to involve Asperger’s pot nor kettle.

  20. Ian

    *waves, from my own neighbouring island*

  21. pconroy

    @14, @15, @17,

    I think “Dr Krune” is assuming that the mother of your child is South Asian, hence his confusion?!

  22. No need to involve Asperger’s pot nor kettle.

    your comments regularly exhibit social retardation. i guess if you are a social retard you can’t tell 🙂 not that you don’t say valid things. but you’re a shitty mind reader, so stop it. your interpretation of my comment was not valid or true to my intent. now, it could be that i’m the freak, and you are just a normal person. but it doesn’t matter, because this is the freak’s world you’re speaking in. i am a moderately busy person so i can’t really afford to waste time on self-exegesis.

    no more discussion of this.

  23. “to find yourself” remains strong in most people

    no shit. though please remind yourself many of my readers are unsheep with a bias toward social retardation.

  24. #21, interesting. is it a defensible prior to assume i’m a homogamist taking into account the stuff i’ve said before? i guess if you haven’t read me.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar