By Razib Khan | January 26, 2011 2:44 am

Yesterday the first batch of results from 23andMe’s v3 chip came online. Instead of 550,000 SNPs you get ~1 million. The difference is pretty clear when you look at the raw SNPs. Under Account → Browse Raw Data, I can enter LCT, and this is what I see:

I’m line #2. A sibling is line #1. Looking at this sort of stuff makes it really likely I’ll upgrade. My main rationale for not upgrading is that there’s diminishing marginal returns for ancestry related stuff. Speaking of ancestry, let’s compare my sibling’s ancestry painting to my own.

In a bit of a surprise, while I’m 43% “Asian,” my sibling is 40% “Asian.” This is somewhat reflected in global similarity.

Northern Europe66.5666.530.03
Southern Europe66.4766.430.04
Middle Eastern66.3866.350.03
Northern Africa65.7865.680.1
Central/South Asia67.1867.140.04
East Asia67.6667.560.1
North America67.267.150.05
South America67.3367.180.15
West African63.9963.870.12
Central African6463.890.11
Eastern African64.1564.060.09
South African64.0463.930.11

I was curious if the differences were v3 vs. v2 chip, so I compared two individuals of Northern European ancestry:

Northern Europe v2Northern Europe v3Difference
Northern Europe67.8267.88-0.06
Southern Europe67.7467.730.01
Middle Eastern67.1467.120.02
Northern Africa66.4266.370.05
Central/South Asia66.8866.94-0.06
East Asia65.7365.77-0.04
North America66.1166.15-0.04
South America66.0166.02-0.01
West African63.2963.210.08
Central African63.3564.24-0.89
Eastern African63.4863.430.05
South African63.3763.270.1

There’s less of a consistent difference here. So I really don’t know what to think.

On the other hand, there’s no difference when it comes to the two dimensional scatter plot which maps your position on the HGDP sample populations, global similarity advanced. Instead of looking at the total genome, remember that this is taking your genetic variation, and placing it within a position along a set of independent dimensions which emerge from the HGDP population variance. Here you’re looking at the two largest components of variance, which easily shake out into discernible clusters. The scale here is not fine enough to distinguish myself from my sibling; we’re basically on the same spot (the black positions are a bunch of South Asians I’ve shared genes with in the attempt to elucidate whether I am a genuine genetic outlier in South Asia).

Overall I’m obviously of two minds here: the differences between between myself & my sibling are genuine, or, they’re artifacts of the fact that I’m on v2 and they’re on v3. A major reason I have to suspect that the difference of chip is important is that my sibling is genetically very close to two individuals of Northern European ancestry…who happen to be two other people who I’m sharing with who are v3.  Additionally, these two people seem to be suspiciously genetically close as well. v3’s cluster with v3’s far more than random expectation.

But then I went to family inheritance. To the left you can see which regions where my sibling & I exhibit identity by descent. Since I do not seem to be inbred (confirmed by checking for runs-of-homozygosity in my raw genotype), my parents contribute different homologs, the distinct genes inherited from their own parents, to their offspring. Imagine that the paternal genotype is Pp and and maternal is Mm, where upper case is the copy inherited from the mother, and the lower case inherited from the father (our common grandparents). The offspring could be: PM, Pm, pM, pm. Siblings have an expectation of 0.50 relatedness, but because of the variance there’s some wiggle room. If the Southeast Asian ancestry which we seem to have is recent enough, then recombination may not have broken up the ancestrally informative regions of the genome which are concentrated on particular homologs inherited from each parent (in particular, I have a suspicion that my father’s genome is a mosaic of conventional Bengali along with recent Southeast Asian admixture; we’ll know in a few weeks). Variance of inheritance of these regions of the genomes may then explain the fact that I am “more Asian” than my sibling. All that being said, I do find it of interest that while I have “wet earwax” and the associated genotype (in heterozygote form), my sibling has the genotype for dry earwax, which is typically found in East Asia (though the minor allele with recessive expression is found at non-trivial proportions across South Asia). Such issues of personal hygiene are not ones which most have inquired of, so this was news to me, though not surprising given my heterozygosity on this locus.

Addendum: For Western Europeans who are on v3, are you finding trace amounts of Asian? Previous gene sharing suggests that ~1% Asian is not uncommon among Finns (and also Russians, but this can be attributable to recent Tatar admixture), but I haven’t seen this among other Europeans without recent non-European admixture. I ask because a friend on v3 who is adopted, but presumably of Western European ancestry (he was told the putative ethnicities of his biological parents), has 1% Asian.

Also, HAP is accepting v3 for South Asians, Iranians, Burmese, and Tibetans!

Comments (16)

  1. There does seem to be some v3 clustering going on. My wife’s best match is a European, the only other person in my sharing list who is on v3.

  2. Pohranicni Straze

    My wife and I just got our results in. I show up as 100% European, my wife as >99% Asian with a little dash of European. On the plot, I show up in the overlap zone between German, French, English, and Norwegian; my wife shows up squarely in the middle of the Cambodian data (probably due to being intermediate between the Dai and Chinese reference populations). The difference in relative finder is really stark- she has only 14 potential cousins showing up, with the closest a 0.19% shared DNA match, while I have 462 at last count, up to a 0.49% match. We need more Southeast Asians in the database!

    Are there any good resources on Asian mtDNA groups out there? Most of the basic references I’ve found are full of highly detailed information on the mainly European haplogroups, with very scant information on anything else. My wife’s mtDNA shows up as M12, about which I’ve found very little other than that it has been found at very low frequencies in Japan, Korea, and Tibet.

  3. Mary

    My results just came in. It shows >99% European and <1% African. At first I got excited, but soon realized that the African bit was "noise". My paper genealogy is pretty solidly northern European. My mtDNA is X2b.

  4. sv

    “A major reason I have to suspect that the difference of chip is important is that my sibling is genetically very close to two individuals of Northern European ancestry…who happen to be two other people who I’m sharing with who are v3. Additionally, these two people seem to be suspiciously genetically close as well. v3′s cluster with v3′s far more than random expectation.”

    The baseline for similarity seems to be a couple percentage points higher on v3. I only have a few V3 people that I am sharing with, but in the case of South Asians and a Middle Easterner, they are higher than any of my other matches by quite a bit – much higher than they would be on v2. I also have an East African sharing on the v3 chip, and he is higher than the v2 West Eurasians. The “Compare Genes” feature really should be divided into two parts: one for comparisons between v3 people, and one for comparisons with v2 people.

  5. Pohranicni Straz,

    at least your wife has relatives. my sibling is my first “relative” to show up 🙂 no brownz in the db.

  6. RK

    Sorry, my previous question was stupid. Let me rephrase: is there any difference when you switch to the world-wide plot? I imagine not, or you would’ve mentioned it.

  7. Razib,

    Fascinating, I’m really curious how that shakes out. I have yet to have anyone close in relationship to me to do this, though I’m trying to convince my father and a sibling or two.

    An related question though, how would one go about taking the raw data and analyzing it further for ancestry? I’m a biologist, well versed in evolutionary science (my Ph.D), but the tools of genetic ancestry are new to me. Specifically the scatter plot using HGDP data. I’d like to get a better picture of my adopted daughter’s placement. My own is 99% European and 1% African, but maps pretty squarely in the European cluster, but I’d still like to look at that a bit deeper.

  8. Sorry, my previous question was stupid. Let me rephrase: is there any difference when you switch to the world-wide plot? I imagine not, or you would’ve mentioned it.

    didn’t see your previous comment. stuck in spam? anyway, there’s a slight difference. it’s more notable with other people. in the world plot i’m closer to the uyghur/hazara cluster than in the south asian zoom one.

  9. Jason Malloy

    Identity-by-descent sharing is also another ingenious way for demonstrating the heritability of traits without the “equal environment assumption,” and (presumably) could be used to test for racial differences in behavioral traits using mixed race siblings.

  10. Paul Ó Duḃṫaiġ

    Myself and the wife got our results the other day. Unsurprising I’m 100% European and cluster in the middle of the Irish on “Global similarity”.
    The wife came back as 87% asian, 12% European and less then 1% african. She’s a Filipina and shows up in between cambodians and chinese. One of her Great Grandfathers was Spainish so she’s quite happy to come up as 1/8th European. Going on Ancestory finder I see she shares some very small segments with people in Taiwan (birthplace of Austronesian languages) and Mexico (The Philippines was ruled from Mexico City by the Spanish)

  11. Pohranicni Straze

    I have found an oddity in the genome sharing section of 23andme. I am sharing genomes with some “potential 4th cousins”. I match all of them in the range of 74.37 – 74.48%. My wife matches them all in the range of 71.23 – 71.32%. But my wife and I match at 73.13%, despite our ancestry painting showing us to both be practically 100% of our respective groups (Asian and European). Has anyone else seen this sort of unexpectedly close match between supposedly “pure” racial types?

  12. RK

    sv is right. My homozygosity for v2 data was 68.417% (including no-calls as homozygous), but for v3 it’s 70.598%. It seems likely that the minor allele is rarer for the new v3 stuff, so v3 people will cluster. They should normalize it.


