85% of genetic variation is within groups…

By Razib Khan | February 23, 2008 2:56 pm

…yes, true. On a typical single locus (on some loci, such as SLC24A5, most of the variation is between groups). But that doesn’t mean that you can’t use genetics to differentiate population clusters. Here are 938 individuals (the points) from 51 world populations (the color of the points) displayed on a figure with the two largest principle components of the variation.
race.jpg
From Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation. Also see Lewontin’s Fallacy.

CATEGORIZED UNDER: Genetics
  • Bob

    I grabbed the “supplemental materials” available online for that paper, but it didn’t have what I’d hoped.
    I’d be really interested to see the components of the PC1 and PC2 vectors, if I’m thinking about this the right way. I think they’re some eigenvectors in SNP space. What do the coefficients look like? For example, could you discard all but, say, the 1000 largest coefficients, and still get a similar plot? 100? 10?
    Actually, I should read the paper (or at least the abstract) to see what the dimension of the space is…
    Cheers,
    –Bob

  • http://www.scienceblogs.com/gnxp razib

    on the order of 100 for continental level differences.
    http://genomebiology.com/2002/3/7/comment/2007
    we can estimate that about 120 unselected SNPs or 20 highly selected SNPs can distinguish group CA from NA, AA from AS and AA from NA. A few hundred random SNPs are required to separate CA from AA, CA from AS and AS from NA, or about 40 highly selected loci. STRP loci are more powerful and have higher effective δ values because they have multiple alleles. Table 3 reveals that fewer than 100 random STRPs, or about 30 highly selected loci, can distinguish the major racial groups. As expected, differentiating Caucasians and Hispanic Americans, who are admixed but mostly of Caucasian ancestry, is more difficult and requires a few hundred random STRPs or about 50 highly selected loci. These results also indicate that many hundreds of markers or more would be required to accurately differentiate more closely related groups, for example populations within the same racial category.
    the paper is from 2002. i think we can go lower than 20 since we know some more ancestrally informative loci, such as SLC24A5, that they didn’t then….

  • Bob

    Thanks, Razib.
    Your mention of SLC24A5 is related to another thing I’d imagined myself fooling around with, given the raw data: remove all of the sites known to contribute to skin color, and diagonalize the matrix again. This might teach me more about the extent to which (in my best S. J. Gould impersonation) “The differences among the races are only skin deep.”
    Cheers,
    –Bob

  • http://www.scienceblogs.com/gnxp razib

    This might teach me more about the extent to which (in my best S. J. Gould impersonation) “The differences among the races are only skin deep.”
    http://www.gnxp.com/blog/2007/09/new-races-of-man.php

  • Joe

    I couldn’t read the full article. It appears that North africans are included with all africans rather than with middle easterners. Is this true? Any more specifics on the dots would be appreciated as well. Like are the east asians that are scattered amongst the CS asians southeast asians close to greater India?

  • http://www.scienceblogs.com/gnxp razib

    no to both questions (mozabites are north african, and most middle eastern). look at this figure, there are more populations:
    http://genetics.plosjournals.org/archive/1553-7404/2/12/figure/10.1371_journal.pgen.0020215.g002-L.jpg

  • Joe

    So the mozabites are the brown dot farthest from the cluster? You said, north africans aren’t africans and the link you just gave me agrees, but the wikipedia link at the top says mozabites are with the africans.

  • Bob

    The “supplemental materials” at the original link Razib gave has a spreadsheet showing the 51 groups and how they were categorized. Dunno which dot is “Mozabite,” but it’s categorized as “Middle Eastern,” according to that spreadsheet.
    Cheers,
    –Bob

  • http://rhinocrisy.org/ saurabh

    So what? This is just collapsing variation due to successive bottlenecks. That’s hardly interesting in terms of the actual “race” part of it – it just means genetics has a historical trajectory. This is true of boring, neutral variation by itself.

  • http://www.scienceblogs.com/gnxp razib

    saurabh, right, watch this space. another, more interesting, paper coming out soon….

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »