When sociology meets statistical genetics

By Razib Khan | July 26, 2011 7:40 pm

In Dr. Daniel MacArthur’s post on Roots into the Future Blaine Bettinger left an interesting comment:

It will be interesting to see how 23andMe deals with the pool of people that respond to the 10,000 free kits. Doesn’t seem like they can pre-screen applicants, since African American heritage is sometimes more sociological than genetic (based on previous genetic studies, anyway). In other words, who’s to say who is an African American and who isn’t?

And how will they deal with the unscrupulous people who apply with the full knowledge that they have no recent African ancestry? Certainly they won’t be screen those people out, even with surveys or other methods.

My concerns probably won’t apply to the genetic association studies, since they can look for test-takers that have, for example, a certain % of African American ancestry, or can look for African American ancestry in the region of the genome where the association is believed to reside (after it’s predicted to exist).

However, my concerns will certainly apply to any conclusions they might make about African American genetic ancestry. For example, a conclusion such as “XX% of African Americans have less than XX% of African American DNA,” or “XX% of African Americans have European Y-DNA signatures.” These calculations will unfortunately be biased by the “unscrupulous”, even if they ask for surveys or other methods to deter bias. The best they might be able to do is “XX% of African Americans with 5% or more of African American DNA have European Y-DNA,” and conclusions that take the “unscrupulous” bias into account.

Naive nerd that I am I hadn’t even considered the possibility of fraud! In any case, after running the African Ancestry Project I have to be honest and admit that it’s weird but I have started to “profile” genotypes automatically. I guess that’s a fraught term to use with black Americans, but the honest truth is that I don’t pay much attention to the ancestry that people report to me in the emails. I just assign them IDs, do the format conversions, and run the algorithms. I then push the results online and let people interpret it how they want to interpret it. But with all that said the genetic profile of African Americans is pretty straightforward. My sample of ~130 individuals has around 100 African Americans, and they’re distinctive in having a mix of European and African ancestry. The individuals who are from Africa stick out like sore thumbs, and I immediately know that IDX has to be African. After 200+ years African Americans almost always have some European ancestry, even if it’s at a low fractional quantum.

One of the aspects of Blaine’s comments is the idea that those who attempt to sneak into this project might distort the distribution of ancestral components reported within the African American population. I don’t think this is an issue. This is one group which has been studied some, and the consensus is rather clear that it’s about ~20% white. Let’s look at some of the papers which report results that give us a sense of what’s going on.

First, Admixture Mapping of 15,280 African Americans Identifies Obesity Susceptibility Loci on Chromosomes 5 and X. The title says it all. ~15,000 African Americans in their total pool. Here’s the table with the statistics by population set (the ranges are standard deviations):

Let’s go graphical. Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans has a PCA which shows the two largest dimensions of variation in their combined data set, which includes East Asians, Europeans, and Yoruba from Nigeria, in addition to African Americans. After removing 11 individuals (outliers and related individuals), they found that of the remaining 89 the ancestral percentages were 21 percent European and 79 percent African. The range in European ancestry in these individuals was 1-62 percent with a standard deviation of 14 percent.

Finally, let’s look at a bar plot from the Genome-wide patterns of population structure and admixture in West Africans and African Americans. Their sample of African Americans was 365. The median proportion of European ancestry was 18.5%, with the 25th–75th percentiles being 11.6–27.7%. To the left you can observe the range in African Americans in terms of admixture. A very few people are overwhelmingly European, but most individuals are much closer to 20% European.

My point in reviewing all this is straightforward: even without screening Roots into the Future will be able to ascertain the likelihood of fraud and deception. The distribution of ancestry among African Americans as a whole is pretty well characterized. There’s some inter-regional variation, but if the project observed a secondary mode with ~100% non-African ancestry I think they can assume that these should be discarded from the project.

  • Darkseid

    so are we using “African-American” again or did we never stop? i thought i remembered a thread on here saying how it didn’t make sense. maybe it was only for “Caucasian.” what is the most precise now? white, black, amerindian and asian?

  • Miley Cyrax

    There appears to be a few self-identified African Americans in the last plot that are indistinguishable from Europeans. Are they like, inverse Clayton Bigsby’s?

  • Darkseid

    lol:) actually, that’s just Paul Wall

  • dufu

    Do you think there might be people who try to use genetic testing to prove Native American ancestry? I believe there’s a good chance this might be so as doing so can have significant financial consequences.

    It’s no secret that many of the members of northeastern U.S. tribes have a lot of non-Native ancestors. Take a look at the tribal council of Mashantucket Pequots: http://www.mashantucket.com/tribalcouncil.aspx. Notice anything odd?

    Here’s the tribal council of the Mohegans: http://www.mohegan.nsn.us/Government/tribalCouncil.aspx. It may just be me, but few of those dudes look severely white.

    It’s no secret that being declared a member of those tribes is massively profitable, since they each own one of the big Connecticut tribal casinos. Given their prime location I would imagine Foxwoods and Mohegan Sun are the two most profitable Indian casinos in the U.S. And each member of those tribes receives payments on the order of thousands of dollars per month.

  • Jumblepudding

    “African American heritage is sometimes more sociological than genetic” Please don’t tell me the red-headed freckled guy I see in my neighborhood who talks in an african american accent seriously identifies himself as black. This man is sighted so I am ruling out the Clayton Bigsby scenario.

  • pconroy


    Come now, haven’t you ever heard of the rap group, “Redhead Kingpin and the FBI”?

    Here is David ‘Redhead’ Guppy

  • Mary

    @dufu: It’s already been done, and people are being booted out–see Trey’s article here:

    http://www.personalgenomics.us/994/what-makes-you-native-american-or-african-or-hawaiian-or-european-or-dna-or-culture/ And read the link to Tribal Wars.

  • http://www.thegeneticgenealogist.com Blaine

    Razib – thank you for addressing this, you make some great points. I wasn’t aware of several of these studies. Based on these previous findings (and their decent databases), it appears that the real value of the Roots into the Future project will be the medical/trait aspect rather than the ancestral aspect.

  • dave chamberlin

    The obvious straight line connecting Europeans and Africans because of recent admixture is to be expected. What it makes me wonder is how many other ancient admixtures can be teased from the data at some point as this new and quite amazing branch of science continues to develop. I guess I’ll just have to just keep reading this blog to find out.


