Genetic variation among African Americans

By Razib Khan | May 5, 2010 10:43 am

There’s new paper in Genome Biology (tip: Dienekes) which doesn’t present too much in terms of new results, Characterizing the admixed African ancestry of African Americans, but has really, really, good visualization of the data:

From cluster analysis, we found that all the African Americans are admixed in their African components of ancestry, with the majority contributions being from West and West-Central Africa, and only modest variation in these African-ancestry proportions among individuals. Furthermore, by principal components analysis, we found little evidence of genetic structure within the African component of ancestry in African Americans.

These results are consistent with historic mating patterns among African Americans that are largely uncorrelated to African ancestral origins, and they cast doubt on the general utility of mtDNA or Y-chromosome markers alone to delineate the full African ancestry of African Americans. Our results also indicate that the genetic architecture of African Americans is distinct from that of Africans, and that the greatest source of potential genetic stratification bias in case-control studies of African Americans derives from the proportion of European ancestry.

I want to emphasize the part about lack of utility of uniparental markers. These were the first markers which became widely used in scientific genealogy, and African Americans made a great deal of recourse to these so as to identify the tribe from which their ancestors came. There are obvious historical reasons why this would have more valence for this group than for others, as their ancestral identity was consciously erased during the period of slavery.

But even though generating trees of mtDNA or Y markers is more tractable using a coalescent model, and it gives you a clean answer, it’s only a tiny slice of your ancestry. And not necessarily a representative one. Perhaps better than nothing 10 years ago, but in the days of 450 K SNP chips probably outdated. As I said above I think the paper is interesting mostly because the graphical representation is really good. Most of the time I add labels, but this figure needs no additional explanatory editing!


The blue represents European ancestry in individual African Americans, and in the text they note that the frappe bar plot nearly perfectly aligns with the distribution on the PCA plot. Remember that the two axes on the PCA plot represent the two largest axes of variation, with the first component (largest) on the x, and the second component (second largest) on the y. The largest component naturally separates Europeans from the African groups, while the second largest component separates the various African groups. The difference between the two Pygmy groups is not surprising, the Biaka have been found to be much more admixed with their Bantu neighbors than the Mbuti. I wouldn’t put too much weight in the closeness of the San and Mbuti on the plot, because you’re seeing only a two-dimensional view of the total genomic variation, the two largest dimensions as evaluated by looking at the total range of variation of genes among the set of individuals (European, African American and African) within the data set. The relationships may differ if you constrain the sample space of genetic variation to African genotypes only, and other dimensions may also indicate different relationships.

Here are the estimates of ancestral quanta for African Americans by region against different potential ancestral groups. They had 136 African Americans, so I wouldn’t put too much weight on the interregional differences.


22% of the ancestry of African Americans in the sample is European, with a standard deviation of 12%. It seems that around 10% of the African American population is more than half European in ancestry. Interestingly, in Henry Louis Gates Jr.’s Faces of America, all three of the people with black ancestry, two of whom clearly identified as African American, were more than 50% European in ancestry.* When it comes to African ancestry the affinity with the region of the west of the Bight of Benin seems clear if you view the data through a more granular lens.


The Mandenka are from the western fringe of West Africa, while the Bantu are a linguistic group which seems to have emerged just to the east of Nigeria, and swept east and south with the spread of a particular agricultural lifestyle until pushed up against the Nilotic and Khoisan groups of East and South Africa respectively. But this is on the population level. Could it be that individuals exhibit variance by African region, as they do on European ancestry? Not too much (at least beyond a level of noise, and perhaps a few outliers).

The two figures below are based on African genotypes within the African American population.

Note the contrast with the linear topology evident when European ancestry is added into the mix. Verbally what is clear is that while some African Americans have more European ancestry than others, on an individual level very few are reasonably identified as Yoruba people, or Mandenka people. Rather, individual African Americans exhibit a mix of African lineages in proportion to the various weights of sources in the slave trade.

Why might this be? I have observed before that the vast majority of the ancestry of African Americans is likely colonial. Though a few African American communities, such as the Gullah of coastal South Carolina, preserve distinctive regional African folkways, by and large black Americans as a culture are American, and derive many of their distinctive aspects from elaborations on Anglo norms or a novel synthesis of African ones (in particular, it seems clear that black Americans have been strongly influenced by the two Southern British settler folkways in their speech and religion). The deep history of African Americans within this country means that a great deal of time has elapsed whereby people of Yoruba, Mandenka or Kongo ancestry could have intermarried. Without positive assortative mating by tribe the various ancestral quanta would have become intermixed in subsequent generations. The Gullah exception supports this model, because they lived in relative isolation from whites. The rice agriculture which they practiced required less direct supervision than cotton or tobacco to extract economic productivity, and the South Carolina coastal country was notoriously unhealthful for whites. The relatively humane nature of rice agriculture as opposed to cotton (and especially sugar) also manifested in the more stable family life of the ancestors of the Gullah. So the relationship between white planters and Africans in this region was closer to that between lord and serf than owner and property, and the ancestors of the Gullah could develop their culture in America more organically than African Americans elsewhere.

Adam_Clayon_Powell_JrIn contrast, white ancestry does exhibit a great deal of individual variation. Why? There are two obvious ones that jump out. First, much of the ancestry may be much more recent. Recent ancestry has less time to be “dispersed” across the population through intermarriage. Though certainly whites and blacks mixed genetically in the colonial era, the process continued uninterrupted down to emancipation, while the addition of new African ancestry ceased in near totality by 1810 (there was some trade in slavery which reached the United States of America after this period, but not much), and had greatly diminished in the decades before 1810. The endogenous population growth of the black American community was sufficient to provide slaves for the new cotton lands of the early 19th century. After 1865 white-black relations were more surreptitious but continued nonetheless (e.g., Malcolm X’s mother’s father was white). Second, there is naturally the reality that there was, assortative mating for European features (e.g., “good hair”, skin lighter than “a brown paper bag”) among the African American elite. Though ancestry and phenotype can become decoupled, this takes time, and as I suggest above much of the European ancestry is recent. The image above is of a black American Congressman, Adam Clayton Powell Jr. I assume most readers are aware that W. E. B. Du Bois’ “Talented Tenth” were disproportionately what in other societies would be recognized as people of mixed-race, but who in the United States were classed within the general black population because of the white Southern paradigm of hypodescent.

Overall, nothing too new in the paper, but really great charts!

Citation: Zakharia F, Basu A, Absher D, Assimes TL, Go AS, Hlatky MA, Iribarren C, Knowles JW, Li J, Narasimhan B, Sidney S, Southwick A, Myers RM, Quertermous T, Risch N, & Tang H (2009). Characterizing the admixed African ancestry of African Americans. Genome biology, 10 (12) PMID: 20025784

* Gates is more than 50% European, while Elizabeth Wright is 65% European in ancestry. This aligns with intuition based on physical appearance. Malcolm Gladwell, who may not identify as African American (his father was a white Englishman, his mother a mixed-race Jamaican, and he is a Canadian immigrant), is likely to be ~75% European, though the number was not noted in the special.

Image Credit: Library of Congress

CATEGORIZED UNDER: Genetics, Genomics

Comments (4)

  1. diana

    “ut even though generating trees of mtDNA or Y markers is more tractable using a coalescent model, and it gives you a clean answer, it’s only a tiny slice of your ancestry…”

    I keep saying this to a variety of people but to little avail.

    My graphic way of attempting to illustrate it is this: imagine that your ancestry is a big inverted triangle. mtDNA would be the leftmost (or rightmost) ray in that big inverted triangle. Just one line. All your other ancestors would be everything else, the vast majority of that triangle.

    Does that make sense?

  2. SJ

    The Mandenka are from the western fringe of West Africa

    True. Also, the Mandenka/Mandinka are also traditionally from the Savannah part of West Africa. They started to really spread out from the area around modern day Mali mostly around the 13th century but generally stayed within the same lattitude – essentially areas that are/where more Savannah-like. Even in the present day, there are very few coastal areas below the 12th degree lattitude where Mandenka/Mandinkas are heavily represented.

    Yorubas (in modern day western Nigeria and Benin) live very close to the coast and therefore more likely to participate in slave trade (as both raiders & slaves). So this is likely another factor that explains the relatively higher representation of this particular group in the African American ancestry.

  3. fwiw, i think this sample of mandenka is from senegal.

  4. SJ

    fwiw, i think this sample of mandenka is from senegal.

    That would make sense (I haven’t read the paper, by the way). Except for a small strip of its southern section, Senegal is mostly Sahelian/Savannah. As can be easily seen on a map, Mali and Senegal countries lie in the same general latitute, if you remove the northern Malian region. In fact, the malinke-speaking people (a subset of the Mande group) straddle the region of eastern Senegal and Western Mali.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar