My initial inclination in this post was to discuss a recent ordering snafu which resulted in many of my friends being quite peeved at 23andMe. But browsing through their new ‘ancestry composition’ feature I thought I had to discuss it first, because of some nerd-level intrigue. Though I agree with many of Dienekes concerns about this new feature, I have to admit that at least this method doesn’t give out positively misleading results. For example, I had complained earlier that ‘ancestry painting’ gave literally crazy results when they weren’t trivial. It said I was ~60 percent European, which makes some coherent sense in their non-optimal reference population set, but then stated that my daughter was >90 percent European. Since 23andMe did confirm she was 50% identical by descent with me these results didn’t make sense; some readers suggested that there was a strong bias in their algorithms to assign ambiguous genomic segments to ‘European’ heritage (this was a problem for East Africans too).
Here’s my daughter’s new chromosome painting:
One aspect of 23andMe’s new ancestry composition feature is that it is very Eurocentric. But, most of the customers are white, and presumably the reference populations they used (which are from customers) are also white. Though there are plenty of public domain non-white data sets they could have used, I assume they’d prefer to eat their own data dog-food in this case. But that’s really a minor gripe in the grand scheme of things. This is a huge upgrade from what came before. Now, it’s not telling me, as a South Asian, very much. But, it’s not telling me ludicrous things anymore either!
But in regards to omission I am curious to know why this new feature rates my family as only ~3% East Asian, when other analyses put us in the 10-15% range. The problem with very high values is that South Asians often have some residual ‘eastern’ signal, which I suspect is not real admixture, but is an artifact. Nevertheless, northeast Indians, including Bengalis, often have genuine East Asia admixture. On PCA plots my family is shifted considerably toward East Asians. The signal they are picking up probably isn’t noise. Almost every apportionment of East Asian ancestry I’ve seen for my family yields a greater value for my mother, and that holds here. It’s just that the values are implausibly low.
In any case, that’s not the strangest thing I saw. I was clicking around people who I had “shared” genomes with, and I stumbled upon this:
As you can guess from the screenshot this is Daniel MacArthur’s profile. And according to this ~25% of chromosome 10 is South Asian! On first blush this seemed totally nonsensical to me, so I clicked around other profiles of people of similar Northern European background…and I didn’t see anything equivalent.
What to do? It’s going to take more evidence than this to shake my prior assumptions, so I downloaded Dr. MacArthur’s genotype. Then I merged it with three HapMap populations, the Utah whites (CEU), the Gujaratis (GIH), and the Chinese from Denver (CHD). The last was basically a control. I pulled out chromosome 10. I also added Dan’s wife Ilana to the data set, since I believe she got typed with the same Illumina chip, and is of similar ethnic background (i.e., very white). It is important to note that only 28,000 SNPs remained in the data set. But usually 10,000 is more than sufficient on SNP data for model-based clustering with inter-continental scale variation.
I did two things:
1) I ran ADMIXTURE at K = 3, unsupervised
2) I ran an MDS, which visualized the genetic variation in multiple dimensions
Before I go on, I will state what I found: these methods supported the inference from 23andMe, on chromosome 10 Dr. MacArthur seems to have an affinity with South Asians (i.e., this is his ‘curry chromosome’). Here are the average (median) values in tabular format, with MacArthur and his wife presented for comparison.
|ADMIXTURE results for chromosome 10|
|K 1||K 2||K 3|
You probably want a distribution. Out of the non-founder CEU sample none went above 20% South Asian. Though it did surprise me that a few were that high, making it more plausible to me that MacArthur’s results on chromosome 10 were a fluke:
And here’s the MDS with the two largest dimensions:
Again, it’s evident that this chromosome 10 is shifted toward South Asians. If I had more time right now what I’d do is probably get that specific chromosomal segment, phase it, and then compare it to various South Asian populations. But I don’t have time now, so I went and checked out the results from the Interpretome. I cranked up the settings to reduce the noise, and so that it would only spit out the most robust and significant results. As you can see, again chromosome 10 comes up as the one which isn’t quite like the others.
Is there is a plausible explanation for this? Perhaps Dr. MacArthur can call up a helpful relative? From what recall his parents are immigrants from the United Kingdom, and it isn’t unheard of that white Britons do have South Asian ancestry which dates back to the 19th century. Though to be totally honest I’m rather agnostic about all this right now. This genotype has been “out” for years now, so how is it that no one has noticed this peculiarity??? Perhaps the issue is that everyone was looking at the genome wide average, and it just doesn’t rise to the level of notice? What I really want to do is look at the distribution of all chromosomes and see how Daniel MacArthur’s chromosome 10 then stacks up. It might be a random act of nature yet.
Also, I guess I should add that at ~1.5% South Asian that would be consistent with one of MacArthur’s great-great-great-great grandparents being Indian. Assuming 25 year generation times that puts them in the mid-19th century. Of course, at such a low proportion the variance is going to be high, so it is quite possible that you need to push the real date of admixture one generation back, or one generation forward.
A new press release is circulating on the paper which I blogged a few months ago, Ancient Admixture in Human History. Unlike the paper, the title of the press release is misleading, and unfortunately I notice that people are circulating it, and probably misunderstanding what is going on. Here’s the title and first paragraph:
Native Americans and Northern Europeans More Closely Related Than Previously Thought
Released: 11/30/2012 2:00 PM EST
Source: Genetics Society of America
Newswise — BETHESDA, MD – November 30, 2012 — Using genetic analyses, scientists have discovered that Northern European populations—including British, Scandinavians, French, and some Eastern Europeans—descend from a mixture of two very different ancestral populations, and one of these populations is related to Native Americans. This discovery helps fill gaps in scientific understanding of both Native American and Northern European ancestry, while providing an explanation for some genetic similarities among what would otherwise seem to be very divergent groups. This research was published in the November 2012 issue of the Genetics Society of America’s journal GENETICS
The reality is ta Native Americans and Northern Europeans are not more “closely related” genetically than they were before this paper. There has been no great change to standard genetic distance measures or phylogeographic understanding of human genetic variation. A measure of relatedness is to a great extent a summary of historical and genealogical processes, and as such it collapses a great deal of disparate elements together into one description. What the paper in Genetics outlined was the excavation of specific historically contingent processes which result in the summaries of relatedness which we are presented with, whether they be principal component analysis, Fst, or model-based clustering.
What I’m getting at can be easily illustrated by a concrete example. To the left is a 23andMe chromosome 1 “ancestry painting” of two individuals. On the left is me, and the right is a friend. The orange represents “Asian ancestry,” and the blue represents “European” ancestry. We are both ~50% of both ancestral components. This is a correct summary of our ancestry, as far as it goes. But you need some more information. My friend has a Chinese father and a European mother. In contrast, I am South Asian, and the end product of an ancient admixture event. You can’t tell that from a simple recitation of ancestral quanta. But it is clear when you look at the distribution of ancestry on the chromosomes. My components have been mixed and matched by recombination, because there have been many generations between the original admixture and myself. In contrast, my friend has not had any recombination events between his ancestral components, because he is the first generation of that combination.
So what the paper publicized in the press release does is present methods to reconstruct exactly how patterns of relatedness came to be, rather than reiterating well understood patterns of relatedness. With the rise of whole-genome sequencing and more powerful computational resources to reconstruct genealogies we’ll be seeing much more of this to come in the future, so it is important that people are not misled as to the details of the implications.
The Pith: You’re Asian. Yes, you!
A conclusion to an important paper, Nick Patterson, Priya Moorjani, Yontao Luo, Swapan Mallick, Nadin Rohland, Yiping Zhan, Teri Genschoreck, Teresa Webster, and David Reich:
In particular, we have presented evidence suggesting that the genetic history of Europe from around 5000 B.C. includes:
1. The arrival of Neolithic farmers probably from the Middle East.
2. Nearly complete replacement of the indigenous Mesolithic southern European populations by Neolithic migrants, and admixture between the Neolithic farmers and the indigenous Europeans in the north.
3. Substantial population movement into Spain occurring around the same time as the archaeologically attested Bell-Beaker phenomenon (HARRISON, 1980).
4. Subsequent mating between peoples of neighboring regions, resulting in isolation-by-distance (LAO et al., 2008; NOVEMBRE et al., 2008). This tended to smooth out population structure that existed 4,000 years ago.
Further, the populations of Sardinia and the Basque country today have been substantially less influenced by these events.
It’s in Genetics, Ancient Admixture in Human History. Reading through it I can see why it wasn’t published in Nature or Science: methods are of the essence. The authors review five population genetic statistics of phylogenetic and evolutionary genetic import, before moving onto the novel results. These statistics, which measure the possibility of admixture, the extent of admixture, and the date of admixture, are often presented, but nested into supplements, in previous papers by the same group. On the one hand this removes from view the engines which are driving the science. On the other hand I have always appreciated that a benefit of this injustice to the methods which make insight possible is that those without academic access can actually bite into the meat of the researcher’s mode of thought.
I did read through the methods. Twice. I’ve encountered all the statistics before, and I’ve read how they were generated, but I’ll be honest and admit that I haven’t internalized them. That has to end now, because the authors have finally released a software package which implements the statistics, ADMIXTOOLS. I plan to use it in the near future, and it is generally best if you understand the underlying mechanisms of a software package if you are at the bleeding end of analytics. I will review the technical points in more detail in future posts, more for my own edification than yours. But for the moment I’ll be a bit more cursory. Four of the tests use comparisons of allele frequencies along explicit phylogenetic trees. That’s so general as to be uninformative as a description, but I think it’s accurate to the best of my knowledge. In the basics the tests are seeing if a model fits the data (as opposed to TreeMix, which finds the best model out of a range to fit the data). The last method, rolloff, infers the timing of an admixture event based upon the decay of linkage disequilibrium. In short, admixture between two very distinct populations has the concrete result of producing striking genomic correlations. Over time these correlations dissipate due to recombination. The magnitude of dissipation can allow one to gauge the time in the past when the original admixture occurred.
Does the higher genetic diversity in sub-Saharan Africans explain why mixed children of blacks + other couples usually look more black than anything?
As in, the higher number of genetic characteristics overwhelms those of the other parent and allows them to be present in the child.
But this makes you ask: is the assumption that people with some African heritage tend to exhibit that heritage disproportionately even true? From an American perspective the answer is obviously yes. But from a non-American perspective not always. Why? Doe the laws of genetics operate differently for Americans and non-Americans? I doubt t. Rather, hypodescent, and its undergirding principle of the “reversion to the primitive type” are still background assumptions of American culture. In fact today black Americans are perhaps most aggressive and explicit in outlining the logic and implications of the “one drop rule,” though non-blacks tend to accept it as an operative principle as well.
In The New York Times, DNA Turning Human Story Into a Tell-All:
The tip of a girl’s 40,000-year-old pinky finger found in a cold Siberian cave, paired with faster and cheaper genetic sequencing technology, is helping scientists draw a surprisingly complex new picture of human origins.
The new view is fast supplanting the traditional idea that modern humans triumphantly marched out of Africa about 50,000 years ago, replacing all other types that had gone before.
Instead, the genetic analysis shows, modern humans encountered and bred with at least two groups of ancient humans in relatively recent times: the Neanderthals, who lived in Europe and Asia, dying out roughly 30,000 years ago, and a mysterious group known as the Denisovans, who lived in Asia and most likely vanished around the same time.
Their DNA lives on in us even though they are extinct. “In a sense, we are a hybrid species,” Chris Stringer, a paleoanthropologist who is the research leader in human origins at the Natural History Museum in London, said in an interview.
First, for reasons of novelty we are emphasizing the exotic tendrils of the human family tree. Even Chris Stringer, the modern paleontological father of “Out of Africa,” is claiming we’re hybrids! But let’s not forget that non-Africans are the product of a very rapid radiation out of the margins of the Afrotropic ecozone within the last ~50-100,000 years. I am not entirely sure that this is as true of Africans (recall how extremely basal Bushmen are to the rest of humanity; they seem to have diverge well before the “Out of Africa” pulse).
With all the talk about Basques I decided to do my own analysis with Admixture. Dienekes gave me a copy of his IBS file, which has all the 1000 Genomes Spanish samples, including Basques. I merged it with the HGDP sample, which has French Basques (just “Basques” in the plots below) and French non-Basques. I pruned most of the populations, but kept the Mozabites, which are a Berber group from Algeria. The number of markers was ~350,000, and I ran it up to K = 8, or 8 component populations. I stopped there because the components were starting to break up in a very choppy manner.
In general I do think that the idea that non-Basque Spaniards have Moorish genetic input seems supported. It isn’t definitive though. And you have to be careful, there are lower parameter values where Sardinians seem to have an affinity with Mozabites to a great extent, even more than Spaniards. But that disappears as you move up the number of K’s. But who is to say which K is the correct K? The consistent Sub-Saharan African among non-Basque Spaniards (also evident in the Behar et al. data set) component probably convinces me that there was a Moorish impact, since these are likely to have come with the Islamic conquest, and not Phoenicians.
All the files from the Admixture run (and csv files with tabular results) are here.
Hominin increase in cranial capacity, courtesy of Luke Jostins
A few years ago a statistical geneticist at Cambridge’s Sanger Institute, Luke Jostins, posted the chart above using data from fossils on cranial capacity of hominins (the human lineage). As you can see there was a gradual increase in cranial capacity until ~250,000 years before the present, and then a more rapid increase. I should also note that from what I know about the empirical data, mean human cranial capacity peaked around the Last Glacial Maximum. Our brains have been shrinking, even relative to our body sizes (we’re not as large as we were during the Ice Age). But that’s neither here nor there. In the comments Jostins observes:
The data above includes all known Homo skulls, but none of the results change if you exclude the 24 Neandertals. In fact, you see the same results if you exclude Sapiens but keep Neandertals; the trends are pan-Homo, and aren’t confined to a specific lineage….
I badger readers here to actually use all the analytic tools which researchers put out into public circulation, rather than just offering cheap opinions. Obviously it’s way more fun and informative to have discussions with someone who can check their own hunches by doing a few “runs” overnight. Secondly, if you have minimal technical skills all it requires is an investment of time. If you can’t be bothered to invest the time if you have a modicum of nerd-quotient then it says something about how passionate you are about these issues in my opinion (granted, life gets in the way, but as someone who routinely felt lucky to sleep 3 hours on many nights over the past 3 months, please spare me).
In the comments below a few days ago someone expressed concern at the diminishing of genetic diversity due to the disappearance of indigenous populations. My response was bascally that it depends. The issue here is whether that disappearance is due to assimilation, or extinction. If a given population is genetically absorbed into another, obviously their genetic diversity is by and large maintained. What disappears are the specific genotypes, the combinations of gene pairs, which are distinctive to that given group. This is the same dynamic at the heart of the ‘disappearing blonde gene’ meme. Unless there is selection at the loci which encode or predispose one to blonde hair the ‘gene’ isn’t going anywhere. Rather, the implicit issue here is that blonde people are intermarrying with non-blonde people, and if the genetic variant has a recessive expression then the frequency of the trait will decrease. Populations with a high degree of homozygosity at the ‘blonde loci’ are distinctive in a very particular manner, but they’re no more or less ‘diverse’ than other populations which don’t manifest the same tendency.
A toy example will suffice. Take two populations, A and B, and one locus, 1, with two variants, X and x. Assume that the two populations are the same size. At locus 1 population A is 100% X, and population B is 100% x. In a diploid scenario then all the individuals in population A will be XX, and in B will be xx. When you add A + B you get a frequency of X of 0.5, and of x of 0.5 (since the two populations are balanced in size).
Even if the odds of successful interbreeding were just 5 percent, Neanderthal genes would make up the majority of the human genome today. As it is, a lack of viable sex explains why none of the Neanderthals’ mitochondrial DNA made its way into modern humans, and why so little of their main genome did.
Currat and Excoffier suggest that either modern humans and Neanderthals didn’t have sex very often, or their hybrids weren’t very fit. They favour the first idea. According to their model, it would only have taken between 197 and 430 liaisons between ancient humans and Neanderthals to fill 1-3 percent of modern Eurasian genomes with Neanderthal DNA. Considering that they two groups probably interacted for 10,000 years or so, it would have been enough for one human to sleep with one Neanderthal every 23 to 50 years.
From what I gather in the comments this is due to the fact that if there was a wave of advance very small levels of admixture per unit of advance can build up rather rapidly. I think this is easy to express in temporal rather than spatial terms.
For example, let’s imagine a population of modern humans expanding into a population of Neandertals. The original source population doesn’t receive any more contributions after the initial push, so you have a series of admixture events over time. Assuming 5% admixture per generation, this is the dilution of the “original ancestry” which would occur over 30 generations, or 750 years:
Last year when discussing the possible admixture of Neandertals with the ancestors of modern non-Africans I joked that Sub-Saharan Africans were “pure humans.” This was tongue-in-cheek in part because the results from the Neandertal genome shifted my assessment of the probability of archaic admixture within Africa as well. In other words, there may never have been a pure “human” type which expanded and assimilated archaic ancestry on the margins of its range. Species Platonism may be very misleading for our particular lineage. Rather, what it means to be human has always been in flux, a compromise between extremely different ancestral components.
The class human or H. sapiens refers to a set of individuals. On the grand scale it’s really not all that clear and distinct. When do “archaic” humans become “modern” humans? Taking into account human variation, what is a “human universal”? A set of organisms are given a name which denotes the reality that they may share common ancestry, and interact behaviorally, and are potential mates. But many of these phenomenon are fuzzy on the margins. Many of the same issues which emerge in the “species concept” debates are rather general up and down the scales of natural complexity. A similar problem crops up when we conflate the history of genes with the history of populations. Such a conflation has value and utility to a first approximation. The story of mitochondrial Eve was actually the history of one particular locus, the mitochondrial genome. But it did tell us quite a bit about the history of the human species, even if in hindsight it looks as if some scientists overinterpreted those findings. One of the major issues I’ve noticed over the past year, with the heightened likelihood of archaic admixture in the modern human genome, is that people regularly get confused by the difference between total genome ancestry, and the evolutionary history of one particular gene.
It will be interesting to see how 23andMe deals with the pool of people that respond to the 10,000 free kits. Doesn’t seem like they can pre-screen applicants, since African American heritage is sometimes more sociological than genetic (based on previous genetic studies, anyway). In other words, who’s to say who is an African American and who isn’t?
And how will they deal with the unscrupulous people who apply with the full knowledge that they have no recent African ancestry? Certainly they won’t be screen those people out, even with surveys or other methods.
My concerns probably won’t apply to the genetic association studies, since they can look for test-takers that have, for example, a certain % of African American ancestry, or can look for African American ancestry in the region of the genome where the association is believed to reside (after it’s predicted to exist).
However, my concerns will certainly apply to any conclusions they might make about African American genetic ancestry. For example, a conclusion such as “XX% of African Americans have less than XX% of African American DNA,” or “XX% of African Americans have European Y-DNA signatures.” These calculations will unfortunately be biased by the “unscrupulous”, even if they ask for surveys or other methods to deter bias. The best they might be able to do is “XX% of African Americans with 5% or more of African American DNA have European Y-DNA,” and conclusions that take the “unscrupulous” bias into account.
If you have not read my post “To the antipode of Asia”, this might be a good time to do so if you are unfamiliar with the history, prehistory, and ethnography of mainland Southeast Asia. In this post I will focus on mainland Southeast Asia, and how it relates implicitly to India and China genetically, and what inferences we can make about demography and history. Though I will touch upon the Malay peninsula in the preliminary results, I have removed the Indonesian and Philippine samples from the data set in totality. This means that in this post I will not touch upon spread of the Austronesians.
I present before you two tentative questions:
- What was the relationship of the spread of Indic culture to Indic genes in mainland Southeast Asia before 1000 A.D.?
- What was the relationship of the spread of Tai culture to Tai genes in mainland Southeast Asia after 1000 A.D.?
The two maps above show the distribution of Austro-Asiatic and Tai languages in mainland Southeast Asia. Observe that when you join the two together in a union they cover much of the eastern 2/3 of mainland Southeast Asia. The fragmented nature of Austro-Asiatic languages in the northern region, edging into the People’s Republic of China, implies to us immediately that it is likely that in the past there was a continuous zone of Austro-Asiatic speech in this region. From the histories and mythologies of the Tai people we know that this group migrated from the southern fringes of China around ~1000 A.D. This is obvious when we note that there are still Tai people in southern China, and the expansion of the Tai across what is today Thailand is to some extent historically attested. Between 1000 and 1500 there was a wholesale ethnic reorganization of the Chao Phray river basin. Was that a matter of demographic replacement, or cultural assimilation, or some of both?
Second, what was the impact of Indians upon mainland Southeast Asia? One of the easiest ways to ascertain Indian influence is script. Burmese, Thai and Cambodian scripts all derive from Grantha, an archaic Tamil script (non-Islamic scripts in island Southeast Asia, such as Javanese and Balinese, are also derive from South Indian precursors). The Indian religious influences also are more southern than northern, manifesting in the southern forms of Shaivite Hinduism and Sri Lankan Theravada Buddhism.
Negrito, Philippines. Credit: Ken Ilio
In the post below I mentioned that the Malaysian and Philippine Negritos seem to be two very distinct populations. This was something I wanted to explore in more detail, so I naturally decided to poke around the Pan-Asian SNP data set. The aims are made somewhat more difficult by the fact that there are only ~56,000 markers in the data set (as opposed to ~600,000 in the HGDP and more than 1 million in the HapMap). Additionally, the intersection with other data sets is small. For example, only ~20,000 SNPs with the HGDP. With all that in mind I hazarded that something is better than nothing. Relatives and HapMap populations were removed from the data set (thanks Zack). Additionally, I beefed up the South Asian populations with the Gujaratis from the HapMap,which had an intersection of ~32,000 SNPs. After a few test runs I decided to remove the Mlabri. They always shook out very early as a separate population from many others nearby, and, their genetic distances were very high. This tribe is only numbered in the hundreds, and I wouldn’t be surprised if they’ve been subjected to a lot of population bottlenecks, resulting in some very distinctive allele frequencies.
But before I move to the results, let’s back up for a moment. Who are the “Negritos”? As suggested by the term Negrito refers to a range of populations which are characterized by small size and African-like features (very dark skin and frizzy hair). In general their distribution is limited to Southeast Asia (there are suggestions that a Negrito population may only recently have gone extinct in Australia’s rainforests, but that’s speculative. On a more antique scale there are records which may be interpreted to suggest the existence of Negritos in Taiwan as late as 1900, and in southern China within the past 1,000 years). So you can bracket their distribution from the Andaman Islands to the Philippines, with isolated groups in the Malay peninsula. Negritos are presumed to be the original inhabitants of Southeast Asia before the arrival of rice farmers from the north. Like the Pygmies of Africa most of the Negritos speak languages whic hare known in other populations. Those of the Philippines speak Austronesian dialects. Interestingly those of Malaysia speak an Austro-Asiatic language, and so have affinities with many groups to their north linguistically, being surrounded by Austronesian speakers. Only the Andaman Islanders have a distinctive language, which makes sense seeing as how they have been relatively isolated from mainland Asian influences.
I ran ADMIXTURE from K = 4 to K = 12. K = 8 seemed the most informative to me (at higher K’s the major dynamic is that the Philippine Negritos start fragmenting into many distinct clusters). I’ve made a few cosmetic changes. With this East and Southeast Asia heavy data set there’s almost no difference between all the various Indian groups, so I amalgamated them together. I also did the same for related populations geographically adjacent which exhibited no genetic difference (e.g., Central and East Javanese).
Recently something popped up into my Google news feed in regards to “Neanderthal-human mating.” If you are a regular reader you know that I’m wild for this particular combination of the “wild thing.” But a quick perusal of the press release told me that this was a paper I had already reviewed when it was published online in January. I even used the results in the paper to confirm Neanderthal admixture in my own family (we’ve all been genotyped). One of my siblings is in fact a hemizygote for the Neanderthal alleles on the locus in question! I guess it shows the power of press releases upon the media. I would offer up the explanation that this just shows that the more respectable press doesn’t want to touch papers which aren’t in print, but that’s not a good explanation when they are willing to hype up stuff which is presented at conferences at even an earlier stage.
A second aspect I noted is that except for Ron Bailey at Reason all the articles which use a color headshot use a brunette reconstruction, like the one here which is from the Smithsonian. But the most recent research (dating to 2007) seems to suggest that the Neanderthals may have been highly depigmented. This shouldn’t be too surprising when one considers that they were resident in northern climes for hundreds of thousands of years.
But there are some new tidbits, from researchers in the field of study:
In the comments below Antonio pointed me to this working paper, What Do DNA Ancestry Tests Reveal About Americans’ Identity? Examining Public Opinion on Race and Genomics. I am perhaps being a bit dull but I can’t figure where its latest version is found online (I stumbled upon what looks like another working paper version on one of the authors’ websites). Here’s the abstract:
Genomics research will soon have a deep impact on many aspects of our lives, but its political implications and associations remain undeveloped. Our broad goal in this research project is to analyze what Americans are learning about genomic science, and how they are responding to this new and potentially fraught technology.
We pursue that goal here by focusing on one arena of the genomics revolution — its relationship to racial and ethnic identity. Genomic ancestry testing may either blur racial boundaries by showing them to be indistinct or mixed, or reify racial boundaries by revealing ancestral homogeneity or pointing toward a particular geographic area or group as likely forebears. Some tests, or some contexts, may permit both outcomes. In parallel fashion, genomic information about race can emphasize its malleability and social constructedness or its possible biological bases. We posit that what information individuals choose to obtain, and how they respond to genomic information about racial ancestry will depend in part on their own racial or ethnic identity.
We evaluate these hypotheses in three ways. The first is a public opinion survey including vignettes about hypothetical individuals who received contrasting DNA test results. Second is an automated content analysis of about 5,500 newspaper articles that focused on race-related genomics research. Finally, we perform a finer-grained, hand-coded, content analysis of about 700 articles profiling people who took DNA ancestry tests.
Three major findings parallel the three empirical analyses. First, most respondents find the results of DNA ancestry tests persuasive, but blacks and whites have very different emotional responses and effects on their racial identity. Asians and Hispanics range between those two poles, while multiracials show a distinct pattern of reaction. Second, newspaper articles do more to teach the American reading public that race has a genetic component than that race is a purely social construction. Third, African Americans are disproportionately likely to react with displeasure to tests that imply a blurring of racial classifications. The paper concludes with a discussion, outline of next steps, and observations about the significance of genomics for political science and politics.
Update: John Hawks’ lab is working in the same area, and he disagrees with the specific results presented here. Always reminds you to be careful about sexy results presented at conference! (someone should do a study!)
So claimed Peter Parham at a Royal Society meeting last week, Human evolution, migration and history revealed by genetics, immunity and infection. You can actually listen to the talk by pulling down the mp3 file. To get the part about human evolution and introgression, jump to 24 minutes in.
Here is the general sketch: It looks like ~50 percent of the HLA Class I alleles in Europeans derive from Neandertals, ~70-80 percent of HLA Class I alleles in East Asians derive from Denisovans, and that and ~90-95 percent of HLA Class I alleles in Papuans derive from Denisovans. If you recall, ~2.5% of the total genome content of non-Africans seems to be Neandertal, while ~5% of the total genome content of Papuans seems to be Denisovan. The total genome content proportions are rough estimates, there may be some wiggle room in there. But you can see that the HLA allele admixture estimates from these ancient Eurasian lineages is greater by an order of magnitude. Why?
A Cape Coloured family
I’ve mentioned the Cape Coloureds of South Africa on this weblog before. Culturally they’re Afrikaans in language and Dutch Reformed in religion (the possibly related Cape Malay group is Muslim, though also Afrikaans speaking traditionally). But racially they’re a very diverse lot. In this way they can be analogized to black Americans, who are about ~75% West African and ~25% Northern European, with the variance in ancestral proportions being such that ~10% are ~50% or more European in ancestry. The Cape Coloureds though are much more complex. Some of their ancestry is almost certainly Bantu African. This element is related to the West African affinities of black Americans. And, they have a Northern European element, which likely came in via the Dutch, German, and Huguenot settlers (mostly males). But the Cape Coloureds also have other contributions to their genetic heritage. Firstly, they have Khoisan ancestry, whether from Bushmen or Khoi. This is well known in their oral memory. The the hinterlands of the Cape of Good Hope are beyond the ecological range of the Bantu agricultural toolkit, so the region was still dominated by the Khoisan when the Europeans arrived. But there are also other suggestions of ancestry from Asia. The existence of the Cape Malays, whose adherence to Islam derives from the Muslims slaves brought by the Dutch, hints at likely relationships to the populations of maritime Southeast Asia. Finally, there are the Indians. This element is not too well recalled in cultural memory. But the Dutch brought many slaves from India as well as Southeast Asia. The Dutch first governor of the Cape Colony had a maternal grandmother who was an Indian slave, by various accounts Goan or Bengali (the town of Stellensbosch is named for him). No doubt it was far more likely that the usual lot of the descendants of Indian slaves during the Dutch era would be to be absorbed into the melange of the Coloured population than assimilated into what later became the Afrikaners.
Why is this aspect of Cape Coloured ancestry forgotten? I think part of the reason is that there is a large South African Indian community present today, but that community post-dates the Dutch period, and arrived with the British. When South Africans think of Indians they think of these people. Interestingly when the new genetic studies confirming Indian ancestry came on the scene I was “corrected” several times by Indians themselves when reporting this part of the Coloured heritage. They were under the impression I must be mistaken, as no one was familiar with the Cape Coloureds having Indian ancestry. Unfortunately pointing to PCA and STRUCTURE plots did not clear up the confusion.
In any case, thanks to the African Ancestry Project I now have three unrelated Coloured samples (I have more, but they are related). Since AAP is Afrocentric I thought it would be appropriate to run the Coloured samples separate first. So that’s what I did.
In the post yesterday I reported what was generally known about the Horn of Africa, that its populations seem to lie between those of Sub-Saharan African and Eurasia genetically. This is totally reasonable as a function of geography, but there are also suggestions that this is not simply a function of isolation by distance (i.e., populations at position 0.5 on the interval 0.0 to 1.0 would presumably exhibit equal affinities in both directions due to gene flow). For example, you observe the almost total lack of “Bantu” genetic influence on the Semitic and Cushitic populations of the Horn of Africa, and the lack of Eurasian influence in groups to the south and west of the Horn except to some extent the Masai.
Tacking horizontally in terms of discipline, over the past few generations there has been a veritable cottage industry making the case for the recent origin of many ethno-linguistic populations through a process of cultural self-creation. Clearly there are many cases of this, some of them studied in depth by anthropologists (e.g., the shift from Dinka to Nuer identity). But there has been an unfortunate tendency to over-generalize in this direction. In some ways this is peculiar insofar as these models presuppose the infinite plasticity of culture without observing the sharp and strong norms which those very same phenomenon can enforce. The genetic isolation of non-Muslims in the Middle East after the rise of Islam seems rather well validated by the evidence from genomics. The norms of both Muslims and non-Muslims strongly biased them toward endogamy, and nature of Islamic hegemony and domination was such that Muslims were the ones who were likely to have cosmopolitan affinities with the “Islamic international.” In contrast, non-Muslim minorities began a long process of involution after the Islamic Arab conquests, only disrupted in the past century by emigration and to a lesser extent emancipation.
So back to the Horn of Africa. The vast majority of the people of the Horn of Africa speak an Afro-Asiatic language. Arabic and Hebrew are the most famous members of this group, but it is a very broad classification, ranging from the dialects of the Berbers in the Maghreb all the way to ancient Akkaddian. There are two large subfamilies of particular note and interest here: Semitic and Cushitic. The map above shows the distribution within the Horn of Africa. One can “quick & dirty” summarize the pattern here by observing that Semitic languages in Ethiopia tend to be concentrated in the north-central Christian highlands, while Cushitic is found everywhere else. Additionally, there is the confluence between religion and ethnicity, as there are Cushitic Muslims (Somalis, Afar, etc.) and Cushitic Christians (many Oromo, etc.). From what I can gather many Cushitic social and political elites have had a tendency toward assimilating into an Amhara Semitic identity (Haile Selassie’s mother was a Muslim Oromo). We could therefore generate a possible model where Semitic langauges arrived late to Ethiopia and spread through elite emulation, so the difference between Semitic and Cushitic peoples should be marginal in the genomic dimension (such as the marginal differences between Hausa and Yoruba in Nigeria). Or, we could posit that the Semitic element is distinctive from a pre-existent Cushitic substratum.
To make a long story short by running more ADMIXTURE with a Horn of Africa centered data set I have discerned that one can actually differentiate Cushitic and Semitic elements in the Horn and tentatively identify them with different ancestral components. First, the technical details….