After my post on the ‘race question’ I thought it would be useful to point to Jerry Coyne’s ‘Are there human races’?. The utility is that Coyne’s book Speciation strongly shaped my own perceptions. I knew the empirical reality of clustering before I read that book, but the analogy with “species concept” debates was only striking after becoming more familiar with that literature. Coyne’s post was triggered by a review of Race?: Debunking a Scientific Myth and Race and the Genetic Revolution: Science, Myth, and Culture. He terms the review tendentious, and I generally agree.
In the early 20th century Western intellectuals of all political stripes understood what biology told us about human taxonomy. In short, human races were different, and the white European race was superior on the metrics which mattered (this was even true of Left-Socialist intellectuals such as H. G. Wells and Jack London). In the early 21st century Western intellectuals of all political stripes understand what biology teaches us about human taxonomy. Human races are basically the same, and for all practical purposes identical, and equal on measures which matter (again, to Western intellectuals). As Coyne alludes to in his post these are both ideologically driven positions. One of the main reasons that I shy away from modern liberalism is a strong commitment to interchangeability and identity across all individuals and populations as a matter of fact, rather than equality as a matter of legal commitment. In a minimal government scenario the details of human variation are not of particular relevance, but if you accept the feasibility of social engineering (a term I am not using in an insulting sense, but in a descriptive one) you have to start out with a model of human nature. So this is not just an abstract issue. For whatever reason many moderns, both liberals and economic conservatives, start out with one of near identity (e.g., H. economicus in economics).
I want to highlight a few sections of Coyne’s post:
In terms of autosomal DNA, the Iceman clearly clusters with modern Sardinians, and also appears slightly more removed than them compared to continental Europeans. Interestingly, at least as far as the PC analyssi shows, Sardinians appear to be intermediate between the Iceman and SW Europeans, rather than Italians. Perhaps, this makes sense if the Paleo-Sardinian language is indeed related to languages of Iberia.
This trend aroused a little curiosity in me too. I’m sure Dienekes & company will be probing these issues a lot in the near future, but I couldn’t wait. I took the IBS data set, which includes a lot of individuals from various areas of Spain, the Sardinians, French and French Basque from the HGDP, and the Tuscans from the HapMap, and threw them together into a pot. I added HGDP Russians & Orcadians (the latter a British group) to make sure there was a North European “outgroup.” In terms of technical details the combined data set had ~220,000 SNPs, not too shabby. Additionally, I decided to run a PCA, where this number of SNPs is more than sufficient.
On a technical note, the Sardinians were swamped in raw numbers by Iberians and Tuscans (over 100 and around 80 respectively). This means that the peculiarities of the Sardinian genetic heritage didn’t show up, rather, what you see are the Sardinians as they arrange themselves in relation to the genetic variation of these more numerous groups. I used SmartPCA to generate the 10 largest independent dimensions of variation. To make a long story short there really wasn’t much variation added from the second dimension on in this relatively homogeneous sample. So below is PC 1 and 2 (E1 and E2).
Well, the paper is finally out, New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. In case you don’t know, Ötzi the Iceman died 5,300 years ago in the alpine region bordering Austria and Italy. His seems to have been killed. And due to various coincidences his body was also very well preserved. This means that enough tissue remained that researchers have been able to amplify his DNA. And now they’ve sequenced it enough to the point where they can make some inferences about his phenotypic characteristics, and, his phylogenetic relationships to modern populations.
The guts of this paper will not be particularly surprising to close readers of this weblog. The guesses of some readers based on what the researchers hinted were correct: Ötzi seems to resemble mostly closely the people of Sardinia. This is rather interesting. One reason is prosaic. The HGDP sample used in the paper has many Northern Italians (from Bergamo). Why is it that Ötzi does not resemble the people from the region that he was indigenous to? (we know that he was indigenous because of the ratio of isotopes in his body) A more abstruse issue is that it is interesting that Sardinians have remained moored to their genetic past, enough so that a 5,300 year old individual clearly can exhibit affinities with them. The distinctiveness of Sardinians jumps out at you when you analyze genetic data sets. They were clearly set apart in L. L. Cavalli-Sforza’s The History and Geography of Human Genes, 20 years ago. One reason that Sardinians may be distinctive is that Sardinia is an isolated island. Islands experience reduced gene flow because they’re surrounded by water. And sure enough, Sardinians are especially similar to each other in relation to other European populations.
There’s a new paper out, Partial genetic turnover in neandertals: continuity in the east and population replacement in the west. The primary results are above. Basically, using 13 mtDNA samples the authors conclude that it looks as if there was a founder effect for Neanderthals in Western Europe ~50 K years ago, generating a very homogenized genetic background for this particular population before the arrival of modern humans. Perhaps it’s just me, but press releases with headlines such as “European Neanderthals Were On the Verge of Extinction Even Before the Arrival of Modern Humans” strike me as hyperbolic. I’m also confused by quotes like the one below:
After posting on Basque mtDNA I wanted to make something more explicit that I alluded to below, that uniparental lineages are highly informative, but they may not be representative of total genome content. This is plainly true in the case of mestizos from Latin America, but we don’t need genetics to point us in the right direction on this score, we have plenty of textual evidence for asymmetry in sexes when it came to admixture events in the post-Columbian era. Rather, I want to note again the issue of South Asia. When it comes to mtDNA the good majority of South Asian lineages are closer to those of East Asia than Western Eurasia. By this, I do not mean to say that that they’re particular close to East Asian lineages, only that if you go back in the phylogeny the South Asian lineages (I’m thinking here of haplogroup M) they tend to coalesce first with East Asian lineages before they do so with West Eurasian lineages.
Here is a quote from one of the definitive papers on this topic:
Michelle tipped me off to 23andMe’s new initiative to get Parkison’s disease sufferers genotyped. Basically, if you are a sufferer, you get the service for free. The goal presumably to increase the sample size so as to pick up new possible associations. But a question: can you think of a downside for Parkinson’s disease sufferers? A lot of people have genetic privacy concerns, but if you manifest a disease like Parkinson’s I suspect that’s the least of your worries.
There’s a new paper in AJHG which caught my eye, The Basque Paradigm: Genetic Evidence of a Maternal Continuity in the Franco-Cantabrian Region since Pre-Neolithic Times (ungated). The first thing you need to know about this paper is that it focuses on only the direct maternal lineage of Basques via the mtDNA. In some ways this is weak tea, since it doesn’t give us a total genome estimate. But there are major upsides to mtDNA and Y. First, because of the lack of recombination it is relatively easy to generate a nice phylogenetic tree using a coalescent model. And second, for mtDNA the molecular clock is considered relatively reliable.
In this specific paper they also expanded the scope of their analysis to the whole mtDNA sequence, instead of just the hypervariable region. Not only did they look at whole sequences, but they also had an enormous sample size. They sequenced over 400 mtDNA genomes from the Basque country and neighboring regions. Haplogroup H peaks in frequency among Basques, and drops off among their neighbors (Gascons, Spaniards, etc.). Because the Basque speak a non-Indo-European language they are usually presumed to be indigenous in relation to their neighbors (or at least more indigenous). Until recently there was a strong presupposition that the Basque were ideal representatives of the pre-Neolithic populations of Western Europe. One common method of analysis would be to use the Basque as a pre-Neolithic “reference,” and simply estimate the impact of a Neolithic demographic wave of advance by using a eastern Mediterranean population as a second “reference” within an admixture framework. But more recent work has muddled the idea that the Basque are the descendants of Paleolithic Europeans. Finally, I suspect we’ll also have to acknowledge complexity in demographic histories. To say that the Basque exhibit continuity with Mesolithic Iberians may not contradict a substantial Neolithic contribution. South Asians for example are one numerous modern group which exhibits sharply divergent affinities if you use Y chromosomes (West Eurasian) or mtDNA (not West Eurasian). Why? The details are prehistorical.
Recently I was tipped off to the appearance of a new paper, Genome-Wide Association Study Identifies Chromosome 10q24.32 Variants Associated with Arsenic Metabolism and Toxicity Phenotypes in Bangladesh. This is the section which caught my eye: “Using data on urinary arsenic metabolite concentrations and approximately 300,000 genome-wide single nucleotide polymorphisms (SNPs) for 1,313 arsenic-exposed Bangladeshi individuals.” 300 K SNPs with 1,313 Bangladeshi individuals is a lot! I’m interested in this data set because of the 200+ participants in the Harappa Ancestry Project my parents remain the “unadmixed” South Asians with the highest fraction of East Asian ancestry (10-15 percent). Within South Asia aside from those groups with clear East Asian affinities only peoples of Munda background have the same levels. This data set could answer a lot of questions as to the typicality of my parents (literally within a few hours in terms of data exploration). But this is all you get in the supplements:
I’m going to be speaking at the Moving Secularism Forward conference in Orlando next week. They invited me because I’m a conservative atheist public intellectual, and the three other conservative atheist public intellectuals in the United States were presumably busy. In any case, going over what I’m going to talk about I was double-checking political breakdowns by atheist & agnostic proportions and ideology in the General Social Survey for after the year 2000.
I used the “GOD” variable, which asks people about their belief in God. Those who did not believe, or said there was no way to find out, I classed as “atheists & agnostics.” This means that the total percentages in the population are higher than self-reports; that’s because the word atheism in particular has a negative connotation (I recall that Julia Sweeney’s parents were tolerant of the fact that she did not believe in God, but were aghast that she was an atheist!). “POLVIEWS” what the variable which I crossed “GOD” with. It has seven responses, from very liberal to very conservative, and I just put all liberals and conservatives into one category.
The first table displays what proportion in the whole society atheist & agnostic liberals (or conservatives) are. Since the total proportion of atheists & agnostics is small, naturally these percentages are small. The two subsequent tables just display what percentage of atheists & agnostics are liberal, or what percentage of liberals are atheist & agnostic.
There has been a lot of talk in the media about a new paper which reports that the Y chromosome is not deteriorating, as had been previously inferred from the data. In the 2004 Bryan Sykes wrote Adam’s Curse: A Future Without Men which used this model as a framing device (and naturally elicited great general interest). You can read some earlier critiques at Gene Expression Classic. I never paid attention to this debate in the details because it seemed ludicrous on the face of it. Bryan Sykes’ was predicting the extinction of males in ~100,000 years. Right, we just happen to be living right before the genomic Götterdämmerung. I don’t think so. Sometimes absurd results which fly in the face of plain history and robust theory are profoundly insightful. But most of the time they’re just false leads.
When the Isthmus of Panama rose from the sea, it may have changed the climate of Africa–and encouraged the evolution of humans.
The emergence of the Isthmus of Panama has been credited with many milestones in Earth’s history. When it rose from the sea some 3 million years ago, the isthmus provided a bridge for the migration of animals between North and South America, forever changing the fauna of both continents. It also blocked a current that once flowed west from Africa to Asia, diverting it northward to strengthen the Gulf Stream. Now Steven Stanley, a paleobiologist at Johns Hopkins, says that that change in currents may be behind yet another major event: the evolution of humans. When the isthmus rearranged the ocean, he says, it triggered a series of ice ages that in turn had a crucial impact on the evolution of hominids in Africa.
Question: do we have enough nukes to re-open the isthmus?
Recently Jason Antrosio began a dialogue with readers of this weblog on the “race question.” More specifically, he asked that we peruse a 2009 review of the race question in the American Journal of Physical Anthropology. Additionally, he also pointed me to another 2009 paper in Genome Research, Non-Darwinian estimation: My ancestors, my genes’ ancestors. Normally I don’t react well to interactions anthropologists who are not Henry Harpending or John Hawks. But Dr. Antrosio engaged civilly, so I shall return the favor.
I did read all the papers in the American Journal of Physical Anthropology special issue, as well the Genome Research paper. My real interest here are specific questions of science, not history or social science. But I will address the latter areas rather quickly. I am not someone who comes to this totally naked of the history or social science of the race question. I’ve read many books on the topic. And as a colored person who has moderate experience with racism I get rather bored and irritated with excessively patronizing explanations of how racism afflicts us coloreds from white academics (non-white academics who focus on this subject are usually careerists or activists who don’t have to make much pretense toward scholarly substance and can be duly ignored, at least in my experience). The main point which I think we can all agree upon is that colloquial understanding of race has only a partial correlation with any genetic understanding of race. I myself have ranted against the confusions which have ensued because of the conflation of the two classes, and it is certainly a legitimate area of study, but it is not my primary concern. And importantly, I have no great primary interest in battling racism.
Greg Cochran pointed out something that I’d been considering about the MacArthur et al. paper: if the average human (OK, non-African human) has ~100 loss-of-function variants, then the standard deviation should be ~10. That’s because the distribution is presumably poisson, and variance = mean, and the square root of the of the variance (~100) is the standard deviation (~10). In plainer English there should be a substantial variation in the number of loss-of-function variants within a population, and across siblings. Though by definition these loss-of-function variants don’t kill you, in general there is the assumption that this class of mutants does exhibit some fitness drag (e.g., the fitness of a heterozygote for a variant which is lethal as a homozygote genotype may be ~0.90). A quick back of the envelope calculation implies to me that there is a 1 out of several hundreds of thousands probability that two siblings may exhibit a range of 60 loss-function-variants. But a 40 unit gap is more like a 1 out of one thousand chance.
This variance in mutational load has been the hobby-horse of intellectuals for a while now. Armand Leroi suggested that it correlated with beauty. Geoffrey Miller with intelligence. In the near future presumably we’ll get to see if there’s anything real in this. And obviously we don’t need to leave it to scientists. We’ll all know the summary statistics about own genomes, and probably be able to intuit rough patterns…if they exist.
At the end of last year we announced that we’ve got some funding from the German WikiMedia foundation to get more people – who are willing to share their results – genotyped. We have now settled on a process that should allow us to perform the project without too many problems.Starting today, you can apply for one of the free genotypings. The deadline for applications is Sunday, 03/25/12 23:59 o’clock, so you still have some time to think about an application. In the two weeks following the deadline, we will select as many participants as we can afford to get genotyped using the 5000 Euros we received from Wikimedia. We’ll get in contact with everybody who has sent an application to let all applicants know whether their application was successful or not.
The genotyping will be done through 23andMe. We will order you a gift kit which will be delivered to your address. These gift kits include prepaid access to the 23andMe website for 12 months, so you can check up on the latest findings about your genetic variation as well. After this 12 month period, those features will expire automatically, you don’t have to cancel any subscriptions.
Our application form contains some standard questions (Where do you live? Does 23andMe deliver to your country? etc.) but also some details about your motivation, why you want to make your dataset available to the public and why your data might be of great interest (For example: Do you have a rare disease where research is lacking?). Additionally, we will also try to get people genotyped who are currently under-represented in publicly available data sets. Most data up to now is from WEIRDs: Western, Educated, Industrialized, Rich and Democratic people (most are probably male, too).
Bastian already contacted me about getting Afrikaners typed this way. I haven’t had time to get back to him, but this might be a viable option if you live in a country where 23andMe ships.
If we looker at the bigger picture we see that most of continental Europe is tied to each other more trough mutations than others making them harder to seperate even at this level (6 chromosomes). We see that Lithuanians seem to have stronger affiliation to the large continental European cluster including Scandinavians but this affiliation is weaker for Vologda Russians. This connection is even weaker for Finns and almost non-existing for Saamis. This is in accordance with the MDS plot.
Here is the relevant plot (I have added some labels):
Over at Genomes Unzipped Dr. Daniel MacArthur has a review up of a paper in Science where he is first author (note for grad students and aspiring post-docs, Dr. MacArthur is starting a new lab, where he posted an ungated version of the paper). He hits all the salient points, so I will cover two issues, a general and a specific.
Prompted by my posts, Dienekes, A teaser on the Kalash:
I am in the middle of a ChromoPainter/fineSTRUCTURE analysis of a broad dataset designed to explore certain mysteries that have often come up in my previous experiments. Barring the unexpected, the analysis should be completed sometime next week.
Below you can see the normalized number of “chunks” donated by various populations to the Kalash….
Here is the bar plot which Dienekes generated (left to right indicates extent of “donation” to the Kalash):
I highlighted the most significant non-South Asian donor. Dienekes states:
A recent paper on Turkish genetics has a tree which illustrates a summary of how the Kalash shake out:
I say summary because this tree takes a lot of information and tries to generate the best fit representation. It does hide some information by the nature of its aggregation of patterns. For example, the position of the Burusho, or Turks, has to do with the fact that both of these have low, but noticeable, levels of East Asian admixture on top of a different base. If you removed this eastern element both groups would come much closer to similar groups. The extreme long branches leading to the Kalash and Mozabites are almost certainly a function of endogamy and inbreeding. Their allele frequencies diverged from nearby populations because of isolation.
But notice the nearby populations of the Kalash. They’re northwest South Asian. In many ways if you removed the drift and endogamy from the Kalash I suspect you’d been left with a group very similar to their Pathan neighbors.
Finally, as many of you know I put a substantial number of comments into ‘spam’ on this weblog. Here’s one related to the Kalash which you didn’t see:
The New York Times and Nature both have favorable reviews of Oxford Nanopore’s showy claims at Advances in Genome Biology and Technology. If you don’t know what I’m talking about, please see the twitter stream, or the post at Genomes Unzipped.
A “test” post showed up on this website earlier. I’ve been told it was probably an error by IT. I had no idea that it was even up because I was off the internet and not checking my phone for ~18 hours for various reasons. Just thought I’d pass that on….