With all the justified concern about “missing heritability”, the age of human genomics hasn’t been a total bust. As I have observed before in 2005′s excellent book Mutants the evolutionary geneticist Armand M. Leroi asserted that we really didn’t have a good understanding of normal variation of human pigmentation. At the time I think it was a defensible claim, but within three years I’d say that most of the mystery had been cleared up. Though there are still some holes to be plugged, and details to be elucidated, the genetic architecture of pigmentation is now understood more or less. By the fall of 2006 Richard Sturm penned a review titled A golden age of human pigmentation genetics, an age I think which in some ways probably was closed with his 2009 review Molecular genetics of human pigmentation diversity. It’s not surprising that many of the traits that 23andMe tells you about have to do with your pigmentation. Of course there’s some limited utility in this, one assumes that most individuals don’t gain much benefit from the knowledge that they have an “85% change of having brown eyes,” though it may be useful in terms of offspring prediction (I would say I have an 85% chance of having brown eyes, but since I’m not European the genetic background isn’t right to make that probability assertion).
But as the golden age of pigmentation genetics comes to a close and the low hanging fruit is stripped bare, where next? I wonder if it may be altitude adaptations. Like pigmentation altitude genetics has been around for a while, but it seems there’s a recent cresting of papers in the area, focusing in particular on the three canonical high altitude peoples, the Tibetans, Andeans, and the Ethiopians. Last spring two major groups came out with papers on the genetics of Tibetan altitude adaptation, and its evolutionary history, using somewhat different techniques. A new paper in PLoS Genetics builds upon that work (verifying two of the loci as targets of selection in Tibetans implicated in the previous papers), and, adds Andean populations to the mix to assess the possibilities of convergent adaptations. Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data:
High-altitude hypoxia is caused by decreased barometric pressure at high altitude, and results in severe physiological stress to the human body. Three human populations have resided at high altitude for millennia including Andeans on the Andean Altiplano, Tibetans on the Himalayan plateau, and Ethiopian highlanders on the Semian Plateau. Each of these populations exhibits a unique suite of physiological changes to the decreased oxygen available at altitude. However, we are just beginning to understand the genetic changes responsible for the observed physiology. The aim of the current study was to identify gene regions that may be involved in adaptation to high altitude in both Andeans and Tibetans. Genomic regions showing evidence of recent positive selection were identified in these two high-altitude human groups separately. We found compelling evidence of positive selection in HIF pathway genes, in the globin cluster located on chromosome 11, and in several chromosomal regions for Andeans and Tibetans. Our results suggest that key HIF regulatory and targeted genes are responsible for adaptation to altitude and implicate several distinct chromosomal regions. The candidate genes and gene regions identified in Andeans and Tibetans are largely distinct from one another. However, one HIF pathway gene, EGLN1, shows evidence of directional selection in both high-altitude populations.
In this paper the authors looked at around 50 Andeans (Quechua and Aymara speakers) and 50 Tibetans, and compared them to various outgroups. In addition to the European and Asian HapMap populations they also looked at some Amerindian populations. The map below shows the geographical scope of their sampling (the right inset are the Amerindian lowland groups):
The ancestral relationships of the two highland groups sampled in relation to the lowlanders was relatively straightforward. Panel A and B show PCA plots for the Andeans and Tibetans, while C and D show frappe bar plots. The only thing notable for me is that the Quechua speakers seem to show residual European ancestry which the Aymara do not, and the Colombian indigenous groups seems to have more affinity with Mesoamerican populations than with the other South American samples. I can give no insight as to the latter, but if it is not just a quirk of non-representativeness one may be seeing the higher number of Spanish men who married into the nobility of the Quechua speaking highlands than further south in lands of the Aymara (though Potosi was in Bolivia, so this may not be plausible).
no images were found
We already have some evolutionary expectations of how these groups came to have these adaptations to their high altitude environments. It seems that the physiological processes for the three groups are somewhat different, and this has been a source of curiosity for geneticists for a long time. It stands to reason if the physiology is somewhat varied, the genetics should be too, and that seems to be a broadly correct assumption. In this paper they took two general approaches, looking at the total genome, and focusing on specific candidate regions. From what I can tell they did not find much novel using the first technique, but they did clarify the relationship between Tibetans and Andeans in terms of their genetic adaptations a bit by looking at specific genes. As noted in the author summary it looks as if the two populations do have somewhat different genetic architectures. Many of the genes which seem to have been targets of selection do not overlap, and of those that do there seem different localized selection events so that the haplotypes being driven by positive selection differ.
They used a compound of techniques to detect possible regions of natural selection:
- locus specific branch length (LSBL)
- the log of ratio of heterozygosities (lnRH)
- a modified Tajima’s D statistic
- whole genome long range haplotype (WGRLH)
LSBL is an elaboration on Fst, so it is finding between population differences in allele frequency. Recall that at any given locus you don’t expect much between population difference, so if there is a great deal of ecological adaptation you may see a lot of variance as a function of geography. Heterozygosity is simply a measure of the fraction of loci where the two gene copies are in different states. It’s just a way to measure genetic variation (though there are others). The Tajima’s D statistic is a test for whether the locus seems deviated from neutral expectations. This means that there may have been a bottleneck, selective sweep, or, balancing selection. Finally, the last test looks for sets of correlated markers within the genome. If there is a haplotype, a sequence of markers, at high frequency then it may be that you’re witnessing a genomic region which is in, or just after, the occurrence of a selective sweep.
Why four different tests? Because one given test is not dispositive of natural selection. As noted with Tajima’s D, there are demographic processes of a stochastic nature which can produce false positives, so it is best not to live or die by one technique alone.
Here is figure 4, which shows the differences in allele frequencies on the EGLN1 gene:
We’ve seen EGLN1 before. In the figure above the left panels show the Andean derived SNPs, and the right panels the Tibetan ones. Note the differences in frequency in A and B. The red denotes statistically significant values for a statistic in panels C & D. Both Andeans and Tibetans show indications of selection, but the details in the patterns vary when you zoom in on the gene. The very last panel has an arrow which points to the SNPs in each population where the between population variance is maximized. Interestingly the ancestral allele seems to have risen in frequency here in the high altitude populations, as black denotes ancestral and red derived in the first and last panels.
Let me jump to their conclusion:
In summary, we performed a genome scan on high- and low-altitude human populations to identify selection-nominated candidate genes and gene regions in two long-resident high-altitude populations, Andeans and Tibetans. Several chromosomal regions show evidence of positive directional selection. These regions are unique to either Andeans or Tibetans, suggesting a lack of evolutionary convergence between these two highland populations. However, evidence of convergent evolution between Andeans and Tibetans is suggested based on the signal detected for the HIF regulatory gene EGLN1. In addition to EGLN1, a second HIF regulatory gene, EPAS1, as well as two HIF targeted genes, PRKAA1 and NOS2A, have been indentified as selection-nominated candidate genes in Tibetans (EPAS1) or Andeans (PRKAA1, NOS2A). PRKAA1 and NOS2A play major roles in physiological processes essential to human reproductive success…Thus, in addition to demonstrating the likely targets of natural selection and the operation of evolutionary processes, genome studies also have the clear potential for elucidating key pathways responsible for major causes of human morbidity and mortality. Based on the findings of this study, it will be important to confirm the results with genotype-phenotype association studies that link genotype to a specific high-altitude phenotype.
I wanted to show the alphabet soup of genes in case you’re a geneticist with an interest in any of these loci. I’ve seen these before in previous papers, I assume the key that got this published in PLoS Genetics is the deep comparative dimension, as the researchers explored the lack or existence of evolutionary convergence between these two populations. Should the finding be surprising? I don’t think so. High altitudes are extreme environments, and the literature is filled with references to problems which emerge even in these populations because of the nature of their adaptations. There are likely deleterious side effects, especially if one of last spring’s papers on Tibetans is correct and that they’re relatively recent settlers of the highlands. But you never know until you play the game, so it is good to confirm.
A further exploration of the genetic architecture and nature of adaptations, especially when the research is extended to Ethiopians, may give us a further window into contingency in evolutionary history. These three occurrences are basically three independent experiments. In this paper they indicate that some of the variants being subject to natural selection may have been in the ancestral population, so standing variation. Others are new mutations, unique and novel. Though there are different pathways to the final expression of the phenotype, which in the details of implementation (physiology) still differ across the groups, there are also genes which in this comparison seem to be implicated in both Tibetans and Andeans as having been subject to selection. How constrained is the sample space subject to possible selection and the implied G-matrix? How contingent are the evolutionary pathways that different populations take to attain the state of adaptive fitness in similar ecologies? These are the sort of long term questions which I think will be possibly answered as the tentative silver age of altitude adaptation gives way to the golden age.
Citation: Bigham A, Bauchet M, Pinto D, Mao X, & Akey JM (2010). Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data PLoS Genetics
Image Credit: Micah MacAllen
Note: I am aware that classically the silver age follows the golden age, instead of precedes it. But we live in Whiggish times indeed!