In the middle years of the last decade there were many papers which came out which reported many ‘hard’ selective sweeps reshaping the human genome. By this, I mean that you had a novel mutation arise against the genetic background, and positive selection rapidly increased the frequency of that mutation. Because of the power and rapidity of the sweep many of the flanking regions of the genome would “hitchhike” along, generating long homogenized regions of linkage disequilibrium. If that’s a little dense for you, just understand that very strong selective events tend to result in disorder and distinctiveness in the local genomic region.
But the late aughts and the early years of the teens are shaping up give us a more subtle picture. Instead of classic hard sweeps, researchers are suggesting that there may also be many ‘soft’ sweeps, where selection draws upon the well of standing genic variation. Instead of a novel trait becoming prominent, one tail of the distribution would rise in frequency. The ‘problem’ with this model is that it’s not as tractable as the earlier one of hard sweeps, and selection on quantitative traits with many loci of small effect is more difficult to detect. Its effect on the genome is more subtle and understated, which means that statistical tests often lack the power to grasp onto the underlying dynamics. Naturally this means that there is an extension of statistical techniques to ever greater degrees of sophistication. A new paper in PLoS Genetics attempting to tease apart the various potential selective pressures in the human genome is reflective of that tendency. Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution:
Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes.
The authors utilized “Projection to Latent Structure multiple regression with an Uninformative Variable Elimination algorithm (UVE-PLS).” I know what multiple regression is, and the general logic which underpins the family of such methods. But I don’t really know what UVE-PLS is in its specifics, so I can’t speak with any intelligence on this technical issue. I assume as per most multiple regression the authors are attempting to tease apart the predictive power of various independent variables upon a dependent variable. In this case, the dependent variable happens to be the pattern of genetic variation, the single nucelotide polymorphisms (SNPs). It isn’t surprising that the biggest predictor of variation happens to be demographic relationship. That is, adjacent populations with recent common ancestors are going to share more genetic variants than those which are distant. The key is to control for this confound, and then see how genes vary according to other factors.
In this analysis they found that diet and climate seem to be less important than genes relating to immune response to pathogens, in particular those implicated in response to parasitic worms. Why worms? The argument they give is that these organisms are not quite so protean as bacteria and viruses, and also tend to be somewhat localized. Their relative sluggishness in adaptation means that humans presumably have some fighting chance in developing defenses, and their spatial stability also implies that human adaptations can differentiate nicely as a function of geography, as may be in the case in genes which are targets of local selection. I’m not quite sure about this idea that we’ve been able to adapt to parasitic worms better though. Rather, I just wonder if human adaptations to viruses and bacteria are simply not easily detectable by these methods. Or, as implied in the piece it may be that these are less locally conditioned, so you see a whole host of generalized adaptations which aren’t geographically constrained.
This is obviously not going to be the last word by any means. They focused on the data sets that were available and computationally manageable in 2011. Over the next 10 years researchers will be combing whole genomes of many individuals in many populations. They’ll come back with gold. It seems a forgone conclusion that loci implicated in response to pathogens are going to be rich candidates for bouts of natural selection. What is perhaps going to be more interesting is the question of what other traits are shaped by natural selection? The unequivocal list is rather short right now. Lactose tolerance, pigmentation, malaria, etc. It’s bound to get longer. The question is now human longer….
Citation: Fumagalli M, Sironi M, Pozzoli U, Ferrer-Admettla A, Pattini L, et al. 2011 Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution. PLoS Genet 7(11): e1002355. doi:10.1371/journal.pgen.1002355