How the worm turns the genic world

By Razib Khan | November 26, 2011 12:25 am

In the middle years of the last decade there were many papers which came out which reported many ‘hard’ selective sweeps reshaping the human genome. By this, I mean that you had a novel mutation arise against the genetic background, and positive selection rapidly increased the frequency of that mutation. Because of the power and rapidity of the sweep many of the flanking regions of the genome would “hitchhike” along, generating long homogenized regions of linkage disequilibrium. If that’s a little dense for you, just understand that very strong selective events tend to result in disorder and distinctiveness in the local genomic region.

But the late aughts and the early years of the teens are shaping up give us a more subtle picture. Instead of classic hard sweeps, researchers are suggesting that there may also be many ‘soft’ sweeps, where selection draws upon the well of standing genic variation. Instead of a novel trait becoming prominent, one tail of the distribution would rise in frequency. The ‘problem’ with this model is that it’s not as tractable as the earlier one of hard sweeps, and selection on quantitative traits with many loci of small effect is more difficult to detect. Its effect on the genome is more subtle and understated, which means that statistical tests often lack the power to grasp onto the underlying dynamics. Naturally this means that there is an extension of statistical techniques to ever greater degrees of sophistication. A new paper in PLoS Genetics attempting to tease apart the various potential selective pressures in the human genome is reflective of that tendency. Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution:

Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes.

The authors utilized “Projection to Latent Structure multiple regression with an Uninformative Variable Elimination algorithm (UVE-PLS).” I know what multiple regression is, and the general logic which underpins the family of such methods. But I don’t really know what UVE-PLS is in its specifics, so I can’t speak with any intelligence on this technical issue. I assume as per most multiple regression the authors are attempting to tease apart the predictive power of various independent variables upon a dependent variable. In this case, the dependent variable happens to be the pattern of genetic variation, the single nucelotide polymorphisms (SNPs). It isn’t surprising that the biggest predictor of variation happens to be demographic relationship. That is, adjacent populations with recent common ancestors are going to share more genetic variants than those which are distant. The key is to control for this confound, and then see how genes vary according to other factors.

In this analysis they found that diet and climate seem to be less important than genes relating to immune response to pathogens, in particular those implicated in response to parasitic worms. Why worms? The argument they give is that these organisms are not quite so protean as bacteria and viruses, and also tend to be somewhat localized. Their relative sluggishness in adaptation means that humans presumably have some fighting chance in developing defenses, and their spatial stability also implies that human adaptations can differentiate nicely as a function of geography, as may be in the case in genes which are targets of local selection. I’m not quite sure about this idea that we’ve been able to adapt to parasitic worms better though. Rather, I just wonder if human adaptations to viruses and bacteria are simply not easily detectable by these methods. Or, as implied in the piece it may be that these are less locally conditioned, so you see a whole host of generalized adaptations which aren’t geographically constrained.

This is obviously not going to be the last word by any means. They focused on the data sets that were available and computationally manageable in 2011. Over the next 10 years researchers will be combing whole genomes of many individuals in many populations. They’ll come back with gold. It seems a forgone conclusion that loci implicated in response to pathogens are going to be rich candidates for bouts of natural selection. What is perhaps going to be more interesting is the question of what other traits are shaped by natural selection? The unequivocal list is rather short right now. Lactose tolerance, pigmentation, malaria, etc. It’s bound to get longer. The question is now human longer….

Citation: Fumagalli M, Sironi M, Pozzoli U, Ferrer-Admettla A, Pattini L, et al. 2011 Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution. PLoS Genet 7(11): e1002355. doi:10.1371/journal.pgen.1002355

  • tom

    Hi Razib!

    Excellent post, though a bit heavy on the technical side (probably inevitable, due to the amount of data), so i’m not sure I fully understood the rigor in their statistics.

    Is it possible that the genes they infer were selected for in anti-parasite (worm) adaptations were instead related to a larger aspect of immune function, such as increasing the speed of immune modulation or the robustness of the system? ie; a high density/diversity of parasite loads selects for a more flexible and efficient immune system that would also be more effective at knocking out other pathogens

    The only reason I ask aside from curiosity is that I suspect a set of adaptations that “streamlined” the immune system would free up valuable calories to divert to other architecture, such as brain development. I’m not sure where their proposed set of adaptations falls chronologically with respect to brain volume expansion in the hominid line (or even if brain volume is an appropriately sophisticated measure of caloric budgeting to various organs), but the question intrigues me.

    however, i realize this speculation is all a hair trigger away from a “just-so” story, the likes of which evo-psych proponents often offer, so i’ll defer to your experience.

    Thank you for your time, and keep up the great work!

  • ohwilleke

    Table S3, which has the environmental factors to which the genes were matched suggests that one of the reasons that worms, rather than other infectious agents came out so strong in a regression based analysis is that the data set shows wider variation in the number of worm species than in the number of other pathogen species, and not all of the variation in other pathogen species is independent of the worm species variation.

    Since we are nowhere near having complete taxonomies of bacteria, virus or even protozoan diversity, particularly in the underresearched third world (the study primarily includes less developed regions), and are relatively more complete in our parasitic worm taxonomy, the worm count may simply be a more accurate measure of the relative pathogen load than the other data used in the study. The case it makes for pathogen load being more important than climate or subsistance method is stronger than the case it makes regarding which pathogens are really most relevant.

    Also, even the distinction between pathogen load and climate is to some extent a function of using linear regression methodologies. Pathogen load is closely related to climate, but not in a linear way. Pathogens need a sweet spot of temperatures, humidity, seasons variations, etc. to thrive, and those variable interact with each other in non-linear ways, which is something that linear regression models don’t capture very well. It is probably better to think about pathogens as being a means through which climate variables have an impact.

    Another notable thing about the study is that by limiting itself to the populations that it does, it is making making a leading order approximation. Pathogens are the first most important variable that swamps everything else where it is important. But, presumably, with a different set of countries and larger set of environmental variables, one could tease out whatever the next leading order variable is once the issue of pathogens is under control (e.g. because one lives someplace that is very cold or arid). Since most people who understand genetics also live in pathogen controlled environments, odds are that the NLO factors might better fit our intuitions.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar