The Pith: You are expected to have 30 new mutations which differentiate you from your parents. But, there is wiggle room around this number, and you may have more or less. This number may vary across siblings, and explain differences across siblings. Additionally, previously used estimates of mutation rates which may have been too high by a factor of 2. This may push the “last common ancestor” of many human and human-related lineages back by a factor of 2 in terms of time.
There’s a new letter in Nature Genetics on de novo mutations in humans which is sending the headline writers in the press into a natural frenzy trying to “hook” the results into the X-Men franchise. I implicitly assume most people understand that they all have new genetic mutations specific and identifiable to them. The important issue in relation to “mutants” as commonly understood is that they have salient identifiable phenotypes, not that they have subtle genetic variants which are invisible to us. Another implicit aspect is that phenotypes are an accurate signal or representation of high underlying mutational load. In other words, if you can see that someone is weird in their traits, presumably they are rather strange in their underlying genetics. This is the logic behind models which assume that mutational load has correlates with intelligence or beauty, and these naturally tie back into evolutionary rationales for human aesthetic preferences (e.g., “good genes” models of sexual selection).
J.B.S. Haldane proposed in 1947 that the male germline may be more mutagenic than the female germline…Diverse studies have supported Haldane’s contention of a higher average mutation rate in the male germline in a variety of mammals, including humans…Here we present, to our knowledge, the first direct comparative analysis of male and female germline mutation rates from the complete genome sequences of two parent-offspring trios. Through extensive validation, we identified 49 and 35 germline de novo mutations (DNMs) in two trio offspring, as well as 1,586 non-germline DNMs arising either somatically or in the cell lines from which the DNA was derived. Most strikingly, in one family, we observed that 92% of germline DNMs were from the paternal germline, whereas, in contrast, in the other family, 64% of DNMs were from the maternal germline. These observations suggest considerable variation in mutation rates within and between families.
In the very near future you may be forced to go through a “professional” to get access to your genetic information. Professionals who will be well paid to “interpret” a complex morass of statistical data which they barely comprehend. Let’s be real here: someone who regularly reads this blog (or Dr. Daniel MacArthur or Misha’s blog) knows much more about genomics than 99% of medical doctors. And yet someone reading this blog does not have the guild certification in the eyes of the government to “appropriately” understand their own genetic information. Someone reading this blog will have to pay, either out of pocket, or through insurance, someone else for access to their own information. Let me repeat: the government and professional guilds which exist to defend the financial interests of their members are proposing that they arbitrate what you can know about your genome. A friend with a background in genomics emailed me today: “If they succeed in ramming this through, then you will not be able to access your own damn genome without a doctor standing over your shoulder.” That is my fear. Is it your fear? Do you care?
In the medium term this is all irrelevant. Sequencing will be so cheap that it will be impossible for the government and well-connected self-interested parties to prevent you from gaining access to your own genetic information. Until then, they will slow progress and the potential utility of this business. Additionally, this sector will flee the United States and go offshore, where regulatory regimes are not so strict. BGI should give glowing letters of thanks to Jeffrey Shuren and the A.M.A.! This is a power play where big organizations, the government, corporations, and professional guilds, are attempting to squelch the freedom of the consumer to further their own interests, and also strangle a nascent economic sector of start-ups as a side effect.
You are so much more than your genes. So much more than that 3 billion base pairs. But they are a start, a beginning, and how dare the government question your right to know the basic genetic building blocks of who you are. This is the same government which attempted to construct a database of genetic information on foreign leaders. We know very well then who they think should have access to this data. The Very Serious People with a great deal of Power. People with “clearance,” and “expertise,” have a right to know more about your own DNA sequence than you do.
What can you do? What can we do? Can we affect change? I don’t know, I can’t predict the future. But this is what I’m going to do.
Image Credit: Anirudh Koul
One of the great things about the mass personal genomic revolution is that it allows people to have direct access to their own information. This is important for the more than 90% of the human population which has sketchy genealogical records. But even with genealogical records there are often omissions and biases in transmission of information. This is one reason that HAP, Dodecad, and Eurogenes BGA are so interesting: they combine what people already know with scientific genealogy. This intersection can often be very inferentially fruitful.
But what about if you had a whole population with rich robust conventional genealogical records? Combined with the power of the new genomics you could really crank up the level of insight. Where to find these records? A reason that Jewish genetics is so useful and interesting is that there is often a relative dearth of records when it comes to the lineages of American Ashkenazi Jews. Many American Jews even today are often sketchy about the region of the “Old Country” from which their forebears arrived. Jews have been interesting from a genetic perspective because of the relative excess of ethnically distinctive Mendelian disorders within their population. There happens to be another group in North America with the same characteristic: the French Canadians. And importantly, in the French Canadian population you do have copious genealogical records. The origins of this group lay in the 17th and 18th century, and the Roman Catholic Church has often been a punctilious institution when it comes to preserving events under its purview such as baptisms and marriages. The genealogical archives are so robust that last fall a research group input centuries of ancestry for ~2,000 French Canadians, and used it to infer patterns of genetic relationships as a function of geography, as well as long term contribution by provenance. Admixed ancestry and stratification of Quebec regional populations:
Population stratification results from unequal, nonrandom genetic contribution of ancestors and should be reflected in the underlying genealogies. In Quebec, the distribution of Mendelian diseases points to local founder effects suggesting stratification of the contemporary French Canadian gene pool. Here we characterize the population structure through the analysis of the genetic contribution of 7,798 immigrant founders identified in the genealogies of 2,221 subjects partitioned in eight regions. In all but one region, about 90% of gene pools were contributed by early French founders. In the eastern region where this contribution was 76%, we observed higher contributions of Acadians, British and American Loyalists. To detect population stratification from genealogical data, we propose an approach based on principal component analysis (PCA) of immigrant founders’ genetic contributions. This analysis was compared with a multidimensional scaling of pairwise kinship coefficients. Both methods showed evidence of a distinct identity of the northeastern and eastern regions and stratification of the regional populations correlated with geographical location along the St-Lawrence River. In addition, we observed a West-East decreasing gradient of diversity. Analysis of PC-correlated founders illustrates the differential impact of early versus latter founders consistent with specific regional genetic patterns. These results highlight the importance of considering the geographic origin of samples in the design of genetic epidemiology studies conducted in Quebec. Moreover, our results demonstrate that the study of deep ascending genealogies can accurately reveal population structure.
Mitochondrial DNA from 147 people, drawn from five geographic populations have been analysed by restriction mapping. All these mitochondrial DMAs stem from one woman who is postulated to have lived ab7out 200,000 years ago, probably in Africa. All the populations examined except the African population have multiple origins, implying that each area was colonised repeatedly
And so was published in the year 1987 the paper which established in the public’s mind the idea of mitochondrial Eve, which gave rise to a famous cover photo in Newsweek. This also led to the Children of Eve episode on the PBS documentary NOVA. Here is the summary:
NOVA examines a controversial theory that traces our ancestry to a small group of women living in Africa 300,000 years ago.
As Milford Wolpoff has complained it is probably accurate to characterize the documentary as not particularly “fair & balanced.” Mitochondrial Eve may have been controversial, and subsequently plagued by issues of molecular clock calibration as well as spurious interpretations of the cladograms, but the tide of history was on its side, and PBS was telling that story. And the story was not just the primary science, rather, one had to understand the controversy in light of the debates among paleontologists and between paleontologists and molecular biologists. A group of researchers, spearheaded by Chris Stringer argued for the recent origin of modern humans from Africa on the basis of fossils alone. They were challenged by an established school of multiregionalists who argued for deeper roots of modern human populations, which derived from local hominins which diversified after the the migration of H. erectus out of Africa. The argument of the multiregionalists was that selective sweeps across the full range of the human populations gave rise gradually to modern humanity as we know it, a compound of specific ancient local features and trans-population characters which unified us into a broader whole. Stringer and company presented a simpler model where anatomically modern human being arose ~200,000 years ago in Africa, and subsequently expanded to other parts of the world, by and large replacing the local hominin populations. In the multiregionalist telling Neandertals became human beings, while Out of Africa would imply that Neandertals were replaced by human beings.
The number 1 gets a lot more press than -1, and the concept of heterozygosity gets more attention than homozygosity. Concretely the difference between the latter two is rather straightforward. In diploid organisms the genes come in duplicates. If the alleles are the same, then they’re homozygous. If they’re different, then they’re heterozygous. Sex chromosomes can be an exception to this because in the heterogametic sex you generally have only one copy of gene as one of the chromosomes is sharply truncated. This is why in human males are subject to X-linked recessive traits at such a great frequency in comparison to females; recessive expression is irrelevant when you don’t have a compensatory X chromosome to mask the malfunction of one allele.
Of course recessive traits are not simply a function of sex-linked traits. Consider microcephaly, an autosomal recessive disease. To manifest the trait you need two malfunctioning copies of the gene, one from each parent. In other words, you exhibit a homozygous genotype with two mutant copies. I suspect that this particularly common context of homozygosity, recessive autosomal diseases, is one reason why it is less commonly discussed outside of specialist circles: there are whole cluster of medical and social factors which lead to homozygosity which are already the focus of attention. The genetic architecture of the trait is of less note than the etiology of the disease and the possible reasons in the family’s background which might have increased the risk probability, especially inbreeding. In contrast heterozygosity is generally not so disastrous. Even if functionality is not 100%, it is close enough for “government work.” The deleterious consequences of a malfunctioning allele are masked by the “wild type” good copy. The exceptions are in areas such as breeding for hybrid vigor, when heterozygote advantage may be coming to the fore. The details of complementation of two alleles matter a great deal to the bottom line, and the concept of hybrid vigor has percolated out to the general public, with the more informed being cognizant of heterozygosity.
But homozygosity is of interest beyond the unfortunate instances when it is connected to a recessive disease. Like heterozygosity, homozygosity exists in spades across our genome. My 23andMe sample comes up as 67.6% homozygous on my SNPs (which are biased toward ~500,000 base pairs which tend to have population wide variation), while Dr. Daniel MacArthur’s results show him to be 68.1% homozygous across his SNPs. This is not atypical for outbred individuals. In contrast someone whose parents were first cousins can come up as ~72% homozygous. This is important: zygosity is not telling you simply about the state of two alleles, in this case base pairs, it may also be telling you about the descent of two alleles. Obviously this is not always clear on the base pair level; mutations happen frequently enough that even if you carry two minor alleles it is not necessarily evidence that they’re identical by descent (IBD), or autozygous (just a term which denotes ancestry of the alleles from the same original copy). What you need to look for are genome-wide patterns of homozygosity, in particular “runs of homozygosity” (ROH). These are long sequences biased toward homozygous genotypes.
The figure to the left is a composite merged from two different papers. One analyzes the patterns of genetic variation within African Americans, and the other the patterns within the East Turkic ethnic group, the Uyghurs. The bar plots show the ancestral element which is similar to two parent populations which resemble Europeans and Africans or East Asians. Looking at total aggregate ancestral quanta we infer that African Americans are on the order of 15-25% European in ancestry, and 75-85% African. Uyghurs seem to be a composite in even measure of a European-like group, and an East Asian-like group. This makes total sense phenotypically; most African Americans look more African, while Uyghurs seem to exhibit a phenotype on average which spans the middle-range between West and East Eurasians.
But we’re clearly missing something when we focus purely on a population level statistic. Each “slice” of the bar plot actually represents an individual. Note the contrast between African Americans and Uyghurs. There is relatively little intra-individual variation among Uyghurs, while there is a great deal of such variation among African Americans. Why? Population geneticists have looked at linkage disequilibrium in both African Americans and Uyghurs, and inferred that the former went through an admixture phase much more recently than the latter. Though you don’t really have to be a population geneticist to have known that about African Americans. The ethnogenesis of the group African Americans as a cultural entity occurred in the period between 1650 and 1850. Genetically they are a compound of African, European, to some extent Native American, ancestry. For the Uyghurs we have thinner textual evidence, but the visual and genetic data point to a “western” Indo-European speaking population in the Tarim basin before the arrival of the Turks sometime in the second half of the first millenium A.D. The assumption is that after the initial admixture event and the absorption of the pre-Turkic substrate there was no population substructure. Over time the two components distributed themselves evenly across the population over a period of 1,000-1,500 years.
From this we can infer that patterns of individual variation within populations, as well as between closely related populations, can tell us a great deal. Today the Dodecad Ancestry Project posted a file with the population ancestries broken down by individuals. Looking at this sort of fine-grained data patterns can jump out based on what you already know. Below is a slide show I created which highlights some patterns of interest.
A quintessentially sexy topic in biology is the origin of sex. Not only are biologists interested in it, but so is the public. Of Matt Ridley’s older books it is predictable that The Red Queen has the highest rank on Amazon. We humans have a fixation on sex, both in our public norms and our private actions. Why?
Because without a fixation on sex we would not be here. Celibates do not inherit the earth biologically. This answer emerges naturally from a Darwinian framework. And yet more deeply still: why sex for reproduction? Here I allude to the famous two-fold cost of sex. In dioecious species you have males and females, and males do not directly produce offspring. The increase of the population is constrained by the number of females in such lineages (male gametes are cheap). There is no such limitation in asexual lineages, where every individual can contribute to reproductive “primary production.” Additionally, the mating dance is another cost of sex. Individuals expend time and energy seeking out mates, and may have to compete and display for the attention of all. Why bother?
The answer on the broadest-scale seems to be variation. Variation in selective pressures, and variation in genes. Sex famously results in the shuffling of genetic permutations through recombination and segregation. In a world of protean change where one’s genes are critical to giving one the edge of fitness this constant flux of combinations results in more long term robusticity. What clones gain in proximate perfection, they lose when judged by the vicissitudes of the pressures of adaptation. In the present they flourish, but in the future they perish. Sex is the tortoise, clonal reproduction is the hare.
And yet science is more than just coarse generalities; biology especially so. The details of how sex emerges ad persists still remains to be fleshed out. The second volume of W. D. Hamilton’s collected papers, Narrow Roads of Gene Land, is the largest. Mostly because it was not edited appropriately (he died before it could be). But also perhaps because it is the volume most fixated upon the origin and persistence of sex, which is a broad and expansive topic.
A new paper in Nature tackles sex through experimental evolution. In may ways the answer it offers to the question of sex is old-fashioned and straightforward. Higher rates of sex evolve in spatially heterogeneous environments:
Evolution means many things to many people. On the one hand some scholars focus on time scales of “billions and billions,” and can ruminate upon the radical variation in body plans across the tree of life. Others put the spotlight on the change in gene frequencies on the scale of years, of Ph.D. programs. While one group must glean insight from the fossil remains of trilobites and ammonites, others toils away in dimly lit laboratories breeding nematodes and fruit flies, generations upon generations. More recently a new domain of study has been focusing specifically on the arc of animal development as a window onto the process of evolution. And so forth. Evolution has long been dissected by an army of many specialized parts.
And yet the core truth which binds science is that nature is one. No matter the disciplinary lens which we put on at any given moment we’re plumbing the same depths on some fundamental level. But what are the abstract structures of those depths? Can we project a tentative map of the fundamentals before we go exploring through observation and experiment? That’s the role of theoreticians. Charles Darwin, R. A. Fisher, and Sewall Wright. Evolution is a phenomenon which is on a deep level an abstraction, though through objectification we speak of it as if it was as concrete as the frills of the Triceratops. As an abstraction it is open to mathematical formalization. Models of evolution may purport to tell us how change over time occurs in specific instances, but the ultimate aim is to capture the maximum level of generality possible.
I’m using some statistics out of William Boyd’s 1956 printing of Genetics and the Races of Man. It gives a good accounting of blood group data known more than fifty years ago, which I’m using to illustrate my intro lectures. Meanwhile, there are some interesting passages, from the standpoint of today’s knowledge of the human genome and its variation.
On skin pigmentation — this is the earliest statement I’ve run across of the argument that the New World pigmentation cline is shallower than the Old World cline because of the relative recency of occupation….
Looking at what was said about pigmentation generations ago is of interest because it’s a trait which in many ways we have pegged. See Molecular genetics of human pigmentation diversity. Why humans vary in pigmentation in a deep ultimate sense is still an issue of some contention, but how they do so, and when the differences came about, are questions which are now modestly well understood. We know most of the genetic variants which produce between population variation. We also know that East and West Eurasians seem to have been subject to independent depigmentation events. We also know that some of the depigmentation was relatively recent, probably after the Last Glacial Maximum, and possibly as late as the advent of agriculture.
On the New World cline, which is clearly shallower than that of the Old World. The chart below from Signatures of positive selection in genes associated with human skin pigmentation as revealed from analyses of single nucleotide polymorphisms is useful:
Over the past day I’ve seen reports in the media of a new paper which claims that long-term urbanization in a region is strongly correlated with genetic variants for disease resistance. I managed to find the paper on Evolution‘s website as an accepted manuscript, ANCIENT URBANISATION PREDICTS GENETIC RESISTANCE TO TUBERCULOSIS:
A link between urban living and disease is seen in recent and historical records, but the presence of this association in prehistory has been difficult to assess. If the transition to urbanisation does result in an increase in disease-based mortality, we might expect to see evidence of increased disease resistance in longer-term urbanised populations, as the result of natural selection. To test this, we determined the frequency of an allele (SLC11A1 1729 + 55del4) associated with natural resistance to intra-cellular pathogens such as tuberculosis and leprosy. We found a highly significantly correlation with duration of urban settlement – populations with a long history of living in towns are better adapted to resisting these infections. This correlation remains strong when we correct for auto-correlation in allele frequencies due to shared population history. Our results therefore support the interpretation that infectious disease loads became an increasingly important cause of human mortality after the advent of urbanisation, highlighting the importance of population density in determining human health and the genetic structure of human populations.
In some ways this seems plausible. There are a priori reasons why we’d expect to see a great deal of evolutionary change in regions of the genome correlated with variations in immune response. Diseases are one of the most likely reasons for why sex exists in complex multicellular species; sex allows a slow-reproducing population to bend with the rapid-fire punches of their pathogens by shuffling their defenses constantly. The results from recent work mapping patterns of variation in relation to natural selection generally indicate that immune related regions show plenty of signs of adaptation. No surprise, a “Red Queen” model whereby pathogens and their hosts constantly co-evolve would imply that immunologically relevant genes would never be at equilibrium frequencies for long, so we’d have a good shot at catching “selective sweeps” on some of the immune loci.
So how do cities play into this picture? I suspect that the picture is more complicated than the presentation in the paper, though I believe that the authors were constrained by considerations of space from evaluating all possibilities in full depth. There are two facts which I think are critical to understanding the pattern of variation here:
With the recent huge furor over the utility of kin selection I’ve been keeping a closer eye on the literature on inclusive fitness. The reason W. D. Hamilton’s original papers in The Journal of Theoretical Biology are highly cited is not some conspiracy, rather, they’re a powerful framework in which one can understand the evolution of social behavior. They are a logic whose basis is firmly rooted in the world of how inheritance and behavior play out concretely. But because of their formality and spareness inclusiveness fitness has also given rise to a large literature derived from simulations “in silico,” that is, evolutionary experiments in the digital domain.
One can elucidate inclusive fitness through Hamilton’s Rule, but it is also rather easy to exposit verbally via a “gene’s eye view.” Imagine for example a dominant mutation in a diploid organism which produces the behavior of altruism toward near kin. Initially the altruist will have offspring whose probability of carrying the dominant mutation is 50%, because there is also the probability that they will carry the ancestral non-altruistic variant. Imagine an altruistic behavior which incurs a small, but not trivial, cost to the individual performing the behavior, and a large gain to the individual who is on the receiving end of the altruism. The logic of favoring near kin is such that in the initial generation the parent which behaves altruistically toward near kin is increasing their own “inclusive fitness” because their offspring share 50% of their genes identical-by-descent (in the case of a diploid sexually reproducing organism). But from a gene’s eye perspective what is really occurring is that there is a 50% chance that the gene which fosters altruism is promoting the fitness of a copy of itself. So inclusive fitness operates by modulating the parameters of costs and gains to focal individuals as a function of their relatedness, but it is the genes, the “replicators,” which persist immortally across the generations. We “vehicles” are just the ocean through which genes sail.
But like Darwin’s theory of evolution through natural selection the fruit of these logics are in the details. A new paper in The Proceedings of the Royal Society puts the focus on different means by which inclusive fitness may be maximized. In particular, the paper offers up a reason for why what Richard Dawkins termed the “green-beard effect” is not more common. Selective pressures for accurate altruism targeting: evidence from digital evolution for difficult-to-test aspects of inclusive fitness theory:
There’s a new paper out in The European Journal of Human Genetics which is of great interest because it surveys the genetic and linguistic affinities of two dozen ethno-linguistic groups from the three Central Asian nations of Uzbekistan, Kyrgyzstan, and Tajikistan. This is what the Greeks referred to as Transoxiana, and the Persians as Turan. Originally inhabited by peoples with close cultural affinities with those of Persia, indeed, likely the root of the peoples of Persia, by the historical period Turan developed a distinctive identity as a frontier or march. It was in Turan where the Turk met the Iranian (a class which included non-Persian groups, such as the Sogdians), from the pre-Islamic Sassanians down to the present day. It is a region of the world which has a very ancient urban culture, cities such as Merv, as well as peoples that were only recently nomads, forcibly made sedentary by the Soviet regime.
To add another twist to the picture many of the ethno-linguistic groups which we are familiar with today and which serve as the cores of the new Central Asian nations only came into being within the last few centuries, with a particular “push” from Russian Imperial and Soviet ethnologists who were tasked with fleshing out national identities with which the center could negotiate. A “Tajik” is after all simply part of the Persian-speaking residual population of Central Asia, spreading down into Afghanistan. The carving out of an independent Tajikistan out of the Central Asian landscape is as much a creation of the modern age as the state of Israel. The “Uzbek” identity was once simply that of the ruling caste of Transoxiana who came to power after the decline of the Timurids. Today it is an appellation which brackets the settled Turkic speaking peoples of Uzbekistan and beyond.
Into this near Gordian knot of history and ideology walk the naive and well-meaning geneticists. There is no great objection one can make to the genetics within the paper, but the historical framework and some of the assertions are peculiar and tendentious indeed. It’s a problem which starts within the abstract. In the heartland of Eurasia: the multilocus genetic landscape of Central Asian populations:
Natural selection happens. It was hypothesized in copious detail by Charles Darwin, and has been confirmed in the laboratory, through observation, and also by inference via the methods of modern genomics. But science is more than broad brushes. We need to drill-down to a more fine-grained level to understand the dynamics with precision and detail, and so generate novel inferences which may then be tested. For example, there are various flavors of natural selection: stabilizing selection, negative selection, and positive directional selection. In the first case natural selection buffets the phenotype about an ideal mean, in the second case deleterious phenotypes and their associated alleles are purged from the genome, and finally, natural selection can also drive a novel trait toward greater prominence, and concomitantly the allelic variants which are associated with the fitter phenotype.
The last case is of particular interest to many because it is often with positive natural selection by which evolution as descent with modification occurs. Over time trait values and the nature of traits themselves shift such that a lineage changes its character beyond recognition. This phyletic gradualism and the scale independence of evolutionary process has been challenged, in particular from the domain of developmental biology (albeit, not all ,or even most, developmental biologists). But ultimately no one doubts that a classical understanding of evolution as change in allele frequency, often driven by natural selection, is part of the larger puzzle of how the tree of life came to be.
One of the phenomena associated with positive directional evolution is the selective sweep. How a selective sweep occurs, and its consequences, are rather straightforward. A genome consists of a sequence of base pairs (e.g., we have 3 billion base pairs). If a new mutation emerges at a particular base pair, a novel single nucelotide polymorphism (SNP), and, that allelic variant is ~10% fitter than the ancestral variant, natural selection could drive up its frequency (the conditionality is due to the fact that in all likelihood it would still go extinct because of the power of stochastic forces when a mutant is at low frequency). So the variant could in theory shift from ~0% (1 out of N, N being the number of individuals in a population, 2N if diploid, and so forth) to ~100%. This would be the fixation of the novel variant, driven by selective dynamics. So what’s the sweep aspect? The sweep in this case refers to the effect of the very rapid rise in frequency of the SNP in question on the adjacent genomic region. What is termed a genetic hitchiking dynamic results if the sweep occurs rapidly, so that nearby regions of the genome also move to fixation along with the favored SNP. But in a diploid organism with sexual reproduction genetic recombination persistently breaks apart associations across the physical genome. Therefore the span of the sequence of genetic markers nearby a favored SNP which form a haplotype is dependent on the rate of recombination as well as the rate of the rise in frequency of the allele, which is contingent on the strength of selection. A powerful selective sweep has the effect of homogenizing wide regions of the genome flanking the favored mutant; in other words the sweep “cleans” the gene pool of variation as one very long haplotype replaces many shorter haplotypes. As an example, in the genomes of Northern Europeans the locus LCT is characterized by a very long haplotype, which itself seems to correlate well with the trait of lactase persistence. The implication here is that the lactase persistence conferring variant arose relatively recently, and was swept up to near fixation by positive directional natural selection.
Sexual selection is, for lack of a better term, a sexy concept. Charles Darwin elaborated on the specific phenomenon of sexual selection in The Descent of Man, and Selection in Relation to Sex. In The Third Chimpanzee Jared Diamond endorsed Darwin’s thesis that sexual selection could explain the origin of human races, as each isolated population extended their own particular aesthetic preferences. More recently the evolutionary psychologist Geoffrey Miller put forward an entertaining, if speculative, battery of arguments in The Mating Mind: How Sexual Choice Shaped the Evolution of Human Nature. It’s clearly the stuff of science that can sell.
Sexual selection itself comes in a variety of flavors. Perhaps the most counterintuitive one on first blush is the idea that many traits, such as antlers, are positively costly and exist only to signal robust health which can incur the cost without debility. The idea was outlined by Amotz Zahavi in The Handicap Principle in the 1970s. Initially dismissed by Richard Dawkins in the original edition of The Selfish Gene, Zahavi’s ideas have come into modest mainstream acceptance, and the second edition of Dawkins’ seminal work reflects a revised appraisal. This is really a subset of a “good genes” model of sexual selection, whereby females select from a range of males which would exhibit variance in mutational load. A more capricious and erratic form of sexual selection is “runaway,” which like genetic drift needs no rhyme or reason. Rather, arbitrary initial preferences can become coupled with heritable preference in a positive feedback loop which drives the mean phenotypic value of a population off the previous median, until natural selection enforces a countervailing pressure once the trait starts to become excessively maladaptive (e.g., imagine selection for longer and longer tail feathers until the ability of a bird to fly is inhibited).
But notwithstanding the inevitable press which the theory gets, and its centrality to several popular science books, the main action in the area of sexual selection is in the academic literature (contrast this with the aquatic ape hypothesis). Many of the verbal outlines of sexual selection are highly stylized, as economists might say. We are treated to images of stags with massive antlers facing off, elephant seals strutting their stuff, and beautifully plumaged birds gathering for a lek. Set next to this is a body of mathematically oriented models, short on color, long on Greek symbols. But these formal models are valuable. Obviously there is a wide range of variation across species in terms of how sexual selection plays out (if it does so at all within a given species, sexual or asexual). The sexual dimorphism of elephant seals is not the norm against which all species are judged. To explore the variables which produce this pattern of difference one must analyze them in an algebraic fashion, where each can be manipulated in isolation so as to properly characterize its impact. So with that, a paper from The American Naturalist which purports to show how assortative mating could emerge in a sexual selective framework, Make love not war: when should less competitive males choose low-quality but defendable females?:
Across the ~3 billion or so base pairs in the human genome there’s a fair amount of variation. That variation can be partitioned into different classes, somewhat artificial constructions of human categorization systems, but nevertheless mapping on to real demographic or life history events of particular importance. Some of the variation is specific to populations, while some of it is specific to a set of populations, and, there is also variation which we find only within families. Presumably when whole genome sequencing and analysis becomes the norm such distinctions will still have utility, but we should be able to tunnel down to whatever level of analysis we wish. But until that day comes we’re going to have to rely on population sets which are deeply sequenced and can serve as a reasonable representation of a subset of human variation.
I mention some of these populations regularly on this weblog, the HGDP, HapMap and POPRES being three prominent data sets with a diverse range. These groups cover only a small subset of human populations, and of those populations only a small proportion of the genomes of individuals (albeit, the component which is likely to vary within the population). A new paper in Nature takes a close look at the expansion of the HapMap to a new set of populations. Since it’s out of the HapMap consortium the list of authors themselves gives us a large set of individuals who might be of population genetic interest! (though not a representative set of human population variation; where are the Papuan employees of the Broad Institute?) Some of the data coming out of the next stage of the HapMap has been found in several papers already (often in the supplements), but this looks to be an overview and taste of what’s to come (the paper was submitted last fall). Integrating common and rare genetic variation in diverse human populations: