The two phylogenies above represent Mycobacterium tuberculosis, to the left, and human mitochondrial DNA (passed from mother to daughter) on the right. It was pulled from the paper, Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans, which just came out recently, and has naturally been making a splash. As the title implies the paper concludes that humans and tuberculosis have been each other’s “partners,” after a fashion, for the whole existence of modern humanity. The main method here is somewhat brute force and straightforward, by sequencing 259 tuberculosis strains from all across the world they managed to make relatively robust phylogeographic inferences. Throwing data at a question usually resolves something. The correspondence between human and pathogen strains is qualitatively uncanny, and there is plenty enough statistical footwork to confirm it more rigorously within the body of the text.
For many the image of evolutionary processes brings to mind something on a macro scale. Perhaps that of the changing nature of protean life on earth writ large, depicted on a broad canvas such as in David Attenborough’s majestic documentaries over millions of years and across geological scales. But one can also reduce the phenomenon to a finer-grain on a concrete level, as in specific DNA molecules. Or, transform it into a more abstract rendering manipulable by algebra, such as trajectories of allele frequencies over generations. Both of these reductions emphasize the genetic aspect of natural history.
Obviously evolutionary processes are not just fundamentally the flux of genetic elements, but genes are crucial to the phenomena in a biological sense. It therefore stands to reason that if we look at patterns of variation within the genome we will be able to infer in some deep fashion the manner in which life on earth has evolved, and conclude something more general about the nature of biological evolution. These are not trivial affairs; it is not surprising that philosophy-of-biology is often caricatured as philosophy-of-evolution. One might dispute the characterization, but it can not be denied that some would contend that evolutionary processes in some way allow us to understand the nature of Being, rather than just how we came into being (Creationists depict evolution as a religion-like cult, which imparts the general flavor of some of the meta-science and philosophy which serves as intellectual subtext).
Evolutionary genetics as a field emerged in the early 20th century. There were some upsides to this. R. A. Fisher was alive, so there were some incredibly brilliant theoretical minds who could focus upon the project of formalizing evolutionary process and fusing it with Mendelian genetics. And, frankly there are situations where data-free theorizing is best because that sort of theorizing at least is blind to what the solutions should be. But there were also many downsides to this early flowering of theoretical evolutionary biology. The reality that biologists were not clear as to the nature of the biomolecular substrate of inheritance, DNA, was not a hindrance for most of the high level abstraction. But to trace patterns of transmission of characters, and implicitly genotypes, within populations researchers relied upon classical phenotypic markers. This means that the theoretical speculation advanced rapidly into confusing and tendentious terrain, while the empirical data sets to test the questions at issue were simply not sufficient to resolve the debates. The emergence of molecular markers in the 1960s, and the maturation of genomics in the 2000s, has revolutionized the empirical domain of evolutionary genetics. To use a rough analogy the large data sets of the present offer up raw material for the machinery of theory to sift, process, and refine.
A new paper in Nature is a perfect illustration of this, Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations:
One of the elementary aspects of understanding genetics on a biophysical scale is to characterize the set of processes which span the chasm between the raw sequence information of base pairs (e.g. AGCGGTCGCAAG….) and the assorted macromolecules which are woven together to create the collection of tissues, and enable the physiological processes, which result in the organism. This suite of phenomena are encapsulated most succinctly in the often maligned Central Dogma of Molecular Biology. In short, the information of the DNA sequence is transcribed and translated into proteins. Though for greater accuracy and precision one must always add the caveats of phenomena such as splicing. The baroque character of the range of processes is such an extent that molecular genetics has become a massive enterprise, to a great extent superseding classical Mendelian genetics.
One critical structural detail from an evolutionary perspective is that the amino acids which are the building blocks of proteins are generally encoded by multiple nucleotide triplets, or codons. For example the amino acid Glyceine is “four-fold degenerate,” GGA, GGG, GGC, GGU (for RNA Uracil, U, substitutes for Thymine in DNA, T), all encode it. Notice that the change is fixed upon the third position in the codon. Altering the first or second position would transform the amino acid end product, and possibly perturb the function of the final protein (or perhaps disrupt transcription altogether in some case). These are synonymous substitutions because they don’t change the functional import of the sequence, as opposed to the nonsynonymous positions (which may abolish or change function). In an evolutionary context one may presume that these synonymous substitutions are “silent.” Because natural selection operates upon heritable variation of a phenotype, and synonymous substitutions presumably do not change phenotype, it is often assumed that evolutionary change on these bases is selectively neutral. In contrast, nonsynonymous changes may be deleterious or beneficial (far more likely the former than the latter because breaking contingent complexity is easier than creating new contingent complexity). Therefore the ratio of gentic change on nonsynonymous and synonymous bases across lineages has been a common measure of possible selection on a gene.
The Y chromosome is strange. It’s gene poor and loaded with repeats. That’s one reason mtDNA phylogenetic and phylogeographic analysis preceded the Y chromosome by about 10-15 years (the other major reason in the pre-PCR age is that mtDNA is very copious). While the hypervariable region of mtDNA is an excellent molecular clock because of its high mutation rate (though at a deep enough time depth this causes problems, as bases start to turnover), in the pre-next generation sequencing era hunting around the Y chromosomes for SNPs was tedious (a significant portion of Spencer Wells’ Journey of Man focused on the nitty gritty of extraction and preparation).
Despite all this one of the weirder stories over the past decade in relation to the Y chromosome is the peculiar theory promoted by Oxford geneticist Bryan Sykes, and outlined in his book Adam’s Curse: A Future without Men. As I observed above the Y chromosome has a tendency to be filled up with genetic garbage (since it does not recombine deleterious mutations tend to accumulate). There are a few important functional regions (e.g., SRY), but there’s also a reason that sex-linked diseases occur: in most cases males have to rely on the X chromosome to pick up the slack for the Y. Extrapolating this genetic decay Sykes posited that human males would disappear within ~10 million years due to this process working its inevitable logic. Needless to say most scientists were skeptical. Extrapolating without seeing if the projections pass the sniff test is a fool’s errand. And in any case there’s no Law of Nature that sex determination has to be via the Y chromosome. Birds and reptiles have males despite a somewhat different sex determination system.
Sexual selection is a big deal. A few years ago Geoffrey Miller wrote The Mating Mind: How Sexual Choice Shaped the Evolution of Human Nature, which seemed to herald a renaissance of the public awareness of this evolutionary phenomenon, triggered in part by debates over Amotz Zahavi’s Handicap Principle in the 1970s. Of course Charles Darwin discussed the process in the 19th century, and it has always been part of the arsenal of the evolutionary biologist (I first encountered it in Jared Diamond’s The Third Chimpanzee, where he lent some credence to Darwin’s supposition that human racial differences may be a consequence of sexual selection). But this bump in recognition for sexual selection seems to be accompanied by its co-option as a deus ex machina for all sorts of unexplained events. And yet as they say, that which explains everything explains nothing.
To get a better sense of the current scientific literature I consulted A Guide to Sexual Selection Theory in the Annual Review of Ecology, Evolution, and Systematics. The image above is from an actual box in this review! Normally technical boxes illuminate with an air of superior authority (e.g. “it therefore follows from eq. 1…/”), but it seems to me that the admission that a parameter can be represented by the verbal assertion that it’s complicated tells us something about the state of sexual selection theory. In short: its formal basis is baroque because the dynamic itself is not amenable to easy decomposition.
In the post below I alluded to the views of R. A. Fisher. This was a moderately dangerous move on my part because many of Fisher’s views have been transmitted only through later researchers, who may have lacked a clear understanding of what Fisher himself was trying to say. Heap on top of that the reality that the debate between Fisher and Sewall Wright was often abstruse for the evolutionary biologists who nevertheless managed to take sides and transmit their understandings of the conflict, and it’s a recipe for misrepresentation. With that in mind let me enter into the record an email from a friend who has engaged in a deep reading of Fisher, and attempted to understand his reasoning (no, this is not A. W. F. Edwards!):
A few days ago I was browsing Haldane’s Sieve,when I stumbled upon an amusing discussion which arose on it’s “About” page. This “inside baseball” banter got me to thinking about my own intellectual evolution. Over the past few years I’ve been delving more deeply into phylogenetics and phylogeography, enabled by the rise of genomics, the proliferation of ‘big data,’ and accessible software packages. This entailed an opportunity cost. I did not spend much time focusing so much on classical population and evolutionary genetic questions. Strewn about my room are various textbooks and monographs I’ve collected over the years, and which have fed my intellectual growth. But I must admit that it is a rare day now that I browse Hartl and Clark or The Genetical Theory of Natural Selection without specific aim or mercenary intent.
Like a river inexorably coursing over a floodplain, with the turning of the new year it is now time to take a great bend, and double-back to my roots, such as they are. This is one reason that I am now reading The Founders of Evolutionary Genetics. Fisher, Wright, and Haldane, are like old friends, faded, but not forgotten, while Muller was always but a passing acquaintance. But ideas 100 years old still have power to drive us to explore deep questions which remain unresolved, but where new methods and techniques may shed greater light. A study of the past does not allow us to make wise choices which can determine the future with any certitude, but it may at least increase the luminosity of the tools which we have iluminate the depths of the darkness. The shape of nature may become just a bit less opaque through our various endeavors.
The Pith: Natural selection comes in different flavors in its genetic constituents. Some of those constituents are more elusive than others. That makes “reading the label” a non-trivial activity.
As you may know when you look at patterns of variation in the genome of a given organism you can make various inferences from the nature of these patterns. But the power of those inferences is conditional on the details of the real demographic and evolutionary histories, as well as the assumptions made about the models one which is testing. When delving into the domain of population genomics some of the concepts and models may seem abstruse, but the reality is that such details are the stuff of which evolution is built. A new paper in PLoS Genetics may seem excessively esoteric and theoretical, but it speaks to very important processes which shape the evolutionary trajectory of a given population. The paper is titled Distinguishing between Selective Sweeps from Standing Variation and from a De Novo Mutation. Here’s the author summary:
Considerable effort has been devoted to detecting genes that are under natural selection, and hundreds of such genes have been identified in previous studies. Here, we present a method for extending these studies by inferring parameters, such as selection coefficients and the time when a selected variant arose. Of particular interest is the question whether the selective pressure was already present when the selected variant was first introduced into a population. In this case, the variant would be selected right after it originated in the population, a process we call selection from a de novo mutation. We contrast this with selection from standing variation, where the selected variant predates the selective pressure. We present a method to distinguish these two scenarios, test its accuracy, and apply it to seven human genes. We find three genes, ADH1B, EDAR, and LCT, that were presumably selected from a de novo mutation and two other genes, ASPM and PSCA, which we infer to be under selection from standing variation.
The dynamic which they refer to seems to be a reframing of the conundrum of detecting hard sweeps vs. soft sweeps. In the former you case have a new mutation, so its frequency is ~1/(2N). It is quickly subject to natural selection (though stochastic processes dominate at low frequencies, so probability of extinction is high), and adaptation drives the allele to fixation (or nearly to fixation). In the latter scenario you have a great deal of extant genetic variation, present in numerous different allelic variants. A novel selection pressure reshapes the frequency landscape, but you can not ascribe the genetic shift to only one allele. It is no surprise that the former is easier to model and detect than the latter. Much of the evolutionary genomics of the 2000s focused on hard sweeps from de novo mutations because they were low hanging fruit. The methods had reasonable power to detect them (as well as many false positives!). But of late many are suspecting that hard sweeps are not the full story, and that much of evolutionary genetic process may be characterized by a combination of hard sweeps, soft sweeps (from standing variation), various forms of negative selection, not to mention the plethora of possibilities which abound in the domain of balancing selection.
Many of the details of the paper may seem overly technical and opaque (and to be fair, I will say here that the figures are somewhat difficult to decrypt, though the subject is not one that lends itself to general clarity), but the major finding is straightforward, and illustrated in figure 4 (I’ve added labels):
One of the weird things about genetics is that it encompasses both the abstract and the concrete. The formal and physical. You can talk to a geneticist who is mostly interested in details of molecular mechanisms, and is steeped in structural biology. For these people genes are specific and material things. In contrast there are other geneticists who focus more on genes as units of analysis. In this case genes are semantic labels for the mediators within an intersection of phenomena. Recall that genetics predates the knowledge of its concrete substrate by 50 years! By the 1920s Mendelian genetics had been fused with evolutionary biology to create a systematic framework in which we could understand the patterns of inheritance across the generations. In the 1950s the DNA revolution was upon us, but as W. D. Hamilton recalls this had only a minimal impact on the evolutionary genetic thinkers of the era. With the Lewontin and Hubby allozyme paper in the mid-1960s this sort of benign disciplinary evasion was no longer possible; the field of molecular evolution came into its own.*
Today with genomics these human-imposed artificialities are fading away. Consider the concept of genetic recombination. Originally an abstraction in a formal Mendelian system, today it is of great interest to molecular biologists who are curious as to its exact mechanism and purpose, and genomicists who are interested in the constraints upon the phenomenon due to its physical parameters (e.g., recombination hotspots). If we were to discover alien beings I assume that there would be some sort of genetics in an abstract sense. But would they package their genes in chromosomes? Would their complex organisms tend toward dioecy? I wouldn’t be surprised if the genetics of alien species have their own particular kinks subject to the contingent nature of the physical scaffolding of the process.
Implicit in the title The Origin Of Species is the question: why the plural? In other words, why isn’t there a singular apex species which dominates this planet? One can imagine an abstract system where natural selection slowly but gradually sifts through variation and designs a best-of-all-replicators. And yet on the contrary it seems that our planet has exhibited an overall tendency of going from lower to higher diversity. The age of stromatolites may be the last epoch when we had the best-of-all-replicators.
Sad news. John Hawks passes along that James F. Crow has died. Further mention from the National Center For Science Education. A little over 5 years ago I sent Crow an email with only minimal expectation of response, asking about an interview. He responded in less than 24 hours! I think it says a lot about the man that he would respond to sincere questions out of the blue from basically a nobody. Here is his Wikipedia entry. And remember that Genetics has commissioned a series of retrospective essays in Crow’s honor.
Genetics is powerful. The origins of the field predate Gregor Mendel, and go further back to plain human common sense. Crude theories of inheritance in the 19th century gave way in the early 20th to Mendelism, which happens to be a very powerful formal system for predicting the patterns of transmission of information from generation to generation. But I suspect that the popular accolades showered upon genetics would be more muted if it were not for the concrete discovery of the biophysical medium of that pattern of inheritance, DNA. By visualizing strands of DNA packaged into chromosomes one can gain a substantive understanding of Mendelian processes previously somewhat abstracted (e.g., recombination). In concert with the centrality of genetics at the heart of evolutionary science has been the ascendance of its methods in the older field of systematics. The phylogenetic tree is not only intuitive, but it has concrete reality in the sequences of base pairs or structural elements within the genome.
Whatever skepticism there might be about the dynamic phenomenon of evolution, the material aspect of modern genetics rooted in molecular biology is one of he primary wedges by which one can introduce an element of doubt into minds of a skeptic. The correlation between phylogeny and sequence identity of organisms which were previously adduced to exhibit some sort of biological relationship on the tree of life can not be dismissed out of hand. But this mode of thinking has limits, albeit due to the quirks of human psychology.
A friend pointed me to the heated comment section of this article in Nature, Rebuilding the genome of a hidden ethnicity. The issue is that Nature originally stated that the Taino, the native people of Puerto Rico, were extinct. That resulted in an avalanche of angry comments, which one of the researchers, Carlos Bustamante, felt he had to address. Eventually Nature updated their text:
CORRECTED: This article originally stated that the Taíno were extinct, which is incorrect. Nature apologizes for the offence caused, and has corrected the text to better explain the research project described.
Here’s Wikipedia on the Taino today:
“Is Evolution Predictable?” asks a piece in Science. Here’s the first paragraph:
If one could rewind the history of life, would the same species appear with the same sets of traits? Many biologists have argued that evolution depends on too many chance events to be repeatable. But a new study investigating evolution in three groups of microscopic worms, including the strain that survived the 2003 Columbia space shuttle crash, indicates otherwise. When raised in a lab under crowded conditions, all three underwent the same shift in their development by losing basically the same gene. The work suggests that, to some degree, evolution is predictable.
The “some degree” part is the catch. I’m a big fan of general ideas, but the more I learn about evolution the more suspicious I become of broad truths. A given dynamic often has some degree of validity, but extending it too far leads to error or confusion in innumerable specific cases. Evolution may be the most robust and powerful theory for deductive inference in biology, but even here rationalism has its limits. For example, before the rise of molecular methods in exploring polymorphism the debates as to the nature of genetic variation in natural populations tended to focus on outcomes based on adaptive pressures. One school followed R. A. Fisher and argued that polymorphism was strongly constrained by negative selection, with periodic bouts of genetic diversity at a given locus as a positively selected allele was in transience between ~0% and ~100%. Sewall Wright on the other hand suggested that balancing selection (e.g., frequency dependence, heterozygote advantage, environmental heterogeneity) would maintain polymorphism within a population. The logic in both cases was clear, crisp, and plausible. But it turned out that in a deep way the argument was in the “not even wrong category.” Neutral theory and its heirs pointed out, correctly it seems, that at the molecular level most variation was driven by non-adaptive forces such as random genetic drift. Though some thinkers had conceptualized the model in its broad outlines prior to the empirical results, it was the latter which crystallized the need for a robust model and marginalized the older debate centered around adaptation and natural selection. But even here neutrality is not a model to explain it all. There are cases where adaptation and natural selection are relevant. In some instances you see classical dynamics with transients generated by positive selection sweeping through populations, and in other cases balancing dynamics may be operative. The overall point is that we must always be careful about bald assertions of the form “the latest research overturns….” in this area. Evolution is such a sprawling and cosmopolitan intellectual empire. Nature is subtle and richly textured, and our conceptual frameworks map onto the shape of reality only coarsely.
As for the paper itself, it’s nice and elegant. Patrick Phillips, who knows a thing or two about evolution and elegans is quoted in Science as saying that “”It’s an amazing study….” The letter to Nature is Parallel evolution of domesticated Caenorhabditis species targets pheromone receptor genes. Here’s the abstract:
The Pith: The human X chromosome is subject to more pressure from natural selection, resulting in less genetic diversity. But, the differences in diversity of X chromosomes across human populations seem to be more a function of population history than differences in the power of natural selection across those populations.
In the past few years there has been a finding that the human X chromosome exhibits less genetic diversity than the non-sex regions of the genome, the autosome. Why? On the face of it this might seem inexplicable, but a few basic structural factors derived from the architecture of the human genome present themselves.
First, in males the X chromosome is hemizygous, rendering it more exposed to selection. This is rather straightforward once you move beyond the jargon. Human males have only one copy of genes which express on the X chromosome, because they have only one X chromosome. In contrast, females have two X chromosomes. This is the reason why sex linked traits in humans are disproportionately male. For genes on the X chromosome women can be carriers of many diseases because they have two copies of a gene, and one copy may be functional. In contrast, a male has only a functional or nonfunctional version of the gene, because he has one copy on the X chromosome. This is different from the case on the autosome, where both males and females have two copies of every gene.
This structural divergence matters for the selective dynamics operative upon the X chromosome vs. the autosome. On the autosome recessive traits pay far less of a cost in terms of fitness than they do on the X chromosome, because in the case of the latter they’re much more often exposed to natural selection via males. In the rest of the genome recessive traits only pay the cost of their shortcomings when they’re present as two copies in an individual, homozygotes. A simple quasi-formal example illustrates the process.
Update: John Hawks’ lab is working in the same area, and he disagrees with the specific results presented here. Always reminds you to be careful about sexy results presented at conference! (someone should do a study!)
So claimed Peter Parham at a Royal Society meeting last week, Human evolution, migration and history revealed by genetics, immunity and infection. You can actually listen to the talk by pulling down the mp3 file. To get the part about human evolution and introgression, jump to 24 minutes in.
Here is the general sketch: It looks like ~50 percent of the HLA Class I alleles in Europeans derive from Neandertals, ~70-80 percent of HLA Class I alleles in East Asians derive from Denisovans, and that and ~90-95 percent of HLA Class I alleles in Papuans derive from Denisovans. If you recall, ~2.5% of the total genome content of non-Africans seems to be Neandertal, while ~5% of the total genome content of Papuans seems to be Denisovan. The total genome content proportions are rough estimates, there may be some wiggle room in there. But you can see that the HLA allele admixture estimates from these ancient Eurasian lineages is greater by an order of magnitude. Why?
Physicists’ study of evolution in bacteria shows that adaptations can be undone, but rarely. Ever since Charles Darwin proposed his theory of evolution in 1859, scientists have wondered whether evolutionary adaptations can be reversed. Answering that question has proved difficult, partly due to conflicting evidence. In 2003, scientists showed that some species of insects have gained, lost and regained wings over millions of years. But a few years later, a different team found that a protein that helps control cells’ stress responses could not evolve back to its original form.
Here are the primary results: