A splice of evolution?

By Razib Khan | March 29, 2010 3:50 am

It is famously noted that when Charles Darwin published The Origin of Species he had no plausible theory of inheritance to drive his hypothesis. Specifically, one of the major issues of the “blending” model whereby the phenotypes of the parents average out in the subsequent generation is that such mixing eliminates the variation which is a necessary precondition for natural selection. At the same time that Darwin was revolutionizing our conceptualization of how the tree of life came to be, Gregor Mendel was preforming the experiments which solidified his eponymous theory of inheritance. Though ignored in his own day by ~1900 Mendelism reemerged and offered a relatively parsimonious abstraction which could explain why variation was not eliminated through the fusion of sexual reproduction. The discrete genes themselves were simply rearranged every generation in a digital manner, a genotype was translated into a phenotype, rather than the more analog model of phenotypic mixing which underpins a blending theory.* The fusion of genetics and quantitative evolutionary biology resulted in population genetics (see The Origins of Theoretical Population Genetics), while the cross-fertilization with ecology, natural history and paleontology eventually crystallized into what we would term the ‘Neo-Darwinian Synthesis’ by the middle of the 20th century.

And it was then that Francis Crick and James Watson elucidated specifically the biophysical substrate, DNA, through which Mendelian inheritance occurred. It was then that Crick also outlined his famous and infamous ‘central dogma,’ whereby information was transmitted unidirectionally from DNA to protein via RNA. While molecular biology was flowering the theorists who relied on the older abstractions were relatively unperturbed (see The Narrow Roads of Gene Land 1 by W. D. Hamilton). In Darwin’s Dangerous Idea the philosopher Daniel Dennett asserted that evolution was fundamentally substrate neutral; that is, how genetic information is transmitted biophysically is of less relevance than the abstract parameter of natural selection which operates upon the character of that information through the mediation of fitness and phenotype. In a broad philosophical sense this may be true. Assuming infinite population sizes and time this is indubitably so. But there is much that transpires from the beginning to the end, and more recent work has suggested that the physical realities and constraints of molecular function can not simply be abstracted away on a realistic time scale. It is I think somewhat peculiar to push the abstraction too far when speaking of biology in particular, because biological processes often operate under physical constraint or scarcity as a matter of course.

To understand evolution today in any non-trivial sense, that is, to understand evolution as a process which operates on scales shorter than the heat-death of the universe, it seems that one must consider the details of the substrate. In other words the great wall between molecular biology and evolutionary science must be buried once and for all. We have come far from the isolated alleles operating in a statistical sea of random variation which R. A. Fisher conceived of when he attempted to reformulate Darwin’s theories so that they were as precise and crisp as the laws of thermodynamics (see The Genetical Theory of Natural Selection). The recent debates between Sean Carroll and Michael Lynch (or Sean Carroll and Jerry Coyne) put into sharp relief the relevance of substrate, the importance of gene regulation and particularly cis-regulatory elements.**

Gene regulation entails the modulation of the expression of some genes by other genes, by any means possible. A new letter to Nature gives us a possible taste of the future, using the familiar HapMap data set to explore variation in gene expression, Understanding mechanisms underlying human gene expression variation with RNA sequencing:

Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project…By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.

The mapping of a genotype to a phenotype through the production of proteins is complex. All the cells in your body have the same set of genes, but they obviously express differently. If you have a background in biology you will be probably recall examples of this issue in the case of the liver, whose fine tune balance is essential toward our health. But think of something more prosaic, some haplotypes around the HERC2-OCA2 locus seem to correlate with somewhat lighter skin color, and also result in blue eyes. Pigmentation genes seem to vary in how they express (or don’t express) in various tissues, primarily the eyes, skin and hair.

Add to this the tangle that is RNA splicing in eukaryotes, and it gets very complicated indeed. The appeal of Fisherian abstraction is very strong, but after nearly one century of abstracting away the concrete I suspect to genuinely understand how the tree of life came to be we may have to understand its physical accidents in more depth. The paper finishes with an observation on the importances of SNPs around splice site:

We proposed that, as in the example described earlier, the mechanism of many of these associations acts through disruption of the splicing machinery. To test this, we extended a Bayesian hierarchical model used previously to include exon-specific effects…This model allows us to estimate the odds ratio for different types of SNPs to affect splicing. First, we considered the binding sites for the U1 small nuclear ribonucleoprotein (snRNP) and U2AF splice factor (of which the canonical splice sites are a part25); we found that SNPs throughout these binding sites are highly enriched among sQTLs relative to non-splice site intronic SNPs…We considered whether SNPs within the canonical 2 bp of the splice site alone are enriched for sQTLs; we find that they are…in contrast to previous studies using exon microarrays…Furthermore, SNPs within the spliced exon itself are also significantly enriched among sQTLs and, as expected, non-genic SNPs are markedly under-represented among sQTLs….

Not too surprising that the QTLs of note are near locations which we know to be importance in a molecular genetic context. Obviously we’ll have to get much further in understanding variation on this level of complexity before we can talk much about evolution. But if we want to understand something like height with any greater depth than Francis Galton I suspect that the long climb is just beginning….

Citation: Pickrell, JK et al., Understanding mechanisms underlying human gene expression variation with RNA sequencing, doi:10.1038/nature08872

* I am aware that there were many theories of inheritance between Darwin and Mendelism.

** Not the Sean Carroll, but this Sean Carroll.


Comments (4)

  1. miko

    Just because no one has commented yet on this worthy post, I’ll say this is fantastically concise and right on. I was, however, confused about which Sean Carroll is THE Sean Carroll.

  2. well, since i am now on the same domain as sean carroll the physicist, that’s the sean carroll. i once took a class with a grad student in bio who was named sean carroll. pretty common combination of names i guess, irish origin for both.

    thanks for the props. generally on a lot of the hardcore nerd posts i only get comments if i make a technical error. not that there’s anything wrong with that, i think that’s the nature of the beast. call it “negative response bias” for posts which focus on the scientific literature and don’t have a sexy hook.

  3. J Pickrell

    thanks for the plug. you’re right, our goal is to connect functional variation with specific adaptations and/or overall evolutionary patterns. not quite there yet…


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar