Burning down the trees in historical population genetics

By Razib Khan | October 27, 2013 5:07 am

BurnTreephylogenetic tree is an essential tool in understanding the broad scope of natural history, placing particular lineages in specific evolutionary contexts of relatedness. These sorts of trees range from Ernst Haeckel’s classical attempt, depicting relationships which biologists derived from intuition within the framework of a grand evolutionary scheme, all the way down to modern methods implemented in software packages such as Mr. Bayes, which many frankly utilize in a “turnkey” manner. These trees are abstractions, in that they reduce down a wide range of phenomena into schematic representations which impart aspects of particular interest in a stylized form. This is important, because the actual nature of the phenomena being represented may be more complex than is being represented. A simple illustration of what I’m getting is clear when you look at the long history of phylogenetics and phylogeography utilizing mitochondrial DNA lineages (mtDNA). Because mtDNA is copious in comparison to nuclear DNA, it is easy to obtain. And, as there is no recombination and it is inherited in a haploid fashion (mother to daughter) it makes the inference of gene trees much easier. The key problem is that the genealogy of this particular sequence is used to infer aspects about population history, when they may not accurately represent the history of other regions of the genome very well. Different genes may have different histories.

These issues of conflating the history of genes with the history of populations move further into the foreground the less genetic distance separates the populations you are comparing. Phylogenetic analysis involving distinct species has its own problems, but they are dwarfed by what must confront those who attempt to parse out relatedness of populations within species. Because of the ubiquity of gene flow across populations within species attempts to generate a tree of relationships of populations is always bound to be a gross simplification. Instead of a sequence of bifurcations the true relationship of putative populations is more accurately represented by a networked graph.

Jumping from the theoretical to the concrete one of the major issues in regards to constructing a sequence of events of the human past which can be used to inform the human present is that a graph relationship is very complex and difficult to tease apart when the tips of your tree are extant populations which are highly admixed. When you try and reconstruct the past from the present, a necessity in phylogenetic analysis which utilizes genetic data (obviously the issues are different if you are focusing on paleontological information), you necessarily gain a blinkered perspective.*

All this came to a head for me when I read the post The First of the Mohicans, which cited a preprint I’d skimmed over earlier in the year, Efficient moment-based inference of admixture parameters and sources of gene flow. It is by its nature a technical paper, but within it is lodged some genuine dynamite. Let me quote:

Our interpretation is that most if not all modern Europeans are descended from at least one large-scale ancient admixture event involving, in some combination, at least one population of Mesolithic European hunter-gatherers; Neolithic farmers, originally from the Near East; and/or other migrants from northern or Central Asia. Either the first or second of these could be related to the “ancient western Eurasian” branch in Figure 5, and either the first or third could be related to the “ancient northern Eurasian” branch. Present-day Europeans differ in the amount of drift they have experienced since the admixture and in the proportions of the ancestry components they have inherited, but their overall profiles are similar.

The result here is outlined graphically in the preprint:


What you see above are two varieties of abstractions which attempt to reconstruct phylogenetic relatedness, and implicitly historical change over evolutionary lineages. To the left is a classical tree, where all the terminal nodes (contemporary populations) are te outcomes of bifurcation events. To the right you have an attempt to produce more informatively representation of the relatedness by drawing out likely admixture events. Here’s the major result: modern Europeans seem to be the products of a major admixture event between a population which roots in northern Eurasia, and another with roots in western Eurasia. At the current rate it seems likely that most major world population are the result of mingling between very distinct populations (to varying extents). In fact, I’m rather certain these sorts of inferences underestimate the extent of admixture, rather than overestimate them. By their nature the methods elide complexity.

The ubiquity of this admixture leaves me a bit chagrined, because with the rise of genome-wide data in the mid aughts I’ve been reading papers which produce neat trees and elegant admixture bar plots, all the while unable to confront the reality that the abstractions before me were not reflecting what truly transpired over the past ~10,000 years. A world where modern human expansion resulted in isolation of several major lineages from each other by the end of the last Ice Age down to the present never existed. A world where these major lineages were connected by continuous isolation-by-distance dynamics is very misleading. Here is what I think is more accurate: a world where the “tips” of the phylogenetic tree are pruned repeatedly, and populations which are the outcomes of admixture events expand rapidly to fill the emptying space. Both “ancient North Eurasians” and the “ancient South Eurasians” do not seem to exist in unadmixed form, perhaps with the exception of Andaman Islanders, and some populations in the far north of Siberia. This begs the question, do any populations exist in an “unadmixed form”? What does that even mean? The paper I mention above actually does answer the question in a somewhat precise manner. Populations such as the Japanese are useful in forming an unadmixed scaffold after populations identified as admixed are removed using f3 statistics (see Ancient Admixture in Human History). But this is not the last word on whether the Japanese are admixed or not, though it suffices for the purposes of the questions being asked in the paper.

Where does this leave us? Let’s go back to Europeans. The authors of Efficient moment-based inference of admixture parameters and sources of gene flow assert that pretty much all Europeans exhibit evidence of massive admixture between very distinct lineages. To me this is highly suggestive of events which have roots prior to the Neolithic Revolution. In other words admixture between west and north Eurasian lineages may have occurred in Europe at the end of the last Ice Age, as the continent was being resettled by hunters from the east and south. Later, Neolithic farmers from the Middle East related to the west Eurasian population in Europe during the Pleistocene added a subsequent layer of west Eurasian ancestry, and to a great extent replaced or absorbed the admixed hunter-gatherers. Finally, it seems now entirely possible that a further wave of migrants from Central Asia, who were also an admixed population, erupted into Europe and replaced or absorbed many of the descendants of the Neolithic farmers.

What we’re confronted by is intellectual rubble and bombs are dropped all over the landscape. The world is turned upside down. We’ll rebuild, but it’s going to take time. The past was a strange land, far stranger than we’d thought. In science you go for the boring answers as a null, but in this case the boring answers are turning out to be wrong.

* Ancient DNA analysis is changing this somewhat.

MORE ABOUT: Human Genetics

Comments (7)

  1. Davidski

    Central Asia appears to be a sink, rather than a source of population movements. Scientists are now realizing this, but more studies like the one below are needed to drive the point home, including a comprehensive paper on R1a using complete Y-chromosome sequences.


    In all likelihood, the present-day European gene pool formed during the late Neolithic with the migrations of pan-European cultures like the Bell Beaker and Corded Ware from Iberia and Eastern Europe, respectively.

    That’s probably why classic Central Asian markers like R1a-L657 are totally missing from Europe, and Volga Russians and Finns show lower Fst to Iberians than to North Caucasians, Iranians and Kazakhs.

  2. Karl Zimmerman

    Doesn’t this sort of cut against the assertion made (more strongly by some commentators than you) that European hunter-gatherers were not racially distinct from the agriculturalists? I mean, considering the most “hunter-gatherer” of modern-day populations tested came out as 40% Ancient North Eurasian, presumably the actual hunter-gatherers were 50% or more Ancient North Eurasian.

    Unless of course than Ancient North Eurasian was enriched itself after the agriculturalists came (but presumably before the Indo-Europeans) as you suggest. But to me this seems doubtful due to the tight relationship between where the admixture is elevated and the areas the agriculturalists only penetrated later (and left less of a genetic impact).

    I’m also not quite sure how the boy’s remains from Lake Baikal work into this. I mean, I understand the gist, but is the inference that he’s an actual Ancient North Eurasian, or (since he’s related to Europeans and Native Americans, but not Asians) that he himself was a mix between a European-like and a North-Eurasian like population?

    • razibkhan

      i think i objected to the idea that it was on the order of aboriginal vs. european. that’s a big high. but yeah, it could be high.

      , but is the inference that he’s an actual Ancient North Eurasian,


      • Karl Zimmerman

        That’s interesting about the boy. So if he, as an Ancient North Eurasian, is from a population which contributed around 1/3rd of the modern Native American genome, wouldn’t that also suggest that using his DNA (rather than modern Native Americans) to look for the Ancient North European admixture would turn up even higher values?

        Lipson et al. in their paper do come up with far higher proportions of this admixture than the first attempt. But from what I can gather (as a math challenged individual) they do so mainly because they have the ability to not use Sardinians as a hypothetical “unadmixed” outgroup. Thus they can detect the baseline admixture in all Europeans, and the Northern/Eastern Europeans which before had elevated levels get really elevated levels.

        But they’re still modeling the Native-American population as being ancestral to Native Americans. Not the ancestor of 1/3rd of the Native American genome. Presumably if you were just looking at the Ancient North Eurasian component (if it can now be segregated out) the relation with modern-day North Europeans would be even tighter, with European hunter-gatherers and the Baikal boy converging on being the same population.

  3. andrew oh-willeke

    One has to be careful not to throw out the baby with the bathwater.

    Pre-modern populations may not have been pure genetic isolates from each other, but uniparental genetics, particular when both NRY and mtDNA data are combined, can together provide quite meaningful estimates of the extent of admixture between populations that can be used to bound parameters for modeling autosomal DNA admixture.

    While the process may be complex and not fully treelike, one shouldn’t understate the extent to which there is discernable population structure, or the possibility that modern population genetics can’t discern anything.

    Also, what is a “blinkered perspective”? I’ve never heard the phrase before.

  4. Karl Zimmerman

    So I saw that Dienekes updated his post with this link to a Science article. It explains some more details, and is worded in such a way to presume the “Ancient Northern Eurasians” were actually Western Eurasian, not Eastern Eurasian or some other, intermediate group.

    More interestingly, one of his commenters claims someone has gotten access to the genome and has done an autosomnal breakdown. The boy appears to map as 2/3rds Udmurd (quite strongly) and 1/3rd something from the northern part of South Asia (Brahui, Makrani, Balochi, Sindi, Burusho, etc). Surprisingly, I don’t actually see Udmurts in Dodecad or anywhere else, so I don’t know how this was estimated. Indeed, it could all be bullshit. Still, it would seem presuming the data is true the Baikal boy was mostly “pure” West Eurasian, albeit of a form not typical today. Thus Europeans aren’t all part “Native American” – instead Native Americans are all part “European.”


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar