The Pith: The rarer the genetic variant, the more likely that variant is to be specific to a distinct population. Including information about the distribution of these genetic variants missed in current techniques can increase greatly the precision of statistical inferences.
A few days ago I mentioned in passing an article in The New York Times which reported on results from a paper which illustrated how starkly differentiated populations might be on rare alleles. By this, I mean that some genetic variants are present at very low frequencies. It turns out that many of these are low frequency variants private to particular populations, in contrast to higher frequency variants which span varied human populations. The explanation presented by one of the authors of the referenced paper was that higher frequency variants presumably date back to a time before human populations had become geographically diversified across the world. Shared variants at higher frequencies then are shadows of shared past history. In contrast, rare variants are a reflection of more recent events, narrowing the circle of those effected.
I have now read the paper in question, Demographic history and rare allele sharing among human populations. From what I can gather The New York Times article was really an elaboration upon some of the issues which were highlighted in the discussion. The “meat” of the paper in terms of methods and results is actually rather technical and deeply embedded in the language of mathematical statistics. For example:
After further consideration, I have decided that I shall spare you my own clumsy exposition in plain English as to the details of site frequency spectrum calculations. There are after all enough points of interest in the paper at which I can throw my verbal talents more effectively. First, the abstract:
Recently a friend got their 23andMe genotype results, and was wondering if there was something they could do for the “greater good.” I told him that he should throw his genotype out to the public domain and attach his name to it. For various reasons he declined to go that far, but he did consent to me to putting his genotype online without personal identifying information. I can tell you that he is a relatively young male of 100% (to his knowledge) Ashkenazi Jewish heritage.
You can get a zipped folder with the raw text file and a binary pedigree formatted file here. If you click the free download option after 30 seconds you’ll get the file within about 5 minutes on a broadband connection (that was my experience at least).
If anyone else wants to throw their genotype to the public domain with as much or as little information as you want just email me at contactgxnp -at- gmail -dot- com. Here’s a spreadsheet with other people who have put their gentoypes online. I want to put up a “roundup” post with a bunch of people who do just that in the near future.
… my father’s father is Latvian, and the N1 haplogroup is not rare in the Baltic regions. In fact, the subgroup, N1c1, is more common in parts of Eastern Europe than it is in Asia.
Initially, this seemed to play nicely into a part of our ancient family history. There is a folk history, relayed to me be my Dad and my uncle Johnny, that Jostins blood may contain traces of Mongolian. The justification for this is that in around 1260, just before the civil war caused the Mongol Empire to die back in Europe, the Empire extended all the way to the Baltic States. It was at this point, my fellow N1c1-bearers hypothesise, that Mongolian DNA entered the Jostins line.
Unfortunately on closer inspection this tale is not really supported by the DNA evidence. The famous Mongol Expansion haplogroup is actually C3, which is the modal haplogroup of Mongolians. In contrast, N1c1 has existed in Europe for thousands of years, and is far to old and too wide-spread to represent a recent expansion.
To the left is a frequency map of the concentration of N1c1. Based on the current distribution, and the diversity being modal in the East Baltic, one has to be skeptical of a simple east-west model. Interestingly the frequency difference of this haplogroup between Finland and Sweden is very high. Also, branch of N1c1 seems to be found among the Rurikids of Russia. This was the ruling dynasty of the Rus, a people who originally seem to have been ethnic Scandinavians from Sweden. Eventually they ruled over a polyglot state of Finns, Slavs and Scandinavians, and submerged their own identity with that of the Slavic peasants. In this they followed the example of the Bulgars, who were ethnically distinctive from their Slavic subjects, but were totally absorbed excepting that their ethnonym persisted. There is some evidence that the Serbs are a similar case, an Iranian group which was eventually absorbed into the South Slav substrate.