There’s a new paper in AJHG which caught my eye, The Basque Paradigm: Genetic Evidence of a Maternal Continuity in the Franco-Cantabrian Region since Pre-Neolithic Times (ungated). The first thing you need to know about this paper is that it focuses on only the direct maternal lineage of Basques via the mtDNA. In some ways this is weak tea, since it doesn’t give us a total genome estimate. But there are major upsides to mtDNA and Y. First, because of the lack of recombination it is relatively easy to generate a nice phylogenetic tree using a coalescent model. And second, for mtDNA the molecular clock is considered relatively reliable.

In this specific paper they also expanded the scope of their analysis to the whole mtDNA sequence, instead of just the hypervariable region. Not only did they look at whole sequences, but they also had an enormous sample size. They sequenced over 400 mtDNA genomes from the Basque country and neighboring regions. Haplogroup H peaks in frequency among Basques, and drops off among their neighbors (Gascons, Spaniards, etc.). Because the Basque speak a non-Indo-European language they are usually presumed to be indigenous in relation to their neighbors (or at least more indigenous). Until recently there was a strong presupposition that the Basque were ideal representatives of the pre-Neolithic populations of Western Europe. One common method of analysis would be to use the Basque as a pre-Neolithic “reference,” and simply estimate the impact of a Neolithic demographic wave of advance by using a eastern Mediterranean population as a second “reference” within an admixture framework. But more recent work has muddled the idea that the Basque are the descendants of Paleolithic Europeans. Finally, I suspect we’ll also have to acknowledge complexity in demographic histories. To say that the Basque exhibit continuity with Mesolithic Iberians may not contradict a substantial Neolithic contribution. South Asians for example are one numerous modern group which exhibits sharply divergent affinities if you use Y chromosomes (West Eurasian) or mtDNA (not West Eurasian). Why? The details are prehistorical.

The major takeaway from this paper is that the Basque mtDNA exhibit a pattern of demographic expansion ~4,000 years BP, and ~8,000 years BP. But I think it is important to look at the range of outcomes over their confidence intervals, so I’ve reproduced their second table below:

Table 2. Time Estimates of the Six Autochthonous Haplogroups

HaplogroupNPercentageRhoStandard ErrorAge (in Years)95% Confidence Interval
Coalescence Age
H1j15212.4%1.860.4948452324 − 7408
H1t1348.1%1.940.97505799 – 10176
H2a5a1225.2%1.330.653422118 – 6800
H1av1174.0%1.240.523213567 – 5906
H3c2a143.3%1.270.3732911403 – 5204
H1e1a1122.9%1.230.723187−464 – 6927
Splitting Age
H1j15212.4%2.861.1175141764 – 13470
H1t1348.1%2.941.397730554 – 15227
H2a5a1225.2%2.331.196094−6 – 12434
H1av1174.0%2.241.13585465 – 11860
H3c2a143.3%2.271.075934443 – 11619
H1e1a1122.9%5.232.13140112729 – 26000

For our purposes the splitting age is important, because it shows when the Basque specific H lineages diverged from other European H lineages. Some of the intervals are huge (look at H1e1a1), so I don’t know what to make of it. I’ll leave further comments to those more well versed in the mtDNA literature, but I would like to say that it is important to remember that we don’t know where the demographic events inferred occurred. It may not have been in the trans-Pyrenees region at all.

More later.

  1. Can you send me a copy, Razib? A reader told me he would but nothing so far. Thanks in advance.

    My greatest caveat is that all this is based in ‘molecular clock’ speculations that are quite pseudoscientific and/or have more holes in their founding models than a Gruyère cheese (for example the insistence on using Pan-Homo divergence times that may be as little as half of realistic estimates). So I’m not really interested in their MC estimates, which invariably range from very wrong to extremely disparaging, but on haplogroups lists and frequencies among Basques and neighbors (if neighbors are considered).


  2. As a referece point there is archaeological evidence of farming and herding in Iberia dating back to at least 7400 years BP, while refugium populations would have origins in the region prior to 20,000 years BP. The distribution of split dates is not a good fit to a Paleolithic source in a refugium population but is a good fit to split date associated with an early Neolithic migration. H1e1a1 has the youngest coalescent date and the oldest split date, suggesting a very small number of individuals with this sole refugium era origin at the time of population expansion are the source of this private haplogroup. And, the numbers in the expansion era population are small enough that a origin in an Italian or Caucuasian refugium aren’t all that much less likely than a SW European refugium.


