The genomic heritage of French Canadians

By Razib Khan | January 24, 2011 1:02 am

Image Credit: Anirudh Koul

One of the great things about the mass personal genomic revolution is that it allows people to have direct access to their own information. This is important for the more than 90% of the human population which has sketchy genealogical records. But even with genealogical records there are often omissions and biases in transmission of information. This is one reason that HAP, Dodecad, and Eurogenes BGA are so interesting: they combine what people already know with scientific genealogy. This intersection can often be very inferentially fruitful.

But what about if you had a whole population with rich robust conventional genealogical records? Combined with the power of the new genomics you could really crank up the level of insight. Where to find these records? A reason that Jewish genetics is so useful and interesting is that there is often a relative dearth of records when it comes to the lineages of American Ashkenazi Jews. Many American Jews even today are often sketchy about the region of the “Old Country” from which their forebears arrived. Jews have been interesting from a genetic perspective because of the relative excess of ethnically distinctive Mendelian disorders within their population. There happens to be another group in North America with the same characteristic: the French Canadians. And importantly, in the French Canadian population you do have copious genealogical records. The origins of this group lay in the 17th and 18th century, and the Roman Catholic Church has often been a punctilious institution when it comes to preserving events under its purview such as baptisms and marriages. The genealogical archives are so robust that last fall a research group input centuries of ancestry for ~2,000 French Canadians, and used it to infer patterns of genetic relationships as a function of geography, as well as long term contribution by provenance. Admixed ancestry and stratification of Quebec regional populations:

Population stratification results from unequal, nonrandom genetic contribution of ancestors and should be reflected in the underlying genealogies. In Quebec, the distribution of Mendelian diseases points to local founder effects suggesting stratification of the contemporary French Canadian gene pool. Here we characterize the population structure through the analysis of the genetic contribution of 7,798 immigrant founders identified in the genealogies of 2,221 subjects partitioned in eight regions. In all but one region, about 90% of gene pools were contributed by early French founders. In the eastern region where this contribution was 76%, we observed higher contributions of Acadians, British and American Loyalists. To detect population stratification from genealogical data, we propose an approach based on principal component analysis (PCA) of immigrant founders’ genetic contributions. This analysis was compared with a multidimensional scaling of pairwise kinship coefficients. Both methods showed evidence of a distinct identity of the northeastern and eastern regions and stratification of the regional populations correlated with geographical location along the St-Lawrence River. In addition, we observed a West-East decreasing gradient of diversity. Analysis of PC-correlated founders illustrates the differential impact of early versus latter founders consistent with specific regional genetic patterns. These results highlight the importance of considering the geographic origin of samples in the design of genetic epidemiology studies conducted in Quebec. Moreover, our results demonstrate that the study of deep ascending genealogies can accurately reveal population structure.

That paper found that nearly 70% of the immigrant founding stock in this data set came directly from France. For the period before 1700 that fraction exceeds 95%. Of the remainder, about 15% of the founding stock were Acadians, who themselves were presumably mostly of French origin. Because of the earlier migration of the French founding stock, they left a stronger impact on future generations:

Much of the difference here is because earlier ancestors in a population which went through demographic expansion would have more of an impact on the nature of the population than later contributors (the earlier ancestors would show up in many more downstream genealogies). But notice that the Amerindians in the pool are a much larger proportion of ancestors than their final genetic contribution (50% of the French Canadians had at least once Amerindian ancestor). I suspect this may be due to differential fertility because of variation in social status by race (i.e., mixed-race French Canadians having lower fertility, perhaps by way of their exclusion from highly fecund elite families), and not just later absorption of Amerindians than French (on the contrary, I suspect that Amerindians were assimilated earlier, not later).

ResearchBlogging.orgBut this research did not look directly at genetics. Rather, these inferences were generated from genealogical records which go back to the founding of Quebec and maintained coherency and integrity from generation to generation. Some of the members of the same research group now have a paper out which looks at the genomics of French Canadians, and directly compares their results to that of the earlier paper. Genomic and genealogical investigation of the French Canadian founder population structure:

Characterizing the genetic structure of worldwide populations is important for understanding human history and is essential to the design and analysis of genetic epidemiological studies. In this study, we examined genetic structure and distant relatedness and their effect on the extent of linkage disequilibrium (LD) and homozygosity in the founder population of Quebec (Canada). In the French Canadian founder population, such analysis can be performed using both genomic and genealogical data. We investigated genetic differences, extent of LD, and homozygosity in 140 individuals from seven sub-populations of Quebec characterized by different demographic histories reflecting complex founder events. Genetic findings from genome-wide single nucleotide polymorphism data were correlated with genealogical information on each of these sub-populations. Our genomic data showed significant population structure and relatedness present in the contemporary Quebec population, also reflected in LD and homozygosity levels. Our extended genealogical data corroborated these findings and indicated that this structure is consistent with the settlement patterns involving several founder events. This provides an independent and complementary validation of genomic-based studies of population structure. Combined genomic and genealogical data in the Quebec founder population provide insights into the effects of the interplay of two important sources of bias in genetic epidemiological studies, unrecognized genetic structure and cryptic relatedness.

In 1760 there were 70,000 residents in the areas of Canada which were under French rule. A substantial fraction of these derived from the much smaller 17th century founding population. Today the number of North Americans with some known French Canadian ancestry numbers around ~10 million. I happen to know an individual whose great-great-grandmother was French Canadian. Using the internet it turned out that I could trace this woman’s ancestry along one line back to the countryside outside of Poitiers in the mid 16th century! Being conservative it seems that at least 5 million North Americans have overwhelming descent from the 1760 founding stock. These are the core French Canadians.

An immediate inference one might make from these background facts, the rapid expansion of the French Canadian ethnic group from a small core founding stock, is that they would have gone through a “population bottleneck.” The data here are mixed. On the one hand, there are particular Mendelian diseases associated with French Canadians. This is evidence of some level of inbreeding which would randomly increase the frequencies of deleterious recessively expressed alleles. And yet as noted in the paper French Canadians do not seem to have lower genetic diversity than the parental stock of French in the HGDP data set. Why? Because to go through a population bottleneck which is genetically significant you need a very small window of census size indeed. Tens of thousands is sufficiently large enough to preserve most of the genetic variation in the founder population which is not private to families. The sort of genetic polymorphisms which might have been typed for in widely distributed SNP chips.

But that’s not the end of the story. Though French Canadians don’t seem exhibit the hallmarks of having gone through an extreme population bottleneck as an aggregate, it turns out that in the populations surveyed there was evidence of substructure. The map to the left shows you the regions where the samples were drawn. Unlike the earlier study the sample size is smaller; this is a nod to the difference between a purely genealogical study and a genomic one. There needs to be money and time invested in typing individuals. Relatively public genealogical records are a different matter. Apparently the Gaspesia sample population were from a relatively later settlement. The urban samples naturally include descendants of local French Canadians, as well as rural to urban transplants.

As one would expect the French Canadian sample clustered with the CEU (Utah whites from the HapMap) and French (from the HGDP) in the world wide PCA. And not surprisingly they exhibited smaller genetic distance to the French than to the Utah whites (who were of mostly British extraction). Using Fst, which measures the extent of genetic variance partitioning between populations, the values from the aggregate French Canadian sample to the CEU sample was 0.0014 and to the French HGDP sample was 0.00078. The Montreal French Canadian group exhibited values of 0.0020 and 0.0012. But, it is important to observe that there was statistically significant differences between the various French Canadian populations as well (excluding the Montreal-Quebec City pairing). This may explain the existence of particular Mendelian diseases in the French Canadian population despite their lack of reduced genetic variation: there’s localized pockets of inbreeding which are not smoked out by looking at total variation statistics. Additionally, the authors conclude that not taking this substructure into account in medical genetics could lead to false positives. Inter-population differences in disease susceptibilities correlated with genome-wide differences in allele frequencies could produce spurious associations.

The population substructure can also be elucidated by extraction of the independent components of variance on a plot, as you can see to the left. Panel A represents PCA of genomic data, while panel B is an MDS derived from genealogical data. The gist here is that you’re seeing the two biggest independent dimensions of variance each data set (these dimensions explain only a few percent of the total variance). Each individual color represents a French Canadian subpopulation. It is clear that there is substructure. Individuals from each group tend to cluster with individuals from their own subpopulation. The authors take this to confirm the Fst values earlier. But to me another interesting aspect is the difference between the genomic and genealogical visualizations. The genealogical visualization looks far “cleaner” to me than the genomic visualization. Why? Genealogical records are imperfect. The rough congruence validates that the Roman Catholic Church in Quebec didn’t make records out of whole cloth, but there were likely fudges, guesses, and deceptions on the margins. One thing to remember is that even if some of the difference is due to issues with paternity, much of that sort of thing would still be within population. Of course I’m looking at this somewhat glass-half-empty. The rough congruency could be seen as a validation of the robustness of the record-keeping of French Canadian institutions over all these centuries. When there isn’t genetic data, one can use genealogical data as a substitute. At least to a rough approximation.

In the final section the paper notes that there are some peculiarities n the genetics of the French Canadians which do indicate some level of genetic homogeneity, at least by locality. To explore this issue they focus on two genomic phenomena which measure correlations of alleles, genetic variations, over spans of the genome within populations. The two phenomena are linkage disequilibrium, which measures association across loci of particular variants, and runs-of-homozygosity, which highlights genomic regions where homozygosity seems enriched beyond expectation (the former is inter-locus, while the latter is intra-locus). Both of these values could be indicators of some level of population bottleneck or substructure, where stochastic evolutionary forces shift a population away from equilibrium as measured by the balance of parameters such as drift, selection, and mutation.

To the right is a mashup of figures 5 and 6. On the left you have a figure which shows the extent of linkage disequilibrium as a function of distance between SNP. As you would expect the greater the distance between two SNPs, the more likely they’re to be in equilibrium as recombination has broken apart associations. The closer and closer two markers, the more likely they’re to be linked, physically and statistically. But there’s a difference between the two LD plots. There’s no difference between the CEU and French Canadian samples in the top panel, but there is in the bottom one. Why? The bottom panel shows LD between markers much further apart. Acadians in particular seem to exhibit more long distance LD than the other populations. This may be a sign of a population bottleneck and inbreeding. Also, please note that the Utah white CEU sample is probably relatively similar to the French Canadians in its demographic history as North American groups go. It is homogeneous and expanded rapidly from a small founder group. To the right you have in the top panel total length of ROH per individual, and the bottom length of ROH greater than 1 MB. Again, the Acadians seem to be standouts in terms of their difference from the CEU reference. Interestingly, there’s no difference between CEU, French, and the two French Canadian urban samples. I suspect this is due to the fact that in Montreal and Quebec City the distinctive inbreeding found in the other samples has been eliminated through intermarriage. ROH disappear when you introduce heterozygosity through outbreeding.

What has all this told us? From a medical genetic perspective it is implying that population structure matters when evaluating French Canadians, an Acadian is not interchangeable with a native of Montreal. In terms of ethnically clustered diseases of French Canadians, in the USA the Cajuns, it may not be that there are patterns across the whole ethnic group, but trends within subgroups characterized by long-term endogamy. I wonder if the same might be true of Ashkenazi. Is there is a difference between Galicians and Litvaks? Such regional differences among European Jews are new, but the French Canadians themselves are the result of the past three centuries. These results also seem to reinforce the Frenchness of the French Canadians. Years ago I skimmed a book on the cultural history of the people of Quebec, and the author went to great lengths to emphasize the amalgamative power of the French Catholic identity in Canada. Arguing that to some extent the roots of the community in the colonial era was something of an overblown myth. These results come close to rejecting that view. In particular the first paper, which shows the disproportionate impact that earlier settler waves have on the long term demographics of a population. A group which one could analyze in a similar vein would be the Boers, who are an amalgam of French Protestants, Dutch, and Germans, but seem to exhibit a dominance of the Dutch element culturally.

Finally, the French Canadians may give us a small window in the long term demographic patterns and genetic dynamics which might be operative on a nearby ethnic group: the Puritans of New England. Because of their fecundity it seems likely that tens of millions of Americans today descend from the 30,000 or so English settlers who arrived in New England in the two decades between 1620 and 1640. This is the subject of the Great Migration Project. With numbers in the few tens of thousands it seems unlikely that much of a thorough population bottleneck occurred with this group in a genetic sense in the aggregate. But the results from the French Canadians indicate that isolated groups can be subject to stochastic dynamics, and develop in their own peculiar directions.

Citation: Bherer C, Labuda D, Roy-Gagnon MH, Houde L, Tremblay M, & Vézina H (2010). Admixed ancestry and stratification of Quebec regional populations. American journal of physical anthropology PMID: 21069878

Citation: Roy-Gagnon MH, Moreau C, Bherer C, St-Onge P, Sinnett D, Laprise C, Vézina H, & Labuda D (2011). Genomic and genealogical investigation of the French Canadian founder population structure. Human genetics PMID: 21234765

CATEGORIZED UNDER: Genetics, Genomics, History, Human Genetics

Comments (17)

Links to this Post

  1. Linkage is Good for You: Classy Edition | January 30, 2011
  1. Hey Razib, I’m interested in what you mean by this:

    “I suspect this may be due to differential fertility because of variation in social status by race (i.e., mixed-race French Canadians having lower fertility, perhaps by way of their exclusion from highly fecund elite families), and not just later absorption of Amerindians than French (on the contrary, I suspect that Amerindians were assimilated earlier, not later).”

    Are you saying that they have lower fertility rates, or less children, or less reproductive children? Or is there a difference between those? I read pretty much all of these posts but I’m not an expert in this field in the least, so sorry if this is a dumb question. Thanks.

  2. JL

    The Montreal French Canadian group exhibited values of 0.20 and 0.12.

    Those numbers are missing some zeros, I think.

  3. Are you saying that they have lower fertility rates, or less children, or less reproductive children? Or is there a difference between those? I read pretty much all of these posts but I’m not an expert in this field in the least, so sorry if this is a dumb question. Thanks.

    they have fewer descendants than they “should.” so my supposition is that there were fertility differences associated with class, and mixed-race french canadians were of a class which had lower fertility rates. imagine, for example, that farm laborers tended to marry amerindian women. their children might have lower reproductive fitness in a pre-modern epoch.

  4. This data set rivals the Icelandic data set by allowing us to link long genological lines to current gene pools to infer very old gene pool composition, and to get a sense of how genetic lines are preserved over long periods that are relatively tranquil geneticically to get baseline empirical constants for situations where we don’t know the facts as well.

    One particularly fruitful possibility from the Montreal French Canadian gene pool plus lineage data is the possibility of estimating the amount (i.e. 0.2% of current and 1.2% of original) Native American admixture, and the character of Native American introgression into it. Since any introgression from the Lief Erikson era or early colonial era of Native American genetics into Eurasia across the Atlantic probably would have been similar genetically to introgression into the Montreal French Canadian population, it may help resolve the extent to which we are seeing North American admixture that returns to Europe via the sea, or circumpolar admixture from the East where we see Native American looking genes in Northern Europe.

    Another interesting trend in the data is the strong selection against outgroups generally within the founder population over subsequent generations. The founding population was 81.8% French or Acadian. Today that is 92.9%. The share of outgroups fell from 18.2% to 7.1% over less than 300 years. Selection against minorities in founding groups may be a general trend in a lot of populations.

  5. Stephen

    New England is different in quite a few ways. Only it had such expansionist political hegemony. Yet today in New England, only something like 9 percent of people claim English ancestry–a far lower proportion than for Spanish in northern New Mexico, or French in French Canada. New England may have been “relatively tranquil genetically” until the mid-19th century, but only for certain classes. Much of the U.S. is from the Puritan Great Migration in certain lineages, but genomically, I think that contribution is fairly small today.

  6. stephen, i think 9 percent is an underestimate. there’s a social science trend of people picking their most exotic ancestors when giving ethnicity for whites. english is always the least exotic, so it always loses. also, in the south there is a tendency to give ‘american.’ that doesn’t seem to happen in new england as much.

  7. Stephen

    @6. Yep, I agree, Razib. Then again, the desire to be exotic versus the desire to be hyper-traditional (e.g. “Mayflower”) has gone through abrupt generational shifts. Teddy Roosevelt was deeply concerned that the Anglo-Saxons were being replaced (google “race suicide”) while much of what happened was admixture with new immigrants. Partly, TR may have been confounding his class with his race.

  8. AG

    On PBS, there was a show called ” WHO DO YOU THINK YOU ARE”. Some of them thought themself as German Americans. At end, their earliest American ancestors are English. Yet, that English was diluted by later German immigrants and forgot.

  9. Scott

    I wonder if mixed-race pairings resulted in higher mortality rates for the Native American spouse. Off the top of my head, a couple of well known Native Americans who married out of their race Pocahontus and Sacagewea died relatively young from disease with only 1-2 children. I don’t think it was good for the health of the Native Americans to spend so much time in close proximity to Old Worlders.

  10. John Emerson

    There are a fair number of families with French names in the Midwest and while they are not generally a coherent group, I have even heard of prejudice in one place where they have some concentration. In the northern Mississippi valley they are probably mostly descended from the Metis, a French-Indian creole group centered in the Red River valley (of the north). Further South they are probably Louisiana French. I don’t know really very much but it would be an interesting study

  11. Emerson Rochette

    It might be interesting to add some historical notes to the presentation of this article. There was effectively a French Canadian population bottleneck around 1760. The British had just won the French-british war here in Quebec. The net result was a sharp decrease in the local population, caused by the war itself: people killed during the fighting period, famine, disease, the deportation of most of the acadians to Louisiana, and also importantly, the fact that the british of this new Canada would, understandably, refuse immigration from France for a long period. This induced an unnatural hermiticity of the gene pool. French-English,French-native couples were rare and, soon after the war, the french clergy, Roman-Catholic uniquely, started pushing for enormous familiesy to maintain what was left of their parishes. Effectively, Quebec was abandoned by France.
    With no new genes in an already limited pool (first settlements only dated back 150 years before that period, many of the founding families emerging from a select group of women (Les Filles du Roy)) AND the fact that Quebec was geographically enormous for such a small population (infering a localization of a particular genetic trait from a small population migration) the study of this population is unique, especially when correlated with the extensive documentation provided by the clergy.

  12. Adela

    I see once again Quebec is the darling center of French Canadianess and Acadians are the ugly step child even though the Acadian population is older than the Quebec one. Presumably of French origin, good grief.
    My Acadian Metis ancestors kinda buck the low fertility issue.

  13. Ben Shell

    The population bottleneck was most likely the number of immigrants before the English conquest in 1760. From 1620 until then, I have seen estimates that the whole population was descended from 5500 men and 2000 women who came almost entirely from France. See, for instance, Leslie Choquette, _Frenchmen Into Peasants_. Another factor in genomic distribution is the division of the population into groups largely pushing outward from Quebec, Trois-Rivieres and Montreal. An amateur genealogist is likely to find his ancestors from Quebec didn’t move very far from generation to generation until the 19th century.

    Without dragging out the details, the way Quebec was settled, even after 1760, was far different from the British colonies to the south.

  14. gcochran

    The low English and/or British ancestry numbers are nonsense. I remember looking at my home county: according to the Census, ~25% English ancestry, > 80% British Isles according to the phone book.

  15. Mark

    There’s an interesting story here! The results dovetail very neatly with what reactionary politicians in Quebec always say, that “Quebecois”, which they always define as those of French ancestry, are a very ‘pure’ race – it should be taken with a lot of suspicion when this happens! Quebec has engaged in cultural and real ethnic cleansing – place names were changed – Hull was replaced by Gatineau recently – it became illegal for most people to send their kids to English language schools, use English on signs, or work in English, and the French language universities often are highly politicized . As the last poster noted, in many areas of Quebec, there are far more people with family names from the British Isles in the phone book than these figures can account for, since in many places in Quebec, they were the first European colonists – in fact, only the southern part of the Province was French, and it’s really notable that the study does not deal with French Canadians that live in Ontario, which also was part of New France – there was immigration from France after 1760 – lots of refugees from the French revolution came, some Acadians came back during the American Revolution in fact, Henri-Gustave Joly de Lotbinière, the Premier of Quebec in the late 1870s, was born in France. And why is no one descended from Africans, even though there were slaves imported into Canada under the French, and slavery was practiced in Lower Canada, which became the southern part of Quebec, until the 1830s?
    To clear up another problem with terminology, Acadians and French Canadians are distinct groups – the study makes that clear.

  16. Aaron


    ” what reactionary politicians in Quebec always say, that “Quebecois”, which they always define as those of French ancestry, are a very ‘pure’ race – it should be taken with a lot of suspicion when this happens! ”


    It just never happend mark, you’re living in your hateful fantasy.

    “why is no one descended from Africans, even though there were slaves imported into Canada under the French, and slavery was practiced in Lower Canada, which became the southern part of Quebec, until the 1830s?”

    Because the separatist eat them with BBQ sauce …

    … good luck with your issues.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar