Yes Virginia, trans-ethnic inferences from GWAS are kosher

By Razib Khan | June 24, 2013 12:00 am

Razib’s daughter’s ancestry composition

An F1, r = 0.5 to Razib

Genome-wide associations are rather simple in their methodological philosophy. You take cases (affected) and controls (unaffected) of the same genetic background (i.e. ethnically homogeneous) and look for alleles which diverge greatly between the two pooled populations. Visually the risk alleles, which exhibit higher odds ratios, are represented via Manhattan plots. But please note the clause: ethnically homogeneous study populations. In practice this means white Europeans, and to a lesser extent East Asians and African Americans (the last because of the biomedical industrial complex in the United States performs many GWAS, and the USA is a diverse nation). Looking within ethnic groups eliminates many false positives one might obtain due to population stratification. Basically, alleles which differ between groups because of their history may produce associations when the groups themselves differ in the propensity of the trait of interest (e.g. hypertension in blacks vs. whites).

But this begs the question: how generalizable are GWAS, and therefore portable across ethnicities? This is not a trivial question for someone like me, as South Asians tend to be understudied for natural reasons (there aren’t that may of us in the West, and funding for this sort of thing is not viable in Third World nations where most South Asians live). Not only are South Asians understudied, but we tend to have large genetic distances within the putative population, so I’m not even sure that GWAS from the HapMap Gujarati samples would be applicable to me (the genetic distance between South Asian ethnic groups is actually greater than between Europeans and some West Asians). And then there is the question of people of mixed heritage. Is there really a possibility in the near future of GWAS’ of various F1 combinations, let alone backcrosses like Reiko Aylesworth?

Fortunately, from where I stand seems that most GWAS being reported today are portable across ethnicities, so we don’t have to go reinventing every wheel. Some of the evidence is plain to see a in new PLoS GENETICS paper, High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants. Here is the abstract:

Describing and identifying the genetic variants that increase risk for complex diseases remains a central focus of human genetics and is fundamental for the emergent field of personalized medicine. Over the last six years, GWAS have revolutionized the field, discovering hundreds of disease loci. However, with only a handful of exceptions, the causal variants that generate the associations unveiled by GWAS have not been identified, and their frequency and degree of sharing across populations remains unknown. Here, we present a comprehensive comparison of GWAS results designed to try to understand the nature of causal variants. By examining the results of GWAS for 28 diseases that have been performed with peoples of European, East Asian, and African ancestries, we conclude that a large fraction of associations are caused by common causal variants that should map relatively close to the associated markers. Our results indicate that many of the disease risk variants discovered by GWAS are shared across Eurasians.

I want to stipulate that my own views on this matter do not hinge on just this paper. Nor do I believe that there is no regional heterogeneity in the genomic architecture of disease risk alleles. Rather, as a prior I now would contend that when looking at the odds ratios for a relatively large effect allele in Europeans for Eurasians at least one shouldn’t be excessively skeptical of transferring the inference toward other populations In the paper the authors report that when accounting for differences in statistical power (European studies tend to have much larger sample sizes, and so can catch more variants) there is a decent replicability of GWAS. Additionally, there is the possibility that some non-replications are due to the fact that the GWAS are focusing on marker SNPs, rather than causal SNPs, and the marker associations are not portable across populations even if the causal ones are. Remember, often current GWAS studies utilizing SNP-chips are focusing on a genomic region, more than a particular SNP as such. This is why you may get strong GWAS signals in noncoding regions.

Of course there are going to be rare variants which are less portable, and as genomics scales up in population sample size and deep whole genome analyses we’re going to be plumbing private alleles. But until then there’ll be a mountain of common variants of diverse effect sizes, and that information needn’t be discarded when one considers populations outside of the study’s purview. When viewing odds ratios in 23andMe there’s always the caveat that “results X for Europeans.” This not expected for a business. And in terms of medical actions one still needs to be cautious. But to the question of how seriously to take GWAS performed in Europeans if you are not European? If you are non-African, I’d say moderately seriously. If you are an African, I’d probably still say somewhat seriously.

Citation: Marigorta, Urko M., and Arcadi Navarro. “High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants.” PLoS Genetics 9.6 (2013): e1003566.

  • Generalista

    I still can’t get over that “French and German” category. Yes, it is applicable to much of the NE of the olive oil / butter line, or where Germanic toponyms prevail – but outside that region it makes much more sense to autosomally group Germans with Czechs, Slovakians, Slovenians, and Hungarians, for example – and *not* label those as Eastern European.

    So, let alone S and SE Asian complexities, the testing companies can’t even get Central Europe right.

  • Chad

    “Additionally, there is the possibility that some non-replications are
    due to the fact that the GWAS are focusing on marker SNPs, rather than
    causal SNPs, and the marker associations are not portable across
    populations even if the causal ones are.”

    As next-gen sequencing increasingly becomes used in GWAS, I would imagine that this complication would become less of a problem as the ability to focus on causal rather than marker SNPs becomes easier.

  • Boris Bartlog

    Do the relatively long unbroken stretches of chromosome that are attributed to various kinds of European ancestry imply the existence of a recent, relatively ‘full-blooded’ ancestor of that type, or is that more or less an artifact of the way the statistics are evaluated? Would smaller chunks imply a longer period of time since your wife’s ancestors came over from Europe and started intermarrying?

    • razibkhan

      yes. two of her great-grandparents were born in the same area of norway. two of her great-great grandparents were austrian german. the rest is much more mixed. so her parents are both 1/2 unmixed european, and 1/2 “american mutt.”


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar