Desmond Tutu, Spaniards, and genetic distance

By Razib Khan | August 21, 2010 12:35 pm

Since we’ve been talking about Fst a fair amount, I thought it might be nice to put it in some concrete graphical perspective. First, to review Fst in the genetic context measures the proportion of genetic variation which can be attributed to between population differences. To give a “toy” example if you randomly divided the population of a large Swedish village into two groups, and calculated their Fst, it should be ~ 0, because if you randomly select from an unstructured population by definition there shouldn’t be noticeable between population differences. In contrast, if you compare a Swedish village to a Japanese village, a large fraction of the genetic variation is going to be distinct to each population. Around ~10% of the genetic variation in fact will be between the two groups. Many of the genes will be extremely informative, so that if you know the allelic state from a given individual you can predict with a high degree of certitude which population that individual was from (e.g., SLC24A5 and EDAR). A small set of ancestrally informative alleles would produce a sequence of conditional probabilities of extremely high certitude (on the order of 10 genes for these two populations should suffice, perhaps three for “government work”).

But to put this in perspective, and show how genetic variation differs from locale to locale, I though I would compare continental-scale Fst values with that in a small region, southern Africa. The Fst values for the first I obtained from Investigation of the fine structure of European populations with applications to disease association studies, and the second, Complete Khoisan and Bantu genomes from southern Africa. The Bantu in this case is Desmond Tutu, who is from the Xhosa tribe, and has substantial admixture from the non-Bantu populations which were resident in South Africa prior to the arrival of the Bantus.

First, in tabular format:

Spain Sweden Russia Japan
France 0.0008 0.0023 0.0037 0.1116
Spain 0.0047 0.0059 0.1118
Sweden 0.0025 0.1095
Russia 0.1057

KB1 NB1 TK1 MD8 Desmond Tutu
KB1 0.021 0.024 0.022 0.08
NB1 -0.007 0.006 0.091
TK1 0.016 0.088
MD8 0.061

Second, two adjacent bar graphs. In the foreground I’ve simply take the Spain vs. other Eurasian population comparisons, while in the background Desmond Tutu is the reference for the four Bushmen.


In some ways this comparison is an exaggeration of the variation in African genes. The Bushmen and Bantu populations are of very distinct origins, as the latter spread over eastern and southern Africa only in the last 2,000 years. The Bushmen-Bantu cultural gap is one of sharp discontinuity, and despite gene flow it is still to some extent a genetic one as well. But there are other factors dampening Fst in this case. First, Tutu is himself of partial Khoisan ancestry (of whom there are other groups besides the Bushmen), so his genetic distance is likely to be smaller than someone from the Zulu tribe, which has presumably had less admixture with the indigenous populations, being a bit farther from the edge of the demographic “wave of advance.” Second, the gene chips are geared toward Eurasian populations, and presumably missed African, and particularly Bushmen, specific variants because they didn’t go looking.

My own confusion on these issues the past week illustrates I suppose the difficulty in mapping these abstruse and yet materially concrete patterns onto human categories. But quite often wrestling with the difficulties in the surest path to illumination.

CATEGORIZED UNDER: Genetics, Genomics
  • Pingback: Tweets that mention Desmond Tutu, Spaniards, and genetic distance | Gene Expression | Discover Magazine --

  • steve hsu

    Thanks for digging deeper into this question. It does appear that there is a lot of diversity in Africa.

    In the figure you showed in the earlier post (Aug. 19) it doesn’t appear that any of the inter-African distances (inside the blue bubble) are as large as .08, given the scale of .01 indicated. Perhaps the detail inside each bubble isn’t according to scale? (Or maybe my eye is just off.)

  • gaffa

    How does Fst distance relate to molecular evolution sequence distance measures? Of course, Fst is a population-level measure whereas a sequence distance is a property of a pair of sequences, but in principle you could get some kind of average sequence distance between two populations. I was thinking about the neighbor-joining tree posted the other day ( ), built from Fst distances and showing that different non-African populations do not have equal distance to Africa. From a phylogenetic sequence-level perspective that feels unintuitive, as a pair of sequences that have evolved from a common ancestor for the same amount of time under the same rate of change will have equal distances to an outgroup. I wonder what would happen if you instead built the neighbor-joining tree from pairwise sequence distances?


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar

Login to your Account

E-mail address:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »