Selection happens; but where, when, and why?

By Razib Khan | November 8, 2013 3:49 am
Distribution of SLC452 variation at SNP rs1426654. Credit, HGDP Browser

Distribution of SLC452 variation at SNP rs1426654. Credit, HGDP Browser

Nina Davuluri, Miss America 2014, Credit: Andy Jones

Nina Davuluri, Miss America 2014, Credit: Andy Jones

One of the secondary issues which cropped up with Nina Davuluri winning Miss America is that it seems implausible that someone with her complexion would be able to win any Indian beauty contest. A quick skim of Google images “Miss India” will make clear the reality that I’m alluding to. The Indian beauty ideal, especially for females, is skewed to the lighter end of the complexion distribution of native South Asians. Nina Davuluri herself is not particularly dark skinned if you compared her to the average South Asian; in fact she is likely at the median. But it would be surprising to see a woman who looks like her held up as conventionally beautiful in the mainstream Indian media. When I’ve pointed this peculiar aspect out to Indians* some of them of will submit that there are dark skinned female celebrities, but when I look up the actresses in question they are invariably not very dark skinned, though perhaps by comparison to what is the norm in that industry they may be. But whatever the cultural reality is, the fraught relationship of color variation to aesthetic variation prompts us to ask, why are South Asians so diverse in their complexions in the first place? A new paper in PLoS Genetics, The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent, explores this genetic question in depth.

Much of the low hanging fruit in this area was picked years ago. A few large effect genetic variants which are known to be polymorphic across many populations in Western Eurasia segregate within South Asian populations. What this means in plainer language is that a few genes which cause major changes in phenotype are floating around in alternative flavors even within families among people of Indian subcontinental origin. Ergo, you can see huge differences between full siblings in complexion (African Americans, as an admixed population, are analogous). While loss of pigmentation in eastern and western Eurasia seems to be a case of convergent evolution (different mutations in overlapping sets of genes), the H. sapiens sapiens ancestral condition of darker skin is well conserved from Melanesia to Africa.

So what’s the angle on this paper you may ask? Two things. The first is that it has excellent coverage of South Asian populations. This matters because to understand variation in complexion you should probably look at populations which vary a great deal. Much of the previous work has focused on populations at the extremes of the human distribution, Africans and Europeans. There are obvious limitations using this approach. If you are looking at variant traits, then focusing on populations where the full range of variation is expressed can be useful. Second, this paper digs deeply into the subtle evolutionary and phylogenomic questions which are posed by the diversification of human pigmentation. It is often said that race is often skin deep, as if to dismiss the importance of human biological variation. But skin is a rather big deal. It’s our biggest organ, and the pigmentation loci do seem to be rather peculiar.

You probably know that on the order of ~20% of genetic variation is partitioned between continent populations (races). But this is not the case at all genes. And pigmentation ones tend to be particular notable exceptions to the rule. In late 2005 a paper was published which arguably ushered in the era of modern pigmentation genomics, SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. The authors found that one nonsynonomous mutation was responsible for on the order of 25 to 33% of the variation in skin color difference between Africans and Europeans. And, the allele frequency was nearly disjoint across the two populations, and between Europeans and East Asians. When comparing Europeans to Africans and East Asians almost all the variation was partitioned across the populations, with very little within them. The derived SNP, which differs from the ancestral state, is found at ~100% frequency in Europeans, and ~0% in Africans and East Asians. It is often stated (you can Google it!) that this variant is the second most ancestrally informative allele in the human genome in relation to Europeans vs. Africans.

SLC24A5 was just the beginning. SLC45A2, TYR, OCA2, and KITLG are just some of the numerous alphabet soup of loci which has come to be understood to affect normal human variation in pigmentation. Despite the relatively large roll call of pigmentation genes one can safely say that between any two reasonably distinct geographic populations ~90 percent of the between population variation in the trait is going to be due to ~10 genes. Often there is a power law distribution as well. The first few genes of large effect are over 50% of the variance, while subsequent loci are progressively less important.

So how does this work to push the overall results forward?

– With their population coverage the authors confirm that SLC24A5 seems to be polymorphic in all Indo-European and Dravidian speaking populations in the subcontinent. The frequency of the derived variant ranges from ~90% in the Northwest, and ~80% in Brahmin populations all over the subcontinent, to ~10-20% in some tribal groups.

– Though there is a north-south gradient, it is modest, with a correlation of ~0.25. There is a much stronger correlation with longtitude, but I’m rather sure that this is an artifact of their low sampling of Indo-European populations in the eastern Gangetic plain. As hinted in the piece the correlation with longitude has to do with the fact that Tibetan and Burman populations in these fringe regions tend to lack the West Eurasian allele.

– Using haplotype based tests of natural selection the authors infer that the frequency of this allele has been driven up positively in north, but not south, India. It could be that the authors lack power to detect selection in the south because of lower frequency of the derived allele. And, I did wonder if selection in the north was simply an echo of what occurred in West Eurasia. But if you look at the frequency of the A allele in the north most of the populations seem to have a higher frequency of the derived variant than they do of inferred “Ancestral North Indian”.

What’s perhaps more interesting is the bigger picture of human evolutionary dynamics and phylogenetics that these results illuminate. Resequencing the region around SLC24A5 these researchers confirmed it does look like the derived variant is identical by descent in all populations across Western Eurasia and into South Asia. What this means is that this mutation arose in someone at some point around the Last Glacial Maximum, after West Eurasians separated from East Eurasians. The authors gives some numbers using some standard phylogenetic techniques, but admit that it is ancient DNA that will give true clarity on the deeper questions. When I see something written like that my hunch, and hope, is that more papers are coming soon.

When I first read The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent, I thought that it was essential to read Ancient DNA Links Native Americans With Europe and Efficient moment-based inference of admixture parameters and sources of gene flow. The reason goes back to the plot which I generated at the top of this post: notice that Native Americans do not carry the West Eurasian variant of SLC24A5. What the find of the ~24,000 Siberian boy, and his ancient DNA, suggest is that there was a population with affinities closer to West Eurasians than East Eurasians that contributed to the ancestry of Native Americans. The lack of the European variant of SLC24A5 in Native Americans suggests to me that the sweep had not begun, or, that the European variant was disfavored. What the other paper reports is that on the order of 20-40% of the ancestry of Europeans may be derived from an ancient North Eurasian population, unrelated to West Eurasians (or at least not closely related). It is likely that this population has something to do with the Siberian boy. Since Europeans are fixed for the derived variant of SLC24A5, that implies to me that sweep must have occurred after 24,000 years ago.

journal.pgen.1003912.g002At this point I have to admit that I believe need to be careful calling this a “European variant.” Just because it is nearly fixed in Europe, does not imply that the variant arose in Europe. If you look at the frequency of the derived variant you see it is rather high in the northern Middle East. Looking at some of the populations in the Middle Eastern panel the ancestral variant might be all explained by admixture in historical time from Africa. If the sweep began during the last Ice Age, then most of Europe would have been uninhabited. The modern distribution is informative, but it surely does not tell the whole story.

Where we are is that SLC24A5 , and pigmentation as a whole, is coming to be genomically characterized fully. We don’t know the whole story of why light skin was selected so strongly. And we don’t quite know where the selection began, and when it began. But through gradually filling in pieces of the puzzle we may come to grips with this adaptively significant trait in the nearly future.

Citation: Basu Mallick C, Iliescu FM, Möls M, Hill S, Tamang R, et al. (2013) The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent. PLoS Genet 9(11): e1003912. doi:10.1371/journal.pgen.1003912

* From my personal experience American born Indians often do not share the same prejudices and biases, partly because subtle shades of brown which are relevant in the Indian context seem ludicrous in the United States.

CATEGORIZED UNDER: Anthroplogy, Genetics, Genomics
MORE ABOUT: Pigmentation

Comments (8)

  1. Dmitry Pruss

    The size of the locus they resequenced, and the number of subjects sequenced, were obviously insufficient to dissect their Haplogroup H. But with more wide and thorough sequencing, one should expect to begin seeing its internal phylogeny, which may give a better answer about origins and migration paths of the derived allele.

    The cultural aspect is intriguing. I heard from a S Asian friend that the negative connotations of Hindi कालिमा are explained by Zanzibar slave trading, which introduced some African slaves in coastal India, much the same as in Oman. Is there any credence to that?

    • razibkhan

      The size of the locus they resequenced, and the number of subjects sequenced, were obviously insufficient to dissect their Haplogroup H. But with more wide and thorough sequencing, one should expect to begin seeing its internal phylogeny, which may give a better answer about origins and migration paths of the derived allele.

      good point. the 1000 genomes has a lot of south asians in the pipeline. i wonder if their coverage is sufficient?

      the muslims obviously brought skin color prejudice to south asia explicitly. many of the shade terms obviously date from that period. but the islamic era has better documentation, period (foreign muslims considered themselves ‘white,’ as opposed to the ‘black’ native ‘hindus,’ which originally was a racial term, not religious one). OTOH, it seems that a casual reading of indo-aryan source texts also alludes to color prejudice. standard revisionism is that color prejudice is metaphorical, and that there was no ethnic/racial difference at work. i think the work coming out of the reich lab, etc., should make us skeptical of this being the total explanation.

      • omarali50

        Just curious: Is there an example of a large mixed society (mixed in color) where there seems to be (or seems to have been) no prejudice in favor of White skin?
        I am not making any claim about White being somehow innately preferred, just curious if such examples exist?

    • razibkhan

      btw, 1) i don’t know hindi, 2) hindi is in a different script than bengali, so i wouldn’t be able to read that anyway (i can’t read bengali). but google translate and context was sufficient 🙂

    • razibkhan

      also, a standard culturally universal social explanation is important to remember. in stratified agricultural societies complexion is correlated with class status. people who work in the fields are dark, all things equal. hypergamy might lead class-color correlations developing naturally over time in a genetic sense too.

  2. Luis Aldamiz

    I believe that you’re probably right in suspecting a (Northern) West Asian origin for this allele. It correlates well with other population genetic markers, such as haploid lineages, which clearly indicate that origin for the bulk of European ancestry, as well as for the South Asian ANI-related markers, such as Y-DNA R1a or J2.

    What I do not really agree with is with your conclusions related to proto-Amerindians, mostly because I think that the alleged North Asian ancestry (Lipson) is a methodological error (that the TreeMix algorithm is drawing the arrow inverted and should show West Eurasian genetic influence on Amerindians instead – I have seen other cases suspect of the same kind of error, so it’s probably a bug in the algorithm and nothing else).

    One key issue is that, while we can identify flows from West Asia to Europe in three periods (Aurignacoid, Gravettian and Neolithic), these do not include at all the dates you speculate with. So it’s either older (Aurignacoid c. 49 Ka BP, Gravettian c. 32 Ka BP) or more recent (Neolithic c. 10-9 Ka BP). The closest fit with your guess would be Gravettian but I would not discard the primary Aurignacoid flow either, although that would imply that somehow the Western allele was displaced in NE Asia (and America) in favor of the East Asian alleles for skin color (East Asian genetic influence in this population was clearly very dominant in any case, even if it mostly comes from the female side of their ancestors, so it would make some good sense).

    • “…that the TreeMix algorithm is drawing the arrow inverted and should show West Eurasian genetic influence on Amerindians instead – I have seen other cases suspect of the same kind of error, so it’s probably a bug in the algorithm and nothing else.”

      Luis, what are these instances, and how did you determine that they were errors?

      • Luis Aldamiz

        From memory, I have seen that in TreeMix studies on Africans, arrows of gene flow that seemed suspiciously inverted. I can’t be more precise without dedicating some time to research back my memory but I’m quite certain that the issue arose first of all, long before this draft paper existed, at a discussion with the author of <a href=ethiohelix.blogspot.comEthio Helix blog, probably at his blog but maybe at mine. I know I should be more

        Anyhow, I took some time to re-read the paper after the comment, as I was commenting from memory, and there is no such “arrow” (there is one in a fig. taken from Pickrell 2011 but only points to Russians, not all Europeans). Instead it’s expressed in another form.

        As I say, I took some time to scratch my head on the matter again and now I spot two issues:

        1. No South Asian or similar (Onge) reference makes the interpretation a bit overly difficult. I would presume that the Ancient West Eurasian split point is shared with whatever is left of aboriginal South Asian ancestry, more or less, because the next split corresponds to Papuans vs. East Asians.

        2. More importantly, it still can be inferred that somehow the flow happened in the opposite direction, because proto-Amerindians MUST have first diverged from the early “proto-West Eurasian” population (Y-DNA Q is most diverse around Iran) and only later massively admixed with East Asians up to the point to be much more like them than like Westerners overall (mtDNA, nDNA) and that hardly deniable fact (supported by archaeology) is not evident in the tree, which makes Native Americans directly diverge from East Asians without suggesting any admixture.

        So I think that the real problem is that the algorithm is not producing perfect results but just an approximation, interpreting Native Americans as “purely derived” from East Asians and instead Europeans as “partly derived” from (proto-) Native Americans. This kind of result is still a very good approximation to reality, let’s not dismiss the merits of achieving such a high level of approximation with a mere computer program!, but human critical revision is still needed and can’t be taken at face value just like that because the algorithm can’t take in account all the contextual and even multidisciplinary elements of judgment that I mentioned in the previous paragraph.

        First thing to know about computers: they are fundamentally dumb sophisticated calculators, so a lot depends on how humans manage and program them. That applies to all kind of statistical simulations. Or in other words: the map is not the landscape, just a rough approximation.

        In many senses we are kind of traveling with the Catalan Atlas, so to say, a very good draft approximation for their time but not even remotely close to Google Earth, for example, much less to Earth itself. And there you have Columbus trying to reach East Asia sailing to the West and imagining America was it.

        We are in that kind of era in regards to population genetics and there are a lot of “Columbuses” around missing entirely the point. We all can be a “Columbus” of population genetics (or whatever else) at some point, I guess. But best try not, and reading too much in data obtained in simulations with algorithms of still not fully tested efficiency, may well be taking Columbus’ way. (Notice also that the paper is not peer-reviewed in any case – it may not matter and I do think it’s a good study anyhow, if properly interpreted, but sometimes peer review helps to get some things straighter).

        As I see it, the result is very perplexing and can only be interpreted as a reflection of the Western Asian origins of the primary proto-Amerindians. No matter how much I look at it, I see no other explanation.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar