The power of one (Nubian that is)

By Razib Khan | May 26, 2011 10:21 pm

Maju pointed me to a new paper on the genetics of Sudanese today. My interest was piqued, then not so much when I looked more closely. Genetic variation and population structure among Sudanese populations as indicated by the 15 Identifiler STR loci:

Background
There is substantial ethnic, cultural and linguistic diversity among the people living in east Africa, Sudan and the Nile Valley. The region around the Nile Valley has a long history of succession of different groups, coupled with demographic and migration events, potentially leading to genetic structure among humans in the region.

Results
We report the genotypes of the 15 Identifiler microsatellite markers for 498 individuals from 18 Sudanese populations representing different ethnic and linguistic groups. The combined power of exclusion (PE) was 0.9999981, and the combined match probability was 1 in 7.4 1017. The genotype data from the Sudanese populations was combined with previously published genotype data from Egypt, Somalia and the Karamoja population from Uganda. The Somali population was found to be genetically distinct from the other northeast African populations. Individuals from northern Sudan clustered together with those from Egypt, and individuals from southern Sudan clustered with those from the Karamoja population. The similarity of the Nubian and Egyptian populations suggest that migration, potentially bidirectional, occurred along the Nile river Valley, which is consistent with the historical evidence for long-term interactions between Egypt and Nubia.

Conclusion
We show that despite the levels of population structure in Sudan, standard forensic summary statistics are robust tools for personal identification and parentage analysis in Sudan. Although some patterns of population structure can be revealed with 15 microsatellites, a much larger set of genetic markers is needed to detect fine-scale population structure in east Africa and the Nile Valley.

The upside: nearly 500 individuals from a huge range of ethnic groups in Sudan. This is the level of population coverage you’d want. Most of the ethnic groups cover the sample size range from 10 to 50. The downside: only 15 microsatellite markers. About the same number as in the study which I critiqued earlier this week. This is just not a huge number. The authors did try very hard to prune the marker set to be ancestrally informative on this scale, but I think it’s pretty obvious that there are major shortcomings in their analysis. 15 STRs is probably useful for inter-continental genetic variation, but not for intra-continental differences. The paper is open access so you can read the whole thing, but I want to highlight a speculation which they offer based on their results:

The number of unique alleles (Figure 2B) was greatest in the Somali population, and and in the population structure analyses (Figure 5), the Somali population grouped separately from other populations. Because the Somali population is separated both geographically and linguistically from the other populations included in our study, it is not surprising that it is also genetically distinct. It is possible that the Bantu expansion from West Africa had a stronger effect on the region of the Horn of Africa, where Somalia is located, compared with the region where Sudan is located. For example, the languages in Somalia belong to two major linguistic families, the Afro-Asiatic and Niger-Congo, whereas Nilo-Saharan is absent and the Bantu Swahili language is one of the major languages in Somalia (Ethnologue [1]). Another explanation could be that the Somali population is of both Eurasian and sub-Saharan origin, as suggested by a recent study [33], potentially explaining the differentiation of this population from some east African groups, although many of the Sudanese populations, such as Arabs and the Beja, may also have mixed Eurasian and sub-Saharan origin.

I think what is more possible is that as hard as they tried these markers don’t give a insightful picture at the fine scale. By insightful, I mean that there aren’t too many results I’d trust beyond what you’d already intuitively accept. The genome bloggers have already shown that there’s hardly any Bantu admixture in the Horn of Africa.

But the main reason I’m talking about this paper is this: I have one Nubian sample in the African Ancestry Project. Just one. As opposed to 34 in this paper. But my N = 1 makes me really wary of the results from this paper based on 15 STRs. How can my one sample make me wary of the results from 34? Because I have nearly 1 million SNPs from 23andMe’s v3 raw data! So there you have it. The number 1 million isn’t really that big of a deal. I’d be wary if I had 50,000 SNPs (I came up with the number based on running a lot of ADMIXTURE on African populations).

So this is what I did. I took my data set from the African Ancestry Project, pruned a lot of the populations, added Egyptians, and limited AAP members to Ethiopians, Somalis, a Yemeni, and my Nubian. I ran them from K = 2 to K = 12 with ~40,000 SNPs. You can find all of the results for this run at the African Ancestry Project website. But here I want to focus on K = 8. Below is the plot for the reference populations, and then the individual plot for all the Egyptians, and AAP project members who are Ethiopian, Somali, then the Nubian (AF070), and finally the Yemeni (AF091). The Nubian individual is highlighted with a red line, while I’ve placed a blue line underneath the Egyptians. In case you are curious, AF004, AF005, AF006, and AF034 are Somali. AF023 and AF064 are Oromo Ethiopian. AF036 is 7/8 Eritrean and 1/8 Italian, while AF001 is 100% Eritrean.

If the Nubian sample I have is representative it seems plausible that Nubians do have a minor component of Egyptian ancestry, but that Nubians are by an large a more conventional East African population. And contrary to the speculation in the paper, Somalis have surprisingly little ancestry from the Bantu expansion. I’m a lot more confident of this assertion than about the nature of Nubians from my one sample, you see this pattern of Bantu exclusion from Afro-Asiatic groups in Ethiopia and Somalia modulating the parameters every which way. I’d bet $250 that Sudanese as a whole don’t have less Bantu admixture than the Somalis (both groups have a little affinity, perhaps more through common ancestry with Bantu groups than real Bantu expansion ancestry, despite the existence of a Bantu ex-slave class in Somalia). I’d bet $100 that my Nubian is representative and that Nubians don’t have much as gene flow from Egypt as this paper infers.

The best thing about people releasing genome data is that you can actually go beyond the armchair when it comes to critiquing a paper.

CATEGORIZED UNDER: Anthroplogy, Genetics, Genomics
ADVERTISEMENT
  • Tom

    The southern regions of modern-day Egypt are still referred to as Nubia, adjacent as it is to Sudanese Nubia (which is the country’s northern region). The Nubian Museum is located in Aswan, Egypt. Travelling south in Egypt today, one will immediately recognise an increase in the distinctive appearance of the ethnic Nubian population.

  • Eze

    Many Egyptians show minor influence from the Central Sahara region due to the presence of Western components (green), the Nubian also shows this influence, while this is lacking in the far East Africans like the Ethiopians/Somalis. The Mediterranean component in the Nubian (absent in most Horn Africans) probably reflects Egyptian ancestry. By the way, it’s pretty cool you were able to test out results from a published paper so quickly.

  • Lank

    Something worth keeping in mind regarding the Bantu that pops up in the Horn of Africa is that it disappears in some runs. Why does this happen? Well, there is some “Bantu” in the HOA in the runs where the Bantu cluster absorbs the East African ancestry of Kenyan Bantus. In the runs where there is both a Bantu cluster and some East African in the Bantu Kenyans, the Bantu in the HOA is nowhere to be found, except for the far southern areas. That’s why I’ll say that it is most likely not real.

    Could you post some individual results for the sample groups? K=11 and K=12 look particularly interesting, there is a cluster that is considerably higher in the Ethiopian average than in any of the East African project members.

  • Eze

    Lank, K11/12 are available on the project page, they are a bit noisy though as the K is higher than the number of populations in that particular run. K6/7/8/9 seem to be the most informative.

  • Lank

    @Eze, I was referring to the results of individuals from the reference samples, which is only available for the Egypt Henn et al. samples.

  • http://forwhattheywereweare.blogspot.com/ Maju

    That’s why I often just speed-read what the authors write and go direct to the data: I’d even was able to ignore claims of “Bantu admixture” (how can they claim that if not a single sample of their study is Bantu?! – beats me).

    On the other side, I agree that 15 markers is ridiculously low, even if they are tested AIMs. But still (generally speaking) a wide sample is more important than a large set of markers, so the same you are probably right about being wary of the small amount of markers used in this paper, you should be wary of a sample of one as well (no matter how many markers you’re using, he/she could be the odd Nubian who is not like the rest).

    Nevertheless the paper looks rather poor to me. With a little more effort (more markers, deeper K levels, a bit wider outgroup comparisons: Ethiopians, Chadians, Pygmies, some real Bantus), they could have provided us with a lot more information and themselves with a lot more prestige.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    Could you post some individual results for the sample groups? K=11 and K=12 look particularly interesting, there is a cluster that is considerably higher in the Ethiopian average than in any of the East African project members.

    i’ll do that later tonight on the AAP website. add it to your feed list, or check back manually now and then.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    . But still (generally speaking) a wide sample is more important than a large set of markers,

    the generally is important. there are diminishing marginal returns to the power of increased numbers of markers. but 15 is just so low. as i said, i’d be confident even with 50,000 SNPs. 1 million is overkill (which is why i didn’t run hundreds of thousands). that being said, 15 STRs which are AIMs are probably as good as 1 million SNPs at differentiating between europeans and chinese or chinese and africans. but at this fine scale i think 15 STRs are just not viable.

  • http://washparkprophet.blogspot.com ohwilleke

    Is 15 STRs equivalent to 15 bits of data, or are there more than two possible values for each STR (I presume that there are at least two possible values for each, because otherwise they wouldn’t be ancestry informative)?

  • Lank

    What I would like to add regarding the results of the Nubian is that they are more distinct from other East Africans (and the Egyptian contribution is perhaps more apparent) in runs that differentiate between different kinds of East African ancestry; in this case, K9 and upwards, but this has been visible in your previous AAP ADMIXTURE runs as well.

    When the Bulala, a Nilo-Saharan population like the Nubians, are separated from the Ethiopians and/or Maasai, the difference between this Nubian sample and other East African project members becomes clear. The Nubian is quite clearly more closely affiliated with the “Nilo-Saharan” cluster than the other East African project members, a reflection of their Nilo-Saharan heritage, and also has some European affinities that are absent elsewhere in East Africa, a sign of the historical interaction between Nubians and Egyptians.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+