An algorithm is just an algorithm

By Razib Khan | April 23, 2012 9:48 pm

In the comments below:

You should include a Moroccan or otherwise native North African sample. Without a North African sample West Africans act as proxy for some of that North African ancestry that does exist in Iberia, specially the Western third (Portugal, Galicia, Extremadura, León, etc.) Doing that your analysis would become more precise and you could make better informed claims.

I was reading through all the entry and there was no mention to the rather surprising notable West African component in Iberians other than Basques. For my somewhat trained eye it is clear that this is a proxy for North African ancestry and not directly West African ancestry. This is demonstratedly also the case in Canary Islands, at least to a large extent, and, by extension in Cuba (which is nearly identical to your average Canarian), at least Cuba-1. Cuba-2 seems actually admixed at low levels and both seem to have some Amerindian ancestry not existent in Spain.

This is a fair point. I switched computers recently, and the Behar et al. data set I had seems to have become corrupted. So I snatched the Mozabites from the HGDP, and removed the Gujaratis from the previous run. I also added Russians, Druze, and some extra Amerindian groups. At K = 7 this pattern jumped out:


The Mozabites have “swallowed” most of the “African” component in the Iberian populations. But not in the two Cubans. The main objection I would have is that the Northern European groups now show some African, which is likely to be an artifact. It does seem unlikely though that the Cuban African element is due these individuals being Northern European rather than Southwest European. These ADMIXTURE results also align Cuba 1 better with the PCA, where the individual was definitely an outlier.

Another illustration that knowledge comes not through blind adherence to methods, but human reflection.

CATEGORIZED UNDER: Anthroplogy
MORE ABOUT: Race
  • Jean Lohizun

    Razib could you kindly provide the numbers you used for each population. I mean, I know that the Iberian populations from the 1000 Genome Project have the following samples which have geographic origin:

    Castilla y Leon 1KG (12)
    Spanish (I supposed this comes from Behar et al.) (12)
    Castilla La Mancha 1KG(6)
    Extremadura 1KG (8)
    Cantabria 1KG (6)
    Pais Vasco 1KG (7)
    Catalonia 1KG (8)
    Valencia 1KG (10)
    Islas Baleares 1KG (6)
    Andalucia 1KG (4)
    Aragon 1KG (6)
    Galicia 1KG (8)
    Islas Canarias 1KG (2)

    So the total number of Iberians used amount to 95. I do wonder why Murcia 1KG (8) wasn’t included on this run?

    Anyways, I presume the Basque individuals are actually the French Basque from HGDP which are 24. Now what puzzles me the most is that both Spanish and French Basque are coming out in the 7-10% North African (Blue component), which is unlike anything I have ever seen before. Also I wonder how big the CEU(Utah, Whites) and Orcadian samples were combined, that the red component(European) is actually centered around Orcadians and to a lesser extent CEU. Also puzzling is the presence of a West African(orange) component in Russians, Orcadians and CEU at levels that are twice that found in Extremadura and Galicia, and still larger than the one found in the Canary Islands. The rest of Iberian populations display no West African(Orange) component, but larger than previously observed values of North African, specially the Spanish Basques. I wonder if you could consider the introduction of a Sardinian sample, to see if the North African showing in Spanish Basque is indeed North African, or if is the previously found Mediterranean component which peaks in Sardinians.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #1, i put the master list and fam file online

    https://docs.google.com/open?id=0B8Tdg8RZEWOOalNMT2Flb0tDZTg

  • xfinity

    Hopefully this is going through, seemed like there was a glitch when I tried earlier.
    Thanks for the further workups.

    This is interesting. It looks to me that at K=6 admixture run I show more red than the average Utah, white American, and that the only thing my husband is showing that is certainly large for Iberian peninsula standards is the Amerindian (light purple) component, yet it is no more than twice that of what the average CEU shows. As for the African, it is just slightly more than what those two Canary Islanders show, and for all we know if we had more Canary Islander samples, he could align easily with them. There are some interesting things to mention about the K=7 run in regards to the Basques, which other posters have commented.

    The results of Cuban 1, or me, is certainly more in line with what I would expect most white Cubans to show, particularly those from the western side of the island. I think if other Cubans submit their data you will see data more like mine, if not more European.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #3, two points

    1) i will probably run a very narrow ADMIXTURE run with a lot of markers at some point to see if your admixture is “real” or not. as you suggest, your levels are low enough that there’s ambiguity, and we just can’t say with any confidence. do you have any citations on the number of canarians proportionally which came to cuba? an N = 2 is really not large enough to separate them from other iberians, so that is as far as i can go there.

    2) i may be able to phase your genotype and try and figure out WHEN/if admixture occurred. if it is canarian or iberian it should be older, but i might not be able to get enough precision….

  • http://forwhattheywereweare.blogspot.com/ Maju

    Thanks for taking in account my opinion and double-checking the matter. I would agree that it is likely that both Cubans have some Tropical African admixture, although very small in both cases and clearly quite smaller than the previous analysis (slightly above half the previous measure). Using a ruler to measure the percentages out from the graph:

    Cuba 1: previous 4.3% TA, now: 2.5% TA
    Cuba 2: previous 6.5% TA, now: 3.6% TA

    *TA = Tropical African (or whatever the Yoruba component means).

    For comparison Utah Whites show 2.1% TA, what makes the Cuban case a bit confusing after all (there’s also French ancestry in Cuba for example or it could even be “Moorish”, Morisco, Jewish…). Of course I do think that there is small Tropical African admixture but not sure exactly how much anymore, not with that “artifact” over there (Utahns are after all also Creoles, much like Cubans).

    I do agree in any case that “an algorithm is just an algorithm” and that the results must be taken with a pinch of salt, very specially for very small apportions such as these.

    Edit: Upon second look, I feel that now the Yoruba component acts as partial proxy for the Indian one, after the Gujarati samples were removed. The Indian component was somewhat larger among North Europeans than Iberians and it is these who now show anomalous Yoruba component. This may also influence Cuba 2 (but not Cuba 1).

    I wonder if Tropical/Southern African samples can act that way, i.e. as de-facto representative of “other human” when the specific ancestry is not well represented. It was not so obvious when it acted as semi-proxy for North Africans, who do have some apparent TA admixture, but that’s not the case in India and yet the Yoruba component seems to make up partly for the lack of the Gujarati component in Northern Europeans.

    Curious to say the least.

  • DavidB

    I don’t know about Spain, but in Portugal there is some input of Sub-Saharan genes from the Portuguese empire – Angola, Mozambique, and Brazil (i.e. mainly West African slaves). It may be a small contribution, but I think it is noticeable, at least in Lisbon.

  • iberian

    The “input” come from the recent imigrants, not from the old slaves. Light people, with signs of admixture, are Brasilians or come from the ancient colonies. There is also a difference between a etnic Portuguese and Portuguese citizen. Aniway, the same “input” exist in Paris, London, Berlin…

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »