Update: Please do not take the labels below (e.g., “Baloch”) as literal ancestral elements. The most informative way to read them is that they indicate populations where this element is common, and, the relationship of proportions can tell us something. The literal proportion does not usually tell us much.
I was browsing the Harappa results, and two new things jumped out at me. Zack now has enough St. Thomas Christian samples from Kerala that I think we need to accept as the likely model that this community does not derive from the Brahmins of Kerala, as some of them claim. Their genetic profile is rather like many non-Brahmin South Indians, except the Nair, who have a peculiar attested history with the Brahmins of their region.
But that’s not the really interesting finding. Below is a table I constructed from Zack’s data.
The term “BRICs” gets thrown around a lot these days. At least it gets thrown around by people who perceive themselves to be savvy and worldly. In case you aren’t savvy and worldly, BRICs just means Brazil, Russia, India and China. The huge rising economies of the past generation, and next generation. Here’s a summary from Wikipedia:
The BRIC thesis recognizes that Brazil, Russia, India and China…have changed their political systems to embrace global capitalism. Goldman Sachs predicts that China and India, respectively, will become the dominant global suppliers of manufactured goods and services, while Brazil and Russia will become similarly dominant as suppliers of raw materials. It should be noted that of the four countries, Brazil remains the only nation that has the capacity to continue all elements, meaning manufacturing, services, and resource supplying simultaneously. Cooperation is thus hypothesized to be a logical next step among the BRICs because Brazil and Russia together form the logical commodity suppliers to India and China. Thus, the BRICs have the potential to form a powerful economic bloc to the exclusion of the modern-day states currently of “Group of Eight” status. Brazil is dominant in soy and iron ore while Russia has enormous supplies of oil and natural gas. Goldman Sachs’ thesis thus documents how commodities, work, technology, and companies have diffused outward from the United States across the world.
But there are big quantitative differences between these nations as well. Below the fold are some charts which I think illustrate those differences.
I have put up a few posts warning readers to be careful of confusing PCA plots with real genetic variation. PCA plots are just ways to capture variation in large data sets and extract out the independent dimensions. Its great at detecting population substructure because the largest components of variation often track between population differences, which consist of sets of correlated allele frequencies. Remeber that PCA plots usually are constructed from the two largest dimensions of variation, so they will be drawn from just these correlated allele frequency differences between populations which emerge from historical separation and evolutionary events. Observe that African Americans are distributed along an axis between Europeans and West Africans. Since we know that these are the two parental populations this makes total sense; the between population differences (e.g., SLC24A5 and Duffy) are the raw material from which independent dimensions can pop out. But on a finer scale one has to be cautious because the distribution of elements on the plot as a function of principal components is sensitive to the variation you input to generate the dimensions in the first place.
I can give you a concrete example: me. I showed you my 23andMe ancestry painting yesterday. I didn’t show you my position on the HGDP data set because I’ve shared genes with others and I don’t want to take the step of displaying other peoples’ genetic data, even if at a remove. But, I have reedited some “demo” screenshots and placed where I am on the plot to illustrate what I’m talking about above. The first shot is my position on the two-dimensional plot of first and second principal components of genetic variation from the HGDP data set.