Since the beginning of this weblog (I’ve been writing for eight years) heritability has been a major confusion. Even long time readers misunderstand what I’m trying to get at when I talk about heritability. That’s why posts such as Mr. Luke Jostins‘ are so helpful. I had seen references to a piece online, The Causes of Common Diseases are Not Genetic Concludes a New Analysis, but I hadn’t given it much thought. Until Ms. Mary Carmichael’s post DNA, Denial, and the Rise of “Environmental Determinism”. She begins:
Michael Pollan, the well-known writer on food and agriculture, is a smart guy. His arguments tend to be nuanced and grounded in common sense. I like his basic maxim on nutrition – “Eat food. Not too much. Mostly plants” – so much that I recently promoted it in a Newsweek cover story. He’s the last person I’d suspect of reactionary thinking, which is why I wish I didn’t have to say this: Michael Pollan has made a deeply unfortunate mistake.
A few days ago, speaking to his 43,000 followers on Twitter, Pollan linked to an essay written by an environmental advocacy group that spends much of its time fighting the depradations of Big Agriculture. Curiously, the essay wasn’t about ecological destruction or even about agriculture. It was about human genetics. It argued that since genetics currently can’t explain everything about inheritance, genes must not influence the development of disease, and thus the causes of illness must be overwhelmingly environmental (meaning “uninherited” as opposed to “caused by pollution,” though the latter category of factors is part of the former one). This was a little like arguing that your engine doesn’t power your car because sometimes it breaks down in a way that confuses your mechanic — and concluding that gasoline alone is sufficient to make a car with no engine run. But Pollan took the argument at face value. He said it showed “how the gene-disease paradigm appears to be collapsing.” He was troubled that its contentions apparently had gone unnoticed: “Why aren’t we hearing about this?!”
Of course I had seen Dr. Daniel MacArthur’s post Bioscience Resource Project critique of modern genomics: a missed opportunity in my RSS, but when I started reading the rebuttal I immediately thought “Dr. Dan’s interlocutors sound kind of dumb,” and I stopped reading. After reading the post I don’t think they’re dumb, I think they’re being lawyerly. Much of the piece is a rhetorical tour de force in leveraging the prejudices and biases of the intended readership. This is the Intelligent Design version of Left-wing “Blank Slate” Creationism.* They smoothly manipulate real findings in a deceptive shell game intended to convince the public, and shape public policy. Their success is evident in Pollan’s response. “X paradigm appears to be collapsing.” “Why aren’t we hearing about this?” Does this sound familiar? Like Dr. MacArthur I think some of the criticisms within the piece are valid. Despite not being hostile to the maxim “better living through chemistry,” I do think that there has been an excessive trend toward pharmaceutical or surgical “cures” in relation to diseases of lifestyle (anti-depressants, gastric bypass, etc.). But we go down a very dangerous path when we make recourse to shoddy means toward ostensibly admirable ends. This sort of discourse is not sustainable! (just used a buzzword intended to appeal right there!)
I honestly can’t be bothered to say much more when so many others already have. This is a boat I missed. But if some of what I say above isn’t clear, I recommend you read the original essay. Then read Dr. MacArthur and Ms. Carmichael. If you’re hungry for more, Ms. Carmichael has a helpful list of links.
* Left Creationism had its most negative manifestation as Lysenkoism, but it suffuses the outlook of many who fear the emergence of a new Nazi abomination. Leon Kamin in the 1970s even claimed that IQ was not heritable at all! Though he backed off such an extreme position, it shows how confident he was that could claim such a thing.
Update: Please see follow up post.
An international team of scientists has identified a previously shadowy human group known as the Denisovans as cousins to Neanderthals who lived in Asia from roughly 400,000 to 50,000 years ago and interbred with the ancestors of today’s inhabitants of New Guinea..
John Hawks, The Denisova genome FAQ:
The most significant finding in the paper is the demonstration that some living humans trace significant fraction of their ancestry to the population represented by the Denisova genome. As in the case of Neanderthals, different human populations show significantly different levels of similarity to the Denisova sequence. For Neanderthals, the similarities indicated between one and four percent Neanderthal ancestry for living people outside of Africa. In the case of the Denisova sequence, the greatest similarities are with living people in Melanesia – in this paper, represented by genome samples from Papua New Guinea and Bougainville. The similarities are consistent with approximately 4% contribution of a Denisova-like population to the ancestry of these living Melanesians.
The paper, Genetic history of an archaic hominin group from Denisova Cave in Siberia. It isn’t open access, but I assume the supplements are.
It was famously reported last winter that Bushmen seem to differ genetically amongst themselves more than Europeans and Asians do. These two latter groups have been separate for at least 40,000 years.
At least? Razib, you are way off on the separation time of Europeans and East Asians. I think it’s much closer to 30,000 years at most. There is growing evidence that ancestral Europeans and ancestral East Asians were one and the same people until 22,500 years ago.
Present-day Europeans and East Asians descend largely from a small nomadic population that once roamed Eurasia’s northern tier—a belt of steppe-tundra that stretched from southwestern France to Beringia during the last ice age. This population then split in two around the time of the glacial maximum (Rogers, 1986; Crawford et al, 1997). Chronologically, this barrier to east-west gene flow matches the dating by Laval et al. (2010) of the split between ancestral Europeans and ancestral East Asians.
The italics are my words, emphasized by the commenter. The bolding is the commenter’s as well (I had to fix some HTML in that comment, but I think I corrected in line with the commenter’s intent). I read (and blogged) the paper cited, so I’m well aware of the low bound value implying more recent common ancestry of East and West Eurasians posited here. I’m moderately skeptical. Part of the issue is that these sorts of computational models are tricky, and most of us aren’t versed in the various moving parts which go into constructing the model. Consider the following from the cited paper:
Countdown to Christmas! Hope everyone has pleasant holidays.
Apple v Google. Very long article highlighting the different strategies of the two companies. I do though think Google is starting to get a touch annoying trumpeting their “open ways.” They’re not a struggling start-up, they’re a massive corporation.
Hmong’s new lives in Caribbean. They’re 1% of French Guinea’s population, but control 70% of the agriculture, since arriving in the 1970s.
School girls in Hunza, Pakistan
A few days ago I observed that pseudonymous blogger Dienekes Pontikos seemed intent on throwing as much data and interpretation into the public domain via his Dodecad Ancestry Project as possible. What are the long term implications of this? I know that Dienekes has been cited in the academic literature, but it seems more plausible that this sort of project will simply distort the nature of academic investigation. Distort has negative connotations, but it need not be deleterious at all. Academic institutions have legal constraints on what data they can use and how they can use it (see why Genomes Unzipped started). Not so with Dienekes’ project. He began soliciting for data ~2 months ago, and Dodecad has already yielded a rich set of results (granted, it would not be possible without academically funded public domain software, such as ADMIXTURE). Even if researchers don’t cite his results (and no doubt some will), he’s reshaping the broader framework. In other words, he’s implicitly updating everyone’s priors. Sometimes it isn’t even a matter of new information, as much as putting a spotlight on information which was already there. Below is a slice of a bar plot from Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation. It uses STRUCTURE with K = 7. To the right of the STRUCTURE slice are two plots of individual data on French and French Basque from the same HGDP data set using ADMIXTURE at K = 10 from Dodecad.
I mentioned a few days ago that a friend was trying to get together some data to analyze the genetic variation of South Asians. By a strange coincidence Dienekes just published a more detailed analysis of South Asians…and uncovered something very interesting, though not that surprising. Some technical preliminaries:
A note of caution: The reduced marker set (~30k) means that a lot of noise is added in the admixture estimates. In particular, many individuals are likely to get low-level admixture from population sources that can be attributed to noise. But, as we will see, the small marker set does not really affect either the power of the GALORE approach, or of ADMIXTURE to infer meaningful clusters.
In addition to the various online sources of public data Dienekes got about a dozen South Asians. I was one of those South Asians, DOD075. In many ways I’m a rather standard issue South Asian, similar to Gujaratis, except that I have a substantial ‘East Asian’ component. More concretely, between 1/6 and 1/7 of my ancestry seems to be of eastern origin, far higher than the norm among South Asians. The rest of my ancestry was mostly South Asian specific, with a minor, but significant ‘West Asian’ component common across northern India.
Rerunning with more data with different samples Dienekes came out with a different set of ancestral components. Of particular interest to me he broke down the East Asian between East Asian proper and Southeast Asian. Below are a selection of populations with ancestral components + me. I’ve also renamed a few components. North Kannadi = Dravidian and Irula = Indian tribal. Indian = Generic Indian. Looking at the Fst it seems that Indian endogamy and population bottlenecks has had an effect…look at the North Kannadi distance from everyone else.
Most readers at this point are aware that I am very curious as to the origin of Europeans at the interface of hunter-gatherer populations and Neolithic farmers. What we thought we knew around the year 2000 does not seem to align very well with the conflicting results coming out of recent analyses. There is no ascendant consensus at this point. All possibilities are still in the field of play.
Part of the issue of course is that the spread of farming in Europe was a prehistoric affair, very far back in time. There have been several cultural, and possibly demographic, revolutions in Europe since the arrival of the first farmers. Pulling apart the manifold layers of the palimpsest is a task of great difficulty, and the confident results utilizing the tools of the past should make us cautious of the inferences of the present. This is why I believe focusing on case-studies such as Japan are essential. From a range of specific cases we may infer general patterns, which will then allow us to have firmer ground when making conjectures about times and places where the empirical results can not resolve conflicts of opinion. Japan is an island which made the transition to agriculture relatively late, so the fog is a bit less daunting than is the case with prehistoric Europe. Africa is perhaps even a better case: the Bantu expansion is very recent, having reached its stable frontier only within the last ~1000 years. If it is true that Phoenicians circumnavigated Africa in the ~600 BC, then they would have encountered many non-Bantu groups south of Kenya at that time. Linguists have long noted the similarities of the Bantu languages from north to south, but the genetic similarities also exist. Here’s a figure from the Bushmen paper in Nature:
First, read Ed Yong’s post. There’s real reporting in it. Such as:
There’s also an issue with attitudes among people in the field. “Biologists were already convinced that genes and genomic variation were key to understanding problems in their field,” he adds. “Social scientists and humanists do not now work with large digital text collections, and relatively few of them now believe that they should do so.”
Genomics did not mean that genetics disappeared. Broad surveys complement, they don’t substitute. You can read the paper for free at Science if you register, Quantitative Analysis of Culture Using Millions of Digitized Books:
We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of “culturomics”, focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. “Culturomics” extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
Ainu in 19th century Hokkaido, and rice paddies
Unlike some islands Japan has a long history of human habitation. More interestingly, under the Jomon culture the Japanese archipelago was home to one of the earliest, if not the earliest, societies which used pottery. The Jomon do not seem to have been intensive agriculturalists. Rather, with a widespread marine littoral they likely maintained extremely high population densities, and at least semi-sedentary habitation patterns, simply through a hunting & gathering mode of production. Pacific Northwest Amerindians are likely a good analogy. They also relied on a dense stock of marine life to maintain population densities of a high level and a sedentary lifestyle.
About 2,000 years ago the Yayoi people arrived in Japan. The first Yayoi settlements are in northern Kyushu. These people brought intensive agriculture, in particular rice agriculture, to the Japanese archipelago. The general assumption is that the Yayoi are the precursors of the Japanese who entered into the international system of East Asia during the Tang dynasty in the second half of the first millennium. The Ainu of Hokkaido are presumed to be the descendants of the remaining Jomon people, maintaining a hold in the northern island because of its ecological unsuitability to Japanese agriculture.
The question is: what proportion of the ancestry of modern Japanese is Jomon/Ainu, and what proportion is Yayoi? The dynamics here are nicely constrained by the fact that Japan is a relatively isolated island system. The Yayoi seem to have arrived at one discrete moment in history, and rapidly expanded in ~1,000 years to all the main islands of Japan, though the full settlement of Hokkaido commenced in the 19th century. Interestingly, parts of northern Honshu seem to have had a distinct post-Jomon culture down to ~1000 AD.
Conveniently the HapMap has both Japanese and Chinese samples, but often there hasn’t been too much focus on the differences between these two groups because they’re very close in a global context when compared to the Yoruba or Europeans. In more recent analyses of East Asian groups the coverage seems to be better with various Chinese ethnic groups, but relatively few samples from Siberian populations. The latter are critical because the supposition is that these are the groups which would have the most affinities with the Jomon, due to the culture and contacts of the Ainu which evident during the modern period.
Dienekes most recent post on K = 15 ancestral components in ADMIXTURE clarifies some issues in this regard. There are multiple Han Chinese and Japanese samples, as well as a wide range of East Asian and Siberian groups. I’ve reedited and formatted K = 15 a bit, with the aim of focusing on the relationships of the Japanese in particular.
Nature profiles Dodecad, the Pickrell Affair, and the emergence of amateur genomicists in a new piece. Interestingly David of BGA is going to try and get something through peer review. In particular, the relationship of Assyrians and Jews.
So we have Genomes Unzipped, Dodecad, and BGA. What next? Who next? I hope Dienekes doesn’t mind if I divulge the fact that the computational resources needed to utilize ADMIXTURE as he has is within the theoretical capability of everyone reading this post. Rather, the key is getting familiar with PLINK and writing some code to merge data sets. After you do that, to really add value you’d probably want to get raw data from more than what you can find in the HGDP, HapMap and other public resources.
But here I make an open offer: if you start a blog or a project which replicates the methods of Dodecad and BGA I’ll link to you and promote you. When Dienekes began Dodecad I actually started to play around with the data sets in ADMIXTURE, but I’ve personally held off until seeing what he and David find. What their pitfalls and successes might be. Here’s to 2011 being more interesting than we can imagine!
Update: Already had a friend with a computational background contact me about doing something on South Asian genomics. So again: if you get a site/blog set up, and start pumping out plots, I will promote you. In particular, if you need 23andMe raw data files of geographical region X it might be useful to try and get the word out via blogs and what not.
In my post below I quoted my interview L. L. Cavalli-Sforza because I think it gets to the heart of some confusions which have emerged since the finding that most variation on any given locus is found within populations, rather than between them. The standard figure is that 85% of genetic variance is within continental races, and 15% is between them. You can see some Fst values on Wikipedia to get an intuition. Concretely, at a given locus X in population 1 the frequency of allele A may be 40%, while in population 2 it may be 45%. Obviously the populations differ, but the small difference is not going to be very informative of population substructure when most of the difference is within populations.
But there are loci which are much more informative. Interestingly, one controls variation on a trait which you are familiar with, skin color (unless you happen to lack vision). A large fraction (on the order of 25-40%) of the between population variance in the complexion of Africans and Europeans can be predicted by substitution on one SNP in the gene SLC24A5. The substitution has a major phenotypic effect, and, exhibits a great deal of between population variation. One variant is nearly fixed in Europeans, and another is nearly fixed in Africans. In other words the component of genetic variance on this trait that is between population is nearly 100%, not 15%. This illustrates that the 15% value was an average across the genome, and in fact there are significant differences on the genetic level which can be ancestrally informative. You can take this to the next level: increase the number of ancestrally informative markers to obtain a fine-grained picture of population structure. In the illustration above the top panel shows the frequencies at the SNP mentioned earlier on SLC24A5. The second panel shows variation at another SNP controlling skin color, SLC45A2. This second SNP is useful in separating South and Central Asians from Europeans and Middle Easterners, if not perfectly so. In other words, the more markers you have, the better your resolution of inter-population difference. This is why I found the following comment very interesting:
I decided to take the Dodecad ADMIXTURE results at K = 10, and redo some of the bar plots, as well as some scatter plots relating the different ancestral components by population. Don’t try to pick out fine-grained details, see what jumps out in a gestalt fashion. I removed most of the non-European populations to focus on Western Europeans, with a few outgroups for reference.
Here’s a table of the correlations (I bolded the ones I thought were interesting):
|W Asian||NW African||S Europe||NE Asian||SW Asian||E Asian||N European||W African||E African||S Asian|
In response to my post from this weekend positing that the Sardinians are a particularly pristine distillation of the genetic heritage of Europe’s first Neolithic farmers, a friend suggests that I compare & contrast Sardinian actress Caterina Murino and the depictions of women which one sees on the walls of Minoan palaces. The Minoans being the presumably pre-Indo-European people who were responsible for an ancient civilization on Crete, before their conquest by Bronze Age Greeks sometime in the 15th century BCE.
The title is a reference to the daughter of king Minos of Knossos, as well as what Heinrich Schliemann asserted after he discovered the ‘mask of Agamemnon’. In the late Bronze Age the people of Crete and Sardinia were associated with the ‘Sea Peoples’, raiders whose assaults on the kingdoms of the ancient Near East resulted in the fall of the Mycenaeans and Hittites, and the near collapse of Egypt.
Also, since I’ve been referencing Dodecad’s 10 putative ancestral components, I thought it would be useful to point to the phylogenetic relationships between them. Remember, these components are not necessarily real concrete ancient populations which were amalgamated. For example, the positions of the “Northwest African” and “South Asian” ancestral components may be a function of ancient admixture events which have distributed across the population and produced a distinctive stabilized allelic profile. I also wouldn’t be surprised if both the “Southern” and “Northern” European components resemble each other because of an underlying Paleolithic European substrate which is represented by haplogroups U5 and I.
Addendum: I would have juxtaposed the Minoan image with this photo, but copyright issues.
Image Credit: Kalumba2009.
Over at Reason Ron Baily has an excellent piece up, I’ll Show You My Genome. Will You Show Me Yours? He reviews his results from two genotyping chips, and has placed his results online. I doubt readers of this weblog will learn anything that new, though the article might prove illuminating to friends & family. But, some cautions: