Tag: Human Genetics

Human mutation unveiled

By Razib Khan | May 21, 2013 11:38 am

Credit: Campbell, Catarina D., and Evan E. Eichler. “Properties and rates of germline mutations in humans.” Trends in Genetics (2013).

What a great age we live in. Until recently critical parameters in population genetics such as mutation rates had to be inferred and assumed, even though they served as bases for much more complex inferences. Now with humans (and humans are only the beginning!) much of what was inferred is being assessed in a more direct fashion. Caterina Campbell and Even Eichler have a review in Trends in Genetics which surveys the field as it stands now, Properties and rates of germline mutations in humans. Notice that there’s a rough convergence using pedigree analysis of a mutation rate in the low 10-8 range. Additionally, it does seem that a disproportionate number of novel mutations come through the paternal lineage via sperm. This should increase our moderate worry about older fathers (something reiterated in the piece, with caveats). Finally, the authors suggest these results are a floor for the mutational rate, in part due to the long term conflict with the inferred ‘evolutionary rates,’ which are higher. This matters because to infer the last common ancestors between lineages the value of the mutation rate is obviously critical.

Read More

CATEGORIZED UNDER: Evolution, Evolutionary Genetics

Europeans share common ancestors to differing extents

By Razib Khan | May 9, 2013 5:23 am

Don’t forget the deep structure in Italy!
Credit: Rita Molnar

Standard apologies that I have had not the marginal time to blog much, but I thought it was important that I least note that Dr. Peter Ralph and Dr. Graham Coop’s paper on identity-by-descent segments and European populations and history is out in its final form in PLoS Biology, The Geography of Recent Genetic Ancestry across Europe. I’ve been familiar with the outlines of these results for about a year now, and to be frank I am still digesting them. The media hype will come and go, with true but to some extent trivial headlines that “all Europeans are related,” but the consequences of these sorts of genetic inquiries into the relatedness of populations are going to be long lasting. At least they should be.

But before I go on about that, if you find the paper itself a bit daunting (though the main body of the text strikes me as eminently readable for a piece of statistical genetics), see Carl Zimmer’s condensation. With this sort of result there is liable to be confusion, so note that Graham Coop has been posting comments on Carl’s blog (and elsewhere, and you can always send him a note on Twitter). Additionally he has a very readable FAQ out. Dr. Coop told me on Twitter that there would even be updates tomorrow as well! In particular one aspect of the paper which I noticed is that most relatively short, but detectable segments (~10 cM), between any two individuals in many nationalities is not going to be evidence of recent genealogical affinities, but deeper historical process.

Read More

Models are great, because rejection is easy

By Razib Khan | April 23, 2013 1:20 am

There’s a new paper in PLoS ONE, Female and Male Perspectives on the Neolithic Transition in Europe: Clues from Ancient and Modern Genetic Data, which uses a combination of contemporary and ancient (that is, from subfossils) Y and mitochondrial DNA to understand the demographic past of Europe. Recall that the Y traces the direct male lineage, and the mtDNA the direct female lineage. Because they don’t recombine and generate clean converges back to a last common ancestor (there is no reticulation because there is no sex on these loci; they’re inherited from one of the two parents), they’re amenable to a lot of nifty demographic inference generation. In this paper they test specific models, and produce probability distributions of those models. Since it is open access I invite you to read the paper. The problem with these sorts of papers is I have a hard time trusting them until I replicate the results or have a sense of how cranky the software/code is!

Read More

CATEGORIZED UNDER: Human Genetics
MORE ABOUT: Human Genetics

Ancient Ainu mariners!

By Razib Khan | April 11, 2013 7:44 pm

Ainu man from 1870 (colorized)

Well, not really. But a new paper in PLOS GENETICS has a really weird speculation nested into the discussion of what seems a relatively banal paper on the phylogeography of South Americans. It’s a Y chromosomal survey of the populations of the New World, so it’s tracing the male lineage only. Because Amerindian populations likely went through at least one (more if you accept multiple migrations) bottleneck the variation on the Y chromosome is low. Ideally you’d be looking at tens of thousands of markers on the autosome, the non-sex inherited genome. But this group had a very good population coverage. Over 1,000 men from 50 tribal populations, with a focus on South America. Additionally, non-recombining markers are more manageable in terms of reconstructing demographic histories.

Read More

CATEGORIZED UNDER: Human Genetics

The inevitability of eugenics…as preventative health

By Razib Khan | March 27, 2013 12:46 pm

Inbred lineage. The Role of Inbreeding in the Extinction of a European Royal Dynasty, Alvarez et. al.

Every now and then Richard Dawkins stirs controversy by bringing up the topic of eugenics. This is not surprising in terms of Dawkins’ intellectual pedigree. The most influential British evolutionary biologist in the generation before Dawkins, R. A. Fisher, was a eugenicist. Arguably the most the most eminent evolutionist of Dawkins’ own generation, W. D. Hamilton, clearly had eugenical sympathies, though he was keenly aware how unfashionable that had become.* University College London’s Galton Laboratory still had the word eugenics in its title until 1965. More recently Dawkins has brought up the issue of consanguinity amongst the British Pakistani community. A practice which one might argue is non-eugenical due to the high rate of recessive diseases.

Read More

Genes are not esoteric knowledge

By Razib Khan | February 19, 2013 10:53 pm

Over at Slate the advice columnist received an email from a man who found out that his wife is really his half-sister. If you don’t want to follow the link, the back story is straightforward, the couples’ parents were lesbians, and used sperm donors. Recently the man sought out the identity of his biological father at the urging of his wife, because they have three children and she thought it would be important to have that information for them. That is how he found out that they shared the same biological father. Here is the part that has me concerned about realism on the part of the advice columnist:

I don’t see how you can keep this information to yourself. She’s bound to sense something off in your behavior and you simply can’t say, “I’m struggling with father issues.” I think you have to sit her down and show you what you’ve discovered. Then you two should likely seek out a counselor who deals with reproductive technology to help you sort through your emotions. I don’t see why your healthy children should ever be informed of this. That Dad didn’t want to find out who his sperm donor was is a sufficient answer when they get old enough to ask about this.

Read More

CATEGORIZED UNDER: Human Genetics
MORE ABOUT: Human Genetics

Visualizing European genetic variation: looking at dimensions which aren’t so boring

By Razib Khan | February 14, 2013 2:40 am

Yesterday I re-ran Plink with a narrower European-biased data set, and generated some MDS plots. I only had a few Asian and African populations, mostly so that I could replicate the standard dimensions 1 and 2, producing the classic “v-shape” which you’ve seen before. But what’s more interesting are lower coordinates. They may not capture as much of the variation in the distance matrix, but illustrate important dynamics. I haven’t used the directlabels package yet, so right now the labels are still imperfect. I’m giving black text as well as colored text. Also, here’s the original data (as in MDS results, not the raw data).

Read More

MORE ABOUT: Human Genetics

Visualizing Europe genetically

By Razib Khan | February 13, 2013 5:54 am

This is a follow up to my post from yesterday. In case you care about the technical details (after I clean this stuff up I will put it on GitHub) I’m using R’s adehabitat package to create a 95% distribution curve after smoothing with kernel density. The goal is to give you a better intuition about where the populations are dispersed across two dimensional visualizations of genetic variation.

Thinking about how to plot text, I came up with a quick hack, which just used the initial data and found the median x and y position. That explains why some of the labels are shifted so, in populations with a huge range the label position is going to be sensitive to not being smoothed (if you know how to pull out the centroid out of the kver, tell!). I’ve given them colors and also used black. The latter actually seems to be clearer!

Note: This is not just for fun, as I plan to start rolling out results and methods from some of the data sets I have more regularly in the near future.

Read More

CATEGORIZED UNDER: Genetics, Genomics

Social barriers to Indian public health & non-Aryan “invasions”

By Razib Khan | February 5, 2013 1:16 pm

A reader points me to a talk given by David Reich at the Center for Human Genetic Research 2013 Retreat. One of the issues Reich brought up is old, but perhaps worth reemphasizing: due to endogamy many South Asians carry a higher load of recessive ailments. This is not due to recent inbreeding (which is barred by custom in many South Asian groups, which enforce kin-level exogamy), but long term genetic isolation. Over time even a moderate sized population can be affected by drift. This was one of the major points in the 2009 paper Reconstructing Indian History, but not one particularly emphasized in the press follow up. A major implication is that a relatively simple public health measure for South Asians would be to marry outside of their jati. The social or genetic distance need not be great. But one generation of outbreeding should “mask” many of the deleterious alleles. If this model is correct one should be able to track decreases in morbidity within the American South Asian population, where there are many inter-caste and inter-regional marriages (yes, this is between people of putative high status, but this doesn’t matter).

Read More

Neurodiversity and genetic diversity

By Razib Khan | January 24, 2013 4:17 am

In the links below I alluded to a controversy over the “Neurodiversity movement”. The basic issue is that people with Asperger syndrome and high functioning autism are being accused of putting their concerns above and beyond those of the large number of mentally disabled autistic individuals (some of whom are non-verbal, and exhibit severe cognitive deficits) in the grab for “rights.” Rights here understood as the rights which black Americans, women, and gays have claimed, to be recognized as equal before the law and endowed with the same value in the eyes of society. As a deep philosophical matter I’m skeptical of Rights in a fundamental sense. As a conservative I’m skeptical of the push for a huge array of rights by a plethora identity groups. Socially recognized rights are valuable, and are cheapened and debased by dispensing them too liberally.

Read More

The dam of ancient DNA starts to break

By Razib Khan | January 21, 2013 5:37 pm

Over the past decade or so much of the reconstruction of the human genetic past has occurred through inferences generated from variation of extant human beings. In more plain English the patterns of genetic variation of modern populations have been used to map out the patterns of the past. There are serious difficulties with these sorts of inferences. For example you generate a huge number of potential phylogenetic trees and zero in on the “most probable tree” (or, the distribution of trees). But at the end of the day these inferences are only as good as your assumptions.

Read More

CATEGORIZED UNDER: Anthroplogy, Genetics, Human Genetics
MORE ABOUT: Human Genetics

Sinister old blue eyes

By Razib Khan | January 10, 2013 2:01 am

Over at Scientific American Christie Wilcox has a post up with the provocative title, People With Brown Eyes Appear More Trustworthy, But That’s Not The Whole Story, which reports on a new PLoS ONE paper, Trustworthy-Looking Face Meets Brown Eyes. Like Christie I would enjoy illustrating this post with my own trustworthy and youthful brown eyed visage, but I worry that my mien is a bit on the sly side! In any case, what of the paper? Wilcox reviews the salient points of the results. In short, the issue here is that brown eyed men seem to have more ‘trustworthy faces’ than blue eyed men. When the eyes were digitally manipulated it turned out that color had no influence on perception. Rather, it was the correlation between eye color and facial proportion which which was driving the initial association. Christie finishes:

Given the importance of trust in human interactions, from friendships to business partnerships or even romance, these findings pose some interesting evolutionary questions. Why would certain face shapes seem more dangerous? Why would blue-eyed face shapes persist, even when they are not deemed as trustworthy? Are our behaviors linked to our bodies in ways we have yet to understand? There are no easy answers. Face shape and other morphological traits are partially based in genetics, but also partially to environmental factors like hormone levels in the womb during development. In seeking to understand how we perceive trust, we can learn more about the interplay between physiology and behavior as well as our own evolutionary history.

Read More

CATEGORIZED UNDER: Genetics, Genomics, Select, Uncategorized

Why the future won’t be genetically homogeneous

By Razib Khan | January 5, 2013 10:52 pm

While reading The Founders of Evolutionary Genetics I encountered a chapter where the late James F. Crow admitted that he had a new insight every time he reread R. A. Fisher’s The Genetical Theory of Natural Selection. This prompted me to put down The Founders of Evolutionary Genetics after finishing Crow’s chapter and pick up my copy of The Genetical Theory of Natural Selection. I’ve read it before, but this is as good a time as any to give it another crack.

Almost immediately Fisher aims at one of the major conundrums of 19th century theory of Darwinian evolution: how was variation maintained? The logic and conclusions strike you like a hammer. Charles Darwin and most of his contemporaries held to a blending model of inheritance, where offspring reflect a synthesis of their parental values. As it happens this aligns well with human intuition. Across their traits offspring are a synthesis of their parents. But blending presents a major problem for Darwin’s theory of adaptation via natural selection, because it erodes the variation which is the raw material upon which selection must act. It is a famously peculiar fact that the abstraction of the gene was formulated over 50 years before the concrete physical embodiment of the gene, DNA, was ascertained with any confidence. In the first chapter of The Genetical Theory R. A. Fisher suggests that the logical reality of persistent copious heritable variation all around us should have forced scholars to the inference that inheritance proceeded via particulate and discrete means, as these processes do not diminish variation indefinitely in the manner which is entailed by blending.

Read More

Buddy, can you spare some ascertainment?

By Razib Khan | December 18, 2012 12:06 pm

The above map shows the population coverage for the Geno 2.0 SNP-chip, put out by the Genographic Project. Their paper outlining the utility and rationale by the chip is now out on arXiv. I saw this map last summer, when Spencer Wells hosted a webinar on the launch of Geno 2.0, and it was the aspect which really jumped out at me. The number of markers that they have on this chip is modest, only >100,000 on the autosome, with a few tens of thousands more on the X, Y, and mtDNA. In contrast, the Axiom® Genome-Wide Human Origins 1 Array Plate being used by Patterson et al. has ~600,000 SNPs. But as is clear by the map above Geno 2.0 is ascertained in many more populations that the other comparable chips (Human Origins 1 Array uses 12 populations). It’s obvious that if you are only catching variation on a few populations, all the extra million markers may not give you much bang for the buck (not to mention the biases that that may introduce in your population genetic and phylogenetic inferences).

Read More

We are Nature

By Razib Khan | December 13, 2012 10:03 am

There’s an interesting piece in Slate, The Great Schism in the Environmental Movement, which seems to be a distillation of trends which have been bubbling within the modern environmentalist movement for a generation now (I’ve read earlier manifestos in a similar vein). I can’t assess the magnitude of the shift, but here’s the top-line:

But that is a false construct that scientists and scholars have been demolishing the past few decades. Besides, there’s a growing scientific consensus that the contemporary human footprint—our cities, suburban sprawl, dams, agriculture, greenhouse gases, etc.—has so massively transformed the planet as to usher in a new geological epoch. It’s called the Anthropocene.

Modernist greens don’t dispute the ecological tumult associated with the Anthropocene. But this is the world as it is, they say, so we might as well reconcile the needs of people with the needs of nature. To this end, Kareiva advises conservationists to craft “a new vision of a planet in which nature—forests, wetlands, diverse species, and other ancient ecosystems—exists amid a wide variety of modern, human landscapes.”

Read More

Don’t wait to have children!

By Razib Khan | December 12, 2012 9:34 am

The New Republic has a piece up, How Older Parenthood Will Upend American Society, which won’t have surprising data for readers of this weblog. But it’s nice to see this sort of thing go “mainstream.” My daughter was born when her parents were in their mid-30s, so I know all the statistics. They aren’t good bed-time reading (she’s healthy and robust so far!). If I had to do it over again I definitely wouldn’t have waited this long. After becoming a father it brought home to me that waiting was one of the worst decisions of my life. Why postpone something this incredible for the more far more prosaic pleasures of an extended adolescence? Granted, I’m not sure that I would have been the best father at 25, but I don’t think there’s much I can say in reply to the argument that I should have become a father by 30.

More concretely, we would have had sperm and egg “banked” if we had been smart delaying parenthood. The article notes that storage of sperm costs $850 up front, and $300 to $500 per year after that, and that many balk at the cost. And how much do you spend on your cell phone every year? The issue here seems to be time preference.

Read More

We don’t know why Ethiopians breathe easy

By Razib Khan | December 11, 2012 11:40 pm

Most people are aware that altitude imposes constraints on individual performance and function. Much of this is flexible; athletes who train at high altitudes may gain a performance edge. But over the long term there are costs, just as there are with computers which are ‘overclocked.’ This is the point where you make the transition from physiology to evolution. Residence at high altitude entails strong selective pressures on populations. Over the past few years there has been a great deal of exploration of the genetics of long resident high altitude groups, the Tibetans, Peruvians, and Ethiopians.

Read More

The origins of the Romani determined definitively

By Razib Khan | December 9, 2012 1:52 pm

In many cases there are questions of a historical and ethnographic nature which are subject to controversy and debate. Scholarly arguments are laid out, and further dispute ensues. For decades progress seems fleeting, as one hypothesis is accepted, only to be subject to later revision. This sort of pattern gives succor to the most cynical and jaded of ‘Post Modern’ set, especially when the ‘discourse’ in question is in the domain of science.

But thankfully these debates can come to an end in some cases. So it is with the origins of the European Romani, better known as ‘Gypsies’ (though the Roma are the most well known of the Romani, other groups within Europe have different ethnonyms). Obviously many of the basic elements have long been there, but I think the most recent genetic work now establishes a level of closure. Taking a step back, what do we know?

1) The Romani language seems to be Indo-Aryan, with a likely affinity with the northwest group of Indo-Aryan languages

2) The Romani presence in Europe only dates to the past ~1,000 years, with an entry point in the Byzantine Empire

3) They are an admixture between an ancestral Indian element, and local populations

4) Their history of endogamy has resulted in a strong genetic drift effect

The two papers which seem to nail the coffin shut on these questions use somewhat different methodologies. One relies on Y chromosomal STRs (hypervariable repeat regions) to generate a paternal phylogeny. Focusing just on the paternal phylogeny allows for one to make very robust genealogical inferences. Additionally, the authors had a very large data set across India. Their goal was to ascertain the exact region of origin of the Romani before they left India. As noted in bullet #1 there is already some evidence from their language that this must be in northwest India. The second paper uses a SNP-chip; hundreds of thousands of autosomal markers. This has been done to death for other populations, so the method isn’t new. Rather, it is that it is now being applied to the Romani.

First, the Y chromosomal paper. The Phylogeography of Y-Chromosome Haplogroup H1a1a-M82 Reveals the Likely Indian Origin of the European Romani Populations:

Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.

 

Two trees illustrate the results succinctly:

The bottom line:

- This particular Y chromosomal lineage which is highly diagnostic of South Asian origin in the Romani shows that the Romani seem to derive from the populations of northwest India

- Additionally, within these populations the Romani Y chromosomal lineages derive from the lower caste elements, the scheduled castes and scheduled tribes

But the above results don’t get directly at genome-wide admixture. The second paper does, using hundreds of thousands of markers to explore the Romani affinity to other populations. Reconstructing the Population History of European Romani from Genome-wide Data:

The Romani, the largest European minority group with approximately 11 million people…constitute a mosaic of languages, religions, and lifestyles while sharing a distinct social heritage. Linguistic…and genetic…studies have located the Romani origins in the Indian subcontinent. However, a genome-wide perspective on Romani origins and population substructure, as well as a detailed reconstruction of their demographic history, has yet to be provided. Our analyses based on genome-wide data from 13 Romani groups collected across Europe suggest that the Romani diaspora constitutes a single initial founder population that originated in north/northwestern India ∼1.5 thousand years ago (kya). Our results further indicate that after a rapid migration with moderate gene flow from the Near or Middle East, the European spread of the Romani people was via the Balkans starting ∼0.9 kya. The strong population substructure and high levels of homozygosity we found in the European Romani are in line with genetic isolation as well as differential gene flow in time and space with non-Romani Europeans. Overall, our genome-wide study sheds new light on the origins and demographic history of European Romani.

The plot to the left illustrates the relationship of the Romani to world-wide populations using multi-dimensional scaling, where genetic variation is decomposed into dimensions, and individuals are plotted on those dimensions. In short, the Romani exhibit a classic admixture cline pattern.That is, they are the products of a two-way admixture between populations which occupy distinct positions along a cline, and Romani individuals and populations are distributed along the cline in proportion to their admixture. One notable aspect is that the Romani are actually two clusters; one which manifests a strong ‘east’-'west’ distribution, and another which seems located purely within the European cluster. The latter seems to be the Welsh Romani, who in the neighbor-joining tree (see the supplements) fall on the same branch as European populations, as opposed to the other Romani, who form their own clade.

To drill down further you need to ascertain admixture with a model-based clustering algorithm. Ergo, ADMIXTURE. I’ve reedited the figure to illustrate the salient points. In particular, it is clear that the Roma populations except the Welsh have significant South Asian ancestry. The question is how much? To answer this question you need to know the source population in South Asia. A peculiar aspect of this plot is that the Romani have very little of the green ancestral component, which happens to be modal in the Middle East (not shown). This element happens to be highly enriched in many Pakistani populations, but not necessarily northwest Indian ones. Nevertheless, the issue that leaves me suspicious of this particular finding is that many of the European populations, in particular those groups (e.g., Balkans) which may have admixed with the Romani, have this element to extent not evident in one of their presumed ‘daughter’ populations. I wonder if perhaps the peculiarities of Romani inbreeding has skewed the allele frequency distribution so much that you get strangeness like this. I am not showing higher K’s because those break out with a Romani-cluster. Just like the Kalash-cluster this is to a great extent a feature of the long term endogamy of these communities. With high levels of drift the allele frequency of these groups moves into a very peculiar space in relation to their parental populations, but one must not become confused and assume that the Romani or Kalash are themselves appropriate independent clusters in the same way that Europeans or East Asians are.

Using various forms of admixture analysis the authors seem to conclude that the Balkan Romani are 30-50% South Asian. This seems in line with intuition. But that still leaves open the question of who those South Asians were. As I noted above the most thorough Y chromosomal data point to the lower caste elements of northwest India. What do the autosomes say?

I don’t want get into the technical details of how they tested the models, but it seems that one of the likely parental populations to the Romani had a close relationship to the Meghwal, a scheduled caste from northwest India. In other words, the autosome results align very well with the Y chromosomal inferences. Additionally, the models tested imply that the Romani likely left South Asian ~1,000 years before the present, which aligns well with what is known from the historical record (though this is a case where I put much more stock in the historical record than inferences from population genetic models; look at the intervals).

Finally, there is the question of inbreeding. One aspect of the Romani genome is jumps out you is that they have many long “runs-of-homozygosity” (ROH). This is totally expected, as decades of uniparental analyses suggested a great deal of population bottleneck events as the Romani spread throughout Europe. But the ROH patterns also unearth an interesting fact: some of the Balkan Romani clearly have recent European admixture, while the non-Balkan Romani had an initial period of admixture followed by endogamy. The latter scenario seems to resemble Askhenazi Jews, while the former would suggest that the boundary between Romani and non-Romani in the Balkans is more fluid than is sometimes portrayed.

So there we have it. The Romani derive from lower castes populations from the northwest Indian subcontinent who seem to have left ~1,000 years ago. Over time they admixed with local populations, and are now 50-70% non-South Asian, with some groups being ~90% European (e.g., Welsh Romani). And, they have a long history as an endogamous group, judging by their inbreeding.

Layering genetic histories

By Razib Khan | December 2, 2012 2:14 pm

As a follow up to my post from yesterday, I decided to run TreeMix on a data set I happened to have had on hand (see Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data for more on TreeMix). Basically I wanted to display a tree with, and without, gene flow.

The technical details are straightforward. I LD pruned ~550,000 SNPs down to ~150,000. I ran TreeMix without and with migration parameters with the Bantu Kenya population being the root. Finally, when I did turn on the migration parameter I set it for 5. You can see the results below.

Most of the flows are pretty expected. The West Eurasian flow from the Turks to the Uygurs makes sense, because there is a large West Asian component to what the Uygurs have (from East Iranians?). The Chuvash are a Turkic group with minor, but significant, Turkic component. The HGDP Russian sample does have some East Eurasian ancestry. And the Moroccans also have African ancestry. But your guess is as good as mine with the Bantu flow in. These are I think Kenya, so it might be trying to interpret Nilotic admixture as generalized Eurasian.

A minor note: installing TreeMix and generating the appropriate files from pedigree format is not to difficult. But you might have confusion in how to generate the pedigree input file. You do it like so in PLINK:

./plink --noweb --bfile YourFile --freq --within YourGroupNamesFile --out YourOutPutFile

It’s the last you want to put into TreeMix’s python conversion script. The YourGroupNamesFile is basically the .fam file with an extra column, the population names for each individual.

Africa’s hidden people hold the keys to the past

By Razib Khan | December 2, 2012 12:42 am

I mentioned this in passing on my post on ASHG 2012, but it seems useful to make explicit. For the past few years there has been word of research pointing to connections between the Khoisan and the Cushitic people of Ethiopia. To a great extent in the paper which is forthcoming there is the likely answer to the question of who lived in East Africa before the Bantu, and before the most recent back-migration of West Eurasians. On one level I’m confused as to why this has to be something of a mystery, because the most recent genetic evidence suggests a admixture on the order of 2-3,000 years before the past.* If the admixture was so recent we should find many of the “first people,” no? As it is, we don’t. I think these groups, and perhaps the Sandawe, are the closest we’ll get.

Publication is imminent at this point (of this, I was assured), so I’m going to just state the likely candidate population (or at least one of them): the Sanye, who speak a Cushitic language with possible Khoisan influences. There really isn’t that much information on these people, which is why when I first heard about the preliminary results a few years back and looked around for Khoisan-like populations in Kenya I wasn’t sure I’d hit upon the right group. But at ASHG I saw some STRUCTURE plots with the correct populations, and the Sanye were one of them. I would have liked to see something like TreeMix, but the STRUCTURE results were of a quality that I could accept that these populations were not being well modeled by the variation which dominated their data set. Though Cushitic in language the Sanye had far less of the West Eurasian element present among other Cushitic speaking populations of the Horn of Africa. Neither were their African ancestral components quite like that of the Nilotic or Bantu populations. The clustering algorithm was having a “hard time” making sense of them (it seemed to wanted to model them as linear combinations of more familiar groups, but was doing a bad job of it).

Here is an interesting article on these groups: Little known tribe that census forgot. Like the Sandawe this is a population which seems to have been hunter-gatherers very recently, and to some extent still engage in this lifestyle. In this way I think they are fundamentally different from Indian tribal populations, who are often held up to be the “first people” of the subcontinent.  More and more it seems that the tribes of India are less the descendants of the original inhabitants of the subcontinent, at least when compared to the typical Indian peasant, and more simply those segments of the Indian population which were marginalized and pushed into less productive territory. Over time they naturally diverged culturally because of their isolation, but the difference was not primal. In contrast, groups like the Sanye and Sandawe may have mixed to a great extent with their neighbors (and lost their language like the Pygmies), but evidence of full featured hunting & gathering lifestyles implies a sort of direct cultural continuity with the landscape of eastern Africa before the arrival of farmers and pastoralists from the west and north.

* I understand some readers refuse to accept the likelihood of these results because of other lines of information. I am just relaying the results of the geneticists. I am not interested in re-litigating prior discussions on this. We’ll probably have a resolution soon enough.

NEW ON DISCOVER
OPEN
ADVERTISEMENT

DISCOVER's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

Human evolution, genetics, genomics and their interstices.
ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »