Tag: Human Genomics

Layering genetic histories

By Razib Khan | December 2, 2012 2:14 pm

As a follow up to my post from yesterday, I decided to run TreeMix on a data set I happened to have had on hand (see Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data for more on TreeMix). Basically I wanted to display a tree with, and without, gene flow.

The technical details are straightforward. I LD pruned ~550,000 SNPs down to ~150,000. I ran TreeMix without and with migration parameters with the Bantu Kenya population being the root. Finally, when I did turn on the migration parameter I set it for 5. You can see the results below.

Most of the flows are pretty expected. The West Eurasian flow from the Turks to the Uygurs makes sense, because there is a large West Asian component to what the Uygurs have (from East Iranians?). The Chuvash are a Turkic group with minor, but significant, Turkic component. The HGDP Russian sample does have some East Eurasian ancestry. And the Moroccans also have African ancestry. But your guess is as good as mine with the Bantu flow in. These are I think Kenya, so it might be trying to interpret Nilotic admixture as generalized Eurasian.

A minor note: installing TreeMix and generating the appropriate files from pedigree format is not to difficult. But you might have confusion in how to generate the pedigree input file. You do it like so in PLINK:

./plink --noweb --bfile YourFile --freq --within YourGroupNamesFile --out YourOutPutFile

It’s the last you want to put into TreeMix’s python conversion script. The YourGroupNamesFile is basically the .fam file with an extra column, the population names for each individual.

Africa’s hidden people hold the keys to the past

By Razib Khan | December 2, 2012 12:42 am

I mentioned this in passing on my post on ASHG 2012, but it seems useful to make explicit. For the past few years there has been word of research pointing to connections between the Khoisan and the Cushitic people of Ethiopia. To a great extent in the paper which is forthcoming there is the likely answer to the question of who lived in East Africa before the Bantu, and before the most recent back-migration of West Eurasians. On one level I’m confused as to why this has to be something of a mystery, because the most recent genetic evidence suggests a admixture on the order of 2-3,000 years before the past.* If the admixture was so recent we should find many of the “first people,” no? As it is, we don’t. I think these groups, and perhaps the Sandawe, are the closest we’ll get.

Publication is imminent at this point (of this, I was assured), so I’m going to just state the likely candidate population (or at least one of them): the Sanye, who speak a Cushitic language with possible Khoisan influences. There really isn’t that much information on these people, which is why when I first heard about the preliminary results a few years back and looked around for Khoisan-like populations in Kenya I wasn’t sure I’d hit upon the right group. But at ASHG I saw some STRUCTURE plots with the correct populations, and the Sanye were one of them. I would have liked to see something like TreeMix, but the STRUCTURE results were of a quality that I could accept that these populations were not being well modeled by the variation which dominated their data set. Though Cushitic in language the Sanye had far less of the West Eurasian element present among other Cushitic speaking populations of the Horn of Africa. Neither were their African ancestral components quite like that of the Nilotic or Bantu populations. The clustering algorithm was having a “hard time” making sense of them (it seemed to wanted to model them as linear combinations of more familiar groups, but was doing a bad job of it).

Here is an interesting article on these groups: Little known tribe that census forgot. Like the Sandawe this is a population which seems to have been hunter-gatherers very recently, and to some extent still engage in this lifestyle. In this way I think they are fundamentally different from Indian tribal populations, who are often held up to be the “first people” of the subcontinent.  More and more it seems that the tribes of India are less the descendants of the original inhabitants of the subcontinent, at least when compared to the typical Indian peasant, and more simply those segments of the Indian population which were marginalized and pushed into less productive territory. Over time they naturally diverged culturally because of their isolation, but the difference was not primal. In contrast, groups like the Sanye and Sandawe may have mixed to a great extent with their neighbors (and lost their language like the Pygmies), but evidence of full featured hunting & gathering lifestyles implies a sort of direct cultural continuity with the landscape of eastern Africa before the arrival of farmers and pastoralists from the west and north.

* I understand some readers refuse to accept the likelihood of these results because of other lines of information. I am just relaying the results of the geneticists. I am not interested in re-litigating prior discussions on this. We’ll probably have a resolution soon enough.

Northern Europeans and Native Americans are not more closely related than previously thought

By Razib Khan | December 1, 2012 3:38 pm

A new press release is circulating on the paper which I blogged a few months ago, Ancient Admixture in Human History. Unlike the paper, the title of the press release is misleading, and unfortunately I notice that people are circulating it, and probably misunderstanding what is going on. Here’s the title and first paragraph:

Native Americans and Northern Europeans More Closely Related Than Previously Thought

Released: 11/30/2012 2:00 PM EST
Source: Genetics Society of America

Newswise — BETHESDA, MD – November 30, 2012 — Using genetic analyses, scientists have discovered that Northern European populations—including British, Scandinavians, French, and some Eastern Europeans—descend from a mixture of two very different ancestral populations, and one of these populations is related to Native Americans. This discovery helps fill gaps in scientific understanding of both Native American and Northern European ancestry, while providing an explanation for some genetic similarities among what would otherwise seem to be very divergent groups. This research was published in the November 2012 issue of the Genetics Society of America’s journal GENETICS


The reality is ta Native Americans and Northern Europeans are not more “closely related” genetically than they were before this paper. There has been no great change to standard genetic distance measures or phylogeographic understanding of human genetic variation. A measure of relatedness is to a great extent a summary of historical and genealogical processes, and as such it collapses a great deal of disparate elements together into one description. What the paper in Genetics outlined was the excavation of specific historically contingent processes which result in the summaries of relatedness which we are presented with, whether they be principal component analysis, Fst, or model-based clustering.

What I’m getting at can be easily illustrated by a concrete example. To the left is a 23andMe chromosome 1 “ancestry painting” of two individuals. On the left is me, and the right is a friend. The orange represents “Asian ancestry,” and the blue represents “European” ancestry. We are both ~50% of both ancestral components. This is a correct summary of our ancestry, as far as it goes. But you need some more information. My friend has a Chinese father and a European mother. In contrast, I am South Asian, and the end product of an ancient admixture event. You can’t tell that from a simple recitation of ancestral quanta. But it is clear when you look at the distribution of ancestry on the chromosomes. My components have been mixed and matched by recombination, because there have been many generations between the original admixture and myself. In contrast, my friend has not had any recombination events between his ancestral components, because he is the first generation of that combination.

So what the paper publicized in the press release does is present methods to reconstruct exactly how patterns of relatedness came to be, rather than reiterating well understood patterns of relatedness. With the rise of whole-genome sequencing and more powerful computational resources to reconstruct genealogies we’ll be seeing much more of this to come in the future, so it is important that people are not misled as to the details of the implications.

Most mtDNA lineages expanded before the Neolithic?

By Razib Khan | October 18, 2012 9:00 pm

A new short communication in Scientific Reports suggests that most demographic expansion as ascertained using mtDNA occurred before the Neolithic. MtDNA analysis of global populations support that major population expansions began before Neolithic Time:

Agriculture resulted in extensive population growths and human activities. However, whether major human expansions started after Neolithic Time still remained controversial. With the benefit of 1000 Genome Project, we were able to analyze a total of 910 samples from 11 populations in Africa, Europe and Americas. From these random samples, we identified the expansion lineages and reconstructed the historical demographic variations. In all the three continents, we found that most major lineage expansions (11 out of 15 star lineages in Africa, all autochthonous lineages in Europe and America) coalesced before the first appearance of agriculture. Furthermore, major population expansions were estimated after Last Glacial Maximum but before Neolithic Time, also corresponding to the result of major lineage expansions. Considering results in current and previous study, global mtDNA evidence showed that rising temperature after Last Glacial Maximum offered amiable environments and might be the most important factor for prehistorical human expansions.

Read More

CATEGORIZED UNDER: Human Genetics, Human Genomics

The Bushmen tell us a lot about human evolution because they are humans who have evolved

By Razib Khan | September 21, 2012 12:32 am

When it comes to the human genetics of the Khoe-San there’s a little that’s stale and unoriginal for me in terms of presentation. The elements are always composed the same. The Bushmen are the “most ancient” humans, who can tell us something about “our past,” about “our evolution.” Tried & tested banalities just bubble forth unbidden. I have no idea why. There’s a new paper in Science on the genetics of the Khoe-San, which includes Bushmen, which brought to mind this issue for me because of the outrageous nature of the press releases.

The title of the paper itself is a testament to vanilla, Genomic Variation in Seven Khoe-San Groups Reveals Adaptation and Complex African History. This is absolutely not surprising. Are you shocked that the Khoe-San have adaptations? Or that African history is complex? The wonder of it all! This paper actually revisits much of the same ground as Pickrell et al.’s originally titled The genetic prehistory of southern Africa. Before Dr. Pickrell executes throw-down on me on Twitter let me concede that I have no creative ideas to offer in terms of an alternative title. Rather, I have an idea: perhaps in the future scientists could explore the evolutionary genetic basis for steatopygia? The trait is not limited just to Khoe-San, my distant cousins the Andaman Islanders also exhibit it. Perhaps this is the ancestral state of the human lineage? This is a situation where the titles just write themselves!

Read More

Across the sea of grass: how Northern Europeans got to be ~10% Northeast Asian

By Razib Khan | September 7, 2012 12:11 pm

The Pith: You’re Asian. Yes, you!

A conclusion to an important paper, Nick Patterson, Priya Moorjani, Yontao Luo, Swapan Mallick, Nadin Rohland, Yiping Zhan, Teri Genschoreck, Teresa Webster, and David Reich:

In particular, we have presented evidence suggesting that the genetic history of Europe from around 5000 B.C. includes:

1. The arrival of Neolithic farmers probably from the Middle East.

2. Nearly complete replacement of the indigenous Mesolithic southern European populations by Neolithic migrants, and admixture between the Neolithic farmers and the indigenous Europeans in the north.

3. Substantial population movement into Spain occurring around the same time as the archaeologically attested Bell-Beaker phenomenon (HARRISON, 1980).

4. Subsequent mating between peoples of neighboring regions, resulting in isolation-by-distance (LAO et al., 2008; NOVEMBRE et al., 2008). This tended to smooth out population structure that existed 4,000 years ago.

Further, the populations of Sardinia and the Basque country today have been substantially less influenced by these events.


It’s in Genetics, Ancient Admixture in Human History. Reading through it I can see why it wasn’t published in Nature or Science: methods are of the essence. The authors review five population genetic statistics of phylogenetic and evolutionary genetic import, before moving onto the novel results. These statistics, which measure the possibility of admixture, the extent of admixture, and the date of admixture, are often presented, but nested into supplements, in previous papers by the same group. On the one hand this removes from view the engines which are driving the science. On the other hand I have always appreciated that a benefit of this injustice to the methods which make insight possible is that those without academic access can actually bite into the meat of the researcher’s mode of thought.

I did read through the methods. Twice. I’ve encountered all the statistics before, and I’ve read how they were generated, but I’ll be honest and admit that I haven’t internalized them. That has to end now, because the authors have finally released a software package which implements the statistics, ADMIXTOOLS. I plan to use it in the near future, and it is generally best if you understand the underlying mechanisms of a software package if you are at the bleeding end of analytics. I will review the technical points in more detail in future posts, more for my own edification than yours. But for the moment I’ll be a bit more cursory. Four of the tests use comparisons of allele frequencies along explicit phylogenetic trees. That’s so general as to be uninformative as a description, but I think it’s accurate to the best of my knowledge. In the basics the tests are seeing if a model fits the data (as opposed to TreeMix, which finds the best model out of a range to fit the data). The last method, rolloff, infers the timing of an admixture event based upon the decay of linkage disequilibrium. In short, admixture between two very distinct populations has the concrete result of producing striking genomic correlations. Over time these correlations dissipate due to recombination. The magnitude of dissipation can allow one to gauge the time in the past when the original admixture occurred.

Read More

CATEGORIZED UNDER: Human Genetics, Human Genomics

Not all genes are created the same

By Razib Khan | August 28, 2012 11:52 pm

The map to the right shows the frequencies of HGDP populations on SLC45A2, which is a locus that has been implicated in skin color variation in humans. It’s for the SNP rs16891982, and I yanked the figure from IrisPlex: A sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information. Brown represents the genotype CC, green CG, and blue, GG. Europeans who have olive skin often carry the minor allele, C. While SLC24A5 is really bad at distinguishing West Eurasians from each other, SLC45A2 is better. Though both are fixed in Northern Europe, the former stays operationally fixed in frequency outside of Europe, in the Near East. As I stated earlier the proportions of the ancestral SNP in the Middle Eastern populations in the HGDP seem to be easily explained by the Sub-Saharan admixture you can find in these groups.

In contrast major SNPs in SLC45A2 are closer to disjoint between Europeans and South Asians. For example I’m a homozygote for the C allele. And yet even here we need to be careful. I want in particular to draw your attention to the frequencies in the Middle Eastern populations, the Sardinians, and the Kalash of Pakistan.

The Kalash, and their Nuristani cousins, have often been observed to have “European” physical features. These populations even trade in legends of descent from the Macedonians of Alexander. And the genetics here shows why. Though the Kalash far are more closely related to other Northwest South Asians than to Europeans, on the subset of genes which are implicated in pigmentation many of them could actually “pass” for Europeans. In fact, it is interesting to me that by these measures the Sardinians are no more European than groups like the Kalash and the Druze (in contrast to the total genome, where Sardinians may be the best reference for Western Europeans). They have a lower frequency of the SNP strongly associated with blue eyes than either of these groups, for example.

In the above paper they also produced a chart which illustrated the relationships of HGDP populations as a measure only of the six SNPs they used in their prediction method. These are markers which distinguish blue and brown eye color in Europeans efficiently.

Read More

Not all homozygosity is created the same (way)

By Razib Khan | August 15, 2012 11:39 pm

Browsing the most recent issue of The American Journal of Human Genetics I stumbled upon a paper with some neat figures, Genomic Patterns of Homozygosity in Worldwide Human Populations. More specifically they focused on patterns of “runs of homozygosity” (ROH), that is, sequences of the genome which exhibited a strong bias toward homozygous SNPs. The figure above illustrates a pooled set of populations with individual variation in total length of ROH for aggregated from three classes, short, medium, and long ROHs. The small and medium length ROH exhibit the pattern of increasing total ROH as a function of distance from Africa. But not so with the large ROH. Why?

Read More

CATEGORIZED UNDER: Genomics, Human Genomics

Humanity 2.0

By Razib Khan | July 28, 2012 9:37 pm

Dienekes points to a David Reich video where he shows his hand as to future possible results to come out of his lab. The short of it is that it seems likely that most agricultural populations exhibit the same dynamic outlined in Reconstructing Indian History. At the least you have an intrusive group admixing with indigenes. At the extreme you have total replacement. The pattern is confirmed for India, Ethiopia, and Southeast Asia. It seems highly likely in Europe. There are other rumored results in East Asia which might shake things up.

On a minor note, I do want to add that I think many archaeologists aren’t going to be totally surprised that modern Europeans don’t derive by and large from Aurignacians. But, the relatively recent nature of the map of genetic variation which we take for granted probably will shock, and result in a high degree of skepticism. Yet if I had to bet I would bet on the model being sketched out by David Reich. These admixtures and replacements are likely to resolve some confusions of our understanding of the settlement of the world using simple tree models with branching points tens of thousands of years in the past (e.g., you already know that Oceanians will have a longer branch because of archaic admixture).

Read More

The first, second, and third nations

By Razib Khan | July 11, 2012 10:47 pm

By now you’ve probably read about the paper which reports that there seem to have been three waves of humans migrating into the New World prior to the arrival of Europeans. A major aspect of this result is that it does not emerge out of a vacuum, but rather comes close to settling an old question in linguistics. The late Joseph Greenberg generated a series of audacious phylogenies of languages of the world. Greenberg’s attempts received mixed reviews. It seems that there is little controversy about some of his classifications of African languages, but linguists of American native dialects rejected his division of the languages of the New World into three broad families, Eskimo-Aleut, Na-Dene, and Amerind. Eskimo-Aleut is rather self-evident. Na-Dene encompasses a group of languages in northwest North America, along with some significant outliers such as Navajo. Amerind seems to roughly be a grab-bag of everything else. The linguistic trichotomy also lent itself to a narrative of three migrations. L. L. Cavalli-Sforza gave his support to Greenberg’s framework in The History and Geography of Human Genes, and it seems most non-linguists are particularly congenial toward his tendency of ‘lumping.’ In contrast, linguists remain more skeptical ‘splitters,’ at lease those who have a more ethnographic disciplinary bent. Geneticists have not always supported Greenberg’s suppositions. For example, many of the members of the same group which authored this paper implicitly put the kibosh on the attempt to construct a unified linguistic family which spanned the Andaman Islanders and the Papuans.

The method of the paper was relatively straightforward, assuming you are already somewhat familiar with the statistical genetic esoterica which was unveiled a few years ago by this group and others. Basically you take genetic data in the form of hundreds of thousands of SNPs, and you test the patterns of variation in that data across populations against explicit models of demographic history, represented visually by phylogenetic trees. You can see here that the sampling was relatively thick, except for the United States. Chalk this up to politics. I’ve been hearing about this particular problem in relation to this paper for over a year now. Not having asked any of the members of the group directly I obviously am going off hearsay, but the lack of American samples is most definitely not a feature. It’s a bug. In the supplement they also note that they couldn’t get Na-Dene data from another research group. Almost certainly that’s because of bioethical issues and legal contractual constraints.

Despite all this drama, the scientific isn’t too hard to understand. Aside from the nifty statistics one problem is that many of these native groups have European and African admixture, but there are workarounds to that (e.g., just pull out genomic segments which are indigenous, and use those). The outcome is neatly visualized in the figure below:

Read More

SMBE 2012

By Razib Khan | June 24, 2012 7:34 pm

Dienekes has summaries up of human-related abstracts of Society for Molecular Biology & Evolution 2012.

1) Remember these are not papers, and some of the abstracts may never become papers, at least in recognizable form

2) Speaking of which, Estimating a date of mixture of ancestral South Asian populations:

Read More

Genes in space

By Razib Khan | May 25, 2012 11:38 pm

From some of the same people who brought you the genetic map of Europe, a very important paper, A model-based approach for analysis of spatial structure in genetic data. Here’s the abstract:

Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.

Within the guts of this paper they make an important observation: constructing a set of populations and then generating pairwise statistics of differentiation across those populations has an element of arbitrariness. Rather than going in that direction the authors here are evaluating variation of genes as a function of continuous space, rather than binning them into discrete populations. In this way they can use patterns of genes to back infer the likely geographic origin of an individual, and more intriguingly pinpoint genetic loci which exhibit sharp gradients across space, and so may be targets of natural selection. The adaptive story for LCT is straightforward. But what of OCA2, which is mostly well known as a pigmentation locus which has been implicated in blue vs. brown eye variation in Europeans? As I like to say, interesting times….

And of course, they have released the software.

Credit: Wikipedia

CATEGORIZED UNDER: Human Genetics, Human Genomics
MORE ABOUT: Human Genomics

Neanderthals came in all colors

By Razib Khan | March 19, 2012 8:48 pm

There’s a report in Science about a new short paper about Neandertal pigmentation genetics. The context is this. First, in 2007 an ingenuous paper was published which inferred that it may be that Neandertals had red hair, at least based on an N = 2 from two divergent locations. The new study looks at three Croation samples, and reports genotypes which are correlated with a swarthier phenotype in modern populations. But the results are neither here nor there: everyone interviewed in the paper assumes that like modern Europeans Neandertals were a polymorphic set of populations when it comes to pigmentation. There are lots of reasons for this agreement, despite issues one might take with this paper.

The report on the paper in Science has two sections which I want to zoom in on. First, “Nearly 60% of the formula’s predictions matched the subjects’ actual physical appearance, the authors say. The team considers that accuracy rate satisfactory, given the complexity of the genetics behind skin color and other physical traits.” Do you consider 60 percent satisfactory? What curve are you grading on? I’m willing to bet that the reporter didn’t consider 60 percent satisfactory, and neither do I. If you look in the paper you’ll see that their method predicts that a Yoruba in the HGDP sample has blue eyes and red hair. Several of the Papuans are predicted to have blue eyes.

Read More

Socialized personal genomics?

By Razib Khan | February 6, 2012 12:07 am

Norway to bring cancer-gene tests to the clinic:

Norway is set to become the first country to incorporate genome sequencing into its national health-care system. The Scandinavian nation, which has a population of 4.8 million, will use ‘next-generation’ DNA sequencers to trawl for mutations in tumours that might reveal which cancer treatments would be most effective.

The consensus seems to be that ~2000 the main proponents of human genomics oversold the short-term biomedical yield on this line of inquiry. But one rule of thumb is that the consequences of novel technologies are often misunderstood; overestimated in straightforward ways in the short-term, but underestimated in unexpected ways over the long-term. To get a sense, you can reread some of the science fiction of the 1950s inspired by UNIVAC. These mass pushes for nation-wide human genomics projects have a comprehensible headline intent. But I wonder if the real results are going to be something we can’t anticipate.

(Via John Hawks)

CATEGORIZED UNDER: Human Genomics, Personal Genomics

Population structure using haplotype data

By Razib Khan | January 28, 2012 2:44 pm

The Pith: New software which gives you a more fine-grained understanding of relationships between populations and individuals.

According to the reader survey >50 percent of you don’t know how to interpret PCA or model-based (e.g., ADMIXTURE) genetic plots, so I am a little hesitant to point to this new paper in PLoS Genetics, Inference of Population Structure using Dense Haplotype Data, as it extends the results of those earlier methods. But it’s an important paper, and at some point I’ll starting using their software. The “big picture” is that earlier methods left “some information on the table.” That’s partly due to the fact that they were developed (or in the case of PCA leveraged, as it’s a very general technique) in an era where very dense marker data sets were not available (today we’re shifting to full genome sequences in many cases!). The information left on the table would be haplotype structure. Genetic variation in a concrete form manifests as sequences along a line, many of them physically connected. These correlations of nearby variant markers represent haplotypes of great interest, because they are excellent clues to admixture or divergence events across populations. In contrast the older methods, were looking at variation from marker to marker, each in turn independently, which collapses some of the important genomic structure that we can now inspect (in fact, linkage disequilibrium due to these correlations can distort some of the results in the older methods, so you want to “thin” your marker set).

Let me make this concrete for you. On 23andMe you can see where your friends shake out on a PCA plot using the HGDP data set as a reference. What this means is that the HGDP data set is used to generate independent dimensions of genetic variation. As is the usual case in these analyses the largest dimension separates Africans from everyone else, and the second largest dimension separates Asians from Europeans and Africans. 23andMe customers are then projected upon this variation, so you can get a sense where you are positioned in the clusters. To the left is a zoom in on the section for Central/South Asians. You can see that one of my friends, highlighted with a green color, falls almost perfectly in the Uygur cluster. According to ancestry estimates my friend is 50 percent Asian and 50 percent European. The “representative” Uygur in the 23andMe chromosome painting gives about the same results. But these are total genome estimates. The historical nature of my friend’s admixture and that of the Uygur woman is very different, as one can see in the below figure.


Read More

How the Amhara breathe differently

By Razib Khan | January 22, 2012 12:23 pm

I have blogged about the genetics of altitude adaptation before. There seem to be three populations in the world which have been subject to very strong natural selection, resulting in physiological differences, in response to the human tendency toward hypoxia. Two of them are relatively well known, the Tibetans and the indigenous people of the Andes. But the highlanders of Ethiopia have been less well studied, nor have they received as much attention. But the capital of Ethiopia, Addis Ababa, is nearly 8,000 feet above sea level!

Another interesting aspect to this phenomenon is that it looks like the three populations respond to adaptive pressures differently. Their physiological response varies. And the more recent work in genomics implies that though there are similarities between the Asian and American populations, there are also differences. This illustrates the evolutionary principle of convergence, where different populations approach the same phenotypic optimum, though by somewhat different means. To my knowledge there has not been as much investigation of the African example. Until now. A new provisional paper in Genome Biology is out, Genetic adaptation to high altitude in the Ethiopian highlands:

Read More

The quest for an Afrikaner genotype

By Razib Khan | January 21, 2012 8:49 pm

Update: If interested, please email me at contactgnxp -at- gmail -dot- com. Also, I am getting some feedback via 23andMe that people with white South African matches noticed Africa segments in many of the ancestry paintings. This has definitely increased by probability that the admixture proportion is ~5 percent. There will probably be a few genotypes coming in shortly, but I am going to see if I can get more people typed (fundraising appeal pending!).

It’s been a while since I’ve gone looking for genotypes of particular ethnic groups. The results were rather good for the Tutsi and Malagasy. So I thought I’d venture out again, despite being a bit busy. Here’s what I want: the genotype of an Afrikaner (or several). A few years ago South African geneticist J. M. Greeff did an analysis of his own pedigree, and estimated that he had ~6 percent non-European ancestry (he did validate this with some genetic markers; e.g., his father’s mtDNA is of the M haplogroup, which is almost always Indian). This is in line with other genealogists who have estimated, about 5 percent non-European heritage. How much should we trust these non-biological studies? The genomic estimates of African American ancestry being ~20 percent European were anticipated by analyses of family histories from text records, so we certainly shouldn’t dismiss them (in fact, it seems possible that these analyses will underestimate non-European ancestry because of cryptic individuals in the pedigrees).

And we have plenty of records of people of non-European ancestry contributing to the Afrikaner population in any case. Greeff found the records for his own pedigree, but the first Governor of the Dutch Cape Colony was himself of mixed-race (his mother was Eurasian). The question is is a matter of degree. Are Afrikaners like American whites, with hardly any non-European ancestry (~1 percent or less), or like Latin American whites, with significant non-European ancestry (~5 to 20 percent)? My own bet is that they’ll be in the middle. The proportion of non-European ancestry is low enough that individuals such as Sandra Laing are very rare indeed. But if the 5 percent estimate is valid, and almost of all these ancestors were women, then a larger proportion of the mtDNA is going to be non-European.


Read More

Reconstructing a generation unsampled

By Razib Khan | January 14, 2012 3:59 pm

In the near future I will be analyzing the genotype of an individual where all four grandparents have been typed. But this got me thinking about my own situation: is there a way I could “reconstruct” my own grandparents? None of them are living. The easiest way to type them would be to obtain tissue samples from hospitals. This is not totally implausible, though in this case these would be Bangladeshi hospitals, so they might not have saved samples or even have a good record of hem. Another way would be to extract DNA from the burial site. This is not necessarily palatable. But assuming you did this, if you have access to a forensic lab it might be pretty easy (though I think most forensic labs using VNTRs, rather than SNP chips, so I don’t know if they’d touch every chromosome), I’m not sure that the quality would be optimal for more vanilla typing operations, especially for older samples which are likely to be contaminated with a lot of bacteria.

For me the simplest option is to look at relatives. Each of my grandparents happens to have had siblings, so there are many sets of relatives related to just each of those individuals of interest. I also have many cousins, so pooling all the genotypes together and using the information of a pedigree one could ascertain which chromosomal segments are likely to derive from a particular grandparent. To give a concrete example, my mother has a maternal cousin to whom she is quite close. By typing my mother and her cousin one could infer that the segments shared across the two individuals derive from the common maternal grandparents. Of course there’s a problem that cousins have a coefficient of relatedness of only 1/8th, so there is going to be a lot of information missing. But, if you had lots of cousins you could presumably reconstruct the genotypes far better.


Read More

"Descendant of Genghis Khan" sequenced

By Razib Khan | December 18, 2011 10:23 am

Chinese Scientists Announce the First Complete Sequencing of Mongolian Genome:

In this study, the DNA sample was from a male adult who belongs to the Mongolian “Royal Family” and is the 34th generation descendant of Genghis Khan. “The sample is very valuable for the study with a full record of family pedigree and no background of intermarriage between other ethnic groups.” said Professor Huanmin Zhou, Project Investigator and Director of Science and Technology at IMAU.


Read More

CATEGORIZED UNDER: Anthroplogy, Human Genetics

No studies necessary: do your own replication!

By Razib Khan | December 16, 2011 9:22 pm

In response to the post below I received the above response on twitter. This is an interesting case. The link goes to a paper in the year 2000, Alu insertion polymorphisms in NW Africa and the Iberian Peninsula: evidence for a strong genetic boundary through the Gibraltar Straits:

An analysis of 11 Alu insertion polymorphisms (ACE, TPA25, PV92, APO, FXIIIB, D1, A25, B65, HS2.43, HS3.23, and HS4.65) has been performed in several NW African (Northern, Western, and Southeastern Moroccans; Saharawi; Algerians; Tunisians) and Iberian (Basques, Catalans, and Andalusians) populations. Genetic distances and principal component analyses show a clear differentiation of NW African and Iberian groups of samples, suggesting a strong genetic barrier matching the geographical Mediterranean Sea barrier. The restriction to gene flow may be attributed to the navigational hazards across the Straits, but cultural factors must also have played a role. Some degree of gene flow from sub-Saharan Africa can be detected in the southern part of North Africa and in Saharawi and Southeastern Moroccans, as a result of a continuous gene flow across the Sahara desert that has created a south-north cline of sub-Saharan Africa influence in North Africa. Iberian samples show a substantial degree of homogeneity and fall within the cluster of European-based genetic diversity.

Read More

MORE ABOUT: Human Genomics

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar