DISCOVER Magazine. Science, Technology and The Future
Current Issue
Subscribe Today »
  • Renew
  • Give a Gift
  • Archives
  • Customer Service
  • Facebook
  • Twitter
  • Newsletter
  • Health & Medicine
  • Mind & Brain
  • Technology
  • Space
  • Human Origins
  • Living World
  • Environment
  • Physics & Math
  • Video
  • Photos
  • Podcast
  • RSS
Gene Expression
« The Chinese Muslims
PCA plots and trees »

Beyond visualization of data in genetics

totalvarHopefully by now the image to the left is familiar to you. It’s from a paper in Human Genetics, Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study. The paper is interesting in and of itself, as it combines a wide set of populations and puts the focus on the extent of disjunction between self-identified ethnic identity, and the population clusters which fall out of patterns of genetic variation. In particular, the authors note that the “Native Hawaiian” identification in Hawaii is characterized by a great deal of admixture, and within their sample only ~50% of the ancestral contribution within this population was Polynesian (the balance split between European and Asian). The figure suggests that subjective self assessment of ancestral quanta is generally accurate, though there are a non-trivial number of outliers. Dienekes points out that the same dynamic holds (less dramatically) for Europeans and Japanese populations within their data set.

All well and good. And I like these sorts of charts because they’re pithy summations of a lot of relationships in a comprehensible geometrical fashion. But they’re not reality, they’re a stylized representation of a slice of reality, abstractions which distill the shape and processes of reality. More precisely the x-axis is an independent dimension of correlations of variation across genes which can account for ~7% of the total population variance. This is the dimension with the largest magnitude. The y-axis is the second largest dimension, accounting for ~4%. The magnitudes decline precipitously as you descend down the rank orders of the principle components. The 5th component accounts for ~0.2% of the variance.

The first two components in these sorts of studies usually conform to our intuitions, and add a degree of precision to various population scale relations. Consider this supplement chart from a 2008 paper (I’ve rotated and reedited for clarity):


pcamyers

The first component separates Africans from non-Africans, the latter being a derived population from a subset of the former. The second component distinguishes West Eurasians from East Eurasians & Amerindians. These two dimensions and the distribution of individuals from the Human Genome Diversity Project reiterates what we know about the evolutionary history of our species.

And yet I wonder if we should be careful about the power of these two-dimensional representation’s in constraining us excessively when we think about genetic variation and dynamics. Naturally there is the sensitivity of the character of dimensions upon the nature of the underlying data set upon which they rely. But consider this thought experiment,

Father = Japanese
Mother = Norwegian
Child = Half Japanese & Half Norwegian

If you projected these three individuals upon the two-dimensional representation above of the worldwide populations the father would cluster with East Asians, the mother with Europeans, and the child with the groups who span the divide, Uyhgurs and Hazaras. So on the plot the child would be far closer to these Central Asian populations than to the groups from which its parents derive. And here’s a limitation of focusing too much on two-dimensional plots derived from population level data: is the child interchangeable with a Uyghur or Hazara genetically in relation to their parents? Of course not! If the child was a female, and the father impregnated her, the consequence (or probability of a negative consequence) would be very different than if he impregnated a Uyghur or Hazara woman.

The reason for this difference is obvious (if not, ask in the comments, many readers of this weblog know the ins & outs at an expert level). Abstractions which summarize and condense reality are essential, but they have their uses and limitations. Unlike physics biology can not rely too long on elegance, beauty, and formal clarity. Rather, it always has to dance back between rough & ready heuristics informed by the empirics and theoretical systems which emerge from axioms. Usually a picture has its own sense. But the key is to be precise in understanding what sense it makes to you.

Share

May 31st, 2010 Tags: Genetics, Genomics, Population Genetics
by Razib Khan in Genetics, Genomics | 5 comments | RSS feed | Trackback >

5 Responses to “Beyond visualization of data in genetics”

  1. 1.   Tweets that mention Beyond visualization of data in genetics | Gene Expression | Discover Magazine -- Topsy.com Says:
    May 31st, 2010 at 8:33 am

    [...] This post was mentioned on Twitter by razib khan, Ron Simon and Jeff C. Coleman, J.S.. J.S. said: Beyond visualization of data in genetics http://ow.ly/17yZ0c [...]

  2. 2.   Peter Marsh Says:
    May 31st, 2010 at 3:58 pm

    The purpose of this article appears to be an attempt to discount the value of these charts which contain some very important information. Information that confirms previous studies regarding the origins of the Polynesians.

    Here is an example of some of these studies linking America and Japan with Polynesia

    InPeter Bellwoods book Mans Conquest of the Pacific he cites a study showing that
    Polynesians and NW Coastal Indians have very similar blood. They both have No B, high A, high M, high R2 & moderate Fya. The study showed Polynesians have no blood similarities to S.E. Asians or Melanesians.

    S.W. Serjeantson “The Colonization of the Pacific – A Genetic Trail 1989 pp 135,162-163,166-7 “The following genes set them apart: Polynesians lack HLA-B27 , wheras it is common amongst Melanesians.
    HLA-Bw48 is commonly found in Polynesian populations, but occurs only sporadically in Melanesia. The only other known population with an appreciable frequency of HLA-Bw48 is that of the North American Indians or more specifically the Tlingit. In Polynesia Bw48 co-occurs with A11, – suggesting a variation since Polynesians departed from the Canadian coast.

    Theodore G Schurr and colleagues(1990) ‘Both the North American Pima and the Central American Maya have high frequencies of the Mitochondrial DNA sequence variation containing the rare Asian RFLP Hine II morph 6 in conjunction with an Asian-specific 9 based pair deletion.’ It appears that both the Pima and the Maya are genetically very close to the Polynesians. The arrival of these genes in America is believed to have been between 6-8,000 years ago, ruling out the possibility of Polynesian origins as Polynesians have only been in the Pacific for 2,200 years. A migration of the Polynesians from America is far more logical.

    Katsushi Tokunaga and colleagues. ‘Genetic link between Asians and Native Americans: Evidence from HLA genes and haplotypes’ in Human Immunology 62 1001-1008 (2001).
    HLA24-Cw8-B48, A24-Cw10-B60 and A24-Cw9-B61 were all commonly observed in Taiwan indigenous populations, Tibetans, Thais, Japanese, Orochon in North East China, Buryat, Man,Yakut, Inuit, Tlingit, Pima, Maya and Maori.’

    Harihara and colleagues (1992) noted: When observing the ‘Frequency of a 9bp deletion in the mitochrondrial DNA among Asian populations’. It appears that the Maori & Cook Islanders had ancestors from the Shizuoka prefecture of Japan.”

    Fideas E, Leon S, and colleagues. ‘HLA Trans Pacific contacts’(1995) notes that; ‘a tribe living near the Pacific Colombian coast named the Noanama/Wanana, are clustered genetically closer to Japanese people than to other American natives.Novick and colleagues concur with this.’

    Yes Polynesians are related to Japanese and native Taiwanese. They came vis the Kuroshio current to America and then sailed down to Hawaii – the Homeland of Polynesians. Yes they did mix with Caucasians – The Easter Islanders are paleolithic Caucasians from America – as are the Basques – hence their close genetic similarity.

    In 1972 Professor Jean Dausset conducted a study of the Caucasian blue/green eyed, red heads of Easter Island, who are in fact a significant part of the Polynesian story. He found them to have an ancient strain of Caucasian blood, which can also be found in the Basques of Spain, characterised by A29 and B12. The analyses revealed that 39% of unrelated Basques and 37% of the Easter Islanders were carriers of the HLA gene B12. These were the highest and second highest proportions tested throughout the world. The figures for A29 were similar. The Easter Islanders, with 37%, had the highest proportion in the world, while the Basques were second with 24%. The most remarkable thing was; that the two genes were found as a haplotype (combined genetic markers) in 11% of Easter islanders and 7.9% of the Basques. No other people in the world had remotely comparable figures.”
    In fact, from the above tests, the Easter Islanders appear to be of a more pure ancient Caucasian racial stock than the Basques!

    So both these very visual graphs tell us exactly like it is. Yes, there has been some recent genetic admixture, but geneticists can see that by looking at the gene tree where they can see the times of recombination.

    Yes America WAS the stepping stone of Polynesians into the Pacific. The second graph in the above article shows this very clearly. This is the trail of Haplogroup B on the West coast of America which arrived 6-8,000 years ago, but in Polynesia its arrival was only 2,200 years ago. Chronology alone suggests the direction of colonisation.

    For more details regarding this alternative much more robust theory regarding the origins of the Polynesians see my website Polynesian Pathways at above url.

  3. 3.   bioIgnoramus Says:
    May 31st, 2010 at 5:32 pm

    “Unlike physics biology can not rely too long on elegance, beauty, and formal clarity”: aye, and physics tends to rely on them when there’s a dearth of experimental data.

  4. 4.   Razib Khan Says:
    June 1st, 2010 at 1:55 am

    The purpose of this article appears to be an attempt to discount the value of these charts which contain some very important information.

    no.

  5. 5.   PCA plots and trees | Gene Expression | Discover Magazine Says:
    June 1st, 2010 at 3:52 pm

    [...] Blogs / Gene Expression « Beyond visualization of data in genetics [...]





    • About Gene Expression

      Razib Khan’s degrees are in biochemistry and biology. He has blogged about genetics since 2002, previously worked in software development, is an Unz Foundation Junior Fellow and lives in the western US. He loves habaneros.

    • Search

    • Recent Comments

      • Jason G. Goldman on Kkkhhhaaannn!!!
      • Wulf Kurtoglu on The social and biological construction of race
      • Donn on The Iranian Genome Project
      • Razib Khan on The Iranian Genome Project
      • Donn on The Iranian Genome Project
    • Must Read List

      • Principles of Population Genetics
      • Quantitative Genetics
      • The Horse, the Wheel, and Language
      • Albion's Seed
      • The Blank Slate
    • Links

      Blogroll

      Blogroll

      • A Replicated Typo
      • Archives at unz.org
      • Brown Pundits
      • Deep Sea News
      • Dienekes
      • Gene Expression Classic
      • Harappa Ancestry Project
      • John Hawks
      • Less Wrong
      • Randall Parker
      • Razib on Books
      • Razib's Aggregator Blog
      • Secular Right
      • Sepia Mutiny
      • Steve Sailer
      • West Hunter
      Q & A

      Q & A

      • A. W. F. Edwards
      • Adam K. Webb
      • Armand Leroi
      • Bruce Lahn
      • Charles C. Mann
      • Charles Murray
      • Dan Sperber
      • David Haig
      • Heather Mac Donald
      • Hugh Pope
      • James F. Crow
      • John Derbyshire
      • Jon Entine
      • Judith Rich Harris
      • Justin L. Barrett
      • Ken Miller
      • Matthew Stewart
      • Parag Khanna
      • Peter Turchin
      • Warren Treadgold
      Books

      Books

      • 1491
      • 1848
      • A Beautiful Math
      • A Concise Economic History of the World
      • A Farewell to Alms
      • A History of Christianity
      • A History of Iran
      • A History of the Byzantine State and Society
      • A Reason for Everything
      • A Separate Creation
      • A Splendid Exchange
      • A Theory of Religion
      • A World History
      • Aboriginal Australians
      • Adaptation and Natural Selection
      • After Tamerlane
      • After the Ice
      • Age of Abundance
      • Albion's Seed
      • American Judaism
      • Banana
      • Before the Dawn
      • Behavioral Genetics in the Postgenomic Era
      • Biometry
      • Blood of the Isles
      • Bones, Stones and Molecules
      • Born That Way
      • Calculus Made Easy
      • Castes of Mind
      • Catholicism and Freedom
      • Causes of Evolution
      • Children of the Revolution
      • China in World History
      • China's Cosmopolitan Empire
      • China: A New History
      • Clash of Extremes
      • Contours of the World Economy 1-2030 AD
      • Darwin's Cathedral
      • Dawn of Human Culture
      • Deep Ancestry
      • Defenders of the Truth
      • Descartes' Baby
      • Divided by the Faith
      • Dragon Bone Hill
      • Empires and Barbarians
      • Empires of the Silk Road
      • Empires of the Word
      • End of the Bronze Age
      • Endless Forms Most Beautiful
      • Epistasis and Evolutionary Process
      • Europe
      • Europe After Rome
      • Europe Between the Oceans
      • Evolution
      • Evolution and the Genetics of Populations
      • Evolution for Everyone
      • Evolutionary Dynamics
      • Evolutionary Genetics
      • Evolutionary Human Genetics
      • Evolutionary Quantitative Genetics
      • Explaining Culture
      • Fooled By Randomness
      • Fourth Crusade & the Sack of Constantinople
      • Freedom Just Around the Corner
      • From Plato to Nato
      • Genetical Theory of Natural Selection
      • Genetics and Analysis of Quantitative Traits
      • Genetics and Origins of Species
      • Genetics of Populations
      • Genghis Khan & the Making of the Modern World
      • Genome
      • Geography of Thought
      • Global Capitalism
      • God's War
      • Grand New Party
      • Grooming, Gossip, and the Evolution of Language
      • Guns, Germs, and Steel
      • Historical Dynamics
      • History of Rome
      • How Pleasure Works
      • How Rome Fell
      • How We Decide
      • In Gods We Trust
      • In Search of the Trojan War
      • India: A New History
      • Infidels
      • Journey of Man
      • Keepers of the Keys of Heaven
      • Knowledge and the Wealth of Nations
      • Mapping Human History
      • Marketplace of the Gods
      • Mathematical Models in Biology
      • Molecular Evolution
      • Molecular Markers, Natural History, and Evolution
      • Mother Nature
      • Mutants
      • Narrow Roads of Gene Land 1
      • Narrow Roads of Gene Land 2
      • Narrow Roads of Gene Land 3
      • Natural Selection and Social Theory
      • Nature via Nurture
      • No Two Alike
      • Of Moths and Men
      • Origin and Evolution of Cultures
      • Origins of Theoretical Population Genetics
      • Out of Thin Air
      • Pandora's Seed
      • Plagues and Peoples
      • Population Genetics and Microevolutionary Theory
      • Population Genetics, Molecular Evolution, and the Neutral Theory
      • Postwar
      • Power and Plenty
      • Predictably Irrational
      • Prehistory of the Mind
      • Principles of Population Genetics
      • Pursuit of Glory
      • Quantitative Genetics
      • R.A. Fisher, the Life of a Scientist
      • Reading in the Brain
      • Religion Explained
      • Rome and Jersalem
      • Sailing to Byzantium
      • Sewall Wright and Evolutionary Biology
      • Sociobiology
      • Speciation
      • Statistical Methods in Molecular Evolution
      • Supernatural Selection
      • Survival of the Prettiest
      • Synaptic Self
      • Tempo and Mode in Evolution
      • The 10,000 Year Explosion
      • The Age of Confucian Rule
      • The Age of Lincoln
      • The Altruism Equation
      • The Ancestor's Tale
      • The Ascent of Money
      • The Barbarian Conversion
      • The Black Swan
      • The Blank Slate
      • The Classical World
      • The Creationists
      • The Cultural Origins of Human Cognition
      • The Darwin Wars
      • The Descent of Man
      • The Early Chinese Empires
      • The Essential Difference
      • The Evolutionists
      • The Faith Instinct
      • The Fall of Rome
      • The Fall of the Roman Empire
      • The g Factor
      • The Genetics of Human Populations
      • The Germanization of Early Medieval Christianity
      • The Great Arab Conquests
      • The Great Divergence
      • The Great Human Diasporas
      • The Great Upheaval
      • The History and Geography of Human Genes
      • The Horse, the Wheel, and Language
      • The Human Web
      • The Imitation Factor
      • The Invisible Gorilla
      • The Language Instinct
      • The Making of a Christian Aristoracy
      • The Math Gene
      • The Mating Mind
      • The Meme Machine
      • The Moral Animal
      • The Number Sense
      • The Nurture Assumption
      • The Origin of Species
      • The Origin Of The Mind
      • The Origins of Virtue
      • The Power of Babel
      • The Price of Altruism
      • The Red Queen
      • The Reformation
      • The Rise of Western Christendom
      • The Sacred Chain
      • The Selfish Gene
      • The Seven Daughters of Eve
      • The Stuff of Thought
      • The Symbolic Species
      • The Tenth Parallel
      • The Troubled Empire
      • The Vertigo Years
      • The Vikings
      • Throes of Democracy
      • Unknown Quantity
      • Unto Others
      • War and Peace and War
      • War, Wine, and Taxes
      • We Are Doomed
      • Wealth and Poverty of Nations
      • What Hath God Wrought
      • When Baghdad Ruled the Muslim World
      • When Genius Failed
      • Why Sex Matters
      • Why Some Like It Hot
    • Elsewhere on DISCOVER

      RSS Genetics in DISCOVER mag

      Genetics in DISCOVER

      • The Spider Assassin That Acts Like Prey and Cloaks Itself With Wind
      • How Did LEGO Become More About Limits Than Possibilities?
      • Top 100 Stories of 2011: #48: Strongest Repellent Found

      • Top 100 Stories of 2011: #35: Fossil Stirs Debate Over 
Dinosaurs’ Last Days
      • Top 100 Stories of 2011: #30: New Fossil Casts Doubt on Oldest Bird

      • Top 100 Stories of 2011: #63: How Many Species Inhabit the Earth?

      • Top 100 Stories of 2011: #74: Meet the Megavirus

      • Top 100 Stories of 2011: #61: Aging Effects 
Reversed in Mice

    • Gene Expression content

      RSS Recent Posts

      Recent Posts

      • Kkkhhhaaannn!!!
      • The social and biological construction of race
      • The Iranian Genome Project
      • Socialized personal genomics?
      • A personal note
      • Everlasting permanence
      • ChromoPainter & fineSTRUCTURE on a South Asian data set
      • Secular liberals the tip of the Islamist spear
      Categories

      Categories

      • Administration
      • Agriculture
      • Anthroplogy
      • Ask a ScienceBlogger
      • Barbarism
      • Behavior Genetics
      • Bioethics
      • Biology
      • Biotech
      • Blog
      • Books
      • Cognitive Science
      • Creationism
      • Culture
      • Data Analysis
      • Demographics
      • Development
      • Ecology
      • Economics
      • Education
      • Environment
      • Evolution
      • Evolutionary Genetics
      • Evolutionary Psychology
      • Fantasy
      • Food
      • Futurism
      • Genetics
      • Genomics
      • Geography
      • GSS
      • Health
      • History
      • Human Evolution
      • Human Evolutionary Genetics
      • Human Evolutionary Genomics
      • Human Genetics
      • Human Genomics
      • International Affairs
      • Linguistics
      • Medicine
      • Paleontology
      • Personal Genomics
      • philosophy
      • Politics
      • Population Genetics
      • Psychology
      • Quantitative Genetics
      • Religion
      • Science
      • Science Fiction
      • Select
      • Social Science
      • Space
      • Sports
      • Statistics
      • Technology
      • Transhumanism
      • Uncategorized
      Archives

      Archives

      • February 2012
      • January 2012
      • December 2011
      • November 2011
      • October 2011
      • September 2011
      • August 2011
      • July 2011
      • June 2011
      • May 2011
      • April 2011
      • March 2011
      • February 2011
      • January 2011
      • December 2010
      • November 2010
      • October 2010
      • September 2010
      • August 2010
      • July 2010
      • June 2010
      • May 2010
      • April 2010
      • March 2010
      • February 2010
      • January 2010
      • December 2009
      • November 2009
      • October 2009
      • September 2009
      • August 2009
      • July 2009
      • June 2009
      • May 2009
      • April 2009
      • March 2009
      • February 2009
      • January 2009
      • December 2008
      • November 2008
      • October 2008
      • September 2008
      • August 2008
      • July 2008
      • June 2008
      • May 2008
      • April 2008
      • March 2008
      • February 2008
      • January 2008
      • December 2007
      • November 2007
      • October 2007
      • September 2007
      • August 2007
      • July 2007
      • June 2007
      • May 2007
      • April 2007
      • March 2007
      • February 2007
      • January 2007
      • December 2006
      • November 2006
      • October 2006
      • September 2006
      • August 2006
      • July 2006
      • June 2006
      • May 2006
      • April 2006
      • March 2006
      • February 2006
      • January 2006
    • Meta

      • Log in
      • Entries RSS
      • Comments RSS
      • WordPress.org
    • RSS Razib’s Pinboard Feed

      • Archaeologists strike gold in quest to find Queen of Sheba's wealth | Science | The Observer
      • The missing heritability: rare variants of large effect? « reaction norm
      • In Vermont, Bronx Players Help Team, but Stir Outcry - NYTimes.com
      • Online Dating Sites Don’t Match Hype - NYTimes.com
      • Big Data’s Impact in the World - NYTimes.com
      • If you’ve seen one elephant, have you seen them all? | Uda Walawe Elephants
      • Functional genomics: The changes that count : Nature : Nature Publishing Group
      • College Rankings :: Political Affiliation of the Students
      • Economics of Family Life, as Taught by a Power Couple - NYTimes.com
      • Steve Sailer's iSteve Blog: Why does Britain have so many yobs these days?
      • Which population in the 1000 Genomes Project samples has the most Neandertal similarity? | john hawks weblog
      • Neanderthal demise due to many influences, including cultural changes
      • Atheism in America: Why won’t the U.S. accept its atheists? - Slate Magazine
      • For Ron Paul, a Distinctive Worldview of Long Standing - NYTimes.com
      • Killers’ Families Left to Confront Fear and Shame - NYTimes.com
      • 911 IS A JOKE - WWW.THEDAILY.COM
      • When Counseling and Conviction Collide — Beliefs - NYTimes.com
      • Rhodes Trust Gives Account of Quarterback’s Candidacy - NYTimes.com
      • The Powerful Resist Change to Greek Tax System - NYTimes.com
      • Effort to Rebrand Arab Spring Backfires in Iran - NYTimes.com


  • Kalmbach Publishing Co.

    Copyright © 2012, Kalmbach Publishing Co.

    Privacy - Terms - Reader Services - Subscribe Today - Advertise - About Us