Every variant with an author!

By Razib Khan | September 29, 2010 1:36 pm

I recall projections in the early 2000s that 25% of the American population would be employed as systems administrators circa 2020 if rates of employment growth at that time were extrapolated. Obviously the projections weren’t taken too seriously, and the pieces were generally making fun of the idea that IT would reduce labor inputs and increase productivity. I thought back to those earlier articles when I saw a new letter in Nature in my RSS feed this morning, Hundreds of variants clustered in genomic loci and biological pathways affect human height:

Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits1, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait2, 3. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.

The supplements run to nearly 100 pages, and the author list is enormous. But at least the supplements are free to all, so you should check them out. There are a few sections of the paper proper that are worth passing on though if you can’t get beyond the paywall.


fig1bIn this study they pooled together several studies into a meta-analysis. One thing not mentioned in the abstract: they checked their GWAS SNPs against a family based study. This was important because in the latter population stratification isn’t an issue. Family members naturally overlap a great deal in their genetic background. Also, if I read it correctly they’re focusing on populations of European origin, so this might not capture larger effect alleles which impact between population variance in height but don’t vary within a given population (note that if you explored pigmentation genetics just through Europeans you would miss the most important variable on the world wide scale, SLC24A5, because it’s fixed in Europeans). In any case, as you can see what they did was extrapolate out the number of loci which their methods could capture to explain variation with the predictor being the sample size. At 500,000 individuals they’re at ~700 loci, and around 20% of the heritable variation. My initial thought is that I’m not seeing diminishing returns here, but since I haven’t read the supplements I’ll let that pass since I don’t know the guts of this anyhow. They do assert that they are likely underestimating the power of these methods because there may be be smaller effect common variants which can top off the fraction.

But even they admit that they can go only so far. Here are some sections from the conclusion that lays it out pretty clearly:

By increasing our sample size to more than 100,000 individuals, we identified common variants that account for approximately 10% of phenotypic variation. Although larger than predicted by some models26, this figure suggests that GWA studies, as currently implemented, will not explain most of the estimated 80% contribution of genetic factors to variation in height. This conclusion supports the idea that biological insights, rather than predictive power, will be the main outcome of this initial wave of GWA studies, and that new approaches, which could include sequencing studies or GWA studies targeting variants of lower frequency, will be needed to account for more of the ‘missing’ heritability. Our finding that many loci exhibit allelic heterogeneity suggests that many as yet unidentified causal variants, including common variants, will map to the loci already identified in GWA studies, and that the fraction of causal loci that have been identified could be substantially greater than the fraction of causal variants that have been identified.

In our study, many associated variants are tightly correlated with common nsSNPs, which would not be expected if these associated common variants were proxies for collections of rare causal variants, as has been proposed27. Although a substantial contribution to heritability by less common and/or quite rare variants may be more plausible, our data are not inconsistent with the recent suggestion28 that many common variants of very small effect mostly explain the regulation of height.

In summary, our findings indicate that additional approaches, including those aimed at less common variants, will likely be needed to dissect more completely the genetic component of complex human traits. Our results also strongly demonstrate that GWA studies can identify many loci that together implicate biologically relevant pathways and mechanisms. We envisage that thorough exploration of the genes at associated loci through additional genetic, functional and computational studies will lead to novel insights into human height and other polygenic traits and diseases.

The second to last paragraph takes a shot at David Goldstein’s idea of synthetic associations.

We’re still where we were a a few years back though, old fashioned Galtonian quantitative genetics, a branch of statistics, is the best bet to predict the heights of your offspring. As with intelligence, “height genes”, are not improvements upon common sense. But if you’re going into the 10-20% range of variation explained it’s certainly not trivial, and the biological details are going to be of interest.

CATEGORIZED UNDER: Genetics
  • Pingback: Tweets that mention Every variant with an author! | Gene Expression | Discover Magazine -- Topsy.com()

  • DK

    “This conclusion supports the idea that biological insights, rather than predictive power, will be the main outcome of this initial wave of GWA studies”

    Care to clarify what the biological insight without predictive power means? How is it then different from a wild (or educated) guess? How else do we know that GWAS hits are “real”? Or is the idea that GWAS generates a list of candidates and then they are evaluated by all other biological means to see if they really do control the trait the way GWAS claims? If so then I don’t see how it’s much different from simply writing down a plausible list of the genes involved and then again proceeding to study them one by one.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    Care to clarify what the biological insight without predictive power means?

    i think they’re trying to indicate that obviously explaining 15% of the phenotypic variance means that their genes won’t be predictive, but i guess the bundle of SNPs can be targets for bench guys.

    If so then I don’t see how it’s much different from simply writing down a plausible list of the genes involved and then again proceeding to study them one by one.

    i assume they’d argue that they’ve narrowed the search space.

    though i wonder if one of the hundreds of authors could answer your query. there’s a probability that at least a few of them will see this post.

  • http://www.sanger.ac.uk/research/faculty/jbarrett/ Jeff Barrett

    I’m maybe the only statistical geneticist in the world not an author on that paper, but I can take a stab at DK’s question. The key here is that these genes are definitely associated with height — far more convincingly than any educated guess based on biological function etc. The field really has bent over backwards to ensure high quality data in these studies and to weed out any false positive associations.

    The “biological insight” vs “prediction” question turns on a different issue. It’s true that generally GWAS haven’t been very successful in yielding useful genetic predictors, but the point that the height authors are trying to make is that there’s another, separate, goal of GWAS beyond prediction: understanding the biology of the trait. The small effect sizes documented in GWAS mean that the particular association we’ve found don’t explain variation in the trait, but crucially the effect size in the GWAS isn’t necessarily predictive of the importance of that gene or locus overall in the trait.

    Genes which have both common GWAS hits and rare ‘Mendelian’ hits (in height, things like severe dwarfism) illustrate that the gene, depending on how it’s genetically perturbed, can have a wide variety of effects on the phenotype. That’s important because we might be able to better understand the progression with disease or identify druggable targets.

    So while the prediction question is still open (although it’s becoming clear that GWAS alone aren’t the whole answer), I’d say that the broader goal of learning about the biology of human health and disease has received a huge boost from studies like this.

  • maxpie

    I am just a beginner in the field of research but if you look at the p-value which is very very highly significant meaning hereby that its a chance association but the association is really really strong.With my very limited knowledge all scientific research is an educated guess you cannot be absolute about anything. we just have to find out if that educated guess is smart enough to answer the given question.

    the solution of complexity always lies in simplicity :)

  • http://blogs.discovermagazine.com/gnxp/2010/09/every-variant-with-an-author/ genie

    At the moment we are merely trying to scramble in the dark with very limited knowledge of the underlying complexity. I doubt these studies are teaching us anything new at all. One has got to start thinking outside the box, rather than merely doing larger and larger GWAS (it obviously helps to get more funding but not understand more science). The solution does not have to be simple… It is wrong to assume that “the solution of complexity always lies in simplicity ” …. In a nutshell I don’t think these papers actually contribute much to our understanding at all.. It is bit of a waste of money I feel….

  • DK

    Jeff Barrett says: “these genes are definitely associated with height — far more convincingly than any educated guess based on biological function etc. The field really has bent over backwards to ensure high quality data in these studies and to weed out any false positive associations”.

    Oh, I wasn’t aware that we are now at a stage where such strong statements can be made. I’ll admit to being somewhat clueless about details yet skeptical. Can you explain what makes you so sure? Are you sure it’s not just a wishful thinking? My experience with what I call “paper biochemistry” is invariably this: every grandiose claim about how soon we’ll not need anything but computers made eventually turns out to be very wrong.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »