Genetics existed before -omics

By Razib Khan | June 9, 2011 2:20 am

In the post below, Moderate marginal value to genomics, I left some things implicit. It turns out that this was an ill-considered decision. In reality my comments were simply more cryptic and opaque than implicit. This is pretty obvious because even those readers who are biologists didn’t seem to catch what I had assumed would be obvious in the thrust of my argument.

The point in the broadest sense is that DNA and genomics are not magical. Genetics existed before either of them. Understanding the physical basis of genetics has certainly been incredibly fruitful, and genomics has altered the playing field in many ways. But there was a broad understanding of genetics before DNA and genomics, both in a Mendelian sense and in the area of biometrics and quantitative genetics. In the earlier post I indicated that the tools for predictions of adult traits due to the effect of genes have been around for a long time: our family history. By this, I mean that a lot of traits of interest are substantially heritable. A great deal of the variation within the population can be explained by variation of genes in the population, as inferred by patterns of correlation between individuals in their traits as a function of genetic relatedness. This is genetics as a branch of applied statistics. It has great “quick & dirty” power, especially in agricultural science.

Let’s look at something simple, height. It’s a continuous trait which is rather concrete. No one argues that “height” is a social construct. In Western societies height is ~80-90% heritable. That means that most of the variation within the population of the trait can be explained by variation in one’s family background. Tall people have tall children, short people have short children, and so forth. Here’s a “toy” scatterplot which shows the relation between mid-parent heights and adult offspring heights (I made up the numbers):


The correlation isn’t perfect. But it’s pretty good. The more heritable a trait is, the more a scatterplot of this form (offspring regressed on parents) approaches tight linearity with a slope of ~1. These plots are measuring narrow sense heritability, which is the additive genetic variance over the phenotypic variance. Additive genetic variance just means the variants which have additive or subtractive values to the trait value (or, they can be transformed as such).

To make this plot in a fashion which is more than illustrative  you need a lot of data on a large number of individuals and their parents. This would be tedious and require a substantial labor investment in earlier periods, but today with powerful data mining techniques I think it would be much, much, easier. In a world where the child is the father of the man these methods would have great power.

But they’re not perfect. Siblings vary in height, even though though the trait seems mostly controlled by variation in genes on the population level. What’s going on? Genetically, Mendelian segregation and genetic recombination are going to reshuffle the many alleles which control variation in height from parent to offspring in terms of what the gamete contributes. Additionally, the nature of the environmental “noise” may vary from sibling to sibling. Using population wide data you can infer the expected value of the offspring based in heritability and mid-parent value, but there’s going to be variance about the mean of the theoretical distribution. For example, the standard deviation of I.Q. within the population is 15 points, and across full-siblings it is also 15 points.

This is where genomics comes in. It does make a difference, on the margin. I suspect it would do so by removing some of the uncertainty of segregation and genetic recombination. Going back to the height example, imagine that you know of the ~1,000 genes which vary within the population to control variation in height. You sequence two parents, and so know which regions of the genomes they’re enriched for “tall” or “short” alleles. Some of the variance in the offspring is going to be due to the fact that the offspring don’t receive a perfect proportional representation of their parent’s alleles in terms of aggregate effect size. You could then remove some of the uncertainty in outcome because you can check the child’s genome against the parents’ and assess whether they received more or less of the “tall” or “short” alleles.

But there would still be environmental “noise” which you probably couldn’t account for. You can see an illustration of what I have in mind in the two normal distributions I plotted above. Both of them represent the theoretical distribution of possibilities of a child on a quantitative trait which only becomes realized in adulthood. The blue line shows what you can infer from the plain information of parental phenotypes. But what happens when you give them a genomic test? You remove some of the uncertainty from your calculus, and the variance drops. You see that in the red line.

This is what I mean when I say that genomics matters on the margin. It does have an effect. But all the tools to profile and predict are around us now. Even determined amateurs can find out quite a bit about someone’s family if they’re determined. This is no different in deep principle from the sort of techniques which large corporations are utilizing to create a “profile” of your possible future purchases by what you purchased in the past. The parents are past purchases. The adult offspring are future purchases. Knowing a lot of behavior genetic implicated genes might help the profile, but at the end of the day it’s not a deal-breaker or a game-changer.

An analogy to current market research and prediction algorithms is particularly apropos I think. They creep people out. So I naturally expect people to be creeped out if the state or insurance company has detailed fleshed out acturial tables based on genetics and genomics. But genetics or genomics don’t make it any more or less scary on a deep level. Nor do they make the techniques qualitatively more effective. And the policy questions and responses are going to be the same no matter what.

  • Sandgroper

    Barry Marshall is enthused: http://au.news.yahoo.com/thewest/a/-/wa/9607909/nobel-winner-eyes-genetics/

    I guess he means the same thing – insurance companies are not going to get a lot more than they can already get from family medical history.

    Except for stuff that might be cryptic in family history but which will show up in genomic data. There can be things like that. And vice versa – e.g. a woman whose mother and sisters all developed breast cancer is not necessarily doomed.

  • bob sykes

    Before DNA, a gene was a “character,” but now it is the code for a protein. Changes in characters are presumably the result of changes in several or many proteins and in animals the hox boxes. So, Mendel was observing coordinated changes in suites of proteins not single proteins.

    Shouldn’t this change in the meaning of “gene” be a point of discussion?

  • Antoine

    Hi Razib, thank you for this interesting entry.

    Just to make clear, can you please correct me if i’m wrong ?
    - As heritability goes, nothing changes because heritability is computed using the middle of the bell curves, which are the same. So the definition doesn’t have to change.
    - For individual beings though, the knowledge of genetic data provides more certainty about the achieved height.

    Don’t we have 3 bell curves then ? from larger to narrower:
    - from knowledge of parent height
    - from knowledge of parent genes (environmental noise + randomness of allele distribution)
    - from knowledge of one’s genes (only environmental noise)

  • Markk

    Is this a faint mirror of the odd split in biology that happened between the traditionally grounded and the sequence grounded? From the outside (i.e. me) that split is shown by the idea still put forth recently, that most people thought there were 100,000 human genes or so based on an offhand remark by Gilbert, and the many geneticists who pegged the number accurately, mostly still using traditional techniques and information rules well before big time sequencing are kind of left out in these tales. Larry Moran, I think, had a an article about that a long time ago. Many “sequencers” perpetuate the myth that everyone was shocked that the number of human genes was so low, but it wasn’t true, they were wrong to guess that high and people were telling them so, but with the big hammer of sequencing everything else was kind of ignored, and the mythologizing has already begun.

    I see you here as stating the obvious – that there are inferential techniques about genes and inheritance that are not sequencing – that give a lot of base information. Even without the exact genes with the information correlation technology we have today we could build powerful models if heritability. This sounds like a good chance combined with cheap sequencing to have some great play back and forth between the two kinds of models.

    #2 – wasn’t that change of meaning done 50 years ago or more though? Is there something that would matter with these correlational studies that would change with the current definition?

  • DK

    Sorry, kinda OT: “standard deviation of I.Q. within the population is 15 points, and across full-siblings it is also 15 points”

    Any references for this? I find this result very, very surprising. Using your analogy to height, this would be an equivalent of height distribution in families having the same variation as in general population. Anecdotally this sounds wrong.

  • http://wwwpersonalgenomics.us Trey

    aha :) . Well, it was a bit opaque, but I was reading it from a different perspective too. I now see what you are saying. I agree for the most part, there was genetics (and family history) before the -omics, and those tools existed if we wanted to use them and actually have stronger predictive power.

    But wouldn’t you say that those tools, though not necessarily different in principle, could be be different in practice? Determining a trait now does take a determined search, determining them with genomic data is a fine instrument compared to determining someone’s family history.

    I kind of look at it like the analogy between all the card catalog’s in a city’s libraries vs. the internet. In principle, I can find the same piece of data. The former would require a lot of determination, time and search, the latter takes moments and minimal effort.

    Of course, as you said, with today’s technology (from facebook to a google search), determining a family history in order to discriminate is a lot faster than it would have been 30 years ago.

  • http://blogs.discovermagazine.com/gnxp Razib Khan
  • http://rxnm.wordpress.com miko

    bob sykes said: “Shouldn’t this change in the meaning of “gene” be a point of discussion?”

    I think technically “genes” were the unknown causal agents that caused differences in characters (phenotypes). So yes, restricting this to physical segments of DNA was a major shift. “Gene” is still up for debate… I’ve never heard a non-operational definition or one that can’t be picked apart until it doesn’t seem that useful. It’s amazing how useful a concept it is, despite its inability to bear too much scrutiny.

    Evelyn Keller has written extensively on this, and many others.

  • DK

    Razib, this quote reads like a back of the envelop calculation for which a typical SD for IQ is good enough to take. I’d be wary of interpreting this nearly off hand remark (“typical of such”) as a real life indication that IQ varies between siblings as much as it varies within general population.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    DK, just the first thing i saw on google books. i’ve seen it plenty elsewhere.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    here’s a standard descriptive statistic:
    http://books.google.com/books?id=YtHoP5UYTDMC&lpg=PA357&ots=mp_4u7HlJO&dq=sibling%20I.Q.%20difference&lr&pg=PA361#v=onepage&q&f=false

    might be 2/3 to 1 standard deviation of population norm. usually i see it closer to 1.

    (~1 standard deviation is plausible to me going by anecdote btw)

  • DK

    Razib, thanks! ~2/3 I can live with. ~ 1 I have hard time believing. Wonder what twins’ SD is then? I’d say probably ~1/2, given the correlations and that anything much lower than 0.5 is unlikely. Are you aware of any actual data?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #12, i think i saw 1/2 mz to dz comparison, as usual. depends on the heritability estimate you get based on the type of sample i think.

  • bob sykes

    Dear Miko: Thanks for the reference.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »