Kobe Bryant is an exceptional professional basketball player. His father was a “journeyman”. Similarly, Barry Bonds and Ken Griffey Jr. both surpassed their fathers as baseball players. Both of Archie Manning’s sons are superior quarterbacks in relation to their father. This is not entirely surprising. Though there is a correlation between parent and offspring in their traits, that correlation is imperfect.
Note though that I put journeyman in quotes above because any success at the professional level in major league athletics indicates an extremely high level of talent and focus. Kobe Bryant’s father was among the top 500 best basketball players of his age. His son is among the top 10. This is a large realized difference in professional athletics, but across the whole distribution of people playing basketball at any given time it is not so great of a difference.
What is more curious is how this related to the reality of regression toward the mean. This is a very general statistical concept, but for our purposes we’re curious about its application in quantitative genetics. People often misunderstand the idea from what I can tell, and treat it as if there is an orthogenetic-like tendency of generations to regress back toward some idealized value.
Going back to the basketball example: Michael Jordan, the greatest basketball player in the history of the professional game, has two sons who are modest talents at best. The probability that either will make it to a professional league seems low, a reality acknowledged by one of them. In fact, from what I recall both received special attention and consideration because they were Michael Jordan’s sons. It is still noteworthy of course that both had the talent to make it onto a roster of a Division I NCAA team. This is not typical for any young man walking off the street. But the range in realized talent here is notable. Similarly, Joe Montana’s son has been bouncing around college football teams to find a roster spot. Again, it suggests a very high level of talent to be able to plausibly join a roster of a Division I football team. But for every Kobe Bryant there are many, many, Nate Montanas. There have been enough generations of professional athletes in the United States to illustrate regression toward the mean.
In the post below, Moderate marginal value to genomics, I left some things implicit. It turns out that this was an ill-considered decision. In reality my comments were simply more cryptic and opaque than implicit. This is pretty obvious because even those readers who are biologists didn’t seem to catch what I had assumed would be obvious in the thrust of my argument.
The point in the broadest sense is that DNA and genomics are not magical. Genetics existed before either of them. Understanding the physical basis of genetics has certainly been incredibly fruitful, and genomics has altered the playing field in many ways. But there was a broad understanding of genetics before DNA and genomics, both in a Mendelian sense and in the area of biometrics and quantitative genetics. In the earlier post I indicated that the tools for predictions of adult traits due to the effect of genes have been around for a long time: our family history. By this, I mean that a lot of traits of interest are substantially heritable. A great deal of the variation within the population can be explained by variation of genes in the population, as inferred by patterns of correlation between individuals in their traits as a function of genetic relatedness. This is genetics as a branch of applied statistics. It has great “quick & dirty” power, especially in agricultural science.
Let’s look at something simple, height. It’s a continuous trait which is rather concrete. No one argues that “height” is a social construct. In Western societies height is ~80-90% heritable. That means that most of the variation within the population of the trait can be explained by variation in one’s family background. Tall people have tall children, short people have short children, and so forth. Here’s a “toy” scatterplot which shows the relation between mid-parent heights and adult offspring heights (I made up the numbers):
It is known that Northern Europeans tend to be somewhat taller than Southern Europeans. This seems intuitively obvious if you spend a bit of time around representative populations. Growing up in the Pacific Northwest I’ve always been on the short side at 5 feet 8 inches, but when I was in Italy for 3 weeks one year back (between Milan and Rome, with disproportionate time spent in the Piedmont) I didn’t feel as small (I recall feeling similarly when I was in Cajun country in the early 2000s). Steve Hsu alerts me to the fact that Luke Jostins is back blogging at Genetic Inference, reporting from the Biology of Genomes meeting. Apparently Michael Turchin has found that:
1) Alleles known to be associated with greater height are found at higher frequencies in Northern Europeans
2) Alleles known to be associated with greater height also exhibit signatures of natural selection
The Pith: There has been a long running argument whether Pygmies in Africa are short due to “nurture” or “nature.” It turns out that non-Pygmies with more Pygmy ancestry are shorter and Pygmies with more non-Pygmy ancestry are taller. That points to nature.
In terms of how one conceptualizes the relationship of variation in genes to variation in a trait one can frame it as a spectrum with two extremes. One the one hand you have monogenic traits where the variation is controlled by differences on just one locus. Many recessively expressed diseases fit this patter (e.g., cystic fibrosis). Because you have one gene with only a few variants of note it is easy to capture in one’s mind’s eye the pattern of Mendelian inheritance for these traits in a gestalt fashion. Monogenic traits are highly amenable to a priori logic because their atomic units are so simple and tractable. At the other extreme you have quantitative polygenic traits, where the variation of the trait is controlled by variation on many, many, genes. This may seem a simple formulation, but to try and understand how thousands of genes may act in concert to modulate variation on a trait is often a more difficult task to grokk (yes, you can appeal to the central limit theorem, but that means little to most intuitively). This is probably why heritability is such a knotty issue in terms of public understanding of science, as it concerns the component of variation in quantitative continuous traits which is dispersed across the genome. The traits where there is no “gene for X.” Additionally, quantitative traits are likely to have a substantial environmental component of variation, confounding a simple genotype to phenotype mapping.
Arguably the classic quantitative trait is height. It is clear and distinct (there aren’t arguments about the validity of measurement as occurs in psychometrics), and, it is substantially heritable. In Western societies with a surfeit of nutrition height is ~80-90% heritable. What this means is that ~80-90% of the variance of the trait value within the population is due to variance of the genes within the population. Concretely, there will be a very strong correspondence between the heights of offspring and the average height of the two parents (controlled for sex, so you’re thinking standard deviation units, not absolute units). And yet height is at the heart of the question of the “missing heriability” in genetics. By this, I mean the fact that so few genes have been associated with variation in height, despite the reality that who your parents are is the predominant determination of height in developed societies.