The environmental factor is never directly observable, but luckily an imperfect proxy is the grandparents — if your parents are very tall but your grandparents aren’t, likely a portion of your parents’ height is due to environmental factors, so you shouldn’t expect to be as tall as someone with tall parents and grandparents. That may be what #9 is getting at.

]]>your model is too simple. in the first generation you’re sampling in a biased fashion from the additive genetic variance (one normalish distribution). you are not sampling in a biased fashion from the residual/environment (the other normalish distribution which is confounded with the former in the total population). that is what regression back toward am a mean is proportional to the amount of environmental variance in the parental population, as it is counteracts the efficacy of selection on genes. in the second generation i’m not positing any selection on genes at all. both of the distributions in e3 are going to be centered around 0 in reference to the parental population, whereas in the e2 the underlying genetic component is deviated from 0 (via selection) and the environmental component remains about 0. the regression is the compound of the two.

*I’ll try to convince you using one other thought*

dude, **animal breeders use this method all the time.** it’s outlined as i described. if i can’t describe it for you clearly, that’s fine, but please don’t presume this is an abstruse theoretical issue! no, subsequent generations don’t regress like your model . that’s not something i made up, that’s something validated in breeding programs for decades. you’re not a geneticist, so you don’t know that empirically. that’s fine.

First, the sentence “~80% of the variation of the trait in the population can be explained by variation of genes in the population” suggests (to me, admittedly not a geneticist) an 80% correlation, not an 80% slope, but more importantly note that correlation = the slope assuming that 2 generations (variables) have same standard deviation, which is likely. Slope = correlation * ratio of standard deviations. But fine, we can set this issue aside.

Also, I did notice the difference (on my 2nd reading, between my 1st and 2nd comments). That’s why, in my 2nd comment, I first calculate E[x2 | x1 = 0.2], and next calculate E[x3 | x1 = 0.2]. These 2 are not analogous – the 1st one is expected value of a person’s height given that both his parents have a height of 0.2. The 2nd one is the expected value of a person’s height given that all of his grandparents have a height of 0.2. That seems to be consistent with your step 4 above.

The bottom line is that if you are arguing that the expected height of a person whose 4 grandparents have height 0.2 sd above mean (and that’s all you know about that person’s genetics) is the same as the expected height of a person whose 2 parents have a height of 0.2 sd above the mean, then I disagree with you (assuming we are using a multivariate normal distribution which is what I’ve implemented in python). The former should be lower. My python code shows this (it should actually be modified slightly to account for the fact that x3 is based on average of 2 draws of x2, rather than a single draw, but this doesn’t change the qualitative conclusion).

If it is possible for you to implement a simulation of your representation in a coding language, or to indicate what in my representation is wrong, that might clear it up.

I’ll try to convince you using one other thought: suppose that you did do what I thought (incorrectly) you were doing at first, i.e. that the members of the 3) population in your comment didn’t mate randomly, but that only those with height 0.16 mated. Is it intuitive to you that in that case, the avg. height in population 4) would be lower than in the case you actually describe?

]]>let me state it concretely, because i think you didn’t notice when i switched the type of populations i’m talking about as well.

1) first, you have a normally distributed population

2) second, you select parents with midparent parents of exactly 0.2 deviations from the median of the first population (subset of 1)

3) you have x number of offspring cohorts. each offspring cohort will be 0.16 deviations from the original population in 1)

4) then i posited that the offspring themselves in 3) would mate randomly, which is different from what i posit in 2). the new population distribution, analogous to 1), would have a mean of 0.16.

]]>Then, in the first part of your post, you explain that E[x2 | x1 = 0.2] = 0.16 (read “the expected value of x2, given that x1 is 0.2 sd units, is 0.16 sd units”). This is correct.

Later (“Rather, the expected value of the offspring would be 0.16 units.”) you seem to argue that E[x3 | x1 = 0.2] = 0.16. But this is just wrong. In fact, E[x3 | x1 = 0.2] = 0.128 (0.8^2 * 0.2).

Or is your representation somehow different from a multivariate normal? You seem to suggest it isn’t.

Here is python code to show that E[x3 | x1 = 0.2] = 0.128 (the single quotes seem to be represented incorrectly in this blog format, you may need to just retype them as single quotes on your keyboard) :

———————————————–

corr = 0.8

x1chose = 0.2 # this represents the selection of gen 1. with height 0.2

x1 = [random.gauss(0,1) for a in range(100000)]

print ‘x1 avg.: ‘, sum(x1)/len(x1)

x2 = [corr*x1chose + (1-corr**2)**.5*random.gauss(0,1) for a in range(100000)]

print ‘x2 avg., given x1 = 0.2: ‘, sum(x2)/len(x2)

x3 = [corr*a + (1-corr**2)**.5*random.gauss(0,1) for a in x2]

print ‘x3 avg., given x1 = 0.2: ‘, sum(x3)/len(x3)

]]>the empirical distribution of offspring is 2/3 to 3/4 of the population from which parents were selected from (this is in a follow up post). the fact that the offspring have a relatively high deviation despite selection on parents is due to genetic segregation and recombination.

*. Further on this issue, what about the parents with a phenotype midpoint of 0.2 made their children’s expected value 0.16, while the children of those with a phenotype midpoint of 0.16 also had an expected value of 0.16?*

you selected the parents to have a different mean from the distribution. you don’t select the children. your question is really hard to understand. you either don’t know this stuff well, or you know it really well that it’s beyond me frankly.

]]>The formula is that the standard deviation of the red curve should be equal to sqrt(1-0.8)*(standard deviation of black curve).

Also, a related issue: you say “They presume that a subsequent generation of mating would result in further regression back to the mean. No! Rather, the expected value of the offspring would be 0.16 units.” If this is true, then either 1) that subsequent generation has a much higher standard deviation than its parents or 2) correlation is 100% between that subsequent generation and its parents. Neither seems likely. Further on this issue, what about the parents with a phenotype midpoint of 0.2 made their children’s expected value 0.16, while the children of those with a phenotype midpoint of 0.16 also had an expected value of 0.16? How in practice can you distinguish between tall parents whose children will have an expected value equal to theirs vs. lower?

]]>I’d say in regards to music and sports at least, it depends on the particulars. Classical music and baseball? Yeah. Blues and Track and Field, not so much.

Also, being the child of someone who is very gifted in what they do *sucks*, especially if they’re a crap teacher. My Dad is a musician for whom it comes naturally. He’s not particularly trained, but can pick up near anything that’s not too technically complex by ear. I remember cutting the strings off an ukulele he gave me when I was seven out of frustration from never playing well enough to match him. For that reason I never again took up music until fairly recently, and found out as an adult, that while I’m not gifted, I’m pretty good at it, and when I’m not comparing myself to my father, I really, really enjoy playing music and composing songs. So, I have to say I really really feel what Jordan’s kids have to go through. There’s nothing like having some talent and the desire to do something, but to be forever eclipsed by your parent.

]]>I’m unversed in statistics, but I’d like to understand this.

]]>Someone with connections will be able to make absolutely the most of their talent. This is least important in highly competitive areas where there are objective standards, but even there it’s a factor. For example, in the college football and the NFL you hear fairly often of top players who were walkons with no scholarship or undrafted. I strongly suspect that many of these players played for low-prestige coaches on less-known teams in small-time leagues

Education: things like sports, music, and math are very specialized and require more than talent and general knowledge. A 15 year old batter and outfielder whose father was a top batter and outfielder will know all kinds of tricks of the trade that the average, equally talented kid doesn’t know. This is conditional; some fathers pass down only genese, and some teach the family trade.

Picasso’s dad was a professional artist. Mozart’s dad was a professional musician who specialized in pedagogy. Bach’s father and two uncles were professional musicians, and so were three of his sons. Mozart’s son was a mediocre professional musician — maybe Mozart’s father was a bteer teacher than Mozart was.

]]>