People often make “year end predictions.” I haven’t done that because I just haven’t bothered. But, it’s probably a nice way to see how full of crap you are. You can look back at how many mistakes you made, suggesting to you that you’re really a lot more ignorant of the shape of reality than you fancy yourself. So I’m going to put some predictions down right now. The title is self-centered, but I want it to be Googleable. There are two classes of predictions. The first class are those which I think have more than 50 percent chance of coming to fruition. I don’t want to pick “sure things,” because what’s the point of that? The second category is different, in that I think the chance of the outcome may be less than 50 percent, and the conventional wisdom is going to be opposite of the prediction, but I suspect the odds are better than people think. I’ll give myself “bonus points” if those come true.
That’s the question a commenter poses, albeit with skepticism. First, the background here. New England was a peculiar society for various demographic reasons. In the early 17th century there was a mass migration of Puritan Protestants from England to the colonies which later became New England because of their religious dissent from the manner in which the Stuart kings were changing the nature of the British Protestant church.* Famously, these colonies were themselves not aiming to allow for the flourishing of religious pluralism, with the exception of Rhode Island. New England maintained established state churches longer than other regions of the nation, down into the early decades of the 19th century.
Between 1630 and 1640 about ~20,000 English arrived on the northeastern fringe of British settlement in North America. With the rise of co-religionists to power in the mid-17th century a minority of these emigres engaged in reverse-migration. After the mid-17th century migration by and large ceased. Unlike the Southern colonies these settlements did not have the same opportunities for frontiersmen across a broad and ecological diverse hinterland, and its cultural mores were decidedly more constrained than the cosmopolitan Middle Atlantic. The growth in population in New England from the low tends of thousands to close to 1 million in the late 18th century was one of endogenous natural increase from the founding stock.
About ~25 percent of the traffic to this website search engines. Mostly Google. Below are two sets of top 25 search results. The first is pretty straightforward. But the second has all the key words which are probably by and large people just looking for weblogs removed. The links are to search results are on Google.
A few qualifications. First, I removed all Google referral sites except for G+. Second, I removed Discover Magazine urls. Some of these sites should perhaps have been omitted from the list as well because of my past or current association with them (gnxp.com, Secular Right, Sepia Mutiny and Brown Pundits). ScienceBlogs is mostly, though not exclusively, from my old website there. I’m a little amused that razib.com is rather high on the list, but that site is the first hit usually for querying my name on Google (and therefore Bing, which seems to just copy Google’s results).
In this list I’ve limited it to posts which were published in 2011. For much of the blog’s history I didn’t autoclose comments after 2 weeks, so the comparisons aren’t appropriate. And comments tend to be less timeless in any case. Comments are a double-edged sword on a weblog, because they often invite the stupid to come out and play in people. But there are a non-trivial subset from whom I’ve learned a fair amount from. That learning doesn’t always have to be a case where you even change your mind. Discussion in good faith can usually sharpen comprehension of your own perspective.
My friend Holden Karnofsky always pings me at this time of the year. Holden is co-founder of GiveWell. If you’re curious, you can look up more on the outfit yourself, I’ve talked about it enough over the years for you to get why I’m interested and a supporter. Holden is a numbers and data driven guy, and it turns out that 25% of the money given through their website last year was on December 31st. Here are their top charities.
In purely selfish news (yes, I’m a heavy user) Wikipedia is also in need of cash. And yes, I give! (though that doesn’t stop the constant stream of begging headshots)
A questioner below was curious if vocabulary test differences by ethnic and region persist across income. There’s a problem with this. First, the INCOME variable isn’t very fine-grained (there is a catchall $30,000 or greater category). Second, it doesn’t seem to control for inflation. But, there is a variable, DEGREE, which asks the highest level of education attained. I used this to create a “college” and “non-college” category (i.e., do you have a bachelor’s degree or not). Because of sample size considerations I removed some of the ethnic groups, but replicated the earlier analysis.
Below are two tables. One shows the mean vocab score for region and ethnicity (for whites) for those without college educations, and another shows those with college educations. I decided to generate a correlation over the two rows, even though it sure isn’t useful as a quantitative statistical measure because of the small number of data points. Rather, I just wanted a summary of the qualitative result. The short answer is that the average vocabulary difference seems to persist across educational levels (the exception here is the “German” ethnicity).
I wonder if in future years we’re going to look at “species debates” in the context of human evolution like we look at counting angels on the head of a pin. Over at BBC News Clive Finlayson has a rambling opinion piece up, Has ‘one species’ idea been put to bed? Finlayson, the author of The Humans Who Went Extinct: Why Neanderthals Died Out and We Survived, doesn’t seem to have a tightly focused point and the end of it all (I think warranted, considering how unsettled this area is). But he does conclude:
And a major conference is planned for September next year when experts from all over the world will meet in Gibraltar to revise our ideas about “the human niche”. After decades of bad press we are finally getting round to humanizing the enigmatic Neanderthals.
1) Extraction of eggs is a major surgical affair. Extraction of sperm is not.
2) Males generally have many more sperm to contribute than females.
The latter issue made me go look for data on human females, by age. The paper A systematic review of tests predicting ovarian reserve and IVF outcome had what I was looking for. First, let’s review the cumulative distribution of fertility curves for women:
The Pith: Even traits where most of the variation you see around you is controlled by genes still exhibit a lot of variation within families. That’s why there are siblings of very different heights or intellectual aptitudes.
In a post below I played fast and loose with the term correlation and caused some confusion. Correlation is obviously a set of precise statistical terms, but it also has a colloquial connotation. Additionally, I regularly talk about heritability. Heritability is in short the proportion of phenotypic variance which can be explained by genetic variance. In other words, if heritability is ~1 almost all the variation in the trait is due to variation in genes, while if heritability is ~0 almost none of it is. Correlation and heritability of traits across generations are obviously related, but they’re not the same.
This post is to clarify a few of these confusions, and sharpen some intuitions. Or perhaps more accurately, banish them.
Mike the Mad Biologist has a post up, A Modest Proposal: Alabama Whites Are Genetically Inferior to Massachusetts Whites (FOR REALZ!). The post is obviously tongue-in-cheek, but it’s actually an interesting question: what’s the difference between whites in various regions of the United States? I’ve looked at this before, but I thought I’d revisit it for new readers.
First, I use the General Social Survey. Second, I use the WORDSUM variable, a 10 question vocabulary test which has a correlation of 0.70 with general intelligence. My curiosity is about differences across white ethnic groups by region. To do this I use the ETHNIC variable, which asks respondents where their ancestors came from by nation. I omitted some nations because of small sample size, and amalgamated others.
Here are my amalgamations:
German = Austria, Germany, Switzerland
French = French Canada, France
Eastern Europe = Lithuania, Poland, Hungary, Yugoslavia, Russia, Czechaslovakia (many were asked before 1992), Romania
Scandinavian = Denmark, Norway, Sweden, Finland (yes, I know that Finland is not part of Scandinavia, Jaakkeli!)
British = England, Wales, Scotland
Northeast = New England, Middle Atlantic
Midwest = E North Central, W North Central
South = W S Central, E S Central, South Atlantic
West = Pacific, Mountain
The key method I used is to look for mean vocabulary test scores by ethnicity and religion. I also later broke down some of these ethnic groups by religion. Finally, all bar plots have 95 percent confidence intervals. This should give you a sense of the sample sizes for each combination.
First let’s break it down by race/ethnicity and compare it by region to get a reference:
In earlier discussions I’ve been skeptical of the idea of “designer babies” for many traits which we may find of interest in terms of selection. For example, intelligence and height. Why? Because variation on these traits seems highly polygenic and widely distributed across the genome. Unlike cystic fibrosis (Mendelian recessive) or blue eye color (quasi-Mendelian recessive) you can’t just focus on one genomic region and then make a prediction about phenotype with a high degree of certainty. Rather, you need to know thousands and thousands of genetic variants, and we just don’t know them.
But I just realized one way that genomics might make it a little easier even without this specific information.
I’ve mentioned this before, but I thought I’d pass on the latest report on MaterniT21, the prenatal noninvasive Down Syndrome test. Currently it has a $235 copay for women with insurance. As of now only a few percent of the ~5 million pregnancies in the USA are subject to amnio or c.v.s. This procedure may result in the screened proportion going from ~1 percent to ~50 or more percent (though the firm that is providing this can only process ~100,000 tests per year as of now). I stumbled upon this after doing a follow up on my post, Would you have your fetus genetically tested? Interestingly the proportions who would get tested doesn’t differ that much between demographics.
And the outcomes can sometimes surprise. A story in the Columbus Dispatch relates the story of a couple who kept their daughter, who tested positive for Down Syndrome. They had originally decided that if the tests came back positive the would terminate. In contrast, the nurses relate that one strongly anti-abortion couple at the beginning of the process seems to have terminated. Right now 1 in 700 pregnancies result in Down Syndrome.
With all the talk about Basques I decided to do my own analysis with Admixture. Dienekes gave me a copy of his IBS file, which has all the 1000 Genomes Spanish samples, including Basques. I merged it with the HGDP sample, which has French Basques (just “Basques” in the plots below) and French non-Basques. I pruned most of the populations, but kept the Mozabites, which are a Berber group from Algeria. The number of markers was ~350,000, and I ran it up to K = 8, or 8 component populations. I stopped there because the components were starting to break up in a very choppy manner.
In general I do think that the idea that non-Basque Spaniards have Moorish genetic input seems supported. It isn’t definitive though. And you have to be careful, there are lower parameter values where Sardinians seem to have an affinity with Mozabites to a great extent, even more than Spaniards. But that disappears as you move up the number of K’s. But who is to say which K is the correct K? The consistent Sub-Saharan African among non-Basque Spaniards (also evident in the Behar et al. data set) component probably convinces me that there was a Moorish impact, since these are likely to have come with the Islamic conquest, and not Phoenicians.
All the files from the Admixture run (and csv files with tabular results) are here.
As I’ve noted in this space before many of my “web friends” and readers are confused why I call myself “conservative.” This is actually an issue in “real life” as well, though I’m not going to get into that because I’m a believer in semi-separation of the worlds. I’ll be giving a full account of my political beliefs at the Moving Secularism Forward conference. A quick answer is that I’m very open to voting for Republicans, and have done so in the recent past. And, my lean toward Mitt Romney* in the current cycle is probably obvious to “close readers.” But I’m not a very “political person” in the final accounting when it comes to any given election. I didn’t have a very strong reaction to the “wave” elections of 2006, 2008, and 2010, except that I was hopeful but skeptical that Democrats would actually follow through on their anti-war rhetoric (I’m an isolationist on foreign policy).
Rather, my conservatism, or perhaps more accurately anti-Left-liberal stance, plays out on a broader philosophical and historical canvas. I reject the very terms of much of Left-liberal discourse in the United States. I use the term “discourse” because for some reason the academic term has replaced the more informal “discussion” in non-scholarly forums. And that’s part of the problem. I am thinking of this because of a post by Nandalal Rasiah at Brown Pundits commenting on a piece over at Slate, Responding to Egregious Attack on Female Protester, Egyptian Women Fight Back. Whether conventional or counter-intuitive Slate is a good gauge of “smart” Left-liberal non-academic public thought. Nandalal highlights this section:
This is probably old news to you, and I’ve read about Digg’s problems in the tech media, but I just realized how much reddit has eclipsed Digg in referral traffic. I’ve always gotten way more attention from reddit (some science bloggers have told me that reddit readers are a “smarter set”), but when I did get Digg bumps they were often of greater magnitude. No more. Not only are referrals from Digg much more rare than they used to be, but they aren’t as significant as reddit.
So of course I checked out Google Trends:
I flog R. A. Fisher’s The Genetical Theory of Natural Selection a fair amount on this site. You don’t need to understand everything in the book, nor do you have to agree with everything in it, but it is a great point of departure toward understanding evolutionary genetics. I’ve noted that you can get it free in PDF format. But if you want to browse it online in a easier format, here you go:
The Pith: The purported sons of great men often are really the sons of great men. Another case of “Conan was right”.
Dienekes points me to a neat new paper, Present Y chromosomes reveal the ancestry of Emperor CAO Cao of 1800 years ago, which attempts to validate the claims to descent from a particular ancestor by a set of Chinese clans. The Chinese clan system is based on direct paternal descent by and large (and there has been a history of aversion to adoption from outside the kin group), and so aligns perfectly with the phylogeny of Y chromosomes, which are passed from father to son to son to son. That’s the ideal. What’s the reality? People are adopted. Or, some sons of a purported father are actually not the biological sons of that father. And finally, there are cases where individuals may fabricate ancestry and interject themselves into a lineage group through deception.
The individual in this case flourished 72 generations ago. Additionally, there is some controversy as to the relationship of this individual to others of their lineage. Here is the important section from the paper:
Here, we typed 100 Y chromosome single-nucleotide polymorphisms (Supplementary Table 1) as listed in the latest Y-chromosome phylogenetic tree…on 280 individuals of 79 Cao clans or clan clusters from different locations throughout the China…and 446 individuals of different clans with other surnames. A clan cluster may consist of several simplex clans if they carry different Y-chromosome haplotypes. Thus, we studied overall 111 simplex clans of CAO (Supplementary Table 2). According to their stemma records, 15 of the CAO clans claimed to be descendants of Emperor CAO. These 15 clans distributed in different provinces and never knew the existence of each other. Their Y chromosomes comprise six haplotypes…Only one of these six haplotypes can be Emperor CAO’s type. The other haplotypes found in the claimed clans might be introduced by other sources such as adoption, acceding to mother’s surname, nonpaternities, and so on…Here we need to recognize the most probable Emperor CAO’s haplotype by examining the haplotype distribution among the clan groups.
There’s a variable in the GSS, GENESELF, which asks:
Today, tests are being developed that make it possible to detect serious genetic defects before a baby is born. But so far, it is impossible either to treat or to correct most of them. If (you/your partner) were pregnant, would you want (her) to have a test to find out if the baby has any serious genetic defects?
This is relevant today especially. First, the technology is getting better and better. Second, couples are waiting longer to start families. Unfortunately this question was only asked in 1990, 1996, and 2004. But on the positive side the sample sizes were large.
I decided to combine 1990 and 1996 into one class. Also, I combined those who were very liberal with liberals, and did the same for conservatives. For political party ideology I lumped strong to weak identifiers. For intelligence I used WORDSUM. 0-4 were “dull,” 5-7 “average,” and 8-10 “smart.” For some variables there weren’t results for the 1990s.
The biggest surprise for me is that there wasn’t much difference between the 1990s and 2004. The second biggest surprise was that the differences between demographics were somewhat smaller than I’d expected, and often nonexistent. Below is a barplot and table with the results.