Sample size, schample size

By Razib Khan | November 15, 2011 4:15 pm

Ed Yong has a post up on a behavior genetic publication where the sample size is 23. The researchers report a correlation between a SNP on the OXTR locus and “prosociality.” To make a long story short the sample size suggested to Dr. Daniel MacArthur and Dr. Jospeh Pickrell that this was a spurious correlation. The bigger issue here is that there are functional reasons to assume that some genes are correlated with normal human variation in psychology and behavior, and a robust body of literature that these traits are heritable (trait value is highly predictive across relatives), but, the results associating a particular genetic marker with a given trait are much less robust.

But I immediately realized something interesting: a sample size of 23 may be small, but there is a sample size potentially of thousands! I know my genotype at this SNP from 23andMe. How about some 23andMe customers get together and produce some results, and then get published in PNAS? A sample size of 230 would be easy I think, and you could probably push it much closer to 1,000.

CATEGORIZED UNDER: Behavior Genetics
MORE ABOUT: Behavior Genetics
  • http://blogs.scientificamerican.com/thoughtful-animal Jason G. Goldman

    I think you’re onto something here… all we’d need to do is set up a website with the video stimuli along with inputs for participants to respond to the questions.

  • http://johnhawks.net/weblog John Hawks

    Well, if we’re going to have thousands of people we may as well add in all the neurotransmitter receptors, too…

  • T. Kosmatka

    AG at the OXTR locus.

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    #1, kind of what i was thinking.

  • S.J. Esposito

    This is what personal genomics is — or perhaps should be — all about. I’m really interested to see how this turns out…

  • I_Affe

    I’d participate. I already sent you my 23andme sequence info.

  • http://chimerasthebooks.blogspot.com/ EEGiorgi

    Frankly, I’d be more curious to look at the crystal structure of the receptor and see how the A versus G changes it. Suppose it turns out that the mutant allele affects the way the receptor binds to the hormone — now that’s an interesting finding! Has anybody looked into it? (couldn’t find anything with a quick search)

  • http://blogs.discovermagazine.com/notrocketscience/ Ed Yong

    A nice idea in general, but it wouldn’t work with this study. The genotypes refer to the people *in the videos* not the viewers, so you’d have to shoot a new video for each volunteer, who would be aware of the study’s objectives.

  • D. Rieder

    I would be careful about systematic bias based on who was willing to volunteer. Revealing part of your personal genotype to contribute to a crowd sourced project seems likely to have at least some correlation to the “prosociality” they are using, and that could really mess up the statistics. I’m not sure how to go about correcting for something like that – any ideas?

  • http://blogs.discovermagazine.com/gnxp Razib Khan

    could ed yong be less prosocial? :-)

  • DK

    @EEGiorgi: that SNP is in intron region, so it does not affect protein sequence.

  • Åse

    I think it sounds incredibly interesting, though. Sure there will be “willing to volunteer” bias, but most likely no more than in most psych studies (WEIRD’s anyone).

    It would take a bit more work, but with phone cameras, and you-tube, and all those things…

    I don’t know what I am on that line (yet – because I haven’t looked it up). But, could 23andme try to recruit members by asking people to volunteer short films. In some cases several family members may be sequenced (or so I have heard – I am getting kits for mine as soon as I have money). Then, set up an online looking things up center, type what they do for IAT.

    I think it could be very cool! (And, hey, it kinda is within my research area, as I do emotion display and such).

    And, aware of study objectives. Yes. But, after a while you stop being aware in that way. I went through Pennebaker’s “the secret life of pronouns” (quick and fun read) where whatever cues you spill in your language is a readout which it is very hard to alter (and alteration does not feed back).

    Non-verbal behavior seems to be more bidirectional that way. I think it is not impossible. Would take quite a bit of work, but would be very interesting.

  • http://chimerasthebooks.blogspot.com/ EEGiorgi

    I believe it’s the same SNP (SNP rs53576) that came up in this study (http://www.pnas.org/content/108/37/15118). People have been looking for all sorts of associations since it’s been found, but sample size aside, the real problem is that behaviors are so hard to quantify. That’s why I think it would be a better idea to tackle it from a molecular biology point of view. If the SNP is silent there must be something else going on (maybe an epigenetic change? maybe this snp is in linkage disequilibrium with the snp that’s truly driving these associations?) OR all these studies (and not just this one) are really looking for a ghost and all there really is underlying here is the fact that whenever you take a genetic sample you find associations by chance just because phylogenetically speaking we are “young” and all related even when we don’t know we are related.

    Genetic studies are far from perfect.

  • http://washparkprophet.blogspot.com ohwilleke

    How do you deal with the sample bias issue? A well drawn sample of 23 is better than a non-random sample of 230.

    More ambitious, but well within the range of 23 and Me management, would be to offer 23 and Me participants to include themselves in a large scale multi-factored study that would feature a long form survey on many, many issues, perhaps with quarterly or annual update questionaires, from which samples would be drawn for individual trait studiies. That way, user interest in a particular topic would skew the sample as much. A string of scholarly journal publications with 23 and Me in the credits would also be great for corporate PR and credibility.

    If you could get a generalized research study participant pool of say 1500, from which random samples of 300 to 500 were drawn for particular studies, you’d get sample size without the skew of a single appeal, and if it were company (or company affiliated non-profit) based, you’d also escape the heavy internet users v. light internet user skews.

  • pconroy

    Results:
    GG Daughter
    GG Mother
    AG Father
    GG Me – pconroy
    GG Wife
    GG Wife’s Father
    AG Wife’s Maternal Grandmother
    GG Wife’s Mother

    I’d describe myself as an anti-social extrovert – or should that be a pro-social introvert?!

  • DK

    @EEGiorgi: If the SNP is silent there must be something else going on

    Of course it would be very interesting what it does physically. Unfortunately, there can be many things going on. OXTR expression level can be affected by influencing a rate of transcription (local structure can have an effect), or it can be on a level of mRNA splicing, or immmature mRNA stability. Or it can be a part of regulatory element acting in trans on something other than OXTR. Or it can simply be linked to something else that went unnoticed before. All these things can in theory be tested but before anyone is willing to do that, a study with N much more than 23 has to show that this SNP is more interesting than millions of others.

  • Dr. Daniel MacArthur

    More ambitious, but well within the range of 23 and Me management, would be to offer 23 and Me participants to include themselves in a large scale multi-factored study that would feature a long form survey on many, many issues, perhaps with quarterly or annual update questionaires, from which samples would be drawn for individual trait studiies. That way, user interest in a particular topic would skew the sample as much. A string of scholarly journal publications with 23 and Me in the credits would also be great for corporate PR and credibility.

    They’ve already done this. Over 60,000 of their customers have answered research surveys. They have two solid PLoS Genetics papers out already, including one reporting two brand new genetic regions associated with Parkinson’s, and several more on the way.

  • http://chimerasthebooks.blogspot.com/ EEGiorgi

    DK, thanks, I agree. But that’s also why I don’t agree with everybody else here who’s booing the study just because of the sample size. I confess I haven’t read this paper, only the one for which I provided the link above, but in general I try to be open-minded about these things: maybe there’s nothing (that’s the fun part of statistics, you can find things just by pure chance; this mutant allele could be there just because of genetic drift and the association could be completely spurious) and maybe they just hit the tip of the iceberg. Either way, it’s important to know and studies like this often pave the way to bigger studies.

    I’m a statistician, nothing makes me happier than a large sample size. But I also understand all the constraints and loopholes lab people go through to get genetic data.

  • Little bit

    AG mom-in-law
    AA her son/my husband
    AA my son (dx’d ASD)
    AA my daughter (not ASD)
    AA me
    AA my mom
    AG her father/my grandpa

    A whole lot of A’s, for us….I’ve looked into it, believe it or not rs53576 A is not statistically higher in ASDer’s or their parents (even though it is in my group). It’s very high among Asians, and appears to be modulated by gender: whatever association it has, females seem to be protected.

    I really wish 23andme would allow a snp search function, without revealing “who.” They are sitting on a wealth of data that put’s all other databases to shame!

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com

ADVERTISEMENT

See More

ADVERTISEMENT

RSS Razib’s Pinboard

Edifying books

Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »