I had a friend recently email me about human genetic data sets. Some, like POPRES, are restricted to researchers. But there are a lot of data available for the public. Zack Ajmal has posted on most of them at some point. Feel free to post links to others in the comments.
A few months ago I purchased a decent desktop just to crunch ADMIXTURE and other packages to analyze genomic data. More recently I set up a ~100 GB Dropbox account, and have started to “push” all of my output files from ADMIXTURE, PLINK, etc., as well as various scripts (Perl, shell, R, etc.) into the public folder (more precisely, a script is running ADMIXTURE and moving the files into the appropriate Dropbox folders as I type this, and Dropbox syncs with the online folders). I’m doing this for two reasons.
First, I want to make the pipeline of data generation easier for me. Instead of running ADMIXTURE, and then processing the files laboriously with R to generate plots, I’ve now created a system where a few automated scripts begin ADMIXTURE runs, and then another script creates files for distruct, and runs distruct, and then trims the images output and converts them into PNGs. This should allow me to resurrect my side projects, even while I’m rather busy with the “main events” of my life.
Second, I am beginning to feel that the promise of the “genome blogging revolution” kind of faded out. Granted, there’s only so much you can do with the same data sets, so I’m going to try and put together large pedigree files in my Dropbox account. But it seems like people need more of a push. Toward that end I hope that distribution of scripts which make the process more “turnkey” will stimulate people going forward.
Addendum: I know that some of the first paragraph is going to be gibberish to some readers. But I hope you’ll appreciate the outcomes of that gibberish!
Call to Participate in a New Study on Social Networking and Personal Genomics:
Do you share your information with others? How has your personal genetic information influenced your lifestyle and the way you approach your health and medical decisions? Can genetic information create new communities and connections?
…
The Social Networking and Personal Genomics Study at the Center for Biomedical Ethics invites participants between the ages of 18 and 75 to spend approximately 2 hours with us in a focus group setting. Participants must have purchased direct-to-consumer personal genetic information from 23andMe, Inc., shared their information with others, and be willing to discuss their perspectives and experiences. Focus group members will receive a $50 gift card for their participation and childcare will be available on an as-needed basis at no cost. For additional information or to enroll, please contact Simone Vernez, Project Manager, by email at svernez@stanford.edu or by telephone at (650) 723- 9364. For more information on the study itself, including specific research aims and funding please visit http://bioethics.stanford.edu/research/SocialNetworkingandPersonalGenomics.html. For general information about participant rights, contact 1-866-680-2906.