How many human genomes have been sequenced?

By Razib Khan | November 7, 2011 8:49 pm

That query doesn’t seem to have an easy answer on Google, so I’m trying to enter it here. A prominent genomicist asserted a ballpark figure of ~30,000 human genomes in the year 2011. Most of that is in the year 2011 itself. Also, in regards to the “$1,000 genome” question, it seems that some labs can get $4,000 dollar human genomes if they buy in bulk (50 at a time). The price point can even go lower apparently, though no number was divulged. This isn’t going to be the “retail” price point obviously, but we’re probably at most an order of magnitude away from the $1,000 genome as of now. We were also told of the likelihood of the $100 genome, in today’s dollars, in 2020.

  • Kevin

    Zero complete genomes have been completely sequenced. Open a textbook or two in molecular biology.

  • omar

    Kevin, Do you mean that some parts of the genome have not been sequenced yet in anyone, or that some parts get missed in every particular individual effort? And do you have a philosophical objection to calling such “near-complete” genomes as “completely sequenced”? or do you have some other more substantive and meaningful disagreement with the claim that thousands of genomes have been sequenced to near-completeness?

  • josh

    All of the 30K genomes have been nearly completely sequenced. Textbooks won’t tell you about the GC-bias of Illumina or 454 methodologies. Aside from those missing pieces, the reality is that the sequences ARE generated, but they simply haven’t been accurately assembled. There’s always a pile of reads left over after reference mapping.

  • Cindy

    In discussing the numbers of whole genomes that have been sequenced and also the number of genomes projected to be sequenced if current trajectory holds, there’s a neat graph my company uses that can be found online at this website:

    The figures in this graph are consistent with the numbers discussed above in this blog post.

  • Randy

    I think it’s likely that “complete” genomes have been sequenced with high-coverage shotgun Next-Gen sequencing tchnologies. Unfortunately, not all of the data has been meaingfully assembled and annotated.

  • Razi Khaja

    @Kevin: First, textbooks are out of date in regard to the number of genomes sequenced.

    @Omar: Some parts of the human genome, particularly the euchromatic regions (centromeres and telomeres) have not been fully sequenced or assembled in anyone and also get missed in every individual because the DNA in these regions is highly repetitive and tightly packed. Another thing to consider is whether the assembled genome is diploid or a haploid mosaic. The human genome sequenced by the International Human Genome Sequencing Consortium is a haploid mosaic and additionally is a mosaic of over 200+ individuals. In contrast to the IHGSC genome, the Venter genome is not a mosaic of several individuals. Additionally the Ventor genome claims to be diploid but, really it is a haploid mosaic with many large regions represented as alternative haplotypes. One cannot be certain that the Venter genome that is assembled occurs in the cell. It is merely a working model to evaluate phenotype. Philosophically, genomes are nearly completely sequenced, but they are far from assembled correctly.

  • Razib Khan

    kevin’s comment: defensible, but douchey :-)

  • Larry Kedes

    No human genomes have been completely sequenced yet. More importantly, there is no metric for quality of what has been sequenced. The Archon Genomics X PRIZE ( has developed a methodology to measure and judge the quality and completeness of a set of sequenced genomes. This validation protocol will soon be put to the test including two genomes and open source evaluation software that will be made available by early Spring 2012.

  • Helen Pearson

    Nature did some reporting to try to assess very roughly how many human genomes had been sequenced — and we estimated in a story last year that more than 30,000 would have been sequenced by the end of 2011. I suspect this is has been exceeded by now.

  • AMac

    The informatics startup Personalis has a graphic with their best guess, here. The cumulative estimate for the number of genomes sequenced through Dec. 2011 — a projection, obviously — is about 25,500.

    From my understanding, well over half of these sequences were performed by BGI, using Illumina’s HiSeq 2000. Complete Genomics, likely the #2 sequencer, will have reached a cumulative tally of between 4,000 and 4,400 by year’s end. This was implied by their 3rd quarter remarks on 7 November.

  • Neuro-conservative

    Interesting update on this from Dan macarthur’s twitter feed (no link as I have no twitter account). He asked BGI about their new public claim of 38,000 human genomes. It turns out that they are including exomes and low-pass genomes into that number. I suspect the number of high-depth, full genomes is a relatively small percentage of that total. I wonder if it would even be as big as the ~4000 genomes from Complete.


