Genomic liftoff

By Razib Khan | July 13, 2010 2:27 am

The firm GenomeQuest has a blog, and on that blog they have a post, Implications of exponential growth of global whole genome sequencing capacity. In that post there are some bullet points with numbers. Here they are:

* 2001-2009: A Human Genome

* 2010: 1,000 Genomes – Learning the Ropes

* 2011: 50,000 Genomes – Clinical Flirtation

* 2012: 250,000 Genomes – Clinical Early Adoption

* 2013: 1 Million Genomes – Consumer Awareness

* 2014: 5 Million Genomes – Consumer Reality

* 2015-2020: 25 Million Genomes And Beyond – A Brave New World

Let’s transform these projections into charts.



Of course GenomeQuest sells analytics tools for the tsunami of genomic data which they see cresting. Though if only 25,000,000 people have whole genome sequenced by the year 2020, I’m not sure if we’ll feel that it’s a “tsunami” of data at that point. I’m sure there would be plenty of stories about the “sequencing gap” between different communities, by class and race and what not. But what do you think about GenomeQuest’s projections?

  • Pingback: Tweets that mention Genomic liftoff | Gene Expression | Discover Magazine --

  • Jon F

    This is basically a question of Moore’s Law and how long it will perpetuate. I don’t know if you’ve read it already, but last year Kevin Kelly wrote an extensive article on Moore’s Law and where it has historically failed already and where it hasn’t:

    So far as the technology goes, it seems to me as a non-economist that if there continues to be a market demand for genome sequencing and it gets cheaper. It seems to me that they’re predicting a market saturation at around 25 million. As for whether they will surpass that by 2020 I couldn’t say, but as for whether they’ll meet it or not, I think it inherently depends on sequencing getting cheaper and cheaper, which inherently depends on technology getting cheaper, which depends on 1) Moore’s Law continuing for the next 10 years for both processor bandwidth and hard disk storage space, assuming third-generation sequencing tech comes to fruition, 2) the market for computer components not finally bottoming out in the next 10 years.

    Now, I think point 1 is more likely to hold true than we think, considering that predictions of Moore’s Law failing within 10 years have been made as far back as the 70s and it always seems to keep going. The latter, though, is more of a variable, I think, since profit-margins on PC components are razor-thin as it is in 2010. Charlie Stross wrote an article back in late April in which he predicted that market will completely bottom-out by 2015, thanks in no small part due to the walled-garden environment being constructed by Apple:

    He’s more talking about the effective death of the open source but I think it could also be applied here in the sense that cheap sequencing requires inexpensive sequencing technology and so far, in 2010, the majority of lab equipment (I’ve seen, at least) runs on Windows PCs and Apple technology remains notably more expensive. I think I don’t need to cite any evidence for the latter.

    Would such a bottoming-out of PC component costs with respect to the log curve kill Moore’s Law? Probably. Would it in turn have a downstream effect resulting in a reduction of the effective market for whole genomes sequenced? A bit, I suspect, but I still have no clue how feasible that 25 million number is to begin with. I suspect it’s a rosy estimate and might largely hinge upon leveraging future discoveries based on deep genome analysis. In other words, if we can find real, reliable trends that can effectively help cut off future diseases at the pass, I think that a cost of several thousand dollars might still be feasible. If not, then for people like me whose family died from the same, boring, common illnesses (cancer and heart disease), it remains an expensive curiosity unlikely to be covered by insurance companies.

  • Richard Resnick

    As the author of the original blog post, I must point out that sequencing capacity is growing at a rate far faster than Moore’s Law (a doubling every 18 -24 months).

    DNA sequencing throughput was roughly on Moore’s curve until about 2005. It is now increasing at approximately 10-fold per year. My original estimates in the GenomeQuest blog post purposefully assumed a slower growth rate to be pessimistic, although if it continues on its current curve things will happen even more quickly.

    There are now 21 companies either actively selling or developing technology to sequence DNA. Indeed, one exciting company is promising a technology in the next 18-24 months that can sequence an entire human genome at 40x coverage in 10 minutes for roughly $100. Will they do it and do it within this time frame? Maybe, or maybe not. But will it happen in the next decade? Definitely. With this many competitors there will be enormous innovation.

    At that price – $100 per genome – Moore’s Law doesn’t matter. Genome sequencing is a simple blood panel. And what if Moore’s Law (or a bigger exponent) continues on anyway?

    Now on the compute side you make a seriously important point. Can Moore’s Law keep up with this volume of data? Keep in mind that to sequence a genome using today’s technology you take all 3 billion bases of it, make about 30 or 40 copies of it (call it 100 billion base pairs now), cut it into roughly 1 billion little pieces of about 100 base pairs each, and then put the puzzle back together in a computer. It’s not just string matching either because you have to be sensitive to mismatches, insertions, deletions, extra copies of genes, and larger-scale tructural rearrangements. So it’s dynamic programming algorithms, run 1 billion times per genome. (Plus a lot of other stuff too – finding places where this genome doesn’t match a reference genome, and annotating those regions automatically with clinically actionable tags.)

    This is not a small computation – indeed it’s GenomeQuest’s business, we know it intimately! But there is huge innovation in this space as well: Moore’s Law continues to advance the hardware; compute clusters can scale to arbitrary sizes; algorithmic software innovation makes the same problems vastly easier. (We just increased our genome analysis speed by 6x and have plans to increase another 6x in 2011, with software improvements alone, all in the setting of Moore’s Law and a growing data center.) And don’t forget, while today’s sequencing technology has us putting together 1 billion little puzzle pieces per genome, tomorrow’s technology will output much longer continuous stretches of DNA. In 2006, Zhang et al. demonstrated the capability to sequence 100 MB of continuous sequence, which is longer than about half of our chromosomes. So the computational requirements will be reduced – no more puzzle pieces.

    All forces seem to be conspiring for continued exponential growth of whole human genome sequencing capacity.

    As for whether whole genome sequencing can improve health care: it can indeed. The amount of knowledge we are generating by looking at wide-scale human variation is unlocking new understanding in the research sector of life sciences. Commercially, already pharmaceutical companies are being required to submit companion diagnostics to the FDA alongside new compounds. Clinically, oncologists are already looking at specific genetic variation to determine which drugs make the most sense in treatment. Take a look at Hercepton, Iressa, Campostar, Gleevec.

    Insurers are willing to pay $1,000-2,000 for a genetic test for certain diseases today. Such tests will pay only for the screening of a few genes related to your particular form of cancer. So the above drugs each has a test, sometimes genetic, sometimes not, paid for by your local insurance company and approved by the FDA. But with your whole genome sequenced, you’ve done every genetic test needed and future genetic tests are simply a query of a database. So whereas today the price of a whole genome is somewhere between $5,000 and $20,000 (depending on who you ask), when the price falls to $1,000 in the very near future – and I expect within three years, maximum – insurers will save huge amounts of money on tests, essentially buying the test once and reusing it over the course of your lifetime.


  • linda seebach

    Has there been work done on compressing genome sequences?


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at


See More


RSS Razib’s Pinboard

Edifying books

Collapse bottom bar