For years, medical researchers have been talking about the day when babies will have their whole genomes sequenced at birth, the day when genomic analysis will allow every patient to be treated not just based on her condition but on which treatment is the best match for her genetic quirks. There will be a day, they say, when we will all carry our genomes around on a thumb drive. But the hurdles, fiscal and otherwise, have proven difficult to overcome.
The DNA of one set of human chromosomes contains 3 billion base pairs—most cells are diploid and have two sets of chromosomes, one from each parent. Sequencing these six billion base pairs, one pair at a time, is unquestionably faster and cheaper than it once was: Since its less-than-humble beginnings almost 15 years ago, human genome sequencing has dropped from $100 million to around $1000. Instead of years, it can now be completed in a day or two.
Yet while that’s incredible progress, it’s not quite enough. Not only is it still too pricey for everyday use, but once that genome has been sequenced it also has to be mapped and analyzed—the process in which the sequenced base pairs are assigned to the correct chromosome and assessed for mutations, something that can take a couple of days or more. What to do with the resulting data is another problem: The genome and its resulting analysis typically occupy about 400GB. (For reference, the 2013 laptop I’m using to write this post has a storage capacity of 250GB—my genome wouldn’t come close to fitting on it.) Securely storing data from 500 or 5000 patients—at about $1 per gigabyte—typically costs hundreds of thousands of dollars per year.
A Better Algorithm
Now, Dutch startup Genalice has created software they say will decrease both analysis time and the resulting data file by orders of magnitude. Last month, the company held a 24-hour live event to draw attention to their product. Genalice used its software to analyze genomes from 42 humans and, when there was still time to spare, went on to analyze 42 tomato plants. Average time per human genome: Twenty-five minutes. Average file size: 4GB.
“The idea to just have a card in your wallet that contains your whole genetic data is not possible with today’s technologies,” says Hardik Shah, a bioinformatics and data researcher at Mount Sinai Medical Center in New York who has seen Genalice’s software at work. Now, with the advances being made by Genalice and other players in the field, he says, “maybe it’s not so impossible.”
Most of the programs that analyze genomic sequences are based on code developed many years ago. And while that code has been updated to make it faster, it’s still cumbersome. Genalice engineers saw this, and saw how far computer hardware had come in the years since genome was first analyzed. “We thought, ‘we’re not going to solve this big data problem by just twisting and tweaking those old algorithms. We have to start from scratch,’” says Jos Lunenberg, Genalice’s Chief Business Officer. With the experience of the company’s CEO, Hans Karten—who cut his teeth on big data sets during his 14 years at Oracle—Genalice did just that. The result, Lunenberg says, is not just a small step but a major leap forward in analysis speed.
Filter Out the Noise
Lunenberg isn’t revealing exactly how they’re able to do this, although he did tell me that some of the speed is enabled by the fact that humans share 99.9 percent of their genome in common. “So if you concentrate on the .1 percent,” he says, “you’re in good shape to get good reduction already. We leave out what’s not relevant.”
Genalice’s software not only reduces analysis time and storage, it also requires far less computing power, which means fewer computers to buy, maintain, and keep cool. “These people are completely out of the box and are trying to do it differently,” Shah says. “There are several companies out there right now trying to make a faster car. These guys want to make a completely different type of car, and they want to do it at much cheaper cost to client and themselves.”
Results in a Day
Not only does Lunenberg say Genalice’s software is faster, he also says it’s just as—if not more—accurate than its predecessors. Of course, the company must still demonstrate this, and that’s something that will take time and rigorous testing. Researchers are now putting Genalice’s product through its paces—Dutch agricultural company KeyGene has been using it on plant genomes, while groups at Oxford University and the Erasmus Medical Center in Rotterdam are applying it to human data, including cancer genomics. Shah’s group at Mount Sinai is next in line, and he says he’s looking forward to really digging in and seeing what the software can do.
If he can sequence a patient’s genome in 24 hours and use the Genalice software to analyze it, he can get disease susceptibility results, cancerous mutations, even potential best-treatment options back to a researcher, physician, or patient within just a day or so—a substantial improvement. “Without Genalice, it’s usually several days to several weeks,” he says. “We could really speed up the personalized medicine that everyone’s talking about.” And in fast-growing cancers, for instance, even just a few weeks can change the course of disease.
Shah then takes it a step further. Three to four years from now, he envisions faster sequencing allowing patients to have their results in minutes. “It could be common for you to walk into your physician for a general lookup and them just doing your whole genome along with a lipid profile,” he says.
Granted, five years ago, researchers were predicting we’d have that today. Genalice and its competitors may finally be pushing us toward that reality.
Image by l i g h t p o e t / Shutterstock