The Sixty-Million-Year Virus

By Carl Zimmer | March 13, 2006 6:16 pm

How do we know that we are kin to chimpanzees and howler monkeys and the other primates? For one thing, it’s by far the best explanation for the fossil record. For another, our DNA shows signs of kinship to other primates, much like the genetic markers that are shared by people from a particular ethnic group. There’s a third line of evidence that I find particularly fascinating: the viruses carried by humans and other apes.

Every day, viruses traffic in and out of human bodies. They invade people’s cells, make new copies of themselves, and then, if they’re lucky, infect a new host. Some viruses do this by stapling themselves into our DNA, so that their own genes are read by our cells much as they read their own genes. In many cases, infected cells die as they manufacture hundreds of new viruses that burst out of them. But in some cases the viruses get stuck. They sit in the cell’s genome, and the cell goes on living. When the cell duplicates, it duplicates the virus DNA as well. Just because the virus spares the cell is not necessarily a good thing. The virus may still be able to pop out of dormancy and wreak havoc. It may also trigger its host cell to duplicate like mad–giving rise to cancer. One in five cancers is associated with these viruses.

Now imagine what might happen if one of these viruses happened to infect an egg. The egg might well die. Or not. And if it started to divide (as a fertilized embryo), the virus would be passed down to all the daughter cells. In other words, a baby would be born with the virus throughout its body.

Too freaky, I know. Put it in a sci-fi script, and a movie producer will say, “Forget it kid. That’s as crazy as the zombie cockroach story some lunatic came in here with this morning.” But it has happened, and many times over. Scientists can identify viruses lurking in our genome (known as endogenous retroviruses) by their distinctive DNA. A fully-functioning retrovirus sequence contains three genes–one for copying DNA, one for a shell, and one for escaping and invading cells. These genes are flanked by a series of repeating DNA, which allow viruses to be inserted or snipped out of their host’s genome. The human genome carries full-fledged retroviruses, as well as viruses in various state of decay. Scientists have identified 98,000 of these viruses, along with about 150,000 fragments of defunct viruses. All told, they make up 8 percent of the human genome. In many cases, the virus genes have disappeared altogether, leaving behind flanking repeats, which have been duplicated to millions of copies that take up about 40 percent of the genome. As a point of comparison, our “own” genes–in other words, those that encode proteins that make up our bodies and allow our bodies live–make up only about one percent of the genome.

Some of these endogenous retroviruses are only found in some people and not others. They must have invaded someone’s genome and then spread to his or her descendants, but have not yet spread throug our entire species. Others appear to be ubiquitous–meaning that they are ancient passengers that had already spread throughout an ancestral population.

Other vertebrates carry their own collections of endogenous retroviruses. Mice have a particularly lively collection that continues to spread through their genome with each generation. And you can trace their history through evolutionary time. Domesticated cats, for example, share many endogenous retroviruses with their wild cousins. But they also carry other endogenous retroviruses of their own. The same goes for pigs, and their wild boar cousins. As pigs and boars stopped interbreeding, they could no longer spread newly acquired retroviruses to future generations.

Now, if you really don’t enjoy reading about evidence that you are related to a chimpanzee, now’s the time to close your browser window. Because now I must write about the endogenous retroviruses in chimpanzees, macaques, and other primates. It turns out that most of the viruses we carry can also be found in these other species. Our retroviruses can be grouped into families. They carry the same families. Our retroviruses usually appear in the same position in the genome, no matter whose genome you look at. Many of theirs are in the same place. These are all the sorts of evidence you’d expect if retroviruses had been carried down from distant primate ancestors. A particular retrovirus is not identical from one host primate to the next, but you wouldn’t expect that. Once each host lineage branched off, the viruses could acquire mutations. But the different versions of these retroviruses are still similar enough that scientists can reconstruct the DNA of original virus that infected some long-gone primate.

Retroviruses appear to have invaded the primate genome in a series of waves, starting over 55 million years ago and continuing until just a couple million years ago. As a result, some of the retroviruses in our genome are found only in some primates and not others. It’s not completely random as to which primates share these retroviruses. In general, they are the same species that other studies have shown to be our closest relatives.

Once viruses get established in a genome, they can take any of a number of evolutionary paths. They may still be able to break out of their resident genome, become full-blown viruses, and invade another cell in the body. If they’ve lost the ability to become true viruses, their DNA can still get accidentally copied and inserted back into the genome. These copies may accidentally get swapped, producing drastic changes in their host’s genome. And most remarkble, sometimes genes from viruses become useful to their hosts. It appears that virus genes have become vital for the development of primate placentas, and to carry out other essential tasks. While these genes retain distinctive sequences seen only in retroviruses, they show signs of having been preserved by natural selection, even as the viral genes that surround them have mutated into uselessness.

There’s one more use these viruses have to offer: they have preserved a precious record of our evolutionary history.

(For more information, see for a discussion of retroviruses and primate evolution from a few years ago.)

MORE ABOUT: Evolution

Comments (22)

Links to this Post

  1. | The Loom | Discover Magazine | December 17, 2008
  2. Getting More Viral Every Day | The Loom | Discover Magazine | January 11, 2010
  1. djlactin

    click the reconstruct link: astonishing work!

    let’s see the YECs explain this away! (or will they simply ignore it as they generally do?)
    this is the absolute death-knell for YEC, and certainly a mortal blow to creationism in general. the only POSSIBLE creationist response is to allow for evolution after creation, but where does ths lead them?

    and “intelligent design”? don’t get me started!

  2. david maas

    Fascinating! It questions the idea of the sovereign organism and where to draw the line between self and other. I also find cooperative organisms like lichens fascinating for this reason.


  3. You wrote:
    “Retroviruses appear to have invaded the primate genome in a series of waves, starting over 55 million years ago and continuing until just a couple million years ago.”

    I wonder about this timeframe, is there an explanation for these numbers? Why these waves and why should this activity have ceased? What if HIV gets stuck?

  4. Nicht nur “Language is a Virus”

    Carl Zimmer in “The Loom” hat eine interessante Geschichte: Wie Viren h

  5. windy

    What’s the latest on the baboon ERV that was once used to support Asian ancestry in humans? (Because chimps and gorillas, but not humans, have the sequence.)

    Are there nowadays more known cases where the insert was lost by chance or doesn’t occur in one species inside a clade?

  6. luca

    In fact, the plot for such a movie already exists… Greg Bear already wrote a novel, Darwin’s Radio, where Endogenous RetroVirus turns out to play a role in some sort of ‘supervised’ evolution of humans… pity for the weak ending… may be Carl can come up with something better…

  7. Theo Godwyn

    The YEC answer to this is so simple. Rather than human eggs being infected by a virus 98,000 times, perhaps we are using reverse logic. The human DNA did not come from a virus. The viral genetic code (either RNA or DNA) originated in a human. Wouldn’t this be a much more reasonable conclusion? Afterall, a virus mutates very rapidly and frequently adopt genetic information from its host.

    If this is the case, genetic similarities between monkeys and humans are due to structural and biochemical similarities. The fact that a virus has those same similarities is only due to the fact that the virus has once infected a human or primate host.

    They say that the human genome contains 8% viral DNA. Human DNA has 3.5 billion base pairs. They are saying that 280 million of these base pairs come from viruses.

    Everytime a virus mutates, it mutates because it has adopted host DNA. The mechanism for adopting host DNA is alot more explainable than a mechanism wherein a egg infected by a virus would survive to pass on this new code. A cell will typically lyse when infected by a virus. Otherwise the virus has no means of reproduction.

  8. I agree that the ending for Darwin’s Radio was weak, but the second book, Darwin’s Children, kind of makes up for it.

    But, it’s not just retroviruses… it’s bacteria, fungi, you name it. I read somewhere last summer that humans should think of ourselves as a super-organism since we actually consist of more foreign cellular matter than human. There are something like 100 trillion bacteria on and in us whereas we consist of only a few trillion human cells. Add that to the virus load within our cells, and the question truly becomes, how human are we?

  9. luca

    Theo: well, being passed unaltered from that single cell and being duplicated into trillion of copies is kind of reproduction for the bacteria… no? and all this without having to lift a finger – pardon, a base-pair…

    Kirsten: I didn’t get Darwin’s Children ’cause I was disappointed by the first ending, the whole split-mouth/patchy-faces thing is really clumsy… but I’ll give it a shot if you say it’s worth it…

    As for our ‘super-organismicity’, could you point me out to the exact reference? it really interests me…

  10. Theo, if that is the case then why are so many of the viruses in our genomes useless to us? If they originated in our genomes because a designer put them there, then why aren’t they doing anything?

    For that matter, why isn’t so much of our genome doing anything?

  11. Carl said, “our DNA shows signs of kinship to other primates, much like the genetic markers that are shared by people from a particular ethnic group. There’s a third line of evidence that I find particularly fascinating: the viruses carried by humans and other apes”.

    Would these be, fundamentally, the same thing? We are looking at genetic markers, whether they be viral or not, correct?

  12. Adam

    Hi All

    Theo has no answer because he popped out a non sequiter like “biochemical similarity” as an explanation for viral similarity. That’s garbage. Carl’s talking about viruses in the same locations – thousands of them – in the genomes of putatively related species. YECs would have to prove that viruses don’t insert themselves randomly but actively seek out the same locations in the merely similar genomes of different species. Or else swallow a BIG pill of “random chance” somehow creating these orderly patterns that look like inheritance in multiple genomes. Or a vengeful, spiteful God who creates such as a test to snare and damn rational free-thinkers.

    Yeah right.

  13. Luis

    Carl: “much like the genetic markers that are shared by people from a particular ethnic group”

    Interesting… I keep hearing everywhere that there is no genetic/biological basis for race (which I think is baloney), so could anyone provide a source for that?

  14. luca

    Luis, there’s no genetic/biological base for race as intended as ‘superiority’ of one above the other(s) in terms of intelligence or else, is one thing. but clearly there are genetic differences, apart from those visible. Different ethnic groups can for example react more or less strongly to a drug, depending on their metabolism peculiarities – e.g. metabolism of propranolol differs in caucasians and blacks (just let me dig up a reference – Introduction to Drug Metabolism III edition – by G Gordon Gibson – pp 122-onward)

  15. Bruce Wright

    I have a question.

    So this is a way that the genome of a species can acquire a mutation. This provides new raw genetic information that natural selection can work on.

    The great thing about this is that this new genetic information isn’t merely single bit coding errors, or a flipped bit due to a stray cosmic particle or other mutagenic exposure. Instead it’s entire strings of new information, already in some cases useful strings of proteins, mulitple codons long.

    Has this been explored as a second mechanism behind punctuated equilibrium? We know that small population sizes will allow greater variation of genes to more quickly propogate among a population.

    But ALSO a small or concentrated population would provide a great breeding ground for viruses. It is a population under stress that is likely to be fertile for viral epidemics. Populations encountering invading species would be exposed to new viruses. Populations colonizing new territory would as well. Populations hunting a different food source than they are used to will also be exposed to that animal’s viruses.

    These viral epidemics provide a population under stress with perhaps, if they’re lucky, the raw genetic data to survive.

    So these retroviruses provide the mechanism for accelerated mutation at the same time that small population sizes provide the mechanism for accelerated propogation of the genes successful individuals. Is this a second part of the Punctuated Equilibrium puzzle?

    That’s my hypothesis. Anyone working on that? Sorry if my understanding of the science is totally off. I assume there are people here to correct me if it is!

    What a TOTALLY amazing piece of the puzzle of evolution if I’m not completely making this up out of my hat. The idea that colonizing species have a mechanism to get the new genetic material for natural selection to work on.

    This is a fascinating field!

  16. Raza Usman

    Sure beats the Adam and Eve and the Garden of Eden explanation:)

  17. luca

    I’m not particularly convinced about Punctuated Equilibrium… but Bruce Wright comment set something working in my head… I think you may have missed an interesting connection between the two things, the virus, and the small population size in your list: how about a population that encounter the virus, gets almost wiped out except for individual with resistance to it; the virus genome is then inserted but unable to express themselves in this small, stressed population, and undergoes rapid drift… this combines your mechanism in only one… may be there’s something in here, any professional geneticist or biologist that care to elaborate on this?

  18. Bruce Wright

    Thanks for the thoughts, Luca.

    Yeah, I don’t know anything about the evidence for punctuated equillibrium. It’s just that I found Gould’s hypothesis for a mechanism compelling.

    I do wish that a professional would happen upon this comment thread and help us out with these questions. I don’t even know who I’d ask!

  19. NAJEEB

    Is it possible that monkeys and pigs (the only ones) are mutated genes from humans? Why nobody tried to scientifically to prove this theory. Is it possible that humans came first than the apes?

  20. Theo godwyn


    My point was that how do we know that the genetic code came from a virus rather than a virus coming from the genetic code. This whole theory of human genetic code evolving through viral infiltration of eggs depends on the genetic code originating in a virus.

    Since a virus does not create genetic code but rather steals it from other organisms, it must be assumed that no genetic code could can be considered native viral code.

    This article is claiming that humans are built (at least 8%) by viral code. My question would be, “why aren’t viruses considered to be composed of human genetic code instead of humans composed of viral genetic code?”


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

The Loom

A blog about life, past and future. Written by DISCOVER contributing editor and columnist Carl Zimmer.

About Carl Zimmer

Carl Zimmer writes about science regularly for The New York Times and magazines such as DISCOVER, which also hosts his blog, The LoomHe is the author of 12 books, the most recent of which is Science Ink: Tattoos of the Science Obsessed.


See More

Collapse bottom bar