Hacking the genome with a MAGE and a CAGE

By Ed Yong | July 14, 2011 4:00 pm

It couldn’t be easier to make sweeping edits on a computer document. If I were so inclined, I could find every instance of the word “genome” in this article and replace it with the word “cake”. Now, a team of scientists from Harvard Medical School and MIT have found a way to do similar trick with DNA. Geneticists have long been able to edit individual genes, but this group has developed a way of rewriting DNA en masse, turning the entire genome of a bacterium into an “editable and evolvable template”.

Their success was possible because the same genetic code underlies all life. The code is written in the four letters (nucleotides) that chain together to form DNA: A, C, G and T. Every set of three letters (or ‘codon’) corresponds to a different amino acid, the building blocks of proteins. For example, GCA codes for alanine; TGT means cysteine. The chain of letters is translated into a chain of amino acids until you get to a ‘stop codon’. These special triplets act as full stops that indicate when a protein is finished.

This code is virtually the same in every gene on the planet. In every human, tree and bacterium, the same codons correspond to the same amino acids, with only minor variations. The code also includes a lot of redundancy. Four DNA letters can be arranged into 64 possible triplets, which are assigned to only 20 amino acids and one stop codon. So for example, GCT, GCA, GCC and GCG all code for alanine. And these surplus codons provide enough wiggle room for geneticists to play around with.

Farren Isaacs, Peter Carr and Harris Wang have started to replace every instance of TAG with TAA in the genome of the common gut bacterium Escherichia coli. Both are stop codons, so there’s no noticeable difference to the bacterium – it’s like replacing every word in a document with a synonym. But to the team, the genome-wide swap will eventually free up one of the 64 triplets in the genetic code. And that opens up many possible applications.

“We are actively pursuing three of them,” says Isaacs. First, they could assign the empty triplet to unnatural amino acids that sit outside the standard twenty. “This [could] expand the diversity of possible enzymes and create new classes of drugs, industrial enzymes and biomaterials.”

Second, the team could use the tweaked genetic codes to make living things resistant to viruses. Viruses make copies of themselves by hijacking the protein-making factories of their hosts. They depend on the fact that their proteins are encoded by the same triplets as those of their hosts. If their hosts stray from this universal genetic code, their factories will mangle the virus’s instructions, creating distorted and useless proteins. That would be useful for industry as well as medicine. The biotechnology company Genzyme had to shut down a manufacturing plant for several months after it was hit by a contaminating virus. Millions of dollars were lost.

Third, and for similar reasons, the altered codes could be used to contain genetically modified organisms, preventing them from breeding with wild populations. It’s the geneticist’s version of the Tower of Babel story – modified creatures would be imprisoned by their own genetic tweaks, unable to productively exchange genes with natural counterparts.

All three applications are some distance away in the future, but Isaacs, Carr and Wang have taken an important step towards them. Their genome-wide edits relied on two complementary technologies, invented by their team – MAGE, which substitutes TAA for TAG in separate pieces of bacterial DNA, and CAGE, which knits the pieces together into a whole genome.

MAGE, the older of the two techniques, made its debut two years ago. It stands for “multiplex automated genome engineering”, a fancy way of saying that it can easily change a genome many times over. It was originally used to create millions of small variants of bacterial genomes, producing a multitude of strains that can be tested for new abilities. As Jo Marchant puts it in her excellent feature, it’s an “evolution machine”. In its debut, within a matter of days, it had evolved a strain of E.coli that would produce large amounts of lycopene, a pigment that makes tomatoes red.

MAGE is a versatile editor. Not only can it create many diverse changes in a group of cells, it can also create many specific changes in a single cell. That’s what Isaacs, Carr and Wang have now done. TAG appears in 314 places throughout the E.coli genome as a stop codon. For each one, the team created a small stretch of DNA that had TAA instead of TAG, surrounded by exactly the same letters. They fed these edited fragments into bacteria, which used them to build new copies of their own DNA. The result: daughter bacteria with edited genomes.

In this way, Isaacs, Carr and Wang created 32 strains of E.coli that, between them, had every possible substitution of TAG to TAA. This might seem overly complicated, but replacing every TAG with TAA in a single step would be inefficient, slow, and error-prone. A single mistake could be lethal for the microbes. By taking things slowly, and spreading the substitutions among 32 strains, the team could better troubleshoot any tricky snags.

To combine the 32 strains into one, Isaacs, Carr and Wang developed CAGE (or “conjugative assembly genome engineering”). The technique relies on the bacterial equivalent of sex – a process called conjugation where two cells sidle up, form a physical link between one another, and swap DNA.

The team matched their 32 strains up in pairs, in a league that looked like a knock-out sports tournament. One strain of each pair would deliver its edited genes into its partner, and the incoming genes were designed to merge with those of the recipient in specific ways. Thirty-two strains with 10 edits each became sixteen strains with 20 edits each. Sixteen turned into eight and eight into four.

At the time of publication, the team had reached this “semi-final” stage. They had four strains of E.coli, each with a quarter of its genome stripped of TAG codons. The strains seem to be growing normally, proving that, individually at least, the TAG codons aren’t necessary for the bacterium’s survival. Whether E.coli can survive without any TAG codons at all is still unclear, but the team suspects this will be the case. If so, they’ll set about reprogramming the unused TAG codon to represent an unusual amino acid beyond the normal set of 20.

Why publish a paper at the semi-finals? “It is indeed an odd stopping point,” admits Carr. “[We’ve] been working on this project for 7 years and we decided to publish at this point largely because we have so much to talk about: the successful innovation of the CAGE technology and it’s integration with MAGE for genome engineering at large and small DNA scales. If you dig into the supplemental data of this paper, there’s another 1-2 more papers worth of stuff in there.

Isaacs points out that only one other research group is “working on genome engineering at this scale”: the J. Craig Venter Institute (JCVI). Last year, they made headlines by creating a bacterial genome, 1.1 million DNA letters (base pairs) long, and implanting it into the shell of a different bacterium.

Isaacs says, “[They] took 10 articles to get to a slightly-modified one million base pairs. We hope to get to a highly-modified, industrially useful 4.7 million base pair genome in three papers.” That includes the one that introduced MAGE to the world in 2009, and the current one that couples it with CAGE. The third one, due in the next year or so, will complete the trilogy – it will feature the final strain, . “All the pieces are in place,” says Carr. “We have a high degree of confidence we will reach our goal.”

What does the JCVI make of this? In a statement released to the press, Dan Gibson and Craig Venter point out that the MAGE/CAGE method still requires an existing genome to work from. Replacing an entire codon is a remarkable achievement, but it’s still a tweaking game. The end result will still be a genome that’s at least 90% similar to the original one. Gibson and Venter say, “Ultimately, we at JCVI would like to design cells from scratch.” The only way to do this is to synthesise an entirely fresh genome, rather than modify an existing one.

They add, “We continue to believe there will be and must be many different techniques developed to engineer and construct genomes so that the field can mature, allowing new and important products to be made. We believe the Isaacs et al paper is a positive addition to the field.”

Reference: Isaacs, Carr, Wang, Lajoie, Sterling, Kraal, Tolonen, Gianoulis, Goodman, Reppas, Emig, Bang, Hwang, Jewett, Jacobson & Church. 2011. Precise Manipulation of Chromosomes in Vivo Enables Genome-Wide Codon Replacement. http://dx.doi.org/10.1126/science.1205822

More on biotechnology:


Comments (9)

  1. “TAG appears in 314 places throughout the E. coli genome” AS A STOP CODON … it of course appears many more times as string of three bases in non-coding sequences, in other reading frames, or on the opposite strand. The trick is not replacing every instance of “genome” with “cake” but only specific instances. 😉

  2. @Guy – D’oh! Yes, obviously. Have amended the text accordingly. Thanks.

  3. Dale Sheldon-Hess

    As I understand it, sometimes organisms have entirely different genes encoded by reading over the same sequence, but offset by 1 or 2 bases, or by reading it backwards. Is that not as much of a problem in the E. coli genome, or do the 314 changes not run across an such complications, or is the possibility of this the reason the researchers fear that the fully-modified organism will not survive?

  4. Phil Ashton


    An excellent explanation of a complex paper. The part which is really mind boggling is…

    ‘the incoming genes were designed to merge with those of the recipient in specific ways’

    I havent read the paper yet but im guessing this is where it gets down and dirty – did you have any more commentary on this part which you left out for brevity or am i going to have to do some thinking for myself? 😉

  5. @Phil – HA! Good spot! Yes, it’s really bloody complicated. It would have really dragged down an already quite long piece. Check out the paper 😉

  6. Great write-up Ed. I am a bit confused by Isaacs’ standpoint on publishing though.. What does the number of papers in which you publish your method have to do with anything? If there really are two more papers in the methods section of this one, I think it would serve science if they had published them separately!

  7. erplus

    not as original as it sounds since every mol.biology student has to learn that organisms use stop codons already in some instances to encode the 21st and the 22nd amino-acids: selenocysteine and pyrrolysine.

    so all the useful ideas here are already in the natural examples…

    plus ca change plus these harvard frauds try to appear non-plus-ultrish…

  8. Phil Ashton

    A post on this great microbiology blog goes into a bit more detail if anyone is interested in the genetics of this approach.


  9. Causio

    [OT] “and it’s integration with ” please remove the apostrophe… ugly error!


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Not Exactly Rocket Science

Dive into the awe-inspiring, beautiful and quirky world of science news with award-winning writer Ed Yong. No previous experience required.

See More

Collapse bottom bar