Chapter 10: Crick, Barnett, Brenner and Watts-Tobin Infer That The Genetic Code is Triplet and Non-Overlapping
“A group of three bases (or, less likely, a multiple of three bases) codes one amino-acid”
If embedded in the sequence of nucleotide pairs in DNA is a linear code for specifying amino acids, then what are the rules that govern this code? Are the coding units composed of two, three, four, five or six nucleotides? Do coding units overlap with each other? Does the code begin at a fixed starting point? These fundamental questions were addressed in one of the most breathtaking publications in the history of molecular biology in which Francis Crick, Leslie Barnett, Sydney Brenner and R.J. Watts-Tobin obtained fundamental insights simply by examining plaque formation using Benzer’s rII system of phage T4.
It is at this point that another giant of molecular biology enters the story. He is the towering intellectual Sydney Brenner (1927-2019), who was born and raised in South Africa. Brenner won a fellowship in 1951 to do his Ph.D. at Oxford University with Nobel Prize-winning chemist Hinschelwood (recall that he was the skeptic who didn’t believe bacteria have genetics at the time of Luria and Delbruck; chapter 2). After doing a postdoc at Berkeley, he spent 20 years at the Laboratory of Molecular Biology in Cambridge, England. Brenner contributed to uncovering the nature of the genetic code and to the discovery of mRNA. He eventually switched fields, leading the effort to make the roundworm, C. elegans, a model system understanding for neuronal development for which he won a Nobel Prize in 2002. Later, he moved to the United States and Singapore, remaining highly influential. His sharp wit and striking personality are captured in the video.
Co-author and British microbiologist Leslie Barnett (1920-2002) worked at the Medical Research Council Unit in Cambridge as an assistant to Crick and later Brenner through the remainder of her career. In follow up to the iconic Crick, Barnett, Brenner, Watts-Tobin paper, she published a detailed record of all the rII mutants and results in a paper with the Royal Society, a treasure trove of information about the mutants and the experiments.
Returning to the 1961 Nature article, Crick et al. reached the following principal conclusions as to the nature of the code:
“(a) a group of three bases (or, less likely, a multiple of three bases) codes one amino acid.
(b) The code is not of the overlapping type…
(c)The sequence of the bases is read from a fixed starting point….
(d) The code is probably ‘degenerate’; that is, in general one particular amino-acid can be coded by one of several triplets of bases.”
Here we focus on the triplet nature of the code. The experiments were carried out using cistron B of the rII region of T4. The starting point for the investigation was a mutation originally named P13 that had previously been generated by the acridine dye proflavin. Crick renamed the mutation FC0 (Francis Crick 0), and the publication represents one of the few times in his career that Crick, the ultimate theoretician of molecular biology, did experiments with his own hands. Acridine-induced mutations are rarely leaky, and Crick et al. inferred that they must add or subtract base pairs. Also, they rarely revert by reversing the original mutation. Instead, reversion occurs by the acquisition of a second, suppressor mutation. So, if the original mutation is an insertion of a base pair (+1) then the suppressor would be a single base pair deletion (-1).
Crick isolated multiple suppressors of FC0, and they too were non-leaky. These suppressors were in turn used to isolate suppressors of these suppressors. For example, one of the suppressors of FC0 named FC1 was in turn used to isolate suppressors of the FC1 suppressor. Crick et al. arbitrarily assumed that FC0 was a + and hence that FC1 was a -. They cautiously left off the “1” because they did not want the + and – symbols to literally mean a single base pair. They also recognized that the reverse could have been the case, and it would not have affected the conclusions of the publication. Shown is the genetic map of Figure 2 in which orange arrows have been added to highlight FC0 and the suppressors of FC0 that were used to isolate suppressors of the suppressors (1, 6, 10, etc). Thus, double mutants created when a + was combined with a - or vice versa exhibited a rII+ phenotype; the mutant phenotype of one mutation was suppressed by the second mutation.
The most famous and impactful results of their classic publication are, however, in Table 3, which reports the phenotypes of triple mutants created by combining a + with a + with a + (e.g., 0 and 40 and 38) or -1 with -1 with -1 (1 and 21 and 23). Thus, and whereas double mutants of the same type (a + with a +) exhibited a mutant phenotype, triple mutants of the same type exhibited a wildtype (or pseudo-wild) phenotype. [To avoid confusion, note that the “+” symbols in Table 3 refer to combinations of the three indicated mutations, not to the sign of the insertions and deletions.]
The authors were cautious in stating that + and – should not be literally taken to mean the insertion or deletion of single base pairs. Thus, they introduced the formalism that: + represents +m, modulo n, where m is a positive integer and n the coding ratio or number of base pairs specifying an amino acid. Likewise, - represents -m, modulo n, where -m is a negative integer. Hence, +1 insertions could have been insertions of 2 base pairs and -1 could have been deletions of two base pairs.
Still, the results presented in Table 3 would have been consistent with a code of three base pairs per amino acid or a multiple thereof (e.g. six). They asserted this strongly in stating that “...they have convincing evidence [namely, Table 3] that the coding ratio is 3 or a multiple of 3.” Central to their argument are the results of Table 2. Consider, for example, the case of the triple mutant, FC 0, FC 40, FC 38 (top line in Table 3 above). “These three mutants are, by themselves, all of like type (+). We can say this ….because each of them when combined with our mutant FC9 (-) gives the wild or pseudo-wild phenotype [see Table 2 above].”
In sum, simply by comparing the capability of various combinations of mutations in a single gene (rII cistron B), Crick et al. were able to infer something deeply fundamental about the nature of the genetic code. Here Brenner in the video recalls when and where the idea that acridines caused insertions and deletions arose and how this led them to devise the strategy for inferring the number of base pairs in a codon. (Not mentioned in the video, but centrally important, was the contribution of Leonard Lehrman, who was on sabbatical at the MRC Laboratory of Molecular Biology and who is credited with the idea that acridines insert between nucleotide base pairs.)
An unexpected sequel
Still, the story is not as straight forward as it appeared in 1961 or as it has been interpreted over the years. DNA sequencing had not been invented yet in 1961. Hence the true nature of the FC mutations was not known. Fast forward to 1987, T4 aficionado Larry Gold and his coworkers at the University of Colorado sequenced several of the FC mutants. Among these were the following mutants, which were central to the Crick et al. paper: FC1 which is shown to be +2 (+AC); FC10 which is shown to be -1 (-A); FC11 which is shown to be -1 (-A); FC21 which is shown to be -1 (-A); FC23 which is shown to be -1 (-T); FC30 which is shown to be +1 (+T); and FC55 which is shown to be +1 (+C).
The most striking is FC1, which rather than being a -1 is a +2. Does this change anything? In Table 3 (above), FC1 (+AC) is combined with FC21 (-A) and FC23 (-T). Thus, +2 -1 -1 = 0. Recall that Crick et al. had correctly inferred that the code is “non-overlapping” and that it is read from “a fixed starting point”, hence in a fixed reading frame. But there would have been no change in the reading frame in the triple mutant because the combination of FC1 with FC 21 and FC23 would not have resulted in a net increase or decrease in the number of base pairs downstream of the three mutations. Ergo, the result of Table 3 by itself would have been consistent with a code of 2, 3, 4, or 5 bases. Said another way, whereas a frameshift would have resulted in out of frame reading of all downstream coding units (codons), causing a mutant phenotype, the combination of +2 -1 -1 would have exhibited a wildtype phenotype whatever the size of codons.
Nonetheless, Table 2 (above) showed that double mutants of FC1 with five other FC mutations (0, 38, 40, 41, 58) that had been assigned as being +1 yielded wildtype phenotypes. Unfortunately, the sequence of none of these five was reported. But if any or all were truly +1, then a wildtype phenotype would have arisen if, and only if, the code is triplet, namely, +2 +1 = 3. That is, the addition of three base pairs would have left the reading frame unaltered. Ergo, Crick et al. were correct in inferring the code is triplet but not on the basis of the FC1 21 23 triple mutant of Table 3 but rather because of the FC1 double mutants in Table 2.
Sadly, what was not reported is the sequence of FC0, the starting point for the Crick et al. publication and central to five of the six constructs in Table 1. (Efforts by the author and others to track down FC0 so that it can be sequenced have been unsuccessful.) What if, for example, FC0, which was generated using the acridine proflavin, is a -2 rather than a +1? (Indeed, we now know that acridines cause multi-base pair insertions and deletions.) Then, once again in Table 3, combining FC0 with two +1 mutations would have resulted in no change in reading frame and hence the results would have been consistent with codons of length 2, 3, 4 or 5. Still, we can infer the code is triplet from Table 2, where FC0 is combined with FC 21 (-A). If FC0 were -2, then the FC0 FC21 double mutant would have had a deletion of three base pairs and could have been expected to have had a wildtype (W in the Table) phenotype if, and only if, the code were triplet.
We note in closing this revealing quote from Brenner:
“The other interesting thing about this was that it was a real ‘house of cards’ theory. You had to buy everything. You couldn’t take one fact and let it stand by itself and say the rest could go. Everything was so interlocked.You had to buy the plus and minuses, you had to buy the barriers, you had to buy the triplet phase, and all these went together. It was the whole that explained it, and if you attacked any one part of it the whole thing fell apart. So it was an all or nothing theory. And it was very hard to communicate to people”
The advent of DNA sequencing has revealed the Crick et al. story to be less straight forward that often interpreted. The assertion that Table 3 provided “convincing evidence” that the code is three base pairs is by itself less convincing than originally thought. Nonetheless, they got it right, and their deep analysis based on the simplest of experiments places this classic publication as among the most breathtaking in the history of molecular biology.