Introduction

Experimental evolution of microbes has been used to study a variety of evolutionary phenomena because they are easy to manipulate and they evolve rapidly. At the same time, the extent of genetic change accumulated in whole genomes during experiments has been small, hence the only processes studied so far are those limited to few changes. Thus many evolutionary phenomena remain inaccessible to experimental evolution for lack of greater amounts of change.

We report here a study of bacteriophage T7 in which several hundred nucleotide substitutions accumulated in its genome, affecting approximately 1% of its bases. This evolution was carried out in two phases. In the first, the phage was grown under mutagenic conditions with frequent bottlenecks. The goal was to fix mutations at a high rate so that an experimental phylogeny could be created for direct testing of phylogenetic methods (Hillis et al. 1992). The fixation of mutations was largely indiscriminate, hence many of the mutations were deleterious, and the fitness of the phage declined substantially. In the second phase, the heavily mutated phage was grown in large population sizes to evolve higher fitness. The goal at this point was to understand whether the phage genomes could recover their lost fitness and what types of mutations were involved in the fitness recovery.

This study reports the DNA sequences of the genomes from two experimental lines at the end of both phases of the experiment. The substitutions and other nucleotide changes that accompanied these evolutions are identified and analyzed for statistical properties of molecular evolution. The methodology used here to evolve T7 can be easily applied to other viruses and be extended indefinitely, leading to potentially thousands of substitutions, a magnitude of change that would allow the experimental study of many natural long-term phenomena.

Materials and Methods

A wild-type isolate of the lytic bacteriophage T7 was used as the starting genome. This DNA phage has a linear, double-stranded DNA genome of 39,937 nucleotides. During the first phase of this study, referred to as the “bottleneck” phase, mutations were accumulated in separate lines of the T7 genome by passaging the virus in aerated, 2-ml LB broth cultures with host Escherichia coli W3110. The cultures contained the mutagen nitrosoguanidine and were maintained at 42°C (Hillis et al. 1992). Each liquid culture was allowed to proceed to lysis, whereupon 2 µl of the lysate were transferred to another 2-ml culture of W3110 plus mutagen. After five serial transfers, the phage population was plated to recover a single plaque, and a suspension of that plaque was used to initiate the next set of five liquid cultures. The present study used isolates K and P from Hillis et al. (1992), each of which had been carried through 120 liquid lysates, interspersed with plaque isolations every five cycles. Thus, there were 24 bottlenecks where an isolated plaque was picked to initiate further phage growth. As a plaque is initiated by infection of a single bacterium by a single phage particle, each bottleneck reduces population size to unity. This phase was completed for the Hillis et al. (1992) study; phage genomes had been saved as DNA stocks and so were not subject to selection for phage viability. Fitness of the phage population was severely reduced after the bottleneck phase, as judged by both plaque size and time to lysis (JJB, unpublished). The drop in fitness was likely due to the fixation of deleterious mutations resulting from the high mutation rate and small population size (e.g., Chao 1990). Fixation of mutations had been the goal of Hillis et al. (1992), to enhance divergence among experimental T7 lines.

The second phase of the present study, referred to as the “recovery” phase, was conducted recently, again in LB cultures of W3110, but at 37°C and without mutagen. The recovery phase consisted of a series of passages in which a suspension of phage was added to 10-ml LB cultures of cells in 125-ml flasks and grown with aeration. This culture was then treated with chloroform, and an aliquot containing at least 105 phage was added to the next culture. This procedure was continued for 82 cycles with line P and 115 cycles with K. The goal was to transfer sufficient phage to maintain a moderately large minimum population size (105) yet transfer before high levels of multiple infection were achieved (e.g., the culture was not allowed to lyse). Initial passages were conducted for 60 min before transfer, but as fitness improved, it became necessary to reduce the duration of each passage to 40 min, because phage fitness was high enough that an inoculum of 105 phage would overwhelm the cells within 60 min. However, because of low fitness, the first five passages with the recovery of K were grown on lawns of bacteria on plates instead of in liquid culture. Phages from the beginning and endpoints of the recoveries will be denoted with a subscript for the passage number: P0 (K0) is the isolate from Hillis et al. (1992), and P82 (K115) is the respective phage from the last passage of the recovery.

To standardize growth conditions for passages and fitness assays, W3110 cells were grown to log phase in a large volume, concentrated 100× and frozen in aliquots (−80°C in LB with 20% glycerol). To initiate a passage or assay, an aliquot was thawed, and a predetermined volume of cells was added to 10-ml LB and grown at 37°C to achieve 1 × 108 cells/ml (within a factor of 2) at 1 h. Phage was added at this time, and the culture was grown for the requisite period of time before chloroform treatment.

Fitness, represented as the number of doublings per hour of the phage concentration, was assayed in the same conditions that were used to recover phage fitness: phage was added to 10-ml liquid cultures of LB with W3110 and grown for 40–60 min at 37°C (as per Rokyta et al. 2002). Fitness was calculated as the log2 of the ratio of the number of phage at the end to the number of phage added at the beginning, divided by the duration of the assay in hours. Fitness measures can be slightly affected by cell density (e.g., see Additional File 1 in Bull et al. 2002), so fitness comparisons among phages were confined to assays performed within one batch of cells.

Sequences of our wild-type phage and of P0, K0, P82, and K115 (GenBank numbers AY264774–AY264778) were obtained using PCR products made directly from phage lysates, using chain termination reactions with fluorescent dyes. Sequences of the terminal 20–25 bases at each end of the phage were not determined. With the exception of K115, lysates for sequencing were obtained from single plaques. Sequencing gel profiles from an ABI377 sequencer were analyzed in DNASTAR. Typically, one strand was sequenced and compared to both the wild-type and parent strain genome data. Changes in P0 and K0 were verified in the P82, and K115 lines as well. Four of the more than 700 mutations in P0 and K0 were ambiguous in the sequence profile, 3 of which included the wild-type base as one of two alternatives. Two of these three were resolved as the wild-type base based on restriction enzyme patterns of the PCR product, the third was scored as the non-wild-type base, and the remaining ambiguity was scored as the wild-type base; these two tentative assignments do not affect the conclusions made. The sequence of K115 showed evidence of polymorphism for the wild-type base and one other base at six positions. Polymorphism is not surprising in the K115 sequence, since the final lysate rather than an isolate was used to generate the PCR product for sequencing. The non-wild-type base was assigned to each of these positions in K115. Again, conclusions are not affected by these tentative assignments.

Genomes of K115 and P82 were recombined by ligating appropriate restriction fragments from different genomes (Rokyta et al. 2002). Breakpoints for the different recombinations were approximately at fourths of the genome, using digestion with the unique cut-site enzymes BglII (11,514), BanI (19,601), or NcoI (29,717) to generate specific fragments. Fitnesses of recombinant phages and the complete K115 and P82 genomes were assayed to identify interactions between the mutations that had occurred during the bottlenecks and recoveries.

Various statistical tests were used in analyzing properties of the substitutions and fitnesses. Standard test statistics are indicated in the text. Some analyses were done empirically, by computing the distribution of changes expected for the numbers and types of changes observed. In these methods, a C program was written to randomly assign the observed mutations across the T7 genome, and distributions generated from large numbers of random assignments were compared to the observed substitutions to assess significance levels. These random assignments accommodated the relevant biological constraints; e.g., an observed G → A change could be randomly assigned only to a position with a G in the wild-type sequence and could not be assigned to positions that were deleted in the evolved line.

Tests of fitness interactions among recombinant genomes used t-tests that compared two pairs of differences (X2–X1)–(X4–X3), as described in Bull et al. (2000). Fitnesses for these tests were obtained from eight genotypes, and since the test of interactions requires fitness estimates among four genotypes, the eight genotypes allow two statistically independent tests of interactions. However, there are three different ways to partition the eight genotypes into two tests. These three pairs of tests are not independent of each other because they each involve the same eight sets of fitnesses, but the interactions evaluated in one pair of tests are also not identical to those tested in the other pairs. Thus our search for interactions included all three possible pairs of tests allowed by eight sets of fitnesses, with appropriate corrections for multiple comparisons.

Results

Populations of phage T7 were passaged under experimental conditions designed to fix large numbers of mutations: periodic population bottlenecks of a single individual, alternating with growth of large numbers of phages in the presence of a mutagen. Genome-wide restriction maps and partial sequences indicated that a large number of point mutations had indeed been fixed by these procedures (Bull et al. 1993; Hillis et al. 1992, 1994), but the identities of most mutations were unknown. Two archived lines were used in this study: K and P (designated here as K0 and P0), each obtained from a single plaque since the last mutagenesis. Isolates K115 and P82 are the lines obtained from K0 and P0 (respectively) after evolution to recover fitness. All four genomes were sequenced completely to identify the characteristics of the mutations in both phases.

Fitnesses

Fitness estimates of K0 and P0 were well below that of wild-type T7 (Fig. 1). As measured here, fitness is expressed on a log scale (base 2) and is easily interpreted as the number of doublings of the phage population per hour (e.g., a single phage with a fitness of x will have 2x descendants in 1 h). This measure combines both burst size and latent period. These fitness measures can also be converted into differences in burst size, for example. If latent periods of all phages are the same (assumed to be 15 min), the burst size of K0 is only 3.1% that of T7+, and that of P0 is 4.8%. On this scale, the increase in fitness of K0 and P0 during the recovery phase needed to reach the wild-type level would be 32-fold and 21-fold, respectively. The actual increases during recovery were 3.8-fold and 9-fold. By either measure, fitness increased during recovery but remained below that of the starting values. However, the trend in Fig. 1 gives the impression that further increases in fitness are possible with additional passaging.

Figure 1
figure 1

Fitness recovery of the K and P lines are shown from selected isolates spanning their respective recovery phases; standard error bars are included for each point but are often obscured by the point. The top boundary of the graph (value 26.9) represents the fitness of wild-type phage, hence is the minimum fitness expected if recovery is complete (passaging of the wild-type virus might well have led to an even higher fitness). Thus, neither the K nor the P line recovered full fitness, though it is possible that additional passaging might have led to further fitness gains. Assay conditions were as described in the text, taken over 60-min intervals. The fitnesses of K115 and P82 are slightly higher here than in Fig. 2, a magnitude of difference that may stemfrom variation between batches of frozen cells used for the two sets of assays.

Hundreds of Mutations Accumulated Throughout K0 and P0

The original restriction digest data of the K0 and P0 genomes indicated that many mutations had accumulated (Bull et al. 1993; Hillis et al. 1992). Sequencing revealed 404 point mutations, 2 deletions, and 1 insertion in K0 and 299 point mutations and 1 deletion in P0 (Table 1). Ninety-six percent of these point mutations were G → A or C → T transitions, as expected for nitrosoguanidine mutagenesis (Horsfall et al. 1990). The difference in the number of substitutions between K0 and P0 (404 − 299 = 105) is highly significant (χ2 [1] = 15.7, p < 0.0001), but this test assumes that the number of substitutions obeys the same sampling process in both lines, an assumption presumably violated by the fact that the identities and effects of early substitutions influence the evolution of later substitutions.

Table 1 Classes of nucleotide changes in K0 and P0

The mutations were distributed nearly uniformly across genes. Using a Poisson model of expected numbers of mutations per base and correcting for multiple comparisons, only gene 8 (essential) showed a significant paucity of missense mutations, and genes 1.5 and 4.7 (nonessential; no known function) showed significant excesses of missense mutations. No significant deviations from expectation were found for silent mutations. Mutations from K0 and P0 were combined for these tests to increase power.

The large number of point mutations that evolved in these lines provides statistical power to quantify and compare a variety of properties of molecular evolution. Approximately half of the 59 genes, or 30% of the coding nucleotides in T7 are considered nonessential, or at least conditionally essential. These genes are designated with fractional numbers, most essential genes are whole numbers, except that gene 7 is nonessential, and genes 2.5, 6.7, and 7.3 are essential (Dunn and Studier 1983). Under the experimental conditions employed, 0.3 and 3.5 are also considered essential genes. Although some nonessential genes have known beneficial functions, they are not necessary for growth in the laboratory. Nonessential genes (and to a lesser extent, conditionally essential genes) may thus have fewer constraints on their evolution than the essential genes, an expectation supported by the characteristics of the deletions, nonsense mutations, and missense mutations. Deletions removed large portions of 0.7 in K0 and P0 and of 0.6A,B in K0; in rich media, 0.7 is not only nonessential but deleterious (Studier 1979). Across both phages, seven nonsense mutations were detected in genes (0.4, 4.2, 4.7, 5.7, 5.9, 10B, and 18.5). These are all nonessential genes. The number of nonsense mutations is not significantly different from expected for the number of GC → AT mutations observed (based on an empirical randomization test). Finally, there was a significantly greater rate of missense mutations in nonessential genes than in essential genes in both phages (χ2 [1] = 22.1, p < 3 × 10−6 for K0; χ2 [1] = 9.7, p < 0.002 for P0). (Expectations were calculated according to the relative number of bases in essential versus nonessential genes times the total number of missense mutations. This test omitted genes with large deletions or nonsense codons, since there is no functional significance of silent versus missense mutations in nontranslated regions.) No significant differences were evident for silent mutations in essential genes compared to nonessential genes (p > 0.6 for K0 and for P0). The proportion of missense to silent changes was also significantly higher for nonessential than for essential genes (p < 0.003, two-tailed Fisher’s exact test).

Eighteen nucleotide positions of the T7 genome evolved the same point mutations in both K0 and P0. Parallelisms thus constitute approximately 5% of the total changes. Three of these parallel changes were not GC → AT, not significantly different from the 4% non-GC → AT of all point mutations (p > 0.08, two-tailed Fisher’s exact test). However, only ~6 parallelisms of GC → AT are expected from the number of point mutations observed, so the 15 constitutes a highly significant excess (χ2 [1] = 13.5, p < 10−3).

Recovery: K115 and P82

After considerable passaging at large population sizes, fitnesses still had not returned to the original level of wild-type T7. The number of mutations that accumulated during this recovery phase was less than 6% of the number accumulated during the bottleneck phase (Table 2). The recovery did not use artificial mutagenesis, and only 6 of the 35 new mutations were GC → AT. In addition, only 1 of the 33 mutations in coding regions was silent, in striking contrast to the 46% of silent mutations during the bottleneck phase (p < 10−6, Fisher’s exact test). Approximately one fifth of the missense mutations were in nonessential genes, not significantly different from random. Only two reversions were observed, both in P82.

Table 2 Classes of new nucleotide changes in K115 and P82

Fitness Interactions

The net divergence between P82 and K115 corresponds to a 2% difference over the complete genome, and many of these mutations had deleterious effects. Have these genomes diverged enough to begin exhibiting incompatibilities if recombined? Such incompatibilities comprise the basis of speciation in higher organisms. To assess this possibility, recombinant genomes between K115 and P82 were created by shuffling and ligating fragments created with the restriction enzymes BglII, BanI, and NcoI, and fitnesses were measured for each recombinant (Fig. 2). The fragments that are generated by these three restriction enzymes divide the T7 genome into roughly equal quarters. We designate these four genomic quarters as W, X, Y, and Z (from the left to the right end of the genome), and subscript each fragment with its origin; thus WPXPYPZP denotes P82, and WKXKYKZK denotes K115.

Figure 2
figure 2

Fitness of recovered lines and recombinants. The height of each bar is the fitness of the genome indicated on the horizontal axis (with 1 standard error indicated by the vertical line). Thenotation for the genomes follows the text, with W, X, Y, and Z representing approximate quarters of the phage genome and subscripts “K” and “P” representing an origin from K115 and P82, respectively. For visual convenience, the gray portion of each bar between fitness 0 and fitness 10 gives the segment of the genome derived from K115, the white represents that derived from P82. The genotypes are arranged with an increasing (then decreasing) contribution from P82; genome K115 is represented at both ends of the figure.

Fitness interactions (as evidence of genetic incompatibilities) may be assessed by comparing the effect of the same fragment in each of two different genomic backgrounds. For example, the difference between WPXPYPZP and WKXPYPZP is the replacement of the leftmost quarter of genome P82 by the leftmost quarter of K115; likewise the difference between WKXKYKZK and WPXKYKZK is the replacement of the leftmost quarter of the genome of K115 by the leftmost quarter of P82. A difference in the fitness effect of this substitution between the two pairs indicates that the fitness contribution of WP versus WK depends on the genetic composition of the remainder of the genome. If the fitness effect of the substitution depends on the genetic background, the fitness effects are considered to be nonadditive, hence are affected by interactions.

Significant interactions exist for Z/WXY and X/WYZ (t 16 = −4.43, p < 0.00042, and t 16 = −2.769, p < 0.003, respectively) as well as for WX/YZ and XY/WZ (t 16 = −5.2, p < 4 × 10−5, and t 16 = −2.71, p < 0.015). These interactions are highly significant, even when correcting for multiple comparisons, and they point to an interaction between X and Z, the second and fourth quarters of K115 and P82. In genetic terms, something between gene 4 and gene 7.3 must interact with something between gene 15 and the right end of the phage.

Discussion

This study characterized the nucleotide changes accumulated in phage genomes subjected to a series of population bottleneck events and also characterized the nucleotide changes accumulated in those evolved genomes during subsequent fitness recovery in large populations. The numbers of substitutions recorded in the two phage genomes—299 and 404—are the largest numbers characterized at the nucleotide level in experimentally evolved microbes, although the rate of nucleotide substitution achieved here (~1%) has been achieved in a virus with a smaller genome—φX174 (Bull et al. 1997; Wichman et al. 2000)—and has been vastly exceeded in single molecules subjected to directed evolution. It should be clear from these results that a far greater number of changes could be fixed by extending the mutagenesis and bottlenecking, interspersed with adequate recovery, and indeed, this type of protocol was used to achieve nearly 50% substitution rates in a single molecule (DHFR) subjected to alternate mutagenesis and selection for function (Vartanian et al. 2001). Substantially more mutations were fixed during the bottleneck phase than the recovery phase, but further recovery would be expected to yield additional changes and fitness gains. It seems clear, however, that in long-term lineages evolved under this protocol, periodic recovery will be needed to maintain fitness levels at a level that allows indefinite propagation. A wide range of silent/missense substitution ratios could be achieved by varying the protocol (to mimic different processes of molecular evolution), as the recovery phase fixed almost entirely missense mutations whereas the bottleneck phase fixed a mix of silent and missense changes (as also observed by Escarmis et al. 1999).

A decay of fitness when populations are bottlenecked has been observed in RNA viruses that were not exposed to mutagen (Burch and Chao 1999; Chao 1990; Escarmis et al. 1996). Small populations are known to fix mutations at a high rate nearly independently of their fitness effects (lethal mutations cannot be fixed, of course), and the decline in fitness with this protocol merely reflects the fact that the net effect of many nonlethal but otherwise random mutations is deleterious.

Despite the overall fitness decline during the bottleneck phase, some mutations fixed during this phase may have been advantageous. Convergent substitutions were more common than expected between the two bottlenecked lines, consistent with a role of positive selection favoring individual mutations during the bottleneck phase. It is entirely plausible that positive selection operated during the passages, as both lines were subjected to the same selection pressures of 42°C and tolerance of mutagen, which may have favored the same mutations in both lines (this adaptation may explain some of the fitness decline at 37°C). Furthermore, each bottleneck was interspersed with five mutagenic, large-population cycles in which positive selection could have been highly effective.

Site-directed recombination between the two recovered phages indicated that fitness interactions (nonadditivity) were present in the differentially mutated genomes. Interactions could reflect a speciation-like process, in which the two genomes experience fitness loss when recombining, but the test for interactions does not discriminate this type of interaction from other types, such as the more mundane case in which a mutation is beneficial in two backgrounds but has a smaller benefit in one background than in the other (as in Bull et al. 2000). Inspection of Fig. 2 suggests that the former type of interaction may be present: the swap of fragment ZK into WKXKYKZP (to create K115) had a slight beneficial effect, but the swap of the same fragment into WPXPYPZP had a large negative effect. It is not known how many mutations contribute to the interaction (minimally two) or what genes are involved.

In view of the uniqueness of most mutations introduced into K0 and P0 and the failure of both lines to recover full fitness during subsequent passages, a wide range of recombinants between K115 and P82 might be expected to improve fitness. In principle, a deleterious mutation in one genome could be replaced with the wild-type nucleotide from the other genome. If two genomes carrying different deleterious mutations recombined to restore wild-type sequence, the recombinant would have higher fitness than either parental genome (a type of fitness interaction that was not observed). Our attempts to favor such recombinants between P82 and K115 by growing the phages in mixed culture yielded modest fitness improvements at best (data not presented), but those studies were not pursued because of the great effort that would have been required to characterize a recombinant. A limited advantage of recombination between these heavily mutated lineages is understandable because recombinations of large regions would tend to exchange one set of deleterious genes for another, instead of restoring the wild-type sequence (the site-directed recombinants were no fitter than the parental types, for example). Alternatively, recombinations of short regions would at best restore wild-type sequence for only a small part of the genome and, thus, yield only a minor fitness gains.

Mutator strains evolve in bacteria when bacteria are subjected to new or changing environments (de Visser 2002; Miller 1998). In view of the intense selection for compensatory mutations that would improve fitness during the recovery phase, an elevated mutation rate might be expected. Gene 5 encodes the phage DNA polymerase, and mutations in this gene are known to affect the rate of replication and the copy error rate (Tabor and Richardson 1989). Yet no mutations in gene 5 were found in P82 (compared to P0); K115 exhibited one missense mutation in codon 18. So at least the P recovery line appears not to have evolved a sustained increased mutation rate, although it should be considered that other T7 genes may also affect the mutation rate. However, some bacterial lines subjected to long-term selection have also failed to evolve mutator phenotypes, so the pattern is not universal even in bacteria (Lenski et al. 2003).

This study is the first to compare characteristics of many substitutions in nonessential as well as essential genes. During the bottleneck phase of the study, nonessential genes accumulated missense mutations at a higher rate than essential genes. (Some nonessential genes also evolved inactivating deletions or nonsense mutations.) This difference is expected if the missense mutations fixed under this protocol tended to be deleterious, since there should be weaker selection against deleterious mutations in nonessential genes than in essential genes. It is interesting to note that the standard explanation for a high missense/silent ratio is positive selection—that missense changes are mostly adaptive (Li 1997)—whereas here the correct explanation seems to be a relaxation of negative selection. Of course, the protocol used here is not necessarily representative of nature. Further evidence against enhanced positive selection in nonessential genes is that, during the recoveries, missense mutations were not more numerous than expected in nonessential genes compared to essential genes.

The substitutions during the recovery phase were almost entirely new, compensatory changes rather than reversions. Similar observations were made in a study of an RNA virus (Escarmis et al. 1999) and in bacteria (Moore et al. 2000). In contrast, reversions were prominent in some experimental evolutions with the bacteriophage φX174 (Brauer 2000; Crill et al. 2000). One potential explanation for the difference between the φX174 results and the T7 results is that the larger genome of T7 provides more avenues for compensatory mutations, but this hypothesis is not consistent with the paucity of reversions during recovery of FMDV, which has a small RNA genome (Escarmis et al. 1999). Alternatively, the large numbers of mutations per se may have altered the interactions and functional activity levels of T7 proteins so much that the range of potential compensatory mutations was greatly expanded over what would obtain with few mutations. It is surprising, however, that these compensatory mutations—which by their very nature are epistatic—did not lead to extreme epistatic interactions among segments of the evolved genomes (as revealed in Fig. 2). Previous work has found that a mix of synergistic and antagonistic epistatic effects can somewhat cancel each other, a possibility to consider here (Elena and Lenski 1997).

Although this study has been considered in the context of molecular evolution, it can also be considered a study of viral attenuation. Viruses that have been attenuated for reduced virulence are commonly used as vaccines. Attenuation has typically been achieved by adaptation of viruses to novel conditions, with the goal of fitness loss in the original host (Badgett et al. 2002; Fenner and Cairns 1959; Flint 2000). These live viruses are then used as a vaccine, because the novel-adapted virus is no longer able to cause disease in the original host. Some such attenuated viruses are prone to revert to virulence when introduced back into the original host, however, when the number of attenuating mutations is small. The present study offers an alternative means of attenuating a virus: fixing deleterious mutations through repeated bottlenecks. The slow fitness recovery observed here shows that such viruses could be rendered unable to revert to virulence in any meaningful period of time. Three problems are apparent with this method, however: (i) The attenuation is not repeatable at the level of individual mutations, because independent lines will fix different deleterious mutations. (ii) The debilitated viruses are not easily grown to high titers for vaccine production, in contrast to a virus attenuated by adaptation to the culture conditions in which it is harvested for vaccine production. (iii) The accumulation of many mutations may alter the antigenic profile of the virus and thus reduce its efficacy as a vaccine against the wild strain (a possibility suggested by a reviewer). Despite these drawbacks, an understanding of the different processes that could be used to create attenuated viruses is likely to help improve methods in the future.