Introduction

In the quest to understand more deeply the evolution of life, one fruitful approach has been to assemble in vitro all of the components necessary to sustain an evolving system. By doing so, researchers hope to delineate the essential components of such a system, and then to utilize the system to address core concepts in evolutionary biology. Much of the frustration of evolutionary biology comes from its retrospective nature, which can add great challenge to the formulation and testing of hypotheses. In a broad sense, this is the power of experimental evolution—to provide a means for testing evolutionary hypotheses that previously were unapproachable by conventional methods. Of the many experimental evolution systems described in this special issue, only one is considered here, that of continuous in vitro evolution of nucleic acid enzymes. The system is termed “continuous” because it does not require user intervention at successive steps of selection, amplification, and mutation. This approach is of special interest because it currently is the most biologically realistic system that has been assembled entirely in vitro.

The purpose here is not to provide a comprehensive review of continuous in vitro evolution (for such a review see Joyce 2004). Rather, the intent is to summarize efforts that led to the construction of a continuous in vitro evolution system and to provide a candid discussion of the strengths and limitations of the system with an eye toward its future development and applications. The hope is to advocate the design of experiments that capitalize on the strengths of the system, while mitigating its limitations.

The first success in establishing an in vitro evolution system was the work of Spiegelman and colleagues nearly 40 years ago, employing the replicase protein and genomic RNA from Qβ bacteriophage (Mills et al. 1967). The investigators utilized a serial transfer technique to bring about evolution of the RNA. The simplicity of the reagents and experimental manipulations belies the significance of these seminal experiments. They had all of the salient features of in vitro evolution: Qβ RNA provided heritable information, Qβ replicase provided the mechanism for replication and the possibility of introducing occasional mutations, and the serial transfer protocol provided the selective pressure. In retrospect, the results of such a simple system were entirely consistent with expectations. With such a strong and straightforward selection pressure, the evolving RNA molecules quickly shed all unnecessary residues, leading to highly contracted sequences that still could code for recognition and replication by Qβ replicase while avoiding strong secondary structures that would impede the rate of replication. The simplicity of the system, however, also limits its broader applicability. The winning combination under a broad range of conditions is simply the minimum sequence necessary for replication. There is little exploration of sequence space under this selection regime, and the corresponding fitness landscape is likely a deep optimum in the neighborhood of the minimum sequence. Nonetheless, the Qβ evolution system has been used to examine the evolution of resistance to either ethidium bromide or a nucleoside triphosphate analog, which were imposed as added selection constraints beyond the requirement for rapid replication (Levisohn and Spiegelman 1969; Saffhill et al. 1970).

About 20 years after the work of Spiegelman and colleagues, several other methods for nucleic acid amplification were developed that largely are indifferent to the sequence being amplified. The most widely utilized of these methods is the polymerase chain reaction (PCR), which employs two sequence-specific DNA primers, the four deoxynucleoside triphosphates, and a thermostable DNA polymerase (Saiki et al. 1985; Mullis and Faloona 1987). Amplification of RNA can easily be achieved by reverse transcribing the RNA to complementary DNA (cDNA), performing the PCR, and then transcribing the cDNA back to RNA.

Another amplification method, which, unlike the PCR, takes place at a constant temperature, is “self-sustained sequence replication” (Guatelli et al. 1990), also referred to as “nucleic acid sequence-based amplification” (Compton 1991) or “transcription mediated amplification” (Hill 1996). In this procedure a combination of reverse transcriptase and RNA polymerase is used, with optional inclusion of RNase H (Fig. 1). There are two primers, one that is complementary to the 3′ end of the RNA and another that is complementary to the 3′ end of the corresponding cDNA and includes an RNA polymerase promoter sequence at its 5′ end. The first primer is extended by reverse transcriptase to form a cDNA copy. RNase H, if present, degrades the template RNA. Alternatively, the RNase H activity associated with reverse transcriptase brings about this degradation. The second primer then binds to the cDNA, and reverse transcriptase extends both the primer and the cDNA to give rise to double-stranded DNA that contains a functional RNA polymerase promoter. The RNA polymerase (typically T7 RNA polymerase) then generates new copies of RNA, and the entire cycle is repeated. Amplification is achieved because each cDNA can serve as the template for the production of tens to hundreds of copies of RNA.

Figure 1
figure 1

Scheme for isothermal RNA amplification. A primer complementary to the 3′ end of the RNA (solid line) is extended by reverse transcriptase to yield cDNA (open line). The RNase H activity of reverse transcriptase (or added RNase H) degrades the RNA. A second primer, containing an RNA polymerase promoter sequence, binds to the 3′ end of the cDNA. Reverse transcriptase extends both the primer and the cDNA to generate double-stranded DNA containing a functional RNA polymerase promoter upstream of the target sequence. RNA polymerase then produces tens to hundreds of copies of the RNA.

Each of these three RNA amplification methods has its strengths and limitations with regard to its application to in vitro evolution. The Qβ system is the most highly sequence specific, and imposes the strongest selection pressure for fast replication rates, thereby favoring shorter and less structured RNAs. PCR amplification is the most general with regard to sequence, and because of temperature cycling and sufficiently long extension times, allows even the most slowly copied sequences to be copied in their entirety. Although short amplicons can be enriched by the PCR, they do not enjoy the exponential advantage that occurs with continuous amplification systems such as Qβ amplification and isothermal RNA amplification (Bull and Pease 1995). Like the PCR, the isothermal RNA amplification system is generalizable with regard to sequence, but like Qβ amplification it is a continuous amplification system that leads to rapid enrichment of the most advantageous molecules and is vulnerable to being taken over by shorter sequences that may arise during the course of evolution.

Initial attempts to devise a continuous in vitro evolution system based on isothermal RNA amplification were foiled by the emergence of short molecular parasites that did not conform to the intended selection constraints. For example, Breaker and Joyce (1994) devised a scheme for the selective isothermal amplification of RNA molecules that could ligate an oligonucleotide substrate to their own 5′ end. The starting population of molecules was derived from the bI1 self-splicing group II intron of yeast mitochondria, which has a low level of activity in ligating a 3′-hydroxyl-terminated RNA substrate to its own 5′ triphosphate, forming a phosphodiester linkage and releasing inorganic pyrophosphate (Mörl et al. 1992). By employing a substrate that had the sequence of the promoter for T7 RNA polymerase, it was hoped that only catalytically active molecules would be eligible for amplification. In theory, molecules that performed the chemistry more quickly would be amplified sooner and, thus, have a geometrically compounding selective advantage. Despite the apparent requirement for catalytic activity, however, new RNA sequences soon emerged that could satisfy the selection criteria without ever having performed RNA ligation. These evolved RNA molecules had acquired an internal T7 promoter sequence just downstream of a hairpin structure that was derived from the original group II ribozyme. With this configuration, reverse transcription of the RNA formed a cDNA that had a 3′-terminal hairpin, which could prime its own second-strand synthesis, giving rise to a functional double stranded promoter. This double-stranded DNA in turn was transcribed to produce additional copies of the RNA, with each RNA inheriting its own promoter (Breaker and Joyce 1994).

While this result fell short of the goal of achieving continuous in vitro evolution of catalytic function, it nonetheless revealed an intriguing evolutionary outcome that reflected the imposed selective pressure in a way that was not anticipated by the experimenters. Researchers with a specific evolutionary goal need to be exceptionally careful to devise a selection scheme that will deliver the intended outcome and not be thwarted by charlatan molecules. However, for the researcher who wishes to study the evolutionary process itself, there often are interesting and sometimes unexpected answers to the imposed selective query.

Continuous In Vitro Evolution of Ribozymes

Subsequent attempts to engineer a system that enables the continuous in vitro evolution of catalytic RNAs eventually proved successful (Wright and Joyce 1997) (Fig. 2). These later studies employed nearly the same scheme as had been devised previously (Breaker and Joyce 1994), but began with a different RNA catalyst that had a much higher level of activity in the desired reaction. The catalyst was a form of the class I ligase ribozyme, which brings about the joining of an RNA substrate to the 5′ end of the ribozyme with high catalytic efficiency (Bartel and Szostak 1993; Ekland et al. 1995). This ribozyme originally was selected from a pool of 1015 RNA molecules containing 220 random-sequence nucleotides. It was further improved by additional randomization and selection, culminating in the identification of a particular molecule, designated b1-207, that has especially desirable catalytic properties (Ekland et al. 1995). Under optimal reaction conditions, and when modified to operate in a mutiple-turnover format, the b1-207 ribozyme exhibits a kcat of 16 min−1 and a K m of 0.23 μM, measured at pH 8.0 and 22°C in the presence of 60 mM MgCl2 (Bergman et al. 2000).

Figure 2
figure 2

Scheme for the continuous evolution of ligase ribozymes. The ribozyme catalyzes attachment of a chimeric DNA–RNA substrate to its own 5′ end, releasing inorganic pyrophosphate. The substrate has the sequence of an RNA polymerase promoter, ending in one or more ribonucleotides (open and solid lines represent DNA and RNA, respectively). A primer binds to the 3′ end of the ribozyme and is extended by reverse transcriptase to yield cDNA. Only ribozymes that have ligated the substrate will contain a functional double-stranded promoter, allowing selective transcription by RNA polymerase to produce new ribozyme molecules, each with a 5′ triphosphate. (Adapted from Wright and Joyce 1997.)

The b1-207 ligase ribozyme was used as the starting point for renewed attempts to develop a continuous in vitro evolution system. However, the ribozyme first had to be adapted to the reaction conditions and catalytic requirements of the continuous evolution system. The sequence of the substrate-binding portion of the ribozyme (P1 stem) had to be made complementary to the T7 promoter sequence. Unfortunately, this resulted in a drastic reduction in catalytic rate, which was insufficient to sustain continuous evolution. Many rounds of noncontinuous (stepwise) in vitro evolution were carried out, resulting in ribozymes with a catalytic rate that was nearly sufficient to support continuous evolution. However, those molecules still were adapted to reaction conditions that would not be compatible with reverse transcriptase and RNA polymerase.

Next a “rapid” evolution scheme was implemented that progressively adapted the ribozymes to the conditions required for continuous evolution. The ribozymes first were allowed to react with the substrate for 5 min, then were transferred to a mixture containing all of the components necessary for selective isothermal amplification. A small portion of the completed amplification mixture was used to seed the next ligation reaction. After 100 rounds of this procedure, a population of ligase ribozymes was obtained that was suitable for continuous evolution (Wright and Joyce 1997). One example of these optimized molecules is shown in Figure 3a. It has a catalytic rate of >1 min−1 at pH 8.5 and 37°C in the presence of 15 mM MgCl2.

Figure 3
figure 3

Important milestones in the continuous evolution of the class I ligase ribozyme. Secondary structure is based on Ekland et al. (1995). Boxes represent the 5′ portion of the substrate and the 3′ end of the ribozyme, which are immutable during evolution. The substrate is shown bound to the ribozyme through base-pairing interactions that comprise the P1 stem. Numbered residues are discussed in the text. a Prototype ribozyme isolated from the population prior to the start of continuous evolution. b E100-3 ligase ribozyme, obtained after 100 transfers of continuous evolution. c B16-19 ligase ribozyme obtained after 16 additional transfers of continuous evolution, carried out in the presence of progressively lower concentrations of MgCl2. Highlighted residues indicate the constellation of nine mutations relative to the E100-3 ribozyme. d Ligase B ribozyme obtained after 40 additional transfers of continuous evolution, carried out in the presence of progressively higher concentrations of a ribozyme-cleaving DNA enzyme. Highlighted residues indicate mutations relative to the E100-3 ribozyme.

The ribozyme sequence shown in Figure 3a is an important landmark on the fitness landscape of the continuous evolution process. However, it represents only one member of a potentially diverse population of approximately 1011 molecules that were used to initiate continuous evolution, and together these molecules represent only a minute sampling of the >1075 possible sequences of this length that define the entire fitness landscape. Thus it is not known what ancestral sequence ultimately proved to be the one that seeded the successful lineage. Considering the coalescent events that must take place during continuous evolution and the tremendous number of generations of selective amplification that occur, it can be said with near certainty that all of the molecules that eventually came to dominate the population derived from a single molecule that was present in the starting pool (barring early recombination events). While it would be interesting to trace the evolutionary path of the original sequence that ultimately succeeded over all others, in practice one is faced with a situation that is similar to reconstructing organismal evolutionary relationships from extant species and perhaps a smattering of relevant fossils. A fundamental difference is that, at least in principle, in vitro evolution allows access to all of the sequences from all of the past generations. This assumes, of course, that one has the technical capability to sequence and analyze all of these molecules.

From this complex and somewhat uncertain beginning on the fitness landscape of continuous evolution, the inexorable march to the nearest peak began. The conditions that define this landscape are largely the requirements for efficient ligase activity in the presence of the two polymerases and other components of the reaction mixture (buffer, salts, nucleotides) under a particular choice of pH and temperature. Of particular importance to the function of the ribozyme are (1) the presence of 25 mM Mg2+, which affects the ribozyme’s secondary and tertiary structure, as well as its catalytic mechanism; (2) the presence of 5 μM substrate, which is well in excess of the K m ; and (3) the pH of 8.5, which enhances reactivity of the 3′ hydroxyl of the substrate compared to neutral-pH conditions, but also has a destabilizing effect on the secondary and tertiary structure of the ribozyme. Efficient substrate binding and rapid chemical ligation are essential for survival in the context of continuous evolution. In addition to providing a competitive advantage, rapid ligation is necessary to keep ahead of reverse transcriptase, which would otherwise inactivate the ribozyme by converting it to an RNA–DNA heteroduplex.

Under these initial reaction conditions, 100 serial transfers were performed, with approximately 1000-fold dilution prior to each transfer. The overall dilution was 10298, carried out over 52 h. After the 100th transfer, the evolved ribozymes were cloned and sequenced in order to assess the genetic composition of the population. A typical clone, designated E100-3 (Fig. 3b), was found to be a greatly improved catalyst compared to the starting ribozymes, with a kcat of 21 min−1 and a K m of 1.7 μM. It contains 15 mutations relative to the prototype ribozyme shown in Figure 3a (Wright and Joyce 1997) and 29 mutations relative to the starting form of the b1-207 ligase (Ekland et al. 1995).

Three of the acquired mutations in the E100-3 ribozyme are of particular significance. A U→G change at position 19 within the P1 stem extended this stem from six to eight base pairs, resulting in improved substrate binding (Fig. 3b). A G→A change at the first nucleotide of the ribozyme, together with a compensatory C→U change at the paired nucleotide position, converted the 5′-terminal guanosine triphosphate to an adenosine triphosphate. This is surprising because T7 RNA polymerase prefers to initiate transcription with a guanosine. This mutation alone would be expected to reduce the efficiency of transcription by about two-thirds (Imburgio et al. 2000). Given that there is a substantial decrease in promoter strength, it seems plausible that this mutation (and its compensatory one) benefits RNA-catalyzed ligation to an even greater extent. This emphasizes that selection is operating on the system as a whole, rather than any particular step of the cycle, and demonstrates empirically that selective advantage may involve trade-offs in fitness. Such detailed analysis often is impossible in more complex systems because it may be difficult to exclude confounding variables or plausible neutral explanations. This highlights one of the strengths of the continuous in vitro evolution system.

Despite the detailed analyses that are possible, one should keep in mind that the E100-3 ribozyme is an arbitrary point in sequence space—a milestone after 100 serial transfers—and not necessarily a fitness optimum. The “end point” of this exploration of the fitness landscape depends on many factors, such as the ruggedness of the landscape, population size, population structure, number of generations, mutation rate, and strength of selection. Under continuous evolution conditions, the dominant ribozyme sequence will move quickly to the sequence that is capable of generating the most copies of itself per unit time among all of the sequences that actually have been explored. The efficiency of exploration of sequence space depends critically on the variation that was present in the starting population, as well as the particular mutations that happen to arise. While these may seem to be the only sources of variation, other “illegitimate” sources of variation may occur and corrupt the system by introducing parasitic or contaminating molecules. The continuous evolution system based on the class I ligase ribozyme typically is not susceptible to the sorts of RNA parasites that foiled earlier experiments with the group II ribozyme, chiefly because of the rapid rate of amplification of legitimate ribozymes and the rapid rate of dilution of the evolution mixture. However, rapid amplification and dilution makes the system highly vulnerable to cross-contamination, as discussed below.

The B16-19 Constellation and Its Adaptive Peak

Taking the E100-3 sequence as a landmark in sequence space, this molecule has been used as the starting point for several other experiments that sought to examine the effects of various selective pressures and to study the possibility of recurrence among multiple lineages that are subject to the same pressures (Fig. 4). One study sought to evolve ribozymes that could function in the presence of progressively lower concentrations of Mg2+ (Schmitt and Lehman 1999). The concentration of Mg2+ was reduced from 25 to 12.5 mM over the course of 20 transfers of continuous evolution. A magnesium ion binds tightly to a ribo- or deoxyribonucleoside triphosphate, compounds that were present at a total concentration of 8.8 mM. Thus the concentration of free Mg2+ was reduced from 16.2 to 3.7 mM over the course of the experiment. During this process, a ribozyme sequence emerged that became nearly fixed in the population by the 16th transfer, designated “B16-19” (Fig. 3c). This ribozyme contains nine mutations relative to the E100-3 ribozyme, and this set of nine mutations has subsequently been referred to as the “B16-19 constellation.” Interestingly, despite its dominance in the population, the B16-19 ribozyme does not have a faster catalytic rate compared to the E100-3 ribozyme. However, a substantially larger proportion of the B16-19 molecules folds into an active conformation. The B16-19 sequence represents a deep fitness optimum and demonstrates the importance of enabling properties, such as the ability to adopt a properly folded state, to the evolution of catalytic function.

Figure 4
figure 4

Genealogy of the class I ligase ribozyme, beginning with a population of random-sequence RNAs. Arrows are labeled with either the number of rounds of stepwise evolution or the number of transfers of continuous evolution. (Reprinted with permission from the Annual Review of Biochemistry, Volume 73, © 2004 by Annual Reviews.)

Simultaneously and independently of the discovery of the B16-19 constellation, another study involving continuous evolution of the E100-3 ribozyme gave rise to a very similar constellation of mutations (Ordoukhanian and Joyce 1999). That study sought to examine the evolution of resistance of the ribozyme to attack by an RNA-cleaving DNA enzyme. The DNA enzyme was designed to recognize and cleave the P1 stem portion of the ribozyme, destroying its catalytic activity. After 40 transfers of continuous evolution, in the presence of progressively increasing concentrations of the DNA enzyme, a ribozyme emerged (designated “ligase B”; Fig. 3d) that had evolved resistance to the DNA enzyme. Compared to the E100-3 ribozyme, it is cleaved by the DNA enzyme with a 10-fold lower kcat and a 200-fold higher K m . Ligase B is still an efficient ribozyme, with a kcat of 14 min−1 and a K m of 1.3 μM. This corresponds to a catalytic efficiency (kcat/K m ) of about 107 M−1 min−1, which is the same as that of the E100-3 ribozyme. Another ribozyme isolated from the evolved population (designated “ligase C”) had a lower kcat of 8 min−1, but also a lower K m of 0.5 μM. Remarkably, ligase B contained seven of the nine mutations of the B16-19 constellation, as well as an A→G change at position 131 (rather than the A→U change seen in B16-19). Ligase C contained all eight of these mutations plus the remaining mutation of the B16-19 constellation.

The mutations that are present in ligases B and C can be rationalized in two ways: they may be reflective of the B16-19 constellation, which confers increased fitness in its own right; or they may be the result of evolution of resistance to the DNA enzyme. Supporting the latter possibility, the C→G change at position 5 and the U→A change at position 7 both disrupt binding of the DNA enzyme to the ribozyme (Fig. 3d). The G→C change at position 121 and A→U change at position 119 can be viewed as compensatory mutations that restore Watson–Crick pairing in light of the mutations at positions 5 and 7, respectively. All four of these mutations also happen to be present in the B16-19 constellation. Ligases B and C also contain an inserted U at position 21, not present in the B16-19 constellation, that further disrupts binding of the DNA enzyme to the ribozyme. The other mutations of the B16-19 constellation that are present in ligases B and C might contribute to DNA enzyme resistance by enhancing the ability of the ribozyme to adopt a folded state, which would render it relatively protected from cleavage compared to improperly folded molecules.

Does the evolution of resistance to the DNA enzyme represent an exaptation of the B16-19 phenotype? Perhaps if continuous evolution were carried out merely to select for improved substrate binding, the B16-19 constellation would appear without the population ever having been exposed to the DNA enzyme. The selective pressure of repeated serial transfers alone may be sufficient to explain the rise of the B16-19 constellation in ligases B and C. Consequently, these adaptive mutations for increased fitness in the general context of continuous evolution might have a co-opted fitness benefit for DNA enzyme resistance—precisely the sense of the term exaptation, as described by Gould and Vrba (1982). Alternatively, one might view the development of DNA enzyme resistance as adaptive and regard enhanced folding as exaptive. However, the number of independent experiments with different selective pressures, all converging on the B16-19 constellation, argues against this interpretation. Ligase B and, especially, ligase C contain other mutations in addition to the B16-19 constellation, and these might be regarded as adaptive, or in some cases neutral, mutations. The inserted U at position 21, for example, has not been seen in any other continuous evolution experiment and likely represents a true adaptation to the presence of the DNA enzyme.

Other continuous evolution experiments starting with the E100-3 ribozyme and employing various selective constraints all came upon at least seven of the nine mutations in the B16-19 constellation. One case involved selection for three successive nucleotidyl addition reactions, two NTP additions followed by RNA ligation (McGinness et al. 2002). This resulted in an evolved ribozyme that contains all nine of the B16-19 mutations, although with an A→GUUU rather than an A→U change at position 131. In another study, the E100-3 ribozyme, which normally operates at pH 8.5, was evolved to operate at either pH 5.8 or pH 9.8 (Kühne and Joyce 2003). The low-pH condition resulted in an evolved ribozyme that has eight of the nine mutations of the B16-19 constellation, while the high-pH condition led to those same eight mutations together with an A→G change (rather than the usual A→U change) at position 131.

The observation of the prevalence of the B16-19 constellation in so many experiments led Lehman (2004) to investigate more formally the evolutionary recurrence of this genotype. A population of randomized variants of the E100-3 ribozyme was constructed, then split into 13 lineages. Five of these lineages were selected for activity in the presence of reduced concentrations of Mg2+, four for activity in the presence of reduced Mg2+ and reduced substrate, and four for activity under standard reaction conditions. During the evolution process strenuous efforts were made to prevent cross-contamination so as to yield truly independent results. All 13 lineages acquired the B16-19 constellation by the 15th transfer, and many were dominated by this sequence by the sixth transfer. With such a unanimous result, Lehman appropriately reexamined the starting population for representation of the B16-19 genotype, and no such sequence was found.

In another study concerning the possibility of recurrence in experimental evolution, at least two distinct adaptive solutions to the imposed selection pressure were obtained (Hanczyc and Dorit 2000). However, that system involved stepwise rather than continuous evolution and was carried out for only 15 rounds. Thus the amount of selective amplification was much lower than for a typical continuous evolution experiment. Nonetheless, four of the five lineages converged on a similar genotype, with near-fixation of two mutations that had been found previously and had been implicated in improved substrate binding (Beaudry and Joyce 1992). It is not known whether, given enough time and selective pressure, the fifth lineage also would have converged on this solution. The fifth lineage appears to have become trapped in a local fitness optimum that is less advantageous than the one reached by the other four lineages, although the barrier that may separate these two optima has not been characterized.

The uniform response to varied selective pressures in the continuous evolution system suggests that the increased fitness associated with improved folding is of paramount importance as an adaptive response to a stressful environment. The B16-19 constellation clearly represents a deep fitness optimum for the adaptive landscape of continuous evolution relative to the E100-3 starting sequence. Although this is the main conclusion to be derived from all of the experiments that have been conducted thus far, one wonders how a single genotype and its corresponding phenotype could so strongly dominate so many different evolutionary lineages.

Strengths and Limitations of Continuous Evolution

In principle, the continuous evolution system could employ any ribozyme that is capable of ligating the 3′ end of a promoter-containing substrate to its own 5′ end, provided the ribozyme can perform the reaction under conditions that are compatible with reverse transcriptase and RNA polymerase. Thus far, however, only descendants of the class I ligase ribozyme have been used in continuous evolution. It also should be possible to carry out continuous evolution with ribozymes that catalyze joining reactions involving a chemistry other than phosphodiester formation, provided the resulting linkage can be traversed by reverse transcriptase and the appropriate reactive group can be restored to the 5′ end of newly transcribed ribozymes. There are several reactions that plausibly could meet these criteria, but none has been demonstrated in the context of continuous evolution.

For all of the selective amplification power of the continuous evolution system, why hasn’t there been broader interest in the research community to utilize this system to glean new insights into evolutionary processes? Perhaps one reason is that with such a powerful system one has to be exceptionally vigilant in guarding against possible contamination. Even the rarest molecule, whether it arises by mutation or cross-contamination, can quickly grow to dominate the population if it has a significant selective advantage. One can employ a clean-room environment and very careful technique, but if a single foreign molecule is able to gain access to the reaction mixture, it can alter the experimental result. This is where the distinction between theoretical sequences and sequences that have been realized previously becomes significant. Once discovered, it may be far easier to “rediscover” the same fitness peak in subsequent lineages. This could potentially affect the interpretation of recurrence in evolution and could hinder the exploration of sequence space for more global optima.

With such a daunting limitation for the operation of continuous in vitro evolution, one wonders why dominance by a single genotype has not been observed more frequently in organismal evolution. Naturally evolving populations escape the tyranny of a single genotype because adaptive landscapes typically are not static. This is instructive for considering how best to operate in vitro evolution. Changing the environmental conditions, and hence the fitness landscape, may change the location of the highest fitness peak. Some of the obvious ways to change the milieu in continuous evolution include altering the pH, temperature, salt concentrations, substrate concentration, and concentration of various added compounds. Within the constraints of the continuous evolution system, especially the range of conditions that are tolerated by the polymerase proteins and other components, such environmental changes may not be sufficient to attenuate the B16-19 fitness optimum. Thus one may need to consider other mechanisms by which natural populations can escape deep fitness optima. Some of these include mutation, genetic drift, inbreeding, population subdivision, recombination, and assortative mating (Wright 1932). Inbreeding and assortative mating do not apply to the continuous evolution system as currently implemented, but all of the others can be incorporated in some manner.

Although mutation and selection are intrinsic to the continuous evolution system, a substantially increased mutation rate is needed to balance the strong selective pressure in order to maintain a diverse population. Higher mutation rates would increase the probability of generating molecules that contain multiple mutations. Assuming that sequences that are close in sequence space have similar phenotypes, these higher-error-class mutants would have a greater probability of escaping deep fitness optima. Moreover, the increased mutation rate would result in an increased genetic load on the population, which would reduce the mean fitness of the dominant allele. Of course the mutation rate must not be so high that it outweighs the ability of selection to recover the fittest individuals (Eigen 1971).

In large populations, artifacts due to stochastic sampling are rare, and hence genetic drift is likely to be weak, especially in the face of strong selective pressures. With the very large numbers of molecules that typically are employed in continuous evolution, genetic drift is not expected to play a significant role in influencing evolutionary outcomes. However, one could envision a strategy of diluting each completed reaction mixture to extremely low concentrations before carrying out the transfer step, perhaps coupled with high-throughput technologies to increase the number of replicate reactions. As the effective population size approaches the harmonic mean of the overall number of molecules, this strategy could become viable for increasing the role of drift. Populations of exceedingly small effective size are strongly influenced by stochastic sampling events, even in the presence of strong selective pressures, and thus drift could be made to play a larger role in evolving ribozymes away from deep fitness optima. However, there would need to be a very large number of replicate reactions to maintain an effective overall search of sequence space. It remains to be seen whether such an approach would increase the efficiency of searching, especially considering the technical limitations that would need to be overcome.

Alternatively, substantial genetic drift could be allowed to occur in the continuous evolution system by engineering a mechanism for subdividing the population. One approach might be to utilize a water-in-oil emulsion (Tawfik and Griffiths 1998) that physically isolates a small number of RNA molecules within individual water droplets that also contain the polymerases and other components of the system. This would introduce substantial barriers to dispersal and hence greatly reduce the effective population size, although collectively a large number of ribozyme sequences could still be explored. Partial mixing after each transfer step, just prior to dilution and recompartmentation, would allow some migration between compartments. This would be analogous to Wright’s (1931, 1982) shifting balance theory of evolution, with the three phases represented by (1) exploration of sequence space in numerous independent water droplets, (2) mass selection of the fittest ribozymes within each droplet, and (3) interdeme selection whereby droplets with a higher average fitness contribute more ribozyme molecules to the general population compared to droplets with lower average fitness. As the name implies, this theory requires a careful balance of population parameters, such as mutation, selection, drift, and migration, for the three phases to work in concert. Yet this is a strength of the continuous evolution system, which enables one to control each of these parameters independently and, thus, determine empirically which specific combinations will yield a stable progression of the three phases as required by the theory.

Aside from these steps to escape deep local fitness optima, the most important future direction will be to make continuous in vitro evolution a more realistic model of biological evolution. Currently the only source of mutation within the system is from copying errors by the two polymerases, especially reverse transcriptase. In several of the previous studies involving continuous evolution, researchers have had to step out of the system by performing mutagenic PCR to introduce genetic variation to compensate for the loss of diversity due to strong selection (Ordoukhanian and Joyce 1999; McGinness et al. 2002; Kuhne and Joyce 2003). Alternatively, one might be able to achieve a higher frequency of mutations within the continuous evolution system, for example, by employing more error-prone polymerases or by altering the conditions for amplification. This would increase the opportunity for discovering more global fitness optima, both by lowering the mean fitness of the population and by distributing the population more broadly across sequence space. In addition, the frequency of higher-order mutants would increase, some of which may prove more fit, and lead to the exploration of additional sequences related to these new mutants.

It would be instructive to drive the evolving population of molecules toward the expression of new and more complex behaviors. One way such evolutionary adaptations arise in nature is through the interaction and coevolution of two (or more) species, which may act synergistically, as in mutualism, or antagonistically, as in competition, predation, or parasitism. Previous studies have examined molecular models for both predation and cooperation (Wlotzka and McCaskill 1997; Ellinger et al. 1998), as well as for the evolution of resistance (Ordoukhanian and Joyce 1999). Thus far, however, there has been no demonstration of an evolutionarily stable system involving truly coevolving molecular species. The aim might be to monitor the evolution of two or more groups of ribozymes that are competing for limited resources in a fluctuating environment or to monitor the evolution of both a ribozyme “prey” and its DNA enzyme “predator.” Such studies likely would yield new insights into the mechanisms of evolutionary innovation.

On balance, despite the pitfalls and limitations of the continuous evolution system, it is the most realistic in vitro model for biological evolution. It entails an autonomous cycle of selection, amplification, and mutation that requires little intervention by the experimenter. It offers tremendous power to control almost every aspect of the evolutionary process—from the starting genotypes, to the nature and strength of selection, to the frequency of mutation, to the size and structure of the population—all in a rapid and replicable experimental design with the ability to perform hundreds of “generations” of evolution per day. It is important to recognize, however, that the continuous evolution system is best applied to questions that draw upon the strengths and mitigate the limitations of the system. Clearly, it is better suited to addressing questions for which the inadvertent sampling of points in sequence space that have been realized previously is not detrimental to the problem being studied. With this in mind, the continuous evolution system is a potent tool that can complement other experimental evolution approaches and aid in the elucidation of mechanism and method in evolution.