Some Aspects of E. coli Promoter Evolution Observed in a Molecular Evolution Experiment

Liu, Shumo; Libchaber, Albert

doi:10.1007/s00239-005-0128-x

Some Aspects of E. coli Promoter Evolution Observed in a Molecular Evolution Experiment

Published: 11 April 2006

Volume 62, pages 536–550, (2006)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Molecular Evolution Aims and scope Submit manuscript

Some Aspects of E. coli Promoter Evolution Observed in a Molecular Evolution Experiment

Download PDF

Shumo Liu^1,3 &
Albert Libchaber^1,2

205 Accesses
2 Citations
Explore all metrics

Abstract

We devised a molecular evolution procedure to evolve E. coli promoter sequences and applied it to observe an arbitrary, nonfunctional sequence evolving into functional promoters. In the experiments, DNA sequence variations were generated with error-prone PCR and were inserted in the promoter region of the cat (chloramphenicol acetyl transferase) gene on a plasmid. Upon transforming the cells, functional promoters on the plasmid were selected according to the chloramphenicol resistance. Within a few cycles of mutation-selection, promoters emerged, and the sequences converged into a small number of groups. In the process, the extended minus 10 type of promoters emerged quickly, and small deletions were often involved in adjusting the length between the −35 and the −10 elements. Our results also suggest a possible selection for promoter stability against mutation.

Random sequences rapidly evolve into de novo promoters

Article Open access 18 April 2018

Gene regulation in Escherichia coli is commonly selected for both high plasticity and low noise

Article 20 June 2022

Gene amplification as a form of population-level gene expression regulation

Article 09 March 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Promoter is a DNA sequence where RNA polymerase initiates transcription. Promoter is singularly important in an individual gene’s expression, as well as in the global regulation of cell physiology of prokaryotes. For example, during the rapid growth phase in E. coli, σ⁷⁰ RNA polymerase holoenzyme is the dominant RNA polymerase species; it utilizes the promoters for a set of genes for rapid growth. When entering the stationary phase, σ^S RNA polymerase, which recognizes the promoters of a set of genes for cell preservation, becomes predominant.

The presence of promoters raises two general questions: (1) what sequence composition constitutes a promoter and (2) how that sequence has evolved. A survey of native promoters from an organism shows that the relationship between promoter activity and sequence composition may be subtle. Historically, a promoter consensus was derived for bacteria by compiling the sequences upstream of known genes (Harley and Reynolds 1987; Hawley and McClure 1983; Pribnow 1975; Raibaud and Schwartz 1984; Rosenberg and Court 1979; Seeburg et al. 1977). From a few hundred known E. coli σ⁷⁰ promoters, the consensus was determined to be TTGACA and TATAAT for the −35 and −10 elements, which are separated by about 17 base pairs. As information has accumulated, sequences deviating from the consensus type have been identified. Notably, in the extended −10 type of promoter, a TGn or TGTGn situated immediately upstream of a −10 element can substitute for a −35 element (Barne et al. 1997; Burr et al. 2000; Kumar et al. 1993). Furthermore, a weak −35 element can be compensated for by a UP element, which is an AT-rich sequence located between position −38 and position −60 (Estrem et al. 1998). Needless to say, if the core promoter is embedded within a complex promoter that requires transcription activators, the promoter elements may further deviate from the consensus so that the basal activity can be lowered. For example, the well-known constitutive mutant lac promoter, lacUV5, is a step closer to the consensus than is the wild-type E. coli lac promoter (which requires an activator to be fully active).

The information gained by surveying native E. coli promoters, however, has some limitations. The native promoters in an organism, evidently, present only a subset of possible promoters. For example, the consensus promoter is conspicuously missing in E. coli even though it is fully functional in vivo. Promoter sequences seemingly different from the natural consensus are found by selecting from a random sequence library (Horwitz and Loeb 1986; Horwitz and Loeb 1988; Oliphant and Struhl 1987, 1988). In a recent exploration, by selecting from a random library within the spacer region, a 10-bp-long AT-rich sequence is found to enhance promoter activity of −35-type promoters (Liu et al. 2004). Perhaps a skewed subset of functional σ⁷⁰ promoters has been selected due to constraints of gene expression regulation, potential conflicts with other cellular functions such as overlapping specificity with σ^s RNA polymerase (Gaal et al. 2001), and, conceivably, some particular events in the evolution history. More significantly, surveying the native promoters alone reveals little about how the promoter sequences have evolved. What are the essential factors in evolution, and how do these factors influence evolution? What trajectory is a succession of sequences likely to follow? To these questions, experimental evolution may provide more direct answers. Experimental evolution does not retrace the exact evolution course in nature; instead, it simulates the natural history. By making its own history under defined conditions and in repetitions, experiments may reveal the essential features in the evolution.

In the experiments reported here, we study E. coli promoters by evolving them from a nonfunctional sequence. At the onset of each experiment, a nonfunctional sequence is arbitrarily chosen to initiate evolution. To accelerate the evolution, we use mutagenic PCR (also called error-prone PCR) (Cadwell and Joyce 1992) to generate variants. The mutagenized DNA is inserted in a designated promoter region upstream of the cat (chloramphenicol acetyl transferase) gene on a plasmid that transforms E. coli cells. The plasmid confers a degree of chloramphenicol resistance according to the promoter activity of the insert. By growing transformants on agar medium containing chloramphenicol, functional promoter sequences are enriched. The plasmid DNA is extracted and ready for a new round of mutagenesis. The mutation frequencies are tuned to between 0.4% and 18% by adjusting mutagenic PCR amplification rate. The selection stringency is easily adjusted by varying chloramphenicol concentration in culture media. We subject the promoter region to several cycles of mutagenesis and selection until the population becomes mostly active promoters. These evolved promoters are sequenced, and their activities assayed.

We primarily adjust mutation frequency and observe how that frequency influences evolution speed and the diversity of the evolved population. Given the nature of the selection condition, it is not surprising that nearly all of the sequences are recognizable σ⁷⁰ promoters. This result, in turn, indicates that the experimental procedure is overall functional. Furthermore, analogous to the high frequency of the extended −10 promoters found in nature, a large portion of the experimentally evolved promoters belongs to the extended −10 type. Interestingly, some dynamic properties shown in the experiments are unexpected. The promoter solutions emerge and populate the sequence pool very fast in just few evolution cycles. The population converges into a small number of groups even at an extremely high mutation frequency. Short deletions, which occur rarely in PCR mutagenesis, are frequently found to bring the −35 and the −10 elements closer to the 17-bp optimal length between them. These and other experimental observations may help us to understand the process of evolution in nature.

Materials and Methods

Cell Culture and Plasmid Preparation

A homologous recombination deficient E. coli strain, TOP10 (Invitrogen, Carlsbad, CA), is used throughout the experiments. The cells grow in 2xYT broth or on 2xYT agar (Q-Biogene, Carlsbad, CA) medium supplemented with antibiotics. Nonselection (regarding the promoter function) medium contains 50 μg/ml kanamycin. Selection medium contains 10 μg/ml kanamycin and various concentrations of chloramphenicol. For the selection process, 25 ml agar medium is poured per petri dish. For the promoter activity assay, a rectangular plate (Omniplate, Nunc brand) is used, and each assay plate contains 30 ml of agar medium. All the cell cultures grow at 37°C. The liquid cultures grow with vigorous shaking. The agar plate culture is incubated at 95% relative humidity.

Plasmid DNA is prepared from 3 ml of saturated cell culture with a QIAprep column kit (Qiagen, Valencia, CA). DNA samples are routinely stored in TlowE buffer (10 mM Tris–Cl, 0.1 mM EDTA, pH 8.0 to 8.5, at room temperature).

PCR and PCR Mutagenesis

A regular Taq polymerase (Roche, Indianapolis, IN) is used in PCR to amplify the promoter region, and Z-Taq (Takara, Madison WI) is used to amplify and to linearize selection plasmid DNA. Except in Experiment 1, the PCR primer sequences for the promoter region are 5′AGTGCAAGUGCAGCUAGAGACAGC AGACCG3′ (Up) and 5′ATGGUGGCAGGUACCTATAUCTC CTACGAGAA3′ (Down). In Experiment 1, one of the above primers (Up) is 5′AGTGCAAGUGCAGCUAGAGACAGCA GA3′. The primer sequences for linear plasmid are 5′AG CTGCACUTGCACUGGGGACA3′ (vector<lic) and 5′ATAU AGGTACCUGCCACCAUGGAGAAA3′ (lic>vector). In these primers some of T’s are replaced with U’s for cloning purpose. Regular PCR solution contains 200 mM Tris–Cl, pH 8.4, 2.5 mM MgCl₂, a 0.2 mM concentration of each of the four dNTP’s, a 0.25 μM concentration of each of the primers, and 1.5 units of Taq polymerase in 50 μl. The thermal cycle consists of three steps: (1) denaturation at 94°C for 6 s, (2) annealing at 6°C below the lower melting temperature of the two primers for 6 s, and (3) elongation at 74°C for 6 s for the promoter region or 40 s for the linear plasmid. The total number of cycles varies.

Mutagenic PCR is slightly modified from the method using MnCl₂ (Cadwell and Joyce 1992). Taq polymerase (Roche) is used in mutagenic PCR, and the reaction solution contains 200 mM Tris–Cl, pH 8.4, 7 mM MgCl₂, 0.2 mM dATP, 0.2 mM dCTP, 0.2 mM TTP, 0.7 mM dGTP, a 0.25 μM concentration of each of the primers, 1.5 units of polymerase in 50 μl, and some amount of MnCl₂. MnCl₂ is first made as a stock solution of 30 mM MnCl₂ in 100 mM HCl and stored at −20°C. The usage of MnCl₂ is empirically determined; too much MnCl₂ inhibits PCR. To determine the maximum permissible MnCl₂ concentration, several mutagenic PCR test reactions, containing MnCl₂ between 0.16 and 0.8 mM, are performed. One half of the maximum permissible value, usually around 0.3 mM, of MnCl₂ is used in the mutagenic PCR.

Mutagenic PCR produces mostly heteroduplex DNA. Before cloning, a one-step extension is applied to convert heteroduplex to homoduplex. In this extension step, 25 μl of mutagenic PCR product is mixed with 75 μl of regular PCR solution with a 1 μM concentration of each of the two primers. After heating the reaction in a 0.2-ml thin-wall tube to 92°C for several seconds to denature the DNA, the tube is inserted in a metal block to quickly cool down to 50°C. The reaction stays at 50°C for 15 s and then at 74°C for 60 s. After cooling to room temperature (22°C), 1.5 units of Klenow fragment of E. coli DNA polymerase I (New England Biolabs, Beverly, MA) is added to treat the DNA for 10 min. PCR products are routinely examined by 4% agarose gel electrophoresis. If an extra band appears, the DNA band of proper length is excised from the gel and purified with a kit (Zymo Research, Orange, CA).

Ligation Independent Cloning

The PCR-made insert and vector fragments overlap at their ends (Fig. 1). The PCR primers contain two to three uracil (U) bases in place of thymine (T) in the overlap regions. A combined enzymatic digestion with UDG and Endo IV is used to severe the DNA backbone near the U’s and, thus, expose the complementary strand as a long single-strand DNA “sticky end” (Berninger 1993; Rashtchian and Berninger 1992). The enzymatic digestion buffer contains 50 mM Tris–Cl, pH 7.9, and 50 mM KCl. One unit of UDG, 2 units of Endo IV (both from Epicentre, Madison, WI), 0.1 pmol of insert, and 0.1 pmol of vector DNA are added in 20 μl of reaction buffer. Digestion is held at 37°C for 1 h, shifted to room temperature for 10 min. and then is added to the competent cell in transformation.

Promoter Sequencing

The promoter region is amplified by PCR using 0.2 μl of the liquid sample culture as the template DNA source in a 20-μl reaction volume. The PCR product is purified by ultrafiltration (Millipore, Berilica, MA), eluted in 20 μl of H₂O, labeled with DYEnamic ET terminator sequencing mix (Amersham, Piscataway, NJ). After purification by ethanol precipitation, the DNA sample is dissolved in 30 μl of H₂O and analyzed by ABI 310 (Applied Biosystems, Foster City, CA). Each sequenced sample is given an identification code of three or four positions separated with a dot, for example, 2.1.8.3×. The first position of the code is the number of the experiment to which the sample belongs. The second is the number of the evolution cycle. The third is an arbitrary serial number. The fourth, assigned to only some samples, is a suffix to mark the promoter activity, i.e., the chloramphenicol resistance level of the transformants. For example, sequence identification “2.1.8.3×” means that it is the 8th sample collected from the 1st cycle of Experiment 2, and the promoter activity is 3×.

Results

Molecular evolution has three components: mutation, selection, and replication. In the promoter evolution experiments, mutation is introduced in vitro by mutagenic PCR, selection is in vivo through the chloramphenicol resistance associated with the promoter activity, and replication occurs both in mutagenic PCR and in selection.

There are several significant parameters in evolution experiments: mutation frequency, selection stringency, clonal size (i.e., population size, number of independent transformants), and number of evolution cycles. Five experiments are reported here. The primary parameter we alter throughout the experiments is mutation frequency, ranging from 0.4% to 18%. The clonal size, obtained by counting colonies of dilution plating on a nonselection agar plate, is of the order of 10⁵ for the first cycle. For the subsequent cycles, this number may be lowered to about 10⁴. Mutation and selection are described in detail below.

In preliminary studies, we noticed that a very high level of expression of cat inhibits cell growth. Consequently, we set the highest chloramphenicol concentration at 330 μg/ml. In most cases we terminate the experiment after three evolution cycles, at which point the population consists of mostly strong promoters.

Sample clones are collected both before and after selection of each evolution cycle. We evaluate the progress of evolution by assaying promoter activity of the sample clones and by DNA sequencing of the promoter regions. For each sampling point along the evolution course, 96 colony samples are collected for promoter activity assay and a subset of the samples is sequenced.

Mutagenesis

PCR mutagenesis can provide a very high mutation frequency within the designated promoter region. To control the overall amplification and, in turn, the mutation frequency, we dilute the template to a desired concentration and let PCR proceed to saturation. For example, for 100-fold amplification, the initial template concentration is 1/100 of the saturation concentration of PCR. The exact amplification required to achieve a certain level of mutation frequency is estimated empirically for each experiment, and the actual mutation frequency achieved is monitored by DNA sequencing.

Among several methods of mutagenic PCR, we use the one with Mn²⁺ (Cadwell and Joyce 1992). This method has a known biased mutation spectrum. Using the initial sequence of Experiment 3 (see below) as the template, 132 point mutations are identified from sequencing results of five separate mutagenic PCR. Among these point mutations, the occurrences of each type are: AT→GC, 93; AT→TA, 10; AT→CG, 0; CG→TA, 13; CG→GC, 1; CG→AT, 2; deletion, 3; and insertion, 0. The biased spectrum can be only partially attributed to the higher AT content of the initial template, 23 AT pairs and 18 GC pairs.

Selection Plasmid Construction

A selection plasmid, pCatKp, is constructed to have the following features (Fig. 1): It has a LIC (ligation independent cloning) site upstream of the cat gene coding sequence for promoter insertion. Between cat and the promoter insertion site, there is a Shine-Dalgarno sequence for proper translation. Downstream of cat there is a transcription terminator. The plasmid has a p15A replication origin and a kanamycin resistance marker, kanR.

The plasmid is derived from pCAT3basic (Promega, Madison, WI). The ColE1-derived replication origin on pCAT3basic is replaced with p15A ori to reduce plasmid copies per cell. The original drug resistance marker, bla, is replaced with a more stringent kanamycin resistance marker, kanR, to deter the growth of nontransformants. The f1 ori is also deleted. The multicloning site is replaced with the LIC site sequence made of synthetic oligonucleotides. The plasmid is constructed by ligation of PCR amplified fragments step by step. We confirmed that the plasmid, either with or without the insertion of a starting sequence for evolution, does not confer noticeable chloramphenicol resistance, i.e., after incubation for 2 days, no colonies seen on agar plate containing 1 μg/ml chloramphenicol.

Selection

We use agar surface selection primarily to avoid the possibility that very few clones quickly take over the whole population; presumably such takeover is likely to occur in liquid culture. In addition, it is very convenient to appraise the evolution in progress by observing the frequency and the growth rate of colonies on chloramphenicol plates during selection.

The PCR mutagenized promoter region is inserted upstream of the cat gene on the selection plasmid, pCatKp, and the plasmid transforms E. coli cells. The transformants first grow on a nonselection plate for 12 h until tiny colonies are visible (about 0.1 mm diameter). The colonies on this plate represent the population before selection. These colonies are transferred onto five selection plates by replica plating (Lederberg and Lederberg 1952). Unless indicated otherwise, the chloramphenicol concentrations of selection plates are 0, 3.3, 10, 33, 100, and 330 μg/ml, abbreviated as 0×, 0.1×, 0.3×, 1×, 3×, and 10×, respectively. The colonies grow on selection plates for 36 h in the first cycle or 18 h in subsequent cycles. An example of colonies on selection plates is shown in Fig. 2. After selection growth, the cells from each plate are scraped off from the surface and stored in individual tubes. In order to permit some weaker promoters to further evolve, cell collection follows the following schedule: All of the cells are collected from the highest chloramphenicol concentration on which colonies appear. Only 1/10 of cells are collected from the second highest chloramphenicol concentration plate, and 1/100 from the next. Plasmid DNA is extracted from the cells for further evolution and for analysis. By adjusting the duration of colony growth and the chloramphenicol concentration, we always find the proper selection stringency for each evolution cycle.

Chloramphenicol resistance of a culture has a noticeable hysteresis. Namely, cells transferred from a large and saturated colony tend to show more chloramphenicol resistance than from a small and growing one. This hysteresis may be largely due to the accumulation of cat gene product prior to chloramphenicol exposure, which effectively reduces the intracellular concentration of the drug. Such hysteresis is undesirable because it introduces phenotype variability. Furthermore, σ^S promoter may gain a selective advantage if the cells are allowed to grow to saturation. For these reasons, the duration of growth on a nonselection plate is precise both in the selection process and in the promoter activity assay.

Promoter Activity Assay

In this section we describe the procedure to assess the activity of the evolved promoters after all the cycles of an experiment have been completed. The procedure is analogous to the selection. A note of difference, the clones assayed here are samples taken out of the evolving population, whereas the selection is on the evolving population itself. Plasmid DNA samples are collected before and after selection for each evolution cycle. Individually the sample plasmid preparations transform fresh cells, and the transformants grow on a nonselection agar plate overnight till colonies reach about 0.2 mm in diameter. Isolated colonies are randomly taken and placed individually in the wells of a 96-well culture plate. Each well is filled with 100 μl of nonselection liquid medium. After 4 h of incubation at 37°C, a 96-pin replicating tool (Model 140500; Boekel, Feasterville, PA) is used to transfer about 5 μl of culture from each well onto each of six assay agar plates. (The liquid cultures, after spotting, continue to grow for several hours to saturation and are stored at −20°C with 10% glycerol for DNA sequencing.) Chloramphenicol concentrations in the six assay plates are 0, 3.3, 10, 33, 100, and 330 μg/ml; for convenience, these concentrations are abbreviated 0×, 0.1×, 0.3×, 1×, 3×, and 10×, respectively. The 0× plate serves as a nonselection control. After incubation for 18 h, cell spots on the plates are evaluated. Examples cell spots on assay plates are shown in Fig. 3.

A chloramphenicol resistance value is assigned to each sample clone according to the growth on the assay plates. Specifically, the value is the highest chloramphenicol concentration of agar on which the cell spot density approaches saturation. For example, if the spot density is near saturation on 0×, 0.1×, 0.3×, and 1× chloramphenicol but decreases sharply on 3× or 10×, the promoter activity is assigned as 1×. In most cases, the chloramphenicol resistance can be clearly scored. However, for about 15% of the clones, the spot density drops more gradually at higher chloramphenicol concentrations. In that case, half of these clones are arbitrarily assigned to the highest chloramphenicol concentrations of full spot density, and the rest are assigned to the next highest chloramphenicol concentrations on which the spots are only about half-grown.

This assay is not as quantitative as growth rate measurement of liquid culture but is much simpler. The chloramphenicol resistance of a clone can be easily scored. We use the measured chloramphenicol resistance as the value of promoter activity (see Materials and Methods). As reported previously, a stronger promoter defined by the drug resistance also shows higher activity in vitro and in vivo (Horwitz and Loeb 1986, 1988). We have verified that stronger promoter activity confers higher chloramphenicol resistance in liquid culture and that a 10× activity is approximately the level of a fully induced lac promoter (data not shown).

Results of Individual Evolution Experiments

Experiment 1, Selecting Promoters from a Random Sequence Library

In this experiment, promoters are selected from a library of random sequences. This experiment is essentially the same as those reported previously (Horwitz and Loeb 1986; Horwitz and Loeb 1988; Oliphant and Struhl 1987, 1988). We use this experiment to verify the selection procedure. The random library is made of synthetic oligonucleotides consisting of a 40-nucleotide random region flanked by two primer sites (Fig. 4). The single-strand DNA is made to double strand by one round of replication with primers of the complementary strand (one-step extension; see Materials and Methods). After insertion and transformation, about 1.4 × 10⁴ transformants are subjected to selection. Several clones from each of 0.3×, 1×, and 3× chloramphenicol plates are sequenced (Fig. 4). Based on sequence similarity, the −10 and −35 elements can be identified for most of the clones. In contrast, no promoters are identified among several samples prior to the selection (data not shown). The results indicate that the selection is effective.

Experiment 2, from a Single Initial Sequence to Functional Promoters, Evolution with Changing Mutation Frequencies

This is our pilot experiment to evolve E. coli promoter from a single starting sequence. The length of the mutable region is 37 bp (Fig. 5). Prior to this experiment we estimated that a clonal size of the order of 10⁷ to 10⁸ and many cycles would be necessary to evolve a weak promoter. Therefore, we applied a relatively high mutation frequency in the first cycle. But in fact, promoter solutions emerge much faster than expected. With a mutation frequency of 7%, from 5.8 × 10⁴ transformant clones in the first cycle, already promoters emerge. In the two subsequent cycles, the mutation frequency is reduced to 2%, and clonal sizes are 4.8 × 10⁴ and 3.5 × 10⁶. The range of chloramphenicol concentrations in selection agar plates is between 0.03× (1 μg/ml) and 3×. In this experiment, the population becomes mostly promoters in three cycles (Fig. 6).

A majority of the clones from the third cycle are resistant to 3× chloramphenicol. There are two major types of sequences (Fig. 5). Twenty-eight of 37 sequence samples are similar to 2.3.12. or 2.3.9. Among these clones, the −35 element identified is located in the upstream primer site, and the −10 element in the mutable region. Between −35 and −10, there is an 18-bp AT-rich region. This group of solution already appears after the first cycle, but only 1 sequence among 12. After the second cycle, it becomes the most frequent one. This solution is puzzling because it is very different from the starting sequence, thus unlikely to be generated by stepwise single base substitutions.

Seven of the 37 sequences are related to another group represented by 2.3.26. This group has an E. coli consensus −35 element in the mutable region. Downstream of the −35, with a 17-bp spacer, there is an easily identifiable −10 element. It requires only three base pair alterations to obtain this promoter from the starting sequence. A precursor of this solution is already seen after the second cycle. This precursor has two of the three specific mutations. Besides the two major groups, two sequences have deletion mutations.

In this and the next three experiments, the initial sequence is confirmed to be nonfunctional (results are not shown but they can be inferred from the activity distribution curve prior to the first selection). This experiment indicates that a functional promoter is not far from any initial sequence. At a mutation frequency of a few percent, it is unnecessary to find the solutions among a large population.

Experiment 3, Evolution at a Low Mutation Frequency

In this experiment, the initial DNA is a single sequence of 41 bp (Fig. 7). The mutation frequency is 0.4% (13 mutations among 3198 bp examined), achieved by a 10-fold mutagenic PCR amplification. We executed nine evolution cycles in this experiment. The clonal sizes of cycle 1 through cycle 9 are 3.5 × 10⁴, 2.5 × 10⁴, 3.4 × 10⁴, 9.0 × 10⁴, 9.0 × 10³, 1.9 × 10⁵, 2.4 × 10⁵, 4.7 × 10³, and 3.8 × 10⁴. Chloramphenicol resistant increases gradually throughout the cycles and reaches a steady level after six cycles (Fig. 8). In cycles 6, 7, 8 and 9, the mean chloramphenicol resistance of preselection samples becomes high. In other words, the population retains more chloramphenicol resistance after mutagenesis.

The sample sequences after three, six, and nine cycles of evolution are shown in Fig. 7. After cycle 3, two major types of solutions appear (3.3.6 and 3.3.9; Fig. 7). There is also a single-base pair deletion type (3.3.17). This deletion appears more frequently through the next three cycles.

After cycle 6, a 2-bp deletion (3.6.73) appears. This 2-bp deletion becomes dominant after cycle 9. In cycle 6, one clone seems to be a recombinant of two earlier sequences. This recombinant type becomes more frequent after nine cycles. It is remarkable that nearly all the final solutions have the TGTG motif of the extended −10 promoter.

One conclusion from this experiment is that deletion is a very important factor to adjust the relative positions of the two promoter elements and that recombination is also exploited to create promoter sequences.

Experiment 4, Evolution at a High Mutation Frequency

In a way, evolving sequences is analogous to physical particles in an energy landscape: high mutation frequencies could prevent DNA sequences from being trapped in a potential well. It is plausible that, at high mutation frequencies, solutions very different from the ones in Experiment 3 would appear and that the solutions would be more diverse. On the other hand, the quantitative aspects of this notion are uncertain. For example, how high of a mutation frequency is sufficient to see the effect? In this experiment, the arbitrary starting sequence is identical to that in Experiment 3. The mutation frequency, however, is increased to 3.6%. The clonal size of the initial cycle is estimated to be between 1 × 10⁴ and 1 × 10⁵. In the subsequent cycles, the sizes are 1.3 × 10⁴ and 1.9 × 10⁴. After three cycles, the population is highly chloramphenicol resistance (Fig. 10). The majority of the clones are resistant to 3× and 10× chloramphenicol.

Two major types of sequences are found after three cycles (Fig. 9). Among 26 samples, 13 clones are similar to 4.3.17; they have a 5-bp deletion. Eleven other clones are related to 4.3.7 and have a 1-bp deletion. These two types are not seen among the 20 samples after the first cycle. There are several other substitution mutations among these sequences, but we do not know how they contribute to the promoter function.

In addition to the above two types, there is a minor type represented by clone 4.3.4. It is very similar to 3.3.9, but we are uncertain whether they have evolved independently in this experiment, or they are from cross contamination with Experiment 3. Furthermore, the sequences related to 4.3.0 are puzzling. They are the same as the starting sequence. Why are they present in the third cycle? Maybe these clones survive on the selection plate due to a titration effect from the neighboring strongly chloramphenicol-resistant colonies on a crowded, low-chloramphenicol concentration plate.

We do find diverse promoter solutions in this experiment. However, a very different solution is likely created by a deletion, rather than by a high level of base pair substitution. In other words, the escape from a trap is likely through “tunneling.” Further refinements are needed to weed out deletion mutants and possible contaminants in order to address the initial questions in this experiment.

Experiment 5, Evolution at an Extremely High Mutation Frequency Over a Longer Sequence Region

In this experiment, mutation frequency is elevated to a practical limit, beyond which PCR becomes difficult. The length of the mutable region is increased to 60 bp (Fig. 11). The logic is similar to that in Experiment 4: At a very high mutation frequency on a longer region, we expect many types of solutions to coexist.

The mutation frequency is about 18%, achieved by three mutagenic PCR in tandem. Three evolution cycles are carried out in this experiment. The clonal sizes of the three cycles are 1.5 × 10⁴, 2.4 × 10⁵, and 2.4 × 10⁴. The population immediately becomes highly chloramphenicol resistant after one evolution cycle; a majority of the clones are resistant to 3× chloramphenicol. The phenotype distributions of the three evolution cycles are similar (Fig. 12). After each mutagenesis, the clones become mostly nonfunctional. After selection, the clones are mostly resistant to chloramphenicol again.

After the third cycle, among 23 sequence samples, 18 appear to be extended −10 type (Fig. 11). Four other clones lost the extended −10 character but appear to gain a −35 and a −10. Finally, one clone does not have an obvious −10 element coupled to a −35. Among the 18 extended −10 type clones, 15 have the stronger TGTG (Burr et al. 2000) in place of GGTG of the starting sequence. Four of the 18 clones have a single-base pair deletion, which brings the space between the −10 and the possible −35 elements to 17 bp. The results after the first evolution cycle is similar but with a smaller sample size (Fig. 11).

From this experiment we conclude that whatever we try, the set of solutions still converges to very few groups. This experiment also suggests that at a very high mutation frequency, single-base substitutions alone are sufficient to create effective promoters, and hence, deletion plays a less important role.

Discussion

There are two general questions in molecular evolution: (1) what is the end product of an evolution and (2) how that product emerges. We explored these questions by evolving E. coli promoter experimentally from an arbitrarily chosen, nonfunctional sequence. The procedure used in the experiments was effective. With few iterations of mutation-selection, each lasting less than 2 days, a nonfunctional sequence evolved into a population of promoters.

This selection method is essentially the same as previously reported by others (Horwitz and Loeb 1986; Horwitz and Loeb 1988; Oliphant and Struhl 1987, 1988). This method is based on the actual in vivo promoter function, and it is conceptually different from the RNA polymerase affinity based in vitro method (Gaal et al. 2001; Xu et al. 2001) used in SELEX. We are confident that the evolved sequences are, by definition, functional promoters. Given that the selection relies on the cellular supplied E. coli σ⁷⁰ RNA polymerase in growing cells, it was no surprise that nearly all of the experimentally evolved promoters bear sufficient similarity to the natural σ⁷⁰ promoter; the exceptions are very few. Apparently, in sequence space, the promoter solutions are largely distributed around the consensus sequence.

With the procedure described here, some parameters, such as mutation frequency, selection stringency, and population size, could be adjusted, within a range. We primarily tested the effect of one parameter: the mutation frequency. We saw that at a higher mutation frequency, the promoter emerged faster. Again, this result was also anticipated. The knowledge gained from the experiments here was mostly concerned with the dynamic properties of promoter evolution. Prior to the experiments, we expected a very slow emergence of very weak promoters initially, with these weak promoters gradually improving through base pair substitutions. We roughly assumed that at least six base pair positions are required to be the same as in the consensus sequence in order for a sequence to have any detectable promoter function. We predicted that with a moderate mutation frequency, the evolved sequences would be “trapped,” meaning that in each cycle, the sequences are of the same type as in the previous cycle. The experiments lent us an opportunity to examine some of the above notions. Many dynamic properties observed in the experiments were actually rather unpredictable, at least not anticipated by us prior to the experiments.

For instance, promoters emerge faster than we originally imagined. At a mutation frequency of about 0.4%, with a population size of less than 10⁶ clones, and in three cycles of evolution, the population shifted from a single non-functional sequence to functional promoters. At higher mutation frequencies, promoters emerged even faster. Certainly, about half of the promoters utilized an unintended potential −35 element within a PCR primer site, and therefore, were only partially evolved. However, after discounting these promoters, in every experiment we still found that promoters completely evolved within the designated region between the mutagenic PCR primer sites.

Several factors may have accelerated the emergence of promoters: among them is the selection scheme. By plating cells in parallel at several drug concentrations, the selection stringency was effectively adaptive to each cycle. Very weak promoters, which would be deterred by stringent selection, were likely enriched in early cycles.

The occurrence of extended −10 type among native E. coli promoters (Burr et al. 2000) was higher than random chance. Because this high frequency could have been due to some unknown transcription regulation functions of extended −10 promoters, our early analysis overlooked the roll of extended −10 type as an alternative path of the evolution. The TG or TGTG motif of extended −10 type appeared in most of the evolved promoters. The frequent appearance of extended −10 sequences let us think that because the TG or TGTG motif was very short, it is highly probable to create an extended −10. An extended −10 can serve as an intermediate from which a “full-fledged” consensus type promoter can evolve. Apart from any possible transcription regulation advantages, the high frequency of the extended −10 solution among native E. coli promoters (Burr et al. 2000) may have been a reminiscence of evolution intermediates.

In addition to the above factors contributing to quick emergence of promoters, through statistics, 3 of the 12 base positions of the consensus promoter would be expected in a randomly chosen sequence of 23 bp (covering −35 and −10 with a 17-bp spacer). Therefore, on average, one only needs to alter 3 bp to generate a weak promoter sequence (assuming that six correct positions are sufficient to form a promoter). Our results were in agreement with this analysis. In retrospect, the fast appearance of promoters is not surprising.

Throughout the experiments the solutions always converged to very few types. In hopes of capturing many functionally equivalent solutions from each experiment, some measures were taken to preserve diversity. Colonies were selected on an agar surface in order to prevent very few clones from immediately taking over the population. The selection output of each cycle was “doped” with a small fraction of cells collected from less stringent selection plates. Using a multicopy plasmid also reduced the effective selection stringency and, thus, allowed moderate promoters to survive. Despite these arrangements, the evolved solutions still converged into very few groups, even at an extremely high mutation frequency and with a longer mutable region (Experiment 5). Seemingly, the converging “force” of selection was still greater than the dispersion “force” by base pair substitution mutations.

In a case where the evolution continued for nine cycles (Experiment 3), we noticed that, while the resistance of the postselection population remained essentially unchanged, the population after mutagenesis (i.e., before selection) became progressively more chloramphenicol resistant in late cycles. We speculate that this phenotype change was partially due to an implicit selection for stability against the effect of mutation in the experiment. First, it is possible that the sequence itself was less mutable at critical positions. For example, in the particular mutagenic PCR used in the experiments, a GC pair was less likely to mutate, and therefore, the descendants of a sequence with GC pairs in critical positions should be less likely to vary. Second, a promoter in a promoter-dense region in sequence space was likely to evolve into a group of functional variants. Conversely, a sequence in a promoter sparse region tended to lose its descendants. In other words, a promoter that had more positions to “spare” would survive better against mutagenesis. As a special case, a promoter region containing multiple promoters was more likely to retain promoter function than a single-promoter region after mutagenesis. More refined experiments may tease apart the above contributing factors.

Finally, we found it interesting that a large portion of evolved sequences contained small deletions of several base pairs. (Such deletion is likely to have originated in mutagenic PCR.) Given that deletion is rare (only a few percent among all mutations from mutagenic PCR), its role must be significant in promoter evolution. Often, it appeared that deletion reduced the distance between potential −35 and −10 elements to the optimal 17 or 18 bp, for instance, in sequences 2.3.3, 2.3.4, 3.3.17, 3.6.73, 4.3.7, and 4.3.17. In less frequent instances, deletions improved −10 elements, as in sequences 2.3.4 and 4.1.2. Similarly, recombination between heterologous sequences was observed only once throughout the experiments. This event brought −35 and −10 element from two separate sequences into one. It is plausible that creation and improvement of −35 and −10 elements occur largely through base pair substitutions, and that deletion and recombination bring existing elements into proper perspective.

The specific experimental design worked effectively in our experiments and generated some interesting and unexpected results. However, there is plenty of room for further improvement of the design. For instance, the specific mutagenesis method has a skewed mutation spectrum. We observed that overexpression of the selection gene cat could be detrimental to the cell (results not shown). Therefore, the promoter of moderate activity could reach a saturation level of the drug resistance. The selection was for constitutive promoters. The above limitations are not, however, fundamental. The mutation spectrum could be altered almost at will by using different mutagenesis methods (Zaccolo and Gherardi 1999; Zaccolo et al. 1996). The dynamic range of selection can be further extended by using a single-copy plasmid, a weaker ribosome binding site, or another selection marker. To evolve more complex, regulated promoters, an elaborate selection scheme involving counter selection is needed. For that, single-copy plasmid can be highly desirable in order to avoid titration of transcription regulator proteins.

Because alteration of the RNA polymerases or introduction of another bacterial RNA polymerase may severely perturb the growing cells, this in vivo selection is limited to the bacterial σ⁷⁰ promoters. We do not know how to circumvent this limitation in order to study promoter-RNA polymerase coevolution; certainly such coevolution is an important aspect of speciation. An exception to this limitation may be the evolution of certain phage promoters, such as a T7 promoter. A phage RNA polymerase is specific to the phage’s own promoters. By transforming cells with variants of the phage RNA polymerase gene in addition to the probable selection plasmids, the selection can be applied to the RNA polymerase-promoter pair, and the experimental evolution may re-create a family of the existing RNA polymerase-promoter pairs and create new ones.

Experimental evolution does not reproduce the natural evolution history. In general, evolution in nature involves a large population over a long period of time and is constrained by certain historical conditions, many of which may forever be obscure. In contrast, the experimental evolution course is very short, the population size is small, and the mutation frequency is usually elevated in order to accelerate the process. Nevertheless, the experiments capture the essence of evolution; they are an iteration of mutation, selection, and replication. Experimental evolution permits us to adjust individually the parameters and repeat the process under controlled conditions, and thus, it lets us observe some essential features in the process and test certain hypotheses. In this way, molecular evolution is a bridge between the purely mathematical modeling and the “real world” in nature. Certainly the molecular evolution procedure reported here could also serve as a tool to create promoters for application purposes.

References

Barne KA, Bown JA, Busby SJ, Minchin SD (1997) Region 2.5 of the Escherichia coli RNA polymerase sigma70 subunit is responsible for the recognition of the ‘extended−10’ motif at promoters. EMBO J 16:4034–4040
Article CAS PubMed Google Scholar
Berninger MS (1993) Use of exo-sample nucleotides in gene cloning. United States Patent and Trademark Office, Life Technologies, Inc
Burr T, Mitchell J, Kolb A, Minchin S, Busby S (2000) DNA sequence elements located immediately upstream of the −10 hexamer in Escherichia coli promoters: a systematic study. Nucleic Acids Res 28:1864–1870
Article CAS PubMed Google Scholar
Cadwell RC, Joyce GF (1992) Randomization of genes by PCR mutagenesis. PCR Methods Appl 2:28–33
CAS PubMed Google Scholar
Estrem ST, Gaal T, Ross W, Gourse RL (1998) Identification of an UP element consensus sequence for bacterial promoters. Proc Natl Acad Sci USA 95:9761–9766
Article CAS PubMed Google Scholar
Gaal T, Ross W, Estrem ST, Nguyen LH, Burgess RR, Gourse RL (2001) Promoter recognition and discrimination by EsigmaS RNA polymerase. Mol Microbiol 42:939–954
Article CAS PubMed Google Scholar
Harley CB, Reynolds RP (1987) Analysis of E. coli promoter sequences. Nucleic Acids Res 15:2343–2361
CAS PubMed Google Scholar
Hawley DK, McClure WR (1983) Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res 11:2237–2255
CAS PubMed Google Scholar
Horwitz MS, Loeb LA (1986) Promoters selected from random DNA sequences. Proc Natl Acad Sci USA 83:7405–7409
CAS PubMed Google Scholar
Horwitz MS, Loeb LA (1988) DNA sequences of random origin as probes of Escherichia coli promoter architecture. J Biol Chem 263:14724–14731
CAS PubMed Google Scholar
Kumar A, Malloch RA, Fujita N, Smillie DA, Ishihama A, Hayward RS (1993) The minus 35-recognition region of Escherichia coli sigma 70 is inessential for initiation of transcription at an “extended minus 10” promoter. J Mol Biol 232:406–418
Article CAS PubMed Google Scholar
Lederberg J, Lederberg EM (1952) Replica plating and indirect selection of bacterial mutants. J. Bacteriol 63:399–406
CAS PubMed Google Scholar
Liu M, Tolstorukov M, Zhurkin V, Garges S, Adhya S (2004) A mutant spacer sequence between −35 and −10 elements makes the Plac promoter hyperactive and cAMP receptor protein-independent. Proc Natl Acad Sci USA 101:6911–6916
Article CAS PubMed Google Scholar
Oliphant AR, Struhl K (1987) The use of random-sequence oligonucleotides for determining consensus sequences. Methods Enzymol 155:568–582
Article CAS PubMed Google Scholar
Oliphant AR, Struhl K (1988) Defining the consensus sequences of E.coli promoter elements by random selection. Nucleic Acids Res 16:7673–7683
CAS PubMed Google Scholar
Pribnow D (1975) Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter. Proc Natl Acad Sci USA 72:784–788
CAS PubMed Google Scholar
Raibaud O, Schwartz M (1984) Positive control of transcription initiation in bacteria. Annu Rev Genet 18:173–206
Article CAS PubMed Google Scholar
Rashtchian A, Berninger MS (1992) Use of exo-sample nucleotides in gene cloning. United States Patent and Trademark Office, Life Technologies, Inc
Rosenberg M, Court D (1979) Regulatory sequences involved in the promotion and termination of RNA transcription. Annu Rev Genet 13:319–353
Article CAS PubMed Google Scholar
Seeburg PH, Nusslein C, Schaller H (1977) Interaction of RNA polymerase with promoters from bacteriophage fd. Eur J Biochem 74:107–113
Article CAS PubMed Google Scholar
Xu J, McCabe BC, Koudelka GB (2001) Function-based selection and characterization of base pair polymorphisms in a promoter of Escherichia coli RNA polymerase-sigma(70). J Bacteriol 183:2866–2873
Article CAS PubMed Google Scholar
Zaccolo M, Gherardi E (1999) The effect of high-frequency random mutagenesis on in vitro protein evolution: a study on TEM-1 beta-lactamase. J Mol Biol 285:775–783
Article CAS PubMed Google Scholar
Zaccolo M, Williams DM, Brown DM, Gherardi E (1996) An approach to random mutagenesis of DNA using mixtures of triphosphate derivatives of nucleoside analogues. J Mol Biol 255:589–603
Article CAS PubMed Google Scholar

Download references

Acknowledgments

We thank Christina Wakamoto (University of California, San Diego) for invaluable help with the manuscript.

Author information

Authors and Affiliations

NEC Laboratories America, 4 Independence Way, Princeton, NJ, 08540, USA
Shumo Liu & Albert Libchaber
Rockefeller University, 1230 York Avenue, New York, NY, 10021, USA
Albert Libchaber
Department of Physics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
Shumo Liu

Authors

Shumo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Albert Libchaber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shumo Liu.

Additional information

[Reviewing Editior: Dr. Laura Landweber]

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, S., Libchaber, A. Some Aspects of E. coli Promoter Evolution Observed in a Molecular Evolution Experiment. J Mol Evol 62, 536–550 (2006). https://doi.org/10.1007/s00239-005-0128-x

Download citation

Received: 25 May 2005
Accepted: 31 December 2005
Published: 11 April 2006
Issue Date: May 2006
DOI: https://doi.org/10.1007/s00239-005-0128-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Some Aspects of E. coli Promoter Evolution Observed in a Molecular Evolution Experiment

Abstract

Similar content being viewed by others

Random sequences rapidly evolve into de novo promoters

Gene regulation in Escherichia coli is commonly selected for both high plasticity and low noise

Gene amplification as a form of population-level gene expression regulation

Introduction