Introduction

Polymerases can be classified into four categories based on the type of nucleic acids synthesized and templates used: (1) DNA-directed DNA polymerases (DNAPs; DdDp), (2) DNA-directed RNA polymerases (RNAPs; DdRp), (3) RNA-directed DNA polymerases (reverse transcriptases or RTs), and (4) RNA-directed RNA polymerases (RNA replicase; RdRp) (Ahlquist 2002; Castro et al. 2009; Chen and Romesberg 2014). These polymerases can be further divided into single subunit RNA polymerases and multi-subunit RNA polymerases (Werner and Grohmann 2011). The single subunit RNAPs have motifs in common and are thought to derive from a common ancestor but the evolutionary divergence of these polymerases is obscure (Cermakian et al. 1997).

The evolution of genetic information depends in part on the co-evolution of polymerases that can synthesize the informational molecule and at the same time transfer the genetic information in the template through the formation of Watson–Crick base pairing. The earliest polymerases in an RNA World are thought to be ribozymes with RNA-directed RNA synthesis. During the transition from the RNA world to an RNA–protein world, the ribozymes were proposed to be replaced by single subunit protein polymerases with RNA-directed RNA synthesis (RNA replicases). The transition from the RNA–Protein world through the RNA–DNA Retro world to the modern DNA world required a co-evolution of polymerase properties to give RNA-directed DNA synthesis (reverse transcriptases), and DNA-directed DNA polymerases (DNA polymerases, DNA Replicases), respectively. One argument that genetic information has evolved through these different “Worlds” is the modern existence of these coevolved single subunit polymerases. All of these single subunit polymerases presumably initiated synthesis through a primer extension mechanism, since almost all modern single subunit polymerases retain this property.

A major evolutionary step was the evolution of promoters which defined genes and allowed their individual and coordinated expression, along with the co-evolution of domains in RNA polymerases which would recognize promoter sequences. This innovation results in de novo initiation of transcription and provides a way to specifically regulate gene expression. RNAPs are distinguished by their ability to recognize promoter sequences and initiate transcription de novo, rather than extend from a primer (Cheetham and Steitz 2000; Kennedy et al. 2007; Steitz et al. 1994). The initiation phase of transcription is transient and generally occurs while the RNAP is bound to the promoter (Gong and Martin 2006; Liu and Martin 2002). The transition to the elongation phase requires a change in RNAP structure to allow promoter release and processive movement on the template (promoter clearance) (Gong et al. 2004; Martin et al. 1988). This initiation phase is unique to RNAPs and this ability to recognize promoters and initiate de novo is a key step in the evolution of organisms’ ability to transfer and regulate specific genetic information from DNA.

The bacteriophage T7 RNA Polymerase is the prototype of the single subunit RNA polymerases. It is an ideal model system for studying polymerase and promoter evolution. It is related to other bacteriophage RNA polymerases and the mitochondrial RNA polymerases. This group of RNAPs has very conserved structure and sequence. However, the mitochondrial RNA polymerases generally have different promoter sequences from the bacteriophage polymerases (Fig. 1). Bacteriophage polymerases generally have a 23 nucleotide promoter that overlaps the site of transcription initiation by six nucleotides (− 17 to + 6). Mitochondrial RNAPs recognize a diverse variety of promoter sequences which are typically about nine nucleotides in length and run from − 8 to + 1. In Fig. 1, the well-characterized mitochondrial promoter from the yeast mtRNAP is used as a reference for comparison with the bacteriophage RNAPs. The yeast promoter sequence has similarity with the − 8 to + 1 portion of the T7 promoter.

Fig. 1
figure 1

Comparison of three bacteriophage RNAP promoters and yeast mtRNAP (YmtRNAP) promoter. All sequences are shown in the 5′–3′ orientation. The − 8 to + 1 portion of the bacteriophage promoter is underlined. The nine-nucleotide yeast mtRNAP promoter sequence is aligned with the bacteriophage − 8 to + 1 sequence to highlight their analogous positions relative to transcription initiation. Arrows indicate the site of transcription initiation

The 23 nucleotide promoter of the T7 RNA polymerase can be divided into several functional domains (Fig. 2). The initiation region covers the ten nucleotides from − 4 to + 6. Nucleotides + 1 to + 6 are called the transcription start site. The + 1 site is conserved as a G in all the bacteriophage promoters. Positions + 1 to + 6 are conserved as purines in the bacteriophage promoters. Polymerase contacts in this region are primarily with pyrimidines in the template strand (Weston et al. 1997). Substitution of the nucleotides at + 3 to + 6 decreases promoter strength and initiation efficiency only slightly, while substitution at + 1 and + 2 significantly affects initiation efficiency and specificity.

Fig. 2
figure 2

The T7 RNAP promoter showing functional domains

The region − 4 to − 1 is called the unwinding region. Positions − 1 [A], − 3 [A], − 4 [T] are conserved among the bacteriophage promoters. This region is invariably AT-rich presumably to aid in melting of the DNA strands at the initiation site, although the conservation of positions − 1, − 3, and − 4 may indicate that they are polymerase contact sites.

The promoter recognition region (13 nucleotides) extends from − 5 to − 17 and consists of polymerase interaction sites important for polymerase binding and positioning. These promoter contact sites interact with two regions of the polymerase. Promoter positions − 5 to − 12 interact with a T7 RNAP structural domain called the specificity loop, located near the carboxyl terminus (Temiakov et al. 2000), while positions − 13 to − 17 interact with a T7 RNAP structural domain called the AT-rich recognition loop located near the amino terminus of the T7 RNA polymerase (Imburgio et al. 2000).

The phylogenetic tree depicting the evolutionary relationships among the single subunit RNAPs has not been rooted and so the evolutionary sequence of the two classes of promoter is unknown (Cermakian et al. 1997). The T7 RNAP can recognize a range of promoter sequences which are closely related to each other by this consensus sequence (Dunn and Studier 1983; Tang et al. 2005). Details concerning the T7 RNAP promoter structure and function were deduced from studies on mutant promoter sequences (Ikeda et al. 1992; Klement et al. 1990). Single base changes either in the recognition region or the initiation domain affects the efficiency of promoter recognition or the efficiency of initiation of transcription, respectively, but a mutation in the recognition region does not affect initiation, and a mutation in the initiation domain does not affect promoter recognition (Ikeda et al. 1992).

The promoter region from − 17 to − 13 is dispensable, and can be deleted with only small effects on initiation activity; however, optimal promoter recognition requires contacts within this region (Chapman and Wells 1982; Martin and Coleman 1987; Osterman and Coleman 1981). Comparison of the promoter sequences from the three phage promoters T3, T7, and SP6 revealed that all of the three promoters share a similar core sequence from − 7 to + 1 pointing to a common role of this region in promoter function (Brown et al. 1986; Jorgensen et al. 1991); (Fig. 1). There is considerable sequence divergence from − 8 to − 12 corresponding to the specificity loop-recognition region in the T7 RNAP promoter, suggesting that this region plays a key role in sequence-specific contacts. Although there is an 82% homology in the amino acid sequence between T3 and T7 RNAPs, neither of the enzymes can efficiently initiate transcription at promoters used by the other (Klement et al. 1990; Rong et al. 1998). Promoter specificity studies using base pair substitutions at − 10, − 11, and − 12 positions in the T7 RNAP promoter by residues of T3 RNAP promoter revealed that base pairs − 10 to − 12 have an important role with − 11 base having a significant role in promoter binding (Klement et al. 1990). Base substitutions at positions − 7 (A or G for C), − 8 (A for T), − 9 (A or T for C, and − 11 (T for G) completely inactivated the T7 promoter (Chapman et al. 1988). Methylation interference studies show that methylation of the G-residues at − 7 and − 9 in the template strand and − 11 in the non-template strand interfered with binding of T7 RNAP to the promoter suggesting that T7 RNAP makes important contacts in the major groove between − 7 and − 11 (Jorgensen et al. 1991).

The yeast Saccharomyces cerevisiae mitochondrial RNA polymerase (YmtRNAP) is homologous to the single subunit bacteriophage T3 and T7 RNAPs (Cermakian et al. 1996; Masters et al. 1987; Matsunaga and Jaehning 2004). YmtRNAP recognizes the simple nine-nucleotide-long promoter consensus sequence ATATAAGTA for transcription initiation, which differs in sequence and length from the phage RNAPs (Nayak et al. 2009). However, this sequence can be aligned with the − 8 to + 1 core region of phage RNAPs (Fig. 1). The YmtRNAP has a region analogous to the specificity loop of T7 RNAP which by analogy would interact with − 8 to − 5 [ATAT]. However, there is no analogous region in yeast mtRNAP to the AT-rich binding region in T7RNAP and so, it is not surprising that the yeast mitochondrial promoter consensus does not extend beyond a − 8 nucleotides.

To study T7 RNAP’s ability to use truncated promoters similar to mitochondrial promoters, we have developed an in vitro transcription system based on the in vitro transcription system developed by Milligan et al. (1987). Classically, two complementary oligonucleotides of equal length containing the T7 consensus sequence (Fig. 3a, b) could be used to produce run-off transcripts in vitro. Milligan et al. (1987) have shown that the 18 nucleotides, − 17 to + 1, when double stranded and attached to a 5′ extended template are sufficient to act as a promoter in vitro (Fig. 3c). We have modified their procedure by extending the double-stranded promoter region to 20 nucleotides (− 17 to + 3) in order to increase initiation frequency. We further modified their procedure by creating an oligonucleotide which formed an intramolecular double-stranded region of the type as shown in Fig. 3d. Under the same conditions, oligonucleotides of this type produced an RNA of the same length and composition as produced by the oligonucleotides used by Milligan et al. (1987).

Fig. 3
figure 3

Oligonucleotides used for run-off transcription with T7RNAP in vitro Panel a shows two complementary oligonucleotides of equal length annealed to create a 20 bp double-stranded promoter producing a run-off transcript with three G nucleotides at the 5′ end. Panel b shows two complementary oligonucleotides of equal length annealed to create an 18 bp double-stranded promoter producing a run-off transcript with one G nucleotide at the 5′ end, but with slightly decreased initiation frequency. Panel c shows two complementary oligonucleotides of different length annealed to produce a double-stranded promoter with a recessed 3′ end of the type used by Milligan et al. (1987). Panel d shows an oligonucleotide of the type used in the experiments reported in this paper, i.e., a single oligonucleotide internally annealed to produce a hairpin loop with a 20 nucleotide double-stranded promoter and a recessed 3′ end

Similar hairpin oligonucleotides were used by Sarcar and Miller (2018) to show that T7 RNAP could add templated nucleotides to the recessed 3′ ends (primer extension) of hairpin oligonucleotides in the absence of a promoter sequence. This primer extension, end-labeling activity is greater for hairpin oligonucleotides with duplexes between about 3 and 9 base pairs depending on the composition of the duplex and the length of the hairpin loop (Sarcar and Miller 2018).

We have used this oligonucleotide system to determine whether T7 RNAP in the presence of such an oligonucleotide and a single ribonucleotide triphosphate can correctly and efficiently initiate transcription on truncated promoter sequences resembling mitochondrial promoters. Using this technique, we show that T7 RNAP can initiate on truncated promoter sequences similar to mitochondrial promoters. However, the site of transcription initiation is not at the canonical initiation site, but initiates instead on the first unpaired base in the template. In addition, we show that with complete promoters, or mostly complete promoters, and a recessed 3′ end, non-templated, de novo initiated transcripts are produced by abortive initiation as previously described by Ling et al. (1989). In the absence of a promoter and with transiently stable hairpin duplexes, the oligonucleotide itself can be labeled at the recessed 3′ end as previously described by Sarcar and Miller (2018). These results are analyzed in the light of T7 promoter flexibility, promoter evolution, and the use of this technique to produce defined RNAs without 5′-sequence constraints.

Materials and Methods

Oligonucleotides, Radiolabeled Ribonucleotide Triphosphate, and T7 RNAP

Oligonucleotides designed for this study were procured from Eurofins MWG operon, USA. The desiccated oligonucleotides were resuspended in water to a concentration of 100 µM. The radiolabeled α32P rATP was procured from Molecular Bioproducts, USA. T7 RNA polymerase (50,000 U/mL) was procured from New England Biolabs, USA.

Reaction Conditions

50 µL reaction mixtures containing 5 µL of the 100 µM oligonucleotides, 2 µL of radiolabeled α32P- rATP (3000 Ci/mmol), 5 µL of 10× RNA polymerase reaction buffer (1× concentrations of 40 mM Tris–HCl, 6 mM MgCl2, 2 mM spermidine, 1 mM dithiothreitol), supplied along with T7 RNA polymerase, 1 µL of RNAase inhibitor-RNasin (Promega, USA), and 1 µL (50U) of T7 RNA polymerase were incubated at 37 °C for 60 min. Reactions were stopped and run on 15% Polyacrylamide gels at 75 V.

The gels were then stained with ethidium bromide to visualize the nucleic acids, and subsequently exposed to a phosphor screen (Amersham GE Healthcare) for 5 min and scanned using Storm 840 (Amersham GE Healthcare, USA).

Hairpin Oligonucleotide Nomenclature

Hairpin oligonucleotides are designated by a three number designation. The first number is the total length of the oligonucleotide in nucleotides, the second number is the length of the duplex portion of the hairpin in base pairs, and the third number is the length of the retained promoter consensus sequence. For some oligonucleotides, an additional designation indicating the number and the type of complementary sequences in the 5′ extended template is added.

Results

The 5′ end of the 20 nucleotide (− 17 to + 3) double-stranded promoter region on hairpin oligonucleotides was systematically deleted to determine if they would (1) initiate promoter-dependent, template-dependent de novo transcription; (2) initiate promoter-dependent, template-independent de novo RNA synthesis (abortive initiation); (3) label at the 3′ end of the oligonucleotide (primer extension); or (4) fail to incorporate label under the conditions of having the T7 RNAP and a single radiolabeled ribonucleotide triphosphate present. The starting oligonucleotide for the deletion mapping was the oligonucleotide shown in Fig. 3d. It is able to form intra- and intermolecular base pairing to produce a recessed 3′end on an extended 5′ template. Next to the 3′ end on the unpaired template are four T nucleotides that will base pair with the radiolabeled α-32P-rATP used as the only nucleotide triphosphate in the experiment. As the double-stranded promoter region is removed, the loop length is, in some cases, increased to maintain the overall length. Table 1 shows the results of these experiments and Fig. 4 shows examples of the four different results.

Table 1 Results of the sequential deletion of the 5′ end of the double-stranded promoter region of hairpin loop oligonucleotides
Fig. 4
figure 4

Panel a Autoradiograph of 15% polyacrylamide gel used to separate the RNA-labeled products produced from the incubation of various oligonucleotides with T7 RNA polymerase and radiolabeled rATP. Examples of the labeled products are shown: (1) end-labeled oligos (lanes 1–2), (2) single spot RNA (lane 3), (3) double spot RNA (lanes 4 and 5), (4) no labeling (lanes 5 and 6). Size markers in nucleotides are shown to the right of the gel. Single spot RNA migrates at 4 nucleotides; double spot RNA migrates at 4 nucleotides and 2 nucleotides. Panel b The oligonucleotides used in this experiment, listed from left to right (lanes 1 to 6) are 44-7-6, 38-4-4, 50-10-8, 56-16-16, 56-21-20, 66-31-20. The DNA sequences are written 5′ to 3′. Underlines indicate promoter or partial promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

The oligonucleotide shown in Fig. 3d (56-21-20) has a full 20 base pair promoter sequence (− 17 to + 3). With all four ribonucleotide triphosphates present, T7 RNAP polymerase would be expected to initiate at + 1 and make a 13 nucleotide RNA run-off transcript starting with GGG. With only rATP present, T7 RNAP reproducibly makes RNAs that separate upon gel electrophoresis as two high mobility, labeled spots (Fig. 4, lane 5) designated DS for “double spot” in Table 1, line 1. This double spot product is RNAase sensitive (not shown) and will be called the “double spot” RNA product below. The two RNAs migrate with 4 nucleotide and 2 nucleotide size markers and are presumably 5′-pppApApApA-3′ and 5′-pppApA-3′, respectively.

Oligonucleotide 66-31-20 (Fig. 4, lane 6) is identical to 56-21-20 except that the recessed 3′ end is extended on the template strand to produce a 31 bp double-stranded region with a full promoter present and a blunt end terminus. This oligonucleotide fails to produce a “double spot” RNA product under the same conditions as with 56-21-20 indicating that under the conditions used in our assay, a recessed 3′ end is required to produce the “double spot” RNA.

Oligonucleotide 56-16-16 is identical to 56-21-20 except that four nucleotides are removed from the 5′ end of the double-stranded promoter region to remove the AT-rich region. Eight nucleotides are added to the single-stranded loop region to maintain the size of the oligonucleotide at 56 nucleotides. The − 13 to + 3 region of the promoter is retained (16 nucleotide promoter). Oligonucleotide 56-16-16 (Table 1, line 2) also produces a “double spot” RNA (Fig. 4, lane 4).

Oligonucleotides with the double-stranded/promoter region reduced to 15 or 14 (56-15-15 and 56-14-14, Table 1, lines 3 and 4; Fig. 10) do not produce double spot. However, when the promoter sequence is retained at 14 or 15, but the double-stranded region is either increased with AT base pairs (56-20-14AT, 56-20-15AT; Fig. 7) or GC base pairs (56-20-15GC; Fig. 7) to 20 base pairs, double spot is produced. This indicates that an oligonucleotide with a minimum of 14 promoter nucleotides can produce double spot, but only when a double-stranded region of greater than 15 base pairs is present.

Oligonucleotides with 5 to 14 base pairs of double-stranded promoter region deleted to give 15 to 6 bp truncated promoters (from − 12 to + 3–5 to + 3), reproducibly produce a high-mobility, labeled spot designated SS (single spot) in Table 1 and fails to produce the “double spot” RNA product, indicating that the failure to produce the “double spot” RNA may allow the production of the “single spot” RNA product. These oligonucleotides lack all or part of the specificity loop region, and therefore, most of the recognition region of the promoter. An example of this type of oligonucleotide is 50-10-8 (− 7 to + 3) which produces a single spot RNA product (Fig. 4, lane 3). The “single spot” product is also sensitive to RNAase (not shown) and will be designated the “single spot” RNA product below. The single spot RNA migrates with a 4 nucleotide size marker and is presumably 5′-pppApApApA-3’.

Oligonucleotides with 13 to 17 base pairs of 5′ double-stranded promoter region deleted, to give 3 to 7 bp truncated promoters (from − 4 to + 3 to + 1 to + 3) do not produce single spot or double spot RNAs, but can add a nucleotide to the 3′ end of the oligonucleotide, designated OEL (oligonucleotide end labeling) in Table 1, as long as there is a potential to form a double-stranded region 5 to 9 nucleotides in length with a recessed 3′ end. Oligonucleotides 44-7-6 (-3 to + 3) and 38-4-4 (+ 1 to + 3) label the 3′ end of the oligonucleotide (Fig. 4, lanes 1 and 2). These oligonucleotides have a double-stranded region with a recessed 3′ end equal to the truncated promoter length.

Oligonucleotides with 18, 19, and 20 base pairs of 5′ double-stranded promoter region deleted, to give 0 to 2 bp truncated promoters and have double-stranded regions of 2 to 4 bp, do not produce an RNA product or label the 3′ end of the oligonucleotide (data not shown).

Characterization of the “Double Spot” RNA Product

An RNA product remarkably similar to the “double spot” has been observed by Ling et al. (1989) using T7 RNAP with a full double-strand promoter with four ribonucleotide triphosphates present, but with limiting amounts of the pyrimidine nucleotides, CTP or UTP (Ling et al. 1989). They ascribed this RNA product to abortive initiation. To determine if the “double spot” is abortive initiation, we examined the dependence of “double spot” on both template and promoter.

In order to test the effect of the template portion of the nucleotide on “double spot” RNA production, the number of T nucleotides in the template portion of 56-21-20 was varied from 0 to 8 or substituted with A nucleotides. All of these modified oligonucleotides were able to produce the double spot RNA product (Fig. 5), indicating that double spot RNA synthesis is template independent.

Fig. 5
figure 5

Panel a Autoradiograph of 15% polyacrylamide gel showing double spot RNA production after incubation with T7 RNA polymerase and radiolabeled rATP from the 56-21-20 and 56-16-16 oligonucleotides modified in their template region. Double spot RNA is produced independent of the template sequence indicating that double spot RNA production is template independent. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, bold letters indicate potential templates for rATP radionucleotides

To determine if promoter sequences are necessary for the production of “double spot” RNA product, scrambled promoter sequences were substituted for the promoter sequence. While the classic “double spot” nucleotides 56-21-20 and 56-16-16 produced “double spot” RNA product, the oligonucleotides with scrambled promoter sequences did not produce “double spot” RNA product (Fig. 6), indicating that “double spot” production is dependent on promoter sequence. These results indicating that “double spot” RNA production is template independent and promoter dependent are consistent with the report that RNA products similar to “double spot” RNA are abortive initiation (Ling et al. 1989).

Fig. 6
figure 6

Panel a Autoradiograph of 15% polyacrylamide gel showing double spot RNA production after incubation with T7 RNA polymerase and radiolabeled rATP with the 56-21-20 and 56-16-16 oligonucleotides modified to have scrambled promoter sequences. The oligonucleotides with scrambled promoter sequences do not produce double spot RNA indicating that the double spot RNA is promoter dependent. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

5′ end deletions of 4, 5, and 6 nucleotides (oligonucleotides 56-20-16, 56-20-15, and 56-20-14) can produce “double spot” RNA (Fig. 7). Oligonucleotide 56-20-15AT has the AT-rich recognition sequence 5′-TAATA-3′ substituted with 5′-TATAT-3′, a similar sequence. These substitutions have little effect on “double spot” RNA production (Fig. 7, lane 3). Also, oligonucleotide 56-20-14AT has the sequence 5′-TAATAC-3′ substituted with 5′-TATATA. To determine if a GC-rich sequence can be tolerated as a substitute for the AT-rich recognition sequence, the sequence CGCGG was substituted in 56-20-15GC. This oligonucleotide produces only a very small amount of double spot” DNA (Fig. 7, lane 4), indicating that while the primary sequence of the AT-rich region is not critical, the base composition of the region should be AT-rich to give optimum double spot RNA production.

Fig. 7
figure 7

Panel a Autoradiograph of 15% polyacrylamide gel showing double spot RNA production after incubation with T7 RNA polymerase and radiolabeled rATP with the 56-21-20, 56-20-16, 56-20-15 oligonucleotides. The 56-20-16 and 56-20-15 oligonucleotides have substituted bases in the AT-rich binding region. All four oligonucleotides with truncated promoters produced double spot RNA (lanes 2–5) albeit the truncated promoter with added GC-rich sequences produced a reduced amount of double spot RNA. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

Our interpretation of these results is that “double spot” RNA production is equivalent to abortive de novo initiation at the recessed 3′ end. A complete 20 nucleotide promoter (56-21-20; − 17 to + 13) or nearly complete 5′ truncated promoters (56-16-16, 56-20-16, 56-20-15, and 56-20-14; − 13, − 12, − 11 to + 13) bind strongly enough so that the RNA polymerase cannot leave the promoter during promoter clearance during the transition from initiation to elongation. The presence of only a single ribonucleotide triphosphate also contributes to abortive initiation. The transcripts produced prior to promoter clearance produce the abortive initiation products that we designate “double spot” RNA.

Oligonucleotide 3′ End Labeling

The labeling of oligonucleotides with recessed 3′ ends by T7 RNA polymerase has been reported and characterized by Sarcar and Miller (2018) as a type of DNA editing. They report a promoter-independent, template-dependent, recessed 3′ end-dependent addition of rNTPs to the 3′ end of oligonucleotide hairpins. A ribonucleotide is added to the 3′ end when the 3′ end is base paired next to a single-stranded region where the first unpaired nucleotide is complementary to the labeling ribonucleotide triphosphate. While this process absolutely requires a double-stranded region with a recessed 3′ end, it is limited by the length of the base-paired region. Oligonucleotides with duplex regions shorter that 3 base pair or longer than about 8 base pairs, depending on their base composition, do not label. In our initial detection of oligonucleotide 3′ end labeling, we observed labeling of oligonucleotides with duplex potentials of 3–9 base pairs (Table 1).

In this initial screen, the partial promoter length was equal to, or approximately equal to the duplex length of the hairpin oligonucleotide (Table 1). To uncouple these two parameters, we made oligonucleotides with a constant 20 bp duplex length, but with variable promoter lengths (see for example, 56-20-16, 56-20-15, 56-20-14, Fig. 7). These showed “double spot” RNA production but did not show end labeling since the 20 nucleotide duplex length exceeded the 3–9 nucleotide length necessary for end labeling. Conversely, oligonucleotides without promoter sequences, but with hairpin oligonucleotides with duplex lengths between 3 and 9 bp and recessed 3′ends with complementary sequences, end-labeled efficiently.

Figure 8 shows oligonucleotides with constant loop length (15 nucleotides) and template length (10 nucleotides) and increasing duplex length from 2 to 10 bp. Oligonucleotides with duplexes 4 to 8 bp label efficiently; oligonucleotides with duplexes of 3 and 9 bp label slightly. Oligonucleotides with smaller or larger duplex length (2 bp and 10 bp) did not 3′ end label (Fig. 8).

Fig. 8
figure 8

Panel a Autoradiograph of 15% polyacrylamide gel showing oligonucleotide 3′ end labeling after incubation with T7 RNA polymerase and radiolabeled rATP. Each hairpin oligonucleotide has a duplex length one bp longer than the previous oligonucleotide. Oligonucleotide end labeling occurred on oligonucleotides with potential duplex lengths of 4 to 8 base pairs (lanes 3–7) with light amounts of end labeling in duplexes of 3 base pairs (lane 2) and 9 base pairs (lane 8). Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

To determine template dependence in oligonucleotide 3′ end labeling, the 5′ extension/template of the 44-7-6 oligonucleotide was systematically altered. Substitution of A nucleotides for T nucleotides in the labeling site resulted in no labeling (Fig. 9, lanes 1 and 7), indicating that the labeling site next to the 3′ end must base pair with the radionucleotide triphosphate (rATP) used in the assay. Lanes 2, 3, 4, 5, and 6 of Fig. 9 have oligonucleotides with a variable number of T nucleotides in the labeling site giving the potential for addition of multiple A nucleotides to the 3′ end of the oligonucleotide. Although at least one A nucleotide is added in each case since labeling occurs, it is unlikely that multiple A nucleotides are added. This conclusion is based on the fact that the intensity of labeling is essentially equal, except for lane 3 where labeling intensity is decreased not increased for an eight T nucleotide template, and the migration of the labeled oligonucleotide is not changed. Interestingly, there seems to be concurrent end labeling and de novo initiation in the autoradiograph of Fig. 9.

Fig. 9
figure 9

Panel a Autoradiograph of 15% polyacrylamide gel showing oligonucleotide 3′ end labeling after incubation with T7 RNA polymerase and radiolabeled rATP. Each hairpin oligonucleotide has a constant length (44 nucleotides), constant duplex length (7 bp), and constant loop length (15 nucleotides). The template 5′ extension is a constant length (10 nucleotides) but has a variable number of T nucleotides at the labeling site or A nucleotide substitutions at the site. All of the oligonucleotides are end-labeled with a least one rA nucleotide, except oligonucleotide 44-7G-6T0A4 (lane 7) which is not labeled since it contains no template and 44-7G-6A2T4 (lane 1) since the 4 T template sequence is not adjacent to the 3′ end, confirming that the primer extension end labeling is template dependent and initiates at the first nucleotide adjacent to the 3′ end. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

Characterization of the “Single Spot” RNA Product

Under the conditions used in these experiments, it was not expected that the oligonucleotides would be capable of de novo initiation, since promotor-dependent de novo initiation requires three G nucleotides for initiation and rGTP is not provided in these experiments. Nonetheless, a small RNA, presumably 5′-pppApApApA-3′, was produced when four T nucleotides are positioned next to the recessed 3′ end in hairpin oligonucleotides with 5′ truncated promoter sequences between 6 and 15 nucleotides.

In Fig. 10, six oligonucleotides with promoter lengths between 10 and 15 are able to produce “single spot” RNAs, but scrambling of promoter sequences completely eliminates “single spot” RNA production (Fig. 11), indicating that this activity is dependent on a partial promoter sequence. While the duplex region of the hairpin oligonucleotides requires a partial promoter sequence, variable loop lengths do not affect the single spot production (Fig. 12).

Fig. 10
figure 10

Panel a Autoradiograph of 15% polyacrylamide gel showing oligonucleotide “single spot” RNA production with variable length partial promoter sequences (10–15 bp) labeling after incubation with T7 RNA polymerase and radiolabeled rATP. Oligonucleotides with partial promoters between 11 and 15 nucleotides produce single spot RNA using the 4 T template (lanes 1–6); oligonucleotides with partial promoter sequences longer than 15 nucleotides (56-16-16, lane 7) produce double spot RNA. Size markers in nucleotides are shown to the right of the gel. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

Fig. 11
figure 11

Panel a Autoradiograph of 15% polyacrylamide gel showing single spot RNA production after incubation with T7 RNA polymerase and radiolabeled rATP. Oligonucleotide 56-12-10 (lane 1) has a 10 bp partial promoter sequence and produces “single spot” RNA. Oligonucleotides 56-14 Scr and 56-16 Scr contain scrambled promoter sequences and do not produce single spot RNA indicating that single spot RNA is partial promoter dependent. Size markers in nucleotides are shown to the right of the gel. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

Fig. 12
figure 12

Panel a Autoradiograph of 15% polyacrylamide gel showing “single spot” RNA after incubation with T7 RNA polymerase and radiolabeled rATP. Each hairpin oligonucleotide has a duplex length of 10 bp, a 5′ truncated promoter to 8 bp, and a 10 nucleotide 5′extension. Each hairpin oligonucleotide has a different loop length. All four oligonucleotides with an 8 nucleotide partial promoter sequence produce single spot RNA indicating that loop length has little or no effect on single spot RNA production. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

However, the single spot production is dependent on the 5′ extension/template region of the oligonucleotide (Fig. 13). Oligonucleotide 56-12-10 A4 (Fig. 13, lane 1) has A substitutions for the template T nucleotides and does not produce RNA. Oligonucleotide 56-12-10 A2T4 (Fig. 13, lane 2) has two A nucleotides separating the 4 T template from the 3′ end and does not produce RNA. This indicates “single spot” RNA production is template dependent and requires a complementary base in the site next to the 3′ end. The remaining oligonucleotides in Fig. 13 have 2, 4, 6, and 8 T nucleotides in the template region next to the 3′ end. The RNAs made from these oligonucleotides correspond in length to the template, indicating that they can initiate template-dependent de novo RNA synthesis from a recessed 3′ end when a partial promoter is present and elongate the RNA on the template. This de novo initiation site is located at + 4 in contrast to the classical promoter transcription initiation site at + 1.

Fig. 13
figure 13

Panel a Autoradiograph of 15% polyacrylamide gel showing de novo initiation with template-directed elongation from oligonucleotides with recessed 3′ ends and partial promoter sequences after incubation with T7 RNA polymerase and radiolabeled rATP. Nucleotides without a 4 T template (lane 1) or with a 4 T template that is not positioned at the 3′ end of the primer duplex (lane 2) do not produce transcripts. Nucleotides with templates of 2, 4, 6, or 8 nucleotides, produce RNAs equal in length to the template, indicating that initiation is occurring next to the 3′ end of the primer and that a template is used for elongation. Size markers in nucleotides are shown to the right of the gel. Panel b The DNA sequences written 5′ to 3′ for the oligonucleotides used in this experiment. Underlines indicate double-stranded promoter regions or partial double-stranded promoter sequences, red nucleotides indicate the potential for duplex beyond the promoter region, bold letters indicate potential templates for rATP radionucleotides

Discussion

When presented with an oligonucleotide which can form a hairpin loop with a recessed 3′ end and a single ribonucleotide triphosphate, T7RNAP displays several novel activities depending on the amount of the promoter sequence retained in the duplex region of the hairpin. When a full promoter (20 nucleotides, − 17 to + 3) or up to a six base pair 5′ deletion of the full double-stranded promoter (14 nucleotide, − 11 to + 3) is used in these circumstances, a double spot, abortive transcript can be produced. Although this de novo initiation can only occur if a recessed 3′ end is present, it is template independent and produces the same double spot pattern independent of the sequence of the 5′ extension. This is presumably the result of strong promoter binding due to the retention of the − 12 to − 5 region of the promoter which is bound by the specificity loop.

Five to fourteen base pair deletions of the 5′ end of the double-stranded promoter result in hairpin oligonucleotides with 15 bp (− 12 to + 3) to 6 bp (− 3 to + 3) partial promoters that can initiate de novo template-dependent RNA synthesis. These RNAs initiate at the first unpaired base after the recessed 3′ end at a + 4 relative to classical full promoter-driven transcription, and are templated. These partial promoters resemble mitochondrial promoters in length, 9 nucleotides (− 8 to + 1), however, mitochondrial promoters initiate at + 1. The minimal T7 RNAP promoter sequence able to support de novo initiation at a recessed 3′ end in vivo is 5′-ATAGGG-3′. Deletion of promoters beyond 6 base pairs results in the loss of de novo initiation. Oligonucleotides with truncated promoter sequences that are too short to support de novo initiation on a template can still incorporate labeled nucleotides.

Hairpin oligonucleotides with recessed 3′ ends and a duplex shorter than about 9 base pairs of AT-rich sequence or longer than about 3 base pairs of GC-rich sequence can add a Watson–Crick-specified ribonucleotide to the 3′ end of DNA oligonucleotides producing a DNA-RNA phosphodiester bond (Sarcar and Miller 2018). The addition of a specific ribonucleotide to a DNA oligonucleotide constitutes RNA editing of DNA (Sarcar and Miller 2018). RNAs that can form hairpins with recessed 3′ ends can add nucleotides using the RNA as template. This activity may be similar to the co-transcriptional, non-DNA templated, insertional RNA editing activity observed for the mtRNAP in mitochondria of myxomycetes (Miller et al. 2017; Miller and Miller 2008; Visomirski-Robic and Gott 1997).

These results indicate that T7 RNA polymerase can have the following activities. It can (1) bind a primer with a recessed 3′ end on a template and extend the primer using the template, (2) bind a primer with a recessed 3′ end and a partial promoter, and can initiate de novo transcription on the first unpaired base at the end of the primer, and (3) bind a promoter and initiate de novo, template-mediated transcription within the consensus sequence of the promoter.

The evolutionary relationship of the single subunit polymerases is obscure even though they are clearly related through highly conserved motifs. DNA-dependent DNA polymerases and RNA-dependent DNA polymerases can utilize RNA or DNA primers to initiate DNA synthesis at specific sites determined by the complementarity of the primer. Although RNA–RNA phosphodiester bond primer extension is also used during elongation by RNA polymerases, they have uniquely evolved the ability to initiate de novo transcription through promoter recognition.

We propose that the likely evolutionary sequence of these activities of ssRNAPs is a transition from primer-mediated initiation through a promoter/primer-specified initiation, and finally to a promoter-mediated initiation. Because most single subunit polymerases, other than RNA polymerases, use primer extension to initiate nucleic acid synthesis, it is most parsimonious to propose that this is a trait inherited from a common ancestor and was originally present in all single subunit polymerases including RNA polymerases. As the need for precision in the location of initiation increased due to the selection of gene specificity and gene regulation, some RNA polymerases may have developed motifs that recognized the sequences of specific primers. Eventually, these sequences may have been able to position the RNA polymerase to initiate at the first available unpaired base on the template at the 3′ end of the primer. This would have allowed specificity of initiation without the attachment of the nascent RNA to the primer. As the specificity of initiation relative to this sequence increased, the need for a primer may have become obsolete and this sequence (promoter) may have then been able to position the RNA polymerase to initiate at a specific site without a primer. These primitive promoters would have probably resembled the smaller promoter sequences of mitochondrial RNA polymerases. Further selection for additional specificity may have led to the transcription factors used in some mitochondrial transcription systems or to longer promoter sequences recognized by multiple recognition domains in some of the bacteriophage RNA polymerases. This model of single subunit polymerase evolution is supported by the fact that these activities exist under the proper conditions in contemporary single subunit RNA polymerases.

One of the most common ways of producing RNAs of specific sequence is to transcribe the RNAs in vitro using T7 RNA polymerase and the T7 promoter followed by a specific template for the RNA. One of the problems with this technique is that 5′ G nucleotides are added to the 5′ end of the sequence since transcription initiates within the promoter upstream of the G nucleotides. Another problem with this technique is the occasional addition of non-DNA templated nucleotides at the 3′ end due to hairpin formation and RNA templating. Knowledge of the novel activities reported above suggests ways in which these activities can be exploited in vitro to produce specific RNAs without unwanted 5′ end G nucleotides and without non-DNA templated 3′ end nucleotides. Primers with partial promoter sequences, such as CACTATAGGG (56-12-10; Fig. 13), initiate at the first template base after the end of the primer, instead of within the promoter sequence producing an RNA without extra 5′ G nucleotides. DNA template sequences without hairpin-forming potential at the 3′ end will prevent RNA-templated addition of nucleotides to the 3′ end. These precautions allow the in vitro synthesis of most RNA sequences without extra 5′ and 3′ nucleotides.