Keywords

1 Introduction

While adenine (A), guanine (G), cytosine (C), and thymine/uracil (T/U) are well known as the standard bases of DNA/RNA, modified bases can also be found in messenger, transfer, and ribosomal RNAs (Fig. 1a). To date, more than 100 kinds of modified bases, including 7-methyl guanine, pseudouracil, dihydrouracil, hypoxanthine, and large bases with amino acid adducts, have been reported (Limbach et al. 1994). Sugar modifications, such as 2′-methoxy (OMe) and 1″-,3-(5″-phosphoryl)-ribosyl groups, have also been discovered. These modifications are considered to stabilize the three-dimensional structures of RNAs, allowing them to evade facile degradation by nucleases. Modified nucleotides are generally synthesized by modifying enzymes, such as methyltransferases and pseudouridine synthases, after transcription. For example, a methyltransferase initiates a nucleophilic attack by the thiolate (-S) of the cysteine residue at position 375 (Cys 375) on the C6 position of cytosine, forming a Michael addition product that activates the C5 position of cytosine. Subsequent transfer of a methyl group from cofactor S-adenosyl-l-methionine yields 5-methylcytidine (Boschi-Muller and Motorin 2013).

Fig. 1
figure 1

(a) Modified bases found in messenger, transfer, and ribosomal RNAs; (b) polymerase inhibitors for antiviral drugs and antineoplastic agents

Artificially synthesized nucleoside analogs, which were actively researched and developed from the 1960s to 1970s as polymerase inhibitors, are well known. These analogs, which include antiviral drugs (e.g., aciclovir and ganciclovir) and antineoplastic agents (e.g., cytarabine and trifluridine) (Fig. 1b), are phosphorylated through cellular metabolism and thereafter specifically inhibit polymerase activities to suppress virus proliferation and tumor growth (De Clercq 2001; Kufe et al. 1980). The other typical examples are fluorescence-labeled nucleoside triphosphate analogs, which are used in sequencing and microarray technologies for genetic research. Similar to fluorescent labeling, the introduction of foreign substituents to confer additional functionality, such as cell membrane permeability, nuclease resistance, and electroconductivity, has been studied for diverse applications.

2 Kinds of Polymerases

In general, DNA polymerases can extend a poly(oligo)nucleotide strand with a sequence complimentary to that of its template strand using nucleoside 5′-triphosphates as substrates. Some DNA polymerases have a proofreading 3′ to 5′ exonuclease activity. Since the time A. Kornburg discovered E. coli DNA polymerase I, various types of DNA polymerases have been isolated from prokaryotes, archaea, eukaryotes, and retroviruses. They are mainly classified into the following eight evolutionary families: A, B, C, D, E, X, Y, and reverse transcriptase (RT). For example, E. coli DNA polymerase I and mitochondrial DNA polymerase γ belong to family A; eukaryotic DNA polymerases α, δ, ε, and E. coli DNA polymerase II (pol II) are classified into family B; and E. coli DNA polymerase III (pol III) is a member of family C. Avian myeloblastosis virus (AMV) reverse transcriptase, which is an RNA-dependent DNA polymerase, can be mentioned as an example of an RT.

Highly thermostable DNA polymerases show optimal activity at temperatures around 75 °C. They are widely employed for polymerase chain reaction (PCR) to amplify/replicate DNA target sequences. To date, various engineered polymerases have been developed and supplied by manufacturers. One of the most popular thermostable DNA polymerases is Taq DNA polymerase, which belongs to family A and possesses 5′ to 3′ exonuclease activity, but not 3′ to 5′ exonuclease activity. It was isolated by A. Chien in 1976 (before the first report of the PCR technique in 1987) from an extremely thermophilic bacterium, Thermus aquaticus YT1, that was first discovered in the lower geyser basin of Yellowstone National Park, USA, in 1969 (Chien et al. 1976). The other examples of thermostable family A DNA polymerases are AmpliTaq and Tth DNA polymerases. The former, which was reported by F. C. Lawyer in 1989, is a modified form of Taq DNA polymerase obtained by expression of the gene in an E. coli host (Lawyer et al. 1989). The latter, which was isolated from the extremely thermophilic bacterium Thermus thermophilus HB8, is used for reverse transcription (RT)-PCR owing to its very efficient RT activity in the presence of Mg2+ ion (Rüttimann et al. 1985). Meanwhile, rTth DNA polymerase, a recombinant DNA polymerase, is known to exhibit efficient RT activity in the presence of Mn2+ ion.

Hyperthermostable DNA polymerases that belong to family B generally exhibit excellent proofreading ability during DNA chain extension, owing to their 3′ to 5′ exonuclease activity. These polymerases, such as Vent polymerase from Thermococcus litoralis, KOD from Thermococcus kodakaraensis KOD1, 9°Nm from Thermococcus species 9°N-7, Tgo from Thermococcus gorgonarius (Miroshnichenko et al. 1998), Deep vent from Pyrococcus species GB-D (Jannasch et al. 1992), Pfu from Pyrococcus furiosus (Fiala and Stetter 1986), and Pwo from Pyrococcus woesei (Rüdiger et al. 1995), were obtained from hyperthermophilic archaea found in deep ocean vents, volcanic marine mud, and solfataras on the seashore. Owing to their higher fidelity (i.e., lower error rate) and heat stability, compared with those of Taq, they and their variants have been well studied, and some of them are commercially available. Vent DNA polymerase, which was reported in 1991, was the first thermostable DNA polymerase having a 3′ to 5′ proofreading exonuclease activity (Mattila et al. 1991). Its D141A and E143A variant, which was engineered to eliminate the exonuclease activity and is known asVent(exo-) DNA polymerase, was developed to improve the yield of PCR products and to be applied to dideoxy sequencing reactions (Kong et al. 1993). KOD DNA polymerase not only possesses excellent proofreading ability, with about 50-fold higher fidelity than Taq DNA polymerase, but can also elongate an oligonucleotide strand with about a fivefold higher reaction rate than the other family B DNA polymerases, such as Pfu and Deep vent DNA polymerases (Takagi et al. 1997). The N210D variant of KOD, known as KOD(exo-) DNA polymerase, which possesses one thousandth of the 3′ to 5′ exonuclease activity of KOD DNA polymerase, has also been developed (Nishioka et al. 2001). The blend of KOD and KOD(exo-) DNA polymerases, which is commercially available under the product names KOD Dash or KOD XL, enables the production of long double-stranded DNAs (~15 kbp) by PCR. KOD FX and AccuPrimePfx DNA polymerases were developed from KOD DNA polymerase for use in hot start PCR. This is an improved PCR technique that can evade nonspecific amplification of DNA at lower temperatures by inactivating the polymerase with its antibody. The 9°Nm DNA polymerase is the E143D variant of wild-type 9°N-7 DNA polymerase (Southworth et al. 1996). This 9°Nm variant exhibits reduced 3′ to 5′ exonuclease activity (0.4–5 % of wild-type exonuclease activity). Therminator DNA polymerase, the D141A, E143A, and A485L variant of 9°N-7 DNA polymerase, has no 3′ to 5′ exonuclease activity. This DNA polymerase exhibits enhanced incorporation of modified nucleotides, that is, efficient single-base incorporation of dideoxy and acyclonucleotides. Thus, various B family DNA polymerases have been discovered and genetically engineered to be applied in a number of applications.

In addition to the highly thermostable and hyperthermostable DNA polymerases, E. coli DNA polymerase I, Klenow fragment (KF), Bst DNA polymerase, φ29 DNA polymerase, and T7 DNA polymerase are well known. They display optimal activity at 30–37 °C, except for Bst DNA polymerase, which has an optimal temperature of 60–65 °C. KF is obtained as a large protein fragment by enzymatic cleavage of E. coli DNA polymerase I using the protease subtilisin (Klenow and Henningsen 1970). Although KF has lost its 5′ to 3′ exonuclease activity, it retains its 3′ to 5′ exonuclease activity; furthermore, its D355A and E357A variant, known as KF (3′–5′ exo-), lacks 3′ to 5′ exonuclease activity (Bebenek et al. 1990). Bst DNA polymerase is the large fragment of DNA polymerase I from the thermophilic bacterium Bacillus stearothermophilus, which is found in soil, hot springs, and ocean sediment and is generally unable to grow at temperatures below 35 °C (Kiefer et al. 1998). Bst DNA polymerase lacks 3′ to 5′ exonuclease activity, but possesses strong strand displacement activity, which allows isothermal DNA amplifications such as loop-mediated amplification (LAMP) and rolling circle amplification (RCA). Meanwhile, φ29 DNA polymerase, which is derived from the Bacillus subtilis phage phi29 (Φ29), is known to exhibit 3′ to 5′ proofreading exonuclease activity and extreme processivity, in addition to strong strand displacement activity (Blanco et al. 1989). Bacteriophage T7 DNA polymerase, which belongs to family A, also possesses 3′ to 5′ proofreading exonuclease activity (Grippo and Richardson 1971; Campbell et al. 1978; Adler and Modrich 1979). This DNA polymerase is known to be a complex comprising phage-encoded gene 5 protein and E. coli host thioredoxin, which enhances the processivity of the polymerase.

In eukaryotes, there are three main types of RNA polymerase: RNA polymerase I, which transcribes ribosomal RNA but not 5S rRNA; RNA polymerase II, which transcribes precursors of mRNA, snRNA, and microRNA; and RNA polymerase III, which transcribes ribosomal 5S rRNA, tRNA, and other small RNAs. Bacteriophage T7 (Davanloo et al. 1984), T3 (Majumder et al. 1979), and SP6 (Kotani et al. 1987) RNA polymerases can be mentioned as examples of commercially available DNA-dependent RNA polymerases. These RNA polymerases bind to their cognate promoters with very high sequence specificities and thereafter transcribe the DNA template downstream of the promoter to generate the complimentary single-stranded RNA. Some variants of T7 RNA polymerase have been developed to enhance the incorporation of modified nucleotides.

3 Enzymatic Synthesis of Modified Nucleotides

3.1 Base Modification

A variety of base-modified nucleotides, particularly C5-modified uridine analogs, have been reported to date (Fig. 2). Research results have demonstrated that nucleoside triphosphates bearing a C5-modified pyrimidine ring or a C7-modified 7-deazapurine ring can be more acceptable polymerase substrates than nucleoside triphosphates with substituents introduced at the other positions. Using these modified analogs in the presence of the four natural nucleoside triphosphates (dNTPs) allows foreign functionalities, such as fluorophores or biotin to be thinly incorporated into nucleic acid strands that can serve as probes or capture specific nucleotide targets. For example, in 1992, T. Ried et al. reported fluorescence in situ hybridization (FISH) probes, which were synthesized using AmpliTaq DNA polymerase-catalyzed PCR that incorporated dUTPs labeled with fluorescein, biotin, digoxigenin, and 2,4-dinitrophenol; these modified DNA probes were used to specifically visualize the centromeres of human chromosomes (Ried et al. 1992).

Fig. 2
figure 2

Base-modified nucleoside triphosphates

In 1998, K. Sakthivel et al. first reported the enzymatic synthesis of modified DNAs in which foreign functionalities were densely incorporated during PCR. They used ten different C5-modified 2′-deoxyuridine-5′-triphosphates (dUTPs) in the presence of the natural dNTPs, except for thymidine-5′-triphosphate (TTP) (Sakthivel and Barbas III 1998). Four different DNA polymerases (Taq, Vent, Pfu, and rTth) were examined. The C5-modified dUTP bearing an (E)-3-(1H-imidazol-4-yl)acryl group was accepted as a good PCR substrate for all of these DNA polymerases and provided the corresponding modified DNA very efficiently. The experimental results particularly emphasized that the rigid and extended α,β-unsaturated arm adjacent to the imidazoyl group has a great influence on the incorporation efficiency of the modified dUs. Furthermore, modified UTPs bearing 5-(3-aminopropyl) or 5-(2-mercaptoethyl) groups were also synthesized and densely incorporated into RNAs with T7 RNA polymerase (Vaish et al. 2000). Using these analogs instead of UTP, it should be possible to perform modified RNA aptamer/ribozyme selections.

In 2001, D. M. Williams and coworkers focused on the linker length and rigidity of substituents and systematically analyzed the substrate properties of ten different C5-modified dUTPs in the absence of TTP during PCR catalyzed by Taq DNA polymerase (Lee et al. 2001). The data showed that C5-modified dUTPs with linker arms containing rigid alkynyl and trans-alkenyl groups in the vicinity of uracil base were significantly superior to those with linker arms containing cis-alkenyl or alkyl groups. Furthermore, they synthesized C7-modified 7-deaza-dATP analogs with alkynyl, cis-alkenyl, and alkyl linker arms and found that the analog with the alkynyl linker arm acts as the best substrate for PCR catalyzed by Taq DNA polymerase (Gourlain et al. 2001). Intriguingly, using the preferred dUTP and 7-deaza-dATP analogs, modified dUs and 7-deaza-dAs could be simultaneously incorporated during PCR. In the same year, H. Sawai et al. first demonstrated, using PCR experiments incorporating dUTPs modified with a methylene linker at C5, that KOD Dash DNA polymerase is one of the most promising candidates as a catalyst for enzymatic syntheses of modified DNAs (Sawai et al. 2001). Indeed, KOD DNA polymerase-related products (e.g., KOD XL, KOD FX, AccuPrimePfx, and KOD Dash) are widely employed in applications involving enzymatic syntheses of modified DNAs.

In 2002, H. A. Held et al. demonstrated that Pfu and Pwo DNA polymerases, which are members of family B, as well as KOD DNA polymerase, can efficiently produce modified DNAs by primer extension (PEX) and PCR, using dUTPs modified at C5 with protected thiol groups (Held and Benner 2002). In successive incorporations of modified dUs, family B polymerases (Pfu, Pwo, Vent, and Deep vent) were found to be preferable to family A polymerases (Taq, Tfl, Hot Tub, and Tth). Thereafter, M. Kuwahara et al. systematically analyzed the PCR-based synthesis of modified DNA strands using C5-modified dUTPs and C5-modified dCTPs as substrates and family A (Taq, Tth) and B [Vent(exo-), KOD Dash, and KOD(exo-)] polymerases as enzymes and arrived at the same conclusion (Kuwahara et al. 2006). Furthermore, their kinetic studies using modified primers/templates/substrates revealed that modified group(s) adjacent to the extending terminus of the primer can greatly reduce catalytic efficiency, which resulted in low product yields with successive modified nucleotide incorporations.

D. M. Williams and coworkers applied the dual modification technique to modified DNAzyme selection, using modified dU and 7-deaza-dA analogs (Sidorov et al. 2004). Meanwhile, in 1999, D. M. Perrin et al. reported dual modification using C5-modified dUTP and C8-modified dATP in a PEX technique catalyzed using Sequenase Version 2.0 DNA polymerase, which is a genetically engineered form of T7 DNA polymerase with virtually no 3′ to 5′ exonuclease activity (Perrin et al. 1999; 2001). Using this technique, followed by in vitro selection, they were the first to produce a modified DNAzyme that acted as a metal-independent RNAase A mimic with two different functional groups (k cat = 0.044 min−1) (Perrin et al. 2001). They also produced a modified DNAzyme with three different functional groups (i.e., amine, imidazole, and guanidine) which had an improved k cat value (0.134 min−1) (Hollenstein et al. 2009). In 2003, T. Tasara et al. synthesized C5-modified dUTPs, C5-modified dCTPs, C7-modified 7-deaza-dATPs, and C7-modified 7-deaza-dGTPs with foreign functional groups, including biotin, Rhodamine Green, Cyanine 5, Evoblue 30, and Gnothis Blue 3 (Tasara et al. 2003). They examined successive incorporations of modified nucleotides in the absence of all four natural dNTPs using PEX. A reaction using Vent(exo-) polymerase and biotinyl dNTPs with four different bases provided a modified DNA with a 40-mer elongated strand, in which biotinyl dNs were successively incorporated. Similarly, in 2005, S. Jäger et al. synthesized modified dNTPs with four different bases possessing basic, acidic, and lipophilic substituents and examined their successive incorporations (Jäger et al. 2005). Full-length modified DNAs comprising 40-mer elongated strands were produced by successive modified dN incorporations using PEX catalyzed by Pwo DNA polymerase. Furthermore, these oligonucleotides, which were modified at high density, were found to be reverse-transcribed to natural cDNAs in a PEX procedure using Pwo DNA polymerase, four natural dNTPs, and a GC-rich reaction buffer containing 1.5 mM dimethyl sulfoxide. Eventually, the technical challenges in successive incorporations have encouraged researchers to explore various applications of enzymatically synthesized modified DNAs.

Postsynthetic derivatization is a convenient and alternative method for the preparation of long DNA strands with high density and/or bulky modifications (Fig. 3). In 2003, H. Sawai and coworkers reported the enzymatic synthesis of a 108-mer modified DNA containing 5-methoxycarbonylmethyl-dUs in place of natural T using PCR and its subsequent derivatizations via amide bond formation by ester–amide exchange reactions using amino functionalities such as tris(2-aminoethylamine), histamine, and hexamethylenediamine (Mehedi Masud et al. 2003). Conversion rates of the derivatizations were unfortunately not very high, i.e., 56 %, 72 %, and 76 %, respectively, owing to the hydrolysis of the methyl ester that occurred during the addition reactions. Thereafter, various postsynthetic derivatizations using different chemistries have been developed. For example, pyranosyl sugar-modified DNA derivatized from a 300-mer precursory modified DNA containing 5-ethynyl-dUs by Cu-catalyzed alkyne–azide cycloaddition, i.e., a “click” reaction (Gierlich et al. 2007); biotin-modified DNA derivatized from a 35-mer precursory modified DNA containing a 7-[5-{(4-azidobutyl)amino}-5-oxopent-1-yn-1-yl]-7-deaza-dA by Staudinger ligation using biotinylated phosphine (Weisbrod and Marx 2007); hydrazone-modified DNA derivatized from a 98-mer precursory modified DNA containing 5-(5-formylthiophene-2-yl)-dCs by condensation reactions of aldehydes with arylhydrazines (Raindlová et al. 2010); and biotin-modified DNA derivatized from a 414-mer precursory modified DNA containing 7-vinyl-7-deaza-dAs by inverse electron demand Diels–Alder reaction (Buβkamp et al. 2014). The precursory modified DNAs were prepared in good yields by PCR/PEX preferably using KOD XL, Pwo, and KlenTaq DNA polymerases; some postsynthetic derivatizations exhibited significantly improved conversion rates, i.e., >90 %.

Fig. 3
figure 3

Base modifications employed for postsynthetic derivatization

3.2 Special Base Modification (Artificial Base Pairs)

Artificial base pairs have great potential for expanding the genetic information system of life on earth. Such ambitious studies were considered to be able to pursue the basic principles of life phenomena and origin of life as well as technological applications.

In the late 1980s, S. A. Benner et al. first proposed expanded DNA alphabets (Switzer et al. 1989). In 1990, they demonstrated that artificial base pair formation, i.e., isoG–isoC (Fig. 4a), which differs from standard Watson–Crick base pairs, i.e., A–T(U) and G–C in the pattern of hydrogen bond formation, is possible during enzymatic DNA and RNA syntheses catalyzed by KF and T7 RNA polymerases, respectively (Piccirilli et al. 1990). Unfortunately, the incorporation of T opposite the enolic tautomer of isoG was also observed. To exclude the formation of the undesirable isoG–T pairing, they used 2-thioTTP instead of TTP, in addition to the other five triphosphate substrates, i.e., dATP, dGTP, dCTP, isoGTP, and isoCTP. As expected, the correct incorporation of isoC at a single isoG on a 51-mer template during PCR using KlenTaq DNA polymerase increased the value of fidelity-per-round in PCR from approximately 93 % to 98 % (Sismour and Benner 2005). Furthermore, in 2006, they designed a novel third base pair comprising 2-amino-8-(2′-deoxy-β-d-erythro-pentofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one (dP) and 6-amino-5-nitro-3-(1′-β-d-2′-deoxyribofuranosyl)-2(1H)-pyridone (dZ) (Yang et al. 2006). PCR experiments using a 51-mer template containing a single dP residue showed high incorporation accuracies without using 2-thioTTP when dPTP and dZTP were used with the four natural dNTPs. The estimated values for fidelity-per-round were 94.4 %, 97.5 %, and 97.5 %, respectively, when Taq, Vent(exo-), and Deep vent(exo-) DNA polymerases were used (Yang et al. 2007). In the meantime, E. T. Kool, F. E. Romesberg, and I. Hirao were independently designing and developing their own candidates for the third base pair.

Fig. 4
figure 4

Artificial base pairs reported by (a) Benner’s, (b) Kool’s, (c) Romesberg’s, and (d) Hirao’s groups

In 1998, E. T. Kool and coworkers reported a F(2,4-difluorotoluene)–Z(4-methylbenzimidazole) base pair (Fig. 4b), which does not form hydrogen bonds between the bases (Morales and Kool 1998). They demonstrated that efficiencies for a single incorporation of the Z residue opposite the F residue on a 28-mer DNA template were 130–1900-fold greater than those of the four natural dNs when KF (3′–5′ exo-) was used. Their results indicate that hydrogen bonding is not necessary for base pair formation, and, if anything, the size and shape of paired bases are more important to the adoption of artificial base pairs by an oligonucleotide duplex. Indeed, thereafter, it was demonstrated that pyrene deoxynucleoside (dPyr) as a nonpolar base analog can selectively be incorporated opposite an abasic nucleoside (X) or a tetrahydrofuran abasic analog (ϕ) to form a dPyr–X/ϕ pair during PEX (Matray and Kool 1999). Moreover, they developed “yDNA” (an abbreviation of “wide DNA”), which involves benzopyrimidine deoxynucleosides (dyT and dyC) bearing size-expanded pyrimidines, i.e., yT and yC, designed to form yT–A and yC–G pairs (Lee and Kool 2005). They then examined whether or not dyT and dyC can store and transfer genetic information in vitro and in bacterial cells (Chelliserrykattil et al. 2008). The results showed that the correct nucleotides could be inserted opposite yDNA residues in PEX using KF (3′–5′ exo-) and Vent(exo-) DNA polymerases. Furthermore, the first example of an encoding protein (GFP; green fluorescent protein) in a living organism, i.e., E. coli, by unnatural DNA base pair architecture was exhibited in 2008.

In 1999, F. E. Romesberg and coworkers reported that a stable 7-propynyl isocarbostyril nucleoside (dPICS) self-pair can be formed in duplex DNA (Fig. 4c), and dPICS triphosphate can be incorporated opposite dPICS on the template by KF (3′–5′ exo-) with reasonable efficiency (McMinn et al. 1999). However, after the dPICS incorporation, synthesis proceeded inefficiently. Thereafter, they determined the best pair from the 3600 (60 × 60) combinations of unnatural DNA base analogs, i.e., dSICS–dMMO2 (Leconte et al. 2008), and subsequently achieved d5SICS–dMMO2 and d5SICS–dNaM pairs, which exhibited the high values of 85.7–99.8 % for fidelity-per-round in PCR using Taq, Deep vent, and Phusion high-fidelity DNA polymerases (Seo et al. 2009; Malyshev et al. 2009). Recently, they finally managed to create a semisynthetic organism with an expanded genetic alphabet involving d5SICS–dNaM as the third base pair; the genetically engineered organism was E. coli that expresses an algal nucleotide triphosphate transporter, which has the efficient uptake of the triphosphates of d5SICS and dNaM, and thereby accurately replicates a plasmid containing d5SICS–dNaM (Malyshev et al. 2014).

In 2000, I. Hirao and coworkers designed and synthesized 2-amino-6-(N, N-dimethylamino)purine (denoted by x) and pyridin-2-one (denoted by y) deoxynucleoside analogs (Fig. 4d) (Ishikawa et al. 2000). They anticipated that the steric hindrance between the dimethyl at the N6 position of x and the 4-keto group of T would interfere with the formation of an x–T mismatch pair and, furthermore, that the unique pattern of hydrogen bonding between N1 and N2 on x and N1 and O2 on y would form a stable and specific xy base pair. In PEX, using KF and KF (3′–5′ exo-), y was selectively incorporated opposite x on the template, which unfortunately was also erroneously incorporated opposites A and G. A ribonucleoside-5′-triphosphate analog of y was also synthesized, and the single incorporation of y opposite x in transcription was assessed using T7 RNA polymerase (Ohtsuki et al. 2001). As a result, y was incorporated opposite x with 95 % accuracy, while the erroneous incorporation of U opposite x was only occasionally observed (<5 % of instances). Thereafter, to improve incorporation efficiency and selectivity, they developed 2-amino-6-(2-thienyl)purine (denoted by s) and imidazolin-2-one (denoted by z) analogs to form sy (Hirao et al. 2002)and sz base pairs, respectively, which involve the formation of two hydrogen bonds (Hirao et al. 2004). Furthermore, the improvement of s provided the 7-(2-thienyl)-imidazo[4,5-b]pyridine (Ds) analog and of z yielded pyrrole-2-carboxaldehyde (Pa) (Hirao et al. 2006) and 2-nitro-4-propynylpyrrole (Px) analogs, which can form Ds–Pa and Ds–Px base pairs, respectively (Kimoto et al. 2009). The Ds–Px base pair in particular exhibited a high value of 99.9 % for fidelity-per-round in PCR using Deep vent DNA polymerase. Intriguingly, as models such as Kool’s and Romesberg’s possessing the F–Z and d5SICS–dNaM pairs, these base pairs do not involve hydrogen bonds. Using an analog bearing, i.e., a foreign functionality via the 4-propynyl group of Px, they have recently performed selections of modified DNA aptamers containing nucleoside analogs with the 5th base (discussed later).

3.3 Sugar Modification

In general, sugar-modified nucleic acids are called xenonucleic acids (XNAs). XNAs containing modified sugars with 2′-substituents such as methoxy (–OMe), fluoro (–F), amino (–NH2), and azido (–N3) groups are typical examples. Moreover, XNAs based on C2′-stereoisomers such as arabinonucleic acids (ANAs) and 2′-fluoroarabinonucleic acids (FANAs) have been extensively studied. Furthermore, XNAs containing unconventional sugars such as hexitol nucleic acids (HNAs), α-l-threofuranosyl nucleic acids (TNAs), cyclohexenyl nucleic acids (CeNAs), and 2′-O,4′-C-methylene-bridged/locked nucleic acids (2′,4′-BNAs/LNAs; hereinafter referred to as “LNA”) have attracted the attention of researchers as informational biopolymer alternatives to DNA and RNA (Fig. 5).

Fig. 5
figure 5

Sugar-modified nucleoside triphosphates

In the late 1990s, to incorporate 2′-modified nucleotide analogs, screening of DNA polymerases that can incorporate 2′–F analogs was attempted, and Vent(exo-) and Deep Vent(exo-) DNA polymerases were found to be efficient catalysts (Ono et al. 1997). Moreover, the engineering of RNA polymerases was attempted. For example, T7 RNA polymerase variants, Y639F and Y639F/H784A, which enabled an efficient enzymatic incorporation of 2′–F or 2′–NH2 analogs and those of the 2′–OMe or 2′–N3 analogs, respectively, were developed (Padilla and Sousa 1999, 2002).

In 2000, K. Vastmans et al. demonstrated that Vent(exo-) DNA polymerase can elongate a 6-mer HNA strand on a DNA template using an HNA triphosphate (hATP) bearing an adenine base (Vastmans et al. 2000). In 2003, J. C. Chaput et al. examined DNA synthesis on a TNA template and TNA synthesis on a DNA template using various DNA polymerases (Chaput and Szostak 2003). Thereafter, they found that the Therminator DNA polymerase can polymerize TNA oligomers that are at least 80 nt in length using the following four TNA triphosphates, i.e., tTTP, tGTP, tCTP, and tDTP, which bear thymine, guanine, cytosine, and 2,6-diaminopurine, respectively, as a nucleobase (Ichida et al. 2005). In 2005, V. Kempeneers et al. reported that seven efficient successive CeNA incorporations were possible in the DNA-dependent CeNA polymerization using a CeNA triphosphate (CeATP) bearing an adenine base and Vent(exo-) DNA polymerase under conditions supplemented with 1 mM Mn2+ (Kempeneers et al. 2005). In 2007, R. N. Veedu et al. first examined the enzymatic incorporation of LNA nucleotides and observed that Phusion High-Fidelity DNA polymerase could accept LNA-triphosphates bearing thymine and adenine bases and catalyze primer extension reactions to yield DNA-based LNA strands (Veedu et al. 2007). In 2008, M. Kuwahara et al. first demonstrated that KOD Dash DNA polymerase was superior to Phusion High-Fidelity DNA polymerase because of its reduced 3′,5′-exonuclease activity and could be applied for the synthesis of not only LNA but also other types of LNA, i.e., 2′,4′-BNACOC and 2′,4′-BNANC (Kuwahara et al. 2008).

In 2012, P. Holliger et al. successfully created some XNA polymerases variants of Tgo DNA polymerase using compartmentalized self-tagging (CST) selection, which was performed on libraries of TgoT DNA polymerase that contained four amino acid mutations (V93Q, D141A, E143A, and A485L) (Pinheiro et al. 2012). For example, HNA polymerase was additionally mutated at V589A, E609K, I610M, K659Q, E664Q, Q665P, R668K, D669Q, K671H, K674R, T676R, A681S, L704P, and E730G. CeNA/LNA polymerase was additionally mutated at E654Q, E658Q, K659Q, V661A, E664Q, Q665P, D669A, K671Q, T676K, and R709K. ANA/FANA polymerase was additionally mutated atL403P, P657T, E658Q, K659H, Y663H, E664K, D669A, K671N, and T676I. These TgoT variants can produce the corresponding 72-mer XNA strands on a DNA template using four XNA triphosphates with different bases (A, G, C, and T).

As examples of other types of XNAs, 4′-modified analogs such as 4′-thioribonucleoside-5′-triphosphates, 2′-deoxy-4′-thionucleoside-5′-triphosphates, and 2′-deoxy-4′-selenonucleoside-5′-triphosphates, which have been developed by N. Minakawa and coworkers, are mentioned (Kato et al. 2005; Kojima et al. 2013; Tarashima et al. 2015). They have found that the first ones can be accepted by T7 RNA polymerase, whereas the second and third ones can act as good substrates for KOD Dash DNA polymerase.

Thus, continuous efforts in the screening and engineering of polymerases have enabled an efficient enzymatic production of various XNAs.

3.4 Phosphate Modification

Phosphorothioate nucleotides are well-studied phosphate-modified analogs in which one of the non-bridging α-phosphate oxygen is replaced by sulfur because introduction of phosphorothioate linkages to oligonucleotides can enhance nuclease resistance and antisense activity (Fig. 6). In 1968, F. Eckstein et al. first demonstrated enzymatic incorporation of a phosphorothioate using UTPαS (mixture of Sp- and Rp-isomers) and Escherichia coli DNA-dependent RNA polymerase (Matzura and Eckstein 1968). Thereafter, using optically pure Sp- or Rp-ATPαS, they discovered that the Sp-isomer could be more efficiently incorporated than the Rp-isomer during RNA strand elongation catalyzed by E. coli DNA-dependent RNA polymerase (Eckstein et al. 1976). In 1988, H. P. Vosberg et al. showed that the Taq DNA polymerase-catalyzed polymerase chain reaction (PCR) with Sp-dATPαS and dNTPs, except dATP, could efficiently amplify 310-mer double-stranded DNA fragments that contained multiple phosphorothioates (Nakamaye et al. 1988). The same results were also obtained when Sp-dNTPαS was used with a different base (G, C, or T). Other examples of non-bridging α-phosphate oxygen substituents, phosphoroboranoate and phosphoroselenoate analogs of 2′-deoxynucleoside-5′-triphosphates, have been reported. Unlike phosphorothioate analogs, the Rp-isomer of phosphoroboranoates was preferably incorporated using Sequenase DNA polymerase. One should note, however, that the Sp-isomer of phosphorothioates corresponds to the Rp-isomer of phosphoroboranoates in the configuration of the four substituents, i.e., –P2O5 3−, =O, nucleoside, and –SH/–BH3−, that bond to the asymmetric phosphorus atom (Li et al. 1995; He et al. 1999). Meanwhile, both isomers of phosphoroselenoates were equally accepted as substrates by KF to produce corresponding DNA strands (Carrasco and Huang 2004).

Fig. 6
figure 6

Phosphate-modified nucleoside triphosphates

Replacement of the bridging oxygen between α-phosphorus and the sugar 5′ carbon in a dNTP by other substituents is one of the means to enzymatically introduce chemical modifications into oligonucleotide backbones. J. L. Wolfe et al. reported that in the presence of natural dNTPs, KF (3′–5′ exo-) could accept 5′-amino-2′,5′-dideoxynucleoside-5′-N-triphosphates to produce the corresponding oligodeoxynucleotide strand in which multiple oxygen atoms in the 5′-position were replaced with imino (–NH–) groups (Wolfe et al. 2002). Similarly, P. Herdewijn et al. demonstrated the insertion of the methyleneoxy (–CH2O–) group at this position using 5′-O-diphosphorylphosphonomethyl-2′-dA with Therminator DNA polymerase (Renders et al. 2007).

To analyze the effects of chemical modifications of the pyrophosphate moiety as a leaving group on polymerase activities, γ-phosphate oxygen substituents, i.e., γ-substituted dNTPs, were examined using AMT RT and DNA polymerase α (Alexandrova et al. 1998). Consequently, γ-substituted dNTPs with azidoethyl, aminoethyl, phenyl, or 2,4-dinitrophenyl groups could moderately be accepted only by AMT RT, although the latter two were poorer substrates than the former two owing to the bulkiness of the substituents. Furthermore, phosphate-modified analogs in which the bridging oxygen between β- and γ-phosphorus is replaced by the methylene (–CH2–) or dibromomethylene (–CBr2–) group were also examined. The results indicated that these modifications are more sensitive to the action of AMT RT than γ-substitutions.

4 Applications

4.1 DNA Sequencing

DNA sequencing methods can roughly be classified into three categories, i.e., first-, second-, and third-generation technologies. First-generation sequencing refers to fluorescent DNA sequencing using capillary electrophoresis, which greatly contributed to the Human Genome Project, 1990–2003. Fluorescent DNA sequencing methods, namely, the dye-primer method, which uses fluorophore-labeled primers, and the dye-terminator method, which uses fluorophore-labeled terminators, i.e., four 2′,3′-dideoxynucleoside-5′-triphosphates (ddNTPs) tagged with different fluorescent dyes, were devised in the 1980s (Fig. 7). Initially, the dye-primer method, which could provide more satisfactory long-read sequencing data, was mainly used, and KF (3′–5′ exo-) and Sequenase DNA polymerase were employed as enzymes for sequencing reactions. However, after AmpliTaq DNA polymerase FS, which exhibits a very weak 5′ to 3′ exonuclease activity and readily incorporates ddNTPs, had been developed, the dye-primer method was replaced with the dye-terminator method.

Fig. 7
figure 7

Modified nucleoside triphosphates employed for DNA sequencing technologies

The second-generation sequencing technologies enable massively parallel sequencing. For example, Roche 454 sequencing technologies based on the pyrosequencing method employ dATPαS instead of dATP because luciferase accepts natural dATP as a substrate, giving rise to false-positive signals. As with dATPαS, some dATP analogs with base modifications can reduce false luciferase positives (Kajiyama et al. 2011). In pyrosequencing, KF (3′–5′ exo-) (Ronaghi et al. 1996), Sequenase Version 2.0 (Gharizadeh et al. 2004) DNA polymerase, and Bst DNA polymerase (large fragment) are used (Margulies et al. 2005). Lasergen developed other sequencing technologies based on the cyclic reversible termination method using 3′-OH unblocked nucleotides called Lightning Terminators. These triphosphates have a terminating 2-nitrobenzyl moiety attached to hydroxymethylated nucleobases and are efficiently incorporated by Therminator DNA polymerase (Gardner et al. 2012) (Fig. 7).

The third-generation sequencing technologies involve single-molecule real-time (SMRT) sequencing, which enables direct analysis of DNA/RNA extracted from biological samples such as cells, tissues, and organs. For example, Pacific Biosciences developed an SMRT sequencing system using four modified dNTPs consisting of a γ-phosphate with different fluorophores and a complex of φ29 DNA polymerase and an analyte DNA template. The complex is solely immobilized at the bottom of a zero-mode waveguide, and attenuated light from the excitation beam penetrates only the lower 20–30 nm of each waveguide to provide a light microscope with a detection volume of only 20−21 L (Eid et al. 2009) (Fig. 7).

Thus, development of modified triphosphates and engineered polymerases continues to greatly contribute to the advancement of sequencing technologies.

4.2 Aptamer Development

To enhance nuclease resistance and improve target-binding affinities and specificities, various nucleoside triphosphates have been employed for the aptamer selection process, so-called systematic evolution of ligands by exponential enrichment (SELEX) (Tuerk and Gold 1990; Ellington and Szostak 1990). Macugen (pegaptanib sodium injection) for the treatment of wet age-related macular degeneration is a typical example of RNA-based aptamers. This modified RNA aptamer targeting vascular endothelial growth factor (VEGF) was selected from a modified RNA library. The library was enzymatically synthesized using T7 RNA polymerase and 2′-F pyrimidine nucleoside triphosphates (U and C) as well as natural ATP and GTP, followed by derivatizations, including post-SELEX modification, by which all natural purine nucleotides, except for two adenosine residues, could be replaced with 2′-OMe analogs (Ruckman et al. 1998). Furthermore, the use of a T7 RNA polymerase variant (Y639F, H784A, and K378R) enabled direct selection of modified RNA aptamers totally replaced with 2′-OMe nucleotides, which was demonstrated with VEGF as a target (Burmeister et al. 2005). Other examples of modified RNA aptamers created by SELEX using the corresponding triphosphate analogs and T7 RNA polymerase are those containing 2′-NH2 pyrimidine nucleosides (U and C) for human neutrophil elastase (HNE) (Lin et al. 1994), those containing 4′-thioribonucleosides (A, G, C, and U) for human thrombin (Minakawa et al. 2008), those containing 5-iodouridines (photoaptamer) for the HIV-1 Rev protein (Jensen et al. 1995), those containing 5-(3-aminopropyl)uridines for ATP (Vaish et al. 2003), and those containing phosphorothioates (A, G, C, and U) for basic fibroblast growth factor (bFGF) (Jhaveri et al. 1998).

For DNA-based aptamer selection, Vent and KOD-related DNA polymerases have mainly been used. For example, modified DNA aptamers containing 5-(1-pentynyl)-2′-deoxyuridines for human thrombin (Latham et al. 1994) and those containing 5-(3-aminopropyl)-2′-deoxyuridines for ATP were obtained by SELEX using Vent DNA polymerase (Battersby et al. 1999). Meanwhile, modified DNA aptamers containing 5-N-(6-aminohexyl)carbamoylmethyl-2′-deoxyuridines for (R)-thalidomide(Fig. 2) (Shoji et al. 2007); those containing (E)-5-(2-(N-(2-(N 6-adeninyl)ethyl))carbamylvinyl)-2′-deoxyuridines for camptothecin derivatives (Fig. 2) (Imaizumi et al. 2013); those containing 5-tryptaminocarbonyl-2′-deoxyuridines (SOMAmers) for various protein targets such as fractalkine, osteoprotegerin, and cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) (Fig. 2) (Gold et al. 2010); those containing the 2′-deoxynucleosides bearing Ds, i.e., one of the artificial bases from the Ds–Px base pair for VEGF and interferon-γ (Fig. 4d) (Kimoto et al. 2013); and those containing locked nucleic acid (LNA) nucleosides bearing thymine for human thrombin (Kasahara et al. 2013) were obtained by SELEX using KOD-related DNA polymerases such as KOD Dash, KOD XL, and AccuPrimePfx DNA polymerases (Fig. 5). In addition, SELEX using TaKaRa Taq Hot Start (HS) DNA polymerase was reported to have provided DNA-based aptamers comprised of six letters, i.e., A, G, C, T, Z (2(1H)-pyridone), and P (imidazo[1,2-a]-1,3,5-triazin-4(8H)-one), for HepG2 liver cancer cells (Zhang et al. 2015).

Furthermore, selection of the following XNA aptamers has been achieved using DNA polymerases: threose nucleic acid (TNA) aptamers for human thrombin (Therminator DNA polymerase) (Yu et al. 2012), hexitol nucleic acid (HNA) aptamers for hen egg lysozyme and HIV trans-activating response RNA (TgoT variant) (Pinheiro et al. 2012), and 2′-fluoroarabinonucleic acid (FANA) aptamers for HIV-1 reverse transcriptase (TgoT variant) (Alves Ferreira-Bravo et al. 2015).

5 Conclusions and Future Outlook

Since the 1980s, various nucleoside triphosphate analogs and polymerase variants have been developed. In the absence of the corresponding NTP(s)/dNTP(s), it became possible to produce oligonucleotides with single, double, triple, and quadruple substitutions with modified nucleotide(s) as research on screening and engineering of RNA/DNA polymerases progressed. Furthermore, oligodeoxynucleotides with expanded genetic alphabets and artificial biopolymers, i.e., XNAs with sugar structures quite different from β-d-ribofuranose in RNA or 2-deoxy-β-d-ribofuranose in DNA, can be enzymatically synthesized. Except for the abovementioned examples, various enzymatically synthesized polynucleotides that exhibit unique properties owing to introduced modifications, e.g., solvatochromic (Riedl et al. 2012), viscosity-sensitive (Dziuba et al. 2015), electroconductive (Patolsky et al. 2002), and amphiphilic functionalities (Fujita et al. 2015), have been reported. Such artificial biopolymers will serve as sensor materials for environmental and biological analyses and as programmable nanocapsules for drug carriers and gene deliveries in the near future.