Introduction

Microbes are abundant on the planet. However, evidence has shown that most microbial species found in the environment cannot be cultivated (Tsai and Olson 1991; Somerville et al. 1989). Accordingly, methodologies for directly characterizing microbes in environmental samples without cultivation become important. The PCR method has become an invaluable tool in that specific DNA segments can be amplified from a background of complex genomes (Arnheim and Erlich 1992; Erlich et al. 1991). In studies of molecular evolution (Wagner et al. 1994) and microbial ecology (Steffan and Atlas 1991) this property has facilitated the characterization of both single genes and gene families in single or multiple species. This is generally achieved by designing specific primers that target conserved regions of homologous genes, thereby accelerating the detection, amplification, and ultimately sequence analysis of the genes under study. In many cases, the template for PCR application is a mixture of genomic DNA extracted from natural microbial communities. When performing PCR, it is essential to sensitively and specifically detect the presence of a particular sequence at low frequency (Holben et al. 1988), and identify different populations in complex mixtures of genomic DNA.

Our previous searches for the last universal common ancestor (LUCA) uncovered a Methanopyrus kandleri (Mka)-proximal LUCA (Xue et al. 2003, 2005). Further searches for organisms even closer to LUCA than Mka might also begin with members of the Methanopyrus genus, or organisms that are close relatives of Methanopyrus. For this purpose, the divergence of the valyl-tRNA synthetase (ValRS) and isoleucyl-tRNA synthetase (IleRS) genes will be employed as a measurement of primitivity. Over the last 20 years, several pure Methanopyrus isolates from the Pacific and Atlantic Oceans, and Methanopyrus genomes in the mixed DNA complexes from the Indian Ocean have been collected (Penders 2002; Takai et al. 2004a, b). Unlike pure Methanopyrus isolates, the frequency of Methanopyrus-specific sequences in the mixed DNA samples from the Indian Ocean is usually low. Therefore, it is necessary to optimize PCR conditions using pure isolate genomics and then adapt the optimal conditions derived from pure Methanopyrus isolates to sensitively and specifically capture the Methanopyrus-specific sequences in the Indian Ocean mixed sample.

This study investigated the experimental conditions required to amplify Methanopyrus–specific IleRS as a reference gene from pure Methanopyrus isolates using PCR. The choice of DNA polymerase, target sequence size, PCR primer size and use of degenerate primers were optimized.

Materials and methods

DNA samples and oligonucleotide primers

Methanopyrus isolates GC37 from the Pacific Ocean and TAG11 from the Atlantic Ocean, were provided by the Culture Collection of the Archaea Center of University of Regensburg (Penders 2002; Yu et al. 2009), and DNA was prepared from the cells as described for whole genome sequencing of Mka (Slesarev et al. 2002). Based on the conserved regions of Methanopyrus IleRS of pure isolates from the Pacific and Atlantic Oceans, the primers for DNA amplification were designed using Primer Premier 5.0 and obtained from Invitrogen (Carlsbad, CA) and are listed in Table 1.

Table 1 Primer sequences for isoleucyl-tRNA synthetase (IleRS) amplification

Preparation of PCR amplicons

Unless otherwise stated, the PCR amplification reaction was carried out in a final volume of 20 μl containing 1.5 ng GC37 genomic DNA, 100 nM each of two primers, 62.5 μM each of dNTPs, 50 mM KCl, 10 mM Tris-HCl, 1.5 mM MgCl2 and 1 U Taq polymerase (Amersham Biosciences, Piscataway, NJ). PCR amplification consisted of denaturation at 95°C for 3 min, followed by 35 cycles of 30 s at 95°C, 30 s at 55°C, 2 min at 72°C, and a final extension step at 72°C for 10 min. At the end of the reaction, the reaction mixture was cooled to 4°C to await further use.

Electrophoresis

After PCR reaction, PCR products were mixed with 6x loading dye and then electrophoresed on a 1.0 % agarose gel using TBE buffer. After electrophoresis, the product sizes were checked on the gel stained with ethidium bromide.

Results and discussion

Selection of DNA polymerase

DNA polymerases play a central role in PCR. Their application depends on the ability of polymerases to faithfully replicate DNA, yielding a detectable product that accurately represents the initial template (Steitz 1999; Joyce and Steitz 1994). It has been found that different DNA polymerases have different properties (Gardner and Jack 2002). KOD (Takagi et al. 1997; Mizuguchi et al. 1999), Taq (Innis et al. 1988; Lawyer et al. 1989) and Pfu (Lundberg et al. 1991) DNA polymerases have been widely used in PCR amplification. To select a highly efficient DNA polymerase, KOD (Novagen, Toyobo, Japan), Taq (Amersham Biosciences, Piscataway, NJ) and Pfu (Sangon, Shanghai, China) were used to amplify a DNA fragment of about 1,500 bp. The results (Fig. 1) showed that KOD and Taq possess similar efficiency. Both can amplify the target product from 0.01 ng genomic DNA; KOD gave marginally higher efficiency than Taq, since the concentrations of DNA produced from KOD (Fig. 1b) were higher than those from Taq (Fig. 1c). Pfu (Fig. 1a) exhibited the lowest efficiency, and could amplify the target DNA only after the quantity of template was increased to 0.05 ng.

Fig. 1
figure 1

Ethidium bromide-stained 1% agarose gel of PCR-amplified fragments of 1,500 bp using Pfu (a), KOD (b) or Taq (c) DNA polymerase. Lanes: 17: 1.0, 0.5, 0.2, 0.1, 0.05, 0.02 and 0.01 ng of GC37 DNA, respectively; 8 without genomic DNA; M standard 1 kb DNA marker

In many cases, PCR applications of environmental samples are followed by constructing clone libraries through thymidine (T) and adenine (A) ligation with a T-overhang from the vector and an A-overhang from the amplified insert DNA (Skirnisdottir et al. 2000; Lanoil et al. 2001) before sequencing. Due to the 3′-5′ exonuclease activity, KOD and Pfu have proofreading activity and possess high fidelity, but they lose the capability of forming a single 3′-T overhang on the amplified DNA product. So they need other polymerases without proofreading activity to create single A-overhangs at the 3′ end of their PCR products. This would compromise the efficiency of 3′-A addition. In contrast, Taq polymerase has a non-template dependent activity that preferentially adds a single adenosine to the 3′-ends of a double-stranded DNA molecule. Thus most of the DNA molecules amplified by Taq polymerase possess single 3′-A overhangs, and the amplified DNA product can be simply and directly cloned into a T-vector through TA ligation (Zhou and Gomez-Sanchez 2000). In view of the similar efficiency of amplification between KOD and Taq (cf. Fig. 1b and c), Taq DNA polymerase was used in all subsequent PCR amplification for the simplicity of library construction.

Dependence of target product size

Amplification of a long target fragment from a template is limited by the processivity and elongation rate of the DNA polymerase. Therefore, product size will affect PCR yield. To investigate the effect of target fragment sizes on PCR amplification using Taq DNA polymerase, we paired different PCR primers (Table 2) to amplify various fragments of different sizes, ranging from about 500 to 3,000 bp. For each size, a series of reactions was conducted with decreasing amounts of genomic DNA templates. Figure 2 shows that the efficiency of PCR amplification increased with the decrease of target sizes from 3,000 to 1,000 bp. A fragment of 1,000 bp in length could be amplified successfully using only 0.002 ng genomic DNA template. However, a further decrease of target size to 500 bp resulted in a reduced amplification efficiency. Therefore, target fragments with a size from 1,000 to 1,500 bp are optimal and suitable for Taq polymerase to work sensitively and efficiently.

Table 2 Primer combinations for amplification of target products of different sizes
Fig. 2a–f
figure 2

Ethidium bromide-stained 1% agarose gel of PCR amplified fragments with various sizes. a ∼3,000 bp; b ∼2,400 bp; c ∼2,000 bp; d ∼1,500 bp; e ∼1,000 bp; f ∼ 500 bp. Lanes: 19: 1.0, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005 and 0.002 ng of GC37 DNA, respectively; 10 without genomic DNA; M standard 1 kb DNA marker

Effect of primer size on PCR amplification

Primer size is an important factor in PCR reactions because primer size is highly associated with PCR specificity and sensitivity (Baker and Cowan 2004). To assess the effect of primer size on PCR amplification, a series of primers of different lengths ranging from 16- to 28-mer were paired to amplify a 1,500 bp fragment. As shown in Table 3, the shortest size was 16mer and longer primers were designed to contain bases additional to the fixed 16mer backbone. Figure 3 showed that primers from 18mer to 28mer in length gave similar sensitivity, which was higher than that of the 16mer primer. This finding indicates that it is better to use primers no less than 18mer in length. Universal bases that can pair with any of the four natural bases can be added to increase the effective size of short primers without increasing multiplicity (Ball et al. 1998). However, the effectiveness of the oligomer to prime DNA was found to be poor when multiple substitutions were made, presumably because universal bases preferentially form secondary structures such as hairpin loops (Loakes et al. 1995a, 1995b, 1997).

Table 3 Sequences of paired primers with different sizes
Fig. 3a–e
figure 3

Ethidium bromide-stained 1% agarose gel of PCR amplified 1,500 bp fragments using paired primers of different sizes. a 16mer, b 18mer, c 20mer, d 24mer, e 28mer. Lanes: 19: 1.0, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005 and 0.002 ng of GC37 DNA, respectively; 10 without genomic DNA; M standard 1 kb DNA marker

Bias caused by primer degeneracy

As shown above, using long enough primers improves PCR efficiency and specificity. However, it is difficult to find both long and completely conserved regions among highly diverse organisms in environmental samples for universal primer design. To account for the difference between organisms, degenerate primers were designed to provide universality. However, this may cause PCR bias. Both forward primer ILERS98F and reverse primer ILERS1672R are 100% conserved in the genomes of Methanopyrus GC37 from Pacific Ocean and Methanopyrus TAG11 from Atlantic Ocean. PCR amplification using this pair of primers resulted in similar product concentrations with different genomic DNA as templates, demonstrating no obvious PCR bias (Fig. 4a). However, the results in Fig. 4b from another pair of primers, ILERS98F and ILERS3020R, showed clear PCR bias when the universal primer ILERS1672R was replaced by a degenerate primer ILERS3020R containing one degenerate site of T/C variation where the T-allele is complementary to an A-base in TAG11 genome and the C-allele is complementary to a G-base in the GC37 genome. The concentration of PCR product obtained was highest using pure GC37 genome as template, and the lowest using pure TAG11 genome as template. The product concentration using a half-and-half mixture as template was intermediate. Most probably because the melting temperature of a GC pair is higher than that of an AT pair, ILERS3020R is more easily annealed to GC37 than to TAG11.

Fig. 4a–c
figure 4

Ethidium bromide-stained 1% agarose gel of PCR amplified fragments. a ∼1,600 bp fragment amplified with two universal primers, ILERS98F and ILERS1672R; b ∼3,000 bp fragment amplified with universal primer, ILERS98F and degenerate primer, ILERS3020R; c ∼1,500 bp fragment amplified with two degenerate primers, ILERS1558F and ILERS3020R. Lanes: 1 0.1 ng GC37 + 0.1 ng TAG11, 2 0.2 ng GC37, 3 0.2 ng TAG11, M standard 1 kb DNA marker

To simply balance the different melting temperature caused by degenerate bases in the primer, the universal primer ILERS98F was replaced by ILERS1558F, which contains one degenerate site of A/G variation. In contrast to the reverse degenerate primer, ILERS3020R, which prefers GC37 to TAG11, the forward degenerate primer ILERS1558F prefers TAG11 to GC37 because an A-allele at the degenerate site of ILERS1558F is complementary to the T-base in the GC37 genome, whereas a G-allele is complementary to the C-base in the TAG11 genome. The two degenerate primers then balance the melting temperature between GC37 and TAG11. After PCR amplification, the results in Fig. 4c showed a faint bias. All these findings indicate that primer degeneracy can cause PCR bias. A good choice would be to use a balanced degenerate primer if the use of a degenerate primer to compromise the sequence difference between organisms cannot be avoided. The application of universal primers containing universal bases at ambiguous positions instead of degenerate primers to amplify specific fragments from mixed samples has also been reported (Patil and Dekker 1990; Kilpatrick et al. 1996). However, various investigators have found primers to be ineffective when a universal base was contained within the first seven or eight bases from, or including, the 3′-end (Loakes et al. 1995a, 1995b; Smith et al. 1995; Rohrwild et al. 1995). In addition, it has been found that the base pairing stability (thermal dissociation, T m) of universal bases with four different natural bases is different in a certain order, such as d(I-C) > d(I-A) > d(I-C) = d(I-C) where dI is deoxyinosine (Martin and Castro 1985; Case-Green and Southern 1994). Therefore, universal base substitution at ambiguous positions can presumably cause PCR bias due to different T m values.

Conclusion

An optimal PCR condition is a compromise among many factors, such as choice of DNA polymerase, target size, primer design, template quality and quantity, and the number of extension cycles. The following recommendations for improving PCR efficiency and limiting bias in PCR amplification emerged from the data presented above. First, degeneracy should be avoided when universal primers are available. Second, to increase PCR sensitivity, a target product of 1,000–1,500 bp in length is optimal. Third, the primers should be no shorter than 18mer. Last but not least, Taq is a good DNA polymerase for efficient amplification of a DNA fragment and subsequent construction of clone library. Employing a combination of Taq DNA polymerase, a target DNA sequence approximately 1 kb in size, and PCR primers 20mer in length, target detection could be achieved with as little as 2 pg genomic DNA sample extracted from pure isolates. Moreover, using the above optimized conditions, two Methanopyrus groups of IleRS were sensitively PCR-amplified from 2 ng of a pooled genomic DNA sample extracted from Indian Ocean sediment and identified after direct construction of clone library and subsequent sequencing (Yu et al. 2009). We found that, prior to PCR condition optimization, it was easy to amplify the Methanopyrus-specific IleRS from pure cultures, but difficult from the Indian Ocean complex environmental sample, e.g., even using 10 ng Indian Ocean mixed DNA sample, PCR products from this environmental sample were weak and only one Methanopyrus phylotype was detected (data not shown). Though the environmental samples are complex, and thus many questions need to be addressed in such studies, our results strongly indicate the improvement of PCR amplification sensitivity to detect the presence of Methanopyrus IleRS at low frequency. With guidelines derived from this optimization, PCR conditions were also successfully designed to detect and characterize the Methanopyrus ValRS gene from both pure cultures and environmental mixtures, and two distinguishable Methanopyrus phylotypes were consistently identified in the Indian Ocean environmental sample (Yu et al. 2009). We believe that our findings can provide useful and general guidelines for amplifying and characterizing other genomes in environmental samples.