Introduction

Bacillus thuringiensis (Bt) is an entomopathogenic, Gram-positive bacterium that is well known to produce, during the sporulation process, insecticidal crystalline inclusions composed of protoxins currently known as Cry proteins (Schnepf et al., 1998). These must be processed by insect midgut proteases to become activated toxins (Schnepf et al. 1998). Cry proteins are classified in 51 groups due to their amino acid sequence similarity (Crickmore et al. 1998; http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/). Cry1A toxins are very important because of their high toxicity to lepidopteran pests and widespread distribution among Bt strains (Bravo et al. 1998; Li et al. 1995; Uribe et al. 2003). In addition, variation in toxicity and specificity due to minor amino acid substitutions exist among different Cry1A toxins (Tounsi et al. 1999). To date sequences of 62 cry1A genes, classified into cry1Aa to cry1Ai types, have been published (Crickmore et al. 1998; http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/). The cry1Aa, cry1Ab and cry1Ac genes are the most frequently found in Bt strains and native isolates (Ben-Dov et al. 1997; Bravo et al. 1998; Juárez-Pérez et al. 1997; Uribe et al. 2003). Some Bt strains bear a single cry1A gene [e.g., strain HD-73, with only one cry1Ac gene (González et al. 1981)] whereas others have been found to present complex gene profiles including several cry1A subclasses (Ben-Dov et al. 1997; Bravo et al. 1998; Juárez-Pérez et al. 1997; Uribe et al. 2003).

Bt strains belonging to the serovar kurstaki were developed and used as efficient control agents for lepidopteran pests. The commercial Bt product called Dipel® (Abbott) is a formulated product based on Bt serovar kurstaki HD-1. This strain, as well as others of the same serovar, has also been used as a source of cry1A genes for the construction of genetically modified plants or bacteria to protect against lepidopteran attack (Bora et al. 1994; Lampel et al. 1994; Vaeck et al. 1987). Polymerase chain reaction (PCR) and gene sequencing are an integral part in agronomical research and biomolecular product development. Both are essential for the finding of novel cry genes in Bt strains that could be used for the construction of genetically modified organisms (GMOs). To our knowledge, many studies have focused primarily on the use of cry genes of Bt strains with the final objective of protecting against lepidopteran pests but none detail a complete molecular strategy with a full description of all primers used for sequencing (Alberghini et al. 2005; Huang et al 2004; Masson et al. 1998; Tounsi et al. 1999). Therefore, the amplification, subsequent cloning of the full-length cry genes from Bt and sequencing are important because they might provide genes encoding active toxins against insect pests that are capable of being introduced in GMOs (Alberghini et al. 2005; Huang et al 2004; Masson et al. 1998; Tounsi et al. 1999).

The main objective of this study was to establish a complete strategy designed for the amplification, cloning and sequence determination of known and potentially novel cry1A genes harboured in a Bt strain and to prove its robustness. Using this method, we analyzed two novel cry1A genes from an Argentinean Bt isolate INTA Mo1-12.

Materials and methods

Bt strains and native isolates

Eight Bt reference strains known to harbour cry1A genes (Bt serovar thuringiensis HD-2, Bt serovar kurstaki HD-1, Bt serovar kurstaki HD-73, Bt serovar kenyae HD-5, Bt serovar aizawai HD-137, Bt serovar aizawai HD-133, Bt serovar tolworthi HD-125 and Bt serovar wuhanensis HD-525), and two known not to harbour cry1A genes (Bt serovar kyushuensis HD-541 and Bt serovar israelensis HD-567) (Cerón et al. 1994; Kuo and Chak 1996), were kindly provided by the United States Department of Agriculture (USDA) Agricultural Research Service (Peoria, USA). Five native Bt isolates collected from stored product dust from different agroecological regions of Argentina were obtained from the Instituto de Microbiología y Zoología Agrícola-Instituto Nacional de Tecnología Agropecuaria (IMYZA-INTA) Bacterial Collection. All these isolates harbour at least one cry1A gene (Table 1).

Table 1 Native Bt isolates from the IMYZA-INTA Bacterial Collection used in this study

PCR primers

Two novel specific primers for the amplification of cry1A genes were designed based on the analysis of conserved regions by multiple alignments of all cry1 DNA sequences in the Bt toxin nomenclature website using ClustalW (Thompson et al 1994; http://www.ebi.ac.uk/clustalw/) and Oligoanalyzer 3.0 (http://scitools.idtdna.com/scitools/Applications/OligoAnalyzer/). Primers used for the amplification of the whole open reading frame of cry1Aa, cry1Ab, cry1Ac, cry1Ae, cry1Af, cry1Ag and cry1Ai were as follow: 1AF (forward; 5′-ATGGATAACAATCCGAACATC-3′) and 1UR (reverse; 5′-YTATTCYTCCATRAGRASTAR-3′). The degenerate bases were designated as follows: Y, C/T; R, A/G; S, G/C. The cry1Ad and cry1Ah genes were excluded because of several sequences dissimilarities at the extreme 5′ end. The forward primer is class specific and was designed to begin at the ATG initiation codon. The reverse primer is degenerate and was designed from sequences at the 3′ end of all cry1 genes. It might also be used to amplify any cry1 gene in combination with other class specific primer. Oligonucleotides were synthesized in a DNA synthesizer as specified by the manufacturer (alpha DNA).

Amplification of cry1A genes

All Bt reference strains and native isolates were grown on nutrient agar plates for 16 h. A loopful of cells was transferred to 100 μl of H2O and boiled for 10 min to make Bt DNA accessible for PCR amplification. The lysate was centrifuged briefly (5 s at 15,000 g; Eppendorf model 5415R centrifuge), and 10 μl of supernatant was used as the DNA template in the reactions. Manual “hot start” PCR was performed with a final volume of 50 μl containing of 50 mM KCl, 20 mM Tris–HCl (pH 8.4), 200 μM each deoxynucleoside triphosphate (dATP, dTTP, dGTP, and dCTP), 1 μM each primer and 2 mM MgCl2. The PCR amplification consisted of an initial denaturation step of 2 min at 94°C, followed by 29 cycles of 5 s at 95°C, 20 s at 47°C, 4 min at 68°C and a final elongation step of 5 min at 68°C in a thermocycler (Eppendorf Mastercycler gradient). Five U of Taq DNA polymerase (Invitrogen) were added after the first denaturation step. Finally, the PCR product was analyzed by 1.0% agarose gel electrophoresis stained with ethidium bromide.

Cloning of cry1A genes from Bt INTA Mo1-12

The PCR product of Bt INTA Mo1-12 was purified from the agarose gel matrix using Wizard SV Gel and PCR Clean-Up System (Promega), cloned in pGEM-T Easy vector (Promega) and then transformed into competent Escherichia coli XL-1 strain following the manufacturer protocols. Thirty white colonies were selected on X-gal IPTG containing selective LB agar plates. Verifying whether the clones contained inserts was accomplished by PCR of recombinant plasmid DNA using vector primer SP6 and T7. Afterwards, the cry1A subclasses of each clone were identified by multiplex PCR with the primers described previously (Cerón et al. 1994).

Sequencing primers and nucleotide sequencing of cry1A genes from Bt INTA Mo1-12

Five clones harbouring the cry1Aa and five harbouring cry1Ac genes were sequenced in both directions by primer walking using vector (SP6 and T7) and specific primers detailed in Table 2 by using DYEnamic ET Terminator chemistry (Amersham) in an ABI 373 DNA Automated Sequencer.

Table 2 Primers for sequencing of cry1A genes

Sequence comparisons

The cry1A nucleotide sequences from Bt INTA Mo1-12 and their translations into amino acid sequences (with the ExPASy translate tool; http://www.expasy.org/tools/dna.html) were aligned separately with ClustalW (Thompson et al. 1994). The deposited sequences in GenBank of all cry1A subclasses were accessed through the Bt toxin nomenclature website (http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/) and used as references.

Nucleotide sequence accession number

The DNA nucleotide sequences of cry1Aa and cry1Ac from Bt INTA Mo1-12 have been deposited in the GenBank databases (http://www.ncbi.nlm.nih.gov) under the accession numbers DQ062690 and DQ062689 respectively.

Results and discussion

The oligonucleotide primers, 1AF and 1UR, were successfully tested for the amplification of PCR products of the expected size (about 3500 bp) using Bt reference strains and native isolates. Bt serovar thuringiensis HD-2, Bt serovar kurstaki HD-1, Bt serovar kurstaki HD-73, Bt serovar kenyae HD-5, Bt serovar aizawai HD-137, Bt serovar aizawai HD-133, Bt serovar tolworthi HD-125, Bt serovar wuhanensis HD-525, and the Bt native isolates INTA Mo29-1, INTA Mo14-3, INTA Mo1-12, INTA Mo27-1 and INTA Mo23-2, used as positive controls, yielded PCR products of the expected size. Bt serovar kyushuensis HD-541 and Bt serovar israelensis HD-567, used as negative controls, failed to produce any amplification. Fig. 1 shows the result of electrophoresis analysis of these amplifications.

Fig. 1
figure 1

Agarose gel electrophoresis analysis on 1.0% agarose gel of DNA sequences amplified by manual “hot start” polymerase chain reaction (PCR) assay by using primers 1AF and 1UR targeted cry1A genes. Lanes: 1, Bt INTA Mo1–12; 2, Bt INTA Mo27–1; 3, Bt INTA Mo23–2; 4, Bt INTA Mo14–3; 5, INTA Bt Mo29–1; 6, Bt serovar thuringiensis HD-2; 7, Bt serovar kurstaki HD-1; 8, Bt serovar kurstaki HD-73; 9, Bt serovar kenyae HD-5; 10, Bt serovar aizawai HD-137; 11, Bt serovar aizawai HD-133; 12, Bt serovar tolworthi HD-125; 13, Bt serovar wuhanensis HD-525; 14, Bt serovar israelensis HD-567; 15, Bt serovar kyushuensis HD-541. MW, molecular weight marker with sizes indicated on left and right (bp)

In order to test the designed strategy, the native Bt isolate INTA Mo1-12 was randomly chosen for cloning and sequencing of cry1A genes by using the vector and specific primers detailed in Table 2. To our knowledge, there are no previously published studies concerning cry genes with a full description of all the primers needed for sequencing. For these analyses, the less than full-length sequences resulting from cloned PCR fragments were trimmed to remove the common, conserved primer sequences, because the sequence hybridizing with the primer might not represent the amplified gene and amplification of certain combinations of degenerated bases in the reverse primer might have resulted in non-native sequence. The trimmed cry1Aa and cry1Ac sequences are approximately 99% of the full length.

The sequence of cry1Aa from Bt INTA Mo1-12 was 3489 nucleotides long, encoding 1163 amino acid residues showing 98.0–99.9% identity with those of the known cry1Aa genes. The cry1Aa gene from INTA Mo1-12 could be grouped together with cry1Aa1, cry1Aa3, cry1Aa5, cry1Aa7, cry1Aa8, cry1Aa10 and cry1Aa12 genes that also encode 1163 amino acids, whilst another group, containing cry1Aa9, cry1Aa13 and cry1Aa14 genes, encode 1167 amino acids. Incomplete sequences of genes cry1Aa2 and cry1Aa6 are available but were not used. The search for sequence similarity with the previously known cry1Aa genes using ClustalW revealed that it is a new cry1Aa gene (Crickmore et al. 1998; http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/). There were substitutions at position 863 of A for G (transversion), resulting in the substitution of Glu(288) for Gly and at 1701 of T for C (silent mutation) (Fig. 2).

Fig. 2
figure 2

Alignment of cry1Aa15 gene from INTA Mo1–12 (A) and their deduced amino acid sequence (B) in the regions where differences were found. The vertical downward arrows and the black boxes indicate the nucleotides and amino acid showing variation. The Glu(288) observed in Cry1Ac from Bt INTA Mo1–12 was embedded in domain II. Nucleotides and amino acids are numbered from those of the corresponding position in all known Cry1Aa sequences

The crystal structures of several Cry toxins have been solved (Derbyshire et al. 2001; Grochulski et al. 1995; Li et al. 1991). The seven α-helices that form the N-terminal domain I have been implicated in the pore formation. Domain II consists of three antiparallel β-sheets with exposed loop regions, whereas domain III is a β-sandwich. Domains II and III are important in receptor recognition. Based on the structure assignment of Cry1Aa (Grochulski et al. 1995), domain I of Cry1Aa from INTA Mo1-12 consists of 221 residues (Tyr(33)–Arg(253)), domain II of 197 residues (Arg(265)–Phe(461)) and domain III of 147 residues (Asn(463)–Thr(609)). The substitution of Glu(288) for Gly observed in Cry1Ac from INTA Mo1-12 was embedded in β-1 (third sheet). This observed change may affect the salt bridge formed at the interface between domain I and II (Arg(233)–Glu(288)) in the other Cry1Aa proteins and therefore its stability might be affected at this point.

On the other hand, the sequence of cry1Ac from Bt INTA Mo1-12 was 3492 nucleotide long, encoding 1164 amino acid residues showing 99.5–99.9% identity with those of the known cry1Ac genes. For this analysis, the incomplete sequence of cry1Ac3 was excluded, although the 1833 nucleotides available through the Bt toxin nomenclature Web site showed 100% identity. Additionally, incomplete published sequences of the genes cry1Ac6 and cry1Ac13, and two truncated cry1Ac genes (cry1Ac12 and cry1Ac17) exist but were not used. A deletion of three nucleotides (AAT) at positions 1303-1305 was found to be present in cry1Ac of Bt INTA Mo1-12. This characteristic was also observed in cry1Ac2, cry1Ac3, cry1Ac6, cry1Ac12, cry1Ac15 and cry1Ac17, but not in cry1Ac1, cry1Ac4, cry1Ac5, cry1Ac7, cry1Ac11 and cry1Ac16. When cry1Ac of INTA Mo1-12 was compared with the full available 3492 nucleotides of three cry1Ac genes (cry1Ac2, cry1Ac14 and cry1Ac15), these analysis finally revealed that it is a new cry1Ac gene. Variations in cry1Ac2, cry1Ac14 and cry1Ac15 genes and in their deduced amino acid sequences make cry1Ac from INTA Mo1-12 different and unique (Fig. 3). Sequence analysis of the regions within these cry1Ac deduced amino acid sequences suggested that the new gene might have arisen by rearrangement among them (Fig. 3).

Fig. 3
figure 3

Alignment of the deduced amino acid sequence encoded by cry1Ac21 gene from Bt INTA Mo1-12 with those encoded by other cry1Ac genes. Only the regions containing differences are presented. The vertical downward arrows and the black boxes indicate the amino acid showing variation. Variations in Cry1Ac15, Cry1Ac14 and Cry1Ac2 make Cry1Ac from INTA Mo1–12 different and unique. The box indicates that this region was embedded in domain I. Nucleotides and amino acids are numbered from those of the corresponding position in all known Cry1Ac sequences

The Cry1 protoxins are activated through the proteolytic removal of an N-terminal peptide of 25–30 amino acids and approximately half of the remaining protein from the C terminus (Bravo et al. 2002; Tojo and Aizawa 1983). The changes observed beyond domain III (Asn(463)–Thr(609)) (Fig. 3) comprise a region that is cleaved in the activated toxin and therefore a role in target insect specificity and level of toxicity is not expected. Excluding this region and considering the substitutions at positions Phe(148) and His(206) in Cry1Ac15 and Cry1Ac2 respectively that are embedded in domain I (Tyr(33)–Arg(253)) (Fig. 3), the activated Cry1Ac from INTA Mo1-12 was identical to Cry1Ac14.

Due to multiple differences in their nucleotide and deduced amino acid sequences, the cry1Aa and cry1Ac genes from Bt INTA Mo1-12 reported in this paper are natural variants of these gene subclasses and they were named by the Bt Pesticidal Crystal Protein Nomenclature Committee as cry1Aa15 and cry1Ac21 respectively (http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/).

In summary, we have established a feasible strategy, with a fully detailed list of primers, for amplifying, cloning and sequencing known and potentially novel cry1A genes harboured in a Bt strain. Following this strategy, novel cry1Aa and cry1Ac genes from Bt INTA Mo1-12 were identified.