Introduction

Nielsen et al. (1995) proposed nine new alkaliphilic Bacillus species: B. agaradhaerens, B. clarkii, B. clausii, B. gibsonii, B. halmapalus, B. halodurans, B. horikoshii, B. pseudalcaliphilus and B. pseudofirmus. These Bacillus species frequently produce subtilisins. Subtilisin family A is one of the six families of subtilisin-like serine proteases, the subtilases (Siezen and Leunissen 1997). Subtilisins in family A have been classified into three subfamilies: the “true subtilisin”, “high-alkaline protease” and “intracellular protease”. A high-alkaline protease, H-221 from Bacillus sp. (B. clausii) 221, was the first enzyme from an alkaliphilic Bacillus to be identified (Horikoshi 1971). Some variants of this enzyme, such as Savinase (Betzel et al. 1992) and Maxacal (subtilisin PB92) (van der Laan et al. 1992), were found subsequently and their 3D structures were determined. We found two other groups belonging to family A, the novel oxidatively stable alkaline subtilisins (Saeki et al. 2000, 2002) and the high-molecular-mass subtilisins (Ogawa et al. 2003; Okuda et al. 2004). True and high-alkaline proteases are industrially important as laundry detergent additives.

We have isolated an alkaliphilic Bacillus sp. strain KSM-K16 that produces a high-alkaline protease named M-protease (Kobayashi et al. 1995). We have successfully produced the enzyme on an industrial scale for use in compact heavy-duty laundry detergents using a hyperproducing mutant (Kobayashi et al. 1995; Ito et al. 1998). The enzymatic properties (Kobayashi et al. 1995, 1996), gene sequence (Hakamada et al. 1994) and 3D structure (Shirai et al. 1997) of M-protease were determined. Recently, the contribution of a salt bridge triad to the thermostability of the enzyme has also been reported (Kobayashi et al. 2005).

Although structural and functional features of high-alkaline proteases and genome sequences of their producers have attracted a great deal of attention, the phylogenetic relationships among the producers remain unclear. The producers of H-221 and Savinase were assigned to B. clausii (Nielsen et al. 1995). While the producers of PB92 and M-protease were reported to be Bacillus alcalophilus (van der Laan et al. 1992) and Bacillus subtilis (Kobayashi et al. 1995), respectively, the sequences of the 16S rRNA genes of these strains have not yet been reported.

Here, we characterized strain KSM-K16 phenotypically, biochemically and genetically to clarify the taxonomic position in comparison with producers of other high-alkaline proteases. We also report that the genomes of strain KSM-K16 and reference strains of B. clausii contain distinct types of 16S rRNA genes, and these strains can be divided into at least two subgroups based on the composition of the different types of 16S rRNA genes within the genome.

Materials and methods

Bacterial strains, plasmids and culture conditions

Strain KSM-K16 (FERM BP-3376) was isolated from a soil sample (Kobayashi et al. 1995). B. clausii DSM8716T, B. alcalophilus (B. clausii) PB92 (ATCC 31408), B. clausii 221 (DSM 2512), O-4 (DSM 2514), H167 (JCM 9160), IC (JCM 9158) and Y-76 (DSM 2515) were used as reference strains. The Bacillus strains were propagated routinely at 30°C in Tryptic Soy broth (TSB; Becton Dickinson and Company) buffered to pH 8.5 by addition of Na2CO3 or on TSB agar plates (pH 8.5) solidified with 1.5% (w/v) agar.

Escherichia coli TOP10 (Invitrogen) was used as the host for cloning and was grown in Luria–Bertani broth (Difco) supplemented with ampicillin (100 mg l−1) at 37°C. Plasmid pCR2.1 (Invitrogen) was used for cloning and sequencing.

For the growth tests, basal medium containing 5 g l−1 Polypeptone (Difco), 5 g l−1 yeast extract (Difco), 1 g l−1 K2HPO4, 5 g l−1 glucose and 0.2 g l−1 MgSO4·7H2O adjusted to pH 9.0 with Na2CO3 was used, unless otherwise stated.

To characterize carbohydrate utilization, Cystine Tryptic Agar (Difco) buffered with Na2CO3 to pH 7.5 and supplemented with a 5 g l−1 carbohydrate was used. Cultivation was performed at 30°C for 14 days in triplicate. Acid formation from each carbohydrate was checked by monitoring the colour change on the agar. Growth and acid formation were observed at 3, 5, 7, 10 and 14 days after inoculation.

Fatty acid analysis

Cells were grown in TSB (pH 8.5) at 30°C for one day, washed twice with a 30 g l−1 NaCl solution at 4°C by centrifugation at 8,000×g and freeze-dried. The dried cells were transferred into Teflon-lined, screw-capped tubes containing 2 ml of anhydrous methanolic HCl, which were then heated at 100°C for 3 h. After cooling, 1 ml of water was added and the fatty acid methyl esters were extracted with n-hexane. The samples were analyzed in a gas-liquid chromatography-mass spectrometer (Komagata and Suzuki 1987).

Isoprenoid quinones

Isoprenoid quinones were extracted from dried cells with chloroform: methanol (2:1 v/v). After purification by thin-layer chromatography, isoprenoid quinones were analyzed by reverse-phase high-performance liquid chromatography (Komagata and Suzuki 1987).

DNA extraction

Genomic DNA was prepared by the method of Saito and Miura (Saito and Miura 1963). DNA–DNA hybridization was carried out at 44°C for 3 h and measured fluorometrically (Ezaki et al. 1989).

Analysis and alignment of 16S rRNA genes

For sequencing of genomic 16S rRNA genes, each gene was amplified by PCR with a DNA thermal cycler (model 9600; Perkin Elmer), using the eubacterial primers 27f (5′-AGAGTTTGATCCTGGCTCAG-3′) and 1492r (5′-GGTTACCTTGTTACGACTT-3′). For sequencing, the 16S rRNA genes were cloned into pCR2.1 (Invitrogen) using a TA Cloning kit (Invitrogen) and the cloned inserts of randomly selected colonies were amplified by PCR with primers LF (5′-CAAGGCGATTAAGTTGGGTAACG-3′) and LR (5′-CTTCCGGCTCGTATGTTGTGTG-3′). The PCR products of both genomic and cloned genes were treated with exonuclease I and shrimp alkaline phosphatase (Amersham Biosciences) and used as templates for sequencing. Sequencing was performed with DYEnamic ET Dye Terminator (Amersham Pharmacia Biotech) and a MegaBACE 1000 DNA sequencer (Amersham Pharmacia Biotech).

Sequence alignment was carried out using Clustal X (Thompson et al. 1997). Gaps were omitted from further analysis. Phylogenetic trees were inferred by the neighbour-joining procedure (Saitou and Nei 1987). RNA secondary structure prediction and homology searches were carried out using GENETYX-MAC ver. 12.2.0 (SDC Software Development).

Nucleotide sequence accession numbers

The 16S rRNA gene sequence data of Bacillus sp. strain KSM-K16, B. clausii 221 and B. alcalophilus (B. clausii) PB92 have been submitted to the DDBJ/EMBL/GenBank DNA databases under the accession nos. AB251922, AB251923 and AB251924, respectively.

Pulsed-field gel electrophoresis (PFGE)

Preparation of agarose plugs with embedded chromosome DNA, digestion of the DNA with I-CeuI and analyses by PFGE were performed according to the method of Takami et al. (1999). For Southern blotting, a 0.3-kb probe corresponding to a partial sequence of the 16S rRNA gene was amplified by PCR with a PCR DIG Labelling Mix (Roche Diagnostics GmbH) and primers BC16S_F2 (5′-CGGTAATACGTAGGTGGCAAGCG-3′) and BC16S_R3 (5′-ACACCTAGCACTCATCGTTTACGG-3′).

Results and discussion

Taxonomic reidentification of strain KSM-K16

The phenotypic characterization and carbohydrate utilization tests showed similar results between strain KSM-K16 and B. clausii DSM8716T. Cells of strain KSM-K16 were Gram-positive, spore-forming motile rods, and the colonies were cream-white and formed a filamentous margin. Growth was observed in the temperature range of 15–50°C and over the pH range of pH 7–10.5. Optimal growth was observed at 40°C and pH 9.0. This strain was positive for catalase, oxidase, gelatinase, amylase and nitrate reduction, and negative for the production of indole. It produced acid from l-arabinose, cellobiose, d-fructose, d-glucose, glycerol, d-lactose, maltose, d-mannitol, d-mannose, d-raffinose, l-rhamnose, d-sorbitol, sucrose, d-trehalose and d-xylose. The isoprenoid quinones of strain KSM-K16 were menaquinone-7 and menaquinone-6. These observations were similar to those in B. clausii strains (Horikoshi 1991; Nogi et al. 2005). The major cell fatty acids were iso-C15:0, anteiso-C15:0, iso-C17:0 and anteiso-C17:0 for both B. clausii DSM8716T (Denizci et al. 2004) and strain KSM-K16.

To further determine the phylogenetic position of strain KSM-K16 relative to other high-alkaline protease producers, the 16S rRNA genes of strain KSM-K16, B. clausii 221 and B. alcalophilus PB92 were sequenced. Their percent identities with the 16S rRNA gene sequence of B. clausii DSM8716T were 98.8, 99.3 and 99.2, respectively. These three high-alkaline enzyme producers fell into B. clausii in the phylogenetic tree (data not shown). Overall, strain KSM-K16, including strains 221 and PB92, is essentially the same as B. clausii DSM8716T with respect to the 16S rRNA genes.

Intragenomic diversity of 16S rRNA genes in B. clausii strains

In the course of direct sequencing of 16S rRNA genes of B. clausii KSM-K16, 221 and PB92, we found overlapping of peaks in sequencing data around the variable region V1 (Gray et al. 1984). This suggested that each of these strains had variations in the sequence and length around the region. To determine whether the intragenomic variations in 16S rRNA genes could be associated with a possible increase in the number of rRNA operons in the genome, the number of rRNA operons in strain KSM-K16 was examined by the method of genomic mapping with I-CeuI (Liu et al. 1993). PFGE of the chromosomal DNA digested with I-CeuI revealed that seven rRNA operons were in the KSM-K16 genome, which was estimated to be approximately 4.4 Mb (Fig. 1). Since the smallest band G was too faint to be visualized clearly (Fig. 1), the existence of the band was confirmed by I-CeuI Southern hybridization with a 16S rRNA gene (data not shown).

Fig. 1
figure 1

PFGE patterns of Bacillus sp. strain KSM-K16 chromosome after digestion with I-CeuI. a Separation of fragments with lengths in the range of 1.0–3.1 Mb. b Separation of fragments with lengths in the range of 200–1,100 kb. c Separation of fragments with lengths in the range of 50–500 kb. d Separation of fragments with lengths in the range of 5–100 kb. Letters (A–G) on the right show the positions of the bands detected

Finally, the whole genome sequence of B. clausii KSM-K16 (AP006627) confirmed the existance of seven rRNA operons. The number of rRNA operons in the complete Bacillus genome sequences reported to date ranges from 7 to 13 (Kunst et al. 1997; Takami et al. 2000; Read et al. 2003; Ivanova et al. 2003; Rasko et al. 2004; Rey et al. 2004; Veith et al. 2004). Thus, strain KSM-K16 has the smallest number of rRNA operons among Bacillus strains characterized to date, suggesting that the intragenomic diversity of 16S rRNA genes in strain KSM-K16 is not associated with an increase in number of rRNA operons.

Intraspecific diversity of the composition of 16S rRNA genes with different types of V1 regions

The patterns of overlapping peaks in the sequencing data around the V1 region of 16S rRNA genes were significantly different between strain KSM-K16 and the two other strains, 221 and PB92, implying that the compositions of 16S rRNA genes are different among B. clausii strains. Intraspecific variations of rRNA operon compositions may be useful for discriminating strain clusters in B. clausii. No tRNA intergenic length polymorphism (tDNA-PCR) was detected in B. clausii strains (Senesi et al. 2001), while tDNA-PCR can be used to discriminate strain clusters in Bacillus stearothermophilus and Bacillus licheniformis (Borin et al. 1997; Daffonchio et al. 1998).

To elucidate intraspecific polymorphism in B. clausii, we amplified and cloned the genomic 16S rRNA genes of B. clausii DSM8716T, 221, O-4, Y-76, H167, IC, KSM- K16 and PB92. The results indicated three distinct types of V1 region, designated types A, B and C, in the cloned 16S rRNA gene sequences (Fig. 2a). Based on the frequencies of the different types of cloned V1 sequence (Table 1), the B. clausii strains were clearly divided into at least two subgroups: one with types A and B V1 sequences (strains KSM-K16, O-4 and H167), and another with types A, B and C (DSM8716T, 221, Y-76, IC and PB92). The intragenomic divergences of the V1 sequences of the 16S rRNA genes within the genome were confirmed by displaying the patterns of overlap of peaks in the direct sequencing data (Ueda et al. 1999). Furthermore, the frequencies of the V1 sequences of the cloned 16S rRNA genes in strain KSM-K16 (Table 1) were in close agreement with the expected frequencies from the genome sequence with two type A and five type B V1 sequences (AP006627). These results indicate the unique phylogenetic position of strain KSM-K16 compared with the other high-alkaline subtilisin producers, B. clausii DSM8716T, 221 and PB92.

Fig. 2
figure 2

Intragenomic variation of the V1 regions of 16S rRNA genes of B. clausii strains. a Alignment of the three types of sequences of the 16S rRNA gene V1 regions with the 16S rRNA gene sequence of B. clausii DSM8716T (X76440) as a reference. Sequence identities are indicated by asterisks. b Predicted secondary structures of the three types of 16S rRNA V1 region. Bars and dots show Watson-Crick and non-Watson-Crick pairing, respectively

Table 1 Frequencies of clones with each type of 16S rRNA gene V1 region

Intragenomic diversity of 16S rRNA genes is rather limited, as verified by analyses of sequenced bacterial genomes (Coenye et al. 2003; Acinas et al. 2004). However, intragenomic variations of V1 and V2 have been reported as a common property of type strains and environmental isolates of the genus Vibrio (Moreno et al. 2002). In general, the Bacillus genomes reported to date do not contain such distinct types of V1 sequences as were found for B. clausii. In fact, the V1 sequences have been used as species-specific signature sequences in this genus (Nielsen et al. 1994; Jones et al. 1998). As an unusual case, similar intragenomic variations have been reported in Bacillus sporothermodurans M215T (Pettersson et al. 1996). It remains to be determined how common the distinct types of 16S rRNA genes are in the genomes of Bacillus species.

As shown in Fig. 2b, the V1 regions form a stem structure, indicating that different types of 16S rRNA genes encode functional 16S rRNAs. In the 16S rRNA of E. coli, the hairpin of the V1 region is essential for specific interaction with ribosomal protein S4 (Sapag et al. 1990) and includes important phosphate oxygens for 30S ribosomal subunit assembly or association with 50S ribosomal subunit (Ghosh and Joseph 2005). Possession of specific sequences in 16S rRNAs is correlated with temperature-dependent growth rates in the Bacillus cereus group (Pruss et al. 1999).

DNA–DNA hybridization

To further verify our phylogenetic findings, DNA–DNA hybridization was performed among strain KSM-K16 and related strains, B. clausii DSM8716T and B. clausii PB92 (Table 2). DNA-DNA relatedness between strain KSM-K16 and the two other strains were 83–85 and 83–87%, respectively, while that between DSM8716T and PB92 was 97%. These results indicate that the three strains belong to the same species, and that DSM8716T and PB92 are more closely related to each other than to strain KSM-K16. This result is consistent with the subgrouping based on the V1 sequences. Furthermore, the subgrouping of B. clausii strains also coincided with the relationships among their high-alkaline protease gene sequences in the phylogenetic tree (Fig. 3).

Table 2 DNA-DNA reassociation values among strain KSM-K16, B. clausi i DSM8716T and B. alcalophilus (B. clausii) PB92
Fig. 3
figure 3

Phylogenetic tree of high-alkaline protease genes in B. clausii. The tree was constructed by bootstrapping using the neighbour-joining method (Saitou and Nei 1987) based on alignment of partial gene sequences of 538 nucleotides from the start codon (Clustal X; Thompson et al. 1997). The sequence sources of the genes were as follows: AP006627 (M-protease), S48754 (H-221), A13738 (PB92) and AX437862, which is a 538-nucleotide sequence of a shotgun clone from the producer strain of Savinase (B. clausii NCIB10309T = DSM8716T), encoding an amino acid sequence identical to the N-terminal sequence (68 residues) of Savinase (P29600) and a putative prepro peptide. The bar shows the length corresponding to 0.005 base substitutions per position

Thus, strain KSM-K16 was identified as B. clausii. Further, B. clausii strains can be divided into at least two subgroups based on the compositions of their 16S rRNA genes, and strain KSM-K16 belongs to a subgroup phylogenetically different from that including other producers of the high-alkaline proteases Savinase, H-221 and PB92.

The divergence of 16S rRNA gene composition within species may be useful for the study of gene conversion, concerted evolution or lateral gene transfer of rRNA operons. Recently, we determined the whole genome sequence of B. clausii KSM-K16 (AP006627), analysis of which will be reported shortly.