Introduction

The genus Paenibacillus was proposed by Ash et al. (1993) for the members of Bacillus group 3; later, its description was emended by Shida et al. (1997). Presently, Paenibacillus spp. are classified into over 200 validly named species (http://www.bacterio.net/paenibacillus.html) with Paenibacillus polymyxa as the type species (Ash et al. 1993). Most Paenibacillus species consist of strains that are motile, Gram-stain positive (some are Gram-stain variable, such as Paenibacillus taiwanensis and Paenibacillus assamensis), spore forming, facultatively anaerobic or strictly aerobic. Most species are catalase positive and show optimal growth at 28–40 °C and pH 7.0, with some species being alkaliphilic. The major fatty acid of Paenibacillus spp. is anteiso-C15:0. The DNA G+C content of Paenibacillus spp. ranges from 40 to 59 mol% (Priest 2015).

Paenibacillus species have been isolated from diverse habitats, such as paperboard, deep-sea surface sediment, honeybee larva, warm springs, farmland soil, tree bark, blood samples, sputum, human urine, rhizosphere soil and necrotic wounds (Nakamura 1996; Roux and Raoult 2004; Saha et al. 2005; Lee et al. 2007; Roux et al. 2008; Kim et al. 2010; Romanenko et al. 2013; Han et al. 2015; Sitdhipol et al. 2016). During an assessment of the cultivable microbial diversity from water of an artificial lake located on the outskirts of Celje (Slovenia) which is accumulating liquid industrial waste, strain 11T was isolated. In a phylogenetic tree based on the 16S rRNA gene sequence, strain 11T formed a separate branch within the genus Paenibacillus, suggesting it could represent a novel species. To clarify its taxonomic status, further genotypic and phenotypic analyses were performed. The genome of strain 11T was sequenced under the umbrella of the Genomic Encyclopedia of Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains (Whitman et al. 2015).

Materials and methods

Strains and growth conditions

A lake situated on the outskirts of Celje (Slovenia) was sampled in September 2013. The water sample had a pH of 7.3. Strain 11T was isolated by plating the sampled water on nutrient agar (NA, Sigma-Aldrich) and incubating the plates at 25 °C in an aerobic atmosphere for 4 days.

The reference type strains Paenibacillus alvei DSM 29T and Paenibacillus apiarius DSM 5581T were obtained from the German Collection of Microorganisms and Cell Cultures (DSMZ, Germany) and the type strain Paenibacillus profundus NRIC 0885T from the Nodai Culture Collection (NRIC, Japan). All reference strains were cultured on NA medium for comparative analyses.

Morphological, physiological and biochemical analysis

Colonial morphology and size were described and measured visually after cultivation on NA medium. Gram-stain reaction was determined by using a Gram-staining kit (Sigma-Aldrich, St. Louis, MO, USA), following the instructions of the manufacturer. Cytochrome c oxidase activity was determined by oxidation of 1% (w/v) N,N,N′,N′ - tetramethyl-1,4-phenylendiamin dihydrochloride and catalase activity by observing air bubble production using 3% (v/v) H2O2 as described by Gerhardt et al. (1994). Oxygen demand for growth was determined by incubating inoculated NA medium for 7 days in a jar with anaerobic atmosphere generation bags (Sigma-Aldrich). Resistance against different concentrations of NaCl and the pH range for growth were analysed in nutrient broth. The growth was detected by measuring an increase in turbidity. The temperature range for growth was analysed on NA medium by incubating the plates for 10 days at different temperatures. Motility was determined as described by Alexander and Strete (2001) in a tube with semisolid NA medium containing 0.5% agar by incubating the tube for 4 days at 30 °C. Gliding motility was analysed on the same type of medium in a Petri dish by inoculating the surface of the medium with a spot of strain 11T culture. The presence of endospores was determined according to the method of Schaeffer-Fulton as described by Gerhardt et al. (1994).

Other biochemical characteristics were determined using the API 50CH, API ZYM and API 20E microtest systems (bioMérieux, Marcy l’Etoile, France) at 30 °C, following the manufacturer’s instructions.

Phylogenetic analysis

The 16S rRNA gene sequence of strain 11T (1478 nt) was determined as described previously (Trček et al. 2000) with the following modifications: DNA was isolated with a GeneJET™ Genomic DNA Purification Kit (Thermo Fisher Scientific, Waltham, MA, USA) following the protocol for Gram-positive bacteria. The 16S rRNA gene sequence was amplified with primers fD1 (5′-AGAGTTTGATCMTGGCTCAG-3′) and rH1542 (5′-AAGGAGGTGATCCAGCCGCA-3′). The nucleotide sequences of both strands were determined by the primer walking method using the primers described by Boesch et al. (1998). The 16S rRNA gene sequence was compared to all type strains representing the genus Paenibacillus. The sequences were aligned with the Clustal X 1.8 program (Thompson et al. 1997) using the default settings of the multiple alignment tool and then subjected to the phylogenetic analyses by applying the neighbour-joining method in the DNADIST program from the PHYLIP package (Felsenstein 2002). The tree was drawn with the Treeview program (Page 2000). The robustness of the phylogenetic tree topology was evaluated by bootstrap analysis with 1000 replications using the SEQBOOT and CONSENCE programs from the PHYLIP package. Bootstrap values >60% were considered to be significant.

Sequencing of the genome

Genomic DNA was extracted with a NucleoSpin Tissue kit (Macherey–Nagel) using the manufacturer’s protocol. A draft genome sequence of strain 11T was generated at the DOE Joint Genome Institute (JGI). An Illumina 300 bp insert standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000-1 TB platform. The reads were filtered using BBDuk (Bushnell 2014), which removes known Illumina artifacts and spike-in PhiX control. Reads with more than one “N” or with quality scores (before trimming) averaging less than 8 or reads shorter than 51 bp (after trimming) were discarded. Assembly was performed with SPAdes (version 3.6.2) (Bankevich et al. 2012). Assembly contigs were discarded if the length was <1 kbp.

Genome annotation

The JGI Microbial Genome Annotation Pipeline (MGAP v.4) (Huntemann et al. 2015; Chen et al. 2016) was used for annotating the draft genome, followed by a round of manual curation using the JGI GenePRIMP pipeline (Pati et al. 2010). The annotated genome is available from the Integrated Microbial Genomes (IMG) system (Chen et al. 2017).

Chemotaxonomic characterisation

The whole cell fatty acids of strain 11T, P. alvei DSM 29T, P. apiarius DSM 5581T and P. profundus NRIC 0885T, all grown for 24 h at 28 °C under aerobic conditions on trypticase soy agar (TSA, Difco) medium, were determined by GC using an Agilent Technologies 6890 N gas chromatograph (Santa Clara, CA, USA). The peak naming table MIDI TSBA 5.0 was used.

The isomer type of the diamino acid in the cell wall of strain 11T was analysed by thin-layer chromatography as described previously (Schumann 2011).

Extraction of quinones was carried out according to the two stage method described by Tindall (1990a, b). Quinones were separated by thin layer chromatography on silica gel (Macherey–Nagel Art. No. 805 023), using hexane:tert-butylmethylether (9:1 v/v) as solvent. UV absorbing bands corresponding to menaquinones were further analysed using a LDC Analytical (Thermo Separation Products) HPLC fitted with a reverse phase column (Macherey–Nagel, 2 mm × 125 mm, 3 µm, RP18) using methanol:heptane 9:1 (v/v) as the eluant. Menaquinones were detected at 269 nm.

Results and discussion

Phenotypic characteristics

The cells of strain 11T were observed to be Gram-stain positive, spore-forming and facultatively anaerobic rods of 0.4–0.6 μm in width and 1.5–2.5 μm in length. The colonies of strain 11T grown on NA medium for 48 h at 30 °C were observed to be white and of irregular shape. Optimal growth of strain 11T was observed at 30 °C and pH 7. Growth was observed from 10 to 46 °C, pH 5.5 to 13 and 0 to 1.5% NaCl. Strain 11T was found to be oxidase and catalase positive. In API ZYM tests, strain 11T was found to be positive for alkaline phosphatase, esterase (C4), esterase lipase (C8), leucine arylamidase, naphthol-AS-BI-phosphohydrolase, β-galactosidase and α-glucosidase. Lipase (C14), valine arylamidase, acid phosphatase, β-glucosidase cystine arylamidase, trypsin, α-chymotrypsin, α-galactosidase, β-glucuronidase, N-acetyl-ß-glucosaminidase, α-mannosidase and α-fucosidase tests are negative. In the API 50CH system, strain 11T ferments the following carbon sources: glycerol, d-ribose, d-xylose, d-galactose, d-glucose, d-fructose, d-mannose, methyl-αd-glucopyranoside, N-acetylglucosamine, amygdalin, arbutin, esculin, salicin, d-cellobiose, d-maltose, d-lactose, d-melibiose, d-saccharose, d-trehalose, d-raffinose, starch (amidon), gentiobiose, and d-turanose. Methyl-αd-mannopyranoside, l-fucose and glycogen are weakly fermented. Erythritol, d-arabinose, l-arabinose, l-xylose, d-adonitol, methyl-β-d-xylopyranoside, l-sorbose, l-rhamnose, dulcitol, inositol, d-mannitol, d-sorbitol, inulin, d-melezitose, xylitol, d-lyxose, d-tagatose, d-fucose, d-arabitol, l-arabitol, potassium gluconate, potassium 2-ketogluconate, potassium 5-ketogluconate are not used as carbon sources. In the API 20E system, nitrate is not reduced to nitrite, gelatin is not liquefied, and H2S and indole are not produced. β-galactosidase, arginine dihydrolase, lysine dihydrolase, lysine decarboxylase and tryptophan deaminase activities were not detected. The main phenotypic characteristics differentiating strain 11T from its close relatives P. alvei DSM 29T, P. apiarius DSM 5581T and P. profundus SI 79T are fermentation of d-fructose, d-mannose and d-xylose, no liquefaction of gelatin, positive α-glucosidase activity, and negative α-chymotrypsin and acid phosphatase activities (Table 1). These characteristics are noted to be similar to those of strain “Paenibacillus tyraminigenes” H3209 (Table 1), which has been proposed to represent a taxon whose name has not been validated (Mah et al. 2008).

Table 1 Phenotypic characteristics of Paenibacillus aquistagni sp. nov. in comparison to type strains of closely related species (1 P. aquistagni sp. nov.; 2 P. alvei DSM 29T; 3 P. apiarius DSM 5581T; 4 P. profundus NRIC 0885T)

Chemotaxonomic characteristics

The major fatty acids of strain 11T were identified as anteiso-C15:0 (53.6%), iso-C15:0 (11.7%), iso-C16:0 (8.4%), iso-C17:0 (5.8%), anteiso-C17:0 (5.4%) and C16:0 (5.5%). The overall fatty acid composition of the strain 11T was similar to those of closely related Paenibacillus species (Table 2). The quinone analysis showed that strain 11T contains menaquinone MK-7 as the predominant quinone (92%), which is in accordance with findings reported for other Paenibacillus species (Ludwig et al. 2009). The diamino acid of the cell wall peptidoglycan was determined to be meso-diaminopimelic acid.

Table 2 Fatty acid composition (%) of Paenibacillus aquistagni sp. nov. and the type strains of closely related species (1 P. aquistagni sp. nov.; 2 P. alvei DSM 29T; 3 P. apiarius DSM 5581T; 4 P. profundus NRIC 0885T)

Molecular analyses

Phylogenetic analysis using a nearly complete 16S rRNA gene sequence of strain 11T (1478 nt) revealed that strain 11T belongs to the genus Paenibacillus and is closely related to strain “P. tyraminigenes” H3029, P. alvei DSM 29T, P. apiarius DSM 5581T, P. profundus SI 79T and P. taiwanensis BCRC 17411T with 99.8, 96.2, 94.9, 94.8 and 94.7% sequence similarity, respectively. Nucleotide identity between the 16S rRNA gene sequence of strain 11T and other species within the genus was found to be below 94.4%. The closely related strain “P. tyraminigenes” H3029 has been deposited in the Korean Collection for Type Cultures as a patent strain (KCTC 10694BP) and described in Mah et al. (2008) as the type strain of a novel Paenibacillus species. However, since the species name “P. tyraminigenes” H3029 has not been validated and, moreover, the strain is not available (or deposited in two culture collections), its name has no standing in the nomenclature.

In a phylogenetic NJ tree (Fig. 1), strain 11T forms a distinct branch well separated from its close relative P. alvei DSM 29T. Since the 16S rRNA gene sequence identity of strain 11T to the type strains of the closely related species was found to be significantly below the threshold of 98.7–99% (Stackebrandt and Ebers 2006) at which DNA–DNA hybridization (DDH) experiments are mandatory, wet-lab hybridization analyses were not performed. The DNA G+C content of strain 11T is 47.5% (estimated from a draft genome sequence determined in the frame of this study, see below). This value falls within the expected range of 40–59 mol% reported for the members of the genus Paenibacillus (Priest 2015).

Fig. 1
figure 1

Phylogenetic tree reflecting the relationships between Paenibacillus aquistagni sp. nov. and the type strains of closely related species of the genus Paenibacillus based on 16S rRNA gene sequences. The tree was constructed by the neighbour-joining method. Bacillus subtilis is included as an outgroup. Bootstrap values based on 1000 replicates are indicated at branch points, only numbers greater than 60% are shown. The bar indicates 1% sequence differences. The accession numbers are in parenthesis

Sequence analysis of the genome

Sequencing of the genome of strain 11T produced a final draft assembly which contained 18 contigs in 18 scaffolds, totalling 5.386 Mbp in size based on 1,183.2 Mbp of Illumina data with a mapped coverage of 220.1X. Annotation with the JGI Microbial Genome Annotation Pipeline (MGAP v.4) (Huntemann et al. 2015) identified 4863 (97.7%) protein coding sequences (CDS), of which a function has been predicted for 3804 (76.4%), along with 17 rRNA genes (seven 5S rRNA, seven 16S rRNA and three 23S rRNA) and 77 tRNA genes.

Comparison of the draft genome sequence of strain 11T to that of strains of the closely related species P. alvei yielded ANI values of around 73% (gANI) (Varghese et al. 2015), 70% (ANIb) and 86% (ANIm) (Table 3). These values are clearly below the species-delineating threshold of 96% (Richter and Rosselló-Móra 2009; Varghese et al. 2015), meaning that strain 11T does not belong to the species P. alvei.

Table 3 Results of ANI calculations (%) using MiSI (gANI) and JSpecies (ANIb and ANIm) software

The genome to genome distance calculator (GGDC) (Auch et al. 2010; Meier-Kolthoff et al. 2013) was used to calculate the genomic distance between strain 11T and strains of closely related species. The GGDC values for P. alvei strains DSM 29T, A6-6i-x, E194 and TS-15, when compared to strain 11T, were 21.7 ± 2.3, 23.2 ± 2.3, 22.8 ± 2.3 and 23.0 ± 2.3%, respectively. The GGDC comparison for some other closely related species, P. assamensis DSM 18201T, P. taiwanensis DSM 18679T, Paenibacillus popilliae ATCC 14706T and Paenibacillus lentimorbus NLLR B-30488, yielded the following values: 21.7 ± 2.3, 19.5 ± 2.3, 19.4 ± 2.3 and 19.0 ± 2.4%, respectively. All these values are well below the threshold value of 70%, which confirms that strain 11T does not belong to P. alvei or any of these species.

Analysis of the draft genome of strain 11T has shown that it contains a cluster of genes related to those encoding a type four pilus system. The genes in the cluster are organised similarly to the pil operons in Clostridium and Bacillus species (Imam et al. 2011). Initially considered to be exclusive to Gram-negative bacteria, they were later found in Streptococcus sanguinis and Clostridium difficile (Fives-Taylor and Thompson 1985; Borriello et al. 1990). Genomic information available since then has shown that a vast majority of Gram-negative and almost all clostridial species contain type four pilin (TFP) associated genes. On the other hand, TFP genes are present only sporadically in other Gram-positive species (Melville and Craig 2013). Strain 11T contains all the components for the synthesis of pili, including genes for type IV pilus assembly protein PilB (SMG56614), twitching motility protein PilT (WP_085497616), type IV pilus assembly protein PilC (WP_085497614), type IV pilin PilA (SMG56607), leader peptidase (prepilin peptidase)/N-methyltransferase (PilD) (WP_085497612), type IV pilus assembly protein PilM (WP_085497609), type IV pilus assembly protein PilN (WP_085497608) and a hypothetical protein (WP_085497605) with a 31% homology to a type IV pilus assembly protein PilO found in Bacillus soli (WP_066063795). The open access PilFind program (Imam et al. 2011) also identified PilA (SMG56607) as a TFP. Out of more than 200 Paenibacillus strains sequenced to this date, approximately 40 (including strain 11T) code for a pilus forming protein PilA. However, only seven sequenced Paenibacillus genomes (including that of strain 11T) out of more than 200, code for the inner membrane accessory proteins PilM, PilN and PilO, which are known to form a complex indispensable for pilus assembly in Pseudomonas aeruginosa (Ayers et al. 2009). This indicates that TFP might be a rare trait in Paenibacillus species. Indeed, the genomes of none of the close relatives of strain 11T contain this cluster. TFP are large and varied protein filaments involved in a diverse array of cellular functions, including gliding motility, conjugation, adherence, DNA uptake, and biofilm formation, all adding to pathogenic properties of bacteria. While they are frequent in Gram-negative bacteria, their presence and role in Gram-positive bacteria has only recently been discovered (Craig et al. 2004; Melville and Craig 2013). The genome sequence of strain 11T will therefore be valuable in expanding our knowledge of TFP systems in Gram-positive bacteria.

The results of this polyphasic study demonstrate that strain 11T belongs to the genus Paenibacillus. Differences in phenotypic, as well as phylogenetic and genomic characteristics distinguish strain 11T from previously described validly named Paenibacillus species, and thus it is considered that strain 11T represents a novel Paenibacillus species for which the name Paenibacillus aquistagni sp. nov. is proposed. The Digital Protologue database (Rosselló-Móra et al. 2017) TaxoNumber for strain 11T is TA00102.

Description of Paenibacillus aquistagni sp. nov.

Paenibacillus aquistagni (a.qui’stag.ni. L. n. aqua water; L. gen. n. stagni of the lake; N.L. gen. n. aquistagni of the water of the lake).

Cells are Gram-stain positive, spore forming, facultatively anaerobic, oxidase positive, catalase positive, motile rods (0.4–0.6 × 1.5–2.5 μm). Forms irregularly shaped white colonies of 1.2–2.0 mm and 2.5–3.0 mm in diameter on NA medium after 2 and 5 days of cultivation, respectively. Growth occurs at 10–46 °C and pH 5.5–13, with optimal growth at 30 °C and pH 7. Tolerates a maximum of 1.5% NaCl. Nitrate is not reduced to nitrite. Gelatin is not liquefied. H2S and indole are not produced. Urease activity is not detected. The predominant fatty acids are anteiso-C15:0, iso-C15:0, iso-C16:0, iso-C17:0, anteiso-C17:0 and C16:0. The major menaquinone is MK-7. Contains meso-diaminopimelic acid as the diagnostic diamino acid of the peptidoglycan. The DNA G+C content of the type strain is 47.5% (estimated from a draft genome sequence).

The type strain, 11T (=ZIM B1027T =LMG 29561T =CCM 8679T), was isolated from water of an artificial lake. The GenBank/EMBL/DDBJ accession number for the 16S rRNA gene sequence of strain 11T is LT576422. The draft genome sequence of strain 11T has been deposited at ENA under the accession number FXAZ01000000.