Introduction

The genus Paenibacillus was defined in 1993 after an extensive comparative analysis of 16S RNA gene sequences of 51 species of the genus Bacillus (Ash et al. 1991, 1993). At that time, the genus comprised 11 species, with P. polymyxa as the type species. Currently, the genus comprises 270 validated species and five subspecies (http://lpsn.dsmz.de) and harbors strains of industrial and agricultural importance relevant to humans, animals, plants, and the environment (Seldin 2011; Grady et al. 2016). Members of Paenibacillus can be found from polar regions to the tropics and from aquatic environments to the driest deserts (for review, see Grady et al. 2016). Within the genus Paenibacillus, different strains are considered plant growth-promoting bacteria (PGPB, Jeong et al. 2019) and/or are under evaluation for their use in biological control (Ruiu 2020). More than 20 Paenibacillus species encompass nitrogen-fixing strains found in different kinds of soils and in a variety of plant phyllospheres, roots and/or rhizospheres, such as wheat (Ripa et al. 2019), maize (Seldin et al. 1998; von der Weid et al. 2002), sugarcane (Seldin et al. 1984), cucumber (Hao and Chen 2017), rice (Li et al. 2021), Sabina squamata (Ma et al. 2007a), Zanthoxylum simulans (Ma et al. 2007b), Sonchus oleraceus (Hong et al. 2009), Arabidopsis thaliana (Qi et al. 2021), and many other plants.

Conversely, in a previous bioprospection study of our group carried out to search for new sources of compounds that may have economic potential (Castro et al. 2011), a novel nitrogen-fixing Paenibacillus strain—designated P121—was isolated from the gut of the armored catfish Parotocinclus maculicauda. Fish belonging to this species are popularly known as Red Fin Dwarf Pleco and are usually found in southern Brazil (Garavello, 1977). The search for novel microorganisms from poorly studied environments, such as those associated with Parotocinclus maculicauda, can bring new insights into novel genes and enzymes. The cellulolytic Paenibacillus strain previously described in Castro et al. (2011) and the novel nitrogen-fixing bacterial strain described here are good examples of this kind of study. The presence of these Paenibacillus strains in the fish gut could be explained by the diet of Parotocinclus maculicauda, which consists mainly of plant material (Hansen and Olafsen 1999).

In this study, we report a polyphasic taxonomic description of this nitrogen-fixing bacterium strain isolated from the gut of P. maculicauda. The phenotypic, chemotaxonomic and genotypic properties indicate that strain P121T represents a novel species within the genus Paenibacillus, for which the name Paenibacillus piscarius sp. nov. is proposed.

Materials and methods

Isolation of the bacterial strain and culture conditions

The bacterial strain studied here was isolated from the gut of the armored catfish Parotocinclus maculicauda. Twenty fishes belonging to this species were obtained from the Mato Grosso River (Saquarema, Rio de Janeiro, Brazil) and taken to the laboratory for dissection. The food content from the digestive tract of the different fish samples was transferred to sterilized tubes, and portions of 200 mg were mixed with 1.8 ml of saline (NaCl 0.85%) and plated onto trypticase soy broth (TSB, Difco)-agar (1.2%). After 48 h of incubation at 32 °C, different colonies were selected based on their morphotypes. Phylogenetic characterization of the isolates based on 16S rRNA gene sequencing surprisingly revealed that one strain (denoted P121T) clustered within the nitrogen-fixing Paenibacillus group. Therefore, it was selected for further phenotypic, phylogenetic and chemotaxonomic characterization.

Bacterial DNA extraction

DNA from strain P121T was isolated according to the method described in Seldin et al. (1998). Further purification steps were those described in Seldin and Dubnau (1985). The DNA was quantified spectrophotometrically using a Qubit™ fluorimeter (Thermo Fisher Scientific, MA, USA).

Sequencing of the 16S rRNA-encoding gene from strain P121T and phylogenetic analysis

The gene encoding 16S rRNA from P121T was amplified by PCR using the pair of universal primers pA and pH and the conditions described in Massol-Deya et al. (1995) and the products sequenced using Macrogen (South Korea) facilities. The 16S rRNA sequence obtained (1447 bp) was compared with the sequences previously deposited in the GenBank database using the BLAST-N facility (www.ncbi.nlm.nih.gov/blast). For phylogenetic tree analysis, the sequences of closely related bacterial strains were recovered from the GenBank database and aligned to the sequence obtained in this study using the online multiple alignment program MAFFT version 7 (https://mafft.cbrc.jp/alignment/software/). Phylogenetic analyses were performed using the RaxML-HPC2 model in the CIPRES Science Gateway (Miller et al. 2010), with phylogenetic tree inference using a maximum likelihood/rapid bootstrapping run. The sequence generated in this study was deposited in NCBI GenBank under accession number JF892726.1.

Whole genome sequencing (WGS) and genome features

The whole genome of strain P121T was sequenced on the Illumina Hi-seq 2500 platform as recommended by the manufacturer. An amount of approximately 5 μg/µl gDNA was used for the construction of paired-end sequencing libraries (2 × 150 bp) of 450 bp insert size. Quality analysis of the final libraries was performed using a 2100 bioanalyzer (Agilent Technologies, CA, USA). The quality of the reads in the genome assembly process was checked through FastQC (Andrews 2010) and Adapter Removal (Lindgreen 2012) software. The estimated best k-mers were selected by KmerStream (Melsted and Halldórsson 2014), followed by assembly using Edena (Hernandez et al. 2008) and SPAdes (Bankevich et al. 2012). The PSI-CD-HIT package (Fu et al. 2012) was used to obtain the final contig file.

Annotation of the genome was performed by rapid annotation using Subsystem Technology server v. 2.0 (RAST; https://rast.nmpdr.org) (Aziz et al. 2008) and Prokka v.1.11 (Seemann 2014). The anti-SMASH server was used to identify the secondary metabolite biosynthesis gene clusters (Blin et al. 2019). Comparative genome analysis for Paenibacillus piscarius P121T, Paenibacillus borealis DSM 13188T, Paenibacillus silagei DSM 101953T and Paenibacillus rhizoplanae DSM 103963T was performed by the OrthoVenn2 webserver (https://orthovenn2.bioinfotoolkits.net/) (Xu et al. 2019). The comparative genome map was represented through a BLASTN-based ring generated by BLAST Ring Image Generator (BRIG) version 0.95 (Alikhan et al. 2011). The P. piscarius P121T genome was used as a reference for comparison. An alignment using tBLASTx was performed to compare the structural nitrogenase encoding genes (nifK, nifD and nifH) and other nif genes necessary for nitrogen fixation between P. piscarius P121T and the most closely related species.

Phylogenetic analyses using 16S rRNA, rpoB, gyrB and nifH genes

Multilocus sequence analysis (MLSA) was performed using the concatenated sequences of the 16S rRNA, rpoB, gyrB and nifH genes. Partial sequences of these genes (gyrB—1911 bp, rpoB—3540 bp and nifH—245 bp) from different Paenibacillus species were obtained using their genome sequences deposited in the NCBI (https://www.ncbi.nlm.nih.gov/) and JGI (https://img.jgi.doe.gov/cgi-bin/m/main.cgi) databases and from the draft genome sequences of strain P121T determined in this study. Sequences were compared using the BLAST program. All multiple alignments were performed using the online Multiple alignment program MAFFT version 7 (https://mafft.cbrc.jp/alignment/software/). Phylogenetic analyses were performed using the RaxML-HPC2 model in the CIPRES Science Gateway (Miller et al. 2010), with phylogenetic tree inference using a maximum-likelihood/rapid bootstrapping run. Phylogenetic trees based on the concatenated sequences were reconstructed by also applying the neighbor-joining (Saitou and Nei 1987) and maximum-parsimony (Fitch 1971) algorithms using Molecular Evolutionary Genetics Analysis (MEGA) software version X (Kumar et al. 2018). Bootstrap analysis (1,000 replications) was used to warrant cluster stability.

Digital DNA–DNA hybridization (dDDH) and average nucleotide identity (ANI)

The ANI between strain P121T and the most closely related Paenibacillus species chosen in the phylogenetic trees was calculated using JSpeciesWS software (http://jspecies.ribohost.com/jspeciesws) (Ritcher et al. 2016). DNA digital hybridization (dDDH) was performed using the Genome-to-Genome Distance Calculator—GGDC 2.1 (Meier-Kolthoff et al. 2013) provided by Leibniz on the DSMZ Institute website (http://ggdc.dsmz.de/distcalc2.php) with the recommended parameters and/or default settings.

Phenotypic and biochemical characterization

Most cultural and biochemical tests were performed by using the methodology described in Gordon et al. (1973). Either TSB or TBN (Seldin et al. 1984) liquid media were used to propagate cultures (P121T and other Paenibacillus strains used in different tests for comparative purposes) for 24–48 h without shaking at 32 °C. Different media were supplemented with 1.2% agar to obtain solid media. Cells were observed by Nomarski differential interference contrast on a Zeiss Axioplan microscope (Carl Zeiss, Oberkochen, Germany) to determine the size of vegetative cells. The length and width (n = 700) of P121T cells were measured using iTEM software (iTEM Software Inc., Whiteley, UK). Gram staining was carried out by using the standard Gram reaction. Cellular motility was observed in fresh wet mounts of young (24-h) bacterial cultures in TSB. Cells were displaced on formvar-coated copper grids (Electron Microscopy Sciences, PA, USA), negatively stained with uranyl acetate 2%, and observed on an FEI Morgagni transmission electron microscope (FEI Company, Hillsboro, OR, USA) operating at 80 kV for flagella observation. For ultrathin section analysis, cells were fixed in glutaraldehyde 2.5% in sodium cacodylate buffer 0.1 M, then washed three times in the same buffer, postfixed in osmium tetroxide 1%, dehydrated in acetone series and embedded in Polybed 812. Ultrathin sectioning was performed on an EM UC6 microtome (Leica Microsystems). Samples were recovered on 300-mesh copper grids (Electron Microscopy Sciences), stained with uranyl acetate and lead citrate, and observed on an FEI Morgagni transmission electron microscope (FEI Company, Hillsboro, OR, USA) at 80 kV.

For all tests that required complex media (temperature, pH and salinity range of growth, resistance to lysozyme, hydrolysis of starch, and liquefaction of gelatin), appropriately adjusted TSB was employed. Cell growth was monitored by the increase in optical density at 600 nm. Anaerobic growth was observed by incubating a TSB-agar-containing plate inoculated with strain P121T in an anaerobic chamber (filled with 80% N2, 10% CO2 and 10% H2) for 5 days. Cytochrome oxidase was determined by the standard paper strip Kovacs oxidase test. Catalase activity was observed by bubble formation with the addition of 3% H2O2 solution. The Voges-Proskauer test, formation of crystalline dextrins, utilization of citrate, reduction of nitrate to nitrite, production of indole, decomposition of casein were performed in media and conditions described in Gordon et al. (1973). Strain P121T was also characterized by using the API 50CH kit (BioMérieux, France) as described in Seldin and Penido (1986). Data from API tests composed of 49 different carbohydrates were recorded as described previously (Rosado et al. 1998). An API 20NE (BioMérieux) containing 20 miniature biochemical tests was inoculated, and the biochemical results were converted into a numerical profile or code used to identify the bacteria as indicated by the manufacturer.

Respiratory quinones and fatty acids

The Identification Service and Dr. Brian Tindall, DSMZ, Braunschweig, Germany, carried out the analyses of respiratory quinones and fatty acids.

Acetylene reduction

To confirm the nitrogen-fixing capacity of strain P121T, acetylene reduction to ethylene (nitrogenase activity) was tested by gas chromatography as described previously (Seldin et al. 1983, 1984), using P. riograndensis SBR5T and P. graminis RSA19T as positive controls.

Results and discussion

Phylogenetic analysis of 16S rRNA gene

Different isolates from the gut of the armored catfish Parotocinclus maculicauda were phylogenetically characterized based on 16S rRNA gene sequence analysis. One strain (denoted P121T) was identified as belonging to the genus Paenibacillus, and its closest relatives were P. borealis DSM 13188T, P. silagei DSM 101953T and P. rhizoplanae DSM 103993T, all considered nitrogen-fixing Paenibacillus species (https://lpsn.dsmz.de/genus/paenibacillus; Elo et al. 2001; Tohno et al. 2016; Kämpfer et al. 2017). The phylogenetic similarity indicated by the 16S rRNA gene sequence data was in agreement with the levels of 16S rRNA gene sequence similarity obtained with the novel strain and P. rhizoplanae DSM 103993T (98.9% similarity), P. silagei DSM 101953T (98.3% similarity) and P. borealis DSM 13188T (97.6% similarity). The 16S rRNA gene-based phylogenetic reconstruction using the maximum-likelihood algorithm, including the sequences of the most related species obtained from the GenBank database, showed that P121T clustered and in a separate clade together with P. borealis DSM 13188T, P. rhizoplanae DSM 103993T, and P. silagei DSM 101953T but in an independent branch (Fig. 1).

Fig. 1
figure 1

Maximum likelihood tree with GTRGAMMA distribution of the multiple alignment of the 16S rRNA encoding gene of Paenibacillus piscarius P121T and related species. The GenBank accession number of each sequence is shown in parentheses. Bootstrap values are expressed as percentages of 1000 replications and are shown at branch points. Bacillus licheniformis ATCC 14580 was used as outgroup. Bar = substitutions per nucleotide position

Genome sequence analysis

The draft genome sequence of strain P121T was determined in this study, and the Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under accession number JAIEUI000000000. The version described in this paper is JAIEUI010000000. Genome sequencing of strain P121T resulted in a chromosome consisting of 7,513,698 bp. The G + C content was 53.9 mol%. According to the annotation, 6,955 coding sequences, 82 RNAs and 444 contigs were found in the P121T genome. The RAST analysis revealed 317 subsystems (Fig. S1). In subsystem categories, carbohydrates had the highest feature counts (299), followed by amino acids and derivatives with 279 feature counts. As a nitrogen-fixing bacterium, genome analysis of P121T revealed the presence of the nifK and nifD genes encoding dinitrogenase α and β subunits (Fe-Mo protein) and the nifH gene encoding the nitrogenase iron protein. Moreover, other nif genes were also found in the P121T genome, and their identities were compared with those of the most closely related species (Table S2). The highest identities were observed between P121T and P. rhizoplanae DSM 103993T (> 89.2%), and P. silagei DSM 101953T (> 89.6%).

AntiSMASH analysis resulted in the identification of eight predicted secondary metabolite biosynthetic gene clusters (BGCs). One of the BGCs matched paeninodine, a bacteriocin from the lassopeptide class, with 100% similarity. Lassopeptides are a class of ribosomally synthesized and posttranslationally modified natural products with diverse bioactivities (Maksimov et al. 2012). Another BGC showed 25% similarity to clusters encoding the polyketide aurantinin b/c/d. Polyketides are a large family of structurally diverse natural products with varied biological and pharmacological activities, including antibacterial, antitumor, and immunosuppressant activities (Nivina et al. 2019). These secondary metabolites have already been described in different Paenibacillus species, such as P. polymyxa KF-1 (Li et al. 2016), P. dendritiformis C454 (Zhu et al. 2016) and P. alvei MP1 (Pajor et al. 2020).

The average nucleotide identity (ANI) and digital DNA–DNA hybridization (dDDH) values were determined between strain P121T and the other three genomes of the closest related members of the genus Paenibacillus (P. borealis DSM 13188 T, P. rhizoplanae DSM 103993T and P. silagei DSM 101953T) and are shown in Table 1. The ANIb values between strain P121T and P. borealis DSM 13188T, P. rhizoplanae DSM 103993T and P. silagei DSM 101953T were 80.47, 83.52 and 84.28%, respectively. The accepted threshold for species delimitation using ANIb is 95–96% (Richter and Rosselló-Móra 2009). The in silico DDH results were in all cases lower than 45%, a value lower than 70%, which is the cutoff value for species delineation (Goris et al. 2007). Both ANI and DDH results indicate that strain P121T is a new species of the genus Paenibacillus.

Table 1 dDDH and ANI values between the genome of Paenibacillus piscarius P121T as the query genome and that of closely related species

Finally, the comparative genome analysis for P. piscarius P121T, P. borealis DSM 13188T, P. silagei DSM 101953T and P. rhizoplanae DSM 103963T revealed that the strains formed 6874 clusters, 5015 orthologous clusters (at least containing two species) and 1859 single-copy gene clusters (Fig. 2). Paenibacillus piscarius P121T possesses 394 singletons, proteins not found in any cluster. Figure 3 shows a circular diagram illustrating the nucleotide similarity between P. piscarius P121T and other Paenibacillus genomes represented by concentric rings. The nif operon region in P121T is highlighted in Fig. 3.

Fig. 2
figure 2

Comparative genome analysis for Paenibacillus piscarius P121T, Paenibacillus borealis DSM 13188T, Paenibacillus silagei DSM 101953T and Paenibacillus rhizoplanae DSM 103963T performed by the OrthoVenn2 webserver. The numbers in the Venn diagram represent the number of clusters shared between strains

Fig. 3
figure 3

Circular diagram illustrating the nucleotide similarity between P. piscarius P121T (in red - inner ring) and other Paenibacillus genomes represented by concentric rings. The nif genes region is delimited in black. (Color figure online)

Multilocus sequence analysis (MLSA) using the housekeeping genes 16S rRNA, rpoB, gyrB and nifH

Multilocus sequence analysis (MLSA) was performed using the concatenated sequences of the 16S rRNA, rpoB, gyrB and nifH genes. Concatenation of the 16S rRNA gene and the three housekeeping genes (rpoB, gyrB and nifH) of the different nitrogen-fixing Paenibacillus species and the outgroup resulted in a phylogenetic tree (Fig. S2 a) showing the same distribution as that from the 16S rRNA gene reconstruction. Again, strain P121T grouped in a monophyletic group together with P. borealis DSM 13188T, P. rhizoplanae DSM 103993T and P. silagei DSM 101953T but formed an independent branch in the tree. Similar results were also found in phylogenetic analyses performed for each gene separately (Fig. S2 b, c, d). Furthermore, the use of neighbor-joining and maximum-parsimony methods for phylogenetic reconstructions, including the sequences of the P121T most related species, showed highly similar trees to those obtained using the maximum-likelihood algorithm (Fig. S3).

Phenotypic characteristics

Strain P121T was Gram-positive or Gram-variable, cells were rod-shaped measuring 0.63 ± 0.11 μm by 3.34 ± 0.45 μm, motile with flagella (Fig. 4a). The spores were ellipsoidal, distending the sporangia and located in the central to subterminal position in the cell (Fig. 4b). The colonies were yellowish, circular, convex and mucoid, 10–15 mm in diameter on TSB agar.

Fig. 4
figure 4

Transmission electron micrographs of isolate P121T. a Vegetative cells with flagella; b ellipsoidal spore, swelling the sporangia

Different phenotypic tests were used to characterize strain P121T based on the recommendations of Gordon et al. (1973) and Logan et al. (2009) and are described below in the species description. Strain P121T was also characterized by using API tests (API 50CH and API 20NE). It produced acid in API 50CH from 24 carbohydrates (listed in Description of Paenibacillus piscarius sp. nov.). A weak reaction was observed with methyl-D-mannoside, methyl-D-glucoside and xylitol. The novel strain was not able to produce acid with 22 of the other carbohydrates tested. Using API 20NE, strain P121T was able to reduce nitrate to nitrite, produce urease, β-galactosidase and arginine dehydrolase, and assimilate glucose, arabinose, mannitol and maltose. Phenotypic characteristics that differentiate the novel isolate from the three closely related species Paenibacillus borealis DSM 13188T (Elo et al. 2001), Paenibacillus rhizoplanae DSM 103993T (Kämpfer et al. 2017) and Paenibacillus silagei DSM 101953 T (Tohno et al. 2016), also considered nitrogen-fixing Paenibacillus species, are presented in Table 2. When the phenotypic characteristics of the novel isolate were compared with those of the three closely related Paenibacillus species, it became clear that P121T could not be considered to represent typical members of any one of these previously established species (Table 2).

Table 2 Characteristics that differentiate strain P121T from the closest type strains of selected Paenibacillus species

Chemotaxonomic characteristic

In accordance with other species of the genus Paenibacillus, meso-diaminopimelic acid was detected. The quinone system was composed predominantly of menaquinones MK-7, which is also in line with other species of the genus.

The fatty acids comprised mainly iso- and anteiso-branched components, and the fatty acid profile was very similar to those of the most closely related Paenibacillus species. The major cellular fatty acids are anteiso-C15:0, iso-C16:0, iso-C15:0 and C14:0. The detailed cellular fatty acid profiles (%) of P. piscarius sp. nov. P121T and closely related Paenibacillus species is shown in Table S1.

Nitrogen fixation

The new isolate, together with the P. graminis and P. riograndensis type strains (RSA19T and SBR5T, respectively), effectively reduced acetylene, showing values varying from 2.08 to 4.73 nmol ethylene/mg protein/h (Table 3).

Table 3 Nitrogenase activity of strain P121T compared with two other nitrogen-fixing Paenibacillus species

Description of Paenibacillus piscarius sp. nov.

Paenibacillus piscarius (pis.ca’ri.us. L. masc. adj. piscarius, of or pertaining to fish).

Cells are straight, motile rods (0.63 ± 0.11 µm in width, 3.34 ± 0.45 µm in length) with flagella. Spores are oval to ellipsoidal and predominantly central to subterminal and distend the sporangium. Young trypticase soy broth (TSB) cultures are Gram-positive or Gram-variable. On TSB agar, colonies are 10 to 15 mm in diameter, yellowish, circular, convex and mucoid. Do not grow at 10 °C or 40 °C; optimum is near 25 °C. Growth was not observed at pH 5.7 (optimum pH 7.5–8) or in the presence of 2% NaCl. Resistant to 0.001% lysozyme. Facultatively anaerobic. Catalase and urease are produced. Voges-Proskauer negative. Nitrate is reduced to nitrite. Gelatin is not liquefied. Starch hydrolysis was negative, and esculin hydrolysis was positive. No crystalline dextrins are formed in rolled oat medium. Casein is weakly decomposed. Indole is not produced. Acid is produced from L-arabinose, D-xylose, β-methyl-xyloside, glucose, D-fructose, D-mannose, mannitol, N-acetyl-glucosamine, amygdalin, arbutin, esculin, salicin, cellobiose, maltose, lactose, melibiose, sucrose, threalose, inulin, D-raffinose, starch, glycogen, gentibiose and turanose. Strain P121T does not produce acid from glycerol, erythritol, D-arabinose, ribose, L-xylose, adonitol, galactose, L-sorbose, rhamnose, dulcitol, inositol, sorbitol, melezitose, D-lyxose, D-tagatose, D-fucose, L-fucose, D-arabitol, L-arabitol, gluconate, 2 keto-gluconate or 5 keto-gluconate. A weak reaction was observed with xylitol, α-methyl-D-glucoside, and α-methyl-D-mannoside. Assimilation of maltose and mannitol was positive but negative for mannose and N-acetyl-glucosamine. Utilization of gluconate, caprate, adipate, malate, citrate and phenyl-acetate was not observed. Nitrogen fixation (acetylene reduction) was detected. The G¬C content of the type strain is 53.9 mol%. The major cellular fatty acids are anteiso-C15:0, iso-C16:0, iso-C15:0 and C14:0. Isolated from the gut of the armored catfish Parotocinclus maculicauda. The type strain is LFB-Fiocruz 1636, DSM 25072 (= P121T).