Introduction

Soil salinity is a major concern in agriculture, causing more than US$ 27 billion loss annually in crop production. Salinity causes a nutritional imbalance that adversely affects plant growth, development, and yield (Shrivastava and Kumar 2015). Bacterial strains that efficiently colonize the rhizosphere and promote plant growth directly or indirectly are known as plant growth-promoting bacteria (PGPB) (Ahemad and Kibret 2014). These strains carry plant growth-promoting traits such as the production of siderophores, nitrogenase, IAA, 1-aminocyclopropane 1-carboxylate (ACC) deaminase, hydrogen sulfate, and phosphate solubilization (Ahemad and Khan 2012; Glick 2012; Jahanian et al. 2012). Some bacterial strains hold additional habitat-specific plant growth-promoting traits such as salinity tolerance and biocontrol of plant pathogens and insects (Hynes et al. 2008). Their application in salinized agricultural soil is promising to support plant growth and tolerate higher salt concentrations. Plant growth-promoting bacteria are gaining more interest in agriculture applications due to their diverse biological activities. The use of particular strains for the induction of plant growth and suppression of phytopathogen provides a substitute for chemical fungicides and fertilizers (Basu et al. 2021).

Bacillus spp. are extensively investigated for inducing plant resistance against pathogens and plant growth development. In recent years, Bacillus strains are employed as an active ingredient in biofertilizer formulation. They colonize the rhizosphere, protect and induce root biomass and enhance plant potency (Gouda et al. 2018). B. paralicheniformis have synthesized several commercially important products such as antibiotics and enzymes. B. paralicheniformis strain MDJK30 was isolated from the rhizosphere and demonstrated to suppress the peony root rot disease (Wang et al. 2017). Next-generation sequencing technology has enabled whole-genome sequencing of bacteria and was recently employed to investigate the genome of several plant growth-promoting strains such as Pseudomonas and Bacillus spp. (Joshi and Chitanand 2020; Olanrewaju et al. 2021). Comprehensive analysis of whole-genome data and identification of genes that contribute to ecology and plant growth promotion will advance our understanding of molecular mechanisms and the development of plant growth-promoting-aided agriculture technology (Meena et al. 2017).

In the present study, one newly sequenced strain ES-1 showing in vitro plant growth-promoting traits and 14 B. paralicheniformis genomes available in the public databases (August 2021) were mined for antimicrobial metabolites BGCs and plant growth-promoting potentials. Furthermore, comparative genome analysis and the contribution of core and accessory genomes to secondary metabolites and plant growth promotion were analyzed. Our results revealed that the strain ES-1 core-genome harbor several secondary metabolite gene clusters coding for antimicrobial metabolites and plant growth-promoting genes majorly found in the accessory genome.

Materials and methods

Strain isolation and ecological features

Saline sodic soil and water samples (n = 14) were collected from various regions including salt mines Karak and Jhelum. The samples were collected in a sterile bottle aseptically to avoid ingress from exogenic bacterial contamination. The samples were sealed and transported in an ice box to the laboratory on the same day and stored at – 20 ℃. The pH, temperature, salinity, electrical conductivity, total dissolved solids (TDS), and global positioning system (GPS) coordinates of the sampling spot were recorded on the sampling sites. The strain ES-1 was isolated from salt mine sodic soil in Karak, Khyber Pakhtunkhwa, Pakistan. The Karak salt mines represent one of the largest reservoirs of salt in Pakistan, having an estimated reservoir of more than 10.5 billion tonnes (Sharif et al. 2007). The salt of Karak mines is light gray to dark gray and 98% pure. The Karak salt mine area consists of small to high rounded arid hills and is very hot in summer and cold in winter. The age of these salty rocks is predicted by geological horizons as 56–33.9 million years ago in the early Epoch era (Sharif et al. 2007).

A series of dilutions were prepared to start with 1 g of sample in 10 mL sterile 1 M NaCl solution. 100 µL from each dilution was spread on trypticase soy agar (Oxide, UK) supplemented with 5% NaCl. The plates were sealed and incubated at 37 ℃ for 5 days. The number of colonies were counted under a plate magnifier, and the colony-forming unit (CFU) per gram was calculated. The pure colonies were isolated via the sub-culturing technique and screened for plant growth-promoting potentials.

Antimicrobial activity assay

All the isolated bacterial strains (n = 72) were preliminarily screened for antimicrobial activity against a set of ATCC bacterial strains (Escherichia coli, Staphylococcus aureus, Listeria monocytogenes, and Agrobacterium radiobacter) and two locally isolated fungal pathogen; Aspergillus niger and Botrytis cinerea using the cross streaking method. Subsequently, the antimicrobial activity of strain ES-1 extract was confirmed via agar well diffusion assay (Iqbal et al. 2021a). Briefly, the strain was cultured in 500 mL tryptic soy broth (TSB) supplemented with 5% NaCl for 7 days at 37 ℃. The culture was extracted with 1:1 ethyl acetate and concentrated at reduced pressure and 60 ℃ in a rotary evaporator. The obtained extract was dissolved in 500 µL phosphate buffer solution and 100 µL was dispensed in an agar well plate for antibacterial activity. The agar plates were pre-inoculated with fresh bacterial indicator strains, comparable to 0.5 McFarland.

Antifungal activity was determined against phytopathogenic, A. niger MB-4 and B. cinerea KST-32 as performed earlier (Iqbal et al. 2021c). A Petri plate containing fungal strain MB-4 and KST-32 mycelial discs without treatment (ES-1 extract) were used as a control. The antifungal activity was calculated using the following formula.

$${\text{Growth}}\,{\text{inhibition }}\left( \% \right) = \frac{{{\text{Diameter}}\,{\text{of}}\,{\text{Control}} - {\text{Diameter}}\,{\text{of}}\,{\text{Treated}}}}{{{\text{Diameter}}\,{\text{of}}\,{\text{Control}}}} \times 100.$$

In vitro plant growth-promoting ability and salinity tolerance

Initially, the strain ES-1 was tested for salinity tolerance using tryptic soy agar (TSA) medium supplemented with various concentrations of NaCl. Afterward, the strain was assessed for plant growth-promoting traits, including extracellular protease, cellulase, amylase and siderophore production, biofilm formation, phosphate solubilization, and 1-aminocyclopropane 1-carboxylic acid (ACC) deaminase. The biofilm formation was assessed using a colorimetric method as performed earlier by Djordjevic et al. (2002). The quantitative analysis of biofilm formation was assessed by destaining the glass walls with 95% ethanol and optical density (OD) was estimated at 595 nm. Fresh cultures (OD 600 nm = 0.8) of the isolates were used as inoculum and an uninoculated medium was used as a negative control. Siderophore synthesis was evaluated using colorimetric chrome azurol sulphonate (CAS) assay (Arora and Verma 2017) and OD was measured at 630 nm. The ACC deaminase activity was assessed using DF salt minimal medium modified with 3 mM ACC (Penrose and Glick 2003) and OD was measured at 540 nm. The strain ES-1 was screened for extracellular protease, cellulase, and amylase production as described earlier (Ji et al. 2014). Pre-sterilized skim milk, carboxymethyl cellulose, and starch was used as a substrate for protease, cellulase, and amylase production, respectively. Phosphate solubilization assay was performed as described earlier (Elhaissoufi et al. 2020) with minor modification. The strain ES-1 was spot inoculated on TSA medium containing insoluble tricalcium phosphate as a sole phosphate source. A clear halo zone of hydrolysis around the colony indicates positive results. The amylase, protease, cellulase, and phosphate-solubilizing potential was measured in mm and calculated as below.

$${\text{Activity }} = \frac{{{\text{Diameter of colony }} - {\text{halozone diameter}}}}{{\text{Diameter of colony}}}.$$

All the plant growth-promoting assays were also evaluated with salinity stress supplemented with various concentrations of salt ranging from 0 to 4.27 M. Based on promising antagonistic and plant growth-promoting activities, strain ES-1 was subjected to whole-genome sequencing and subsequent genome analysis.

Whole-genome sequencing, assembly, and annotation

Genomic DNA was extracted from a fresh culture of strain ES-1 using DNA kit Pure link™ (Invitrogen, USA) according to manufacturer instructions. The purity and concentration of extracted gDNA was confirmed using NanoDrop and Qubit high sensitivity assay, respectively. Library of gDNA was prepared using Nextera XT library preparation kit (Illumina Inc SDCA, USA) and was sequenced using Hiseq Illumina 2500 platform with paired-end reads. The reads were trimmed using trimmomatic v 0.36 (Bolger et al. 2014) and de novo assembly was performed using SPAdes v 3.12 (Bankevich et al. 2012). Genome annotation was performed using Prokka (Prokka 2014) and PGAP v.4.10.

Genome mining

The biosynthetic gene clusters in the ES-1 genome were identified using the Antibiotic and Secondary Metabolite Analysis Accessory (AntiSMASH) online server (https://antismash.secondarymetabolites.org/). Moreover, both “known” and “unknown cluster” BLAST modules were selected to find similar clusters by genome comparison. Sequence similarities to known clusters and domain functions were predicted and annotated using BLASTp and Pfam analysis.

Identification of putative horizontal gene transfer (HGT) and prophages

GIs were predicted using an annotated whole-genome sequence of strain ES-1 as input to Islandviewer 4 online server (https://www.pathogenomics.sfu.ca/islandviewer/). Islandviewer 4 predicts GIs in the bacterial and archeal genome using three prediction methods: IslandPath-DIMOB, IslandPick, and SIGI-HMM. Additionally, a web-based PHASTER server (https://phaster.ca/) was used to identify prophage regions in the B. paralicheniformis ES-1 genome.

Comparative genome analysis

To date (August 2021), 87 genome assemblies of B. paralicheniformis genomes are available in public databases; however, most are incomplete or partial draft genomes. Therefore, only complete genomes or chromosomes (n = 15) were selected and used for comparative genome analysis. To assess genome-level diversity and similarities among B. paralicheniformis strains, high throughput average nucleotide identity (ANI) was conducted using orthoANI. The subsequent ANI matrices were visualized using a web-based heat mapper (http://www.heatmapper.ca/). In silico DNA–DNA hybridization (DDH) was performed for species boundaries delineation through genome-to-genome distance calculator (GGDC) version 2.1 (http://ggdc.dsmz.de/). The Bacterial Pan-Genome Analysis (BPGA) pipeline was used to evaluate genomic assortment and unique gene pool in B. paralicheniformis strains (Chaudhari et al. 2016). The core, accessory, and unique genes were extracted using the pan-genome extraction module. Phylogenetic analyses were conducted based on core proteins alignment using the presence/absence of binary pan gene matrix and the maximum likelihood tree was constructed based on the concatenated core proteins using MEGA X (Kumar et al. 2018). Cluster orthologous group (COGs) analyses were performed using orthoVenn with an inflation value of 1.5 and cutoff e-value of 1e−2 (https://orthovenn2.bioinfotoolkits.net/). Genome-wide analysis of orthologous clusters is important for understanding genome structure and gene/protein function. The information obtained from COGs comparison may serve as a foundation for taxonomic classification, thereby shedding light on the underlying mechanism of molecular evolution in genes/genomes. Furthermore, the ES-1 genome was mined for secondary metabolites gene clusters, and plant growth-promoting traits.

Results and discussion

Ecology, morphological characteristics, and antimicrobial activities

The sample temperature, pH, salinity, and TDS were recorded as 25 ℃, 7.85, 26%, and 37%, respectively (Supplementary file 2; Table S1). The sample revealed an average bacterial density of 2 × 102 ± 20 CFU/g. The strain ES-1 is a Gram-positive, rod-shaped bacterium and forms irregular dry colonies on tryptic soy agar. The antibacterial assay indicates that strain ES-1 produces broad-spectrum antibacterial metabolites, exhibiting promising antagonistic activity against E. coli, A. radiobacter, S. aureus, and L. monocytogenes (Fig. 1). B. paralicheniformis strain ES-1 extract also exhibited antifungal activity against A. niger MB-4 and B. cinerea KST-32 with inhibition of 32.5 and 39.25%, respectively (Fig. 1). In the earlier study, 35 halotolerant bacterial strains were isolated and tested against Fusarium culmorum. Out of 35, 3 strains exhibit antifungal activity by inhibiting mycelial growth in a dual culture plate assay (Albdaiwi et al. 2020).

Fig. 1
figure 1

Antimicrobial activities of B. paralicheniformis ES-1 extract against indicator strains; E. coli, S. aureus, L. monocytogenes, A. radiobacter, A. niger and B. cinerea

In vitro plant growth-promoting traits and salinity tolerance

Plant growth-promoting bacteria interact with plant roots directly by promoting the availability of essential nutrients such as phosphate and iron and indirectly by protecting plants against phytopathogens via competing with pathogens or producing hydrolytic enzymes (Oleńska et al. 2020). The results confirm that the strain ES-1 is capable of solubilizing phosphate, synthesizing siderophores, forming biofilm, and producing extracellular enzymes (Fig. 2). Salinity is an important abiotic stress change physicochemical properties of the soil that significantly affects bacterial and plant growth (Shrivastava and Kumar 2015). We found that the strain ES-1 is halotolerant and grows optimally up to 1.71 M salt concentration (Fig. 2). Recently, Reang et al., isolated several halotolerant bacterial isolates from coastal regions, showed in vitro plant growth-promoting traits and also optimally grow at 10–15% NaCl concentrations (Reang et al. 2022).

Fig. 2
figure 2

a Extracellular enzymes production and salinity tolerance of B. paralicheniformis strain ES-1. The enzymes activities were evaluated on a solid medium containing various concentrations of NaCl (indicated in various colors corresponding to the legend) and halozones were measured in mm. b Quantitative analysis of plant growth-promoting traits. The biofilm, siderophore, and ACC deaminase production were assessed using a colorimetric method, and optical densities (OD) were measured at 595, 630, and 540 nm, respectively. Results represent the mean values of triplicates experiments with standard deviations

Genome features

A total of 4,294,515 reads were obtained with a median insert size of 359 bp and 462.29 × mean coverage. The sequenced reads having a Phred score more than Q30 were subjected to de novo assembly. De novo assembly resulted in a high-quality draft genome of 4.47 Mbs having 47 contigs with N50 23,67,028 and L50 as one. The average GC content of the ES-1 genome is 45.73%. The draft genome annotation resulted in 4708 CDS and 94 RNAs (Fig. 3).

Fig. 3
figure 3

Circular genome map of strain ES-1 in comparison with the reference B. paralicheniformis Bac84 (two outermost circles) genome. The 3rd and 4th circles represent GC skew deviation from average, 5th and 6th circle indicate genome location of GIs and prophages, respectively

Putative GIs and prophages

GIs are regions in a genome that has evidence of horizontal gene transfer (HGT). GIs might be involved in important functions such as symbiosis or pathogenesis. HGT is also considered a vital factor that drives adaptation and evolution. The previous study demonstrated that B. japonicum acquired genes via HGT which leads to improved symbiotic N2 fixation capability as compared to other related strains (Itakura et al. 2009). Similarly, Richard and his colleagues analyzed the dynamic of genomic evolution in Streptococcus species and determined that eight GIs were attained during early evolution, consequently assisting a superior adaptation to a specific habitat (Richards et al. 2014). Herein, we determined five HGT events in B. paralicheniformis strain ES-1 (Fig. 3). The acquired genes were found to be associated with some important enzymes synthesis and stress-related proteins such as cross-over junction endoribonuclease (ruvC), Response regulator aspartate phosphatase A (rapA), Tyrosine recombinase (xerC), NADPH-dependent 7-cyano-7-deazaguanine reductase (queF), Putative hydrolase (ydeN), protease synthase and sporulation protein (paiB), cold shock protein (cspC), Spermidine/spermine N(1)-acetyltransferase (paiA), Phthiocerol/phenolphthiocerol synthesis polyketide synthase type I (ppsC), Adaptive-response sensory-kinase (sasA), Metallopeptidase (immA), Nisin biosynthesis protein (nisB/C), ICEBs1 excisionase (xis) and Metallopeptidase (immA) (Supplementary file 1; Table S1). The higher proportion of important enzymes and stress-related genes in GIs at divergence events propose their role in the adaptation/fitness of the strain ES-1 in the extreme environment. Several genes in GIs code for hypothetical proteins with unknown functions (supplementary file 1; Table S1).

Phage-mediated recombination is also important for environmental bacteria to exchange genetic material, which leads to numerous beneficial properties such as evolution, adaptation, and acquisition of antibiotic resistance/producing genes. In the present study, a total of seven prophage regions were identified in the ES-1 genome and classified as two intact, four incomplete, and one questionable (Table 1). The intact prophages (score 100 and 110%) is 38.2 and 35.3 kb in size with 43.24 and 47.16% GC content. These regions are composed of 57 and 46 coding sequences, of which 13 and 7 are phage hit proteins, respectively. The four incomplete prophages are 18.1, 13.3, 12.2, and 19.6 kb in size with 42.37, 39.96, 46.53, and 36.44% GC content, respectively. On the other hand, the questionable prophage is 26.7 kb in size with 40.70% GC content. The questionable prophage region comprises 28 coding sequences, of which 18 are phage-hit proteins and nine are hypothetical proteins (Supplementary file 1; Table S2).

Table 1 Detail of prophage regions identified in Bacillus paralicheniformis strain ES-1

Taxonomic classification

Pairwise genome similarity and distance were calculated against all (n = 15) available complete genome sequences of B. paralicheniformis strains. All ANI values were found above the threshold (< 95%) (Richter and Rosselló-Móra 2009) and a minimum ANI percentage (98.6) was noted between strain CBMAI 1303 and Bac48 (Fig. 4A). The heatmap generated using ANI values illustrates the four optimal clades in the dendrogram. Clade A and B are composed of two and five strains, respectively. Clades C and D consist of four strains with the highest ANI values (< 99%). The newly sequenced strain ES-1 lies in clade C, exhibiting the highest similarity with strain SUBG0010 (99.28%) followed by strain Bac84 and Bac48 (Fig. 4A). The strain Bac84 is a representative genome for B. paralicheniformis species in RefSeq. Database (https://www.ncbi.nlm.nih.gov/refseq/). In addition, in silico DDH values are above the threshold, confirming that all the strains belong to the same B. paralicheniformis species (Supplementary file 1; Table S3). The DDH value threshold > 70% was previously recommended for species delineation (On et al. 2017). Herein, the DDH values support the results of ANI since the recommended DDH threshold corresponds well to the ANI cutoff values.

Fig. 4
figure 4

a Whole-genome comparison of B. paralicheniformis strains. The cells in the heatmap correspond to the level of similarities on the scale of 98–100, where 98% similarity is depicted in green and 100 illustrated in blue color. The dendrogram was constructed based on ANI percentage and corresponds to the ANI values between the understudy strains. b Phylogenetic tree based on concatenated 3344 core proteins using maximum likelihood method. The values on each branch represent the estimated time of divergence. The tree was constructed using MEGA-X and edited in iTOL (https://itol.embl.de/)

Core pan-genome analysis of B. paralicheniformis

Core-pan genome analyses were conducted to identify unique genomic features and genomic diversity among B. paralicheniformis strains. To avoid incorrect inference from incomplete genomes, only complete genomes were considered (Supplementary file 2; Table S2). The pan-genome pool was generated by considering all genes found in all (n = 15) genomes and dividing them into core, accessory, and unique genes. The genes were clustered in the same group when the BLASTp coverage was more than 75%, with a cutoff value less than 1e−8. A total of 14,857 gene clusters (Pan-genome) were generated in 15 strains. Among these, 3344 genes are common in all strains and make the core genome. Based on 3344 concatenated core genome proteins, the maximum likelihood tree grouped 15 strains into 4 distinct clades with high confidence (Fig. 4B). The strain ES-1 cluster with SUBG0010, that was isolated from the rhizosphere, and Bac48 and Bac84, isolated from red sea sediments.

The power fit equation and exponential equation curve show that the pan-genome of B. paralicheniformis has the exponent >0 i.e., b = 0.117371 and d = − 0.0126432, respectively, suggesting an open pan-genome (Tettelin et al. 2008). The core versus pan-genome plot illustrates that the number of accessory and unique genes increases while the number of core genes slightly decreases with the addition of each genome (Fig. 5A). These results indicate that the pan-genome window is still open for expansion. A large number of accessory (10,654 genes) and unique genes (915) (Fig. 5B) might be due to the specialization of each B. paralicheniformis strain and correspond to their geographical location. Protein sequences of core, accessory, and unique genes were determined and established their KEGG identities. Interestingly, majority of the secondary metabolite gene clusters identified in ES-1 genome are associated with the core genome. The accessory genome which includes genes that are only found in a few genomes is majorly associated with amino acid and carbohydrate metabolism, emphasizing habitat-specific changes in the strain. Sun et al., reported that the diversity of amino acid and carbohydrate metabolism enhances the genetic fitness of Bifidobacterium to adapt in a particular habitat (Sun et al. 2015) which is desirable in sustainable agriculture applications. In addition, the genes that belong to the unique pan-genome are mostly involved in replication and repair, energy metabolism, and membrane transport (Fig. 5C). The detail of core, accessory, and unique genes present in each of B. paralicheniformis genomes are given in the Supplementary file 1; Table S4.

Fig. 5
figure 5

a Mathematical modeling of B. paralicheniformis genomes estimating the size of core and pan-genome. b Bar graph representing the number of unique genes. c KEGG distribution of representative protein sequences in the core, accessory, and unique pan-genome across 15 B. paralicheniformis strains

Orthologous clusters are clusters of genes originated by vertical descent from a single gene in the last common ancestor. Comparative orthologous cluster analysis in different genomes provides insight into gene structure, function, and molecular mechanism of evolution in genes/genome (Kristensen et al. 2011). Here, we performed a comparative orthologous cluster analysis which identified a total of 4756 clusters, 3548 orthologous clusters (found in at least two strains), and 1205 single-copy gene clusters in 15 B. paralicheniformis. The highest number of singletons were observed in strain RSC-3 (480) followed by the newly sequenced strain ES-1 (241) (Supplementary file 2; Table S3). Previously orthologous cluster analyses of Paenibacillus polymyxa were conducted. A total of 6052 families were prioritized and 3650 are shared by five strains (Xu et al. 2017). Similarly, proteome comparison of B. firmus strain I-1582 with five other related strains identified a total of 5371 clusters, 5211 orthologous clusters, and 3116 singletons (Susič et al. 2020). The low number of clusters in the current study are probably due to the small genome size of B. paralicheniformis as compared to Bacillus firmus. The orthologous clusters identified in ES-1 were compared with four B. paralicheniformis strain isolated from soil (Fig. 6a) as well as four strains originated from non-soil source (Fig. 6b). The soil-originated strains sharing 3767 clusters (core proteome) representing 79.20% repertoire of the proteins, as compared to B. paralicheniformis strains isolated from other source which shared 3737 (78.57%) clusters (Fig. 6a, b). These results indicate that the core-proteome is similar, regardless of the source of isolation. The newly sequenced strain ES-1 shared 37 and 33 clusters with strain Bac48 and SUBG0010, respectively. The strains Bac48 and SUBG0010 were earlier characterized as promising antimicrobial-exhibiting strains and were suggested to be a better candidate for biocontrol (Al-Amoudi et al. 2016; Bhatt et al. 2018). Recently, an in silico metabolic network was constructed for strain Bac48 and revealed twice as many secreted proteins as compared to model strain B. subtilis 168 (Othoum et al. 2019). These results suggest that these three strains shared particular modules of metabolites that have evolved in these strains due to the unique niche adaptation.

Fig. 6
figure 6

Genomic diversity in Bacillus paralicheniformis strains. a The strain ES-1 was compared with the strains that originated from soil. b The strain ES-1 was compared with the non-soil strains. Each strain is represented with an oval shape with a unique color and the number of orthologous proteins shared by all strains are mentioned in the center

Genome mining

The AntiSMASH analysis identified 12 secondary metabolites biosynthetic gene clusters in the ES-1 genome (Fig. 7). These include four NRPS encoding for fengycin, bacitracin, bacillibactin, and lichenysin. Fengycin is a cyclic lipopeptide exhibiting promising antifungal activity. Bacitracin is a polypeptide that disrupts peptidoglycan synthesis of Gram-negative bacteria. Bacillibactin is an iron siderophore that enhances ES-1 capability to scavenge iron from the surrounding. Furthermore, the strain ES-1 harbor 4 BGCs coding for ribosomal synthesized post-translational modified peptides (RiPPs), one each for siderophore, terpene, type 3 polyketides (T3PK), and cyclodipeptides (CDPS). The identified BGCs were compared with known clusters and 6 BGCs showed homology with known gene clusters from other Bacillus species such as thiopeptide exhibited 7% similarity with butirosin A (BGC0000693) from B. circulans, fengycin showed 93% similarity with BGC0001095 from B. velezensis and bacitracin showed 100% similarity with BGC0000310 from B. licheniformis. Similarly, bacillibactin, lichenysin, and lanthipeptide revealed 53%, 100%, and 18% homology with BGC0000309 (B. subtilis str. 168), BGC0000381 (B. licheniformis DSM 13), and BGC0000506 (B. subtilis subsp. spizizenii), respectively. On the other hand, six clusters showed no similarity with known gene clusters, so-called orphan clusters. Comparative BGC analyses with other B. paralicheniformis genomes revealed that strain SUBG0010 carries highest (n = 13) and Bac48 harbor the lowest (n = 10) number of BGCs (Fig. 7). Overall, the B. paralichenifromis genomes carries a similar pattern of BGCs. However, a strain-specific hybrid BGC was identified in Bac48 genome as previously reported (Othoum et al. 2018). The genome mining data suggest that strain ES-1 potentially produces numerous antibacterial metabolites that correspond to the in vitro antibacterial results.

Fig. 7
figure 7

Comparative analysis of secondary metabolites biosynthetic gene clusters in B. paralicheniformis strain. The bar graph shows the consistent distribution of BGCs across the species. The number of BGCs in each strain is color-coded as per legend

Orphan BGCs in ES-1

Siderophore

Siderophore received great attention due to its application in agriculture. A previous study demonstrated that siderophore-producing Pseudomonas fluorescens play a vital role in controlling phytopathogens and also promote plant growth. (Cornelis and Matthijs 2007). Recent studies also support the siderophore theory of biological control by plant growth-promoting bacteria. Recently, Brevibacillus brevis was isolated from rhizosphere soil to produce a siderophore with potent antimicrobial activity (Sheng et al. 2020). One siderophore encoding gene cluster identified in strain ES-1 did not show similarity with a known cluster. The siderophore BGC is located in contig 1 of the ES-1 genome and contains 10 modules. The BLASTp results indicate 2 core biosynthetic genes, IucA and IucC, 6 additional biosynthetic genes including aminotransferase, aminotransferase class-III, Decarboxylase (pyridoxal-dependent), putative siderophore biosynthesis protein, lysine/ornithine N-monooxygenase, short-chain dehydrogenase/reductase SDR and 2 others genes. Siderophore coding BGCs were previously identified in plant growth-promoting B. subtilis and B. velezensis isolated from rhizosphere (Olanrewaju et al. 2021).

Terpene

Terpenes are an extremely diverse and largest class of natural products, widely used as herbicides, pharmaceuticals, biofuels, and flavoring. Previously, they have been mostly isolated from plants and fungi. However, the advances in genome technology and availability of bacterial genome sequences indicate that bacteria also harbor terpene synthase genes. Although, most of these terpenes synthase genes seem to be silent in the parent strains. Reddy et al. identified 22 terpenes genes in the bacterial genome and 15 were successfully expressed in E. coli (Reddy et al. 2020). Here, a terpene cyclase gene was identified as a core biosynthetic gene while 3 additional biosynthetic genes, beta-lactamase, aldehyde dehydrogenase, and GCN5-related N-acetyltransferase were identified in the ES-1 genome. Moreover, 13 “others” genes and one transport-related gene were identified.

Type III polyketide synthases (T3PKSs)

Type 3 polyketide synthases were earlier considered to be produced by only plants and fungi. However, the bacterial genome projects revealed that T3PKSs are widely distributed in bacterial genomes. Bacillus spp. are known to produce various biological active metabolites with polyketide group. In the earlier study, a new variant of antimicrobial aryl-crowned polyketide was isolated and characterized from a seaweed-associated B. subtilis strain MTCC 10403 (Chakraborty et al. 2018). The isolated polyketide exhibited promising activity against food-borne pathogens including E. coli. Furthermore, several Paenibacillus and Bacillus spp. are known to produce polyketide, antagonizing growth of human and phytopathogens (Olishevska et al. 2019). The current study identified T3PKS BGC in all B. paralicheniformis genomes including ES-1. The T3PKS cluster in ES-1 genome is composed of one core gene encoded for chalcone and stilbene synthase and three additional biosynthesis genes, isoprenylcysteine carboxyl methyltransferase (locus tag ctg1_1412, score: 100.9, e-value: 3.9e−29), acetyltransferase (ctg_1426, score: 49.6, e-value: 4.6e−13), and monogalactosyldiacylglycerol synthase (ctg_1428, score: 157.0, e-value: 4.8e−46) were identified in ES-1 genome. These findings suggest that T3PKS in strain ES-1 genome may encode important biological metabolites that could be involved in antimicrobial activities of B. paralicheniformis.

RiPP-like

RiPP-like are the unspecified ribosomally synthesized post-translationally modified peptides (RiPPs) products. In the present study, an unknown RiPP-like cluster was predicted in ES-1 genome located in contig 1. The RiPP-like cluster is composed of a core biosynthetic gene (345 nt) located on locus tag ctg1_1835 and Pfam hit results showed similarity with Bacteriocin class IId cyclical uberolysin-like (score: 40.5, e-value: 2.4e−10). Furthermore, 5 transport-related genes including 3 ABC transporter (ctg_1833, ctg1_1838 and ctg1_1839) and 2 binding protein-dependent transport system inner membrane component (ctg1_1840, ctg1_1841) were identified. Previously reported that members of bacteriocin class IId cyclical uberolysin-like are membrane-interacting peptides, and exhibit broad-spectrum antimicrobial activities (Wirawan et al. 2007).

Cyclodipeptides synthases (CDPSs)

CDPSs are a newly described family of peptidases that are responsible for the biosynthesis of various cyclopeptides, which act as a precursor for numerous natural products with important bioactivities (Gondry et al. 2009; Canu et al. 2020). CDPSs do not have a distinct structure; however, they still have a common architecture such as the rossmann fold domain. Initially, it was thought that NRPSs are responsible for CDPs biosynthesis until ribose-independent production was explored. The strain ES-1 genome harbor a unique CDPS BGC composed of core gene located on locus tag ctg2_293 and Pfam hit result (score: 270.4, e-value: 1.1e−80) revealed similarity with tRNA-dependent cyclodipeptide synthase. Moreover, five additional biosynthetic genes and two transport-related genes were identified. The two additional biosynthetic genes located upstream of the core gene, locus tag ctg_287 (1420 nt) revealed pfam hits with FGGY family of carbohydrate kinases, N-terminal domain (score: 104.9, e-value: 5e−30) and FGGY family of carbohydrate kinases, C-terminal domain (score: 100.2, e-value: 1.2e−28) and ctg_290 (2070 nt) showed pfam hit with Class II Aldolase and Adducin N-terminal domain (score: 91.2, e-value: 7.6e−26) and Enoyl/Acyl carrier protein reductase (score: 171.8, e-value: 1.7e−50). While the other 3 additional biosynthetic genes located downstream of the core gene locus tag; ctg2_294, ctg_297, ctg_298 exhibited pfam hits with Cytochrome P450 (score: 146.0, e-value: 1.4e−42), Amidase (score: 171.6, e-value: 2.8e−50) and Oxidoreductase family, NAD-binding Rossmann fold (score: 92.2, e-value: 3.9e−26). Previously, cyclic dipeptides were isolated from Bacillus sp. that antagonized growth of several microorganisms including fungal plant pathogens (Kumar et al. 2013). Similarly, B. amyloliquefaciens was reported to be secreting cyclic dipeptide that inhibits biofilm and virulence production in methicillin resistant Staphylococcus aureus (Gowrishankar et al. 2015).

Lassopeptide

Lassopeptides are important bacterial natural products that belong to a family of RiPPs with unique lasso-like structures. Lasso peptides have a broad spectrum of biological activities, including antimicrobial, antiviral and antimetastatic activities, while some act as enzyme inhibitors (Katahira et al. 1996). The biosynthetic machinery producing RiPPs contains a highly conserved lactam ring threaded by the carboxyl-terminal domain associated with post-translational modification of the precursor peptide, making them a perfect target for genome mining (Hetrick and Donk 2017). Genome mining of the ES-1 genome indicates a new putative macrolactam class II lasso peptide gene cluster with no similarity to the known gene cluster. The core biosynthetic genes are located at locus tag ctg2_713, ctg2_714 encoding for lasso peptide putative class II exhibiting Pfam hits with Asparagine synthase (score: 77.2, e-value: 1.8e−21) and transglutaminase-like superfamily (score: 55.2, e-value: 5.8e−15), respectively. One stand-alone RiPP recognition element (RRE) was also identified as a core biosynthetic gene located on locus tag ctg2_715. Recently, genome mining of Streptomyces humidus revealed a unique BGC for lasso peptide and heterologous expression identified a new peptide humidimycin with strong antimicrobial activities (Sánchez-Hidalgo et al. 2020).

Plant growth-promoting capabilities

B. paralicheniformis strain ES-1 genome was mined for plant growth-promoting genes that could be associated with their abilities to suppress phytopathogen, improve nutrient availability, and resist abiotic stress. The genome harbor genes associated with phosphate solubilization, protease, amylase, and cellulase synthesis, biofilm formation, and ACC deaminase production, which is in agreement with the in vitro plant growth-promoting assays (Supplementary file 2; Table S4). The ACC deaminase coding gene rimM was earlier reported in plant growth-promoting bacteria to function in reducing plant ethylene that inhibits root nodulation (Shah et al. 1998; Gupta et al. 2014). L-tryptophane (L-TRP) is an important residue required for normal plant growth/development and also acts as a precursor for plant growth regulators (Mustafa et al. 2018). It is generally supplied through seed priming or foliar spray. We identified a complete biosynthetic pathway of L-TRP in the strain ES-1 genome. The L-TRP pathway involved five enzymes encoded by seven genes (trpABCDEF). The last two steps in this pathway are catalyzed by a single enzyme tryptophan synthase which is composed of two (trpA and trpB) alpha and beta subunits. The enzyme anthranilate phosphoribosyltransferase catalyzes step 2 and converts anthranilate to phosphoribosyl anthranilate encoded by two genes trpD_1 and trpD_2 (Supplementary file 2; Table S4). Two volatile organic compounds, acetoin and 2,3-butanediol biosynthesis genes (ilvHB, alsD and budC) found in ES-1 genome which were reported to promote plant growth by increasing disease resistance (Han et al. 2006), stimulating root formation (Ryu et al. 2003) and drought tolerance (Cho et al. 2008). Furthermore, we identified genes (cysCJI) involved in the biosynthesis of hydrogen sulfide which is shown to promote plant growth and seed germination (Dooley et al. 2013). Two operons narHGX and nasABCDEF were determined and annotated as nitrate reductase and nitrate transporter, respectively. These clusters were previously found in plant growth-promoting B. subtilis strain MBI 600 and were predicted to be involved in nitrate reduction and nitrate transport (Samaras et al. 2021).

The chitinase coding gene was found which hydrolyzes the cell wall of pathogenic fungi and pests (Gupta et al. 2014; Loper et al. 2012). Besides these, the genes gabD and gabR were identified which are involved in the synthesis of disease/pest inhibiting gamma-aminobutyric acid (GABA) (Loper et al. 2012). The strain ES-1 genome also harbors speAGE and potAB operon which are associated with spermidine biosynthesis and transport system, respectively. Spermidine is crucial for plant cell viability and has been associated with lateral root expansion, phytopathogen confrontation, and mitigation of osmotic, oxidative, and acidic stress. Thus, spermidine synthesis by B. paralicheniformis ES-1 may establish additional biosynthetic machinery associated with increased plant growth. However, further wet-lab experiments are required to validate the direct involvement of spermidine produced by B. paralicheniformis ES-1 in plant growth promotion.

Root colonization and chemotaxis

Flagellar protein plays a vital role in the colonization of plant growth-promoting bacteria on plant roots. In strain ES-1, we found 20 genes associated with flagellar biosynthesis and assembly. These genes are mostly localized in 3 clusters, fliDEGJMSTPW, flhABHEF and flgBCGL, while gene hag encoding for flagellin and csrA encoding for a translational regulator found separately on locus tag 1 E1_02900 and E1_02901, respectively. Furthermore, the genes involved in chemotaxes such as methyl-accepting chemotaxis two-component systems protein-encoding genes like cheABCDRVWY, pomA, yoaH, and mcpBC were also found in ES-1 genome. The two-component system helps in signal recognition of exudate and adaptation to the environment. The presence of these genes indicates that strain ES-1 is capable of responding to a stimulus and consequently moving toward plant roots. These flagellar biosynthesis, assembly, and chemotaxis genes were previously identified in the plant growth-promoting Cronobacter muytjensii strain JZ38 genome (Eida et al. 2020). It was observed that plant growth-promoting genes are found in the accessory genome of ES-1, which implies an evolutionary process for adaptation in specific habitats, as suggested earlier (Iqbal et al. 2021b; Zhang et al. 2016).

Salt tolerance capabilities in ES-1

The strain ES-1 was isolated from the salt mine and can grow well on TSA supplemented with 0–1.71 M salt concentration (Fig. 2). Genome mining showed that the strain harbors several genes associated with salt tolerance. For instance, the genes betA and betB involved in glycine-betaine synthesis were identified in the ES-1 genome (Supplementary file 2; Table S5). These two genes are reported to be the most important genes associated with salt tolerance (Liu et al. 2016). Further, trehalose functions as an osmoprotectant under extreme conditions such as drought, high salt concentration, and osmotic stress (Duan et al. 2013). Garg and his colleagues previously reported that trehalose accumulates in plant and enhance systematic resistance to abiotic stresses (Garg et al. 2002). In bacteria, five trehalose biosynthesis pathways have been reported including ostsAB, treS, treT, treP, and treYZ (Paul et al. 2008). Here, in the strain ES-1 genome, treP biosynthesis pathway was identified. In addition, trehalose regulator (treR) and transport (sugA) genes were also identified in the ES-1 genome. Trehalose biosynthesis via treP biosynthesis is a single-step process where glucose are converted to trehalose by trehalose phosphorylase (treP). Subsequently, trehalose may be hydrolyzed via treA and produce two glucose molecules. This is a common pathway present in the microorganism and associated with adaptation and survival in extreme environments.

Furthermore, several osmoregulatory receptor genes were identified in the ES-1 genome (Supplementary file 2; Table S5). For instance, the kdpD is responsible for detecting hyperosmotic stress and triggering the expression of genes involved in the accumulation of solutes and cell wall synthesis (Möker et al. 2004). In response to high salt concentration, they may also trigger the expression of kdp operon, which encodes for a high-affinity K+ ions uptake system (Heermann and Jung 2010). Additionally, the genes encoding for transport systems such as Na+/H+ antiporter transport system for exporting Na+ and importing H+ and K+ transport system for K+ accumulation (Epstein 2003) were determined in the ES-1 genome to survive in hyperosmotic conditions ( Supplementary file 2; Table S5). The strain ES-1 genome also carries protein quality control system-related gene htpG, htpX, groL, groS, grpE, dnaK, and dnaJ, which are called molecular chaperons and play a vital role in stress response (Susin et al. 2006).

Conclusion

The current study sheds light on the genomic and biochemical characterization of a potential plant growth-promoting B. paralicheniformis strain ES-1 isolated from an unexplored salt mine in Karak, Pakistan. Among B. paralicheniformis species the strain MDJK30 was previously reported as potential PGPB and 11 secondary metabolites BGCs were identified (Wang et al. 2017) as compared to ES-1 where 12 BGCs were found. Furthermore, the current study revealed various features of B. paralicheniformis that are associated with their commensal lifestyle with plant roots such as open pan-genome, high diversity of transport, and the metabolism of amino acids and carbohydrates. The essential genomic and plant growth-promoting characteristics of strain ES-1 highlight the conjunction application of various microbes and imply the future research focus towards a holistic strategy for plant growth promotion. The complementarity of their antimicrobial activities and plant growth promotion will pave the way for their application as an alternative sustainable approach to enhance agriculture production under salt-stress conditions.