Introduction

Microbial polysaccharides are an abundant group of polymers that can be secreted by bacteria, fungi, and algae during the growth and metabolism of microorganisms [1]. They can be divided into capsular polysaccharides (CPS), lipopolysaccharides (LPS), and exopolysaccharides (EPS) [2]. EPS are water-soluble polymeric sugars produced by bacteria that are either attached to the external cell membrane or exported outside of the bacteria [3]. Based on the monosaccharide composition, EPS are classified into homopolysaccharides and heteropolysaccharides. Homopolysaccharides are composed of only one type of monosaccharide unit, such as cellulose, dextran, or branched-chain starch, whereas heteropolysaccharides (e.g., xanthan gum and hyaluronic acid) consist of two or more monosaccharide units [4]. Compared with EPS produced by plants and animals, microbial EPS are more advantageous due to a shorter microbial growth cycle and independence from seasonal and weather conditions [4]. EPS are receiving much attention due to their functional properties, they can be used as stabilizers, emulsifiers, viscosity enhancers, gelling agents, and water binders to improve the rheological properties, texture, taste, and flavor of food [5, 6]. In addition, EPS can perform a variety of potential physiological functions and possess diverse biological activities including antioxidant, anti-cancer, immunomodulatory, antiviral, anticoagulant, and anti-cholesterol activities [7,8,9].

Currently, the ability to secrete EPS is widespread among microorganisms, only a few bacterial and fungal EPS have reached full commercialization, including Xanthomonas sp. for the production of xanthan, Leuconostoc sp. for the production of dextran, Aureobasidium sp. for the production of pullulan, and Bacillus sp. for levan. [10]. And B. amyloliquefaciens, a bacterium that is recognized as safe for use(GRAS) in food and pharmaceuticals, has been used to produce enzymes, antimicrobials, EPS, and poly-γ-glutamic acid [11, 12]. An EPS yield of 6.7 g/L was recorded for B. amyloliquefaciens 23,350 grown on mineral base-medium with added yeast extract, but the conversion yield was only 0.05 g/g sucrose [13]. However, because the biosynthetic mechanism of EPS is unclear, further increasing EPS production by B. amyloliquefaciens is challenging. With the development of whole genome sequencing technology, the use of whole genome sequencing has become an important method to give an overview of the metabolic potential. For example, the whole genome of Lactobacillus paracasei TD062 was sequenced to predict the utilization of sugar sources and the metabolic pathways associated with the synthesis of nucleotide sugars, based on the results, the carbon source was optimized to enhance the activity of key enzymes in the synthesis pathway, while increasing EPS production and altering the EPS composition [14]. Currently, mutagenesis, culture medium optimisation, and other methods are commonly used to increase EPS production, and research on increasing EPS production by genetic engineering is limited by the lack of a comprehensive study of the process of the EPS biosynthetic pathway. Therefore, clarification of the EPS synthesis pathway by whole genome sequencing could lay a foundation for subsequent genetic engineering of strains to improve yields.

In this study, B. amyloliquefaciens D189, a high EPS-producing bacteria, was screened from Guangxi Liupao tea. To further clarify the EPS biosynthesis pathway, the whole genome of D189 was sequenced, annotated using various databases, and the general features and EPS biosynthesis mechanism were revealed at the gene level, including sugar transport system, carbohydrate-active enzymes, and nucleotide sugar biosynthetic pathway genes. In addition, the safety of D189 was evaluated based on genomic and phenotypic analysis. Whole genome sequencing preliminarily elucidated the physiological mechanism of this high-level microbial EPS producer, providing several pathways for engineering D189 to further enhance EPS yield.

Materials and Methods

Bacterial Strain and Growth Conditions

B. amyloliquefaciens D189 was isolated from Liupao tea in Guangxi and preserved in our laboratory. The lysogeny broth medium (LB) used for seed culture of D189 consisted of (g/L): tryptone 10.0, yeast extract 5.0, and NaCl 10.0. Fermentation medium consisted of (g/L): carbon source 20.0, yeast extract 5.0, and NaCl 10.0. During EPS fermentation, the seed culture was prepared by inoculating a single colony in LB for 24 h. Subsequently, 5% (v/v) of seed culture was transferred into fermentation medium for fermentation at 30 °C and 200 rpm for 48 h.

Genome Sequencing and Assembly

Third-generation sequencing technology was employed for sequencing. Firstly, high-quality genomic DNA of D189 was extracted and a Nanodrop, Quantus Fluorometer and agarose gel electrophoresis were used for purification, concentration, and integrity testing. Genome sequencing of D189 was carried out by Shanghai Majorbio Bio-pharm Technology Company (Shanghai, China) on Illumina Hiseq and PacBio Sequel platforms. Low-quality and short sequences were removed from raw data to generate clean data. Subsequently, statistical analysis of clean data was performed to obtain relevant information such as the total amount of data, the length of reads, and the distribution of quality values. Sequence assembly of clean data was performed using Unicycler and Pilonjin software was used to correct the assembly genome and obtain the complete chromosome and plasmid sequences.

Gene Prediction and Functional Gene Annotation

Coding gene predictions were performed on sequenced genomes using Glimmer, GeneMarkS, and Prodigal software. tRNA, rRNA, and sRNA were predicted by tRNAscan-SE v2.0 and Barrnap software, and the Rfam database.

Predicted gene sequences were translated and searched against the National Center for Biotechnology Information (NCBI) non-redundant (NR) database, the Swiss-Prot database, the protein families (Pfam) database, the Clusters of Orthologous Group (COG) database, and Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases for annotation.

Safety Assessment of B. amyloliquefaciens D189

Identification of Safety-Related Genes from the Whole Genome of D189

According to comprehensive antibiotic resistance database (CARD), antibiotic resistance genes of D189 were identified, and the conditions were set as E value < 1E-5, identity > 45%, and coverage > 70%.

According to virulence factor database (VFDB) annotation of whole genome of D189, the identity in the comparison results was set as more than 60%, coverage > 70%, and E value < 1E-5.

Antibiotic Sensitivity Analysis of D189

Antibiotic sensitivity of D189 was determined using the Kirby–Bauer disk diffusion method according to the Clinical and Laboratory Standards Institute(CLSI) [15]. Eleven antibiotics were tested, including gentamicin (10 μg), streptomycin (10 μg), erythromycin (15 μg), Vancomycin (30 μg), ceftriaxone (30 μg), ciprofloxacin (5 μg), amoxicillin (10 μg), ampicillin (10 μg), oxacillin (1 μg), penicillin (10 μg), and polymyxin B (300 μg). A uniform coating of 200 μL bacterial solution was applied to LB plate. Antibiotic disks were then placed on inoculated plates under sterile conditions. After culture at 30℃ for 24 h, the diameter (mm) of growth inhibition zone was measured. According to the guidelines of the CLSI, the drug resistance and susceptibility were determined as follows: S = sensitive; I = intermediate; R = resistant.

Evaluation of Hemolytic Activity of D189

The strain was cultured on Columbia blood agar plates and incubated at 30 °C for 24 h. Hemolytic activity was detected by observing hydrolytic extraction around colonies. Hemolytic grades were classified as complete hemolysis (β-hemolysis, clear halo surrounding colonies), partial hemolysis (α-hemolysis, greenish halo surrounding colonies), and no hemolysis (γ-hemolysis, absence of clearing zone surrounding colonies).

EPS Biosynthesis Pathway

Analysis of EPS Synthesis Gene Pathway

Based on the CAZy database, functional annotation of carbohydrate-active enzymes genes was performed using Hmmscan software, and based on KEGG results, functional annotation of sugar transporter-related genes was performed using the Diamond software to further resolve the EPS synthesis pathway of D189.

Determination of the EPS Production by D189

After 48 h, cultures were centrifuged (8000 rpm, 4 °C for 15 min), the supernatant was collected. The supernatant (5 mL) was mixed with 95% ethanol (4:1, v/v) and placing it at 4 °C for 24 h. Crude EPS precipitation was collected by centrifugation at 8000 rpm for 15 min at 4 °C, and placed in a dryer to evaporate the remaining ethanol, and then dissolved in ultrapure water. Finally, the EPS sample was assayed for polysaccharide content using the phenol–sulfuric acid method [16].

Statistical Analysis

Each measurement was performed in triplicate, and results are presented as mean ± standard deviation of n replicates.

Results

General Genome Features

The results of Whole-genome sequencing that D189 contained a single, circular chromosome of 3,963,356 bp with an average GC content of 45.74% (Fig. 1), and plasmids were not present in D189. The number of genes was predicted to be 3996 and the coding rate was 87.84% using Glimmer software. Among the genes, 3996 were protein-coding genes, and 198 were RNA-coding genes, including 10 coding for 5S, 9 coding for 16S and 23S rRNAs, 84 coding for sRNAs, and 86 coding for tRNAs (Table S1). The genome sequence of D189 was submitted to NCBI (SRA) with Accession Number of PRJNA1019658.

Fig. 1
figure 1

Circular map of B. amyloliquefaciens D189 genome. From the inner to the outer circle, the first circle represents genome size; the second circle represents GC-skew; the third circle represents GC content; the fourth and seventh circles represent CDS on positive and negative chains (different colors representing functional classification by COG with different CDS); the fifth and sixth circles represent the positions of CDS, tRNA, and rRNA in the genome

Functional Analysis of the Complete Genome

The GO database is divided into three categories: biological process, cellular component, and molecular function (Fig. 2a). A total of 2784 genes were annotated in the GO classification, among which the highest proportion of genes involved in transcriptional regulation is in the Biological Process category; the highest proportion of genes in the Cellular Component category was annotated as the integral component of membrane; and ATP binding was the most annotated gene in the Molecular Function class.

Fig. 2
figure 2

Functional analysis of the complete genome. a GO database annotation. b COG database annotation. c KEGG database annotation

According to the results predicted using the COG database (Fig. 2b), 3133 COG genes were divided into four categories and 23 functional groups. D189 was the most prominent in terms of amino acid transport and metabolism (E, 319). The next most prominent were transcription (K, 293) and carbohydrate transport and metabolism (G, 268). Notably, a large number of genes related to the biosynthesis of nucleotide sugars were included found among the annotated carbohydrate transport and metabolism genes, such as UTP-glucose-1-phosphate uridylyltransferase, UDP-N-acetylglucosamine pyrophosphorylase, UDP-glucose dehydrogenase, UDP-N-acetylglucosamine 2-epimerase, phospho-sugar mutase, and UDP-glucose 4-epimerase. The above results suggest that D189 has the ability to synthesis EPS.

Furthermore, functional analysis of genes annotated according to KEGG pathways classified 2368 coding genes into six categories (Fig. 2c). The largest proportion of genes was annotated in the metabolism category (1797), of which 205 and 240 genes were related to amino acid and carbohydrate metabolism, respectively. The highest abundance of genes related to membrane transport was found in the environmental information processing category. Some of the functional genes in the EPS gene cluster are involved in membrane transport [17]. Therefore, it indicates that D189 has a high capacity for carbohydrate metabolism, amino acid metabolism, and membrane transport.

Safety Assessment of D189

Identification of Antibiotic Resistance gene

Antibiotic resistance makes bacteria immune to antibiotics and poses a huge threat to human and animal health. To test for the presence of antibiotic resistance in D189, 26 resistance-related genes were identified by comparison with the CARD database (Table S2), consistent with previous report [18]. In addition, bacteria can acquire drug resistance by horizontal transfer of mobile elements such as transposons or plasmids [19]. However, D189 has no plasmids, hence cannot acquire resistance genes by horizontal gene transfer through plasmids. To further corroborate antibiotic resistance gene analyses, phenotypic studies were conducted. Notably, D189 was sensitive to all 11 antibiotics tested (Table 1). The results of phenotyping supported the safety of D189 with respect to antibiotic resistance.

Table 1 Results of antibiotic susceptibility determination of D189

Evaluation of Virulence Factor Genes and Toxin-Encoding Genes

Virulence factors are substances derived from microorganisms that contribute to the self-infection of microorganisms and cause disease in specific hosts [20]. As shown in Table S3, 14 genes identified in D189 were associated with virulence factors based on comparison with the VFDB database. Some of the annotated virulence factors were related to the synthesis of capsular polysaccharides, such as VF0141 and VF0144. Some were related to the synthesis of poly-γ-glutamic acid, such as VF0141. Genes encoding hemolysin, non-hemolytic enterotoxin, and enterotoxin-related genes were absent in D189. These results indicate that the annotated toxin-related genes were not real virulence genes, but rather regulatory genes that play an important role in regulating biological processes [21, 22]. Furthermore, as shown in Fig. 3, the hemolysis experiment revealed no clear halo around colonies on the blood plates, indicating that D189 did not cause hemolysis. Whole-genome analysis and hemolysis experiments showed that D189 was a potential safety strain.

Fig. 3
figure 3

Evaluation of hemolytic activity on blood agar plates

EPS Biosynthesis of D189

To further reveal the biosynthetic pathway of EPS, the sugar transport system, carbohydrate-active enzymes, and nucleotide sugar formation pathway of D189 were analyzed, and the ability of D189 to synthesize EPS was demonstrated via validation experiments.

Sugar Transport System

Carbohydrates are key nutrients for EPS synthesis, and their type affects the structure of EPS by influencing nucleotide sugar synthesis and regulating the expression of key genes in the EPS biosynthetic pathway [23]. EPS synthesized under different carbon source conditions also show significant differences in molecular structure, activity, and physicochemical properties [24]. In general, carbohydrates can be transported in three ways: phosphotransferase systems (PTS), ABC type sugar transfer systems, and sugar penetration. Based on the whole genome, 108 genes for ABC transport were predicted in D189. However, ABC pathways for transporting carbon sources were incomplete. As the principal transport mode, PTS is composed of the soluble phosphate carrier protein (Hpr) that phosphorylates carbohydrates, and energy-coupled protease (EI) and membrane-bound permease (EII). Hpr (gene1379, gene3497) and EI (gene1380) were both annotated in D189. In addition, 22 genes of EII were identified in the genome (Table 2). Therefore, D189 could transport lactose, mannitol, glucose, fructose, maltose, sucrose, trehalose, cellobiose, and N-acetyl-D-glucosamine into the cell, which forms the substrate for the synthesis of nucleotide sugars, the active precursors for the synthesis of EPS.

Table 2 Putative genes related to PTS in D189

CAZy Database Annotation of D189

To further investigate the mechanism of EPS biosynthesis, the carbohydrate-active enzymes of D189 were analyzed at the genomic level. Compared with CAZy database, 135 genes related to carbohydrate metabolism were identified, with glycoside hydrolase (GH) and glycosyl transferase (GT) genes the most abundant (Fig. 4a). In the GH category, GH1, GH3, and GH43 were annotated as the most abundant genes (Fig. 4b). The CAZy database annotation results showed that GH1 mainly encodes β-glucosidase involved in lactose and galactose metabolism [25], and GH13 is the largest sequence subfamily in the GH family, mainly encodes α-glucosidase involved in starch metabolism [26], and hydrolyzing the α/β-glucosidic bond can produce monosaccharides, which provides precursors for the biosynthesis of EPS. Thus, the abundance of GH indicates that D189 can efficiently utilize metabolically different carbon sources for the formation of nucleotide sugars.

Fig. 4
figure 4

a CAZy functional annotation of D189 and b the distribution of related genes

In addition, 12 classes of GT family genes were annotated in D189. GTs catalyze the transfer of sugar groups from activated donor molecules to specific acceptor molecules, which synthesize a wide variety of EPS with different properties by altering the types, lengths, and conformations of glycosidic bonds in the polymers [27]. GT2 and GT41 are the most abundant in D189. The above results suggest that D189 has the ability to synthesis EPS.

Biosynthesis Pathways for Nucleotide Sugars

Nucleotide sugars are precursors for EPS biosynthesis. The nucleotide sugar biosynthesis pathway of D189 was predicted based on KEGG metabolic pathways (Fig. 5). PTS is the major sugar transport system in D189. As shown in Fig. 5, mannitol, fructose, and N-acetyl-D-glucosamine are transported from extracellular to intracellular compartments via PTS, where they are phosphorylated to produce mannitiol-6P, fructose-6P, and glcNAc-6P, respectively. Then, they are converted to fructose-6P, glcN-6P, by the action of related enzymes (mtlA, scrk, nagE) and ultimately participate in the synthesis of UDP-GlcNAc, UDP-ManNAc, and UDP-MurNAc. Similarly, sucrose, maltose, trehalose, and cellobiose are transported into the cell via the PTS and are phosphorylated to produce sucrose-6P, maltose-6P, trehalose-6P, and cellbiose-6P, respectively, and converted to glucose-6P by the action of the related enzymes (pgi, sacA, glvA, treC). Glucose-6p and glucose-1P are intermediates of the nucleotide sugars (dTDP-Glucose, UDP-Glucose, and dTDP-Rhamnose) involved in EPS biosynthesis, and the conversion between them is performed by phosphoglucose metathesis enzymes. They are ultimately involved in the synthesis of UDP-Glucose, dTDP-glucose, and dTDP-Rhamnose. UDP-glucose is interconverted with UDP-galactose under the action of the enzyme (galE). Based on the whole genome analysis, strain D189 can produce UDP-Glucose, UDP-Galactose, and UDP-GlcNAc, UDP-ManNAc, UDP-GlcA, and UDP-MurNAc, dTDP-Glucose and dTDP-Rhamnose, all of them can serve as repeating units of EPS when associated with gene clusters.

Fig. 5
figure 5

Synthesis of nucleotides sugar in D189

Validation of the EPS-Producing Capacity of D189

According to the results of KEGG pathway analysis and CAZy database annotation, D189 contained nine PTS sugar transporter enzymes that can transport substances into the cell. Meanwhile, key metabolic enzymes for eight sugars (glucose, sucrose, trehalose, fructose, cellobiose, maltose, mannitol, and N-acetyl-D-glucosamine) were predicted in D189. Based on the predicted nucleotide sugars biosynthetic pathways of strain D189 at the gene level, the sugar-metabolizing ability of D189 and the corresponding EPS yield were verified. As shown in Fig. 6, when glucose, sucrose, trehalose, fructose, cellobiose, maltose, mannitol, and N-acetyl-D-glucosamine were used as a single carbon source, respectively, the EPS yields were 0.345 g/L, 1.212 g/L, 0.014 g/L, 0.451 g/L, 0.022 g/L, 0.448 g/L, and 0.025 g/L. When sucrose was the sole carbon source, the highest EPS yield was 1.21 g/L. The high EPS production of strain D189 may be a result of the highest predicted number of carbohydrate metabolism-related genes [28, 29]. The above results suggest that strain D189 has a strong ability to metabolize sugars and produce large amounts of EPS.

Fig. 6
figure 6

Effects of different carbon sources on yield of EPS

Discussion

Whole-genome sequencing is a high-throughput technology that has gradually become an important tools for basic microbial research by facilitating the prediction of the general characteristics and synthesis mechanisms of microbial metabolites at the genomic level [17, 22]. In this study, the safety and EPS biosynthetic pathway of B. amyloliquefaciens D189 were investigated by both genomic and phenotypic analysis. The results showed that strain D189 was a potentially safe strain with the ability to biosynthesize EPS. Safety is a critical factor in the application of bacterial strains [27]. Previous studies have shown that some antibiotic resistance genes in microorganisms may be transmitted through horizontal gene transfer, which posed a great threat to human health [30, 31]. In D189, 26 drug resistance genes were predicted. However, the absence of a plasmid prevents the strain from acquiring resistance through the horizontal transfer pathway [32,33,34]. Genes encoding hemolysins, non-hemolytic enterotoxins, and enterotoxins were not identified in the genome sequencing results. In addition, both drug sensitivity and hemolytic activity tests demonstrated that D189 is safe.

Biosynthesis of EPS is a complex process involving sugar transport and phosphorylation, biosynthesis of nucleotide sugars, polymerization of repeating units, and export of polysaccharides [35]. Carbon sources are critical for microbial growth and physiological metabolism, and carbon sources affect the structure of EPS by influencing the synthesis of nucleotide sugars and by regulating the expression of key genes in the EPS biosynthetic pathway [36, 37]. Genomic analysis showed that D189 has nine sugar-specific PTS systems that transport lactose, mannitol, glucose, fructose, maltose, sucrose, trehalose, cellobiose, and N-acetyl-D-glucosamine into the cell. However, the absence of key metabolic enzymes results in the inability of D189 to metabolize lactose. A variety of active enzymes are involved in the synthesis of EPS. The CAZy database annotation results showed that GH and GT genes were the most abundant in D189. GHs have a great potential for hydrolyzing complex carbohydrates [38] and are considered key enzymes involved in carbohydrate metabolism that can efficiently utilize metabolized sugars. GTs transfer sugar moieties from activated donor molecules to repeating units [39]. The abundance of GTs in D189 indicates the potential for EPS synthesis. Endogenous levels of nucleotide sugars, the precursors of EPS, may be important for controlling the level of EPS production [17]. Genomic analysis revealed that strain D189 could produce UDP-Glucose, UDP-Galactose, and UDP-GlcNAc, UDP-ManNAc, UDP-GlcA, and UDP-MurNAc, dTDP-Glucose and dTDP-Rhamnose, all of which can serve as repeating units of EPS, and gene clusters mediate their synthesis and hence the formation of EPS.

Sugar metabolism validation experiments showed that D189 was able to metabolize the eight sugars predicted from the genome sequence for bacterial growth and EPS production. Sucrose was the best carbon source for EPS production by D189, probably because bacteria use it to generate other precursor molecules (glucose and fructose) for EPS synthesis and energy generation [40]. In general, these bacterial polysaccharides are synthesized via four mechanisms: the Wzx/Wzy-dependent pathway; the ATP-binding cassette (ABC) transporter-dependent pathway; the synthase-dependent pathway; and extracellular synthesis using a single sucrose protein [41, 42]. Notably, genes encoding proteins mediating EPS transport out of the cell were not annotated in the D189 genome, and we hypothesize that there may be a sugar transporter protein gene homolog. To gain a more comprehensive understanding of the synthesis and secretion pathways of EPS, uncharacterized proteins in D189 should be further analyzed in future research.

Conclusion

In summary, the genome of D189 was annotated to explore its safety and EPS biosynthetic pathway. Elucidation of the genome makes D189 a model strain for EPS production. In addition, the putative EPS biosynthetic and regulatory pathway established in this study may provide an opportunity to develop metabolic engineering strategies to enhance EPS production by D189, which could be important for industrial-scale production of important metabolites such as EPS produced by other strains.