Introduction

The Lactobacillus plantarum species constitutes extremely flexible and versatile lactic acid bacteria (LAB), which have been isolated from many different environmental niches, such as animals, plants, and the gastrointestinal and vaginal tract, as well as various food materials, such as vegetables, dairy products, meat products, and fermented foods [1, 2]. L. plantarum is applied to a variety of fermented foods, and some strains are used as probiotics that may confer beneficial health effects to humans or animals [3].

Probiotics are living microorganisms that provide beneficial effects to the host and are used to prevent a variety of diseases associated with diarrhea, hyperlipidemia, inflammatory bowel disease, and immune function [4, 5]. In the genus Lactobacillus, some strains of the species, such as L. acidophilus, L. gasseri, L. rhamnosus, L. plantarum, and L. fermentum, act as important probiotics [6]. To function as a probiotic, a bacterial strain should be resistant to bile and the acidity of the gastrointestinal tract to enter the small intestine. Other functional properties for characterizing probiotics are the ability to produce antimicrobial compounds and reduce serum cholesterol levels [6, 7]. Cholesterol-lowering effects are closely related to the bile salt hydrolase (bsh). Bile acid conjugated with taurine or glycine helps to absorb cholesterol in the small intestine. However, when bile acid is removed by bacterial bsh, bile acid is excreted and cholesterol is consumed as a precursor for the synthesis of new bile acid, thereby lowering serum cholesterol [8]. The bsh activity present in microorganisms has been reported in strains, such as Bifidobacterium, Lactobacillus, and Streptococcus, and contributes to the probiotic properties in the gastrointestinal tract of humans and animals [9]. One of the antimicrobial compounds, bacteriocin, is an antimicrobial peptide synthesized in ribosomes and works against closely related species [10]. Plantaricin is a bacteriocin produced by L. plantarum, most of which, such as plantaricin A and the two-peptide bacteriocins, plantaricin EF and plantaricin JK, belong to class IIc. Some plantaricins have antimicrobial activity against both gram-negative and gram-positive bacteria, indicating the potential of L. plantarum as an antimicrobial agent [11]. A general mechanism for the probiotic effects may be related to the genus or species of bacteria, but specific mechanisms tend to be strain-specific [12]. Thus, genome sequencing is the best way to identify the metabolic pathways, phylogenetic relationships, the health and safety of specific strains, and genetically understand the biological specificity of new strains [13, 14].

The previous researcher isolated L. plantarum EM from kimchi, a traditional Korean food, and this strain has been shown to reduce serum cholesterol levels [15]. L. plantarum EM has been shown to meet the functional criteria required for probiotics, such as bile and acid tolerance, antimicrobial activity against pathogenic bacteria and fungi, and antibiotic susceptibility [15]. Here, we performed genome sequencing and comparative genomic analysis to uncover the mechanism of the probiotic effect of L. plantarum EM.

Materials and Methods

Strain Isolation and DNA Extraction

Before use, L. plantarum EM was activated in MRS broth (Difco, Becton & Dickinson, Sparks, MD, USA) at 30 °C for 48 h under anaerobic conditions. The genomic DNA of L. plantarum EM was extracted with a DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. The total genomic DNA purity and concentration were determined by absorbance using an Ultrospec 2100 Pro-spectrophotometer (Amesham Biosciences, Cambridge, UK) [16].

Genome Sequencing, Assembly, Annotation, and Analysis

The genome sequencing of L. plantarum EM was performed using the PacBio RS II platform (Pacific Biosciences, Menlo Park, CA, USA). A 20 kb library was generated using a SMARTbell Template Preparation Kit 1.0 and sequenced with P4-P2 chemistry on two cells. The raw data were obtained as 91,147 reads with an average read length of 10,774 bp. The filtered subreads were de novo assembled using HGAP version 2.0. Functional annotation of the genome sequence was performed using RAST version 2.0 [17], and probiotic-related genes were identified based on the annotation results. A circular genomic map was constructed using the CGView server [18]. The ResFinder version 3.1 was used to identify the antibiotic resistance genes in plasmids from pEM1 to pEM8 [19].

Comparative Genomic Analysis of Probiotic Lactobacillus Species

To confirm the genetic characteristics of L. plantarum EM, a comparative genomic analysis was performed with 41 probiotic Lactobacillus strains. The probiotic Lactobacillus strains used were those with probiotic functions identified in previous studies, including L. acidophilus, L. brevis, L. rhamnosus, L. paracasei, L. casei, L. fermentum, L. helveticus, L. plantarum, and L. reuteri. The genomes of these strains were retrieved from the National Center for Biotechnology Information (NCBI; ftp://ftp.ncbi.nlm.nih.gov/genomes/) genome database (Table 1). For estimation of the phylogenetic tree, the 16S rRNA gene sequences extracted from the genome sequence of 42 probiotic Lactobacillus strains were aligned using ClustalW with default parameters, and phylogenetic analysis was performed using the maximum-likelihood method with 1,000 bootstraps in MEGA (version 6.06). The genes related to bacteriocin synthesis in each Lactobacillus species were identified using BAGEL4 [20]. The genes related to probiotic properties and cholesterol-lowering were identified using BLASTp. The bsh genes, the genes related to cholesterol-lowering effects, were extracted from the genome sequence of 21 L. plantarum strains, and the alignment and phylogenetic tree were constructed using the ETE3 module with its default parameters for protein sequences [21]. The pan-genome analysis and visualization of the probiotic Lactobacillus strains were analyzed using Anvi’o version 6.0 pan-genomic workflow [22, 23]. The number of pan-, core-, accessory-, and unique-genomes were analyzed using computational pipeline Bacterial Pan Genome Analysis (BPGA) version 1.3 with the default parameters [24]. In order to classify the genomes of each strain by functional categories, clusters of orthologous groups (COGs) were assigned to the amino acid sequences using USEARCH version 8.0 against the COG database [25].

Table 1 General genome features of 42 Lactobacillus strains

Nucleotide Sequence Accession Numbers

The genome sequence of Lactobacillus plantarum EM was deposited in the DDBJ/EMBL/GenBank with accession numbers CP037429.1 (chromosome), CP037430.1 (pEM1), CP037431.1 (pEM2), CP037432.1 (pEM3), CP037433.1 (pEM4), CP037434.1 (pEM5), CP037435.1 (pEM6), CP037436.1 (pEM7), and CP037437.1 (pEM8).

Results and Discussion

General Genome Features

The complete genome of L. plantarum EM was composed of a circular chromosome and eight plasmids (Fig. 1). The complete genome of L. plantarum EM consisted of 3,618,689 bp with a G+C content of 44.2% (Table 1). The genome had a chromosome of 3,184,808 bp with a G+C content of 44.7%. The plasmids, designated pEM1 to pEM8, had various lengths ranging from 21,426 to 76,369 bp. The genome size and G+C content of the L. plantarum EM chromosome were similar to L. paracasei N1115 (3,064,279 bp, 46.46%), L. rhamnosus DSM 14870 (3,013,149 bp, 46.7%), and L. casei LC5 (3,132,867 bp, 47.9%), but not to L. fermentum F-6 (2,064,620 bp, 51.7%) or L. acidophilus NCFM (1,993,560 bp, 34.7%). Among the species used for analysis, L. plantarum had the largest genome and the greatest number of plasmids. This fact is related to the ecological flexibility of L. plantarum and the diversity of ecological niches in which L. plantarum is encountered [26]. And, in general, Lactobacillus reduced the genome size by removing useless functions to adapt to the environment during evolution, whereas L. plantarum has a larger genome obtained by horizontal gene transfer via mobile elements, such as plasmids, transposons, prophages, and integrons [27].

Fig. 1
figure 1

Circular genome maps of L. plantarum EM a Chromosome, b Plasmids

The nucleotide sequence blast results revealed that the eight plasmids of L. plantarum EM showed similarity to the plasmids or chromosome of the L. coryniformis, L. plantarum, L. pentosus, and L. curvatus strains. It was also confirmed that each plasmid had a gene related to a plasmid replication protein. The annotation results showed that the genome had 3,107 coding sequences and 88 RNA genes. Moreover, the protein-coding sequences were functionally divided into 238 SEED subsystem categories. The plasmids of L. plantarum EM contained from 29 to 79 different protein-coding genes. The ResFinder database were used to identify the antibiotic resistance genes. The results showed that no antibiotic resistance genes were detected in any of the plasmids. Therefore, in the gastrointestinal tract, antibiotic resistance genes are not expected to be transmitted from L. plantarum EM strains to pathogenic microorganisms.

Probiotic-Related Genes of L. plantarum EM

The probiotic properties of L. plantarum EM were confirmed in a previous study [15]. This was supported by the genomic analysis data in our study, in which a gene encoding F0F1 ATB syntheses (chr_orf2044 to chr_orf2050), which are related to acid tolerance, and choloylglycine hydrolases (chr_orf56, chr_orf57, chr_orf2236, chr_orf2913, chr_orf3049), which are related to bile salt resistance, were detected (Table 2). Probiotics can experience heat stress in the food industry (e.g., pasteurization and spray-drying) or during storage. Exposure to high temperatures induces the expression of evolutionarily conserved heat shock proteins (HSPs), including chaperones, such as GrpE, DnaK, DnaJ, and GroES/GroEL [28]. The heat shock protein GrpE (chr_orf1738) and the chaperone proteins DnaK (chr_orf1737), DnaJ (chr_orf1736), and GroES/GroEL (chr_orf638 to chr_orf639), which participate in the heat shock response and hyperosmotic response, were detected in the chromosome of L. plantarum EM. Cold shock-inducing proteins have been identified in a variety of microorganisms, and these genes are related to the adaptation process required for bacterial survival at low temperatures [29]. In L. plantarum EM, the cold shock protein of the CSP family genes was found on the chromosome (chr_orf31, chr_orf886, chr_orf1025). Additionally, catalase katE (chr_orf3077), thiol peroxidase (chr_orf2002), and glutathione peroxidase (chr_orf194), which protect against oxidative stress, were detected.

Table 2 Important genes encoding probiotic-related proteins in L. plantarum EM

Pan-Genomic Analysis of 42 Probiotic Lactobacillus Strains

The taxonomic relationship between L. plantarum EM and other probiotic Lactobacillus species was confirmed by 16S rRNA gene sequence. Phylogenetic tree analysis revealed that L. plantarum EM was grouped with L. plantarum strains (Fig. 2). The 16S rRNA gene sequence of L. plantarum EM was most closely related to ST-III, 10CH, and WCFS1 (100% identity) among the L. plantarum strains. Hence, based on the phylogenetic relationship analysis, the EM strain was identified as L. plantarum.

Fig. 2
figure 2

Phylogenetic analysis was based on 16S rRNA gene sequences for 42 probiotic Lactobacillus strains

To understand the genome of probiotic Lactobacillus species and to obtain the unique genes of L. plantarum EM, we performed a pan-genome analysis. The pan-genome analysis of 42 Lactobacillus strains showed that the remaining strains, except L. casei and L. paracasei strains, were grouped according to each species (Fig. 3). Based on a comparative genomic analysis of 42 genome sequences of probiotic Lactobacillus species, the pan-, accessory-, and core-genome encompassed 15,020, 10,877, and 114 genes, respectively. To investigate the diversity and functionality encoded by the pan-genome, the genes were classified by functional categories using COG analysis. The core-genome was assigned a high percentage of genes for translation, ribosomal structure and biogenesis, and the accessory-genome had the highest percentage of genes for general function prediction and transcription. It contains probiotic-related genes, such as choloylglycine hydrolase, that function in bile resistance. These results suggest that all the strains used for the analysis had probiotic-related genes because they have already been confirmed as probiotic bacteria. Of the 4,029 unique genes identified in the 42 Lactobacillus strains, 83 genes were identified as unique genes present only in the L. plantarum EM genome. The unique genes identified were those involved in replication, recombination and repair (20.51%), transcription (15.38%), and carbohydrate transport and metabolism (12.82%).

Fig. 3
figure 3

Pan-genome distribution across 42 Lactobacillus species. The center figure shows the hierarchical clustering of pan-genome based on their presence/absence

Genetic Analysis Related to the Cholesterol-Lowering Effect

High cholesterol-removing ability was observed in the L. plantarum EM strain [15]. This ability was supported by our genomic analysis results, in which a total of five bsh genes were detected. L. plantarum ST-III is a highly cholesterol-resistant strain with four bsh genes on the genome, and the function of these genes was demonstrated in a previous study [30]. As a result of the alignment of the bsh genes of L. plantarum ST-III and EM, the bsh1, bsh3, and bsh4 genes of ST-III showed 98–100% identity to chr_orf2191, chr_orf2855, and chr_orf2990 of EM, respectively. Compared to the bsh2 gene of L. plantarum ST-III, eleven nucleotide substitutions were found in chr_orf56 and chr_orf57 of EM. At the 475 bp position of chr_orf56 and chr_orf57 in L. plantarum EM, the TGG for tryptophan was replaced with TAG, causing a premature stop codon. As a result, the bsh gene was divided into two fragments and a total of five bsh genes were present. Comparison of the bsh gene of 23 L. plantarum strains showed that they mainly had one bsh gene, similar to bsh1 of L. plantarum ST-III. The bsh2, bsh3, and bsh4 genes were present only in strains with three or more bsh genes (Fig. 4). A previous study showed that all bsh genes of L. plantarum ST-III were responsible for the hydrolysis activity of many substrates, and the bsh1 gene was highly activate against glycodeoxycholic acid [30].

Fig. 4
figure 4

Phylogenetic analysis was based on the bile salt hydrolase genes for 21 L. plantarum strains

Identification of Bacteriocin Gene Clusters

The bacteriocin synthesis gene clusters of Lactobacillus species were compared and analyzed. As a result, one or more bacteriocin gene clusters were found in all species except the L. brevis and L. fermentum strains (Table 3). L. rhamnosus, L. helveticus, and L. plantarum mostly contained carnocin, helveticin, and plantaricin, respectively. L. casei and L. paracasei mainly contained LSEI bacteriocin derived from L. casei ATCC 334, and L. acidophilus mainly contained acidocin and helveticin. Bacteriocin gene clusters seemed to be among those that are transferred horizontally, showing similar patterns between closely related genomes [31]. The genome of L. plantarum EM consists of encoding genes involved in bovicin (gene start position, 1 bp on the plasmid) and plantaricin (gene start position, 353,842 bp on the chromosome), i.e., plantaricin JK, N, A, and EF (Fig. 5). Plantaricin F was previously identified in probiotic L. plantarum with antimicrobial activity against Micrococcus, Listeria, Staphylococcus, and Salmonella [32]. In an in vitro assay, L. plantarum EM showed antimicrobial activity against Bacillus cereus, Micrococcus leteus, Staphylococcus aureus, Escherichia coli, Salmonella Typhi, Vibrio parahaemolyticus, and Pseudomonas aeruginosa [15]. These results were assumed to be related to the bacteriocin gene cluster present on the genome of L. plantarum EM.

Table 3 Putative bacteriocin gene cluster identified in Lactobacillus species
Fig. 5
figure 5

Genetic organization of putative bacteriocin synthesis genes: a Bovicin gene cluster of L. plantarum EM on plasmid, b Plantaricin gene cluster of L. plantarum EM on chromosome, c Plantaricin gene cluster of L. plantarum WCFS1, d Plantaricin gene cluster of L. plantarum 16, e Plantaricin gene cluster of L. plantarum KLDS1.0391

In this study, we performed genome sequencing and analysis of L. plantarum EM, which has already confirmed probiotic properties. The genome sequence of L. plantarum EM provided genetic information on probiotic-related functions, such as cholesterol-lowering, antimicrobial activity, and tolerance to bile and acid. The bsh gene, bacteriocin gene cluster, and F0F1 ATB syntheses were identified through genomic analysis of L. plantarum EM. This strain may be used in foods or industries as a probiotic for human health.