Introduction

Bacteria in the colon have been correlated with healthy phenotype and some with diseased phenotypes (Rojo et al. 2016). Moving ahead from correlations, recent studies have demonstrated the beneficial effect of specific or consortia of microbes on human health (Atarashi et al. 2013; Subramanian et al. 2014; Plovier et al. 2016). Modulating the gut microbiota for improving and or maintaining a healthy status is a promising avenue and can be approached from various aspects like diet and specific probiotics or consortia of probiotics (Rojo et al. 2016; O’Toole et al. 2017).

For a long time, bacteria from genera like Lactobacillus, Lactococcus, Bifidobacterium, Leuconostoc, Bacillus, Streptococcus, Clostridium and Escherichia were being used as general probiotics. However, recent advances in high throughput sequencing have revealed population-specific features of the human gut microbiome (Rastall et al. 2005; Consortium 2013; Falcony et al. 2016). Hence, identifying specific populations or indigenous beneficial bacteria will be an important aspect of designing microbiome-based therapeutics. Initial studies on the Indian microbiome have indicated a notable difference from other populations over the world (Bhute et al. 2016). This is hypothesised to be a consequence of its diverse genetic composition, dietary habits, cultural, religious affiliations and vast geographic scattering (Shetty et al. 2013). The current probiotics marketed in India for Indians are mostly non-indigenous strains and their efficacy is debated (Raghuwanshi et al. 2015).

In the present study, we screened bacterial strains isolated from faecal samples of healthy Indian subjects for their probiotic potentials. More than 125 different strains from 18 different genera were isolated using 35 culture media and different growth conditions. Among these, the strain 17OM39 of Enterococcus faecium showed properties of a potential probiotic. This work utilises an integrated approach of in vitro experiments and genomic analysis to characterise the metabolic characters of strain 17OM39.

Materials and methods

Isolation and preservation

Three self-declared healthy volunteers were selected for this study. Approval from the Institutional Ethics Committee (IEC) and consent from the subjects was taken prior to collection of the sample. Faecal samples were collected in a pre-sterile container and was immediately transported to the lab at 4 °C and processed within 6 h. For isolation of faecal bacteria, 1 g faces were transferred to 9 ml of sterile saline (0.85% sodium chloride, Sigma) and mixed well. The serial dilutions were subsequently prepared in sterile saline. Appropriate dilutions of the samples were plated on MRS media (HiMedia, Mumbai, India). Plates were incubated at 37 °C for 24 to 48 h under aerobic condition (Mishra et al. 2015). A pure culture of the isolate was preserved in 20% (v/v) glycerol as frozen stocks at − 80 °C.

Identification

Genomic DNA was extracted by using QIAGEN Blood and Tissue Kit (QIAGEN, USA) as per manufacturer’s protocol. DNA quantification was done by Nanodrop ND1000 (Thermo Scientific, USA). The 16S rRNA gene was amplified by using eubacteria specific primers 27F (5′-AGA GTT TGA TCM TGG CTC AG-3′) and 1492R (5′-ACG GCT ACC TTG TTA CGA CTT-3′) as described in the earlier study (Patil et al. 2015). Amplified PCR products were purified using polyethylene glycol (PEG)–NaCl precipitation (Evans 1990). Sequencing of both strands was done on an ABI 3730xl DNA analyser using the Big Dye terminator kit (Applied Biosystems, Inc., Foster City, CA). The sequence obtained was assembled using DNASTARPro, version 10 (Patil et al. 2015) and taxonomic identity were checked using the EZ-Biocloud server (Kim et al. 2012). Microscopy by SEM was performed stated elsewhere (Golding et al. 2016).

Characterisation of probiotic properties

Antibiotic susceptibility testing was carried out by disc diffusion method using the Dodeca Universal I & II kit (Hi-media, India). The isolate was tested for nine exoenzymes by standard microbiological methods viz. phosphatase by Pikovskaya’s agar base (Shen et al. 2016), urease by urease agar base (Vuye and Pijck 1973), lipase by tributyrin agar base (Yusof et al. 1989), cellulase by CMC agar (Kasana et al. 2008), protease by skimmed milk agar (Savijoki et al. 2006), gelatinase by gelatin medium (Whaley et al. 1982), catalase by effervescence of 6% H2O2 (Whittenbury 1964), nitrate reductase by colour change method (Tiso and Schechter 2015) and amylase activity by starch iodine test (Carrasco et al. 2016). Carbohydrate utilisation test was performed according to manufacturer’s instruction using HiCarbohydrate™ Kit (Sigma, India) consisting of 34 carbohydrates. The presence of any plasmid was checked by QIAGEN Plasmid Mini Kit (QIAGEN, USA). Bile and acid tolerance assay (Chou and Weimer 1999; Muller et al. 2011; Hassanzadazar et al. 2012; Tokatl et al. 2015), autoaggregation assays was performed by briefly 107 cells/ml of probiotic cells were harvested washed with PBS (pH 7.2) and re-suspended in the same buffer. Consecutively, bacterial suspensions were incubated at 37 ± 2 °C and monitored at different time intervals (0, 1, 2, 3, 4, and 5 h). The percentage of autoaggregation was expressed as A% = (A0 − At)/A0 × 100 where A0 represents the absorbance (A600 nm) at 0 h and At represents the absorbance at different time intervals (Schmidt and Hensel 2004), cell surface hydrophobicity (Saran et al. 2012), adhesion to human HT-29 cell line (Gyles and Boerlin 2014; Mishra et al. 2015), bile salt hydrolytic (BSH) activity (Zanotti et al. 2015), hypocholesterolemic activity (Sridevi et al. 2009), resistance to hydrogen peroxide (Halliwell et al. 2000; Wu et al. 2014; Forman et al. 2016) exopolysaccharide production (Abdhul et al. 2014), haemolytic activity (Ike et al. 1987; Semedo et al. 2003a, b) and serum resistance (King et al. 2009) was carried out as stated. Percentage cell survival was determined by calculating the ratio of amount of cell survived at the end of experiment (colony count) to the number of cell seed initially (colony count).

Antimicrobial activity of the strain 17OM39

The inhibitory activity assay was performed against Listeria monocytogenes (ATCC 13932), Pseudomonas aeruginosa (ATCC 15442), Streptococcus pneumonia (ATCC 49619) and Escherichia coli (ATCC 25922) as described in previous study (Pieniz et al. 2014).

Statistical analysis

All the experiments were done in triplicates. Mean values and standard deviation were calculated from the data obtained from triplicate assays. These data were then compared using Duncan’s Multiple Range Test (SPSS Ver. 10.0).

Genome sequencing and assembly

Genomic DNA extraction was followed as stated in the manufacturer’s protocol (QIAamp genomic DNA kit, Germany). The high-quality DNA was sequenced using Illumina MiSeq platform using 2 × 300 paired-end libraries. Reference-based assembly of quality-filtered reads was done using MIRA assembler version 4.9.3 (Chevreux and Wetter 1999). The genome of E. faecium T110 (GenBank ID: CP006030.1) was used as a reference.

Bioinformatics analyses

The draft genome sequence was annotated with Rapid Annotation using Subsystem Technology (RAST) version 4.0 (Overbeek et al. 2014) and the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (Tatusova et al. 2016). Protein coding genes and tRNA and rRNA genes from the genomes were predicted using Glimmer version 3.02 (Delcher et al. 1999), tRNA_scan-SE (Lowe and Eddy 1997), and RNAmmer (Lagesen et al. 2007), respectively. Protein coding genes were also analysed by COG database (Tatusov et al. 2000) and Pfam domains were predicted using NCBI Batch CD-Search Tool (Marchler-Bauer et al. 2011). Presence of CRISPR repeats was predicted using the CRISPRFinder tool (Grissa et al. 2008). The origin of replication was predicted using ORIFinder (Luo et al. 2014). The open reading frames were obtained using the ORF finder tool (https://www.ncbi.nlm.nih.gov/orffinder/). Prophage sequences were predicted and annotated using PHASTER (Arndt et al. 2016). Bacterial insertion elements (ISs) were identified by ISfinder (Siguier et al. 2006). Horizontal gene transfer was detected by genomic island tool: Islandviewer (Juhas et al. 2009). Gene clusters of any bioactive compounds were identified by antiSMASH: antibiotics and Secondary Metabolite Analysis SHell (Medema et al. 2011). PlasmidFinder was used to search for plasmids within the genome (Carattoli et al. 2014). Codon usage analysis was done using General Codon Usage Analysis: GCUA (McInerney 1998). Kyoto Encyclopedia of Genes and Genomes (KEGG) was used to predict the metabolic pathways from the genome (Ogata et al. 1999). The in silico DDH was done using the Genome-to-Genome Distance Calculator (version 2.1) server at http://ggdc.dsmz.de/ of Leibinz Institute DSMZ-German Collection of Microorganisms and cell culture. The average nucleotide identity comparisons were done using the ANI Calculator available at the EZ-Biocloud server (Lee et al. 2016). In silico analysis of complete bacterial genomes for RLFP was done at http://insilico.ehu.es/restriction/ (Bikandi et al. 2004).

Accession number(s)

This whole genome shotgun project has been deposited at GenBank under the accession LWHF00000000. The version described in this paper is version LWHF00000000 GenBank accession number for the partial 16S rRNA nucleotide sequence is KY682304.

Results

Microscopy and identification

The bacterial cell morphology was examined using scanning electron microscope. The coccoid cells occur in pairs or short chains (Fig. S1). The coccoid cells of strain 17OM39 have a diameter of ~ 1.35 μm. The 16S rRNA gene of strain 17OM39 shows 99.92% similar to E. faecium type strain CGMCC (Fig. 1).

Fig. 1
figure 1

Phylogenetic relationship of strain 17OM39 with closely related taxa based on 16S rRNA gene sequences (16S rRNA gene sequence of Lactococcus lactis was used as out-group). The phylogenetic trees were constructed using the neighbour-joining method. The evolutionary distances were computed using the Kimura 2-parameter method and are in the units of the number of base substitutions per site. The rate variation among sites was modelled with a gamma distribution (shape parameter = 1)

Characterisation of probiotic properties

Exoenzymes, carbohydrate utilisation, antibiogram and plasmid determination

The strain 17OM39 was positive for protease, cellulase, catalase, amylase, lipase and nitrate reductase activities (Table 1). The strain was also able to metabolise most of the carbohydrate sources tested in this study few of them are lactose, xylose, maltose, fructose, dextrose, galactose, raffinose, trehalose, melibiose, sucrose, mannose and inulin (Table 2). The strain was sensitive to 19 of the 24 tested antibiotics and showed only intermediate resistance to 5 which were levofloxacin, ciprofloxacin, cefotaxime, gentamicin and furazolidone (Table 3). Absence of plasmids was confirmed by plasmid extraction method.

Table 1 Exoenzymes produced by Enterococcus faecium 17OM39
Table 2 Various carbohydrates utilised by test strain Enterococcus faecium 17OM39 and control Lactobacillus rhamnosus GG
Table 3 Showing antibiogram results for the test strain Enterococcus faecium 17OM39 and control Lactobacillus rhamnosus GG

Bile, acid and hydrogen peroxide tolerance

Strain 17OM39 was able to grow in the bile concentration of 0.3% bile salt with 80% survivability after 24 h (Fig. 2a) and survive up to 1% (w/v) bile salt concentration (data not shown). The acid tolerance assay was carried out at pH 2.5 for 3 h depicting the condition and time taken by the food in the stomach with 74.02% survivability in these conditions (Fig. 2b). The strain was able to tolerate exposure to hydrogen peroxide for 6 h and tolerance declined gradually thereafter (Fig. 2e).

Fig. 2
figure 2

Results for the experiments conducted for characterisation of probiotic properties, where 17OM39 is Enterococcus faecium and control is Lactobacillus rhamnosus GG. a Bile tolerance test shows the survival of 17OM39 and control in 0.3% bile salt. b Acid tolerance test shows the survival of 17OM39 and control at pH 2.5 for 3 h. c Percentage autoaggregation capacity of 17OM39 and control. d Serum resistance test conducted for 17OM39 and control. e Peroxide resistance test conducted for 17OM39 and control in hydrogen peroxide. Also, error bar indicates a standard deviation (SD) and percentage cell survival was determined by calculating the ratio of amount of cell survived at the end of experiment (colony count) to the number of cell seed initially (colony count)

Adhesion ability and exopolysaccharide production

Autoaggregation of the strain 17OM39 was observed to be 32% (Fig. 2c, Table 4), while the cell surface hydrophobicity was observed with hydrocarbons such as hexane (32%), toluene (42%) and hexadecane (45%). Adhesion to human HT-29 cell line was found to be 57 and 60% for strain 17OM39 and control, respectively. In addition, microscopic observations revealed strong adhesion (Fig. 3). The strain 17OM39 is also capable of producing exopolysaccharide (Table 4).

Table 4 Showing autoaggregation and surface hydrophobicity, adhesion, EPS and haemolytic activity results for test strain Enterococcus faecium 17OM39
Fig. 3
figure 3

Results for cell adhesion assay performed on Human HT-29 Cell line. The upper panel shows images taken under fluorescence microscope 60× oil immersion, while the lower panel represents the images taken under normal microscope. a, a1 Negative control without any bacterial cell added. b, b1 Strain 17OM39. c, c1 Positive control Lactobacillus rhamnosus GG

BSH activity and cholesterol removal

A 17-mm-diameter zone of clearance was observed demonstrating the ability of strain 17OM39 to hydrolyse bile. The ability to hydrolyse bile was confirmed by the ninhydrin method (Table 5). The BSH activity of strain 17OM39 was equal to that of the control strain Lactobacillus rhamnosus GG. It was observed that the active cells were able to remove cholesterol better than resting and dead cells (Table 5).

Table 5 BSH activity and cholesterol removal results for test strain Enterococcus faecium 17OM39 along with the (±) standard deviation

Pathogenicity testing

Pathogenicity testing included the investigation of haemolytic activity and serum resistance. The strain 17OM39 exhibited alpha haemolytic activity and was found susceptible to human serum with 0.01% survival after 3 h (Fig. 2d).

General genome features

More than 5 million good-quality paired-end reads were obtained, which accounted for an approximate 100× sequencing coverage. The complete genome of E. faecium 17OM39 consisted of 2,696,332 bp (2.69 Mb) with an average GC content of 38.5%. Genome consisted of 2865 genes and 2639 ORFs were identified. The general genome information is in Table S1 and the genome atlas in Fig. 4. Within the genome, we identified 78 RNA genes: 5S RNA (four copies), 16S RNA (six copies) and 23S RNA (two copies). The relative synonymous codon usage found in the genome is presented in Table S2. Protein coding genes were analysed by COG database on WebMGA and revealed 76.60% code for major cellular processes and signalling, information storage, processing and metabolism, while the remaining 23.4% genes were poorly characterised (Table S3). Strain 17OM39 showed 81% DNA–DNA relatedness, 97.83% ANI to the marketed probiotic E. faecium strain T110. The in silico genome RFLP with SnaI gave 28 and 31 bands for strain 17OM39 and T110, respectively.

Fig. 4
figure 4

Genome atlas of strain 17OM39. The atlas represents a circular view of the complete genome sequence of Enterococcus faecium 7OM39. The circle was created by using CG viewer. Innermost circle 1 shows GC-skew (black). Circle 2, 3 shows CDS’s (blue). Circle 4 represents genes for stress and Allied metabolism (pink). Circle 5 shows genes for carbohydrate and amino acid metabolism (green). Genes imparting probiotic properties are shown in circle 6 (Sky blue)

Mobilome

In silico analysis of the genome of E. faecium 17OM39 revealed two prophages; one complete and the other is an incomplete prophage. The complete phage region has genes encoding for lysin, tail, capsid, head, portal, terminase and integrase, while the incomplete phage has transposase, tail and lysine (Table 6 and Fig. 5a). Seven putative transposase genes, representing three different families namely IS4, IS256 and IS982 were identified in E. faecium 17OM39. These families appeared in duplicate copies and are highly conserved (Table 7).

Table 6 Prophage regions identified within Enterococcus faecium 17OM39
Fig. 5
figure 5

In silico analysis performed. a Predicated prophages in the genome of Enterococcus faecium 17OM39 with position using PHASTER tool. b Predicted genomic islands using Island Viewer tool

Table 7 IS elements found in the genome of Enterococcus faecium 17OM39

Clustered regularly interspaced short palindromic repeats (CRISPR) elements were not found in the genome of strain 17OM39 genome. A total of 10 genomic island regions were identified and none of these regions was found to contain any antibiotic resistance gene (Fig. 5b). Our analysis suggested that these regions harboured genes mainly encoding few ribosomal, hypothetical and most of the viral proteins. However, some genomic islands contained genes encoding for death on curing protein, glutathione synthesis, methionine ABC transporter and gluconate dehydratase (EC 4.2.1.39). The regions in probable genomic islands are described in Table S4. Moreover, the genome was devoid of origin of transfer (oriT or bom) and transfer (tra) genes essential for plasmid maintenance.

Bacteriocin gene cluster

The genome of E. faecium 17OM39 harboured seven secondary metabolite gene clusters as observed using antibiotics and Secondary Metabolite Analysis Shell tool (antiSMASH). These clusters belong to microcin and bacteriocin family, while only one cluster was observed with a significant hit of 66% similarity to bacteriocin, located at nucleotide position 2312434-2313194 in the genome (Fig. S2). The bacteriocin has three domains viz. enteriocin induction factor (1-147), enteriocin A immunity protein (251-589) and enteriocin A (564-761). Based on the previous evidence that such type of bacteriocin exhibits activity against L. monocytogenes specifically, antimicrobial activity assay was performed against L. monocytogenes (ATCC 13932) along with P. aeruginosa (ATCC 15442), Streptococcus pneumonia (ATCC 49619) and E. coli (ATCC 25922). It was observed from the experiment; the antimicrobial activity was against L. monocytogenes (9.2 ± 0.2 mm as zone of clearance). In vitro experiment supports the presence activity of this extracellularly secreted bacteriocin as it inhibited the growth of L. monocytogenes. The bacteriocin is being further characterised in our lab.

Genome-based metabolic capabilities

We screened the genome sequence of the strain 17OM39 for metabolic capabilities to understand the cellular processes accounting for potential probiotic nature.

Amino acid synthesis

Analyses of genome report the potential of strain 17OM39 to synthesise amino acids such as cysteine, serine, and aspartate. From the three amino acids (cysteine, serine, and aspartate), seven other amino acids and derivatives could be generated (Kleerebezem et al. 2003). The presence of particular genes in a pathway was assessed and metabolic pathway for lysine was constructed as shown in Fig. S3. The genome harboured genes encoding for Beta-lyase (metC) and L-serine dehydratase (EC 4.3.1.17) to synthesise cysteine and serine from pyruvate. Detailed information about amino acid synthesis can be found in Supporting Text.

Allied metabolism

Sulphate permease (permits SO24− into the cell) and serine acetyltransferase (EC 2.3.1.30) the enzymes required for sulphate incorporation from H2S for the synthesis of L-serine were traced in the genome. An alternative pathway of ammonia assimilation involving glutamine synthetase (GS- EC 6.3.1.2) was found in the genome. It was observed that based on the genes present, thiamine synthesis can be formed by series of enzymes as shown in Fig. S4. The metabolic pathway for folate synthesis was traced into the genome from 7,8 dihydropteroate to folate by seven enzymes (Fig. S5). Detailed information about allied metabolism can be found in Supporting Text.

Proteolytic system

Three types of proteolytic systems were found in strain 17OM39, i.e. oligo peptidases, aminopeptidases and carboxypeptidases specific for D-alanine-D-alanine. The type M (metallo) peptidase group consisted of two members: PepO, PepF. Group type S (serine) consisted of only five members—glutamyl aminopeptidase (EC 3.4.11.7); methionine aminopeptidase (EC 3.4.11.18); aminopeptidase S (Leu, Val, Phe, Tyr preference - EC 3.4.11.24); aminopeptidase YpdF (MP-, MA-, MS-, AP-, NP- specific); aminopeptidase C (EC 3.4.22.40) and tripeptide aminopeptidase (EC 3.4.11.4). In addition to these, CAAX protease family was detected. These proteases share several conserved motifs and most members are likely membrane-bound endopeptidases with no common specificity. Detailed information about the proteolytic system can be found in Supporting Text.

Carbohydrate metabolism

The 17OM39 genome encodes a large variety of genes related to carbohydrate such as mono-, di- and oligosaccharides; an amino sugar; organic acids and sugar alcohols. Annotation from RAST and PGAP has shown the genes present in Entner-Doudoroff pathway, pentose phosphate pathway, glycolysis and gluconeogenesis and pyruvate metabolism I and II. The strain 17OM39 was able to utilise xylose, maltose, dextrose, raffinose, melibiose, arabinose, inulin, sodium gluconate, glycerol, inositol, sorbitol, mannitol, adonitol, rhamnose, esculin, D-arabinose, citrate, malonate and sorbose in the experiment conducted. Moreover, genomic analysis further confirms the presence of these genes within the genome. We could also trace for lactose, fructose, galactose, trehalose, sucrose, mannose, salicin, arabitol and cellobiose along with the transports, which was see in the API strip assay. Figure 6 shows carbohydrate utilisation found in the genome. Detailed information about carbohydrate metabolism can be found in Supporting Text.

Fig. 6
figure 6

Carbohydrate utilisation found in the genome. The diagram shows transporters and enzymes, as predicted by the putative genome annotation. For transporters, blue and red indicates a putative PTS transporter; orange a putative ATP-binding cassette (ABC) transporter; and pink a galactoside permease

Stress response

The genome of E. faecium 17OM39 encodes a number of stress-related proteins, including numerous proteases involved in the stress response and has highly conserved SOS regulon genes. Also, heat shock proteins were identified in the genome of 17OM39; these include the highly conserved class I heat shock genes responsible for maintaining the integrity of cellular proteins under stress conditions (GroES and GroES operons) and the conserved class III Clp proteases ClpB, C, E and X. The F1F0-ATPase system and superoxide dismutase (SOD) have been identified in this genome. The genome also harbours universal stress protein family, non-specific DNA-binding protein (Dps), iron-binding ferritin-like antioxidant protein and ferroxidase (EC 1.16.3.1), redox-sensitive transcriptional regulator and peroxide stress regulator to organic hydroperoxide resistance protein were annotated. Glutathione redox cycle along with glutathione reductase (EC 1.8.1.7), NADH peroxidase (npx; EC 1.11.1.1) and recA, involved in repairing DNA damage was also encoded by the genome. Cold shock protein (cspA (four copies), B, D) were observed in the genome. Lastly, a gene relA was identified that encodes an enzyme putatively involved in osmotolerance via synthesis and hydrolysis of (p)ppGpp.

Regulation systems

One major (rpoD) and one putative minor alternative sigma factor (sigV) was found in the genome sequence of strain 17OM39, essential for the bacterial cell to adjust its metabolism to changing environmental conditions. The housekeeping Rrf2 family transcriptional regulator, group III and DNA-directed RNA polymerase with its all subunit were observed in the genome. Different operon transcriptional repressors were also found in the genome such as trehalose-, maltose- (MalR), fructose-, phosphosugar-binding (RpiR family) and heat-inducible repressor (HrcA) and also repressors for sucrose (ScrR), ribose, galactose and lactose operon were found. The genome also has arginine pathway regulatory protein (ArgR), purine nucleotide synthesis repressor, arabinoside (GntR) and biotin operon repressor, osmoregulation-related genes, zinc (ZUR) and ferric (FUR) uptake regulation protein. Also, few repressors for DNA were detected: DNA-binding response regulator (OmpR family), DNA-binding heavy metal response regulator and DNA-binding response regulator AraC family.

Probiotic genes associated with the genome

The cholylglycine hydrolase (EC 3.5.1.24) gene responsible for bile salt hydrolysis action was identified in two copies within the genome. Fibrinogen and collagen-binding protein were found in the genome allowing them to bind the GI tract, suggesting an important role in adhesion and colonisation in intestinal mucosal surfaces. Aggregation-promoting factor (apf) is a protein that has been associated with self-aggregation and maintenance of cell shape and was identified in the strain 17OM39 genome. While the EPS cluster consisted of 14 genes including the highly conserved proteins EpsB–EpsF was located in the genome. Also, the resistance to hydrogen peroxide is imparted by genes alkyl hydroperoxide reductase (ahp) and NADH peroxidase (npr) were found in the genome, while this activity was also evident as shown by the experiments. In addition, based on the previous studies, we also screened for a set of genes involved in imparting important probiotic functions, as described in Table 8.

Table 8 Genes involved in imparting important probiotic functions

Absence of virulence determinants

We screened for the presence of virulence genes that are commonly known to be associated with other enterococci. Cytolysin (cyl), aggregation substance (as), gelatinase (extracellular metalloendopeptidase EC 3.4.24.30), hyaluronidase (efm), the two cell wall adhesins (efaAfs and efaAfm), two sex pheromones (cob and ccf), hemolysis (hlyA), serum resistance-associated gene (sra) and any other virulence factors could not be traced into the genome.

Discussion

As a part of our research to catalogue and characterise the bacterial populations within the Indian population, we isolated an indigenous strain of E. faecium and comprehensively tested for its probiotic potential. For a successful colonisation of bacteria, it needs to have the ability to tolerate low pH, presence of bile and tolerate oxidative stress. The human intestinal tract has both facultative and obligate anaerobic bacteria. Addition of facultative bacteria to synthetic microbial communities has been shown to improve colonisation resistance in mice (Brugiroux et al. 2016). Hence, it is important to investigate the commensal facultative anaerobes from the human intestinal tract. In the present study, we use a combination of in vitro and in silico approaches to test the indigenous bacterial strain 17OM39 for its probiotic attributes.

First, we tested the acid tolerance and found that strain 17OM39 was able to tolerate a low pH of 2.5 for 3 h. This is an important character for a bacterium to be able to survive and become part of the natural microbial community. In many cases, bacteria that are unable to tolerate low pH, bile and transition through the intestinal tract require microencapsulation to aid successful delivery (Brugiroux et al. 2016). Along with experimental evidence, we identified the genes known for encoding proteins involved in acid tolerance (lytr and ehr) within the genome (Lebeer et al. 2008). We also identified genes encoding for bile resistance proteins (cdpA, clpE and dps) (Lebeer et al. 2008) and genes responsible for bile hydrolysis (bsh) (Begley et al. 2006). In vitro assays demonstrate that our strain is able to tolerate low pH and has 80% survivability in vitro to 0.3% bile. This suggests that our strain may not need protective encapsulation for successful transition through the host intestinal tract.

Next, we tested the aggregation and adhesion properties of strain 17OM39 as it is an important feature for long-term colonisation in the host intestinal tract. Our strain showed only marginally low adhesion and autoaggregation than the control L. rhamnosus GG, a known probiotic (P value < 0.51, paired nonparametric “t test”). Genome analysis revealed the presence of gene encoding for aggregation-promoting factor. These proteins are associated with self-aggregation, maintenance of cell shape and enhance the gastrointestinal persistence of the organism in vivo (Jankovic et al. 2003; Goh and Klaenhammer 2010). The genome of strain 17OM39 consisted of genes encoding for fibrinogen and collagen-binding protein suggesting its ability to adhere to host cells. We confirmed this by adhesion assay using human HT-29 cells and analysis of cells surface showed a hydrophobic nature which is helpful for adhesion. The adhesion of the strain to the intestinal mucosa helps to increase the persistence of probiotic cells in the gut and therefore enabling colonisation and competitive exclusion of pathogens (Greene and Klaenhammer 1994; Ouwehand and Salminen 2003). Further, the presence of exopolysaccharides (EPS) gene cluster and in vitro results for production of EPS by strain 17OM39 confirms its ability to form EPS which could aid in higher tolerance to environmental stress in gastrointestinal tract (Mozzi et al. 2009; Ciszek-Lenda et al. 2011). Also, the resistance to hydrogen peroxide is imparted by genes alkyl hydroperoxide reductase (ahp) and NADH peroxidase (npr) were found in the genome, while this activity was also evident as shown by the experiments. In addition, based on the previous studies, we also screened for a set of genes involved in imparting important probiotic functions, as described in Table 6.

The genome analysis of strain 17OM39 for insights into its carbohydrate metabolism revealed a diverse array of gene encoding for utilising numerous carbon sources for energy and growth. Investigations for carbohydrate (CHO) metabolism has been carried out using a two-step integrated approach which includes the genomic data such as presence of genes encoding for important enzymes involved in CHO metabolism along with the genes encoding for the respective transport systems. Moreover, the genomic data was then supported with the results obtained from the API carbohydrate utilisation test. This integrated data of genomic and physiological features of the strain together provides a comprehensive outline of the mechanism of carbohydrate utilisation as depicted in Fig. 6. Our in silico analysis also revealed the presence of specialised proteolytic system, the ‘Opp proteins’, which belong to a superfamily of highly conserved ATP-binding cassette transporters that mediate the uptake of casein-derived peptides (Kunji et al. 1995; Peltoniemi et al. 2002; Vermeulen et al. 2005). This specialised machinery including the extracellular enzymes, drive the conversion of complex carbohydrates and proteins from the food to absorbable forms for the human body and thus contributing to the energy extraction in a major way.

The metabolic capacity to assimilate nitrogen and sulphur was found to be associated with the genome, along with its potential to convert nitrate to nitrite and then to reduced nitrogen such as nitrous oxide, ammonia and urea by strain 17OM39. These compounds are known to play a role in the microbiome development (Tiso and Schechter 2015). The strain 17OM39 has potential to synthesise vitamins (folate, thiamine) and essential amino acids (methionine, lysine) along with ten other amino acids. These amino acids produced serves as precursors for synthesis of short-chain fatty acids (Neis et al. 2015). Strain 17OM39 was found to be an effective reducer of cholesterol in vivo. This depletion of cholesterol in the host results in protection against cardiovascular diseases (Pereira and Gibson 2002; Ha et al. 2006).

One concern regarding the use of strains of E. faecium is that this species includes both haemolytic and non-haemolytic stains (Semedo et al. 2003b). Several are considered to have pathogenic properties. However, E. faecium is a well-known commensal and found to commonly occur in healthy individuals (Lebreton et al. 2014) Therefore, it is highly important to test for pathogenicity of E. faecium strains for safety purposes. Strain 17OM39 showed alpha haemolytic activity and was sensitive to human serum. This demonstrates the strain 17OM39 is non-pathogenic and is potentially a safe bacterium as shown by in vitro testing. Antibiotic resistance is also a major concern in strain of E. faecium (Lebreton et al. 2014) and the strain 17OM39 was sensitive to major antibiotics tested and showed only marginal sensitivity to some other antibiotic such as ciprofloxacin.

In addition, the strain 17OM39 was found to secrete an antimicrobial compound belonging to bacteriocin (NRPS group), as evident from the antimicrobial assay and genomic analysis. Moreover, the antimicrobial compound was found to be specific against L. monocytogenes. We also investigated the genome for any signatures of mobile elements, which are known to play a major role in pathogenesis (Mikalsen et al. 2015). The strain 17OM39 did not harbour any virulence genes, plasmids, pathogenic IS elements, CRISPR and Genomic Islands which are implicated to play a role in the virulence and antibiotic resistance in the bacterium (Schmidt and Hensel 2004; Gyles and Boerlin 2014; Louwen et al. 2014). Absence of genes encoding for the plasmid maintenance and transfer (oriT and tra), is an additional attribute for strain 17OM39. Based on the genome homology results, we further tried to investigate its relatedness with the marketed probiotic strain T110. It is evident from the DDH, ANI calculation and RFLP analysis (in silico analysis), that the strain 17OM39 under study is a not a clonal strain of the marketed probiotic strain 17OM39.

In summary, the integrated approach guided by genome analysis, physiological characterisation, with specific testing for probiotic potential has given crucial insights to the potential of E. faecium 17OM39 as probiotic. The strain 17OM39 fulfils all the qualities of being a probiotic and the genome insights have given more information into the molecular machinery potentially involved in the probiotic effects. Our work demonstrates the importance and usefulness of combining genome analysis with laboratory assays including in vitro testing for extensive characterisation of population-specific probiotics for the Indian population.