Introduction

Lactobacillus spp. have achieved popularity in the manufacturing of probiotic products because of their convincing beneficial effects on human health. However, before providing benefit to human health, probiotic bacteria must survive the industrial production processes and transit through the gastrointestinal tract [1, 2]. Bacterial two-component signal transduction systems (TCSs) play important roles for many bacteria by enabling them to detect and respond to diverse changes/stresses in the environment [3]. TCS genes are typically located within the same operon encoding two signalling proteins: a transmembrane sensor histidine kinase (HK) and a cytoplasmic response regulator (RR), which may sometimes be carried by a single polypeptide to form the hybrid HKs [3].

Individual HKs contain a conserved kinase core and respond to environmental signals by autophosphorylation of a histidine residue, creating a high-energy phosphoryl group, which is then transferred to an aspartate residue in the RRs. The RRs, which are usually transcriptional regulators, contain a conserved regulatory domain. Phosphorylated RRs then activate downstream specific responses [3]. In most HKs, the transmitter domain shows high sequence conservation, especially within a set of six recognizable motifs or boxes designated H, N, F, G1, G2, and G3. In particular, the H box contains an invariant H residue that is autophosphorylated in an ATP-dependent manner [4]. RRs generally contain at least two functional domains: a conserved N-terminal receiver domain (REC domain) that is phosphorylated by the HK at a strictly conserved D residue, and one or more variable C-terminal output domains [5]. Modulation of the phosphorylated state of the RR controls either expression of the target genes or cellular behaviour.

Lactobacillus casei is a facultative heterofermentative lactic acid bacterium. It has traditionally been recognized as a probiotic and used in commercial products for its health-promoting and nutritional properties [68]. L. casei requires a complex array of TCS proteins to cope with diverse human hosts, host responses, and environmental conditions. The TCS MaeKR belonging to the citrate family is essential for the expression of malic enzyme of L. casei strains BL23 and ATCC 334, and MaeKR expression was induced by l-malic acid [9]. The genome sequences of L. casei strains BL23 and ATCC 334 harbor 17 putative TCSs, among which the role of three TCSs involved in bile response, cell envelope stress response, oxidative stress tolerance, and acid tolerance [10]. However, the role of TCSs in L. casei is not still well understood.

With the advance of large scale sequencing technologies and bioinformatics tools, it has become possible to computationally predict TCS proteins and their putative functions from the whole genome of an organism. The availability of complete genome sequences of six L. casei strains enables a more comprehensive study of the role of TCS in the stress response of this organism. In this study, we conducted a thorough comparative analysis of the identified TCS proteins which provides valuable insights into the conservation and divergence of TCS proteins in the L. casei strains studied here.

Materials and Methods

Data Collection

Complete genome sequences of the L. casei strains ATCC 334, LC2W, BD-II, BL23, W56 and str. Zhang were collected from the National Center for Biotechnology Information (NCBI) (ftp.ncbi.nih.gov/genomes/Bacteria/). The genomes used in this study was listed in Table 1.

Table 1 The information of six sequenced L. casei genomes

Identification of HKs and RRs

The approach used to identify putative HKs and RRs from the complete genome sequences of L. casei strains ATCC 334, LC2W, BD-II, BL23, W56 and str. Zhang was similar to that described previously [11]. Briefly, the HMM profile (Accession numbers PF00512) was found in Pfam database that targets the HisKA family of HKs, which was used to recognize the HKs in the L. casei genomes. A profile HMM downloaded from Pfam protein families database [12], which targets the RR REC domain (Accession number PF00072), was used to recognize the RRs in each L. casei genome. Recovered sequences were further scrutinized according to the following criteria: (i) the HATPase domain had to be located in the C-terminus (last 2/3) of the encoded protein and (ii) a putative H-box had to precede the HATPase domain. Detection of HK–RR gene pairs and ‘orphan’ HK and RR genes was similar to that described previously [11].

Identification of Common and Unique TCS Proteins

TCS proteins that are common or unique among L. casei strains were identified through ortholog analysis. The ortholog groups were constructed by using the OrthoMCL-DB tool (http://www.orthomcl.org) [13]. Briefly, HK protein sequences of L. casei strains were assigned to OrthoMCL-DB for the ortholog group identification, and HK proteins belong to the same orthomcl_group were recognized as a common TCS protein.

Bioinformatic Analysis

Protein domain organizations of the HKs and RRs were identified using SMART (smart.embl-heidelberg.de) [14]. Domain limits for proteins were also derived from the graphical output of the SMART web interface. Transmembrane helices of HKs were predicted by the TMHMM2 program (http://www.cbs.dtu.dk/services/TMHMM/) [15]. Phylogenetic trees of the HKs and RRs were built by the software MEGA version 4 [16].

PCR Verification

To verify the presence of 15 TCSs in L. casei, PCR amplification with original DNA from two L. casei strains ATCC334, LC2W and five isolated strains was performed. The primers were designed using the PRIMER-BLAST at online NCBI. Conditions for this conventional PCR were: 94 °C, 2 min; followed by 30 cycles of 94 °C for 30 s; annealing temperature 58 °C for 30 s; and 72 °C for 30 s; final extension at 72 °C for 5 min. The amplified PCR products were resolved in a 1.5 % agarose gel.

Results and Discussion

Identification of TCS Proteins of L. casei Strains

The putative HKs and RRs in the six L. casei strains were identified by searching the complete genome sequences for proteins containing HK and RR domains using Pfam HMM profiles. The repertoires of potential TCS proteins (HKs and RRs) were obtained, as shown in Table 2. By analyzing the putative operon organizations of genes encoding the identified TCS proteins 98.9 % of the total putative HKs and 93.1 % of the total putative RR were found to constitute HK–RR pairs. No hybrid HKs could be detected in all the genomes of the six L. casei strains compared in this study.

Table 2 Identification of putative two component systems in the six sequenced L. casei strains

Ortholog Analysis of TCS Proteins

Ortholog analysis of the paired or non-paired TCS proteins among the six L. casei strains revealed a total of 15 different TCS clusters, one orphan HKs and three orphan RRs (Table 3). Co-evolution of TCS proteins could be clearly observed. This means, HKs and RRs which belong to a particular TCS cluster are usually co-present or co-absent in a specific strain. Twelve of the 15 TCS clusters (designated as TCS-2, TCS-4, TCS-5, TCS-7, TCS-8, TCS-9, TCS-10, TCS-11, TCS-12, TCS-13, TCS-14, TCS-15) were common to all the strains. Three clusters (TCS-1, TCS-3, TCS-6) were observed to be absent in one or several strains. One orphan HK was identified as uniquely present in L. casei W56 (BN19407810, named as orphan HK1). In contrast, an orphan RR (orphan RR1, LSEI2389 in L. casei ATCC 334) was found to be common to all strains except for L. casei BD-II. L. casei W56 harbored an additional unique orphan RR (BN19402120, orphan RR3). In addition, a clear clustering of HK and RR orthologs can be visualized in the phylogenetic tree shown in Fig. 1, which additionally illustrates the relationships between the different TCS clusters.

Table 3 Ortholog analysis and classifications of the putative TCS proteins in the six sequenced L. casei strains
Fig. 1
figure 1

Phylogenetic trees of the paired HKs and RRs in the six sequenced L. casei strains. The trees were constructed using MEGA version 4 by applying the neighbor-joining method. The scale bar is shown above the trees and the scale is in units of “substitution/site”

Classification of HKs Based on Domain Architecture Analysis

Using the classification method as previously described [17], the putative HKs were grouped into three different groups: extracytoplasmic-sensing HKs, cytoplasmic-sensing HKs, and membrane-sensing HKs (HKs with sensing mechanisms associated with membrane-spanning helices), as shown in Fig. 2.

Fig. 2
figure 2

Domain architectures of histidine kinases representative of each TCS clusters. The pictorial depiction is based on the predictions carried out using the SMART web interface http://smart.embl-heidelberg.de/. The transmembrane helices (TMs) were predicted using the tool TMHMM. C, E and M stand for cytoplasmic, extracytoplasmic and membrane sensing, respectively

Among all the HKs identified, HKs of TCS-3 and TCS-4 were recognized as extracytoplasmic sensing HKs by displaying at the N-terminal region an extracytoplasmic putative signal perception domain, which were flanked by (at least) two transmembrane helices (TMs). The cytoplasmic part of the HK proteins harboring the transmitter domain comprised either a HisKA-HATPase_c domain (HK of TCS-3) or a PAS-HisKA-HATPase_c domain (HK of TCS-4). Per-Arnt-Sim (PAS) domains play important roles as sensory modules for sensing oxygen tension, cellular redox state, or light intensity [18]. Most PAS domain-containing proteins are intracellularly located with dual functions of monitoring both the external and internal environments by perceiving alterations in the electron transport system caused by intracellular or extracellular changes in redox potential [19]. It should be noticed that a region of low compositional complexity was found to exist between two TM regions of the TCS-3 HK. The region starts at position 121 and ends at position 135.

The HKs of seven TCS clusters, namely the soluble HK of TCS-11 and the membrane anchored HKs of TCS-2, TCS-6, TCS-7, TCS-10, TCS-14, and TCS-15, and Orphan HK1 were identified as HKs with putative cytoplasmic sensing functions.

HKs of TCS-2, TCS-6, TCS-11, TCS-14, and TCS-15 possess a histidine kinases, adenylyl cyclases, methyl-accepting chemotaxis proteins and phosphatases (HAMP) domain. HAMP functions as a linker to bridge the transmembrane helix and the transmitter domain [20]. HK of TCS-15 possesses the PAS and PAC domains. It has been reported that PAS domains are often associated with proxy auto-config (PAC) domains and they are directly linked and together form the conserved 3D PAS fold [21], as also exemplified by the HK of TCS-15 in this study.

HKs of TCS-1, TCS-5, TCS-8, TCS-9, TCS-12 and TCS-13 in this study were all found to belong to membrane-sensing HK group, indicating that a relatively high percentage of HKs of the L. casei strains are involved in sensing signals directly associated with the membrane.

Classification of RRs

The majority of the putative RRs identified in this study were classified into the following 3 families: CitB, LytT and OmpR, with RRs of the OmpR family constituting the largest group. The assignment of RRs of the 15 TCS clusters and the orphan RRs to the corresponding RR families is given in Table 3.

The RRs of TCS-8 and TCS-10 contain a protein of the CitB family, respectively. Members of the CitB family have been documented to control expression of the genes for citrate fermentation in response to external citrate under anaerobic conditions [22, 23], and to have an effect on the inheritance of iteron-containing plasmids and on the SOS response to β-lactam antibiotics [24, 25].

The RR of TCS-13 and Orphan RR1 contain a protein of the LytT family, respectively. RR proteins of the LytT family are characterized by having a non-HTH DNA binding domain, which modulate the expression of many genes coding for virulence factors, fimbriae, cell wall components, bacteriocins, extracellular polysaccharides etc. [26, 27].

The RRs of the others 12 TCS clusters and two orphan RRs contain a protein of the OmpR family, respectively. RRs of the OmpR family constituted the largest group. Proteins of the OmpR family have been reported to mediate a wide range of biological functions related to, for example, osmolarity, phosphate assimilation, antibiotic resistance, virulence and toxicity [28].

TCS Proteins Common to All the Six L. casei Strains

TCSs are conserved in closely related microorganisms [2931]. Ancient TCSs, on one hand, may have maintained basic functions in different bacteria, and on the other hand, may also have evolved new functionalities in niche-specific bacteria.

Proteins of the TCS clusters 2, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15 are common to all the six L. casei strains compared here, indicating probably the functional importance of these TCS clusters for the adaptation and survival of these L. casei strains isolated almost from dairy products (Tables 1, 3, 4). For instance, TCS-12 is highly conserved across the six L. casei strains. TCS-12 is homologous to the three paralogous TCS of Bacillus subtilis, BceRS, YvcPQ, and YxdJK, involved in the cell envelope stress response against the nisin [10, 32]. TCS-12 of L. casei BL23 strain has been found to play a vital role in the growth under a low pH environment [10]. Therefore, it is conceivable that conservation of TCS-12 across the L. casei strains is essential for their acid tolerance. These functional similarities and differences of the core TCSs clearly indicate that although they are conserved in the L. casei, they may have developed new niche-specific functions during evolution.

Table 4 A brief summary of known/putative functions of the TCSs identified in L. casei BL23

L. casei strains have achieved popularity in the manufacture of probiotic products because of their convincing beneficial effects on human health. However, before providing benefit to human health, L. casei strains have to survive the difficult journey through the human gastrointestinal tract in sufficient densities in the presence of bile salts [33]. The implicated pathways of L. casei are involved with a complex physiological response under bile salts stress, particularly including cell protection (DnaK and GroEL), modifications in cell membranes (NagA, GalU, and PyrD), and key components of central metabolism (PFK, PGM, CysK, LuxS, PepC, and EF-Tu) [34].

In this study, we found that TCS-1 and TCS-6 clusters are involved with the response under bile salts stress, and TCS-12 is involved with the response under the acid tolerance. In addition, the TCS-6 cluster is also involved in the stress tolerance of oxidative and H2O2. These TCSs in L. casei will contribute to survive the difficult journey through the human gastrointestinal tract.

TCS Proteins Uniquely Present/Absent in One or Several Strains

The TCS-1 cluster was predicted to be absent in L. casei W56 strain, which is involved in cell envelope stress tolerance, and the response of the bile, NaCl and antimicrobials of L. casei BL23 (Table 4) [10]. The TCS-3 was also absent in L. casei W56. Taken together, these findings indicate dramatic differences in the regulation of the response of the bile, NaCl, antimicrobials and the cell envelope stress of L. casei W56 in comparison to the other L. casei strains.

The TCS-6 cluster could be not found in L. casei BD-II strain, which is involved in cell envelope stress response, nisin resistance, bile and NaCl response, oxidative stress tolerance, H2O2 stress tolerance of L. casei BL23 strain [10, 3537]. Orphan RR1 was also absent in L. casei BL23 strain.

It has been suggested that specific TCS systems may play critical roles in microbe–host relationship, such as the HrpXY system in plant enterobacteria, which regulates type III secretion [38].

PCR Verification of Predicted TCSs

To verify the presence of predicted TCSs in L. casei, 15 primer pairs were designed based on 15 TCS genes, respectively (Supporting Information, Table S1). PCR amplifications using these primers were performed with two sequenced L. casei strains (L. casei ATCC334 and LC2W), four isolated strains (L. casei BD00054, BD00090, BD01649, and BD01803) and one L. paracasei BD03416. All primer sets exhibited 100 % inclusivity for six L. casei strains (Supporting Information, file 1). However, no clearly products were obtained from the isolated strain L. paracasei BD03416, which needs to be further clarified. Typical data is shown in Fig. 3. These results supported successfully the identification of these TCSs in L. casei strains by bioinformatics analysis.

Fig. 3
figure 3

PCR verification of the presence of TCS genes in Lactobacillus casei strains. Agarose gel electrophoresis of PCR products amplified using TCS-2 primers. Lane 1, Lactobacillus casei ATCC334; Lane 2, Lactobacillus casei LC2W; Lane 3, Lactobacillus casei BD00054; Lane 4, Lactobacillus casei BD00090; Lane 5, Lactobacillus casei BD01649; Lane 6, Lactobacillus casei BD01803; Lane 7, Lactobacillus paracasei BD03416; Lane 8, negative control (ddH2O); M 100 bp DNA Marker

Conclusion

In the present study we conducted a genome-wide identification, classification, and ortholog analysis of the TCS proteins in six sequenced L. casei strains. Totally, 15 TCS clusters comprising HK–RR pairs were identified, with 12 of them shared by all the six strains compared, three being absent in one strain. In addition, one orphan HKs and three orphan RRs were identified. We believe that the results from this genomic level study will be certainly helpful for the design of physiological studies which in turn will lead to a better understanding of response mechanisms for survival in the gastrointestinal tract of L. casei strains.