Introduction

All living cells contend with differences of water activity across their membranes and osmoregulation is found to be universal. Bacterial cells maintain an outwardly directed turgor pressure through the accumulation of solutes against their chemical gradient, leading to water flow into the cytoplasm (Epstein 1986). When subjected to hypo-osmotic stress (downshock), bacteria release solutes from the cytoplasm, thus preventing excessive water inflow (Berrier et al. 1992; Levina et al. 1999). The major proteins involved in fast solute release are the Mechanosensitive (MS) channels (Martinac et al. 1987, 1990; Sukharev et al. 1994). In Escherichia coli, rapid release of osmolytes is required for survival of extreme turgor by activation of MscL and MscS channels (Levina et al. 1999; Booth et al. 2007). These channels are the main signalling molecules of mechanosensory transduction, because they convert mechanical forces in the lipid bilayer of cellular membranes into electrical and/or biochemical signals (Perozo 2006).

Mechanosensitive channel of large conductance (MscL) exhibits the largest conductance (greater than 2.5 nS), opens at high membrane tension (close to the lytic limit of the bilayer) and has fast kinetics (Perozo 2006). Each MscL subunit is a 136 amino acid polypeptide predicted to have two transmembrane helices TM1 and TM2 (Sukharev et al. 1994; 1999). The crystal structure of TbMscL at 3.5 Å of the Mycobacterium tuberculosis ortholog (151 amino acids) revealed a pentameric channel in an apparently closed state (Chang et al. 1998). The core of the protein is formed by tightly packed TM1 (residues 15–45) and TM2 (residues 76–100) transmembrane helices from all five subunits. The TM1 and TM2 helices are linked by a periplasmic loop. The COOH-terminal part of the protein (residues 101–110) adopts an α-helical bundle conformation and reside in the cytoplasm (Sukharev et al. 2001a, b; Gullingsrud and Schulten 2003; Maurer et al. 2008), while the segment S1 at the N-terminus is in close proximity to the cytoplasmic interface (Iscla et al. 2008; Steinbacher et al. 2007). On the other hand, MS channel of small conductance (MscS) is a 0.8–1-nS channel opened by moderate pressure (Perozo 2006). The structure of the EcoMscS channel from E. coli (286 amino acids) was originally solved at 3.9 Å (Bass et al. 2002) and recently at 3.45 Å in an open conformation (Wang et al. 2008). However, unlike the more uniform MscL family, the MscS homologs form a much larger family with more than 18 subfamilies of different peptide lengths, and it extends from Archaea to plants (Pivetti et al. 2003). The EcoMscS is a homoheptamer, each subunit consisting of two major domains: the amino-terminal membrane domain and the carboxy-terminal cytoplasmic domain. Each subunit has three TM helices (TM1–TM2–TM3). TM3 (residues 96–127) lines the channel pore and contains predominantly hydrophobic residues (Bass et al. 2002; Miller et al. 2003a). The large (~17 kDa) cytoplasmic carboxy-terminal domain exhibits three major sub-domains. A β sub-domain (residues 132–177) is linked to a mixed αβ sub-domain (residues 188–265) and the protein terminates in a β barrel (residues 271–280). The cytoplasmic domain creates a large vestibule that is perforated by seven 10 × 8 Å lateral portals created by the junctions of the β and αβ sub-domains and an axial portal formed by the β barrel (Bass et al. 2002; Wang et al. 2008).

Herein, we have used a profile-to-profile alignment strategy to align the TM1, TM2 and cytoplasmic helix of 231 MscL homologs and separately the TM3 and αβ sub-domain of 309 MscS homologs. Using this approach, we have precisely aligned these repeated regions, and identified new conserved residues and consensus motifs. Sequences from Archaea and Eukarya were included, since the mechanism of mechanotransduction is essential for osmoregulation, and therefore seems ubiquitous in life (Kloda and Martinac 2001, 2002).

Methods

Data collection

MS sequences were collected from the NCBI (http://www.ncbi.nlm.nih.gov) database with a keyword search by hand using the sequence of the whole protein for the search. We select protein sequences based in the original annotation as MscL or MscS homologs and confirmed by BLAST. To obtain a more comprehensive representation of the conserved motifs of these channels, a representative sequence from every genus available in the NCBI database was selected and aligned. In a few cases, sequences from species within the same genus were included in the analysis. For MscL family, we selected sequences from 143 gram-negative Bacteria, 66 gram-positive Bacteria, 8 Archaea and 14 Eukarya (see Supplementary Table S1). In the case of MscS family, we examined 204 sequences from gram negative and 56 from gram-positive Bacteria, 34 Archaea and 15 Eukarya (see Supplementary Table S2).

Building alignments and identification of conserved motifs

Data analysis was carried out separately for each taxonomic group, and we generated consensus sequences for each domain using multiple alignments with Clustal-W (Thompson et al. 1994). Limited regions of MscL (TM1–TM2–Cyt-H) and MscS (TM3 and Cyt-terminus) were determined based on the crystal structure of both channels (Chang et al. 1998; Bass et al. 2002; Wang et al. 2008) and taking into account the more conserved regions for each alignment. Sequence logos were generated using the WebLogo program (http://weblogo.berkeley.edu/logo.cgi) (Crooks et al. 2004). Consensus motifs were determined from multiple alignments, presented in the PROSITE format (http://www.expasy.org/prosite) (Sigrist et al. 2002), and were determined by the ≥50% of conservation threshold in each sequence logo. For the logos from Bacteria, we follow the numbering of E. coli (MscL and MscS); for the logos from Archaea, we used the numbering of Methanoregula boonei (MscL) and Methanocaldococcus jannaschii (MscS). For the logos from Eukarya, we follow the numbering of Aspergillus clavatus for MscL and Arabidopsis thaliana (AtMSL3) for MscS. However, all our analysis was compared to E. coli numbering. Results from structure–function and mutational studies were collected from PubMed literature searches. It is important to note that in relation to the limited number of sequences from Archaea and Eukarya obtained in the protein database, our analysis can result in an underestimation of the logos founded. However, the method incorporates a sample correction, which can diminish this problem (Crooks et al. 2004).

Results and discussion

In this study, a sequence logo generator was used to obtain sequence logos from the homologs of MscL and MscS channels. The logo generator program takes the consensus sequence and adds statistical information about the relative frequency, conservation of each residue and the chemical composition of each motif. Thus, each logo gives a richer and more accurate description of sequence similarity. At present, there are some reports where multiple alignments for MscL (Chang et al. 1998; Pivetti et al. 2003; Moe et al. 1998; Oakley et al. 1999; Spencer et al. 1999) and MscS channels (Pivetti et al. 2003; Miller et al. 2003a; Koprowski and Kubalski 2003) have been done. Here, the conserved residues from MscL and MscS channels have been systematically identified and compared. To explore the homology of sequences before the generation of sequence logos separately, we compare the sequences of each channel from representative sequences of Bacteria, Archaea and Eukarya at the same time. This combined analysis show that some residues are well conserved throughout the three domains of life (Figure S1). The most salient feature of this comparison is that the motifs analyzed are similar within each major group and shows good conservation between the three domains of life. Our comparative approach illustrates the high conservation and relative importance of TM1–TM2 for MscL and TM3–β-subdomain for MscS. This information is important to obtain a more precise representation of conserved motifs that might help, for instance, for mutational or structural studies. Out of these results, several common characteristics to all MscL and MscS homologs can be identified in future studies.

MscL protein: conserved motifs in TM1

In 209 bacterial sequences compared (E. coli numbering), the highly conserved residues (≥90%) for this segment are aspartate/glutamate (D/E18), alanine (A) at position 20 and 28, glycine G26, phenylalanine F29, valine V33/V37 and proline P43. We found the following consensus motif for TM1 of MscL from Bacteria: [VIML]-D-[LM]-A-[VI]-[GA]-[VI]-[IV]-I-G-[AG]-A-F-[GTS]-x-I-V-x-[SA]-[LFV]-[VT]-x-D-[IVL]-[ILVF]-[MNT]-P-x-[IVL]. Similarly, for eight sequences from Archaea, we identified the following conserved residues (≥50%) (Methanoregula boonei numbering): G50, S59, V61, M66 and P67. In this segment we found the consensus motif G-x(8)-S-x-V-x(4)-M-P, where G50, S59, V61, M66 and P67 are equivalent to G26, S35, V37, M42 and P43 in Bacteria respectively. In Eukarya (14 sequences compared), we found six well-conserved residues (≥70%): A41, G43, F50, S56, V58 and P64 (Aspergillus clavatus numbering). The consensus sequence found here is: [AE]-x-G-L-I-x-A-x(2)-F-T-x(2)-V-x-S-x-V-x-[DN]-[IV]-x-[LM]-P. Again, we found here a good correspondence within the sequence from Bacteria, where A41, G43, F50, S56, V58 and P64 correspond to A20, G22, F29, S35, V37 and P43 in E. coli respectively (Fig. 1; Table 1). We found that the most conserved residues in TM1 present in all the three domains are the equivalent to: A20, I24, V33, S35, V37, I40 and P43 from E. coli (see Figure S2).

Fig. 1
figure 1

Conserved motifs in the TM1 modules of MscL from Bacteria (numbering of E. coli), Archaea (numbering of M. boonei) and Eukarya (numbering of A. clavatus). Residues are distinguished by colors: AVLIPWFM (hydrophobic, red), KRH (positively charged, green), DE (negatively charged, blue) and CGNQSTY (polar but uncharged, black). Bold double-arrows show high conservation (≥50%) of the identical residue in the three domains. Gray double-arrows show high conservation of the same residue in two domains and low conservation (<50%) for the same residue in the third domain. Single arrows show high conservation (≥50%) of the same residue only in two domains. The data for these logos consists of 209 sequences from Bacteria, 8 from Archaea and 14 from Eukarya

Table 1 Conserved residues and motifs in MS channels from Bacteria, Archaeobacteria and Eukarya

MscL protein: conserved motifs in TM2

We found that TM2 from the sequences of the three domains of life are rich in hydrophobic residues, principally F, I, A and V. In bacterial sequences, the highly conserved residues (≥80%) are G76, F78, F85, I87, A89 and F93. We identified the follow consensus motif for TM2 from Bacteria: Y-G-x-F-[IL]-x(3)-[IVLF]-[ND]-F-[LVI]-[IL]-[IVL]-A-F-x-[IVL]-[FY]. For Archaea, we found two highly conserved residues (≥70%): G93 and F95, where G93 and F95 correspond to G76 and F78 respectively (M. boonei to E. coli numbering). In addition, we found four moderately conserved residues (≥50%): N/E101, F/R102, A/K106 and F/T110, where that positions are parallel to N/D84, F85, A89 and F93 in the bacterial sequences. Thus, the motif found for Archaea is: G-x-F-x(5)-N-F-x(3)-A-x(3)-F. In Eukarya, we found that G112 (A. clavatus numbering) is the most conserved residue (≥79%) and it is parallel to G76 in Bacteria and G93 in Archaea. In addition, Y111 and F114 show good conservation (≥60%) and the equivalent residues are Y75 and F78 in Bacteria. The consensus founded here is: A-x(2)-D-G-A-x-V-x-A-Y-G-x-F (Fig. 2; Table 1). We found that the most conserved residues in TM2 present in all the three domains are the equivalent to: G76, F78 and F85 from E. coli (see Figure S2).

Fig. 2
figure 2

Conserved motifs in the TM2 modules of MscL from Bacteria, Archaea and Eukarya. See Fig. 1 for details

MscL protein: conserved motifs in the cytoplasmic helix

This part of the protein is not well conserved (<52% of conservation threshold in each sequence logo). However, we found that ~45% of the bacterial sequences studied have the motif: R-K-K-E-E-x(2)-P-x-P-P, where the second glutamate residue is frequently conserved and corresponds to E108 in EcoMscL. For Archaea, we found a putative truncated segment in seven of the eight studied sequences and only the homolog from Thermoplasma acidophilum looks complete. In this case, we found in a ~40% the following motif: V-x(2)-E-E-K-x(2)-K. It is notable that deletion of residues 110–136 (C-terminus) in EcoMscL is tolerated, producing only a slight effect on gating (Blount et al. 1996; Hase et al. 1997). This mutant channel significantly increases the osmotic ATP release (Anishkin et al. 2003). This is interesting in the context of the MscL channels from Archaea, since there is evidence suggesting that this part of the protein is probably deleted naturally. On the other hand, the Cyt-H has a ‘prefilter’ function (Anishkin et al. 2003), and a possible role for the pH regulation of EcoMscL activity by the charged cluster RKKEE (residues 104–108) has been proposed (Kloda et al. 2006). It will be interesting to probe a similar mechanism in the archaean homologs. In eukaryotes (A. clavatus numbering), we observed only T148 and I158 conserved (≥60%) but not a clear motif (Fig. 3; Table 1).

Fig. 3
figure 3

Conserved motifs in the cytoplasmic helix of MscL from Bacteria, Archaea and Eukarya. Dotted arrows show reduced conservation (<50%) of the same residue only in two domains

MscS protein: conserved motifs in TM3

The MscS family has good conservation throughout the TM3 helix, in contrast to the TM1 and TM2 (Pivetti et al. 2003; Miller et al. 2003a), which are relatively poorly conserved, and we confirm these observations. A previous study of Koprowski and Kubalski (2003) compares 36 sequences of MscS homologs and found that TM3 shows the highest level of identity as a whole (44% as compared to 30 and 23% estimated for TM1 and TM2, respectively). This observation clearly indicates that TM3 is important for MscS channel function and suggests that TM3 may be the functional equivalent of TM1 in MscL protein. It has been proposed that both TMs serve as channel-lining helices with a similar generalized structure and possibly a common evolutionary origin (Pivetti et al. 2003; Kloda and Martinac 2001; 2002), although the overall structure of both proteins are quite different, indicating that these channels do not share a common evolutionary ancestor (Steinbacher et al. 2007). This diversity also reflects a much wider distribution of MscS than for MscL (see Supplementary Tables S1 and S2) with homologs found in Bacteria, Archaea, yeast, Entamoeba, Chlamydomonas and higher plants.

Our analysis corroborate that TM3 is G-rich, showing the characteristic periodicity of this residue (Edwards et al. 2005), and additionally we found that TM3 is rich in A, L, F and V residues with different level of conservation between the three domains of life. In Bacteria, the highly conserved residues (≥60%) that have shown periodicity in this domain are G101, G104, A106, G108, L/F109, A/G110, Q112, N117 and G121. In the crystal structure, the TM3 helix bends at G113 and this residue is very important in the gating of the channel (Edwards et al. 2005; Akitake et al. 2007). Curiously, after our analysis, G113 is not well conserved in Bacteria and Asp mainly occupies such position. This observation suggests that the particular amino acid at position 113 is significant but not restricted to a G residue. Nevertheless, we found high conservation for G121, which is frequently used for the same function of flexibility and alternative kink positions could potentially correspond to different functional states of the channel (e.g. closed, desensitized and inactivated) (Akitake et al. 2007; Edwards et al. 2008). For TM3 domain from Bacteria, we found the consensus motif: [LFIV]-x-[ATS]-x(2)-G-x(2)-[GSTA]-[LVIA]-[AVG]-[IVL]-[GAM]-[LF]-[AG]-x-[QK]-x(2)-[LVI]-x-[ND]-x(2)-[ASG]-[GS]-x(2)-[IL]. For Archaea, we observed eight well-conserved residues (≥63%): G166, G169, G173, F/L174, A175, Q177, N182 and G186. These residues are equivalent to G101, G104, G108, L/F109, A/G110, Q112, N117 and G121, respectively (M. jannaschii to E. coli numbering). It is important to highlight that D/T/S178 in Archaea is equivalent to the residue G113 and the well-conserved G186 corresponds to G121 from Bacteria. For TM3 domain from Archaea, we found the following consensus motif: [GAT]-x(2)-[GAT]-[ILA]-x-[LIV]-[GAS]-[FL]-[AG]-x-[QKR]-x(4)-[ND]-x(2)-[ASG]-G-x(2)-[ILM]-x(3)-[RKQ]-[PFTS]-x(3)-[GN]. Concerning TM3 from Eukarya, we observed poor conservation but some periodicity for the G residues in comparison with prokaryotes. We found nine residues weakly conserved (≥39%): L267, G271, G/L274, L/F279, A/I280, N287, S/E290, G/S291 and L292 (numbering of ‘MscS-like protein’ AtMSL3 from A. thaliana). It is important to note here that G271, G/L274 and G/S291 are equivalent to G101, G104 and G121 from Bacteria and G166, G169 and G186 from Archaea, respectively (Fig. 4; Table 1). We found that the most conserved residues in TM3 present in all the three domains are the equivalent to G101, G104, G108, A110, N117 and G121 from E. coli (see Figure S3).

Fig. 4
figure 4

Conserved motifs in the TM3 modules of MscS from Bacteria (numbering of E. coli), Archaea (numbering of M. jannaschii) and Eukarya (numbering of AtMSL3 from A. thaliana). The data for these logos consists of 260 sequences from Bacteria, 34 from Archaea and 15 from Eukarya. See Fig. 1 for details

MscS protein: conserved motifs in the C-terminal domain

In E. coli, W240 and W251 lie at the β and αβ interfaces that create the seven lateral portals, and both residues are located close to the subunit interfaces in the C-terminal domain, and therefore they are close to the portals that penetrate the vestibule (Bass et al. 2002; Steinbacher et al. 2007). W240 has an essential role in MscS oligomer stability and assembly, but W251 has a less important role in such functions. It is located at the side portal and is in contact with the neighboring subunit (Rasmussen et al. 2007). According to a previous study, aromatic residues are required at both positions. W240 is conserved in almost 100% of the top 100 proteins showing strong overall similarity to EcoMscS by BLAST, and W251 is more frequently founded as a F (57%) and only 29% have W at this position (Altschul et al. 1997). Nevertheless, in accordance with our analysis, after comparing 260 MscS homologs from Bacteria we found that W240 is weakly conserved (40.6%), Tyr can be found usually in this position (16.5%) and occasionally Phe (7.1%). On the other hand, at position 251 we found W (7.1%) or L/F (9.1%). However, it is important to note that when W is present in position 251, W at position 240 is constantly present. Indeed, in Bacteria we found poor conservation for this region and only five positions are relatively conserved (≥48%): V/L/I237, I/L/V261, I/L/V270, P/A273 and P/F275 (E. coli numbering). These residues are located between αβ subdomain and the β barrel. For Archaea, there are five positions conserved (≥46%): V/L/I299, L/I/F/V301, W306, V/A307 and I/V326 (M. jannaschii numbering). After comparing 34 MscS sequences, we found that W306 from Archaea is equivalent to W240 in Bacteria. Finally, when 15 MscS-like sequences were compared from eukaryotes, a significant conservation in this region was not found, showing a high variability of sequence (Fig. 5; Table 1).

Fig. 5
figure 5

Conserved motifs in the cytoplasmic vestibule of MscS channels from Bacteria, Archaea and Eukarya. Polar and charged residues that surround the lateral portals in the MscS crystal structure (Bass et al. 2002) and homologs in Archaea are marked below the lines with a filled orange circles. Dotted arrows show poor conservation (<50%) of the same residue only in Bacteria and Archaea

In summary, bacterial MscL and MscS have been extensively studied, and several residues important for the channel activity have been previously described (see Tables 2, 3 and references therein). Given its high frequency in the sequences analyzed here, we confirm the importance of such residues. Even so, according our analysis, some residues have received less attention. Tables 2 and 3 shows significant residues to be considered. To our knowledge, some residues have not been reported as important mutant phenotypes or the effect of mutations produce mild phenotypes. On the other hand, highly conserved residues founded in archaeal and eukaryotic homologs have not been evaluated until now.

Table 2 Alternative residues in conserved motifs for MscL channels from Bacteria, associated with mutational data
Table 3 Alternative residues in conserved motifs for MscS channels from Bacteria, associated with mutational data

An evolutionary consideration

MS channels are present in organisms from all three domains of the universal tree. These channels form a superfamily of MS membrane proteins that probably have descended from a common ancestor (Pivetti et al. 2003; Kloda and Martinac 2001, 2002). This suggests that MS channels may have appeared early in the evolution of cellular life and that mechanosensation probably originates very early to fulfill crucial functions for cell survival (Martinac and Kloda 2003; Anishkin and Kung 2005; Kung 2005; Martinac et al. 2008). It has been proposed that gating of MS channels by mechanical force may have first evolved in cell-walled microbes and then has been relatively conserved in eukaryotic cells (Hamill and Martinac 2001). In general terms, this is in accordance with the good conservation between MS channels orthologs in Bacteria and Archaea but the minor conservation in Eukarya that we found. Indeed, it is interesting to know which sequences used in our study form functional MS channels.

In particular, MscS-like proteins have been identified in some higher plants (Arabidopsis and Oryza), single-celled alga Chlamydomonas and fission yeast (Schizosaccharomyces pombe), but MscS homologs have not been found in animals (Pivetti et al. 2003) and our sequence analysis confirms this. Hence, it is notable that both MscS and MscL homologs founded in Eukarya are mostly from cell-walled organisms (Fungi and Plantae): for MscS, we found 13 homologs from plants, 1 from S. pombe and 1 for Entamoeba whereas for MscL, we found 13 homologs from Fungi and curiously 1 from Apis. The cell wall itself provides important mechanical protection to the membrane, and this finding suggests that MS channels may have evolved to regulate turgor pressure and perform similar functions as in prokaryotic MS channels (Kloda and Martinac 2002). In Arabidopsis, for example, MSL2 and MSL3 could be MS channels that control plastid size, shape and perhaps division during normal plant development releasing solutes in response to osmotic downshock (Haswell and Meyerowitz 2006). The presence of such MscS homologs in Arabidopsis and Chlamydomonas (Nakayama et al 2007) is very interesting with respect to the symbiotic origin of the chloroplast as well as the function of this channel in the control of organelle morphology (Haswell and Meyerowitz 2006). It is possible that MscS family members have evolved new roles in plants since the endosymbiotic event. On the other hand, the presence of MscL homologs predominantly in filamentous fungi is of great significance because, as saprophyte and parasitic symbionts, they experience both slow and fast changes in extracellular water activity, and adaptation to changing environmental conditions is an essential trait for the survival in competitive habitats (Hohmann 2002). MscL channel is essential for the survival of extreme turgor and the presence of MscL homologs in fungi could be an indicative of their role in releasing the pressure built up by hypoosmotic shocks. A basic question is whether the mechanisms of mechanical gating that originated in prokaryotic channels are conserved in the eukaryotic MS channels.

Concluding remarks

In this paper, we identify several new conserved motifs from homologs of MscL and MscS from all three branches of the universal tree of life. A similar study for voltage-gated ion channel has been published elsewhere (Guda et al. 2007). Our study presents the first attempt to summarize very recent knowledge about the MS channels of the bacterial as well as non-bacterial homologs, and offers the opportunity to compare and contrast the structural and functional properties of these channels. Our analysis could provide a useful starting point for future structure–function studies of diverse MS channels and leads us to some important conclusions:

  • Conserved residues and motifs identified in this report can be used for new mutational studies to better understand the function of the MS channels in homologs from Bacteria, Archaea and Eukarya.

  • Specifically, the conserved residues founded in archaeal and eukaryotic sequences are important targets to be considered in future research. Their sequence similarity with bacterial homologs may reflect functional relatedness. The first objective is to establish more electrophysiological activity for those putative channels.

  • The elevated homology founded for MS channels from Bacteria and Archaea suggests that archaeal MS channels could participate in similar mechanisms of osmoregulation.

  • Although we found a reduced homology in each motif for eukaryotes, their presence could suggest that all three domains of life may have evolved from a common ancestor, and therefore these channels may share similar functions.