Introduction

Trillions of microbes, such as bacteria, fungi, and viruses, inhabit the human gut and could be considered as a virtual organ (Baothman et al. 2016; Ley et al. 2006; O’Hara and Shanahan 2006). In particular, diverse bacterial species represent the largest community in the gut microbiota. However, people recently realized and appreciated that gut microbes live symbiotically with us. The diverse and complex microbial communities play an essential role in human digestive function. For example, indigestible nutrients are digested or degraded by bacteria in the gut. In addition, depending on various factors (such as diet, mood, and age), the composition of gut microbiota is constantly changing, which might affect human health (Kasparovska et al. 2016; Mangiola et al. 2016; Odamaki et al. 2016). Nutrition is a very vital factor to affect the gut microbiota and human healthy (Graf et al. 2015; Sonnenburg and Backhed 2016). Malnutrition could cause disease by altering the gut microbial composition and delaying their normal development (Kane et al. 2015). Plant-based diets have been recommended to reduce the risk of colon cancer, type 2 diabetes, and heart diseases (Martin et al. 2013; Westergaard et al. 2014; Satija et al. 2016). The proportion of Bacteroidetes decreased and Firmicutes increased in heavy meat eaters compared to vegetarians (Zhang and Yang 2016), and an animal-based diet could result in significantly lower levels of short-chain fatty acids (SCFAs) which have anti-inflammatory effect (David et al. 2014; Iraporda et al. 2015). Recent finding indicated that dysbiosis of intestinal flora might be associated with, or cause, inflammation (Magrone and Jirillo 2013) and metabolic disease (Karlsson et al. 2013). For example, several beneficial members of the gut microbiota, such as Faecalibacterium prausnitzii, were significantly reduced in Crohn’s disease compared with healthy controls (Erickson et al. 2012). To further characterize the structure and function of the gut microbiota, the Human Microbiome Project (HMP) and Metagenomics of the Human Intestinal Tract (MetaHIT) projects were initiated (Human Microbiome Jumpstart Reference Strains et al. 2010; Human Microbiome Project 2012). Certain meta-omics approaches have been used to analyze large-scale gene or protein expressions and metabolite compositions (Gao et al. 2016; Levi Mortera et al. 2016; Xie et al. 2016; Zhao et al. 2016).

Metagenomic analysis has been used widely in gut microbial studies, employing different experimental methods. Generally, 16S ribosomal RNA subunit genes are highly conserved in bacterial species, providing useful information for taxonomic characterization (Barker et al. 2013; Costello et al. 2009; David et al. 2014). Although genome shotgun sequencing increases the available information for taxonomical characterization of the gut microbiota, it provides little information concerning the detailed and invaluable function of the gut microbiota. Moreover, these genomic-based approaches only predict potential functions, and do not reveal the extent and locations of protein expression. In recent years, metatranscriptomic analysis based on RNA sequencing has provided information of the functional characteristics and dynamic range of microbial communities (Wang et al. 2009). Typically, the metatranscriptome is studied via isolation of total RNA, construction of cDNA libraries and identification of the sequences. However, messenger RNA (mRNA) easily degraded in prokaryotes is unstable (Redon et al. 2005). Furthermore, mRNA levels do not represent the biological function significantly. The hypothesis that signaling molecules produced by the gut microbiota could lead to a variety of diseases is yet to be confirmed (Zhu et al. 2016). For example, short-chain fatty acids, such as acetate and propionate, produced by gut bacteria could be recognized as signal transduction molecules to regulate host energy metabolism (Donohoe et al. 2011; Zhu et al. 2016). Metabolomic analysis aims to characterize metabolite variations or the metabotype under a variety of conditions using nuclear magnetic resonance (NMR) spectroscopy or mass spectrometry (MS) approaches. Metabolomics provides an image of the entire metabolite composition of the microbial communities. Nevertheless, the details of the molecular mechanisms of gut microbiota remain ambiguous, because ultimately, proteins mediate the gut microbial function. Genes are transcribed into RNA, which are then translated into protein. Meanwhile, proteins can catalyze the synthesis of certain metabolites that regulate the physiological process of an organism or mediate its biological function directly. Therefore, metagenomics, metatranscriptomics, metaproteomics, and metabolomics are closely linked, and metaproteomics might play a critical part (Fig. 1). This paper reviews recent researches on metaproteomics of gut microbiota and highlights their methodological strategies and medical or biological applications. We also stress the importance of metaproteomics in understanding the potential biological function of gut microbiota and pay attention to their limitations. We conclude that metaproteomics is a promising tool for studying gut microbiota, and with the combination of other omics may get much greater gains in the gut microbial research.

Fig. 1
figure 1

Overview of the relationships among various meta-omics

Advantages of metaproteomics in gut microbiota research

Metaproteomics has been defined as the examination of a complete protein composition of environmental microbiota at a specific time (Wilmes and Bond 2004). Moreover, metaproteomic analysis provides valid information for environmental microorganisms, such as microbial activity, signal transduction, and metabolic pathways (Verberkmoes et al. 2009). In addition, it can also be used to investigate environment-microbe interactions (Mayne et al. 2016). Microbial metaproteomics analysis has become an efficient tool to investigate various processes in different environments, such as soil (Wang et al. 2011), marine (Lopez et al. 2002), and feces (Tanca et al. 2015). Adult feces contain about 75% water and 25% solid material (Rose et al. 2015). The major organic material of the solid matter is gut microbial cell (Stephen and Cummings 1980). In this respect, Tang et al. (2014) used a metaproteomic approach to explore the composition of microorganisms in chicken feces and revealed the adaptation process of the chicken gut microbiota.

Metaproteomic analyses of gut microbiota provide a unique perspective to understand microbial life processes. It could reveal the whole biological processes from the qualitative analysis of the activity of the microbes to the quantitative analysis of protein expressions and dynamic changes. Some proteins could serve as biomarkers for disease diagnosis, prognosis, and therapy. Protein expression in the gut microbiota might be altered in patients; therefore, metaproteomic analysis of gut microbiota might reveal the molecular mechanisms of diseases (Xu et al. 2016). It also could provide key information on complex protein networks and signal transduction pathways in certain diseases (Carrasco-Navarro et al. 2016). Therefore, to satisfy the demands of discovering the gut microbiota related biomarkers and addressing the relationship of functional redundancy and diversity of gut microbiota, it is necessary to use metaproteomics to gain the quantitative information of all proteins in the gut microbiota.

Metaproteomics methodology for gut microbiota research

Mass spectrometry is more suitable for analyzing complex samples and revealing their biological function (Schneider et al. 2012; Verberkmoes et al. 2009; Aires and Butel 2011). The availability of MS technology (especially LC-MS/MS) offers an opportunity to gain unprecedented insight into environmental microbes and makes the identification of proteins in gut microbial communities possible (Leary et al. 2012; Del Chierico et al. 2014). An optimized workflow could maximize protein identification, allowing us to analyze the metaproteome of gut microbiota successfully. Shotgun metaproteomic analysis has been applied widely to measure the total proteins in the gut microbiota (Tanca et al. 2015; Verberkmoes et al. 2009). As shown in Fig. 2, the critical steps for an efficient metaproteomic analysis include sample collection, protein extraction, protein isolation, MS analysis, and database searching.

Fig. 2
figure 2

Metaproteomic work flow used in the characterization of gut microbiota proteins

Sample collection

First, researchers need to collect fecal samples or samples from different compartments of the gut (e.g., the ileum, cecum, and the small and large intestine), where the microbial composition and diversity can be different. Collecting abundant high-quality samples is required for reliable results. Collecting samples from the gut poses challenges and limitations for analyzing the metaproteome of gut microbiota. For convenient sampling and saving costs, most investigators choose to isolate the gut microbiota from feces of human, which contain approximately 1011 bacteria per gram (Tlaskalová-Hogenová et al. 2011).

Centrifugation and filtering are two strategies to separate gut bacteria from other material in fecal samples. Apajalahti et al. (1998) and Kolmeder et al. (2012) used differential centrifugation to purify microorganisms. Briefly, feces are suspended by vortexing in sodium phosphate buffer containing Tween-80 or glass beads. The supernatants are collected, and insoluble particles are removed by low-speed centrifugation. This process is repeated several times to gather the cells. Microorganisms in the supernatants are then collected by high-speed centrifugation. The pellet is washed to remove the soluble compounds, such as viscous polysaccharides and mucins. Tanca et al. (2015) analyzed the data for the human gut microbiota proteome from feces with or without differential centrifugation and found that the former method is superior to the later for harvesting bacteria and proteins.

Recently, indirect double filtering, a non-centrifugation method, was developed to harvest gut bacteria (Xiong et al. 2015). Typically, bacterial cells and human cells or undigested food particles can be separated based on their different sizes. Bacteria are significantly smaller than human cells; therefore, large particles and human cells can be removed using a 20-μm filter, while microbial cells can be collected through a 0.22-μm filter. Using this approach, a higher number of microorganisms, especially low-abundance microbes, are enriched from complex fecal samples.

Protein extraction

Different chemical and physical methods may be used to prepare gut microbial proteins. SDS or Triton X-100 and guanidine are used as general detergents and denaturants to extract proteins, and the steps of protein extraction are well established. Commonly, a cell pellet is resuspended in 6 M urea, 2% SDS or 1% Triton X-100 solution to lyse cells for several minutes, and the proteins are recovered by high-speed centrifugation (Chen et al. 2016; Dhabaria et al. 2015; Peach et al. 2015). The similarities of the basic structures of urea and peptide bonds suggest that urea could reduce the efficiency of trypsin. In addition, detergent could interfere with protein quantification using BCA or Bradford assays. Moreover, the use of detergents has a detrimental effect on MS. Thus, chemical reagents that have negative influences on protein digestion or quantification should be diluted or removed. Commonly, a commercial C18 spin column is used to deplete undesirable substances of the sample (e.g., salt and detergents) before being analyzed by mass spectroscopy. To avoid the effects of chemical reagents, sonication is used frequently (Prauchner et al. 2013). Normally, microbial cells are sonicated three times, and the protein contents are collected. In practice, the efficiency of extraction varies by species because some proteins bound to DNA could be removed by centrifugation.

Protein isolation

Protein isolation is a critical step in metaproteomic studies. To improve the coverage of the proteome, proteins should be fractionated before MS analysis. Several protein separation strategies may be used to reduce the complexity of samples. Methods of protein fractionation have been developed to study the metaproteome of the gut microbiota, such as separation according to their physical properties, including molecular weight, charge, or hydrophobicity.

Gel-based methods, including one-dimensional (1D) or two-dimensional (2D) electrophoresis, can separate proteins based on their size or isoelectric point (Brunelle and Green 2014; O’Farrell 1975). In a 1D electrophoretic step, sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) is used commonly to generate protein fingerprints of gut microbiota. For example, Kolmeder et al. (2012) used a 1D-MS method to study human intestinal microbiota. In that study, proteins were separated into three fractions (proteins larger than 80 kDa, smaller than 35 kDa, and between 35 and 80 kDa) using SDS-PAGE according to their molecular weights. For downstream MS analysis, proteins bands in the gel are visualized via Silver or Coomassie blue staining, excised, and digested into peptides. Using this method, false positive results of MS may be avoided. Meanwhile, highly abundant proteins could be removed and low abundant proteins could be enriched, according to the magnitude of the staining. However, 1D electrophoresis fails to identify individual proteins because of its low resolution of separation.

Alternatively, 2D electrophoresis is used to separate proteins based on their molecular weight and charge (O’Farrell 1975). Briefly, proteins are separated based on their isoelectric point in the first dimension, and further separated by SDS-PAGE. Individual proteins or their isoforms can be visualized and characterized using computer-assisted image analysis of 2D-PAGE gels. Klaassens et al. (2007) identified 55 microbial proteins in infant feces using 2D electrophoresis-MS. However, there are several caveats to this method, such as laborious manual procedures with high costs and limited application. In addition, low abundance proteins might be lost during the procedure and are thus not identified.

Gel-free approaches rely on the development of chromatography techniques to separate proteins. High-pressure liquid chromatography (HPLC) is faster and more convenient than electrophoresis for separating peptide mixtures. In a typical metaproteomic experiment, a complex protein sample is digested into peptides followed by separation and identification by LC-MS. Reversed-phase liquid chromatography (RPLC) is the most frequent choice for proteomic study and a C18 column is generally used to separate peptides. RPLC separation is performed with a gradient elution program comprising a low proportion of methanol or acetonitrile and a high proportion of water as the mobile phase, which can be easily volatilized.

Technological and methodological improvements have advanced metaproteomic studies of the gut microbiota. A promising LC method, multidimensional liquid chromatography (MDLC), has been developed to allow the study of complex samples using several separation dimensions. MDLC has higher capacity and efficiency than traditional 1D liquid chromatography. Typically, size-exclusion chromatography (Cheruthazhekatt et al. 2013), strong cation exchange (SCX) chromatography (Mawuenyega et al. 2003; Zhou et al. 2015), and strong anion exchange (SAX) techniques (Ficarro et al. 2011) are always used in the first dimension, while traditional reverse phase chromatography is applied in the last dimension. These techniques have different selectivities.

To date, the combination of SCX chromatography and RP chromatography has been used widely to fractionate complex peptide mixtures. In a SCX-RP system, the acidified peptide mixture is loaded onto an SCX column, and fractions eluted from the first column are absorbed into an RP column (Betancourt et al. 2013; Quan et al. 2015; Slebos et al. 2008). Peptides are separated efficiently and easily by MDLC-MS; therefore, it has been used widely to identify the metaproteome of the gut microbiota. Brooks et al. (2015) applied a split-phase fused silica column, containing both SCX and RP materials, for 2D-LC. The peptide mixture was fractionated using an ammonium acetate solution in a gradient elution, followed by reverse phase chromatography. About 1149-2636 microbial proteins per sample were identified by the method of nano-2D-LC-MS/MS. Erickson et al. (2012) used a similar method to study the human host-microbiota signatures of Crohn’s disease.

Generally, in a gel-based method, proteins are digested into peptides after 1D or 2D-PAGE separation. For non-gel-based methods, proteins are digested before separation. A universal digestion procedure, named filter-aided sample preparation (FASP) to filtering out small molecular substances, has yet to be fully established (Wisniewski et al. 2009). In a FASP protocol, reduction, alkylation, and digestion are performed using a 10-kDa filter. Compared with the existing methods, FASP is a relatively time-consuming, but more efficient, process (Lipecka et al. 2016).

MS analysis

MS is an essential technology of metaproteomic research. Currently, ion sources of electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI) are ubiquitous in the study of proteomics (Jansen et al. 2005; Soltwisch et al. 2009; Whitehouse et al. 1985). ESI-MS/MS and MALDItandem time-of-flight (TOF/TOF) as the powerful tools for proteomics research could always be used to improve the accuracy of the protein identification by enhancing the separation efficiency, detection sensitivity, and the detection of flux significantly (Dumpala et al. 2009; Trufelli et al. 2011; Noordin and Othman 2013). ESI, combining ionization technology and LC, is a great choice for the analysis of complex samples and the identification of low abundance proteins (Kawashima et al. 2013). ESI generates ions of higher charge states and reduced mass-to-charge ratios, which can improve the compatibility and efficiency of the MS analyzer (Ho et al. 2003). Unlike ESI, MALDI generates +1 charged ions based on the short wavelength laser-induced desorption from the matrix, resulting in very large mass-to-charge ratios of peptides or proteins (Caprioli et al. 1997).

Mass analyzers, such as TOF, quadrupoles, and ion traps, are employed routinely in proteomics research. Scan speed, resolution, mass accuracy, and acquisition range are the primary parameters that determine the performance of mass analyzers. TOF analyzers determine the mass-to-charge ratios of the protein by detecting the ion flight time. Generally, a TOF analyzer runs in tandem with a quadrupole or additional TOF mass filters. Ion traps can capture ions according to the mass-to-charge ratios using magnetic and electric fields (Douglas et al. 2005). Although ion traps have low sensitivity for quantification, they are gaining ground because of their smaller size and lower cost (Ho et al. 2003).

In recent years, more and more new mass spectrometry technologies have been applied in proteomics research, such as LTQ-Orbitrap and Q Exactive systems. These technologies provide very high resolution and analytical ability, which will promote the development of proteomics significantly.

Database searching

Gut microbiota research is calling out for comprehensive reference metaproteome databases containing non-redundant sequences of gut microorganisms. The NCBInr database, with its high sequence coverage, is used frequently for protein characterization. However, removing redundant proteins is challenging. It is very time-consuming to get protein information because of the sheer size of the NCBInr database. To solve this problem, a combined database including different known gut microbial metagenome sequence databases are usually applied (Erickson et al. 2012; Kolmeder et al. 2012). However, they do not represent a comprehensive microbial map of the gut microbiota, because some genomes of uncultivable gut microbiota have not been sequenced. Thus, an ideal database with all potentially expressed proteins in the gut microbiota sample is urgently required.

To solve this problem, Zhang et al. (2016c) proposed a high-performance and universal workflow for gut metaproteome identification and quantification, named MetaPro-IQ, in which over 120,000 peptides corresponding to 30,000 proteins were identified in a single experiment. Briefly, a three-step database search strategy was performed to identify proteins of the gut microbiota from human and mouse. The first search was performed against the whole gene catalog database to generate a “pseudo-metaproteome” database. Subsequently, the pseudo-metaproteome database was employed for a second search, which was typically a target-decoy database search with strict filtering (e.g., with a false discovery rate < 0.01). Finally, the resulting proteins were mapped to a non-redundant database to generate confidence data. To gather more information about the composition and function of the gut microbiota, the proteins identified were subjected to taxonomic classification and Clusters of Orthologous Groups (COG) categorization, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses.

Applications of metaproteomics in gut microbiota research

Currently, metaproteomics has limited application in gut microbiota research, compared with RNA-seq and metagenomics. With the metaproteomics emerging into the spotlight of research, several metaproteomic projects have been initiated to characterize proteins in the gut microbiota. To illustrate the applicability of metaproteomics in gut microbiota research (Fig. 3), a number of interesting examples are described below. The following section was based on the novelty of the metaproteomic studies of gut microbiota reported to date.

Fig. 3
figure 3

Applications of metaproteomics in the study of the gut microbiota

Characterization of the microbial composition

Increasing evidence suggests that the numerous microorganisms in human gut have important significance for health and disease (Baothman et al. 2016; Kasparovska et al. 2016; Ley et al. 2006; Malaguarnera et al. 2014). Microbial classification has been studied by different methodologies, such as culture, microscopy, and especially, metagenomics. Culture and microscopy techniques can be costly, time-consuming, and biased; therefore, metagenomics and next-generation sequencing were applied broadly to characterize the composition of the gut microbiota. Currently, metaproteomics might provide new insights into the diversity of the gut microbiota, and researchers have worked in the field of characterization of the composition of microbiota using metaproteomics.

Levi Mortera et al. (2016) employed metaproteomics to characterize gut microbiota in mouse models. They compared two different metaproteomic methods, namely MALDI-TOF MS-based culturomic procedures and traditional bottom-up liquid chromatography with subsequent MS/MS shotgun metaproteomic procedures, using a newborn model to dissect the effect of nutrients (e.g., IgA) in maternal milk on the gut microbiota. They found that IgA-deficient milk could cause an increase in the population of opportunistic bacterial pathogens, such as Pasteurella pneumotropica and Staphylococcus xylosus. Notably, the two metaproteomic techniques were used to characterize the gut microbiota from various angles and depth, and both were found to be reliable techniques to describe the gut microbiota.

Identification of therapeutic target in the gut microbiota

MS-based traditional proteomics is an effective tool that is employed in both in vitro and in vivo models to identify therapeutic targets and discover biomarkers and to reveal the molecular mechanisms associated with many diseases, including cancer, cardiovascular disease, and cognitive disease (Wang et al. 2016; Zhang et al. 2016a, b). Recent reports have highlighted the important role of the gut microbiota, which might contribute to the occurrence and development of disease (Tai et al. 2015; Tlaskalová-Hogenová et al. 2011). A study focusing on changes in the metaproteome to understand their influence on disease, to identify new biomarkers in the gut microbiota, and to find new therapeutic targets in the gut microbiota would have the consequent benefits from health and economic aspects. Hence, metaproteomics analysis has been applied gradually to identify the gut microbiota metaproteome in healthy and diseased states, with the aim of identifying specific proteins as targets for treatment.

Liver cirrhosis is the final condition of liver fibrosis, in which the liver fails to work normally because of long-term liver injury. Cirrhosis is characterized by distortion of the liver parenchyma associated with fibrous septae and nodule formation, as well as alterations in blood flow (Bugianesi 2005; Pinter et al. 2016). In recent years, the gut microbiota has been noted to share a close relationship with liver cirrhosis. Wei et al. (2016) used 1D gel electrophoresis and in-gel protein digestion coupled with a high-throughput LC-MS/MS measurement to detect metaproteomic changes in the intestinal microbiota of liver cirrhosis patients. In that study, about 4400 proteins from bacteria were identified. The abundances of 14 proteins, such as chaperone protein DnaK, glutamate dehydrogenase, and elongation factor G, were increased; and seven proteins, such as ketol-acid reductoisomerase, phosphoglycerate kinase, and probable thiol peroxidase, were expressed uniquely in the patients with liver cirrhosis compared with their healthy counterparts. Functionally, these proteins were mainly related to carbohydrate and amino acid transport and metabolism, suggesting that the gut microbiota from patients with liver cirrhosis have higher metabolic activity. In addition, it was found that patients with liver cirrhosis had different biosynthesis of branched chain amino acids, pantothenate, and CoA compared with controls. Overall, this study revealed more comprehensive and specific proteins of gut microbiota in patients with cirrhosis, and therefore provided potential biomarkers and therapeutic targets for the progress and treatment of cirrhosis.

Erickson et al. (2012) used multi-omics (metagenomics/metaproteomics) to reveal the host-microbiota map of Crohn’s disease. They identified the metaproteome of the gut microbiota from different parts, including the ileum and colon. The result showed that proteins of gut microbiota from different parts of the intestine were different. Comparing with the healthy subject, proteins of gut microbiota from the ileum of the subject with Crohn’s disease showed significant differences in COG categories and metabolic levels. Proteins of gut microbiota with functions in carbohydrate transport and metabolism, energy production and conversion, and amino acid transport and metabolism were significantly less abundant in patients. Proteins of gut microbiota related to replication, recombination, and repair were significantly more abundant from the ileum of the subject with Crohn’s disease. Carbohydrate active enzymes, such as glycoside hydrolases and polysaccharide lyase, showed lower abundance compared with the healthy subject. Additionally, the abundance of proteins of gut microbiota involved in butyrate production was lower than in the control. These informations showed a great diversity of microbial function in healthy versus Crohn’s disease subjects, which might lead eventually to the development of new therapeutic targets.

Prediction of drug-induced adverse effect

An adverse drug effect may be referred to as toxic reactions resulting from an intervention of a medicinal product that could cause severe condition and death in patients, which challenge many institutions, such as pharmaceutical companies, regulatory agencies, and healthcare professionals (Lazarou et al. 1998). Approximately 17% patients have an adverse drug reaction (Bohm and Cascorbi 2016). Hence, testing and avoiding adverse drug reactions are mandatory and essential from ethical aspects. Generally, drug-induced toxicities of liver, heart, and renal are tested using in vitro models and experimental animals. Despite the improved detection rate of adverse drug effects at the organ level, some potentially dangerous effects on the gut microbiota are still unavailable. As such, it is vital to predict drug-induced adverse effects according to the dynamic changes of the gut microbiota. The metaproteome of gut microbiota will contribute to understanding the occurrence and development of drug-induced adverse effects, realizing the pharmacological mechanism, and preventing the emergence of drug-induced adverse effects. However, there are few studies on the application of metaproteomics in the study of drug-induced adverse effects.

Perez-Cobas et al. (2013) applied ultra high-performance liquid chromatography coupled with an Orbitrap instrument to investigate the effect of β-lactam therapy by detecting the change in the gut microbiota metaproteome at multiple time points (3, 6, 11, 14, and 40 days). In that study, a total of 3011 proteins were identified. It was discovered that β-lactam treatment could reduce the number of highly abundant proteins of gut microbiota compared with the control sample over time, but increased the number of low-abundance proteins of gut microbiota. Some proteins related to the immune response decreased during β-lactam treatment, which render the bacteria more susceptible to the drug. Some proteins related to glycolysis, pyruvate decarboxylation, the tricarboxylic acid cycle, glutamate metabolism, iron uptake, GTP hydrolysis, and translation termination were increased at the initial stages of β-lactam treatment. These results suggested that β-lactam treatment might affect the metabolic status of the gut microbiota negatively.

Interpretation of host’s gut microbiota adaptation to environmental exposure

The “hygiene hypothesis” is a popular concept that is accepted commonly among scientists and the public. It mainly states that the fewer infections in early childhood, the greater the chance of developing allergic diseases in the future (Strachan 1989). By extension, it has gradually translated to an objective fact that the less microbiota in our body, the higher the chance of developing an allergic disease in the future (von Mutius 2010). For example, there was a lack of understanding as to why beggars seldom get sick and adapt easily to survive in hostile conditions. An experiment relating to the hygiene hypothesis was carried out by observation of the protective effect of growing up on a farm against asthma and allergies (von Mutius 2010). The experiment showed that children growing up on a farm have less chance of getting asthma and allergies, because they have more contact with the microbiota. A large number of studies have used metagenomic approaches to reveal the protection process in individuals (Wong et al. 2016); however, the phenomenon of adaption to long-term microbiota exposure in different environments has been revealed rarely using metaproteomics approaches. Despite this, metaproteomics has broadened our horizons greatly in the study of the symbiotic relationship between humans and their gut microbiota.

Tang et al. (2014) used high-throughput metagenomics integrated with a mass spectrometry-based metaproteomics approach to identify proteins in chicken fecal samples. A total of 3673 proteins were identified. Among them, 155 proteins were from Clostridium spp., 380 were from Lactobacillus spp., and 66 were from Streptococcus spp. The most frequently identified proteins in study were chaperone proteins, including GroEL and DnaK proteins. Cold shock proteins, cytochromes, thioredoxins, and peroxidases, which might be related to the adaptation process, were also identified. However, these stress-associated proteins were identified much less frequently in humans and pigs. The alteration of the central body temperature of the chickens was different to that of humans and pigs. The normal temperature for chicken varies between 39.8 and 43.6 °C at different times of the day; however, the normal temperature for humans is 37 °C and for pigs is 38.8 °C. In addition, birds excrete uric acid, which is another challenging factor for the adaption of gut microbiota. Thus, the main stress factors might be higher than other animals. This result suggested that these proteins might be required for bacteria to adapt to different environments.

Limitations and challenges of metaproteomics analysis

Clarifying the composition and functions of the microbial communities in their corresponding environments can help us understand the mechanism of diseases (Kasparovska et al. 2016; Mangiola et al. 2016; Odamaki et al. 2016). Large metaproteomic sequencing projects that analyze proteins from samples can provide comprehensive data on the diversity of gut microorganisms and their potential roles in specific environments (Tang et al. 2014). However, several challenges remain for metaproteomic analysis. (1) It is difficult to purify and identify microbial proteins from fecal samples that containing the complex metaproteome of gut microbes (Bojanova and Bordenstein 2016). Some potential host proteins (e.g., Secretory IgA) in the sample could influence the identification of the microbial metaproteome (Tang et al. 2014). (2) High diversity in gut microbial composition and variations between individuals might lead to inconsistent outcomes (Bai et al. 2016; Vernocchi et al. 2016), which present many difficulties to the study of metaproteome. Besides, the microbial or protein composition could also be influenced by other factors, such as food (Kasparovska et al. 2016), disease (Maeda et al. 2016), mood (Mangiola et al. 2016), age (Odamaki et al. 2016), and gender (Strati et al. 2016). (3) Current technologies prohibit the detection of low abundance proteins. Although high-performance MS has increased the dynamic range and coverage of proteins significantly, certain low abundance proteins still cannot be detected by the current mass spectrometers (Muth et al. 2013). Thus, a more sensitive, efficient, and sophisticated analytical technology should be developed to study the metaproteome of gut microbiota. (4) It is ambiguous in mapping peptides to distinct proteins, the so-called protein inference, which is especially noticeable in metaproteomics (Kolmeder and de Vos 2014; Nesvizhskii and Aebersold 2005). Generally, MS/MS spectra of peptides, rather than proteins, are matched to search against a database. However, most of the matched peptides are not unique peptides, which are necessary for protein identification. For example, when a peptide is provided, more than one protein could be identified. This is a problem for the specific taxonomical affiliation of detectable peptides and the generation of protein groups. (5) Lack of reference database is another major impediment to the application of metaproteomics in gut microbiota research. Several comprehensive projects, such as the Human Microbiome Project initiated by the NIH, and MetaHIT, which are based on large-scale sequencing, have been developed over the last couple of years and aimed to characterize the composition and function of gut microorganisms. These will help to generate new results from high-performance and universal analyses of complex high-throughput metaproteomic datasets.

Conclusions and prospects

Metaproteomics shows great potential as a universal analytical method that would broaden our knowledge of organisms’ biology within an ecosystem. Ultimately, it should be able to reveal all the proteins expressed by environmental microbial communities at a specific time. Furthermore, it may provide a real-time representation of the microbial activity, function, and signal transduction in the gut microbial community. Moreover, metaproteomics might help to identify the effect of dynamic changes of the gut microbiota. In addition, it could reveal the regulatory mechanisms of the gut microbiota. Above all, metaproteomics plays an essential role in studying the complete biological process of the gut microbiota.

Although the metaproteomics approach has identified complex microbial communities, several challenges restrict its development. Some gut microbial proteome, especially low-abundance gut microbial proteome, have not been identified. It is critical that metaproteomic approach should be significantly improved in throughput in the coming years, and some experiences based on human proteomics strategy of identifying low abundance protein should be shared. Indeed, during the last decade, researchers have applied the metaproteomics strategy to identify the composition of proteins in gut microbial communities and comparatively analyze the differential proteins groups in healthy and sick states. However, how gut microbiota communicate with human and what signaling pathways and mechanisms exist are still the open questions at the moment in the field of intestinal microbiomics, thus, analyzing gut microbiota from a system perspective by applying metaproteomics technology might be the most important research direction to human health care. In addition, using metaproteomics strategy to characterize protein post-translational modifications (PTM) of gut microbiota might be another important direction. Notably, the effectiveness of metaproteomics will depend on the development process of sequencing technologies and computational approaches, and therefore, keeping pace with the progress of metagenomics will contribute to the development of metaproteomics. Meanwhile, metaproteomics could be combined with metagenomics, metatranscriptomics, metabolomics, and other “omics” methods. The integrative omics approaches may generate comprehensive information from genes to RNA to proteins and metabolites. Overall, metaproteomics is a powerful tool with an uncharted potential to reveal disease mechanisms related to the gut microbiota.