Introduction

The term “proteome” was created in 1994 by Marc Wilkins to indicate all time- and condition-specific proteins that are simultaneously produced by a cell or a tissue (Anderson and Anderson 1998; Wilkins 2009a). Studying this proteome poses an analytical challenge. The large diversity in protein size and properties as well as in posttranslational modifications makes the proteome much more complex than the genome or the transcriptome. Moreover, (micro-)organisms adapt rapidly to changes in the environment resulting in a highly dynamic protein composition. Proteomics aims to use state-of-the-art protein analysis tools to reveal particular features in the cellular system, including the identification of (subcellular) proteins, the changes in abundance of proteins as well as in their maturation, posttranslational modifications, and degradation of those proteins in response to a certain challenge. Protein networks and their dynamics are resolved in addition to the structure of proteins to allow their functional annotation. Proteomics can thus be considered as a field where researchers provide insights into cellular processes and function by regrouping different pre-fractionation methods, quantification methods, mass spectrometry (MS), and bioinformatics. The recent development of high-throughput proteomic techniques can help in the microbiologist’s quest to identify and characterize microorganisms and to study their evolution and origin as well as their interaction with the environment.

The main focus in today’s bacterial proteomics consists of the use of quantitative tools to analyze changes in protein abundance in laboratory experiments aiming to measure the effect of changing culture conditions (temperature, nutrients, chemical (antibiotic) treatment). Traditionally, these studies combine two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) for protein identification. However, in this review, we will focus on novel non-gel-based, mass spectrometric methods for quantitative proteomics. Thanks to developments in genome sequencing, the scope of microbial proteomics has broadened. Next-generation nucleotide sequencing and automatic annotation pipelines have had a tremendous impact on the number of microbial genomes that are publically available today. This genomic information is not only crucial for the interpretation of the MS spectra and identification of the proteins but, reversely, proteomics can also aid in a better annotation of the genome by providing proof of existence of predicted proteins and generating better functional annotation (proteogenomics) (Armengaud 2012). Additionally, the collective proteome of microbial communities can be studied as a meta-organism (metaproteomics) under either a controlled environment in the laboratory or in their natural environment (Hettich et al. 2012).

The application areas of microbial proteomics range from fundamental understanding of bacterial physiology (system-wide or specific environmental stress responses, adaptation) to more practical problems including wastewater treatment problems or the effects of metabolic engineering for fermentations (Lacerda and Reardon 2009). Microbial communities are also extensively studied in complex human environments, such as the gastrointestinal system (VerBerkmoes et al. 2009). Proteomics is used in clinical microbiology to study pathogenicity factors by comparing the proteins synthetized by virulent and avirulent strains grown under similar conditions. It is also used to support the development of monoclonal antibodies, serological tools for diagnosis, and vaccine design by identifying immune-reactive proteins (Bensi et al. 2012). Additionally, the development of new antibiotics is increasingly profiting from proteomics for the identification of new targets and the understanding of the mechanisms of action of existing drugs and of antibiotic resistance (Fournier and Raoult 2011). Some examples of the contribution of proteomics to antibiotic drug discovery are the building of a compendium of protein profiles covering mechanisms of action of known antibiotics to achieve a classification of the compounds, to support antibiotic structure improvement programs, to identify toxic effects of possibly new antibiotics, and to support target-based antibiotic discovery (Wecke and Mascher 2011; Wenzel and Bandow 2011).

Proteomics involves multiple techniques and is still an evolving discipline. Here, we will focus on the challenges and the more recent developments in sample preparation and mass spectrometry for the quantitative analysis of microbial proteomes.

Proteomics—basics, opportunities, and limitations

A typical proteomics experiment consists of different stages (Fig. 1). First, the sample preparation stage aims to isolate the proteins from cell lysates or subcellular compartments. Then, this complex mixture of intact proteins is separated using chromatographic or electrophoretic techniques. Individual fractions can then be directly analyzed by mass spectrometry (top-down approach). While significant improvements of this approach have been established, the number of true applications in microbial physiology studies is limited and we therefore refer to specialized literature. More widespread is the use of electrophoretic techniques, either sodium dodecyl sulfate (SDS)-PAGE or 2D-PAGE to separate proteins that are then digested in-gel with a specific protease like trypsin, followed by mass spectrometric analysis of the peptides (bottom-up approach). Alternatively, the proteins can be digested first and the peptide mixture is consequently subjected to a chromatographic separation in order to diminish its complexity before it reaches the mass spectrometer (shotgun approach) (Fig. 1).

Fig. 1
figure 1

Workflow of a typical proteomic experiment. First, the proteins are extracted from a sample and then subjected to fractionation before being enzymatically digested into a peptide mixture and identified by mass spectrometry. This reduction in complexity can either be done by using gel-based separation methods such as two-dimensional gel electrophoresis (2-DE) and geLC or by using multidimensional HPLC separations of the proteins before mass spectrometric identification

Sample preparation and pre-fractionation in bacterial proteomics

A prerequisite for optimal mass spectrometric analysis is the availability of a sample of well-dissolved proteins or peptides, devoid from interfering compounds such as peptidoglycan or extracellular polymeric substances. Typically for microbial proteomics, the cell wall is disrupted first by mechanical means such as sonication or bead milling, by enzymatic digestion with lysozyme, or by the use of detergents. Usually, a combination of these methodologies is employed to obtain proteins in solution (Bhaduri and Demchick 1983; Herbert et al. 2006; Cañas et al. 2007; Abram et al. 2009).

Protein complexes have to be disintegrated and the interactions with other proteins or other molecules have to be broken to obtain soluble proteins which are ready for direct 2D-PAGE, LC-MS analysis, or further proteolytic digestion into peptides. This can be achieved by using chaotropes such as urea and guanidinium hydrochloride, combined with detergents. Meanwhile, proteins have to be protected from proteolysis and modification to reflect the proteome as it was at the time of the cell collection, by the addition of protease and phosphatase inhibitors. Nucleic acids that were released during the extraction can interfere and are preferably removed by the addition of RNAse and DNAse in the lysis buffer. Some of these chaotropes, proteins, enzymes, buffer salts, and detergents can be detrimental to enzymatic digestion or further fractionation and have to be removed. Different (commercial) clean-up methods exist and the choice of the method should be carefully considered, taking into account possible losses of proteins, costs, and foremost the purity grade of the used products as impurities can seriously interfere with LC-MS analysis.

An alternative way to reduce interference of other biomolecules is to perform a subcellular fractionation prior to protein extraction. Different protocols exist to specifically isolate proteins from a subcellular compartment, such as the secreted proteins, outer or inner membrane proteins, and periplasmic and cytoplasmic proteins (Cordwell et al. 2001; Thein et al. 2010). Cellular fractionation has another major advantage: it strongly reduces the complexity of the protein mixture and therefore less subsequent separation steps are required. Most commonly, cell fractionation in bacteria is meant to isolate the membrane compartment (Fischer et al. 2006; Hahne et al. 2008; Poetsch and Wolters 2008). Samples obtained from detergent-based dissolving of cell pellets may still contain large amounts of cytosolic proteins. Specific collection of membrane proteins therefore requires more substantial separation steps using two-phase partitioning and density centrifugation (Norling et al. 1998). In classical biochemical protocols, commonly used detergents include SDS, Triton X-100, and amidosulfobetaine 14, but these can be detrimental to subsequent LC-MS approaches. Recently, some manufacturers have solved this problem providing novel acid-labile detergents (Chen et al. 2007). These detergents help in solubilizing proteins throughout the sample preparation protocol, but are cleaved by acid treatment releasing a non-interfering polar group and an insoluble hydrophobic part that can be removed by centrifugation prior to LC-MS analysis.

Advances in analytical chromatography

Most commonly, protein samples are enzymatically digested by trypsin and the resulting peptide mixtures are extensively fractionated by chromatographic separations before being introduced to a mass spectrometer for MS/MS analysis (Fig. 1). Nanoscale reversed-phase high pressure liquid chromatography (RP-HPLC) can easily be hyphenated to a mass spectrometer because of its compatibility of flow rate, solvents, its high resolving power, and reproducibility. However, the resolving power of a single chromatographic separation is often not enough for the very complex peptide mixtures encountered in shotgun proteomics (Nilsson and Davidsson 2000; Shi et al. 2004). Indeed, despite the advances in mass spectrometric instrumentation, both at the level of sensitivity and resolution (see further), undersampling in mass spectrometry is often observed. This can be attributed to, amongst others, limitations in peak capacity at the level of the chromatography, matrix suppression, saturation effects, or MS instrument dwell time. Optimized column dimensions, lengths, and gradient conditions as well as separation temperature are all steps towards higher separation efficiencies (Eeltink et al. 2010; Horie et al. 2012). Exploring the use of sub-2 μm particles for packed RP-LC columns in ultra-high pressure liquid chromatography (UPLC), of silica core–shell particles, or the use of monolithic columns has led to high-throughput separations with improved peak capacities and consequently an increased proteome coverage (Patel et al. 2004; de Villiers et al. 2006; Luo et al. 2007; Sandra et al. 2008; Iwasaki et al. 2010; Rozenbrand et al. 2011). Despite these improvements in single dimension LC-MS, further fractionation of the peptide mixture is typically needed to reduce the complexity and consequently minimizing undersampling during the mass spectrometric analysis (Motoyama and Yates 2008).

The field was revolutionized by the introduction of “multidimensional protein identification technology (MUDPIT),” where the strong cation exchange (SCX), RP-HPLC, and MS analysis were performed in an online hyphenation (Washburn et al. 2001). The separation of peptides is based on two orthogonal methods, i.e., the samples are separated according to unrelated molecular properties to be able to increase the peak capacity and the resolving power of the separation as much as possible (Shi et al. 2004; Gilar et al. 2005a; Motoyama and Yates 2008; Horvatovich et al. 2010). The combination of SCX gradient elution, fraction collection, and subsequent RP-HPLC separation (offline) followed by electrospray ionization (ESI)-MS measurements has long been the standard approach (Shi et al. 2004; Vollmer et al. 2004). Alternatively, the peptides can be separated in an orthogonal fashion using the combination of two RP-HPLC separations at different pH. First, the peptides are separated at pH 10 followed by a classical separation at pH 3 coupled to ESI-MS (Gilar et al. 2005b; Nakamura et al. 2008). This approach has similar orthogonal properties as the SCX-RP-HPLC, due to the different behavior of the peptides on the RP stationary phase at basic and acidic pH (Gilar et al. 2005a). RP-HPLC at different pH was shown to outperform the SCX-RP-HPLC approach when comparing protein identification numbers (Dowell et al. 2008). In our laboratory, an offline RP/RP-HPLC shotgun approach together with MALDI-TOF/TOF MS was successfully used for the identification of membrane proteins that showed a change in abundance upon antibiotic challenge in the opportunistic pathogen Stenotrophomonas maltophilia (Van Oudenhove et al. 2012). Online 2D-RP-UPLC at different pH was applied for the characterization of the proteome of Methylocella silvestris grown with methane, succinate, or propane as their carbon source (Patel et al. 2012). The authors showed that performing a two-dimensional separation results in almost a doubling of the identified proteins compared to single LC-MS, in addition to a significant enhancement of the sequence coverage. Another example is the in-depth analysis of the cytosolic proteins in Corynebacterium glutamicum (Lasaosa et al. 2009).

Advances in mass spectrometry

There are different types of mass analyzers used in the proteomic field with each its advantages and limitations regarding the sensitivity, accuracy, dynamic range, resolution, and speed of analysis (Domon and Aebersold 2006; Thelen and Miernyk 2012). The basic types of mass analyzers are the quadrupole (Q), the time-of-flight (TOF) analyzer, the ion trap, the Fourier transform ion cyclotron resonance (FT-ICR MS), and the Orbitrap (Fig. 2). They can stand alone or be placed in tandem to take advantage of their individual strengths (reviewed in Aebersold and Mann 2003; Graham et al. 2007). The quadrupole is mostly used as an ion guide to focus ions in an ion trap (Q-TRAP) or reflector TOF (Q-TOF) mass spectrometer and in MS/MS analysis for high resolution selection of peptide ions to be fragmented by collision-induced dissociation (CID). A major advantage of the Q-TOF configuration is the high speed of analysis, allowing state-of-the-art equipment to take MS/MS spectra at 20 Hz rate, dramatically increasing the number of proteins identified in single LC-MS runs (Andrews et al. 2011). The TOF mass analyzer remains a widely applied, versatile, and sensitive component in many mass spectrometers. It is widely used in MALDI-TOF MS analysis for clinical and microbiological diagnosis, where protein profiling or peptide mapping is used as a distinctive tool. When peptide sequence information is aimed, a TOF/TOF instrument can be used. Here, the first TOF analyzer is used as a timed ion gate to select the precursor ions of interest for MS/MS analysis, while the second one separates the fragment ions prior to detection. This instrument became the standard for analysis of 2D-PAGE spots (Vanrobaeys et al. 2003). Recently, the (Q-)TOF was combined with ion mobility devices, where ionized molecules are separated based on their different behavior in a carrier buffer gas. Though ion mobility MS is mostly used in structural biology, it also has been applied recently as an extra dimension of separation in an LC-MS setup, further increasing the proteome coverage (Valentine et al. 2011). It leads to an enhanced and more accurate quantification because of the more accurate interpretation of chimeric MS/MS spectra. This is due to a diminished interference of fragment ions from precursor ions that were present in the selection window, but not intended to be fragmented.

Fig. 2
figure 2

Schematic overview of the principal components of a mass spectrometric-centered proteomic setup. The abbreviations used in the overview are for electrospray ionization (ESI), matrix-assisted laser desorption ionization (MALDI), laser ablation electrospray ionization (LAESI) for imaging MS, Fourier transform mass spectrometry (FTMS), data-dependent mode of acquisition (DDA), data-independent mode of acquisition (DIA), collision-induced dissociation (CID), electron capture dissociation (ECD) and electron transfer dissociation (ETD), peptide mass fingerprinting (PMF), peptide fragment fingerprinting (PFF), selected reaction monitoring (SRM), and multiple reaction monitoring (MRM). The quantification of peptides and proteins can be done using different labeling, label-free, or absolute quantification strategies

A widely used MS analyzer is the ion trap, which can perform multiple fragmentation cycles (MSn), where the ions are trapped, fragmented, and analyzed several times after each other. This is an interesting feature for the detection of phosphorylated peptides, for example, where the neutral loss of the phosphate group can trigger an additional round of MS/MS (MS3) to improve the peptide sequence information and subsequently the identification of the phosphorylated peptide. The more recent linear ion traps that replaced the traditional three-dimensional ion traps offer several advantages. Examples are the faster scan rates and enhanced sensitivity, while a better trapping efficiency and capacity are also achieved. The system can easily be coupled to hybrid devices such as Fourier transform-based mass spectrometers (FTMS) to obtain an ultimate performance in resolution and sensitivity. FTMS is nowadays dominated by Orbitrap mass analyzers, since these instruments show a high mass resolution and accuracy as well as a dynamic range greater than 103 at a much lower cost than the classical FT-ion cyclotron resonance instruments (Hu et al. 2005). An improved MS sensitivity was demonstrated for shotgun proteomics using a hybrid linear ion trap Orbitrap instrument. Subparts per million precursor as well as product mass accuracy are achieved after internal calibration (Olsen et al. 2009; Wenger et al. 2010). The technology has been refined, and the current benchtop quadrupole-Orbitrap instrument (Q-Exactive) outperforms other configurations in terms of the numbers of peptide and protein identifications (Michalski et al. 2011b). An example of the use of this instrument in microbial proteomics is the large-scale proteomic analysis of Mycobacterium tuberculosis to improve gene annotations from the Sanger and The Institute for Genomic Research databases (de Souza et al. 2008).

Another upcoming trend is the use of alternative dissociation methods to provide additional sequence information, complementary to that obtained by CID. Electron capture dissociation (ECD) and electron transfer dissociation (ETD) are combined with high mass accuracy MS instruments, such as the LTQ-Orbitrap or SYNAPT G2 QTOF, for high-throughput posttranslational modification analysis (Zubarev et al. 1998; Syka et al. 2004).

Finally, an emerging trend is to replace the traditional data-dependent mode of acquisition (DDA) by a data-independent mode of acquisition (DIA) (Fig. 2). The DDA serial approach for fragmentation typically lets the mass spectrometer cycle through an MS survey scan and then uses automated acquisition software to make a decision on which peptide precursor ions, detected in the MS survey scan, will be selected for MS/MS fragmentation. Ion intensity is one of the key parameters in this decision process, usually selecting precursor ions in a serial manner from the highest to the lowest intensities before these ions are excluded for a limited period of time to allow other less intense precursor ions to be selected (dynamic exclusion). This leads to a bias towards the selection of the most abundant peptide ions in real complex biological samples and to the advent of product ion spectra which are composed of fragment ions from different isobaric and nearly co-eluting peptide ions, resulting in identification difficulties during the database searching (chimeric spectra) (Michalski et al. 2011a). These limitations can largely be overcome with a DIA, where parallel measurement and fragmentation of all peptide precursor ions that are present at that time point is performed. Purvine et al. demonstrated that parallel precursor and fragment acquisition with in-source CID on a TOF MS and subsequent manual alignment followed by SEQUEST peptide identification was feasible (Purvine et al. 2003). DIA by sequential narrowband selection and fragmentation of precursor windows of 10 m/z within an ion trap has been utilized by the Yates group for the qualitative and quantitative analysis of metabolically labeled yeast (Venable et al. 2004). In LC-MSE, a HPLC or UPLC separation is combined with a Q-TOF mass spectrometer in which the quadrupole functions as a guide to transfer all ions in the collision cell. The collision energy is continuously switched from low (MS) to high (MS/MS) at a high frequency throughout the analysis (MSE) (Bateman et al. 2002; Silva et al. 2005, 2006b; Chakraborty et al. 2007). Sophisticated post-acquisition software can align the chromatographic profiles of the precursor and product ions based on retention time and accurate mass measurements to enable subsequent database searching and peptide (protein) identification (Geromanos et al. 2009; Li et al. 2009). This strategy provided an overall protein coverage ranging from 10 to 80 % for an unfractionated Escherichia coli proteome (Silva et al. 2006a). Several groups confirmed the dramatic improvement in proteome coverage and protein identification, especially for the lowest abundant proteins, using an LC-MSE DIA experiment compared to a DDA approach (Geromanos et al. 2009; Patel et al. 2009; Blackburn et al. 2010; Levin et al. 2011). Similarly, the DIA method referred to as precursor acquisition independent from ion count, in which narrow isolation windows (m/z channels) are sequentially scanned in an ion trap mass spectrometer, regardless of whether a precursor ion is observed or not, resulted in the identification of 70 % or more of the expressed proteins from Pseudomonas aeruginosa without any prior protein fractionation or enrichment method other than the RP-UPLC separation coupled to the LTQ-Orbitrap (Panchaud et al. 2009). Mann et al. used a stand-alone Orbitrap mass spectrometer instead, to allow alternation between MS acquisition and “all-ion fragmentation” MS/MS acquisition in a high-energy collisional dissociation (HCD) cell (Geiger et al. 2010a). Owing to the high resolution and mass accuracy of this instrument, the fragmentation peaks are assigned to their precursor ions on the basis of co-elution profiles. Furthermore, the Aebersold group presented the SWATH MS acquisition method, where a high resolution Q-Q-TOF MS repeatedly cycles through 32 consecutive precursor isolation windows of 25 Da (swaths) for the time-resolved acquisition of fragment ions. SWATH combines this DIA approach with a data analysis method for targeted data extraction resulting in the confident identification of yeast peptides over 4 orders of magnitude (Gillet et al. 2012).

Relative quantitation in proteomics

Proteomics was defined by Anderson and Anderson (1998) as “the use of quantitative protein-level measurements of gene expression to characterize biological processes (e.g., drug effects) and decipher the mechanisms of gene expression control.” The study of the changes in abundance of proteins upon certain perturbations such as gene mutations and chemical and environmental variables will help us in understanding what their function are, in addition to elucidating the mechanisms of either action or reaction of these perturbations. Hence, quantitative proteomics is an essential component of “systems biology,” which is the attempt to systematically study all concurrent physiological processes in a cell by global measurement of differentially perturbed states (Aebersold and Mann 2003).

Quantitative proteomics was dominated for a long time by 2D-PAGE, particularly after the introduction of immobilized pH gradients and of the difference fluorescent labeling method (DIGE). Although the method has still a number of advantages, including the ability to discriminate posttranslational modified forms of proteins, we will focus here on more recently introduced MS-driven quantitative approaches (Ong and Mann 2005; Bantscheff et al. 2007; Domon and Aebersold 2010; Otto et al. 2012).

Relative quantitation of proteins using metabolic labeling

For metabolic labeling for quantitative proteomics, the cells are grown under a particular condition in media supplemented with either a light or heavy stable isotope of a nutrient, typically a nitrogen source or an amino acid. The proteins are extracted and combined prior to enzymatic digestion, which decreases the experimental error introduced in the sample, highlighting the major advantage of this quantitative technique (Fig. 3). The quantitation is performed using the MS signal, where two peaks are detected for the same peptide with an m/z interval corresponding to the difference between the light and heavy isotope forms. Oda et al. (1999) initiated the idea of growing mutant yeast supplemented with 15 N, added as ammonium salt in the medium, comparing to wild-type yeast grown in normal medium. The method is particularly of interest for the study of protein dynamics, as the incorporation of the isotope will be faster in proteins with a high turnover (Bunai et al. 2005; Rao et al. 2008). Labeling with elementary nitrogen, however, is challenging since all nitrogen, including backbone amide groups, are labeled, and therefore, the resulting mass difference is peptide sequence and size dependent. This challenges the forthcoming data analysis. The use of stable isotope-labeled amino acids is more popular. This so-called stable isotope labeling by amino acids in cell culture (SILAC) strategy was developed by the group of Mann in 2002 (Ong et al. 2002). Here, “essential” amino acids, labeled with a heavy or natural occurring isotope, are supplemented in the amino acid-deficient growth medium and allow for the incorporation of these amino acids in all proteins as they are synthetized (Fig. 3). Typically, 13C/15 N-labeled lysine and/or arginine is used, resulting in a fixed mass difference when trypsin or Lys-C endopeptidase is used. A serious disadvantage is that this metabolic labeling is limited to those organisms that are (made) auxotrophic to these particular amino acids, typically requiring genetic engineering of the lysine or arginine biosynthetic pathway. Therefore, only few applications of this method in microbial proteomics appeared. Soufi et al. (2010) used SILAC on an auxotrophic Bacillus subtilis strain to compare the gluconeogenetic growth on succinate with growth under phosphate starvation. They also reported successful identification and quantitation of Ser/Thr/Tyr phosphorylation using this approach. Other applications are described in E. coli (Sommer et al. 2010), Salmonella typhimurium (Yu and Guo 2011), and recently in Neisseria gonorrheae biofilm studies (Phillips et al. 2012). Meanwhile, the SILAC approach was further developed to overcome some of its early limitations. Applications were all in the area of higher eukaryotes, but they might be applicable in prokaryotes. The group of Gevaert combined SILAC with differential sample mixing to overcome the singleton detection problem, where only the light or heavy form of the peptide is detected and thus hampering correct quantification (Impens et al. 2010). Geiger et al. showed recently how SILAC can be expanded to multiplex comparisons with Super-SILAC as well as the use of SILAC as a spiked standard in quantitative proteomics (Geiger et al. 2010b, 2011). pSILAC or pulsed stable isotope labeling by amino acids in cell culture takes the method even further to compare protein dynamics, namely protein translation rates (Schwanhäusser et al. 2009).

Fig. 3
figure 3

Schematic overview of some popular metabolic and chemical labeling approaches for quantitative proteomics. A bacterial cell culture grown under two different conditions is compared here (blue and red colors). Different labeling strategies are shown on the level of the bacterial growth, extracted proteins, or peptide mixtures. Depending on the method chosen, the quantification is either done using the MS spectrum or the MS/MS spectrum

Relative quantitation of proteins using chemical labeling

As an alternative to metabolic labeling, differential analysis can be achieved using chemical labeling. In 1999, isotope-coded affinity tags (ICAT) were developed to covalently label cysteines in extracted proteins with either a light or heavy (deuterium containing) form of the ICAT reagent (Fig. 3, Gygi et al. 1999). The proteins of the two samples are mixed and digested together before further affinity purification of the ICAT-labeled peptides, reducing the sample complexity towards mass spectrometric analysis. The MS analysis then reveals the peptide intensity ratios corresponding to the quantity of these peptide pairs (mass shift of 8 Da for 2+ charged peptides) in the two samples. This method was also further developed to overcome some of its initial limitations, as was explained by Goshe and Smith (2003), but the limitation to cysteine containing peptides has substantial disadvantages in terms of missing proteins and the fact that quantitation is often based on a low number of peptides per protein. The use of ICAT seems nowadays to focus on measuring redox states of cells (Sethuraman et al. 2004).

Chemical isotope labeling strategies are nowadays dominated by the use of multiple isobaric tags that are predominantly applied on whole proteome tryptic digests. Several commercial products are available, e.g., isobaric tags for relative and absolute quantification (iTRAQ) or tandem mass tags (TMT) (Thompson et al. 2003; Ross et al. 2004). In both methods, tryptic peptides from different samples are labeled at their N-terminus and lysine side chains, using isobaric tags. A major advantage is that several samples can be multiplexed together for the detection of differences in protein expression (Fig. 3) (Thompson et al. 2003). Moreover, in contrast to ICAT, the reliability of protein identification and quantification as well as the proteome coverage are improved by tagging almost all peptides, by reducing the MS complexity as well as by quantifying at the MS/MS level by the generated reporter ions at specific m/z values only (Fig. 3). The Lottspeich group developed a similar strategy, denoted as isotope-coded protein label (ICPL; Schmidt et al. 2005). As the name suggest, it is promoted to be used at the protein level, labeling both N-termini and lysine side chains. This has the advantage that labeled samples can be mixed and an earlier stage, but tryptic digestion is then restricted to arginine side chains resulting in less and larger peptides per protein. Technically spoken, ICPL can also be used at the peptide level (Leroy et al. 2010) and iTRAQ and TMT at the protein level since the basic chemistry for derivatization is the same (succinimide-based amine labeling).

Out of the many applications of iTRAQ and TMT, we list a few examples in the area of the bacterial resistance against antibiotics. Yun et al. (2011) detected common and antibiotic-specific protein responses to tetracycline and imipenem in a clinical Acinetobacter baumannii strain. The membrane protein profile of E. coli stimulated with an antimicrobial peptide or of S. maltophilia upon imipenem challenge was interrogated by combining iTRAQ with a 2D LC-MS/MS approach (Zhou and Chen 2011; Van Oudenhove et al. 2012)

Isobaric chemical labeling techniques suffer mainly from the advent of chimeric MS/MS spectra, which result in a diminished accuracy of quantification (Altelaar et al. 2012; Evans et al. 2012). Recently, it was demonstrated that an MS3-based analysis can be used to eliminate this interference problem and holds a solution for the continuation of using isobaric chemical labeling in today’s requirements for quantitative proteomics (Ting et al. 2011). Finally, labeling the N-terminus and lysine side chains of peptides by reductive amination or “dimethyl labeling” is also gaining popularity for the quantification of protein abundances (Fig. 3) (Hsu et al. 2003; Boersema et al. 2009).

Label-free quantitation of proteins

Isotope labeling is associated with a number of technical difficulties ranging from the specific requirements for metabolic labeling to reproducibility problems in chemical labeling approaches. Therefore, recent improvements in high throughput and automation of LC-MS instruments and especially the development of novel algorithms dealing with LC-MS data, quantitative proteomics using label-free approaches attracted a lot of interest in the proteomics community. The label-free techniques for bottom-up proteomics can be divided into two groups depending on their correlation with protein abundances in the sample: (1) spectral counting, which counts the number of peptides assigned to a protein in an MS/MS experiment and (2) chromatographic peak area under the curve (AUC) or signal intensity measurement of the precursor ion MS spectra (reviewed by Neilson et al. 2011). Spectral counting relies on the observation that more abundant proteins will be selected more often for fragmentation in a DDA experiment and will thus produce more MS/MS spectra (Liu et al. 2004). The challenges, limitations, and further developments of spectral counting were recently reviewed elsewhere (Lundgren et al. 2010). The performance of spectral counting was compared with that of metabolic labeling and iTRAQ/TMT labeling on a LTQ-Orbitrap Velos using a Pseudomonas putida strain (Li et al. 2012). The technique showed improved proteome coverage, but did not outperform the quantification of label-dependent strategies owing to reproducibility problems in the quantitation.

Alternatively, the protein abundance index (PAI) or the number of observed unique peptides divided by the theoretical number of tryptic peptides for each protein that can be observed within a m/z range is able to estimate the abundance relationship between proteins in a sample (Rappsilber et al. 2002). This technique was further improved using the exponentially modified PAI (emPAI), which is directly proportional to the protein amount in a sample (Ishihama et al. 2005). The emPAI method, however, suffers from saturation when highly abundant proteins are present in the sample, as well as from a decreased correlation with real protein abundances when low resolution mass spectrometers are used. The method of absolute protein expression (APEX) takes the probability of detection of the peptides by MS into account, as was demonstrated in E. coli and yeast when estimating the contributions of transcriptional and translational gene regulation (Lu et al. 2007). Because this method uses a machine learning classification algorithm for peptide length and composition, the selection of an appropriate training set can be challenging when facing many unknown proteins in your sample. Finally, Asara et al. (2008) showed that the spectral total ion chromatogram (TIC), which is the average of the total ion count for a protein, can be used as a quantitative value that eliminates the bias towards larger proteins because they generate more tryptic peptides. It also expands the dynamic range of quantification compared to basic spectral counting methods.

In addition to the spectral counting-based label-free quantification methods, the chromatographic peak area or ion intensity (ion count) for a given peptide at a specific retention time in an LC-MS run can be used to quantify proteins in a sample because this measure is linearly correlated to the concentration of that peptide in the sample (Bondarenko et al. 2002; Chelius and Bondarenko 2002). Indeed, ESI generates multiple charged ions with a signal strength proportional to the concentration of the ion (Graham et al. 2007). Under well-standardized LC-MS conditions, peak intensities of a peptide can thus be compared between multiple LC-MS runs. However, this AUC quantitation heavily relies on a good LC resolution and reproducibility and should be performed on MS instruments with a high mass accuracy and resolution for the correct assignment of m/z values (TOF, Orbitrap). Moreover, this method only holds with the appropriate software analysis for the alignment of the retention times of the different LC-MS runs covering all samples, the peak picking, normalization of peak abundances, and statistics to detect real biological differences in protein abundance (America and Cordewener 2008). Thanks to recent developments in the analysis of label-free data such as dealing with peptides shared amongst proteins, minimizing false discovery rates, data normalization, and appropriate statistical analysis, and label-free proteomics is more reliable than some years ago (Podwojski et al. 2010; Neilson et al. 2011).

As previously discussed in this review, the method called LC-MSE solves the alignment issue by using a high resolution mass spectrometer (Q-TOF) which cycles between MS and MS/MS (DIA), enabling the registration of changes in peptide signal response from each accurate mass measurement and retention time (AMRT) value and thus reflecting the concentration of that peptide in a sample compared to another one (Fig. 4) (Silva et al. 2005; Richardson et al. 2012). This label-free method can even allow for absolute quantitation, when the peak ion count for the three most intense peptides by electrospray ionization and protein concentration is used as a correlation. The inclusion of a protein digest with a known concentration as an internal standard allows a response factor to be calculated from this correlation, which is then applied to proteins with minimally three peptides observed (Silva et al. 2006a). The advantage of this label-free LC-MSE method, in terms of sample requirement, LC-MS instrument time, and a higher protein coverage compared to the gel-based or iTRAQ-based quantitation methods was clearly demonstrated with M. silvestris proteomics (Patel et al. 2009). These advantages are responsible for the promising future of this method for the accurate quantitative analyses of, e.g., many environmental and clinical samples with low sample amount available and for experiments where minimal sample preparation is crucial for the detection and quantification of transient modifications. This approach was also recently validated for accurate quantitation in simple as well as complex samples by Levin et al. (2011).

Fig. 4
figure 4

Example of an LC-MSE label-free relative quantitation analysis. Proteins from S. maltophilia at different time points after antibiotic stimulation were analyzed. The SYNAPT HDMS (Q-TOF) low-energy (MS) and high-energy (MS/MS) base peak ion chromatograms for two samples are shown. Both spectra resulting are time aligned and the masses of the ions are corrected using internal mass calibration for peptide identification (a). The precursor peak intensities from similar mass-to-charge ratios (m/z) measured at a specific retention time are aligned through all the LC-MS runs (b). The peak intensities are compared throughout the different samples, and after statistical analysis, changes in protein abundance at different time points before and after antibiotic challenge are observed (c)

MS-based validation methods for proteomic results

Unlike the relative quantitative strategies that are mostly used in hypothesis-generating or discovery-driven proteomic strategies, absolute protein quantities can be obtained in targeted proteomic (hypothesis-driven) strategies. Gerber et al. (2003) proposed an absolute quantification (AQUA) strategy for proteins as well as their posttranslationally modified forms, in which synthetic proteotypic peptides with incorporated stable isotopes are used as the ideal internal standard. These internal standard peptides are then used to measure the absolute quantity of a protein of interest after digestion using selected reaction monitoring (SRM) MS measurements. This commercially available approach is the most commonly employed one for absolute quantitation with SRM. An alternative to the expensive and sometimes difficult to synthetize AQUA peptides are the artificial genes encoding a concatenation of isotopically labeled tryptic peptides for the absolute quantification of multiple proteins (QCAT, QconCAT). This QconCAT strategy is more suited for highly multiplexed absolute quantification experiments because the QconCAT can encode between 10 and 30 target proteins with two proteotypic peptides per protein (Beynon et al. 2005; Simpson and Beynon 2012). As with AQUA, the efficiency of tryptic digestion has to be assessed for accurate quantification. Alternatively, the protein standard absolute quantification method (PSAQ) that spikes in isotope-labeled full length target proteins can be incorporated with the samples at the very beginning of the sample preparation workflow, circumventing these problems (Brun et al. 2007; Dupuis et al. 2008). Evidently, the limitations of this approach are the capacity of expressing, purifying, and quantifying the native proteins. More information on the recent developments in and differences between these different isotope dilution strategies for absolute quantification can be found in the reviews of Brun et al. (2009) as well as of Picotti and Aebersold (2012).

Malmström et al. developed a groundbreaking strategy combining the absolute quantification of proteins using those isotope-labeled reference peptides (AQUA), the MS intensity-based label-free quantitation, and high-throughput sequencing with LC-MS to obtain the average number of protein copies per cell for a significant portion of the Leptospira interrogans proteome (Malmström et al. 2009). Schmidt et al. (2011) used the same idea with a directed MS strategy and AQUA to detect and absolutely quantify tryptic peptides from L. interrogans in 25 different conditions, resulting in one of the most complete proteome abundance profile comparisons so far.

SRM is frequently used as a validation technique to detect and quantify targeted proteins with a high precision across a high number of samples. It is used in a LC-MS system mostly using a triple quadrupole or Q-TRAP mass spectrometer, which specifically monitors an analyte ion and one or several predetermined fragment ions generated by CID (SRM transitions). Several SRM transitions can be sequentially measured and thus quantification of multiple analytes can be done across the same LC-MS run, termed multiple reaction monitoring (MRM) (Yocum and Chinnaiyan 2009). Selection of target peptides and transitions is based on previous knowledge and computational tools (Cham Mead et al. 2010). The application of SRM in proteomics together with relative or absolute quantification strategies, advances, pitfalls, and future directions was recently reviewed by Picotti and Aebersold (2012). SRM is also used to study protein modifications such as phosphorylation (Cox et al. 2005). Currently, researchers are exploring possibilities to increase the multiplexing capabilities of SRM. Intelligent SRM, e.g., monitors intense transitions for a peptide’s precise quantification. When a preset threshold is exceeded, additional transition signals are acquired in a DDA manner to confirm the peptide identity (Kiyonami et al. 2011). Another example is the use of fragment ion spectra from all precursors using SWATH MS followed by targeted SRM to uniquely identify peptides in the DIA fragment ion maps (Gillet et al. 2012). Increased SRM sensitivity and specificity could be obtained by coupling SRM to ion mobility separation. Moreover, improved bioinformatics tools for the prediction of SRM transitions as well as the data evaluation will increase the specificity of SRM assays in the future.

Future applications for microbial quantitative proteomics

Improvements in sensitivity, mass accuracy, and MS/MS capabilities of mass spectrometers had a tremendous impact on the field of proteomics since its inception some 20 years ago. For quantitative proteomics, we have observed that gel-free shotgun proteomics methods are replacing 2-DE and that label-free quantitative approaches growingly become more popular than the labeling techniques. However, the measurement of protein abundances by 2-DE or shotgun proteomics provides an overview of the expressed protein abundances in a cell at a single time point, but mostly no insight into the dynamic changes (Wilkins 2009b; Doherty and Whitfield 2011). Measuring protein turnover on a proteome-wide scale as response to changes in the environment will be necessary to improve a more complete view of a biological system. Similarly, increasing the sequence coverage of proteins by digestion procedures complementary to the traditional trypsin digestion will have to be further exploited for a more comprehensive view of the “complete” proteome (Thelen and Miernyk 2012). In the future, microbial proteomics will also have to evolve from profiling and expression studies towards the comprehensive analysis of the role of different posttranslational modifications in certain biological processes, the function of protein complexes, and interactions instead of individual proteins as well as the spatial localization of proteins in bacterial cells or even multicellular structures, such as biofilms, by imaging MS (Blaze et al. 2012).

Further developments in tools for data mining, protein functional annotating, and finding meaningful answers to the posed biological questions are needed. All too often, proteomics experiments fail to provide concluding results as a result from an inadequate experimental design. Therefore, a close collaboration with bioinformaticians and statisticians will be needed. Additionally, the use of publically available data repositories for acquired proteomic data as well as a better cooperation of different institutes and databases to create a clear set of gene names, symbols, functional annotation, and predicted localization as well as possible modifications will be beneficial for the proteomic community to continue to thrive.

Finally, a more transparent reporting of proteomic results will help in the correct interpretation of the available proteomic data sets and improve the comparison of the existing quantitative proteomic approaches as well as the development and testing of new ones (Taylor et al. 2007; Mead et al. 2009). MS-based quantitative proteomics is still evolving rapidly and will continue to be a tremendously important tool for deciphering complex biological processes such as microbial communities and their creative adaptation mechanisms towards environmental changes.