Keywords

1 Introduction

MS-based protein quantitation is increasingly utilized to determine differences between samples from healthy and diseased patients for biomarker (i.e., biological indicators of disease or disorder) and systems biology studies. Although quantitation can be performed using a relative technique, such as iTRAQ (isobaric tags for relative and absolute quantitation [1]) or TMT (tandem mass tag [2]), techniques that provide exact endogenous concentrations (often reported in ng/mL units), as opposed to fold changes of abundance levels, are more informative and better suited for applications where the analysis of pre-clinical and clinical samples is the ultimate goal. Such quantitative techniques are commonly referred to as “absolute”, and require the use of isotopically labeled standards (typically expressed in bacterial media, in the case of proteins [3], or chemically synthesized, for peptides) and a targeted form of MS detection (usually MRM-MS with electrospray ionization, ESI, for gas phase ionization of the chromatographic eluent) to be employed within a bottom-up analytical workflow [46]. In this generalized approach, proteotypic peptides serve as molecular surrogates for the target proteins. The isotopically labeled standards are typically labeled with 13C and/or 15N, as opposed to 18O or 2H, and these labels are incorporated into amino acids within a protein or the C-terminal residue of a tryptic peptide. Collectively, the standards are used for normalization of the peptide signal and LC-MS conditions. In MRM-MS, specific precursor-product ion pairs (referred to as transitions) are used for peptide detection. Generating peptide specific transitions requires a priori knowledge of the analyte and its dissociation upon collisional activation (also referred to as collision induced dissociation or CID). While the use of MRM is common and is classically performed on a triple quadrupole mass spectrometer, directed quantitation has also recently been accomplished by parallel reaction monitoring (PRM) on a hybrid quadrupole-Orbitrap (i.e., Q Exactive) mass spectrometer [79] and by MS/MSALL with SWATH acquisition on a quadrupole time-of-flight (QTOF) [10] or a hybrid quadrupole linear ion trap (QTRAP) mass spectrometer [11]. Mechanistically in PRM, for instance, all product ions that lie within a specified mass range and emanate from a specifically fragmented precursor are detected in the high resolution, high mass accuracy Orbitrap analyzer. An attractive feature of this technique, as well as MS/MSALL, is that it allows the post-analysis mining of previously collected (or archived) MS/MS data, and therefore allows the selection of alternate quantitative transitions if interference with the target(s) is observed.

The most desirable sample sources for biomarker research and clinical measurement are ideally non-invasive, such as urine or saliva. Although blood plasma and serum are semi-invasive, they are still commonly used for monitoring and stratifying diseases. Plasma and serum are used because they are relatively inexpensive to collect and analyze, and carry a wide dynamic range of proteins (approximating or exceeding 10 orders of magnitude [12]) that are secreted, released, or leaked from neighboring cells, tissues, or organs into the systemic circulation. The fluid therefore paints a physiological picture of the health status of an individual, which is imperative for disease diagnosis and prognosis. It is important to note here that there is a distinction between plasma and serum since the two are often incorrectly used interchangeably by the proteomics field. Plasma and serum are both derived from whole blood, with serum collected from plasma after coagulation. It is through the coagulation process that an assembly of mid-abundance proteins (e.g., fibrinogen, prothrombin, thrombin, and a host of coagulation factors – notably II, V, and VIII) are at least partially removed. Serum is, however, generally disfavored by the Human Proteome Organization (HUPO [13, 14]) since coagulation can cause additional proteins to be unintentionally removed through non-specific interactions and is also a highly variable process, with the results being dependent upon the coagulation conditions and the nature of the collection tube [15]. It is for these reasons that our blood-based assay developments and analyses are commonly conducted with plasma, with the exception being our dried fluid spot quantitative analyses where the spots originate from whole blood [16].

As inferred above, plasma is an inherently complex biofluid, carrying thousands of potentially measurable proteins spanning the low mg/mL (or millimolar; encompassing serum albumin and the immunoglobulins, among others) to low pg/mL (or attomolar; which includes the interleukins and cytokines) concentration range. An active area of biomedical research centers on developing sensitive methods to accurately and reproducibly quantify proteins at the lower end of the concentration range since these candidates are considered to have the greatest diagnostic potential. Targeted quantitative methods for detection of proteins with concentrations below the MRM detection limit often use anti-protein [17] or anti-peptide [12, 18, 19] antibodies for immunoaffinity enrichment or alternatively the implementation of multidimensional separations (increasingly with alkaline and acidic RPLC (reversed-phase liquid chromatography) [20, 21], less commonly with strong cation-exchange and RPLC configurations [22]) for peptide fractionation. Additional techniques developed for deeper protein quantification involves the upfront use of immunodepletion for high abundant protein removal via antibody-based, affinity interactions [2226]. Depletion, however, is disfavored from a cost and throughput perspective, as well as for the potential of target protein loss through non-specific or non-covalent interactions with the depletion cartridge or depleted proteins. An added detraction of this technique is the potential underestimation of protein concentration, as was demonstrated recently by Percy et al. in the side-by-side comparison of a depletion-based and depletion-free, multiplexed quantitative proteomic assay of cerebrospinal fluid [27]. Nonetheless, despite the increasing emphasis on low-abundance proteins, antibody- and fractionation-free quantitative proteomic methods should also be developed for the screening of higher-abundance protein markers since these are also informative and correlate with multiple diseases such as cancer and cardiovascular disease (CVD) [28, 29]. This is why we have developed sets of highly-multiplexed (defined as enabling multi-analyte detection in a single analytical run) MRM assays for the precise quantitation of high-to-moderate abundance, candidate protein biomarkers in undepleted and non-enriched human plasma [20, 3032].

The protein biomarker pipeline is essentially comprised of four stages – discovery, verification, pre-clinical validation, and clinical validation. Although quantitative MRM or PRM methods can be used to assess marker utility at all levels, their greatest value lies in the discovery and verification phases. Once the lengthy list of potential candidate markers has been screened and condensed according to statistical significance, resources can then be invested in the development of antibodies, which is a costly and developmentally intensive process [33]. At the validation stages of biomarker assessment, shorter lists of verified candidates (typically <10) are interrogated against a larger number of samples (on order of 1000s at the validation stage vs. 10s–100s in the preceding stages [33]). While ELISAs (enzyme linked immunosorbent assays) are often considered to be the “gold-standard” for clinical applications [34], emerging techniques, such as iMALDI (immuno matrix-assisted laser desorption/ionization; where peptide detection of captured peptides occurs by MALDI-TOF-MS without prior chromatographic separation [35]) and SISCAPA (stable isotope standards and capture with anti-peptide antibodies via LC-MS [18, 36] or MALDI-MS [37, 38] detection) could alternatively be employed.

To expedite biomarker verification, the targeted quantitative methods must be standardized. This should facilitate improved method reproducibility and transferability and lead to a more rapid and accurate evaluation of the candidate protein biomarkers in a given biological fluid [39, 40]. To this end, a variety of kits have been developed for the quantitative proteomics community. Stemming from work done in our laboratory, QC kits are developed to evaluate the performance of a LC-MS system and/or one type of sample preparation in a targeted quantitative proteomic workflow [41, 42]. Recently, we have also developed several biomarker assessment kits (BAKs) for screening various protein panels against patient plasma samples for biomarker discovery or verification studies. The methods collectively utilize an antibody-/fractionation-free approach, a rigorously optimized and evaluated bottom-up LC/MRM proteomic workflow, and our well characterized SIS peptides. The targeted proteins are either putative biomarkers for CVD and cancer or have unknown disease associations. Each BAK contains a collection of key starting materials (i.e., reference plasma, trypsin, and SIS peptide mixture), a detailed protocol, a LC-MS acquisition method, data analysis software, and a troubleshooting guide. This chapter will detail the protocol and provide the rationale behind the development and application of two recent biomarker assessment kits – BAK-192 for discovery and a custom BAK-21 for verification – for MRM-based quantitative proteomic studies. Also provided is a description and implementation of our recently developed Qualis-SIS software [43] for quantitative proteomic applications.

2 Targeted Quantitation Method – Strategy, Description, and Rationale

The principle checkpoints we use in developing sensitive and specific MRM-based quantitative proteomic assays, such as the BAK-192 and BAK-21, involve protein/peptide target selection, SIS peptide production, solution/sample preparation, interference screening, and protein quantitation (see Fig. 24.1 for our generalized workflow). Additional important steps include balancing the concentrations of the mixture of SIS peptides to their corresponding natural (or NAT) peptide signals (balancing helps reduce analytical variation between analyses [44]), as well as optimizing the MRM transitions (includes their collision energies) and LC gradient. This section expands upon that basic framework developed to quantify multiplexed panels of plasma proteins for assessment as potential biomarkers via a bottom-up LC/MRM approach using SIS peptides. By outlining our strategy and rationale behind each development step, the user will obtain the necessary tools for extending the quantitative method to alternative panels and types of samples. Nonetheless, the applications that these BAKs are designed for is discussed in the section that follows.

Fig. 24.1
figure 1

General workflow for MRM assay development. Protein/peptide selection is a bioinformatics exercise aided by previously collected data or curated databases, as well as by software tools, such as PeptidePicker. The internal standards employed are SIS peptides, which are synthesized, purified, and characterized for more accurate protein quantitation. MRM transition optimization and screening for chemical interference in the sample matrix is performed empirically, while protein quantitation is performed on the interference-free peptides via standard curves

2.1 Protein and Peptide Selection

The first step in our quantitative proteomic method development is generating a list of potential biomarkers in human plasma. These putative biomarkers are selected from prior discovery experiments or from literature reports, and typically exist in a wide range of concentrations. Tryptic peptides (ideally a minimum of 2) are then chosen to act as molecular surrogates for each biomarker. Selection is based on adherence to a set of qualification criteria [45], with the most notable ones indicated below:

  • Peptides must be unique to the target biomarker (human in this case; determined from a BLASTp search).

  • Peptides must have been previously observed in tandem MS proteomic studies (revealed in the Global Proteome Machine and PeptideAtlas databases).

  • Peptides must not contain a missed tryptic cleavage site (Kiel rules obeyed [46]).

  • Peptides must be between 5 and 25 residues in length to ensure acceptable ionization and gas-phase fragmentation.

To reduce error and subjectivity, the rules have recently been assembled into a software tool we named PeptidePicker, which automates candidate identification and ranks the selected peptide(s) for a given protein within a specified proteome (human or mouse) [47]. This program, we note, is an advancement over the PeptideSieve tool (developed by the Seattle Proteome Centre), which predicts proteotypic propensity based solely on the physicochemical properties of the peptides expected to result from a digest of a given protein [48]. Due to the accuracy and enhanced speed of peptide selection in PeptidePicker (ca. 50 proteins per hour compared to 8 per day in peptideSieve [47]), the time devoted to bioinformatics is significantly reduced, allowing more time to be spent on the rest of assay development. Furthermore, PeptidePicker reduces human error and provides users with a standardized method for target peptide selection of any panel of biomarkers.

2.2 SIS Peptide Production

Once the proteotypic peptides have been selected, their heavy isotope labeled analogues are synthesized, purified, and characterized. These are essential steps for obtaining absolute and precise, but not necessarily accurate, endogenous protein concentrations. In our laboratory, synthesis is performed in-house on an Overture peptide synthesizer (Protein Technologies) using Fmoc chemistry. To enable chromatographic alignment of heavy isotope coded peptides with the regular NAT peptides (which greatly assists in the subsequent interference testing step), [13C]/[15N] isotopes (Cambridge Isotope Laboratories) are incorporated at the C-terminal residue of tryptic peptides, typically leading to +8 Da (from [13C6, 15N2]-lysine) or + 10 Da ([13C6, 15N4]-arginine) mass shifts. Purification is also performed in-house by RPLC, with the fractions of interest confirmed by MALDI-TOF-MS on an Ultraflex III TOF/TOF mass spectrometer (Bruker Daltonik). After lyophilization of the pooled target fractions, amino acid analysis (AAA) and capillary zone electrophoresis (CZE) are then performed for absolute concentration and purity determination, respectively. Of relevance here, the average purity of the 487 target peptides used in the discovery BAK-192 is 92 %.

2.3 Sample Preparation and LC-MS Processing

It is our general practice to prepare small sample sets (i.e., <20) manually in polypropylene Maxymum recovery microtubes (Axygen), but automate the preparation of larger sets of samples with a robot (Freedom EVO 150 platform; Tecan) in 96-well microtiter plates. A generalized flow chart of our sample preparation and processing process is illustrated in Fig. 24.2. It should be noted that our robot is configured to automate only the liquid handling steps, with centrifugation and incubation occurring externally.

Fig. 24.2
figure 2

Overview of our sample preparation and processing workflow. The plasma proteins are unfolded and the disulfide bridges are cleaved and capped by a series of denaturation, reduction, alkylation, and quenching steps prior to tryptic proteolysis. Labeled peptide standards are spiked post-digestion to prevent chemical modification which can occur during proteolysis. After the sample is concentrated by solid phase extraction, peptide mixture is separated by RPLC and detected by dynamic MRM on a QqQ mass spectrometer. Plasma protein quantitation is achieved through SPM or regression analysis of the standard control curve

Toward the preparation of plasma proteolytic digests, a ten-fold diluted plasma sample (20 μL for the control and 6 μL of raw fluid per patient) is denatured, reduced, alkylated, and quenched with 1 % sodium deoxycholate (10 % initially), 5 mM tris(2-carboxyethyl) phosphine (50 mM initially), 10 mM iodoacetamide (100 mM initially), and 10 mM dithiothreitol (100 mM initially), respectively, all prepared in 25 mM ammonium bicarbonate. The protein denaturation and Cys-Cys reduction steps occur simultaneously for 30 min at 60 °C, while Cys alkylation and iodoacetamide quenching is performed subsequently for 30 min at 37 °C. Thereafter, proteolysis is achieved by the addition of 23.3 μL TPCK-treated trypsin (Worthington) (1.8 mg in 2 mL of 25 mM ammonium bicarbonate; prepared immediately before addition) at a 10:1 substrate:enzyme ratio. After overnight incubation at 37 °C, proteolysis is arrested by the step-wise addition of a chilled SIS peptide mixture (concentration balanced; 50 μL at 250 to 0.5 fmol/μL for the control or 50 μL at 25 fmol/μL for the patient plasma) and a chilled formic acid (FA) solution (277 μL at 1 %) to a digest aliquot (277 μL; pooled from 4 digests in the control prep). The SIS mixes used in the control will be used to prepare the calibration curves. These mixtures each contain a fixed amount of endogenous peptide and an increasing concentration of synthetic peptide (over a 500-fold concentration range). The resulting dilution series prepared from each reference standard is as follows: 250 fmol/μL stock (standard F), 125 fmol/μL (standard E), 25 fmol/μL (standard D), 12.5 fmol/μL (standard C), 2.5 fmol/μL (standard B), and 0.5 fmol/μL (standard A; all prepared in 0.1 % FA). A merit of the deoxycholate surfactant is that is acid insoluble and therefore can be readily removed by simple centrifugation (10 min at 12,000 rpm). This is in contrast to sodium dodecyl sulfate which damages the LC column and causes signal suppression if not properly removed. Following centrifugation, the peptide supernatant is concentrated by solid phase extraction (SPE) using a polymeric RP sorbent (10 mg Oasis HLB; Waters). The extraction steps are as follows:

  1. 1.

    wash with 1 mL methanol,

  2. 2.

    condition with 1 mL water,

  3. 3.

    load with 556 μL of 0.1 % FA followed by 444 μL of digest supernatant,

  4. 4.

    wash with 1 mL water, and

  5. 5.

    elute with 300 μL of 50 % acetonitrile (ACN) in 0.1 % FA.

The eluate is then lyophilized and rehydrated in 100 μL of 0.1 % FA for LC-MRM/MS.

The LC-MS system we routinely use for the BAKs consists of a 1290 Infinity system that is interfaced to a 6490 triple quadrupole (QqQ) mass spectrometer (all from Agilent Technologies) via a standard-flow, ESI source (operated in the positive ionization mode). The LC column is a Zorbax Eclipse Plus RP-UHPLC column (2.1 × 150 mm, 1.8 μm particles). The separation occurs over a 43 min gradient (1.5–81 % mobile phase B; mobile phase compositions: 0.1 % FA in water for A and 0.1 % FA in ACN for B) at flow rates of 0.4 mL/min and a temperature of 50 °C. A 4 min post-acquisition step using mobile phase A is allotted for column equilibration. The specific gradient we employ is as follows (time in min, %B): 0, 1.5; 1.5, 6.3; 16, 13.5; 18, 13.77; 33, 22.5; 38, 40.5; 39, 81; 42.9, 81; and 43, 1.5. Note that standard flow rates are used instead of conventional nano-flow rates due to the superior analytical merits (in terms of reproducibility and sensitivity) found for the standard flow system when 10× material is loaded onto a wider-bore column [49]. The mass spectrometer is operated in the dynamic MRM mode (i.e., scheduled retention times for enhanced analyte specificity and reduced duty cycle) with 1 min detection windows and cycle times approximating 850 ms (see [32] and its supplemental tables for the general and specific acquisition parameters).

2.4 Interference Reduction and Screening

Interference is commonly observed in the quantitative analysis of human plasma. These interferences exist despite the m/z and retention time filtering in scheduled MRM acquisitions, and is attributed largely to the inherent complexity of blood plasma, as well as to the low resolution QqQ mass spectrometer employed. Tryptic proteolysis further increases the complexity as it converts thousands of plasma proteins into millions of peptides. This increased complexity increases the possibility of non-target ion transmission in the quadrupole mass analyzers (Q1 and Q3) which necessitates utilizing interference reduction and screening techniques in quantitative proteomic studies.

Interferences can be reduced by minimizing concurrent MRM transitions, so our method development first involves optimizing the LC gradient, to produce an even distribution of peptides across the chromatographic space. To ensure the accuracy of quantitative results, the control and sample are first screened for interference. This is conducted empirically in our laboratory, as opposed to theoretically using a program such as SRM Collider [50]. In the analysis of the control (also referred to as the reference) sample digest, interferences are determined by monitoring the SIS and NAT responses (i.e., peak areas) under matrix-free and matrix-containing conditions (both at n = 2). The variability in these calculated response ratios indicates the presence or absence of interferences in the MRM ion channels. For a given peptide to be interference-free, the average relative ratios between a SIS transition in buffer or plasma, and NAT transition in plasma, must have CVs below 20 %. Further, the NAT and SIS signals must be the same in both peak shape and retention time. Figure 24.3a shows a typical example of an interference-free and interference-containing peptide. In this example, the interference observed in the NAT transition of YWGVASFLQK and the high variability of two of its three average relative ratios precludes its use for protein quantitation.

Fig. 24.3
figure 3

Interference screening strategies for MRM transitions monitored in control and patient plasma digests. (a) Representative XICs of 3 SIS and NAT transitions measured in buffer and control plasma for the interference-free peptide VGYVSGWGR (from haptoglobin, P00738) and the interference-containing peptide YWGVASFLQK (from retinol-binding protein 4; P02753). (b) Relative response (RR) correlation plots and peptide XICs for the interference-free peptides SFNPNSPGK and IQNILTEEPK from serum paraoxonase/arylesterase 1 (P27169) and the interference-containing peptide VVLSQGSK from sex hormone-binding globulin (P04278) in the CVD patient sample marked with the arrow. These figures were reprinted from [41] and [32], respectively, with permission

The aforementioned approach is suitable for the inspection of control samples, but an alternative strategy must be adopted for interference screening in patient samples. Our recommended strategy requires a minimum of two peptides to be targeted for a given protein in order to construct peptide correlation plots (ratios of quantifier NAT/SIS relative responses), as first introduced by Agger et al. [51]. The linearity of each plot is then examined for outliers; with those that deviate requiring further inspection of their SIS and NAT peptide extracted ion chromatograms (XICs) to evaluate the level of interference. We recently demonstrated the implementation of this strategy in the quantitative analysis of 40 CVD-linked proteins (inferred from an average of three peptides per protein) across a small CVD patient cohort (n = 18; blood plasma supplied by Bioreclamation). As illustrated in Fig. 24.3b, the peptides SFNPNSPGK and IQNILTEEPK can effectively serve as surrogates for serum paraoxonase/arylesterase 1 (P27169) in all of the measured samples since they are interference-free, while peptide VVLSQGSK cannot be used to quantify sex hormone-binding globulin (P04278) in the CVD patient sample marked with the arrow due to interference. The advantage of this approach is that it requires the peptide responses of only the quantifier transitions, which enables BAK-192 to be processed with a single acquisition method. The use of multiple transitions (customarily with 1 quantifier and 2 qualifiers) for enhanced interference evaluation for BAK-192 discovery requires 3 LC-MS acquisition methods (2922 total transitions for the 487 peptides with 1461 transitions targeted for both peptide forms). In this case, multiple methods are required to reduce the duty cycle and obtain sufficient points across a chromatographic peak (defined as 10–15) for improved ion statistics.

2.5 Plasma Protein Quantitation

The MRM data is first examined with MassHunter Quantitative Analysis software (Agilent; Skyline can alternatively be used), for verification, peak selection and integration. Thereafter, the processed data is inputted into our in-house developed software tool – Qualis-SIS – for analysis. This tool requires two input files for each of the reference and sample data sets. These files carry peptide- and protein-related information, with SIS and NAT responses required for the former (retention time, peak width, symmetry but other metrics can additionally be included) and protein molecular weights and SIS peptide concentrations required for the latter. After defining a small number of criteria (e.g., regression weighting, precision and accuracy requirements) for each concentration level of the standard curve, the tool automatically performs the following three functions: (1) generates and extracts assay information from standard control curves, (2) determines the endogenous protein concentrations in the patient samples, and (3) assesses the quality of the quantitative sample measurement with respect to the assay’s linear dynamic range. The following information is provided by each control curve: endogenous protein concentration, dynamic range, lower and upper limits of quantitation (LLOQ and ULOQ), and regression equation (slope and y-intercept) with coefficient of determination (R2). In the analysis of the samples, each measured concentration (derived from the relative response measurements also referred to as single point measurement –SPM- and linear regression analysis) is plotted on each peptide’s standard curve. The quality assessment page indicates whether or not the results should be trusted through a color-coded matrix. In the matrix, green denotes an acceptable quantitative value (due to its presence within the assay’s range of linearity), yellow indicates that caution should be exercised, while red suggests that the value should be discarded (see Fig. 24.4 for an example of each classification type from the CVD-directed quantitative study indicated above). The assessment is based on the relationship of the concentration to the linear dynamic range as well as its deviation from the LOQ and the user-defined confidence threshold. The comprehensive and summarized results can then be exported for subsequent reporting and statistical treatment.

Fig. 24.4
figure 4

Examples of patient sample results from the Qualis-SIS data analysis software tool. The examples show cases where the quantitative results are (a) acceptable (TAAQNLYEK from apolipoprotein C-II; P02655), (b) intermediate (IIPHHNYNAAINK from coagulation factor IX; P00740), or (c) unacceptable (TLEAQLTPR from heparin cofactor 2; P05546). The results were obtained from the same patient plasma sample used in the CVD study

3 Method Implementation and Practical Biomarker Applications

Through rigorous evaluation and refinement, a well characterized set of MRM assays has been developed for quantifying a multiplexed panel of 192 candidate disease markers in unfractionated human plasma. The method centers on a bottom-up UHPLC/MRM workflow and uses concentration-balanced SIS peptides as internal standards. The quantified proteins are of high-to-moderate abundance, with concentrations spanning 6 orders of magnitude, from 31 mg/mL (for serum albumin, P02768) to 18 ng/mL (for peroxiredoxin-2, P32119) – see Fig. 24.5a for the quantitation range. These endogenous concentrations were derived from standard control curves (based on 144 proteins [52]) and/or individual XIC measurements (based on an additional 48 proteins [20]) using peptides as surrogates (487 interference-free in total). Regarding the curves, these were constructed from a strict set of qualification criteria, which our developed software – Qualis-SIS – accurately applies in an automated and rapid manner. The result of this analysis were a set of assays with average linear dynamic ranges of 102–103, protein LLOQs between 5 ng/mL and 260 ng/mL (based on quantifier peptides), and average R2 values of 0.980. The assay reproducibility is high, with average relative responses of <6 % and average retention times of <0.1 % routinely obtained over replicate analyses [52]. These quantitative panels can now be applied in discovery- and verification-directed proteomic studies to help bridge the gap between biomarker discovery and validation.

Fig. 24.5
figure 5

Quantitation results from the multiplexed MRM analysis of control plasma. The range of protein concentrations shown in (a) was determined from the BAK-192 discovery analysis, while the concentration distribution in (b) is from the BAK-21 verification analysis

In the classical sense, protein biomarker discovery is accomplished through bottom-up (or shotgun) LC-MS/MS using a multidimensional protein identification technology (MudPIT) in conjunction with data dependent acquisition (DDA). In DDA, a subset of peptide precursor ions, detected in the survey scan, are selected for CID based on abundance, yielding a collection of complete product ion spectra. Typical acquisition instruments for this include the quadrupole time-of-flight (QTOF) and hybrid ion trap-Orbitrap mass spectrometers. While technological advancements have enabled broad classes of putative protein biomarkers to be identified through DDA, their detection sensitivity and sample-to-sample reproducibility is limited due to the intensity-driven, stochastic nature of the precursor ion selection process [53, 54]. To overcome these inherent issues, data independent acquisition (DIA) strategies, such as MS/MSALL [55], have been proposed. This is based on the acquisition of complete product ion spectra generated from the dissociation of all precursors measured in given SWATH windows (typically 25 amu spanning from 400 to 1200 m/z) over the chromatographic run. While this may provide enhanced reproducibility and throughput over a DDA-based method, a MRM-based methodology, such as that described above, can instead be employed at the discovery stage for improved sensitivity, throughput, and reproducibility.

The discovery BAK-192 platform allows the interrogation of 192 proteins using 487 peptides as molecular surrogates. In this targeted application, the candidates will be assessed by quantitatively comparing the patient sample results with those from healthy controls. Ideally, a minimum of three process replicates (also referred to as “analytical replicates” that encompass the entire preparatory workflow) should be obtained. But only replicates that are quantitatively reproducible and interference-free should be used for comparison. To be statistically significant, a fold change ratio exceeding 1.5 and a p value <0.05 is desired [56]. While this biomarker panel is rather small, it covers a broad concentration range of proteins that can be consistently quantified without laborious pre-fractionation, which can in itself introduce variability. For more comprehensive biomarker discovery efforts, however, pre-fractionation is undoubtedly required. Using a scaled-up sample preparation method, we have recently developed a multidimensional LC-MRM workflow for quantifying a broader and deeper (by a 2 order of magnitude concentration range) panel of putative protein markers in human plasma [20]. In that method, the LCs are operated under alkaline and acidic mobile phase conditions for altered peptide selectivity, using an ACN gradient with constant 10 mM ammonium hydroxide (pH 10) in the former dimension and an ACN gradient with constant 0.1 % FA (pH 3) in the latter. Both dimensions additionally utilize RP stationary phases and standard-flow rates. Using SPM, and recently standard curves for a smaller protein panel (e.g., the low abundance targets osteopontin and matrix metalloproteinase 9 at 7 and 16 ng/mL, respectively), 253 proteins (inferred from 625 peptides) were quantified across an 8 order-of-magnitude concentration range. This panel can also represent a potentially useful starting point for assessing potential biomarker candidates at lower concentrations.

In a separate study focused on biomarker verification, a 21-plex protein assay was selected by a group of investigators based on their previous proteomic discovery results and our 1D LC-MRM/MS quantitative capabilities. The overall aim of their study was to determine whether these proteins play a role in the resolution and remission of type 2 diabetes after bariatric surgery. Bariatric surgery is of considerable research interest as it has rapid and dramatic effects on glycemic control. Recent studies by Mingrone G et al. [57] and Schauer P et al. [58] found bariatric surgery to be more effective than conventional medical therapy in controlling hyperglycemia in severely obese patients with type 2 diabetes, leading to long-term benefits on macro and microvascular disease [59]. Since some bariatric procedures, such as biliopancreatic diversion, improve glycemic control in people with diabetes, understanding this additional effect could provide insight into the pathogenesis of type 2 diabetes and assist in the development of new drug modalities. To address this unanswered question, we are currently engaged in a project involving a cohort of 20 morbidly obese, insulin-resistant patients whose plasma was collected over a 13-point time-course (from before surgery to 28 days post-surgery).

Sample preparation and analysis of the BAK-21 is as described above. This requires standard curves to be prepared for each of the 5 plates of 50 samples. Preliminary results for the concentration distribution from this study are shown in Fig. 24.5b. To aid in standardization, key starting materials (i.e., reference plasma, trypsin, and SIS peptide mixture) and acquisition/analysis methods have been assembled. The final MRM acquisition method consists of a maximum of two proteotypic peptides per protein (39 total) and three transitions per peptide, which will be used for interference screening and protein quantitation of the patient samples, as outlined above. To ensure consistent performance of the LC-MS platform, daily/monthly QC kits will also be run before and after each plate. These kits require only simple rehydration of the lyophilized, SIS-spiked plasma digest(s) prior to LC-MRM/MS analysis, with evaluations achieved through value tracking and correlation to the reference values in the kits. These QC kits, we note, have already proven useful in diagnosing instrument errors and performance deficits in intra-/inter-lab studies in the past [41, 42], and should help again here to validate the experimental workflow and analytical system.

4 Summary

We have developed a set of highly specific and robust MRM-based assays for quantifying a large panel of 192 high-to-moderate abundance candidate protein markers in antibody- and fractionation-free human plasma. The 192 proteins (inferred from 487 peptides) are designed to be implemented in targeted, biomarker discovery-based studies, while a subset panel of 21 targets has been designed for biomarker verification in a diabetes-centric study. To help standardize the process, essential materials required to complete the entire protocol (from sample preparation and processing to quantitative analysis) have been assembled into kits, as described here for the BAK-192 and BAK-21. Additionally, our recently developed Qualis-SIS software offers an automated means of quantifying proteins in reference and patient samples through regression analysis of standard curves or through SPM. To aid in quality assessment, the results are illustrated in a color-coded matrix for rapid visualization and evaluation of the results. Continued developments are focused on extending these panels for more comprehensive discovery and verification of putative, or unknown, protein biomarkers. Nonetheless, the strategies, kits, and tools discussed here act as a useful starting point for biomarker evaluation of a panel of proteins of interest in patient samples.