Keywords

1 Overview

The importance of biomarkers has long been recognized by the public, scientific community, and industry. Yet despite extensive efforts and funding investments in biomarker discovery, only 109 protein biomarkers in plasma or serum were approved by the US Food and Drug Administration throughout 2008 [1], and even fewer protein biomarkers are currently used routinely in the clinic. In recent years, the introduction of new protein biomarkers approved by the US Food and Drug Administration has fallen to an average of 1.5 per year (a median of only one per year) [1]. The low efficiency of biomarker development is due to several reasons, including the poor quality of clinical samples, the gap between subjective clinical definition of a disease and objective protein measurements, and high false discovery rate of differentially expressed proteins identified in the initial discovery phase [2]. It has become clear that the vast majority of differentially expressed proteins identified in the discovery phase will ultimately fail as useful clinical biomarkers, and only few true positive candidates can move through the biomarker development pipeline. Isolation of true biomarkers from the large pool of differentially expressed proteins identified in the discovery phase becomes the greatest challenge and the bottleneck in most biomarker pipelines. To succeed, after the initial discovery study (see Chap. 20), the authenticity of biomarker candidates need to be tested in a pilot study with high throughput, high accuracy and reasonable cost. This essential process is addressed by qualification and verification phase of the biomarker development pipeline.

The aims of the qualification and verification phase in biomarker development pipeline are:

  1. 1.

    to confirm the differential expression of candidates observed in the discovery phase

  2. 2.

    To verify the correlation of the biomarker candidates to the disease over a relative large population of patients

  3. 3.

    To confirm the performance of the statistical model combining the biomarker panel.

The qualification and verification phase, therefore, is a critical phase in the transition from discovery to clinical applications. Three major factors influence the feasibility of a biomarker qualification and verification study:

  1. 1.

    The availability of biospecimens from a well-curated cohort

  2. 2.

    The availability of highly specific and quantitative assays for the biomarker candidates of interest

  3. 3.

    The expense for assay development and applying the assays to measure a large number of targeted analytes across many samples.

1.1 Biospecimens from Clinical Cohort

Success in the qualification and verification phase relies on a rigorous clinical study design and attention to detail in sample acquisition, archival and tracking. Biomarker studies typically seek to identify combinations of proteins whose measurement will serve as a molecular indicator of the severity of a disease or its early response to treatment. This application of biomarkers enables the application of precision medicine, an approach that tailors specific interventions to those individuals that would most benefit. Described in Chap. 20, the “discovery phase” entails the application of high throughput proteomics measurements to broadly sample proteins that distinguish between two disease states. The discovery phase typically is applied to a small number of representative cases and controls in a cohort. The qualification phase will measure the candidates in the samples used in the discovery phase. The verification phase involves measuring the candidates in an independent, larger sample of similar cases and controls, frequently from multiple collaborating clinical sites. In order for the verification phase to be meaningful, a reproducible, observer-independent criteria for case definition needs to be applied.

Moreover, significant attention to detail in uniform sample acquisition and storage is paramount. There is increasing recognition that “center effects”, variations in sample acquisition, processing and storage may have profound impact on the discovery, qualification, and verification phases of the biomarker development pipeline. To overcome this issue, multi-site clinical studies should develop and rigorously adhere to standard operating protocols (SOPs) for sample acquisition/archival at the onset of the study. Although the techniques for quality assessment/quality control of proteomics samples are currently limited, sample quality should be monitored where possible prior to the application of qualification and verification assays.

The number of samples used in verification study needs to provide sufficient power to assess the sensitivity and specificity of a candidate biomarker panel. The sample size for verification stage depends on multiple factors including the analytical variation of the assays, the biological variation between patients, the concentration of biomarker candidates in clinical samples, and the effect size (the difference in the biomarker’s abundance between cases and controls). The statistical design for biospecimen size in verification studies should take these factors into account [3].

1.2 Requirements for Qualification and Verification Assays

The transition from discovery study to qualification and verification usually requires the transition from the unbiased, quantitative or semi-quantitative approaches used in the discovery to a targeted and much more precise, reproducible, quantitative approach. If such assays for biomarker candidates are not readily available, they need to be established de novo within a short lead-time. The analytical performance of the biomarker qualification and verification assays including accuracy, precise, repeatability, reproducibility, sensitivity, specificity, and linear dynamic range should be validated to meet the predicted needs of the study. The assays need to have high selectivity and sufficient sensitivity to detect and quantify the analytes targeted in a highly complex matrix (such as human plasma). Because the goals of biomarker qualification and verification are to confirm and verify the relative changes observed in the discovery study and to evaluate the model performance in their combination, but not to measure the actual amount of analytes in biological samples, the true accuracy is usually not required in qualification/verification studies. However, the assays need to have high repeatability and reproducibility so that they can be used to precisely and consistently measure relative changes in a large numbers of targeted analytes across many samples. Ideally, the assay can be standardized across laboratories.

Because all biomarker candidates identified from a discovery phase need to be tested in hundreds of samples over a short period of time and with reasonable cost, confirmatory technologies should have a high throughput capability for analyzing hundreds of samples with good precision and accuracy, be capable of multiplexing to evaluate the significant number of biomarker candidates at a time, require minimal sample consumption (because samples amount may be limited), and have low assay cost.

2 Platforms for Qualification/Verification, Advantage and Disadvantage

  1. 1.

    Enzyme-linked immunosorbent assay (ELISA)

ELISA has been extensively used in verification of biomarkers. It is extraordinary sensitivity (low pg/mL) [4, 5]. This technique has high sample throughput, and is capable of analyzing hundreds of samples with good precision. For example, ELISA can reliably measure interleukin (IL)-6 at concentrations as low as 0.15 pg/mL with coefficient of variation (CV) of 5 % [2]. However, only a small number of potential biomarker candidates have immunoassay-grade antibody pairs available. Developing a new, clinical-grade ELISA assay is costly ($100,000–$4 million per biomarker candidate), time-consuming (1–1.5 years), and associated with a high failure rate [6]. And it is even more difficult to develop multiplex ELISA assays for a large number of protein targets because of the possible cross-reactivity between antibodies [7, 8]. Taken together, ELISA technology is not well-suited for quantifying a large number of protein candidates in the qualification and verification study.

  1. 2.

    Selected reaction monitoring (SRM)

A number of targeted mass spectrometry approaches have emerged recently, such as accurate inclusion mass screening (AIMS), parallel reaction monitoring (PRM), SRM, and data-independent acquisition (DIA-MS/MS) coupled with targeted data extraction. These approaches have tremendous promise for specific, reproducible, and quantitative measurements of changes of proteins of interest in clinical research. Among them, SRM is currently the most widely used approach for biomarker qualification and verification.

SRM-MS has emerged as a favorable alternative to immunoassays for qualification and verification of candidate biomarkers. In a SRM-MS assay, one or two signature proteotypic peptides are selected to stoichiometrically represent the protein candidate of interest. The SRM analysis of these signature peptides are performed on a triple quadrupole mass spectrometer (QQQ-MS). In SRM assays, the precursor ion of interest is preselected in the first mass filter (Q1), and stimulated to fragment by collision-induced dissociation in second quadrupole (Q2). Several preselected fragments are analyzed by the third mass filter (Q3). The signals of the fragment ions are then monitored over the chromatographic elution time. The SRM-MS offers several attractive features as a qualification/verification assay. First, because only preselected precursor-product ion transitions are monitored in SRM mode, the noise level is significantly reduced and thereby SRM assays decrease the lower detection limit for peptides by up to 100-fold in comparison to full scan MS/MS analysis. Second, if the precursor-product ion transition of one proteotypic peptide is unique to the protein of origin, it is not only distinguishable from other MS signals in one LC run, but it is a characteristic signature for the protein of interest. Therefore, the two filtering stages in SRM mode result in near-absolute structural specificity for the target protein, representing a significant advantage over immunoassays. Third, because no affinity reagent is typically needed, SRM assays can be rapidly and cost-efficiently developed in comparison to immunoassays. Finally, SRM assays have multiplexing capability. Hundreds of precursor-product ion transitions can be monitored in SRM mode over one LC run, allowing for the simultaneous quantification of tens-to one hundred protein biomarker candidates in parallel.

SRM-MS in combination with stable-isotope dilution (SID-SRM-MS) is a target-driven approach for direct quantification of target proteins in a complex mixture [9]. In stable isotope dilution experiments, 13C-, or 15N- labeled absolute quantification peptide standards (AQUA) [9], concatemer of standard peptides (QCAT) [10, 11], or isotope-labeled full-length target proteins [12, 13] are added to the sample as the internal standard. The sample is trypsin digested, and the resultant mix of unlabeled and labeled peptides are analyzed by SRM-MS. Absolute quantification of target protein can be done by comparing the abundance of the known internal standard peptide with its native peptide when well-qualified isotope-labeled full length protein standards are available.

The use of stable isotope-labeled peptides as internal standards has significantly increased the detection confidence and measurement precision in SRM experiments. In SRM, only 3–5 fragment ions from the preselected precursor ions are typically monitored. When it is used for analyzing the target analytes from a highly complex system such as plasma, this assay may be prone to matrix-related interference. Co-eluting matrix components can produce the same SRM transitions as the analytes of interest, resulting in false-positive identification and inaccurate quantification. Matrix components can also cause ion suppression by competing for available protons in the spray droplets. When matrix components co-elute with analytes of interest, they will cause variation in ion current response in different samples severely affecting the precision, accuracy, and sensitivity of quantification. The stable isotope-labeled peptides have identical structures as their endogenous peptide, and as a result, co-elute in LC fractionation. When ion suppression occurs, the suppression will affect both endogenous and stable isotope-labeled peptides at the same degree. Therefore, the ratio of analyte to its internal standard will not be affected by ion suppression. The LC retention time of stable isotope labeled peptides can also be used as the landmark to pinpoint the LC peak of endogenous peptide. Furthermore, stable isotope-labeled peptides generate identical sets of fragment ions as the endogenous counterparts. The relative abundances of the fragment ions of stable isotope labeled peptides can serve as reference to distinguish the true signal of targeted native peptides from other co-eluting isobaric peptides. It will be important to demonstrate that the LC retention time and the relative abundances of the fragment ions of the native peptide are near identical with the stable isotope labeled internal standards. This usually requires significant amount of time and effort to manually inspect the SRM data to ensure the accuracy of quantification [1416]. Several bioinformatics tools mProphet [17] and AudIT [18] have been developed to overcome these problems. mProphet use criteria such as relative intensities from reference spectra, correlation with the reference spectra, retention time deviation, and co-elution to generate a single score and compute error rate of the measurement. AudIT identifies contaminated transitions. It relies on reference peptides and technical replicates.

SID-SRM is well-suited for highly reproducible quantification across many samples and, in fact, also across different mass spectrometers and laboratories. Recently, the Clinical Proteomic Tumor Analysis Consortium led a landmark multisite assessment study with a focus on the reproducibility of SID-SRM-MS assay between-run, between-laboratory, and between-mass spectrometer manufacturers [19]. In this study, the precision and reproducibility of SRM-based measurements of proteins spiked in a background of human plasma were assessed over nine different laboratories with mass spectrometers from two manufacturers. The results are very promising, with a 10–23 % inter-laboratory CV, a variance that includes variations in sample preparation and MS platforms.

Compared to ELISA, SRM-MS assay can be developed with a short lead-time (1–3 months). A critical step in SRM-MS assay development is the selection of suitable transitions for a target peptide [14]. The considerations are given to fragment ions that provide the highest signal intensity and lowest level of interfering signals. We previously reported a pathway for SRM assay development and optimization, an approach that requires both empirical and bioinformatics tools [14]. Several interfaces (for example, MRMaid [20], MRMer [21], and MaRiMba [22]) use fragment-spectra from shotgun experiments to help in designing favorable transitions for target peptides. For SRM assay design in analyses of complex samples it is also important to infer retention times. Software have been developed to realign and to predict elution times [3, 23]. Transitions extracted for an SRM assay need to be confirmed by addressing the likelihood that the chosen transitions and their intensity distributions are associated with target peptide. Several freely available software products (for example TIQAM, ATAQS [24], and Skyline [25]) integrate many of the above mentioned tasks and automate assay development for peptides (peptide and transition selection), data evaluation, and analyzing SRM traces.

A publicly available SRM assay database, SRMAtlas (www.srmatlas.org), features SRM assays for about 99 % of human proteins. This database was generated from high-quality measurements of natural and synthetic peptides conducted on a QQQ mass spectrometer and is intended as a resource for SRM-based proteomic workflows. Furthermore, to consider the detectability of the SRM assays, PASSEL [26] was created as a combined catalog of the best –available transitions selected from PeptideAtlas shotgun data and SRMAtlas, providing the validation information of all assays in the context of a specific sample. Huttenhain et al. [27] developed SRM assays for 1172 cancer-associated proteins. Using these SRM assays in the clinical samples, 182 proteins were detected in depleted plasma and 408 proteins were detected in urine. These databases of SRM assays are, therefore, valuable resources for designing and accelerating biomarker qualification/verification studies.

Some advancement in instrument design has helped to improve the sensitivity and specificity of SRM assays. For example, in most of SRM analysis, the first quadrupole (Q1) usually uses unit resolution (m/z window 0.7 full width at half maximum (FWHM)). This large m/z window allows other co-eluting sample constituents with similar m/z pass through Q1 and interfere with detection of the desired target. The frequency of these interferences increases as the complexity of the sample increases. Narrower mass windows Q1 will increase selectivity for precursor ions with the cost of a steep decline in signal as these windows are narrowed to <0.5 FWHM. The Thermo Scientific TSQ Quantum line of triple quadrupole mass spectrometers offers a new technique called highly selected reaction monitoring (H-SRM). With the advancement of the technology, the m/z window in Q1 can be narrowed to 0.1–0.2 FWHM to increase the specificity without sacrificing sensitivity. The practical advantage H-SRM is that it dramatically reduces isobaric chemical noise, thereby increasing the signal-to-noise (S/N) [15, 16], which translates to improved lower limit of quantification (LLOQs) and higher confidence in the quantification results. Improvements in the design of nano-electrospray source and interface and applications of the ion-funnel technology to triple-quadrupole mass spectrometers have been proven to increase the ionization efficiency and ion transmission, thus improving the LLOQ of SRM-MS [28, 29]. Application of further stages of ion filtering in QQQ MS increases the sensitivity and specificity of SRM in MRM3. This technique uses a hybrid quadrupole/linear ion trap instrument and monitors reconstructed ion chromatograms on secondary product ions derived from a trapped primary product ion [30, 31]. MRM3 can improve the limit of quantification by a factor of two to fourfold and enables protein biomarker quantification in the low ng/mL range in non-depleted human serum without using immunoaffinity enrichment. The drawback of this method is that it requires much longer acquisition times (350 ms) for each transition in comparison to regular SRM (6 ~ 20 ms), which reduces the number of data points that can be sampled over a given chromatographic peak and the number of peptides that can be monitored in one acquisition cycle.

  1. 3.

    Parallel reaction monitoring (PRM)

SRM is primarily performed on a triple quadrupole MS. With the newly introduced high resolution and mass accuracy (HR/AM) instruments (e.g., Q Exactive quadrupole-Orbitrap or quadrupole-TOF mass spectrometers), a new target proteomics approach referred as PRM has been developed [32, 33]. PRM has been used to measure amyloid-β, a biomarker for Alzheimer disease, in cerebrospinal fluid. The assay shows the similar performance as SRM, with a recovery of 100 % (15 %), intra-assay and inter-assay imprecision of 5 and 6.4 % [34]. The operation of PRM is similar to a SRM. The precursor ions of the target peptides are isolated in the quadrupole mass filter and transferred to higher energy C-trap dissociation (HCD) cells for fragmentation. The fragment ions are measured by HR/AM Orbitrap mass analyzer instead of a third quadrupole used in SRM. The use of an Orbitrap mass analyzer presents specific advantages. First, instead of only 3–5 transitions are monitored by the Q3 mass analyzer in SRM, PRM acquires a full MS/MS spectrum which contains all of potential fragment ions of one targeted peptide, which can significantly improve the confidence of identification of the LC peaks of target peptides. Second, the Orbitrap provides additional data on assay selectivity. In the case of complex samples, the interfering matrix ions co-isolated with the precursors of target peptides can sometimes generate fragment ions which have similar m/z values as those of the monitored transitions. These two signals sometimes cannot be separated by a quadrupole mass analyzer with isolation width of 0.7–1.0 m/z and may cause false positive identification and inaccurate quantification. The Orbitrap mass analyzer can separate fragment ions with m/z difference higher than 10 ppm; this mass accuracy and resolution is much greater than that of the quadrupole. This feature enables PRM technology to more effectively separate fragment ions of interest from interfering ions and improve the selectivity of quantification. The enhanced selectivity and specificity of the PRM method can result in better sensitivity of quantification [32, 35]. Performance comparison between PRM and SRM shows that the linearity and dynamic range of PRM can also rival the traditional SRM approach. However, it is clear that SRM has superior quantitative precision [33]. The imprecision of PRM is largely because the PRM relies on the Orbitrap mass analyzer, which is fundamentally less sensitive and has slower data acquisition rate than quadrupole mass analyzer. Quadrupole mass analyzers operate at a duty cycle nearing 100 % and have the ability to sample more points over a given chromatographic peak, thus provides a more accurate quantification of the LC peak and, in turn, greater precision and run-to-run repeatability. The Orbitrap requires much longer scan time and 40–120 ms Orbitrap injection time, which significantly decrease the duty cycle of acquisition. This reduces the data points sampled over a given chromatographic peak resulting in lower precision and repeatability of quantification. This feature limits the number of possible peptides that can be monitored in one PRM acquisition cycle. To increase multiplex capability, PRM requires time-scheduled acquisition, which relies on the availability of high-quality local spectral library with well-calibrated peptide chromatographic elution time. Unlike SRM, PRM does not require significant effort for assay development, but it requires the high-quality local spectral library to confirm the identity of the analytes and assess measurement quality, especially when the stable isotope labeled standards are not used.

  1. 4.

    Accurate inclusion mass screening (AIMS)

AIMS is another emerging targeted mass spectrometry-based proteomic technique [36]. In AIMS acquisition, a list of pre-selected precursor ions is used to generate an “inclusion list” for MS acquisition [37, 38]. Only precursors represented on the “inclusion list” will be selected for fragmentation if they are detectable in a survey scan. Compared to untargeted data-dependent LC-MS/MS acquisition (DDA-MS/MS) approach used in the discovery study, AIMS significantly improves the level of reproducibility, sensitivity, and dynamic range by restricting detection and fragmentation to only those peptides derived from proteins of interest. It is at least fourfold more efficient at detecting peptides of interest than DDA-MS/MS [36]. The analytical performance of AIMS is less satisfactory than SRM in terms of accuracy, sensitivity, specificity, and dynamic range. However, because AIMS has the ability for time-scheduled monitoring over 1000 peptides in a single LC-MS run, it can be used as a targeted approach for data-dependent triage and prioritization of hundreds of candidate biomarker in a time- and cost-effective manner [39]. In a newly developed targeted MS-based pipeline for biomarker verification, AIMS was implemented between discovery and SRM-based verification study to confirm the detectability of the candidates in plasma [39]. Only the candidates detected in the plasma by AIMS will be advanced to SRM-based assay development for more sophisticated quantitative comparison of the levels of the candidates in cases vs controls. This strategy allows one to test a much larger number of candidates than would have been possible over the traditional SID-SRM-MS based verification.

  1. 5.

    Data-independent MS/MS acquisition (DIA-MS/MS)

DIA-MS/MS is a new MS/MS acquisition technology [40, 41]. DIA-MS/MS carries the acronyms Precursor Acquisition Independent From Ion Count (PAcIFIC) [42], All-ion Fragmentation (AIF) [43], and Sequential window acquisition of all theoretical mass spectra (SWATH) [44, 45]. DIA-MS/MS is an approach where tandem mass spectra are acquired at every m/z value without regard for whether a precursor ion is observed or not. In DIA-MS/MS, the direct relationship between fragments and precursor from which they originate is lost, and assigning fragments to precursors can depend on the targeted data extraction and the availability of extensive spectral libraries such as PeptideAtlas [46, 47]. DIA-MS/MS demonstrates better sensitivity, reproducibility and dynamic range than DDA-MS/MS, and allows consistent quantification of proteins spanning a wide range of concentrations, e.g., 125–106 copies/cell [44], a range well within the needs for quantifying host cellular response profiles. Data-independent acquisition itself is not a targeted approach, but in combination with targeted data analysis, it can be used as an alternative approach of SRM assay in clinical research. In this approach, a quantitative, digitalized proteomic recording (SWATH maps) will be generated for each clinical sample as a personalized digital representation for each patient [48]. The profile of proteins of interest can then subsequently be extracted in a targeted fashion using assay information derived from mass spectrometric reference maps. In a recent study of N-linked glycoproteins in human plasma, N-linked glycoproteins in human plasma were enriched with solid phase extraction, then analyzed by both SWATH maps and SRM [45]. SWATH maps coupled with targeted data extraction shows less sensitivity than SRM, but achieved a higher analyte throughput, comparable dynamic range, reproducibility, and accuracy if stable isotope labeled peptides of analytes were used as internal standards. This finding indicates that SWATH maps can be used as targeted, reproducible quantitative approach for biomarker qualification/verification in less complicated samples [45]. Furthermore, SWATH maps are permanent digital maps and can be easily re-examined for qualification/verification of new sets of biomarker candidates without reanalyzing the sample physical samples [48]. Although SWATH maps require little assay development, it can be useful only when a high quality MS/MS spectra reference maps with well-calibrated elution times are available and can be replicated on the instrument used for SWATH MS analysis. SWATH generates highly complex and overlapping MS/MS spectra, and significant bioinformatic effort is required for analyzing SWATH data. Some special bioinformatic tools, such as openSWATH [49] and Spectronaut (www.biognosys.ch) have been developed for facilitating target data extraction from SWATH maps data and quantification.

We therefore summarize the benefits and tradeoffs inherent to each platform for biomarker verification with respect to the main factors characterizing measurements: accuracy, sensitivity, specificity, reproducibility, precision, dynamic range, sample throughput, analyte throughput, assay development easiness, and ease of data analysis (Fig. 23.1). Each method entails a compromise that maximizes the performance at some level, while reducing it at others. For example, PRM has higher specificity than SRM, but lower reproducibility and precision of quantification; SWATH can significantly improve analyte throughput but at the cost of specificity and accuracy. Given SRM has the best overall analytical performance, it is considered as the gold standard approach for biomarker qualification and verification.

Fig. 23.1
figure 1

Performance profiles comparing technical advantages and disadvantages of target MS platforms used in biomarker verification study

Because the odds of discovery of a clinically useful biomarker or biomarker panel are extremely low, a large number of biomarker candidates must be tested in a qualification phase. Developing SID-SRM assays for every candidate identified by discovery study will become very costly and time consuming. A small number of candidates must be selected from the many hundreds of available candidates. Therefore, the qualification phase can be further divided into two steps: triage and quantification. In the triage step, the biomarker candidates are measured by targeted, but less costly assays [39]. Among the platforms available for biomarker qualification/verification, PRM, AIMS, and SWATH have the capability to test and triage large number of candidates with lower expense and less lead-time for assay development. They can be easily developed if a local high quality MS/MS spectra reference maps with well-calibrated elution times are available and can be replicated on the instrument used for analyzing the clinical samples. Only the candidates that pass the triage step will be advanced to more expensive SID-SRM quantification. This staged qualification/verification strategy will enable one to test as many candidates as possible with reasonable cost and time to improve the chance of discovery of clinically useful biomarker panels.

3 Pre-fractionation and Enrichment Technologies

Ideally, SRM assays can be applied to verify biomarker candidates directly from plasma or serum without upfront sample fractionation. It is efficient, reproducible, high throughput, and less prone to errors and analytical variations. In recent studies, high and medium abundance human plasma proteins have been quantified by using multiplexed SRM approach without further sample preparation. Kuzyk et al. reported the simultaneous quantification of 45 major plasma proteins with a CV below 20 % for 94 % of the measured peptides [50]. Anderson et al. reported that 47 major plasma proteins were quantified with in-run CVs of 2–22 % [51]. The least abundant protein quantified, L-selectin, had a measured concentration of 0.67 μg/mL, a concentration 4–5 orders of magnitude lower than the concentration of albumin in plasma. Addnota et al. tested the LLOQ of SRM assays of target proteins in human plasma [18]. Eight of ten tested peptides had median LLOQ values between 0.66 and 2.0 fmol/μL when peptides were added into 1:60 diluted plasma (equivalent to a range of 0.70–3.34 μg/ml protein in plasma). These studies demonstrate SRM assay can reliably quantify the classic plasma protein biomarkers with concentration higher than 1 μg/mL directly in plasma. But this LLOQ of SRM assays is not sufficient for unambiguous detection and quantification of other types of protein biomarkers with lower concentration, such as tissue leakage products, interleukins, and cytokines, directly from plasma (Fig. 23.2). The lack of sensitivity by applying SRM assays directly to plasma is mainly caused by matrix-related interference and ion suppression. Plasma is an extremely complex mixture of proteins over a concentration range of 11 orders of magnitude in the presence of other endogenous salt, lipid, and metabolites. These matrix components have deleterious effect on the sensitivity of SRM assays. Competition for ionization between the analytes of interest and other endogenous (such as salt, lipid, and metabolite) or exogenous (such as polymers extracted from plastic tubes) species causes the ion suppression effect. When these interfering species elute at the same time as the analyte of interest, the signals of analytes will be suppressed [52]. Some matrix components can also produce the same product ions monitored for the analytes of interest, giving rise to chemical and biological noise, which reduce the S/N ratio necessary for detection and quantification. To overcome these sensitivity barriers, a variety of sample preparation strategies have been developed for target protein quantification aimed at reducing sample complexity while maintaining the requirements for high accuracy, reproducibility, and throughput.

Fig. 23.2
figure 2

Comparison of the LLOQ of different strategies for the quantification of protein biomarkers in plasma. A schematic diagram of the source and target concentration ranges of candidate plasma biomarkers. At right is LLOQ of current reported verification assay (Taken from Zhao, Current Proteomics, permission required)

3.1 Depletion of High-Abundance Proteins

Depletion of the highest abundance plasma proteins using affinity columns is the simplest way to reduce the sample complexity. In a study, Keshishian et al. reported that depletion of the 12 highest abundance plasma proteins improved the SRM assay LLOQ to 25 ng/mL [2]. The combination of depletion with strong cation exchange chromatography (SCX) further improved the LLOQ of SRM assay to 1–10 ng/mL with CV below 15 % [53]. But this approach is impractical for biomarker qualification/verification because extensive prefractionation of samples into numbers of subfractions substantially reduces the throughput of the entire assay.

3.2 Enrichment of Target Proteins or Peptides Using Affinity Chromatography

Specifically isolating the target proteins or peptides from human plasma with affinity purification is the most efficient way to reduce the sample complexity. This approach is based on the highly specific interaction between the targeted proteins with affinity ligands, such as antibodies, aptamers, or lectins. Pre-fractionation is especially useful for quantification of low-abundance proteins in plasma. In our recent qualification and verification study of dengue fever biomarker panel, we found that the circulating level of one of the biomarker candidates, Complement Factor D (CFD), was below the LLOQ of the SID-SRM-MS assay and could not be detected in unfractionated plasma. To address this issue, we developed an assay in which the CFD was first immuno-precipitated (IPed) by anti-CFD antibody from plasma followed by quantification with SID-SRM-MS [54]. The CFD protein in each sample was IPed with biotin conjugated anti-CFD antibody. The complex of CFD and its antibody was captured by streptavidin magnetic beads. Stable isotope labeled CFD signature peptide was spiked into each sample, the proteins were trypsin-digested, and CFD abundance was quantified with SID-SRM-MS. By using this approach, we significantly improved the sensitivity of the assay.

IP-SRM can be multiplexed using a mixture of magnetic beads containing different antibodies to increase the throughput of the assay. Nicol et al. used this approach to quantify multiple proteins from human sera simultaneously [55]. The assays extend the LLOQ of SRM assay to low ng/ml range with good accuracy.

A newly emerging immuno-affinity-SRM approach termed stable isotope-labeled standards with capture by anti-peptide antibodies (SISCAPA) was developed by Anderson et al.[56], using immobilized anti-peptide antibodies to enrich the target peptides and the previously spiked synthetic stable isotope-labeled peptides. Using this method, more than 1000-fold enrichment for target peptides in a plasma digest can be achieved. In several studies, individual SISCAPA-SRM assays have been successfully configured for quantifying biomarkers in the ng/μL range in plasma with CV < 20 % [5658]. The protein concentration determined by this method with results obtained using a commercial immunoassay yield a high correlation of the two technologies [57, 59], demonstrating that the method can quantify low-abundance proteins with high accuracy. SISCAPA-SRM-MS has potential to multiplex the number of peptides measured in one assay by using a mixture of magnetic beads containing different antipeptide antibodies. Whiteaker et al. demonstrated that up to nine peptides have been enriched simultaneously with a LLOQ in the low ng/ml range (from 10 μl of plasma) and a median coefficient of variation of 12.6 % [58]. They also demonstrated that the LLOQ can be extended to low pg/ml range of protein concentration when larger volumes of plasma (1 ml) were used. This method holds great promise for verifying biomarker candidates. Interlaboratory evaluation of SISCAPA indicated that limits of detection of SISCAPA were at or below 1 ng/ml for the assayed proteins in 30 μl of plasma. Assay reproducibility was acceptable for verification studies, with median intra- and inter-laboratory CVs above the limit of quantification of 11 % and <14 %, respectively, for the entire immuno-MRM-MS assay process, including enzymatic digestion of plasma [60]. SISCAPA has several advantages over immunoaffinity capture of target proteins since; (1) it avoids potential interference from endogenous antibodies in the sample as they are digested to peptide by trypsin, and (2) anti-peptide antibodies are easier to generate in comparison to anti-protein antibodies. The limitation of this type of enrichment strategy is the requirement for specific antibody to be generated for each tryptic peptide used for a target protein. An alternative approach is the use of aptamers, oligonucleotide sequences with molecular recognition properties selected from combinatorial oligonucleotide libraries [61]. Aptamers bind protein ligands with high affinity and specificity [62]. They can be easily generated because they are chemically synthesized, enabling standardization of assays across multiple lots, a feature not possible with generation of polyclonal antibodies, for example.

3.3 Sample Fractionations for Protein Adduct or Fragments

Potential biomarkers may be proteins with posttranslational modifications or peptide fragments derived from endogenous proteins. To unambiguously quantify these candidates, they have to be first separated from their canonical forms. In our recent biomarker discovery study of dengue fever, we identified a high molecular weight (>250 kDa) form of albumin is associated with dengue fever virus infection [63]. The nature of this protein is incompletely characterized, but is probably a covalently linked polymer [63]. To verify the high molecular weight albumin isoform, in our NIAID funded Clinical Proteomics Center, we developed a capillary electrophoresis (CE) based fractionation approach. For CE fractionation, plasma samples were separated after spike-in with Beckman protein size standards. The 250 kDa fraction was collected into a receiving vial. The SDS in each collected CE fractions was removed by using SDS sample cleaning kit (Bio-Rad). The protein pellets were redissolved in 8 M urea. The proteins were digested with trypsin and quantified with SID-SRM-MS assay. Similarly, for the peptide fragments derived from endogenous proteins, size-based separation approaches such as size-exclusion chromatography (SEC) can be used. For example, in our recent biomarker discovery study of Aspergillosis (Discovery of Candidate Biomarkers, Chap. 20), we identified 26 small molecular sized peptides in plasma. These peptides are fragments of endogenous proteins such as albumin, apolipoprotein A-I, haptoglobin. To quantify these peptides, we first used size-exclusion chromatography to separate the denatured plasma into protein and peptide pools (MW <17 kDa). Then the concentration of these 26 peptide fragments in the peptide pool was quantified with SID-SRM-MS.

The qualification and verification strategies that were used for Dengue fever virus-3, infectious Aspergillosis, and Chagasic Cardiomyopathy are summarized in Table 23.1, 23.2, 23.3, and 23.4.

Table 23.1 Qualification and verification strategies for candidate plasma proteins for Dengue fever virus-3
Table 23.2 Qualification and verification strategies for candidate plasma proteins for infectious Aspergillosis
Table 23.3 Qualification and verification strategies for candidate plasma peptides for infectious Aspergillosis
Table 24.4 Qualification and verification strategies for candidate protein markers for Chagasic Cardiomyopathy

4 Feature Reduction/Candidate Selection

The qualification/verification phase seeks to reduce the number of candidate biomarkers to those most informative for general application in clinical setting. Another goal of qualification/verification is to test the statistical model that combines several of the informative features. Feature reduction aims to decrease the number of input variables to the model. Lower number of input variable enhances the quality of the data, increases the predictive power of the biomarker panel, and makes the results understandable and more robust for application to broader populations. This is a statistical approach that utilizes quantitative information derived from any of the qualification/verification assays described above. Approaches for feature reduction include pairwise statistical comparison, significance analysis of microarray (SAM), a technique that estimates false discovery rate (FDR) in high dimensional datasets, regression modeling, or machine learning techniques such as classification and regression trees (CART), multivariate adaptive regression splines (MARS) or ensemble methods. The application of these approaches is described more fully in the Chap. 20.

5 Consideration in Designing Quantification/Verification Study

  1. 1.

    Selection of sample cohorts for verification study

As described in the Introduction to Proteomic-derived Biomarkers (Chap. 20 ), the samples in the qualification phase are the same samples used in the discovery phase. The verification phase involves measuring the candidates independently in a larger number of samples collected from patients with similar diagnosis and control patients from those that were assayed in the discovery phase of the biomarker pipeline. In order for the qualification/verification phase to be meaningful, a reproducible, observer-independent criteria for case definition needs to be applied. Samples should represent meaningful sampling of the patient cohort. Specifically the biospecimens should be derived from components of the cohort that meet the same objective criteria for cases and controls as those used for the discovery analysis.

  1. 2.

    Statistical design for verification study

The statistical design for the verification phase should be developed based on considerations of the effect size, outcomes (classes) in the experimental cohort, and experimental goal –e.g. is the focus to test the performance of a biomarker to differentiate cases vs controls, or to evaluate the statistical model? The reader should refer to Statistical Approaches (Chap. 22 ) for more details.

  1. 3.

    Selection of assays – Fit-for-purpose concept

We propose to adopt staged, fit-for-purpose strategy for design a biomarker qualification/verification study [64, 65]. Depending on the number of biomarker candidates, the concentration of biomarker candidates in clinical samples, the feasibility of de novo assay development for the candidates, the analytical performance of the assays, and the cost of assay development and application for measuring a large numbers of targeted analytes across many samples, qualification/verification study can consist of three steps: triage, quantification, and verification (Fig. 23.3). The triage and quantification are performed in the qualification phase with the same samples used in the discovery phase. One important lesson learned from past 10-year’s biomarker discovery studies is that the odds of identifying a clinically useful biomarker panel is extraordinarily low. To increase the chance of identifying a successful biomarker panel, researchers usually assemble a candidate pool for the qualification study from several sources including local proteomic and transcriptional profiling experiments, as well as data from the published literature. The candidate pool can become very large and these candidates may not directly associate with the disease of interest. In the case of that hundreds of candidates have to be tested in the qualification study, the study should start with a triage process to test these candidates while containing cost. The goal of this triage process is to reduce the initial list of candidates to a small subset that will be quantified with SID-SRM in the quantification stage. The technology used in this step should have higher capacity to triage large number of candidates with lower expense and shorter lead time for assay development. The assay should have enough specificity and precision to semi-quantitatively measure the relative changes in the level of large number of analytes across large number of samples. The validation of the assays for triage will be minimal, including specificity, precision and run-to-run variation. The accuracy of quantification is not required. Although the use of stable-isotope labeled standards for each analytes are not required for triage process, a constant set of stable isotope labeled isotopic peptides corresponding to certain housekeeping proteins is recommended to be spiked into the samples in same amount. These standards can serve as benchmarks for normalization of run-to-run reproducibility and landmarks for calibration of LC retention time.

Fig. 23.3
figure 3

Multistage, targeted proteomic workflow for biomarker qualification and verification

The targeted MS assays such as PRM, AIMS and SWATH with targeted data extraction are well-suited for this purpose. They can monitor the entire set of fragment ions for each analytes with high resolution and high mass accuracy. With the absence of stable isotope labeled peptides as internal standards for each target analyte, these approaches heavily rely on the reference database of standard spectra of each analyte to construct time-scheduled data acquisitions and confirm the identification of the analytes. The acquired MS/MS spectra will be compared with authentic standard spectra to examine the agreement of relative abundance of fragment ions and LC retention time. The identification confidence is determined by the number of fragment ion observed and the correlation of the observed LC retention of the analyte to its predicted retention time. It should be noted that SRM without stable isotope labeled peptide of each analyte is not a reliable tool for the triage process because SRM usually monitors only 3–5 transitions with moderate mass accuracy and unit resolution. This technique cannot provide sufficient confidence in detecting candidate biomarkers in the absence of stable isotope-labeled peptide standard. If SRM is the only platform available for the study, low cost, unpurified stable isotope labeled peptides for each targeted analyte should be used to provide the confidence needed for LC peak identification.

Measurements in triage step are semi-quantitative, only allowing rough estimations of relative abundance changes of targeted proteins. The small set of candidates derived from triage step require additional quantification with SID-SRM-MS to confirm the observed changes. In addition to prioritizing the candidates for more accurate quantification, triage step will also determine which protein candidates can be quantified directly from clinical samples, and which candidates need additional sample fractionation or enrichment to improve the limit of detection and quantification.

In the quantification step, the list of candidates for quantification can be first divided into several groups based on the concentration of biomarker candidates in clinical samples: extremely low abundance proteins such as cytokines and interleukins, medium-low proteins such as tissue leakage products, and classic plasma proteins. For cytokine and interleukin candidates, ELISA is the first choice assay because well-validated ELISA assays are commercially available for most. The analytical performances of ELISA are acceptable for the studies. The task to develop SID-SRM assays for low-abundance proteins such as cytokines and interleukins is very challenging, requiring significant amount time and effort to find suitable antibodies for the candidates. Even with antibody enrichment, the sensitivity of SRM will not be able to reach the required LLOQ of pg/ml in order to quantify cytokines and interleukins. As a result, a much larger biospecimen volume is required for their quantification by SRM. For tissue leakage products and classic plasma proteins, SID-SRM is the primary choice for quantification. SRM can be applied to verify classic plasma proteins directly from clinical samples without upfront sample fractionation. For tissue-leakage proteins, certain strategies for sample fractionation or enrichment are usually required in order to quantify the candidates with acceptable sensitivity and specificity (Fig. 23.2). If antibodies are not readily available, IP-SRM and SISCAPA-SRM are not recommended for less credentialed candidates because of tremendous effort required for developing suitable antibodies.

The use of stable internal standards in SRM assays are required to provide the highest level of detection confidence and measurement precision. Stable isotope labeled tryptic peptide standards are the most commonly used internal standards. They can provide sufficient precision and reproducibility to confirm the differential expression of candidates by the disease and eliminate the false positive candidates identified in the discovery phase. But in this approach the accuracy of quantification is only moderate because stable isotope-labeled peptide standards do not account for the differences in trypsin digestion efficiency. So assays using stable isotope-labeled peptide standards need to be validated to prove moderate precision, reproducibility, and specificity. The outcome of the quantification process is the list of candidates with high correlation with disease of interest. These candidates will then advance to more rigorous verifications.

The goals of verification process are threefold; one is to confirm that the small subset of candidates that survived the triage step truly reflects the disease presence, severity, or outcome, second is to establish the specificity and sensitivity of the biomarker panel for its intended use; and third is to implement suitable sample fraction/enrichment approach for the targets, if applicable. It was found that trypsin digestion and its requisite sample handling usually contribute the most to assay variability. It has been shown that the use of stable isotope-labeled protein as an internal standard instead of stable isotope labeled peptides to account for losses in the digestion process nearly doubles assay accuracy [60]. Therefore, in verification phase to increase the accuracy of quantification, labeled, full-length proteins, or winged-peptides with 2–6 amino acids of native flanking sequence at the N-, and C- termini of tryptic peptide analyte, or concatemer of standard peptides should be added at the start of trypsin digestion to serve as more robust internal standards. The purity and quantity of internal standards must be established. For “winged” peptides, quantification is usually done by HPLC and amino acid analysis. If the concentration of targeted proteins are below the LLOQ of SID-SRM-MS assays and cannot be quantified directly from clinical samples, suitable strategies to enrich targeted proteins should be established. IP-SRM or SISCAPA are the first choice for this purpose because they are proven to be very efficient way to enrich the targeted proteins with high precision and repeatability compared to other approaches.

Similarly, the confidence in the accuracy of the qualification/verification assay should increases as the credential of the biomarker candidate increases. Although achieving total accuracy in mass spectrometry based protein quantification is not possible, the assays used for high credential candidates should have high specificity, reproducibility, precision (less than 35 % CV), and sensitivity for target quantification [65]. Analytical validation assays are evaluated based on their assay precision, linear dynamic range, and sensitivity (LOD and LLOQ). If a prefractionation/enrichment step is implemented prior to MS analysis, such steps also need to be validated as part of the overall assay validation for factors such as run-to-run variation, recovery, and carryover. Ideally, the assays for high credential candidates should be able to be standardized across laboratories and translated into clinical assays.

6 Summary

By far, the most challenging step in the biomarker development pipeline is isolating the true biomarkers from a large pool of differentially expressed proteins identified in discovery phase. The large size of the initial candidates pool is due to several factors including high false positive discovery rate, the poor quality of clinical samples, the high complexity of clinical samples, and the lack of highly specific and quantitative assays for quanitfying all protein in biofluids. Recent advances in targeted MS-based technologies such as AIMS, PRM, SWATH and SID-SRM-MS show the potential to alleviate the bottleneck in biomarker pipeline. Among them, SID-SRM-MS assays have been proven to be the most reliable approach for biomarker qualification/verification. With the progress that has been made in recent years, it is becoming more of a realistic possibility that SID-SRM-MS approach can also be developed into a FDA-approvable assay for clinical test. MS-based clinical assays can complement traditional immunoassays well especially for protein biomarkers that high quality ELISA assays cannot detect, or in cases where protein isoforms or posttranslational modifications constitute the biomarker. In this chapter, we proposed a fit-for-purpose, staged biomarker qualification/verification workflow to verify the hundreds of candidates generated from discovery phase with a cost-effective rapid manner. This workflow starts with a data-dependent biomarker candidate triage step by using semi-quantitative AIMS, PRM, or SWATH approaches followed by SID-SRM-MS based qualification and verification for candidates that survive the triage. The accuracy and precision of qualification/verification assays for final candidates need to be confirmed at every step. The rigor of biomarker assay validation should increase as the credential of biomarker candidate increases. This continuous and evolving fit-for-purpose strategy will conserve resources and efforts in the qualification/verification stages of biomarker development and increase the chance to identify a successful biomarker panel.