INTRODUCTION AND SCOPE

Liquid chromatography-mass spectrometric (LC-MS) methodologies, including conventional LC-tandem mass spectrometry (MS/MS) and LC-high-resolution accurate mass spectrometry (HR/AMS), are emerging as important techniques for quantifying protein therapeutics in biological matrices. While ligand-binding assays (LBAs) have historically been the only platform available for protein bioanalysis, MS-based technology is now providing alternative approaches with inherent characteristics that complement LBAs: a truly orthogonal detection principle, based on the physicochemical properties of the target protein analyte or proteotypic peptide sequences present in its primary structure and a potentially greater tolerance for interferences from other proteins (e.g., anti-drug antibodies (ADAs)) which may be present. The general strategy for LC-MS/MS protein bioanalysis is shown in Fig. 1. For simplicity, the scope of this paper was limited to the most common approach currently applied to most protein biotherapeutics, in which they are indirectly quantified using LC-MS/MS instrumentation to measure one or more surrogate peptide(s) derived from proteolytic digestion of the analyte (Fig. 1b). Although inherently simpler, the alternative approach of intact analysis is currently practical only for peptide and relatively small protein biotherapeutics when using LC-MS/MS (Fig. 1a). Recent advances in LC-HR/AMS instrumentation are making direct quantification possible for larger proteins. LC-MS-based protein analysis is rapidly evolving, and intact approaches are likely to become increasingly utilized in regulated studies. While most of the recommendations discussed in this paper are expected to be applicable, intact analysis—particularly utilizing high-resolution accurate mass techniques—introduces additional technical and validation considerations that may be addressed in a future publication.

Fig. 1
figure 1

Protein LC-MS/MS bioanalysis general strategy. a Direct measurement approach for analysis of intact peptide and small protein analytes up to ∼10 kDa in size. b Indirect measurement approach for analysis of larger proteins, via surrogate peptides produced by proteolytic digestion. MW molecular weight, PPT protein precipitation, SPE solid-phase extraction, AC affinity capture

LC-MS/MS assays can typically be developed and implemented more quickly than LBAs, and, since they are less reliant on specific critical assay reagents, they often can be readily extended to analyze multiple drug candidates. LC-MS/MS assays have been reported for the determination of total drug (1), multiple analytes in combination therapies (24), products of post-translational modifications (5), as well as protein catabolites (6). As biotherapeutics progress from discovery into development, their bioanalytical assays must be validated to meet global regulatory requirements. A common question in the bioanalytical community is which parts of current regulatory guidelines are applicable and should be followed, the chromatographic (small molecule) or the LBA (large molecule) sections? The concepts incorporated in this white paper arose from extensive discussions among practicing scientists in the field from both pharmaceutical and contract research laboratories. This group of experts formed a pre-competitive consortium whose purpose was to consider the development of protein LC-MS/MS methods and better define the process to validate them within the constraints of existing guidance documents for bioanalytical method validation of both small molecules and protein biotherapeutics (710) and current global regulatory trends (11). The primary goal of this collaborative effort with the Protein LC-MS Bioanalysis Subteam of the AAPS Bioanalytical Focus Group (BFG) was to propose a set of assay parameters that should be evaluated during the validation of protein LC-MS/MS bioanalytical methods and, where appropriate, to establish scientifically sound and applicable validation acceptance criteria. Consideration was given to the traditional validation approaches for LBA and small molecule LC-MS/MS methods, which are summarized along with our recommendations for protein LC-MS/MS assays in Table I. There are many classes of biotherapeutics, each with specific characteristics that cannot be fully addressed in a single article. As noted, this paper focuses on the validation of biotherapeutic protein bioanalytical methods, where a proteotypic peptide produced by enzymatic digestion, is quantified as a surrogate of the target protein. This scope was intentionally made narrow as a first effort to provide some basic principles for the validation of protein LC-MS/MS-based bioanalytical methods. The bioanalytical considerations discussed are intended to be appropriate for supporting regulated non-clinical toxicokinetic (TK) and clinical pharmacokinetic (PK) studies.

Table I Comparison of Conventional Method Validation Parameters for Protein LBA and Small Molecule LC-MS/MS, with those Proposed for Protein LC-MS/MS

The selection of the overall LC-MS/MS method approach depends on many factors, including the matrix type, the analyte structure, choice, and availability of reagents (e.g., stable isotope-labeled (SIL) internal standard (IS), affinity capture agents, proteases), required sensitivity, and specificity. The assay format that will be most suited to quantify the protein biotherapeutic must be selected. Several example procedures are illustrated in Fig. 2. As a general rule, it is recommended to use the simplest approach that will achieve the required selectivity/specificity, sensitivity, accuracy, and precision in the intended matrix and species. Approaches can range from traditional sample preparation (e.g., protein precipitation or solid phase extraction) to affinity capture enrichment strategies and from generic to specific LC-MS/MS assays (1217). Depending upon assay requirements, a protein therapeutic may be analyzed following simple direct proteolytic digestion of the biomatrix sample without enrichment or by more complex methods involving highly selective, affinity capture enrichment either prior to and/or after digestion. Special validation considerations with unique caveats are made for evaluating selectivity, matrix effect, and method recovery for protein LC-MS/MS quantification methods.

Fig. 2
figure 2

Example protein LC-MS/MS quantification procedures. a Protein-level affinity capture, b peptide-level affinity capture, and c affinity reagent-free procedure. *Digest may involve a sequence of processing steps, including denaturation, reduction, and alkylation, prior to proteolysis. PPT protein precipitation, SPE, solid-phase extraction

This white paper may also provide useful principles for validating LC-MS/MS-based assays for other types of biotherapeutics, including undigested peptides and proteins, antibody-drug conjugates (ADCs), and other hybrid protein-based biotherapeutics. Similar principles could also be taken into consideration for LC-MS/MS-based assays to quantify endogenous protein biomarkers.

SURROGATE AND MONITORING PEPTIDES

Surrogate Peptide

An appropriate surrogate peptide must be chosen for LC-MS/MS bioanalysis of a digested protein. This peptide should be unique to the target protein, and its chromatographic signal generated by a particular Selected Reaction Monitoring (SRM) transition must be free from interferences due to other peptides, processing reagents, or other endogenous material from the sample matrix. The surrogate peptide must also exhibit sufficient sensitivity to reach the desired lower limit of quantification (LLOQ) and must be sufficiently stable to survive both the digestion process and overall bioanalytical procedure. Peptides containing amino acids that may be susceptible to modification in vivo or during processing and analysis (e.g., methionine) should be avoided if possible.

Surrogate peptide candidates are initially sought by in silico analysis using computer programs that evaluate the protein’s amino acid sequence vs. proteolytic enzyme specificity to predict fragment peptides. Candidate peptides are usually selected from a specific region of interest in the protein molecule. As an example, with antibody therapeutics, peptides from the variable complementarity-determining regions (CDR) are most appropriate for clinical applications; whereas, human-specific peptides from the constant framework regions are often appropriate for non-clinical assays (13,14). The final surrogate peptide for quantification is best selected during method development from potential candidates using experiments with actual processed matrix samples to evaluate and confirm sensitivity, selectivity, chromatographic properties, and reproducibility (18).

Monitoring Peptides

Because LC-MS/MS technology readily supports multi-analyte testing, it is possible to obtain qualitative structure-related information simultaneously with quantification of the target protein by measuring multiple peptides in the assay. This information is potentially valuable for gaining insights into possible biotransformation of the protein in vivo, and it may also be useful for assay troubleshooting. The capability to obtain specific peptide sequence-based molecular characterization information is a novel aspect of LC-MS methods. The following describes how these optional data can be obtained. In addition to the surrogate peptide for quantification, one or more secondary peptides may be chosen from the list of potential peptide candidates for use as “monitoring” peptide(s) (18). These peptides are selected based on: (1) their location in the protein amino acid sequence to provide structural information and (2) their chromatographic, mass spectral, and stability properties, as described above for the surrogate peptide. The number of monitoring peptides used may vary depending on the amount of secondary information desired for the analyte and the availability of peptide fragments with suitable analytical properties. For a small protein biotherapeutic that does not consist of multiple subunits or polypeptide chains, as well as large proteins with stable structures that are not expected to undergo complex biotransformation (e.g., most monoclonal antibodies (mAbs)), a single monitoring peptide may provide sufficient structural information. If the protein has multiple subunits or polypeptides connected by linker(s), additional monitoring peptides from the appropriate regions may be of interest.

Under some circumstances, a monitoring peptide may be specifically associated with an additional molecular species that is intended to be separately quantified (e.g., a different isoform, expected catabolite, or degradation product). In such cases, the monitoring peptide would be more appropriately classified as a surrogate peptide representing another analyte, which may require an additional reference standard if a multi-analyte method validation is to be conducted.

The ratio of the chromatographic signal for the monitoring peptide(s) to the surrogate peptide is generally determined semi-quantitatively using their absolute peak areas or IS-normalized peak area ratios and evaluated for consistency or potential change within a set of samples. Peptides may sometimes exhibit different response linearity across the assay range, leading to a shift in their relative response ratio vs. analyte concentration. Applying regression analysis to the peptides’ responses across the analytical range of the assay can compensate for these differences. Evaluation of monitoring peptides begins with method development and may continue through method validation and sample analysis to assess assay consistency and potential trends. Any notable change in the ratio(s) that is observed to trend over a PK or TK profile may be indicative of biotransformation (e.g., protein subunits have been cleaved) or that the integrity of the molecule has been otherwise altered in vivo. As these data are for characterization purposes and are not intended for quantification, it may not be necessary to establish specific data acceptance criteria for the ratio(s).

While the evaluation of monitoring peptides is not considered an essential requirement for protein quantification, these data can easily be simultaneously acquired with quantification data and may provide potentially significant characterization or troubleshooting information. Despite the potential theoretical value of monitoring peptides, our experience with them in regulated studies has been limited, and actual applications in the literature are just beginning to emerge. With any protein LC-MS/MS bioanalytical assay, it is important that detailed knowledge of the protein therapeutic, its potential catabolism in vivo, likely sites of cleavage, and routes of clearance for both the protein and putative catabolites be closely coupled with the use and interpretation of data provided by monitoring peptides.

Use of Multiple SRMs per Peptide

Within the constraints of the chromatographic peak width and the MS duty cycle, it may be useful to acquire data from more than one SRM transition or channel per peptide (either from different precursor ions (e.g., +2 vs. +3) or the same precursor with different product ions). In most assays, single SRM transitions are used for the quantification of a target peptide and to monitor its IS, while secondary transitions might be used for qualitative confirmation. However, and with certain software, it may sometimes be beneficial to sum multiple SRM signals to create a composite signal to be used for quantification. This practice, while not applied routinely, may be advantageous to improve assay sensitivity when the following conditions are met.

  • Each of the individual SRM signals to be summed can be traced to the same peptide within a single chromatographic peak (i.e., their multiple mass chromatographic profiles are superimposed with the same retention time)

  • When monitored separately, the SRMs show the same signal to concentration relationship

  • The composite SRM signal exhibits a higher signal-to-noise ratio than the individual SRM signals.

When using additional SRM transitions for qualitative purposes, the ratio of signal for the secondary channel(s) to that of the primary quantification channel is determined empirically during method development and validation. As these data are for qualitative confirmatory purposes and not for quantification, it is not necessary to establish specific data acceptance criteria for the ratio(s). During sample analysis, any notable change in the SRM ratio(s) observed for study samples, relative to those of the standards and quality controls (QCs) may be indicative of an interference issue.

SELECTION OF STANDARDS AND CRITICAL ASSAY REAGENTS

Selection of Reference Standard

Reference standards of protein therapeutics should be well characterized and representative of the material to be used in non-clinical and clinical trials (19). For regulated studies, either the drug substance (purified protein in a buffered solution or lyophilized powder) or the drug product (formulated protein in a buffered solution with additives) is frequently used. A Certificate of Analysis (COA) or appropriate Analytical Report that documents the material’s biochemical, biophysical, and biological properties should contain at a minimum, protein content or concentration, results of identity tests (e.g., peptide sequencing by MS, amino acid composition) and characterization (e.g., total protein assay, size exclusion), and information related to storage conditions and stability or expiry information. For best analytical results using LBAs, it is recommended to prepare standards and QC samples with the same batch or lot of drug substance or product dosed in the non-clinical and clinical studies (8). For LC-MS/MS protein assays, most enrichment procedures and the generation and detection of the surrogate and monitoring peptides are unaffected by small differences between drug product lots. However, in cases where lot-to-lot variation in the protein may directly impact the efficiency of a critical procedure step, such as specific affinity capture, the assay result could be affected, and it may be advisable to use the dosed drug material as the reference standard. With some LC-MS/MS methods, the use of the drug substance may be preferable over the drug product to avoid potential matrix ionization effects from residual additives. The impact of the source and composition of the reference standard on assay performance should be evaluated during method development.

Presence of different isoforms in the reference materials can be of concern for LC-MS/MS, especially protein fragments that contain the surrogate peptide or proteins with modifications (e.g., deamidation) within the surrogate peptide sequence. However, it is rare that information on the site(s) of modification is included in the COA (16). Presence of aggregates may or may not affect results, dependent on the completion of digestion prior to analysis. It is unclear at this time how heterogeneity or potential in vivo-induced changes in glycosylation may affect certain procedure steps, such as digestion efficiency, and this may require investigation in the future. In spite of these potential complexities, for most assays, the stated uncorrected total protein concentration (i.e., drug content) should be used in the calculations of standard and QC concentrations.

Selection of Internal Standard

Sample processing for quantitative analysis of proteins in biological samples is generally complex compared with that for small molecule drugs, and use of an appropriate IS is highly recommended to compensate for variations in sample preparation and LC-MS/MS analysis. It has been well established that the ideal ISs for quantitative assays utilizing MS-based detection are SIL forms of the target analyte. Labeling with 15N, 13C, and sometimes 18O atoms is generally preferred over deuterium, which may undergo proton exchange reactions under certain conditions. Multi-deuterated analytes also exhibit isotope effects that can cause significant shifts in their chromatographic retention times from those of their unlabeled counterparts. Even a partial separation between the SIL-IS and the analyte peaks can reduce its ability to compensate for matrix-related ionization effects.

A key consideration in selecting an appropriate SIL-IS is the degree of labeling (i.e., number of heavy atoms and total mass difference between the SIL-peptide IS and the unlabeled analyte peptide). As with small molecule analytes, the mass difference should be sufficient to avoid overlapping of the MS signals between the IS ions and the natural isotopic ions from the analyte. An additional complication with peptides is their tendency to produce multiply charged ions, which effectively reduces the apparent mass difference detected by the MS instrument. This must be taken into account to ensure selectivity between the SIL-IS and the analyte. This consideration was exemplified in a recent article where appropriate surrogate peptides were selected from the constant regions of the light and heavy chains of human IgG1 and IgG2 antibody subclasses for analysis (13).

The types of ISs utilized in protein LC-MS/MS assays are summarized in order of potential preference, along with their advantages and disadvantages, in Table II.

Table II Types of Internal Standards Utilized in Protein LC-MS/MS Assays

Although not commonly available, a SIL-protein having the same physiochemical properties (12) as the target protein is generally considered to be the ideal IS. These are produced by incorporating one or more “heavy” SIL-amino acid residues into the protein structure, including the surrogate and monitoring peptide portions of the sequence, during recombinant synthesis (20). A SIL-protein IS is added to the samples at the beginning of sample preparation, thereby compensating for variations occurring in all steps of the assay (13).

More frequently employed are either extended or final length SIL-peptide IS forms of the surrogate and monitoring peptides. Due to their relatively small size, SIL-peptide ISs can be either chemically synthesized from SIL-amino acids or labeled through differential derivatization using SIL-reagents. Examples of the latter have been reported, such as differential dimethyl labeling (21) and O18/O16 iodoacetic acid labeling of cysteine residues (22). The use of an extended SIL-peptide IS, which includes extra flanking amino acids (typically three to six non-labeled) added to one or both ends of the SIL-surrogate peptide, may provide compensation for variation in the digestion procedure. The added “wings” will be cleaved during digestion to produce the SIL-peptide IS. While often useful, care should be taken as the digestion efficiency of the much smaller extended SIL-peptide may vary from that of the analyte protein. As SIL-peptide ISs do not participate in the protein extraction or enrichment steps, and only an extended version can compensate for variation in the digestion, the reproducibility of any steps uncontrolled by the IS should be carefully optimized and confirmed during method development and validation (23).

Least commonly used are non-labeled analog proteins or peptides. Successful examples have been reported using molecules with appropriate properties (14,17). For example, selection of similar size surrogate peptides of multiple mAb analytes at the same framework location led to similar digestion and chromatographic behaviors for bioanalysis of the mAbs from a cassette dosing study (14). The selected peptide for the IS should also have similar size and chromatographic retention to that from the analyte protein.

As noted, there are many IS choices that can be successfully used in a quantitative protein assay. Trade-offs in potential benefits, cost, and availability must be weighed and the ultimate effectiveness of the IS verified during method validation. Figure 3 illustrates workflows for the incorporation of the different IS types into three different affinity capture-based LC-MS/MS assay formats.

Fig. 3
figure 3

Affinity capture LC-MS/MS workflows with SIL-IS options. a Protein-level affinity capture. b peptide-level affinity capture, and c double-affinity capture. A SIL-protein IS is added at the beginning of the sample processing procedure, while an extended SIL-peptide IS is added prior to digestion. A SIL-peptide IS may be added before digestion to compensate for potential proteolysis-associated degradation or after

Selection of Critical Assay Reagents

Although the reference standard, IS, and other solutions may be considered as critical reagents in the assay, we limit the definition to the affinity capture reagents (e.g., binding proteins, aptamers, or antibodies) and protease digestion enzymes that have a direct impact on the assay results (8). It is important to note that these materials are usually added in excess in protein LC-MS/MS methods. In addition, due to the high selectivity of LC-MS/MS detection, the quality (and potential specificity) of the affinity capture reagent is generally less critical than is required for LBAs (24).

There are two types of affinity capture techniques generally used for isolation or enrichment at the protein or (surrogate) peptide level: specific, typically involving immunoaffinity interactions with an immobilized antibody or target ligand/receptor, and non-specific, generally involving affinity interactions with a generic binding protein, such as protein A/G or anti-Fc. The latter approach using commercially available reagents can be especially useful when specific custom reagents are not available. The LC-MS/MS provides the additional selectivity required for the generic approach. Figure 2 depicts example affinity capture enrichment procedures at the protein level (Fig. 2a) and at the peptide level (Fig. 2b). Other variations are also used.

Bead-based enrichment techniques (at protein or peptide level) typically do not reuse the affinity capture reagent, thus eliminating potential carryover issues. Column-based enrichment methods, typically used in online peptide enrichments, must be evaluated for carryover and recovery because the affinity capture reagent is typically regenerated with each run. In both modes, the amount of capture reagents should be in stoichiometric excess relative to the expected highest concentrations of the analyte and other endogenous background proteins that may also be captured to ensure sufficient binding capacity.

For protein analyte digestions, trypsin, which cleaves at arginine and lysine residues, is by far the most frequently used enzyme. Sequencing-grade trypsin, which has been specially treated to prevent autolysis, is preferred by many to improve digestion efficiency and avoid additional non-specific cleavages. However, other less-expensive grades have been successfully used. When different cleavage specificity is needed to obtain more suitable peptides (e.g., shorter, more proteotypic, or representative of a specific molecular region), a variety of other proteases are available, including Lys-C, Glu-C, Arg-C, and Asp-N, in addition to chemical cleavage (e.g., cyanogen bromide and formic acid).

VALIDATION EXPERIMENTS AND CRITERIA

Selectivity/Specificity

The terms selectivity/specificity are collectively defined as: “the ability of the bioanalytical method to measure and differentiate the analytes in the presence of components that may be expected to be present” (7,11). Although often used interchangeably, in the context of protein analysis, using either LC-MS/MS or LBA technology, selectivity is better defined as the ability to measure the analyte of interest in the presence of unrelated compounds in the matrix; whereas, specificity is defined as the ability to distinguish and measure the analyte in the presence of structurally “related compounds” or drugs expected to be concomitantly administered (8). For example, the application of affinity capture in a protein LC-MS/MS method may significantly enhance the selectivity of the assay due to the distinctive ability of the capture reagent to bind the analyte over the matrix components. However, the specificity of the assay could be impacted by potential cross-reaction or interference from related substances. Consideration should be given to evaluating both non-specific and specific sources of matrix interference on assay selectivity/specificity as described below.

Non-specific Matrix-Related Interferences

Sources of potential non-specific matrix interference are widely varied, ranging from salts and endogenous lipids (causing MS ionization effects) to unrelated proteins. As with LBAs, the presence of human anti-mouse antibodies (HAMA), rheumatoid factor, and heterophilic antibodies may impact affinity capture-based methods. In clinical studies, the occurrence of lipemic or hemolyzed samples, and potential matrix compositional differences between normal and relevant disease populations, may also be of concern. Therefore, selectivity testing should be performed on multiple matrix lots, including lots of normal and disease-state matrix for clinical applications.

Specific Interferences

Sources of potential specific interference include physiochemically similar molecules, such as endogenous protein analogs, analyte isoforms, and analyte-derived degradants and catabolites. Other “related compounds,” with respect to their specific structural complementarity that promotes binding to the analyte, include ADAs and soluble target ligands. In addition, with LC-MS/MS-based methods, concomitant medications of both small and large molecule types may need to be evaluated for potential interference effects when applicable. Small molecule drugs could affect chromatography or MS ionization; whereas, large molecule protein drugs may impact an affinity capture-based procedure or produce interfering peptides.

It can be challenging to assess interference from ADAs, soluble target, and catabolites without true reference materials for testing. Although not universally applicable, the potential effects of ADAs can sometimes be simulated by using a positive control antibody (preferably polyclonal) from immunogenicity testing. For other protein components (e.g., soluble target) recombinant substitutes may suffice to investigate possible interference with affinity capture. In practice, appropriate “related molecules” are often unavailable, and specificity evaluations may need to be conducted after the original validation is completed. Whenever possible, it is recommended to evaluate incurred samples from similar studies, following treatment with the protein biotherapeutic and/or dosed with concomitant medications, to assess potential interference differences relative to normal control matrix (i.e., drug-naive sample matrix obtained from animals or subjects who have not been exposed to the biotherapeutic or concomitant drug).

When a multi-analyte LC-MS/MS method is developed, cross-analyte interference should be evaluated among the surrogate peptides from the different protein analytes, their corresponding SIL-IS peptides, and between the analytes and ISs.

Whenever possible, it is advisable to evaluate selectivity/specificity during method development and identify any assay weaknesses that may need to be addressed or noted as a limitation. Confirmatory testing is then conducted during validation as described below.

Validation Considerations

Validation experiments to evaluate non-specific matrix-related interferences should be conducted according to the applicable method validation guidance (710). As with small molecule chromatographic methods, selectivity is evaluated by analyzing blank matrix aliquots that are (1) unfortified, (2) fortified with only IS, and (3) fortified with the protein analyte at the LLOQ and the IS. A minimum of six matrix lots are typically required; however, additional lots may be more appropriate for clinical applications, particularly when normal vs. disease state samples are to be compared. Consistent with most current chromatographic guidelines, the recommended acceptance criteria are that the response of any background component in the unfortified blank matrix samples should be less than 20% of the LLOQ for the analyte and less than 5% for the IS at the working concentration, respectively. Acceptable lot-to-lot selectivity to achieve quantification at the LLOQ is demonstrated when the accuracy for at least 80% of the analyte-fortified lots is within ±25% of the theoretical analyte concentration. In addition to accuracy, which is usually based on calculations using analyte/IS ratios, the absolute analyte responses should be sufficient (see below under “Lower Limit of Quantification”) and reproducible across the lots. Significant variation in responses between lots may be indicative that an underlying issue (e.g., extraction recovery, digestion efficiency, matrix effect) should be further evaluated to avoid potential problems with study sample analysis.

Validation experiments to evaluate specific interferences should be conducted, if appropriate and technically feasible (i.e., appropriate material is available), by assaying samples spiked with various levels (including the highest anticipated concentration) of each of the specificity test materials into blank matrix and matrix containing the therapeutic protein at the LLOQ concentration. Evaluation of specificity for some potential interferents may need to be conducted after the original validation is completed, when appropriate materials or incurred samples become available. The accuracy of the fortified specificity samples should be within 25% of the nominal concentration.

For a multi-analyte assay, each analyte spiked at the ULOQ or each SIL-IS spiked at the working concentration should be evaluated for any possible cross-analyte interference. The recommended target acceptance criteria are that the response detected in the SRM channel of the analyte is less than 20% of the mean LLOQ response and less than 5% of the mean IS response at the working concentration.

Matrix Effect on Ionization

Matrix effect has been defined as the direct or indirect alteration or interference in response due to the presence of unintended analytes (for analysis) or other interfering substances in the sample (8). In the context of mass spectrometric methods, the term matrix effect primarily refers to potential suppression or enhancement of the MS signal, particularly with the most commonly used electrospray ionization (ESI) mode. This phenomenon, which occurs to varying degrees in all matrix samples, is well known in small molecule drug analysis and similarly may impact large molecule protein analysis when detecting surrogate analyte and IS peptides. MS matrix effects are conventionally assessed by determining the magnitude and consistency of matrix factors (MF) measured in multiple lots of matrix; however, performing this experiment can be challenging for a protein assay because the analyte is not measured directly. As an alternative, the MF can be evaluated using synthetic peptides as the actual surrogate analyte (and IS) being detected and measured in the sample extract, similar to the approach for small molecule method validation (710). It should be kept in mind that the most important aspect of these tests is to demonstrate that the matrix ionization effects are consistent among individual matrix lots from different donors.

Validation Considerations

MF is most easily assessed by spiking synthetic surrogate peptide and SIL-peptide IS into extracts of matrix blank and reagent blank samples (free of matrix components) that were processed according to the method-specified sample preparation steps, including enrichment, reduction, alkylation, and/or digestion. The absolute and IS-normalized MF values can then be calculated by comparing the analyte and IS SRM peak responses from the appropriate samples. It should be noted that non-specific binding (NSB) to pipettes and sample containers may cause significant difficulties in preparing and handling neat low concentration peptide spiking solutions and fortified reagent blank samples, leading to anomalously low peptide signals and erroneous absolute MF values. It is recommended to evaluate potential NSB issues during method development. In practice, the IS-normalized MF values determined in spiked matrix samples provide the most reliable indication of lot-to-lot assay performance consistency with respect to matrix ionization effects. It is recommended that at least six individual lots of matrix be evaluated for MF at both low and high analyte concentrations, with an acceptance criterion that the CV of the IS-normalized MF calculated across matrix lots should not be greater than 20%. In some clinical applications, where a significant incidence of hemolyzed or hyperlipidemic samples may be expected or observed, or when required by applicable regulatory guidelines, the potential impact of these factors on MF and the assay performance should also be considered as needed.

In cases where traditional determination of MF is considered impractical (e.g., excessive NSB issues would impact non-matrix sample reliability), or impossible because the surrogate peptide and/or SIL-peptide IS are not available (e.g., only the analyte protein and/or a SIL-protein IS were used without the corresponding surrogate peptide(s) being synthesized), matrix ionization effects may be assessed by analyzing multi-lot QC samples, each prepared using matrix from at least six different donors. Low and high levels of analyte may be evaluated, as in the MF approach, or alternatively, the data from the multi-lot matrix selectivity/specificity sample set fortified at the LLOQ level may be evaluated for this additional purpose. Consistent “IS-normalized” quantification (i.e., absence of variable matrix ionization effects that could impact assay accuracy) will be demonstrated if the precision of determined concentrations across the lots is within 20% (10).

ACCURACY AND PRECISION

The accuracy and precision criteria that can be achieved in practice are dependent on the methodology employed. As noted above, the suitability and type of IS used in the method will also have a large influence on the accuracy and precision achievable for a protein LC-MS/MS method (13). In general, the accuracy and precision criteria commonly applied to LBAs (8,25) are deemed to be appropriate for most protein LC-MS/MS assays due to the potential increased variability associated with the methodologies used (Table I). Accuracy and precision acceptance criteria should be pre-defined as part of a method validation plan and/or by SOP, be scientifically justifiable, and determined using a fit-for-purpose approach to support the intended use of the assay. When assays perform with a higher degree of accuracy and precision during validation, it may be appropriate to adjust the assay acceptance criteria for study sample analysis. It may be advisable to gain some method application experience with incurred samples prior to altering criteria for subsequent studies.

Calibration Range

The calibration range and response functions for both chromatographic assays and LBAs have been extensively discussed (7,8,25). Here, we focus on special considerations for the analysis of proteins by LC-MS/MS. In contrast to LBAs, which often display a non-linear analyte concentration-response relationship, LC-MS/MS assays generally exhibit a linear response function over a wide dynamic range. Non-linear behavior may sometimes be encountered and affect the practical dynamic range with an LC-MS/MS method, particularly when affinity capture steps are employed. Care should be exercised to understand the origin of non-linearity and resolve any issues that might adversely impact assay performance. In some cases, choosing a non-linear regression model may be appropriate with justification and demonstrated calibration curve performance.

Lower Limit of Quantification

The lower limit of quantification (LLOQ) is defined as the lowest concentration of analyte in a sample that can be quantified with acceptable accuracy and precision (Table I). The LLOQ is considered as the lowest calibration standard (see “ACCURACY AND PRECISION”). As with small molecule assays, it is recommended that the analyte surrogate peptide signal measured in the LLOQ sample should be at least five times the signal of a blank sample (7,8). In any case, the LLOQ of an assay should be adapted to expected concentrations and to the aim of the study to which the assay will be applied (8).

Validation Considerations

Validation QCs should be prepared at a minimum of four concentrations: the LLOQ, low, mid, and high levels, and analyzed to evaluate intra- and inter-assay accuracy and precision according to current guidelines. For methods that exhibit a non-linear curve characteristic, it may be advisable to include an additional QC prepared at the Upper Limit of Quantification (ULOQ) level. We recommend that as a minimum, three validation runs are sufficient to establish accuracy and precision.

Critical assay reagents should be identified in the validation plan and/or method SOP and more than one lot tested, if available, during method validation. In cases where a new lot of such reagents must be qualified, a fit-for-purpose approach may be applied such that the reagent can be accepted based upon its functionality in the assay. It may be useful in some cases to examine the specific assay parameters impacted by the reagent, for example, to verify the digestion efficiency of the assay when a new lot of the proteolytic enzyme is first being used. The demonstration of one acceptable precision and accuracy run or an acceptable analytical run may be sufficient to infer reagent suitability; however, in cases where the reagent quality is uncertain or is known to require rigorous testing, a more thorough test of accuracy and precision or a partial validation may be required. It may also be advisable to compare the performance of the new lot with the previous one in the same test batch. In any case, the level of criticality of reagents, their production procedure or procurement options, and any other considerations important to the assay should be documented and described in the analytical method. Since many assays are utilized over several years, the long-term availability and stability of critical reagents should be carefully considered (26).

Dilutional Integrity and Linearity

Dilution experiments should be conducted according to current guidelines to cover the potential need to dilute study samples with analyte concentrations above the ULOQ. For simpler assay formats, a high-level sample (dilution QC) should be prepared at an above-the-curve concentration (typically covering the highest analyte concentration expected in study samples) and analyzed with an appropriate dilution factor. It is recommended that the accuracy and precision should be within ±20% of nominal concentration.

With more complex affinity capture LC-MS/MS assays, saturation effects can occur at high analyte levels, impacting the linearity of the assay. The dilution QC should be diluted serially with pooled matrix to prepare a set of samples with concentrations within the calibration range, and analyzed. It is recommended that the back-calculated concentration for each dilution should be within 20% of the nominal concentration after correction for dilution and the precision of the concentrations across the dilution series should not exceed 20%.

Parallelism (Incurred Sample Dilution)

Depending on the isolation/enrichment procedure employed and the specificity of the assay, it is sometimes desirable to determine parallelism between the calibration standard curve and serially diluted incurred samples. While this evaluation is not routinely conducted as part of method validation for either LBAs or LC-MS/MS assays, it can be a useful troubleshooting experiment to assess whether catabolites, binding proteins (e.g., ADAs, soluble targets) or other interfering compounds are affecting the validity of the assay results. As with dilutional linearity, this is likely to be more of a concern with affinity capture-based procedures. Typically, investigation of parallelism includes high concentration study samples, which are serially diluted with the blank matrix over an appropriate range and analyzed to evaluate accuracy and linearity. It is recommended that the precision of the back-calculated concentrations across the dilution series should be within 30% (8).

OVERALL AND INDIVIDUAL PROCESS RECOVERIES

The consistency of protein/peptide recovery efficiency for isolation/enrichment, enzymatic digestion, and any subsequent purification steps is critical for achieving an accurate and rugged LC-MS/MS assay (27). Furthermore, recovery reproducibility is likely of greater concern for LC-MS/MS analysis of protein biotherapeutics than for small molecule analytes, since SIL-protein ISs, which would ideally compensate for sample-to-sample and run-to-run recovery variation, are not readily available. Analog protein, extended SIL-peptide or SIL-peptide ISs may not be able to compensate for all sources of variability during affinity capture, enrichment, and/or digestion steps before LC-MS/MS analysis.

Overall (total) recovery of an LC-MS/MS assay is a combination of recoveries of all processes, including those of the protein during pre-digestion treatment (pre-digestion recovery), and of the surrogate peptides from enzymatic digestion (digestion efficiency) and from post-digestion treatment (post-digestion recovery) (2). The recovery of each processing step can be calculated from the responses of the neat protein or the surrogate peptide spiked into the sample before the process over that spiked after the process. Percent overall recovery of the analyte is obtained by comparing the responses of samples spiked with protein into sample matrix before pre-digestion treatment, against the sample spiked with the surrogate peptide into the final sample extract (i.e., after post-digestion treatment steps).

If affinity capture is used for sample cleanup and protein or peptide enrichment in pre- or post-digestion steps, several parameters need to be optimized, including capture reagent/analyte concentration ratio, capture reagent capacity, and analyte NSB tendency. Understanding the affinity capture reagent’s specificity and affinities will ensure consistent capture efficiency. As noted elsewhere, evaluation of the potential impact of endogenous materials such as ADAs and soluble targets on affinity capture methods should be considered and evaluated when there is a concern and the appropriate materials are available.

Validation Considerations

Rather than evaluating recovery at each individual method step, which is of interest during method development, the minimal validation requirement is an assessment of overall recovery. The overall recoveries are determined at three analyte concentrations (low, mid, and high) within the range of the standard curve for analyte and at the working concentration for the IS and the results compared. Although achieving high absolute recovery is preferable, demonstrating reproducible recovery across the concentration range is most important. Recovery acceptance criteria are not specified by regulatory agencies; it is up to the individual laboratory to define acceptable recovery in their own SOPs or for a given assay situation. Recovery of individual process steps may sometimes need to be tested for troubleshooting when inconsistent accuracy and precision results are observed or when better sensitivity is needed.

STABILITY

Therapeutic proteins are subject to various factors that can impact their in vitro or ex vivo stability as neat material (e.g., lyophilized form), in non-matrix solutions, and in biological matrices. Proteins are prone to degradation or modifications upon chemical and environmental stress. As most LC-MS/MS protein bioanalytical methods involve quantification of selected peptide surrogates, changes to the protein may not be detected if they do not affect the surrogate and monitoring peptides or the analytical processes employed (e.g., affinity capture efficiency). The protein analyte of interest is considered “stable” under the test conditions being evaluated, as long as the measured responses of the surrogate peptide measured in stability samples are within acceptance criteria. Monitoring peptides can sometimes be used to detect stability-related changes in other parts of the protein.

In addition to actual molecular changes, apparent (measured) stability may be affected by handling issues, such as incomplete solubilization, NSB to surfaces, and aggregation, resulting in concentration bias falsely appearing as analyte instability.

Stock and Working Solution Stability

To properly assess stabilities of a protein analyte, or ISs and surrogate peptides (when needed), it is important to understand their inherent solubility properties and NSB tendency under the expected conditions of use. The choice of solvent(s), preparation techniques, and container types used to prepare and store stocks and working solutions should be evaluated and optimized. Reference materials are often provided as solutions, with recommended storage conditions and expiry information provided by the source. For lyophilized reference materials, a stock solution may be prepared by weighing a portion and/or directly dissolving the pre-weighed material in an accurate volume of an appropriate solvent. In such cases, vigorous mixing should be avoided to prevent protein aggregation; standing for several hours or overnight may be required to ensure complete dissolution. Exposure of concentrated protein solutions to multiple freeze-and-thaw cycles should generally be avoided to prevent degradation and/or aggregation. It is often beneficial to subaliquot and store stock solutions in small single-use vials.

Validation Considerations

Solution stability evaluations should follow applicable regulatory guidelines (7,8). For protein reference materials supplied in solution form, the appropriate storage conditions and expiry of these (as received) stock solutions are usually provided by the supplier. When stock solutions are prepared in the analytical laboratory from powder or in a different solvent system or when storage conditions change or duration needs to be extended beyond the expiry date, stability should be demonstrated. The preferred method is to compare a stored stock or diluted working solution to the corresponding freshly prepared solution, which is derived from a fresh weighing or an unopened vial of material (lyophilized or solution) within stability, as provided. Some prefer to avoid the use of non-matrix working solutions by spiking stock solutions directly into matrix and diluting appropriately to prepare standards and QCs. If working solutions are to be used, there is no need to evaluate stability for all concentration levels; testing at the lowest and highest concentrations is sufficient to bracket a range.

To evaluate the relative content of non-matrix protein and peptide solutions, direct analysis by HPLC-UV can be convenient approach; however, some may prefer that stability evaluations be made in the context of the actual bioanalytical assay to be employed (e.g., an indirect LC-MS/MS assay designed to measure a protein via a surrogate peptide). Using LC-MS/MS, whether following direct dilution of a protein analyte solution or after freshly spiking the solution into blank matrix, sample processing and/or digestion will be necessary to compare the signature peptide responses of two comparator solutions. Therefore, it is recommended that the relative difference between the mean responses (n ≥ 5) determined for the stored solutions and freshly prepared controls be within 10% to confirm apparent stability.

SIL-protein/peptide IS materials are expensive to produce and often available in quantities too small for stability evaluation. The stability of SIL-IS solutions need not be assessed as long as there are no significant interferences with the analyte response and no trends of instability. The apparent stability of IS solutions can be monitored by evaluating the IS responses for any trends over the course of their use.

Matrix Stability

Stability of the protein analyte in biological matrix should be evaluated and optimized with respect to expected sample collection, initial handling, storage, and processing conditions during method development. In some cases, the age of the biological matrix used for in vitro stability experiments might positively or negatively influence stability results due to changes in endogenous enzyme activity and pH with age, storage, and handling conditions. Some labile protein analytes may require stabilization at the time of sample collection or early in sample processing, such as keeping samples on ice and adding protease inhibitors or denaturants. Additive treatments must be evaluated to ensure that stability is achieved, while avoiding any impact on the extraction, digestion, or analysis steps of the method.

As is sometimes encountered with LBAs, the observed analyte recovery from freshly spiked matrix samples may differ from that of frozen ones, potentially due to slow dissolution and/or binding equilibration with matrix components. This discrepancy may artificially bias the stability experiment results when using freshly prepared comparators. If during method development it is demonstrated that initial freezing is needed to provide consistent recovery (independent of duration and cycles), then it may be appropriate to substitute the fresh comparators with ones that have been frozen for a short period (e.g., overnight) and thawed only once.

Validation Considerations

Matrix stability evaluation should be in line with the accuracy and precision criteria applicable to the validation.

Processed Sample (Autosampler Tray) Extract Stability

Similar to stock and working solution stability, the properties of the surrogate and IS peptides in the reconstitution solvent or final extract should be tested with regard to their solubility and NSB in the intended storage container as potential confounding factors.

Validation Considerations

Stability of processed sample extracts should be evaluated according to requirements applied for small molecule drugs (7,8). The generally preferred method is to reinject a set of aged QC sample extracts along with a freshly prepared calibration curve. While this approach is often successful, some methods may experience batch-to-batch variation in absolute analyte recovery and analyte/IS response ratios due to the analytical procedure(s) employed, particularly when the IS does not provide control for every step of the method (e.g., affinity capture or digestion using a SIL-peptide IS). In these cases, the aged QCs may not quantify accurately against a calibration curve prepared in a different batch. An alternative evaluation may sometimes be applied to eliminate inter-batch bias and demonstrate acceptable processed sample stability with appropriate justification. In this approach, the values for the aged QC samples are calculated by comparing their re-injection data (analyte/IS ratios) to the regression of the originally injected calibration curve with which they were prepared and initially analyzed.

As a practical matter, to justify restarting a run following an instrument malfunction, it is also useful to demonstrate batch “re-injection reproducibility” by injecting a run containing calibration standards and QC samples when first prepared, storing the batch under appropriate conditions (e.g., in the autosampler), and re-injecting and quantitating the aged QCs against the aged calibration standards in the run. Many find it convenient to simultaneously analyze a set of freshly prepared calibration standards along with the reinjected batch and quantifying the aged QCs from both the aged and fresh calibration curves to get both types of data.

Critical Assay Reagents Stability

As previously noted, some types of reagents (e.g., affinity capture media and protease enzymes) may be considered critical to assay performance. Similar to the recommended handling of protein stocks, it is often beneficial to limit environmental stability stress by subaliquotting and storing critical protein reagent solutions in small single-use vials. In general, the apparent stability of different preparations of reagents can be inferred from consistent and acceptable assay performance in validation or analytical runs. It is advisable to monitor for trending that may indicate loss of reagent effectiveness over the duration of its use. Since many assays are utilized over a long period, stability of critical reagents should be carefully managed (26).

OTHER EXPERIMENTS

Carryover

As with small molecule assays, carryover <20% of the LLOQ response is generally preferred. Because peptides tend to be adsorptive, higher carryover may be encountered. In such cases, appropriate mitigation strategies should be implemented and potential impact on study sample results evaluated.

CONSIDERATIONS FOR SAMPLE ANALYSIS

Analytical Runs

Those parameters and criteria established for an LC-MS/MS method for a protein biotherapeutic during the method validation will generally apply during sample analysis. Considerations for sample analysis runs are summarized in Table III.

Table III In-Run Considerations for Sample Analysis of Protein Drugs by LC-MS/MS

CONCLUSIONS

LC-MS/MS has become an important new technique for the quantification of proteins in biological matrices, now routinely applied in regulated bioanalysis as a complement or alternative to LBAs. Our experience thus far is that the development and validation of protein LC-MS/MS assays require different approaches than those used for small molecule drugs. The assay development, pre-validation and validation plans should be tailored to the particular assay being established, taking into account the intended use of the assay, the characteristics of the protein, similarity to endogenous proteins, and the test species or study population. Because biotherapeutics are more complex molecules than small molecules, it is important to understand what is being measured in the assay and to select an assay format that is appropriate for the desired application. In many cases, we recommend a hybrid approach to validation that includes experiments traditionally associated with both LC-MS/MS assays and LBAs (Table I). Of particular importance is establishing the selectivity/specificity of the assay. This may require more extensive evaluation than is typical in a small molecule assay and may be expanded to include testing a larger number and variety of matrix lots, ADAs, and other plasma/serum factors in addition to the traditional LC-MS/MS selectivity and MF determinations. Regarding accuracy and precision, in general, we recommend applying the LBA criteria as a starting point because of the inherent nature of these assays and the limited industry experience with the performance of the methods in regulated applications. This position has also been supported in a recent editorial from the European Bioanalysis Forum (28). Tighter criteria may be established, if desirable, for relatively simple assays with an appropriate SIL-IS and traditional sample preparation, as these assays tend to have similar performance as small molecule LC-MS/MS methods, or for sample analysis if desired and supported by the method validation data.

During the writing of this paper, a new draft US FDA Guidance for Industry on Bioanalytical Method Validation was issued, which currently does not address LC-MS/MS technology applied to protein or large molecule assays (11). The proposed new guidance adheres to the conventional technology delineation: LC-MS/MS for chromatographic analysis of small molecule drugs and LBA for immunoassay of biologics. In its present form, the draft guidance does not fully consider the increasingly diverse range of large molecules (e.g., ADCs) entering drug development, which may require new and more complex bioanalytical assays (29,30). We recognize from discussions at the AAPS Crystal City V Workshop (31) that regulators feel that it is too early to provide specific guidelines for LC-MS/MS assays of proteins and that until more protein drug applications supported by LC-MS/MS assay data have been filed and reviewed, the FDA and other regulatory agencies will not be ready to define specific requirements for these assays. This white paper intends to provide preliminary recommendations on the development and validation of protein LC-MS/MS assays and may influence future bioanalytical regulations in this area. The opinions are based on the consensus and cumulative experience of the authors who are actively developing and applying LC-MS/MS assays for protein quantification. As analytical techniques and protein drugs are rapidly evolving, our recommendations represent the current best practices and are not intended to be used as the definitive document on the subject.