Introduction

Triple quadrupole tandem mass spectrometry (MS/MS), particularly when coupled with gas or ultrahigh-performance liquid chromatography (GC or UHPLC), is one of the most powerful instrumental analysis tools in analytical chemistry [1,2,3,4,5,6,7,8]. The great strength of MS/MS is that highly selective analyses of hundreds of analytes at very low limits of quantification (LOQ) can be achieved even in complex matrices using optimized detection conditions with very short dwell times (e.g., 1–10 ms) for each analyte. However, the inherently associated drawback of MS/MS is that only targeted analytes are monitored, which typically places greater limits on the number of compounds detected per given period of time than full mass range acquisition techniques. Therefore, any desired analyte that is present but not targeted in the sample almost surely becomes a false negative in MS/MS. However, there is no guarantee that nontargeted analytes would be identified using full mass range high-resolution (HR)MS either, for several reasons as documented and discussed previously [2,3,4,5,6,7,8,9,10,11,12,13,14].

More analytes per given time unit can be monitored by using shorter dwell times in MS/MS, but physical and electronic limits are reached at some point, and accuracy and LOQs are adversely impacted when dwell times become too short (or data acquisition cycle times are too long to properly define peaks in chromatography). Acceptable sensitivity and precision for analysis of most analytes using multiple reaction monitoring (MRM) with modern MS/MS instruments typically require minimum dwell times of 1 ms per ion transition plus 1 ms inter-dwell delay. Positive/negative switching in electrospray ionization (ESI) also takes an additional ≈20 ms for settings to re-equilibrate between ESI(+/–) MRM cycles.

In addition to quantification, qualitative identification is another important factor in MS/MS [3,4,5,6,7,8]. Quantidentification is a proposed portmanteau word to signify the merging of both quantification and identification of analytes. In this expression, acceptability criteria for identification and quantification both have to be met to suit fitness of purpose analytical needs [15]. Quantification in method validation generally requires 70–120% recoveries with ≤25% RSD for diverse replicate samples spiked at multiple levels, and many different rules or guidelines have devised analyte identification criteria for MS/MS detection in chromatographic analyses [3, 4, 15,16,17,18,19].

Although the acceptable values for identification may be different, all of the criteria entail that retention time (tR) and relative ion abundance ratio(s) between at least 2 MRM transitions must fall within a certain range from the reference tR and ion ratio(s) for the analyte. Furthermore, a signal/noise (S/N) or concentration threshold must be exceeded, which could be the LOQ, limit of identification (LOI), maximum residue limit (MRL), or reporting level. The reference values must be determined contemporaneously from a reference standard (std) analyzed using the same method, ideally in the same analytical sequence as the samples.

Common sense dictates that acquisition of 3 MRM transitions, which yields 3 ion ratios, should lead to greater chances for accurate identification of analytes [20, 21] than the single ion ratio obtained when only 2 ion transitions are acquired per analyte. The author learned this lesson when attempting and failing to validate a quantidentification method for aminoglycosides initially using only 2 MRMs/analyte [22]. The variability of each ion ratio was too high to consistently meet the required ±10% absolute (not relative) difference from the reference value, particularly at lower concentrations. The method easily met the FDA/USDA identification criteria [16] of ±20% (absolute) for at least 2 ion ratios when the validation was repeated using 3 MRMs per drug analyte. Even when acquiring 3 MRMs/analyte in this analysis, the 2002/657/EC identification criteria [17] could not be met as often. The greater complexity and strictness of the 2002/657/EC criteria is a known issue [3, 8, 19, 23], and after a re-assessment of previous criteria [19], SANTE/12682/2019 guidelines recommend a ± 30% (relative) acceptable ion ratio window for analyte identification purposes [18].

Still, the acquisition of more MRMs for hundreds of analytes takes time that reaches a limit in the number (no.) of analytes that can be monitored. As shown in a previous application [24], the relationship between the number of analytes monitored per unit time depending on chromatographic peak width, the number of ion transitions, and dwell times can be plotted. UHPLC typically provides peak widths of ≈4 s, and in chromatography, 5–10 data acquisition points per analyte peak are needed for accurate results. Figure 1 presents the calculated number of analytes using those parameters depending on whether 2 or 3 MRM ion transitions are collected per analyte. A 9 min chromatogram consists of 54 segments of 10 s each, theoretically allowing up to 3618 equally spaced analytes of 3 MRMs each with 10 points/peak. In this respect, the bigger problem in practice with modern MS/MS using MRMs pertains to the great effort to optimize conditions and manage results for hundreds of targeted analytes in a method, and not the limited no. of analytes that can be inherently monitored.

Fig. 1
figure 1

Number of possible simultaneously monitored analytes in MS/MS using multiple reaction monitoring for 2 or 3 ion transitions (MRMs) per analyte of 2 ms cycle time per MRM in chromatography with 4 s peak widths, depending on the number of data collection points per peak

Several years ago when MS/MS instruments could only achieve minimum dwell times of 10 ms and similar inter-dwell delay times, analysts who targeted hundreds of analytes in methods were often forced to monitor only 2 MRMs/analyte, even when LC peaks were 20 s wide. However, modern instruments are fast, precise, and sensitive enough to acquire 3 MRMs/analyte even with UHPLC that provides ≈5-fold narrower peaks.

Yet, most analysts still only monitor 2 MRMs/analyte, often because they wish to avoid re-optimization for so many analytes to add the 3rd ion transition. Also, most MS/MS instrument software programs still do not resolve the added complication in data management to deal with 3 ion ratios rather than just 1 for identifications. For decades, the author has imported raw data into Excel for this task because instrument software from several vendors has lacked functional multi-ion identification features. An Excel template for automatic quantidentification of analytes can be generated using if-then algorithms; thus, if results using 3 ion ratios are demonstrated to better meet application needs than use of 1 ratio per analyte, then analysts should not use instrument software limitations as a reason to collect only 2 MRMs/analyte. Furthermore, manufacturers should better incorporate common identification criteria into their software.

In this study, liquid and powdered egg products were chosen as matrices for method validation at the request of the USDA Food Safety and Inspection Service (FSIS). Whereas the Food and Drug Administration (FDA) has responsibility for food safety of shelled eggs in the USA, FSIS has jurisdiction over the processing of cracked eggs, which are commonly added as ingredients for other processed foods. This evaluation tested the USDA extract-and-inject (E&I) method [25,26,27], which also covers the LC-amenable analytes in the expanded QuEChERSER mega-method [28, 29]. In addition, the validation study included the separate extraction and cleanup steps for aminoglycosides to be combined with the E&I egg extracts for analysis using an ion-paring reagent also added to final extracts [30, 31]. The scientific literature describes many methods using LC-MS/MS for analysis of antibiotics and other veterinary drugs in eggs [32,33,34,35,36,37,38,39,40,41], but none are as streamlined as the USDA methods to achieve such high throughput [27, 31].

One goal of this study was to validate a multiclass, multiresidue analytical method using UHPLC-MS/MS for 169 veterinary drugs, including aminoglycoside antibiotics, at levels of regulatory concern in liquid and powdered eggs. Method performance was to be assessed with regard to both quantitative determination and qualitative identification (quantidentification) of the targeted analytes. Another major goal was to compare the rates of false positives and negatives when using either 2 or 3 MRM ion transitions per analyte and regulatory identification criteria from the FDA/USDA [16], SANTE/12682/2019 [18], and 2002/657/EC [17].

Materials and methods

Reagents and solutions

Reference stds of high-purity drug analytes were obtained from LGC Standards (Manchester, NH; USA), US Pharmacopeia (Rockville, MD; USA), Sigma-Aldrich (Saint Louis, MO; USA), Honeywell (Muskegon, MI; USA), Dr. Ehrenstorfer (Augsburg; Germany), Toronto Research Chemicals (Toronto, ON; Canada), and C/D/N Isotopes (Pointe-Claire, QC; Canada). Stock solutions of ≈2000 ng/μL were prepared usually in acetonitrile (MeCN), and analyte mixtures were similarly prepared at the appropriate concentrations to yield the spiking levels for each analyte listed in Table 1.

Table 1 %Recoveries (and %RSDs) for the identified veterinary drug analytes spiked (n = 10 each level, 30 overall) in liquid and powdered eggs (reconstituted concentrations for powdered samples). All results except ISTDs normalized to 13C6-sulfamethazine (or tobramycin for aminoglycosides). Bold text indicates 70% > recovery >120% and RSD > 25%. * = isomers from the same analyte

MeCN, methanol (MeOH), and hydrochloric acid (HCl) were purchased from Fisher Scientific (Pittsburgh, PA, USA). Deionized water (18.2 MΩ-cm) was collected from an E-Pure Model D4641 purifier from Barnstead/Thermolyne (Dubuque, IA, USA). Trichloroacetic acid (TCA), formic acid (HCO2H), disodium ethylenediaminetetraacetic acid (EDTA) dihydrate, sodium chloride (NaCl), and sodium 1-heptanesulfonate monohydrate originated from Sigma. Ammonium acetate (NH4OAc) and sodium hydroxide (NaOH) were purchased from Mallinckrodt (Paris, KY; USA). Weak cation exchange (WCX) sorbents contained in 5 mL disposable (or dispersive) pipet extraction (DPX) tips came from DPX Technologies (Columbia, SC; USA).

The aminoglycosides extraction solution was 10 mM NH4OAc, 0.4 mM EDTA, 0.5% NaCl, and 2% TCA in water, and for multiple classes of veterinary drugs, the 4/1 (v/v) MeCN/water extraction solvent was prepared weekly in a dispenser bottle.

Samples

Dozens of powdered and liquid egg products were provided by FSIS labs collected from around the USA. Many of the received samples were unlabeled, and those with labels included “dried whole egg,” “pasteurized dried whole egg F.F.”, “pasteurized dried whole egg with sodium silica-alumina,” “pasteurized free flow whole egg,” “pasteurized dried egg,” “pasteurized dried whole egg,” “100% liquid egg whites,” “egg whites,” “dried yellow egg product,” “dried egg whites H-40,” “pasteurized free flow yolk,” “spray dried egg whites,” “SLS whites 30,” “pasteurized dry white,” “pasteurized dried egg whites type G-3,” “past salt yolk (NEF),” “FEW,” and “sugar yolks.” Whole and liquid eggs were also purchased from local grocery stores. All samples were stored at −20 °C until they were brought to room temperature for extraction.

Samples were divided into powdered and liquid eggs for separate validation, which was conducted in accordance with FSIS protocols. Each batch of samples consisted of 10 matrix blanks and 10 spikes each at the 0.5X, 1X, and 2X levels with X provided in Table 1 (reconstituted concentration for powdered eggs). For each batch, 9 samples were selected from those provided by FSIS for the blanks and spikes, and the 10th replicate was an equal portion mixture of those samples, which were also used to prepare 6 matrix-matched (MM) calibration stds at the 0X, 0.25X, 0.5X, 1X, 2X, and 3X levels from their extracts. Reagent-only (RO) calibration stds were also prepared at the same levels using 7/3 (v/v) MeCN/water solutions to mimic the egg extracts.

Sample preparation protocols

The multiclass E&I veterinary drug sample preparation method entailed the following steps [25,26,27,28,29,30,31]: (1) weigh 2 g liquid egg or 0.5 g powdered egg product into a 50 mL polypropylene (PP) centrifuge tube, plus 1.5 mL water to reconstitute the powdered egg (1.7 mL water was used for reagent blanks); (2) spike the test portions with veterinary drug mixtures and/or internal stds (ISTDs) (do not add ISTDs to samples to be used for MM calibration stds), vortex briefly, then let stand for 10 min; (3) dispense 10 mL of 4/1 (v/v) MeCN/water to each tube; (4) shake for 10 min on a platform vortexing shaker (Glas-Col; Terre Haute, IN; USA) at maximum pulsation and 80% vortex setting; and (5) centrifuge for 5 min at room temperature and 3711 rcf (maximum setting of the centrifuge); (6) transfer 407 μL (71 mg equivalent sample) of the extract to a PP autosampler vial (only pipet the extract beneath any foam or other floating materials that may be present); (7) for the spiked and blank test portions, add 140 μL water + 253 μL of 146.5 mM of 1-heptanesulfonic acid aqueous solution (ion-pairing reagent). For the preparation of the MM and RO calibration stds, veterinary drug mixtures (including the ISTDs) in water substituted for the 140 μL water additions.

The aminoglycosides sample preparation method entailed the following steps [22, 30, 31]: (1–2) repeat steps 1 and 2 above using aminoglycoside spiking solutions; (3) dispense 20 mL of the aqueous solution of 10 mM NH4OAc, 0.4 mM EDTA, 0.5% NaCl, and 2% TCA into each tube; (4–5) same as above except the centrifuge is kept at 2 °C; (6) transfer 10.75 mL (1 g equivalent sample) of the extract into a 50 mL PP tube (avoid foam and floating materials); (7) using a pH meter and stir bar, adjust pH to 6.5 ± 0.1 by adding a few drops of 30% NaOH followed by 1 M NaOH and/or HCl solutions; (8) centrifuge for 3 min at 3711 rcf and room temperature; (9) decant the supernatants into 15 mL PP tubes, and using the DPX apparatus, condition the DPX tips with 3 mL MeOH followed by 3 mL water; (10) in 4 aliquots of ≈2.75 mL each, pull up the extracts into the DPX tips to mix thoroughly with the WCX sorbent, transferring the extracts to waste; (11) rinse the DPX sorbent tips with 3 mL water and then remove excess water by pumping with air; (12) elute the sorbents by pumping the DPX tips 3 times with 1 mL of 10% formic acid solutions into 15 mL PP tubes; (13) pipet 71 μL extract (71 mg equivalent sample) to the PP autosampler vials containing 407 μL of egg extract from the multiclass method + 253 μL of 146.5 mM ion-pairing reagent solution + 69 μL water for samples or aqueous calibration preparation solutions for stds.

UHPLC-MS/MS

Analysis of the final extracts was conducted using a Sciex (Foster City, CA; USA) 6500 QTrap mass spectrometer paired with a Shimadzu (Columbia, MD; USA) Nexera X2 UHPLC. Injection volumes were 4 μL (36 μg equivalent sample). UHPLC was performed using a 5 mm Waters (Milford, MA; USA) Acquity Vanguard + 100 mm analytical 2.1 mm i.d. connected columns, both with HSS T3 stationary phase of 1.8 μm particles. Flow rate and column temperature were 0.5 mL/min and 40 °C, respectively. Mobile phase solution contained 0.1% HCO2H in (A) water and (B) 1/1 (v/v) MeOH/MeCN. A gradient of 5% B for 0.5 min ramped to 100% B until 8 min and held until 10.5 min. Initial conditions were reset within 0.5 min followed by a 4 min re-equilibration. The autosampler tray was kept at 10 °C.

ESI with (+/-) ion switching and scheduled (s)MRM was used with 0.8 s cycle times, 0.4 s target scan time, and 45 s tR window. The solid lines in Fig. 2 show the no. of MRM ion transitions and dwell times during the chromatogram for the 169 drug analytes plus 7 ISTDs using the conditions listed in Table S1 (see Supplementary Information, ESM). Gas settings 1 and 2 were 50 and 25, respectively. The temperature of the ion source was 350 °C and ion spray voltage was 5000 V or − 4500 V in ESI positive and negative modes, respectively.

Fig. 2
figure 2

Plot of dwell time and number (no.) of MRM ion transitions along the chromatogram for 2 or 3 ions per drug analyte in the analysis of the 176 drugs and ISTDs in the UHPLC-MS/MS method using scheduled (s)MRM at conditions listed in Materials and Methods and ESM Table S1 (3 MRMs/drug were acquired in the experiments)

Sciex MultiQuant 3.0 software was used for automatic peak integrations in the summation integration mode [42], and the raw data were imported into an Excel spreadsheet template for quantidentification and reporting. The final universal integrator settings were 25% baseline threshold with smoothing factor of 1.0. The %difference in the slopes of the MM and RO calibration lines (relative to the ISTD) was used to calculate matrix effects (MEs) = (MM slope/RO slope) − 100%. Except for aminoglycosides (which used tobramycin), 13C6- sulfamethazine was used as the ISTD for all other quantifications, which also required identification criteria to be met. LOQs were calculated from the average normalized peak areas (background) of the 11 blank matrix samples (including 0X MM calibration std) plus 10 times the std deviation (SD) from the same integrated peak areas, in which LOQ = (background + 10×SD)/(MM slope) [42].

Per FSIS regulatory requirements, FDA guidance document #118 [16] dictated the primary identification criteria in the study. The acceptable tR window was ±0.1 min (6 s) from the reference tR for each analyte, which also required all ions to fall within ±0.1 min from each other when used to generate an ion ratio. The acceptable ion ratio difference from the analyte’s reference ratio was ±10% absolute for any one ion ratio or ± 20% for any two ion ratios (all ion ratios had to be >0%). Results from the 4 most concentrated RO calibration stds (0.5–3X) were averaged to define the contemporaneous reference tR and ion ratios for each analyte in each batch of samples. The reference ion ratios were always <110%, which sometimes required rearrangement in the order of ions #1, #2, and/or #3 for an analyte to ensure that the numerator had smaller average relative abundance than the denominator. Lastly, the determined concentration of the analyte had to exceed the 0.1X level (see Table 1) for it to be identified. The same experimental data were used to also calculate rates of false positives and negatives using the SANTE/12682/2019 [18] and 2002/657/EC [17] identification criteria with relative ion ratio limits as described in “Results and discussion” (the tR and concentration thresholds remained the same as above).

Results and discussion

Effect of MRM dwell times

As shown in Figs. 1 and 2, the no. of analytes, ion transitions per analyte, their tR, and acquisition time windows in UHPLC-MS/MS using sMRM affect the resulting dwell times and/or points per chromatographic peak in the analysis with constant inter-dwell time of 1 ms. ESI(+/−) switching of 20 ms also affects resulting dwell times, which is taken into account in Fig. 2 and ESM Fig. S1 that also compare the acquisition of 2 or 3 MRMs/analyte. UHPLC peak widths were ≥ 4 s for all analytes, and since at least 5 points are needed to define and integrate peaks with acceptable accuracy [8, 24], this led to the data acquisition cycle time setting of 0.8 s in the method. A conservatively wide acquisition time window of 45 s was used in this study, as shown in Fig. 2, but in retrospect, a narrower window setting of 24 s could have been used as before [31] (see ESM Fig. S1). If the tR and peak widths are consistent from day-to-day, then the narrower acquisition time windows can be used to nearly double MRM dwell times with the same conditions otherwise.

Theoretically, longer dwell times lead to more sensitivity and precision, but in practice, measurement uncertainty in the steps leading up to the analysis (among other factors) tend to be vastly greater than any gains that can be made by using a longer dwell time than ≈1 ms (for modern instruments). In this analysis of 508 MRM ion transitions in 9 min, the shortest dwell times were 2 ms when using 3 MRMs/analyte. Longer dwell times within the method did not correlate with lower LOQs in this study, as demonstrated in Fig. 3. Similarly, ESM Fig. S2 indicates a weak correlation in the SD of the reference ion ratios vs. dwell time > 2 ms in experiments. As Mol et al. [19] also reported, the signal intensities associated with the concentrations and chemical natures of the analytes (e.g., stability, ionization efficiencies, sensitivities, MEs) have a much greater impact on the consistency of ion ratios than dwell times.

Fig. 3
figure 3

Limits of quantification (LOQs) of the 508 MRM ion transitions for 176 drug analytes and ISTDs vs. dwell times in liquid eggs. LOQ did not depend on dwell time (>2 ms)

Summation function chromatographic peak integration settings

Ion ratios are commonly calculated from integrated chromatographic peaks for each ion; thus, accurate chromatographic peak integrations are essential to generate accurate ion ratios. Reliance on default integrator functions and settings in instrument software often leads to mis-integrated peaks or excessive human review and manual re-integrations. In practice, ideal integrator settings for an analyte at one concentration or matrix often do not work well for different levels and/or backgrounds. However, when analyses involve targeted analytes with known and consistent tR, the summation integration function has been shown to have advantages over more complex integrators [42]. The summation integration program simply draws a line across the baseline from user-defined times before and after the expected tR for the analyte. The summed area or height of the signal above and/or below the line serves as the response. No human review is needed other than to check that analyte peaks fall within the summation windows, which is done by viewing calibration stds injected before and after samples in the batch. Furthermore, manual re-integrations are rarely better than the mathematical summation function applied by the software.

Although different software programs may have additional features, summation integration fundamentally entails only 3 settings: start time, end time, and baseline threshold (note: the latter is eliminated in some programs by just using the lowest point between the start and end times [42]). For the MultiQuant software used in this study, the tR was defined as usual for any software, and the integration start and end times could be set for each analyte or universally by the analyst as the desired times before and after the known tR. In the absence of chemical interferences, which is often the case in MS/MS, much leeway can be granted in the start and end times, provided that the full peak is captured (or just one baseline and the apex when using peak height as the response).

However, when chemical interferences are present near analyte peaks, greater care is needed to make the settings. For example, Fig. 4 shows the reagent blank and 25 ng/g RO calibration std for zeranol in the analysis. The integration end time had to be set to capture as much of the analyte peak as possible without integrating much of the nearby interfering peak. This was successfully done in this study, leading to reference ion ratios with ≤1% SDs, no false positives or negatives (see ESM Table S2), and ≈100% recoveries with ≤11% RSDs (see Table 1).

Fig. 4
figure 4

Effect of different summation function chromatographic integration settings (smoothing and % baseline noise) on the 3 integrated ion transition peaks of 0 (blank) and 25 ng/g equivalent zeranol reagent-only calibration stds. The 25% baseline noise with smoothing of 1.0 was universally chosen for use in the validation study

With respect to baseline threshold setting, the software allowed input from 0 to 100%, which were evaluated in 12.5% increments using the same collected data from the liquid egg validation experiment. Also, the effect of smoothing was assessed by comparing no smoothing vs. a factor of 1.0. The effects of baseline and smoothing settings on different outcomes in the validation are displayed in Figs. 4 and 5, and ESM Figs. S3–S5.

Fig. 5
figure 5

Effect of different summation function chromatographic integration settings (0–75% baseline noise and smoothing = 0 or 1.0) on the average SD (n = 4) of the primary reference ion transition ratio for the targeted drugs in the study. Error bars indicate the SD among the analytes (n = 157, aminoglycosides not added)

The 0% baseline threshold integrated the defined region of each MRM chromatogram from the detector’s electronic zero level, which commonly extended well beneath the instrumental background noise for most ion transitions (see Fig. 4). Despite the intuitive expectation that this setting would lead to excessive noise, the baseline noise level was so consistent within the experiment that the 0% threshold yielded among the best results, as shown in Fig. 5 and ESM Figs. S3–S5. Baseline threshold up to 50% made little difference in the outcomes, but higher settings than 50% should be avoided, as indicated in Fig. 5. Also, the figure shows how the 1.0 smoothing factor yielded slightly more consistent ion ratios than no smoothing. This conclusion was less apparent in terms of LOQs (Fig. 3) and recoveries (ESM Fig. S4), but smoothing did not reduce method performance in any factor evaluated. Moreover, the use of peak areas as the signal resulted in better performance than peak heights except when 87.5% baseline threshold was used, as shown in ESM Fig. S5.

Ultimately, 25% baseline threshold and 1.0 smoothing using peak areas were chosen as the universal summation integration conditions for all analytes in the validation study. Perhaps analyte-by-analyte assessments using different settings for each analyte would have improved results slightly, but random fluctuations from experiment to experiment would likely lead to frequent re-evaluations. As shown in Fig. 4, the 25% setting with 1.0 smoothing also subjectively looked better than the other settings evaluated.

Qualitative identification results

In the validation experiments, each drug analyte had to meet identification criteria before it was quantified (i.e., “quantidentified”). Figure 5 shows how the SDs of the reference ion ratios at the chosen integration conditions averaged 2% ± 2% in the validation experiment for liquid eggs. Of course, the ion ratios tend to vary more greatly at lower drug concentrations, particularly in the presence of matrix, and they also depend on the properties of the drug analyte and its MRM ion transitions. Figure S6 (see ESM) provides plots of ion ratio consistencies for the 3 MRMs acquired per analyte relative to (A) tR and (B) drug concentrations. In the former, the most crowded region of the chromatogram from 3 to 5 min gave the highest variability, but this was probably related to drug concentrations (responses), as indicated in the latter plot (and by Mol et al. [19]).

In any case, only the trueness of the reference ion ratios matters, which is why the ratios should be averaged from replicate injections of high concentration stds in RO solutions. Biases in the ion ratios are introduced when direct and/or indirect MEs affect the responses of different ions to different extents [8]. The use of MM calibration stds to set reference ion ratios may (partially) compensate for the biases, which is commonly done for quantification, but this is judged to be improper for real-world regulatory qualitative purposes, even if better rates of false positives and negatives could be achieved during method validation. Similarly, the reference tR should not be set using the MM stds, but quantification of the 0.1X concentration threshold can be done using MM calibration stds. In this study, peak areas for each ion normalized to the ISTD for the MM stds were used for all quantidentification purposes.

As mentioned in “Materials and methods,” 3 different identification criteria were applied using the same data. Figure 6 displays the acceptable ion ratio tolerance ranges vs. the reference ion ratio using the FDA/USDA [16], SANTE/12682/2019 [18], and 2002/657/EC [17] criteria for LC-MS/MS analysis. Each of the criteria requires the denominator to be larger than the numerator (on average) when determining the reference ion ratios, but sometimes in practice, reference ion ratios may reach ≈110% due to random variations. The FDA/USDA ion ratio tolerance window is ±10% (absolute) for one ion ratio (which must be >0%) or ± 20% (absolute) for any two ion ratios (>0%). The SANTE identification criteria simply require the ion ratio for any no. of MRM ion transitions to fall within ±30% (relative) independent of the reference ion ratio value. Compared to the FDA/USDA criteria, the SANTE ion ratio tolerances are relatively stringent when the reference ion ratios are <60% and increasingly less strict from 60 to 110%. The 2002/657/EC rules for LC-MS/MS, however, call for broader relative ion ratio tolerances as the value of the ion ratio decreases: (A) ±20% for ion ratios >50–100%; (B) ±25% for >20–50%; (C) ±30% for >10–20%; and (D) ±50% for ion ratios ≤10%. In effect, these tolerances generally fall between the FDA/USDA and SANTE criteria throughout the ion ratio range, as displayed in Fig. 6.

Fig. 6
figure 6

Upper and lower ion ratio tolerance boundaries for analyte identifications vs. the reference ion ratio when applying different identification criteria

The 2002/657/EC rules additionally assign “identification points” depending on the type of MS analysis being applied [17], but this system entails a subjective assessment of those who designed the system [8]. For standard (low resolution) UHPLC-MS/MS, 1 identification point is granted for each precursor ion used and 1.5 points per product ion, and either 3 or 4 points are needed to identify an analyte depending if it is a banned veterinary drug or not [17]. The 2002/657/EC rules are arbitrarily complicated and complicating in both concept and practice, and the identification point system was not employed in this comparison. As with the SANTE criteria, if any single ion ratio (out of potentially 3) fell within the acceptable tolerance windows (and the tR and concentration criteria were also met), then the analyte was said to be (quant)identified.

Figure 7 presents the overall rates of false positives and negatives in the liquid eggs validation for each set of criteria depending on the conc. identification threshold (0 or 0.1X), spiking levels (0.5–2X), no. of ion ratios (3 or 1), and which ones (#2/#1, #3/#1, and #3/#2) used for identification purposes. Each of the identification criteria led to rather similar rates of false positives and negatives in the study. However, as shown in ESM Fig. S7, the 2002/657/EC ion ratio tolerances led to slightly worse results than the other criteria, especially when employing 3 MRMs/analyte, when method acceptability criteria require analytes to achieve <10% rates of false(+/−).

Fig. 7
figure 7

Rates of false positives (+), n = 5511, or negatives (−), n = 5010 at each level, for the 501 MRM ion transitions in the liquid eggs validation experiment depending on the no. of (and which) ion ratios were used, concentration threshold at 0 ng/g or 0.1X level (see Table 1), spiking levels, and FDA/USDA, SANTE/12682/2019, or 2002/657/EC analyte identification criteria

As shown in both Fig. 7 and ESM Fig. S7, application of the 0.1X concentration threshold for 3 MRMs/analyte using any of the identification criteria dramatically lowered the rates of false(+). The overall rate fell from 25–31 to 5–6% false(+) when applying the 0.1X threshold (see Fig. 7), which led the percentage of the 169 drugs yielding <10% false(+) to increase from 27–37 to 95–97% (see ESM Fig. S7). This demonstrates the importance of setting a reasonable threshold. Use of a lower identification reporting level would have led to more false positives without significantly affecting rates of false negatives for the 0.5–2X spikes. In fact, increasing the identification threshold from 0.1X to 0.2X (i.e., 20 to 40% recovery at the 0.5X level) reduced false(+) results for all 501 individual MRM product ions from ≈10 to ≈1% without appreciably affecting false(−) results in the liquid eggs validation experiment. Unfortunately, the concentration threshold could not be universally adjusted in the spreadsheets to fully evaluate this factor vs. rates of false(+/−) in this study.

With respect to ion ratios, the first step was to determine which type of ion ratio tolerances were more suitable fundamentally: absolute (as in the FDA/USDA system) or relative (as in the 2002/657/EC and SANTE systems). Figure S8 (see ESM) was prepared to help answer that question by plotting SD and RSD of the ion ratios vs. the reference (average) ion ratios. Mainly due to idiosyncratic analyte properties as already noted, rather poor correlations were obtained in the results shown in ESM Fig. S8. However, the observed trends in the plots do indicate that relative tolerances are probably more appropriate throughout the range of ion ratios rather than absolute tolerance windows.

Independent of the identification criteria used, the no. of MRMs/analyte also affects the qualitative performance of the methods. To evaluate this aspect, ions #1–3 were assigned in order of intensity, with #1 being most intense. Counter-intuitively, the use of ion ratio #3/#1 led to fewer false negatives than ion ratio #2/#1 when applying the FDA/USDA identification criteria (based on absolute ion ratio windows) than the application of the SANTE and 2002/657/EC criteria (based on relative ion ratio windows). That is because the smaller ion ratios (e.g., <60%) generated from the least intense vs. the most intense ions leads to a wider permissible ion ratio range when using the FDA/USDA criteria, as shown in Fig. 6. In any case, Fig. 7 and ESM Fig. S7 show how 3 MRMs per analyte should be used for improved qualitative identifications (except for certain analytes that do not fragment into a third tolerably detectable ion).

Figure 7 and ESM Fig. S7 offer evidence that both the SANTE and FDA/USDA criteria perform equally well, especially when using 3 MRMs/analyte. The analytical chemistry version of Occam’s razor principle is that the simplest approach should be chosen among others that yield the same outcome. The algorithms to make analyte identifications in the Excel spreadsheet developed for the SANTE criteria were much simpler than the ones applying the FDA/USDA and 2002/657/EC criteria. Thus, taking into account the ease of use along with the results shown in Fig. 7 and ESM Figs. S7–S8, the author recommends the SANTE approach over the others.

Unlike the spreadsheets using FDA/USDA and 2002/657/EC criteria, the compiled validation spreadsheets for liquid eggs applying the SANTE criteria could be easily modified to determine the effect of the single relative ion ratio tolerance on the results. Figure 8 and ESM Fig. S9 are plots of false(+) at the 0X and false(−) at the 0.5X spiking level vs. the relative ion ratio tolerance setting. Figure 8 shows the rates of false(+/−) directly, and ESM Fig. S9 factors in the 10% method acceptability threshold rate when individually treating all 501 ion transitions for the 169 analytes. As expected, the rates of false(+) increased as the ion ratio window widened, which in the case shown in ESM Fig. S9 adhered to a logarithmic relationship up to ±300% relative tolerance. Conversely, rates of false(−) >10% fell from 95% of the MRMs when applying a ± 1% ion ratio tolerance to a minimum of 2% of the MRMs when using ≥ ±100% tolerance (still requiring all ion ratios to be >0%). The maximum rate of false(+) was ≈16% and minimum rate of false(−) was ≈1% in the liquid eggs experiment (see Fig. 8), both of which remained the same even when extending to ±1000% relative ion ratio tolerance.

Fig. 8
figure 8

Rates of false(+) and false(−) vs. ion ratio tolerance (relative) for the 501 individual MRM ion transitions in the liquid egg validation experiment consisting of 11 blanks (n = 5511) and 10 spikes at the 0.5X level (n = 5010)

Interestingly, the crossover point where ≈5% rates of false(+/−) in Fig. 8 (or 10–11% of the ion transitions exceeded the 10% method acceptability threshold in ESM Fig. S9) fell at nearly exactly the ±30% ion ratio tolerance chosen by the SANTE committee [18, 19].

By the way, the 10% method acceptability threshold in rates of false(+/−) for each analyte could be altered depending on the fit-for-purpose choices [15], but FSIS validation criteria were employed in this report. The author has always maintained that it is better to identify say 50% of the positives near the regulatory level of concern than 0% by excluding the analyte from the method due to supposed validation failures. As long as the rate of false(+) results are reasonably low, then rates of false(−) ≤90% should be good enough in the absence of alternatives.

However, rates of false(−) could be greatly reduced by expanding the ion ratio tolerance. As shown in Fig. 8 and ESM Fig. S9, widening the tolerance to ±300% resulted in merely 16% false(+) overall or 23% of the MRM ion transitions to exceed the 10% method acceptability false(+) threshold. This meant that the ion ratio tolerance could be removed altogether for >75% of the ions (or analytes), provided that the tR and concentration criteria were kept in place. In fact, only the latter criterion would be needed if the chromatographic peak integration window matched the ±6 s tolerance from the reference tR for each analyte. Perhaps the choice of ±30% relative tolerances by the SANTE committee should be reconsidered in light of the results provided in Fig. 8 and ESM Fig. S9.

In regulatory applications, the false(−) samples are not re-analyzed, thus will be missed altogether, but false(+) findings above the regulatory violation level should be eliminated upon confirmatory re-analysis using orthogonally selective techniques [8]. For food and environmental safety reasons (among others), identification criteria should be devised to reasonably minimize false(−) results moreso than false(+), provided that subsequent measures are taken to confirm actual positives.

Validation results

Although the author currently suggests use of the SANTE identification criteria, the FDA/USDA criteria were used in this report to meet FSIS needs in the validation of the E&I method for liquid and powdered eggs. ESM Table S2 lists the rates of false positives and negatives using 3 MRMs/analyte for each drug, spiking level, and matrix in the study. The last column in the table also provides the combined %true identifications for each analyte in both matrices (n = 82). The identification criteria applied make little or no differences in quantification results, and Table 1 presents the recoveries and RSDs for the quantidentified drugs spiked into the liquid and powdered eggs samples.

In both tables, the drugs are grouped into their class, and the E&I method was found to work similarly as in previous validation studies for meats and fish [25,26,27,28,29,30,31]—except for an unknown reason, the recoveries of tetracyclines and (fluoro)quinolones were lower in the egg matrices than in previous validations. Perhaps the pH, ionic strength, viscosity, and/or amounts and types of proteins and lipids of the eggs made a difference. Metal content and chelation of these drugs may also be playing a role in the results. Samples containing yolk (including only yolk) tended to yield worse performance in general than egg white samples.

Overall, 139–141 drugs (82–83%) of the 169 analytes met the 70–120% recovery with <25% RSD method acceptability criteria for powdered and liquid eggs (n ≈ 30 each), respectively. The reconstituted powdered eggs were somewhat more concentrated (less watery) than the liquid eggs, which could explain the slightly better results in the latter case.

Figure S10 (see ESM) shows an assessment of different quantitative parameters (MEs, recoveries, RSDs, MM calibration R2, and LOQs) for each of the 3 product ions acquired per analyte in the liquid and powered egg validation experiments. As expected, the performance features generally trended from best to worst depending on relative ion intensity in nearly all cases. Curiously, the only exception was that the weaker ions #2 and #3 outperformed the more intense ion #1 in terms of R2 and LOQs in powdered eggs. This is believed to be due to carry-over or slight cross-contamination of the analytes in the MM 0X calibration std (matrix blank) sample for powdered eggs. This is also apparent in ESM Table S2, which shows a preponderance of a single false(+) for many drugs. This happens when the possible contamination crossed the 0.1X concentration threshold. The most intense ion would exceed the background noise level most frequently to cause the observed effect, and the R2 would also be adversely affected, along with a bias in the calibration curve. If the sample was contaminated with ≈1 ng/g of the drugs, for example, the drugs with 1X spiking level of 10 ng/g would indicate the most “false” false(+) results. No outliers were removed from the study, and the results were still fine despite the suspected low-level cross-contamination in this instance.

Conclusions

The author was requested by FSIS to validate the E&I method for veterinary drug residues, including aminoglycosides, in liquid and powdered eggs. A diverse range of 169 drugs were tested at regulatory levels of interest in dozens of samples of different types that FSIS had collected from across the USA. The results from the straight-forward validation study according to FSIS protocols were reported directly to FSIS, which are also reported here. In all, 139–141 (82–83%) of the drugs spiked into the powdered and liquid eggs, respectively, met the acceptability criteria of 70–120% recoveries with ≤25% RSDs. Upon implementation, FSIS chemists noted foaming of some samples, but an aliquot of the liquid extract taken from beneath the foam layer worked fine in this study.

In this report, the data generated in the validation experiments further served as a case study in the comparison of 3 different regulatory analyte identification criteria for UHPLC-MS/MS analysis. The compared criteria consisted of the FDA/USDA [16], SANTE/12682/2019 [18], and 2002/657/EC [17] ion ratio tolerances vs. the reference ion ratios. In addition, analyte identification required difference in tR no more than ±0.1 min and threshold concentration of 0.1X of the regulatory concentration of interest for each analyte. Also, the use of 2 or 3 MRM ion transitions per drug analyte was compared in each case, and qualitative performance improved ≈10-fold when using 3 vs. 2 MRMs/analyte.

Applying the ≤10% threshold rate for method acceptability, 95–97% of the 169 drugs had a sufficiently low rate of false(+) results (≈5% overall), and 88–90% of the drugs spiked at the 0.5X level met the ≤10% false(−) criterion (≈5% overall). All 3 identification criteria yielded similar rates of false(+/−) results, but the 2002/657/EC criteria were more complicated in theory and practice to employ. The SANTE/12682/2019 analyte identification guidance to allow ±30% (relative) ion ratio difference from the contemporaneous reference ion ratio for any pair of product ions worked well and was simplest in both concept and implementation.

Bear in mind that this study was devised for regulatory enforcement applications, which involve identification and reporting levels relative to regulatory concentrations of concern. For risk assessment and similar purposes that need to make identifications at the lowest level possible, then the limit of identification would be chosen as the reporting threshold, which would be set at the tolerable rate of false positives for the purpose of the analysis.