Introduction

Cobalamin (vitamin B12) is an essential cofactor for two enzymes involved in one-carbon metabolism: methylmalonyl CoA mutase and methionine synthetase [1]. Reduced function of the former enzyme results in increased serum methylmalonic acid (MMA) concentrations. Untreated vitamin B12 deficiency leads to megaloblastic anemia and/or irreversible central nervous system degeneration [2]. Serum or plasma vitamin B12 concentrations, particularly in the low-normal range, may not adequately reflect tissue vitamin B12 status [3]. Increased MMA concentrations are specific for and believed to be the first indication of vitamin B12 deficiency [4]. MMA in serum or plasma was therefore found to be a useful indicator of vitamin B12 deficiency, especially in patients with few or no hematological abnormalities, normal results for the Schilling test, or normal or only slightly depressed serum vitamin B12 concentrations [5].

Various analytical methods for the measurement of MMA in biological fluids have been reported, mainly since the 1980s. Different approaches were taken for sample preparation (e.g., ultrafiltration, solid-phase or liquid-liquid extraction) and analytical measurement (e.g., GC/MS, electrophoresis, and, more recently, LC-MS/MS) [619]. The analysis of serum MMA presents several challenges: this dicarboxylic acid has a small molecular weight, is hydrophilic, nonvolatile, and its concentration in serum is relatively low, especially in the generally healthy population with a low prevalence of vitamin B12 deficiency. This makes direct analysis by GC/MS challenging, and various types of derivatization (silylation, cyclohexanol, chloroformate, and butanol) have been reported to improve detection [69]. LC-MS/MS methods [1018] yield shorter run times and better sensitivity compared with GC/MS procedures. These two factors are critical when a method is used for population monitoring because they translate to higher throughput and lower patient sample volume.

The NHANES has measured plasma MMA concentrations in the US population from 1999 to 2004 using a GC/MS procedure that required 275 μL of sample and had a low throughput (36 samples/run) [19]. No MMA data are available for 2005–2010. A 2010 expert roundtable discussing “NHANES Monitoring of Biomarkers of Folate and Vitamin B-12 Status” recommended that serum MMA measurements be reinstated in future NHANES in conjunction with serum vitamin B12 measurements [20]. Our primary objective was to introduce a more efficient yet highly accurate LC-MS/MS method for NHANES 2011–2014. Our secondary objective was to compare this new method to the previously used GC/MS procedure in order to provide continuity in interpreting NHANES data. Lastly, we aimed to assess how these two procedures compare with the GC/MS measurement procedure used by the National Institute of Standards and Technology (NIST).

Materials and methods

Chemicals, reagents, and samples

MMA, succinic acid, tert-butyl methyl ether (MTBE), and 3 M hydrochloric acid in n-butanol were purchased from Sigma-Aldrich (St. Louis, MO, USA). Deuterium-labeled MMA (d3-MMA) was purchased from CDN Isotopes (Quebec, Canada). ACS grade methanol, phosphoric acid (H3PO4), glacial acetic acid, and acetonitrile were purchased from Thermo Fisher Scientific (Waltham, MA, USA). Purified water (18 MΩ) was obtained from an Aqua Solutions water purification system (Aqua Solutions, Inc., Falmouth, ME, USA) and was used to prepare all samples, calibrators, and reagents. Quality control (QC) pools (low, medium, and high) were prepared in-house from pooled human serum purchased from a US blood bank. All units were screened for MMA and spiked with MMA standard solution as needed to achieve desired concentrations. All specimens were stored at –70 °C when not in use.

Sample preparation and analysis by LC-MS/MS

Samples were prepared by a modified procedure reported by Pedersen et al. [16]. Master stock solutions of MMA and d3-MMA were prepared in water (50 μmol/L). Six individual MMA working calibrators were prepared from the stock solution in water (25–2500 nmol/L) and stored (as 400 μL aliquots) at –70 °C until analysis. We prepared a fresh working internal standard (ISTD) solution (1 μmol/L) from the d3-MMA stock solution on the day of analysis in 50 % methanol/water. The sample preparation steps involved liquid–liquid extraction and generation of MMA dibutyl ester (Fig. 1). MMA was separated from succinic acid and from other compounds on an Accela UHPLC system (Thermo Fisher Scientific, Waltham, MA, USA) by isocratic elution (0.1 % acetic acid:methanol 40:60; 0.4 mL/min) within 4.5 min on a Hypersil Gold C18 analytical column (2.1 mm × 50 mm, 1.9 μm particle size) (Thermo Fisher Scientific) held at 35 °C. The total run time was 6 min. Samples were maintained at 7 °C within the instrument during the run. Injection volume was 3 μL in a partial loop mode. Mass detection was carried out on a TSQ Vantage (Thermo Fisher Scientific) under multiple-reaction monitoring (MRM) conditions in positive electrospray ionization (ESI) mode. The mass spectrometer parameters were as follows: spray voltage 4500 V, vaporizer temperature 350 °C, nitrogen sheath gas pressure 30 arbitrary units, auxiliary gas flow 5 arbitrary units, ion transfer capillary temperature 290 °C, argon collision gas pressure 1.2 mTorr, scan time 200 ms, scan width 0.1 Da, mass resolution in Q1 and Q3 0.7 full width at half maximum. We monitored two transitions each for MMA (m/z 231 → 119 and 175.1) and for d3-MMA (m/z 234.1 → 122.1 and 178.1) and used the sum of the two signals for quantitation, MRM mode. We also used the two individual transitions in selected reaction monitoring (SRM) mode to calculate confirmation ion ratios. In this calculation, the more abundant transition (m/z 231 → 119) was the quantitation ion (QISRM), whereas the less abundant transition (m/z 231 → 175.1) was the confirmation ion (CISRM). We established confirmation ion ratio (CISRM/QISRM) limits (mean ± 2 SD) for MMA from repeated analysis of the calibrators (n = 20). Quantitation was based on peak area ratios [analyte/internal standard (ISTD)] interpolated against a six-point aqueous calibration curve (1/×weighting), using XCalibur software (Thermo Fisher Scientific). We reinjected the calibrators at the end of each run to assess drifting. Each run consisted of a reagent blank (no ISTD), a blank (with ISTD), a set of calibrators, three serum QC samples prepared in duplicate (bracketing the unknown samples), and one blind QC sample for every 20 unknown samples (see Electronic Supplementary Material Figure S1).

Fig. 1
figure 1

Sample preparation steps for MMA LC-MS/MS method

Method validation

We used the FDA “Bioanalytical Method Validation” and IUPAC validation guidelines [21, 22] as well as our internal division document “Policies and Procedures Manual for Bioanalytical Measurements” to design our method validation experiments. We evaluated calibrator accuracy for 10 runs by calculating the proportion of measured to nominal calibrator value. We assessed the effect of the presence of matrix on the calibration curve from a 10-point calibration curve (0–2500 nmol/L, three independent calibration curves) prepared both in water and in human serum. Slopes that agreed within ±5 % were considered to be equivalent. We determined method imprecision using three levels of serum QC pools: within-run, 10 replicates/level in one run; total and between-run, two replicates/level over 10 runs. We assessed method accuracy through spike recovery: a serum pool was spiked with MMA at four levels (two runs, two replicates/level: 50, 250, 700, and 1000 nmol/L). We also measured the unspiked serum pool for endogenous MMA concentration (two runs, two replicates each). To assess spike recovery, we added the ISTD to the samples and carried them through the extraction process. To assess extraction efficiency, we added the ISTD after extraction was completed. The spike recovery and the extraction efficiency were calculated as the measured concentration difference between the spiked and the unspiked sample divided by the nominal concentration of the spike. We estimated the limit of detection (LOD) by serially diluting the low QC pool with charcoal stripped serum (Aalto Scientific, Carlsbad, CA, USA) and calculating the SD at concentrations of zero (σ0) from the extrapolation of repeat measurements (three runs, three replicates/dilution) [23]. We defined the LOD as 3σ0.

MMA stability

We exposed three serum QC pools (97.1, 247, and 548 nmol/L) to room temperature for 1, 2, 3, and 7 d to assess short-term MMA stability. The vials were returned to –70 °C storage and later all samples were processed and analyzed together with pristine serum QC pools kept at –70 °C as a reference. We also subjected the same QC pools to repeated freeze-thaw (up to three cycles, 2 h/cycle at room temperature in closed vials). We studied the stability of processed serum extracts for the above set of serum QC pools kept in the autosampler compartment at 7 °C for 3 d. In a separate experiment, we kept the 96-well plate containing processed serum QC pools and 25 unknown samples at –20 °C for 1 wk and compared the results with the initial results.

We assessed mid-term (6 mo and 1 y) MMA stability using three plasma QC pools (179, 515, and 2007 nmol/L, three runs, one replicate/condition) stored at suboptimal temperatures (5 °C and –20 °C). The vials were returned to –70 °C storage and later all samples were processed and analyzed together with pristine plasma QC pools kept at –70 °C as a reference. Long-term (14 y) MMA stability at  ≤ –70 °C was assessed from six plasma QC pools (120, 137, 1079, 1825, 10230, and 10295 nmol/L, two runs, five replicates/level).

Effect of specimen type and anticoagulant

To study the effect of different anticoagulants, we used matched serum and plasma samples from 14 anonymous blood donors [serum, serum separator, K2EDTA plasma, Na citrate plasma, and Na heparin plasma (5-mL blood collection tubes with 0.5 mL anticoagulant), Tennessee Blood Bank Services (Memphis, TN, USA)]. Plasma was obtained within 2–4 h of blood collection (blood held at room temperature), serum after overnight clotting (leading to higher serum yield and less residual fibrinogen clots) at room temperature through centrifugation at 4 °C, following standard operating procedures. All specimens were refrigerated, shipped on cold packs, and frozen at –70 °C within 48 h of blood collection. Serum was used as a reference. Because of the diluent in the plasma citrate tubes, we multiplied results for this specimen type by 1.1.

Method comparison studies

We performed three comparison studies to assess how the new LC-MS/MS method compared with other methods. First, we subjected randomly selected residual NHANES plasma samples (n = 336; MMA: 60–2320 nmol/L) previously analyzed with our GC/MS method to analysis by our new LC-MS/MS method. We excluded seven extreme outlier results (difference between the two procedures >40 %) and three samples with MMA concentrations above 1500 nmol/L to avoid an undue influence of very high concentrations. The final set (n = 326; MMA: 60–1180 nmol/L) was evaluated for Pearson correlation, Deming regression, and Bland-Altman analysis using Analyse-it, a statistical plug-in for Microsoft Excel (Analyse-it ver. 2.20; Analyse-it Software, Ltd., Leeds, UK). Second, we subjected eight serum and 11 plasma samples (stored at –70 °C for 14 y) from a previous international round-robin study [24] to analysis by our new LC-MS/MS method and compared the results with the consensus mean of all 13 participating laboratories and to our previous GC/MS results. Third, we characterized the plasma-based NIST Standard Reference Material (SRM) 1950 “Metabolites in Human Plasma” by our LC-MS/MS and GC/MS procedures (replicate analyses of six vials over 3 d). This material was also characterized by a NIST GC/MS measurement procedure that has been used in value assignment of NIST reference materials. For all three methods, we calculated the mean±SD and the coefficient of variation (CV). This evaluation was another accuracy assessment of the new LC-MS/MS method.

Results and discussion

The sample preparation procedure (adapted from Pedersen et al. with some minor modifications) involved liquid–liquid extraction of MMA from serum, followed by derivatization of MMA to dibutyl ester and analysis by positive ion ESI-LC-MS/MS.

We optimized and tested critical sample preparation steps. Because the addition of acid to the organic solvent improves the transfer of organic acid from the aqueous to the organic phase, we tested extraction efficiency with different concentrations of o-phosphoric acid (0.1, 0.25, and 0.5 M) on a set of calibrators and QC samples. We obtained poor MMA extraction with 0.1 M o-phosphoric acid in MTBE and selected the 0.5 M acid concentration because it provided optimum MMA extraction from the aqueous phase (data not shown). Placing the tubes into a dry ice/ethanol bath was a fast and efficient way to separate the organic from the aqueous layer by simply decanting the top (MTBE) layer after the bottom (aqueous) layer froze. This process happened very quickly, and small variations in the amount of time the samples were kept in the ice bath (1, 2, or 3 min) did not influence the MMA areas in a set of calibrators and QC samples. Because the derivatization of extracted MMA to the dibutyl ester (at 60 °C) is a critical step, we studied the timing on the heating block to monitor reaction completion (20, 30, and 40 min). The use of excess of derivatization reagent and a 30 min incubation time assured that even samples with high MMA concentrations (above 10000 nmol/L) completed the reaction. The derivatized samples have to be subjected to evaporation to remove the excess of the high boiling derivatization reagent (3 M HCl in n-butanol). In our experience, longer evaporation times, involving vacuum and heat, can reduce the recovery of the volatile MMA dibutyl ester (optimum evaporation time is 30 min at 45 °C under vacuum, data not shown).

One important challenge in the analysis of serum MMA is the potential interference from other low-mass carboxylic acids, especially the structural isomer succinic acid, which is present in serum in high concentrations. Because the mass spectra of the two acids are nearly identical, MMA and succinic acid have to be separated chromatographically. To achieve the best resolution, we tested several reversed phase analytical columns (pentafluorophenyl, cyano, and C18). We selected the C18 UHPLC column because it allowed excellent resolution between MMA and succinic acid with relatively fast isocratic chromatography, narrow analyte peaks, and excellent column life (typically ~2000 injections) (Fig. 2). Our LC-MS/MS method offers an additional degree of specificity because it features data collection from two specific MMA transitions (QISRM and CISRM). We obtained fairly low (±8 %) variation of the confirmation ion ratio when we monitored these two ions in all six calibrators over 20 runs, particularly compared with common “industry standards” (FDA: ±10 % [25]; Society of Forensic Toxicologists: ±20 % to ±30 % [26]; European Commission: ±20 % to ±50 % depending of the abundance of CISRM vs. QISRM [27]). When we applied the confirmation ion ratio limits to MMA results >150 nmol/L obtained from 600 serum samples, 99 % of samples met the predetermined limits. This indicates that our confirmation ion ratio limits, although narrow, can be used in routine applications to allow identification of a potentially interfering compound that may be present in a sample.

Fig. 2
figure 2

Typical chromatogram of the low quality control serum

Our previous GC/MS procedure [19], although performing reliably for the measurement of thousands of NHANES samples, was not an efficient method. The sample volume requirement was 275 μL/test, the GC/MS run time was 15 min, the throughput was limited to 36 samples/run, and the sample preparation time for one run was ~12 h (Table 1). With the new LC-MS/MS method, we achieved our main goal to improve the method efficiency. The LC-MS/MS method used a much reduced sample volume (75 μL), allowed a notably faster sample preparation time (~4 h) and LC-MS/MS run time (6 min), and had a much higher sample throughput (up to 160 unknown samples/run).

Table 1 Sample preparation and analysis steps, comparing the CDC LC-MS/MS and GC/MS methods for MMA

We subjected our LC-MS/MS method to complete method validation. Ion suppression was evaluated by post-column infusion (using a T-connection) of 50 nmol/L MMA-dibutyl solution while injecting a reagent blank, a blank, standards, and unknown and QC samples. No baseline drifts were observed around the elution time of MMA upon injection of matrix and non-matrix samples, indicating lack of ion suppression. We verified and confirmed that aqueous calibration produced an equivalent slope (0.0025, r 2 = 0.999) compared with calibration in serum (0.0024, r 2 = 0.994). The average slopes differed by 4.1 %. Due to matrix equivalency, we prepared many aliquots of aqueous calibrators, stored them at –70 °C, and used a new set with each run, which greatly simplified method implementation and daily routine preparations for the analyst. The aqueous calibration curve showed linearity over two orders of magnitude (25–2500 nmol/L) with a high correlation coefficient (r 2 = 0.999). Some investigators reported storage at –20 °C as preferred over storage at –80 °C (for aqueous calibrators [16] or for serum-based calibrators [12]) because of possible MMA absorption to the storage container surface. We have not observed any change in method performance with aqueous calibrators stored in 2-mL Nalgene vials for up to 2 y at –80 °C (data not shown).

Our LC-MS/MS method achieved good precision with a total CV of 4.9 %–7.9 % for three serum QC pools analyzed over 22 runs (Table 2), similar to the precision of the GC/MS method (CV of 5.4 % for three plasma QC pools analyzed over 56 runs, [19]) and to other LC-MS/MS methods [1018]. We can rate our method precision more objectively by using generally applicable quality goals based on biologic variation [28]. As such, the analytical variation CVa should be a fraction of the within-person biologic variation CVw; optimum performance, CVa = 0.25 × CVw, desirable performance, CVa = 0.5 × CVw, minimum performance, CVa = 0.75 × CVw. Lacher et al. reported a within-person biologic variation of 18.7 % for MMA using data from NHANES 1999–2002 [29]. Therefore, the optimum, desirable, and minimum precision performance criteria for MMA are <4.7 %, <9.4 %, and <14 %, respectively. Our LC-MS/MS method achieved desirable precision.

Table 2 Analytical performance of the LC-MS/MS method

We evaluated method accuracy through spike recovery. The use of an isotopically-labeled ISTD is expected to correct for any loss of analyte during sample handling. The high accuracy assessed through spike recovery for the LC-MS/MS method (94 % ± 5.5 %) was comparable to that reported for the GC/MS method (96 % ± 1.9 %, [19]) and to other LC-MS/MS methods [1018]. Our LC-MS/MS method also achieved a good extraction efficiency of 81 %, as assessed from post-extraction addition of ISTD.

The LC-MS/MS method sensitivity, as expressed by the LOD (22.1 nmol/L), was improved compared with our previous GC/MS method (50 nmol/L) and commensurate or better compared with other LC-MS/MS methods [11, 1315]. The use of MRM in our LC-MS/MS method allowed us to monitor a cumulative signal for quantitation (sum of quantitation and confirmation ion areas). This produced a notable increase in signal, while keeping the background approximately the same. The confirmation ion area contributed ~60 %–85 % of the quantitation ion area. Thus, MRM helped to improve our method detection limit. The improved sensitivity of our new LC-MS/MS method and the small sample volume requirement (75 μL) allowed the accurate measurement of MMA at normal concentrations encountered in population surveys. Low-volume samples (<100 μL) that require confirmation can be diluted with deionized water and reanalyzed.

Plasma can form micro-fibrinogen clots during long-term storage and, therefore, serum is generally preferred. After we adjusted for the dilution in the plasma citrate tube, we found interchangeable results among the matched serum (plain and serum separators tubes) and plasma (EDTA, Na heparin, and citrate) samples (n = 14) with a mean difference of ±3 % (Fig. 3). This was generally consistent with other reports [11, 18]; however, Kushnir et al. reported an interference with the citrate tube, causing a significant increase of the confirmation ion ratio [11].

Fig. 3
figure 3

Effect of specimen type and anticoagulant on MMA. Fourteen matched serum and plasma samples were analyzed with the LC-MS/MS method. Error bars represent standard error. Because Na citrate was a liquid (0.5 mL/5 mL vacutainer tube) anticoagulant, MMA results were multiplied by 1.1 to correct for the dilution

We assessed the short-term stability of MMA in serum and in extracted samples, mid- and long-term stability of MMA in refrigerated and/or frozen serum, and the stability of MMA during repeated freeze/thaw of serum, and the short-term stability of MMA in extracted samples. We confirmed previous reports [16, 18] that MMA was stable for up to three freeze/thaw cycles and for up to 1 wk of serum storage at room temperature (data not shown). Given these results, it was not surprising that the short-term stability of MMA in extracted samples (3 d at 7 °C and 1 wk at –20 °C) was also very good (±3 % and ±1 %, respectively, from the original value). Furthermore, storage of serum at –20 °C for 1 y also showed excellent stability with no significant change in MMA concentrations from baseline [mean percent difference (95 % CI) among the three serum QC pools: 0.98 % (–1.46 % to –3.47 %), P = 0.42]. Storage of serum at 5 °C for 1 y showed a small but significant increase in MMA [6.59 % (2.19 % to 11.2 %), P = 0.0048], possibly due to slight sample evaporation. However, the 6 mo time point for storage at 5 °C showed no significant changes. This excellent MMA stability makes it possible to ship serum at ambient temperature to laboratories around the world participating in external quality assessment schemes. Lastly, the long-term (up to 14 y) stability of frozen (≤ –70 °C) plasma QC pools that were originally value-assigned by our GC/MS method was also excellent with MMA concentrations being within ±1 % of the original target values.

The second objective of this report was to compare our new LC-MS/MS method with the previous GC/MS method so as to provide continuity in interpreting NHANES data. The mean±SD (nmol/L) MMA concentration in the final set of 326 residual NHANES samples was 236 ± 191 nmol/L by LC-MS/MS compared with 235 ± 186 nmol/L by GC/MS. The two methods showed excellent correlation (Pearson r = 0.99). Deming regression analysis showed a significant but small deviation from a slope of 1 and an intercept of 0 and a nonsignificant bias (Table 3 and Fig. 4). Therefore, no adjustments will be needed to compare the NHANES 1999–2004 GC/MS data with the new NHANES 2011–2014 LC-MS/MS data. Using a much smaller sample set from a previously conducted international round-robin (n = 19, concentration range: 70–10500 nmol/L, [24]), we also found excellent correlation (Pearson r = 1.0) and no bias between our LC-MS/MS method and the earlier generated GC/MS or consensus results from 13 laboratories (Table 3).

Table 3 Comparison of MMA results in serum and plasma samples obtained by different methods
Fig. 4
figure 4

Comparison of LC-MS/MS to GC/MS results obtained for MMA. Method comparison consisted of 326 residual plasma samples. Deming regression (a). Bland-Altman plot of differences (b)

At present, there is no serum-based reference material available with certified MMA concentrations. As an additional accuracy evaluation of the new LC-MS/MS method and as part of our third objective, we compared our results by both the LC-MS/MS and the GC/MS method with those obtained by NIST on a future NIST SRM material. SRM 1950 “Metabolites in Human Plasma” is the first reference material that has been characterized for MMA by the NIST GC/MS and CDC GC/MS and LC-MS/MS measurement procedures. The mean±SD (ng/mL) MMA concentrations obtained by the three procedures were virtually identical: CDC LC-MS/MS, 18.47 ± 0.71 (n = 17); CDC GC/MS, 18.18 ± 0.67 (n = 11); NIST GC/MS, 18.41 ± 0.70 (n = 15). The variability (CV) of the three procedures was also very comparable: 3.8 %, 3.7 %, and 3.8 %, respectively.

In summary, with this new LC-MS/MS method, we have implemented important features for methods used in large population surveys: low sample volume and high throughput, high accuracy, low imprecision, and good sensitivity. Time will show how stable the method will perform long-term. However, so far an intermittent use of the method did not pose any problems in bringing the method back online after several months of pausing. The excellent comparability and bias-free agreement between this method and the NIST GC/MS measurement procedure as well as our previous GC/MS method will facilitate future time trend analyses using NHANES data.