Introduction

Soy [Glycine max (L.) Merr.], a member of the pea family (Fabaceae), has been a common component in Asian diets for thousands of years [1]. Soy products are also found in modern American diets as both foods and food additives. Soybeans, the high-protein seeds of soy, can be cooked and eaten, or used to make other products such as tofu, tempeh, soy milk, and soy sauce. Soy is also used as an additive in various processed foods to enhance the texture, flavor, or nutritional content, and is commonly used as a vegetarian or non-dairy alternative (e.g., soy infant formula, soy yogurt, veggie burgers) to conventional products. Soybeans are a source of complete protein, as well as dietary fiber, calcium, iron, manganese, phosphorus, magnesium, zinc, potassium, thiamine, riboflavin, folate, vitamin C, and vitamin K [2].

In addition to high protein and nutrient content, soy also contains isoflavones, compounds similar to the female hormone estrogen. Isoflavones may be present in soy foods and supplements as aglycones (daidzein, genistein, and glycitein), as glycosides (daidzin, genistin, and genistein), or as the malonyl- and acetyl-glycosides. After consumption of soy, all isoflavone forms are hydrolyzed in the gut and the aglycone forms are further metabolized or rapidly absorbed [3, 4]. Soy products have been used in traditional medicine to relieve menopausal symptoms, memory problems, high blood pressure, and high cholesterol levels, as well as for the prevention of osteoporosis, breast cancer, and prostate cancer [1]. These health benefits are often attributed to the activity of the isoflavones, but in clinical research, the findings are inconclusive [511]. Research suggests that daily intake of high levels of soy protein (over half of the daily protein intake) may slightly lower levels of low-density lipoprotein (LDL cholesterol), but the findings favored soy protein compared to soy isoflavones as the source of the health benefit [5, 6]. The overall cardiovascular health benefit of soy supplementation was deemed minimal at best, with no evidence of improvement in levels of high-density lipoprotein (HDL cholesterol), triglycerides, lipoprotein(a), or in lowering blood pressure [5]. Research also suggests that women consuming moderate amounts of soy throughout their lives have slightly lower breast cancer risk as well as lower risk of cancer reoccurrence compared to women who do not consume soy [7]. The inability to link soy isoflavones to health outcomes, however, may be an artifact of the inability to accurately determine the doses of soy isoflavones being administered to patients as part of a normal diet and/or as part of a focused clinical trial.

Despite the inconclusive clinical evidence, soy has been marketed as a dietary supplement in recent years, in forms such as tablets and capsules and containing soy protein, isoflavones, or both. In addition, the National Institutes of Health (NIH) Office of Dietary Supplements (ODS) and National Center for Complementary and Integrative Health (NCCIH) have promoted and supported research related to soy supplementation and its effect on human health [1]. The ability to link soy isoflavones to health outcomes relies heavily on the accurate determination of the doses of soy isoflavones being administered to patients. Two Official Methods of Analysis have been developed by AOAC INTERNATIONAL for isoflavones in soy; AOAC Official Method 2001.10 is based on extraction, base hydrolysis, and liquid chromatography (LC) with absorbance detection for determination of isoflavone glycosides and aglycones [12], while AOAC Official Method 2008.03 utilizes only extraction and LC-absorbance to quantify isoflavone glycosides and aglycones, as well as malonyl- and acetyl-glycosides [13]. While these methods have been performance tested and peer evaluated, each has drawbacks. AOAC 2001.10 relies on LC-absorbance without an internal standard for quantification [12]. AOAC 2008.03 describes quantification of malonyl- and acetyl-glycosides using response factors calculated using reference standards for the glycosides, as pure reference standards for the malonyl- and acetyl-glycosides have limited availability [13]. Neither method includes rigorous purity evaluation of the reference standards, which may lead to significant bias of quantitative results. In addition, neither of the AOAC Official Methods applies to the analysis of isoflavones in other dietary supplements, such as red clover (Trifolium pratense L.) or kudzu (Pueraria lobata Willd.), which could be useful in direct comparison of isoflavone-containing products from a variety of sources.

Since 2000, a number of reviews have been published describing and categorizing analytical methods for determination of isoflavones in soy [3, 4, 1416]. Since the latest review, a number of additional papers have described quantitation of isoflavones in foods and dietary supplements, including methods based on capillary zone electrophoresis (CZE) with mass spectrometry (MS) detection [17], supercritical fluid chromatography (SFC) with absorbance detection [18], and gas chromatography (GC) with MS detection [19]. Because CZE, SFC, and GC separations are based on different fundamental properties of the isoflavones, each of these methods provides unique selectivity in their determination. Most methods for determining isoflavones in soy and related products, however, are based on LC, using absorbance detection [2023], MS detection [2325], or tandem MS (MS/MS) detection [26]. The LC absorbance methods in the recent literature for soy isoflavones are based on an external standard quantitation approach [2022], which may lead to biases during the sample preparation, separation, and detection. One method describes an internal standard approach using formononetin [23], which may be naturally present in soy and cause erroneous quantitative results in some products. Of the MS-based publications reporting quantitation of soy isoflavones, only two report the use of an internal standard [25, 26], and only one approach describes the use of isotopically labeled analogues of isoflavones [26]. Burdette et al. [25] describes the use of 7-hydroxy-4-chromone as an internal standard for isoflavone determination by LC with a particle beam interface to allow electron ionization prior to MS (LC-PB/EIMS), to facilitate ease of interpretation in fragmentation pattern recognition and spectral library matching. A more widely applicable approach was described by Clarke et al. [26] via the use of isotopically labeled analogues of daidzein, glycitein, genistein, formononetin, and biochanin A. The use of isotopically labeled analogues as internal standards has been described previously as the most desirable approach for obtaining unbiased quantitative results [27], using the example of vitamin analysis. Despite the depth and breadth of literature covering the analysis of soy products for isoflavones, additional work is needed toward validation of quantitative LC-based methods.

To facilitate accurate labeling of soy dietary supplement products, the National Institute of Standards and Technology (NIST) has partnered with NIH-ODS to develop two independent analytical methods for soy isoflavones as well as a suite of five Standard Reference Materials (SRMs) for soy foods and dietary supplements. The two methods are based on solvent extraction of the isoflavones from the soy matrices, followed by basic hydrolysis of malonyl- and acetyl-glycosides. The extracts are then analyzed by LC with absorbance detection at 254 nm or by isotope dilution (ID) LC-MS in single ion monitoring mode. The results obtained by these two methods for isoflavones in SRM 3234 Soy Flour, SRM 3236 Soy Protein Isolate, SRM 3237 Soy Protein Concentrate, and SRM 3238 Soy-Containing Solid Oral Dosage Form were directly comparable, and these materials can be used for quality control in the food and dietary supplement industry. For SRM 3235 Soy Milk, inhomogeneity was observed and the material was determined to be unsuitable as a control material for isoflavones.

Materials and methods

Certain commercial equipment, instruments, or material are identified in this report to specify adequately the experimental procedure. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.

Chemicals

Daidzin, genistin, daidzein, and genistein standards used for LC-absorbance were obtained from Phytolab (Vestenbergsgreuth, Bavaria, Germany) via Cerilliant Corporation (Round Rock, TX, USA). The glycitin standard used for LC-absorbance was obtained from Sigma (St. Louis, MO, USA) and from Blaze Science Industries (BSI, Lawndale, CA, USA). Glycitein and sissotrin standards used for LC-absorbance were obtained from Indofine Chemical Company (Hillsborough, NJ, USA). Daidzin, glycitin, genistin, daidzein, glycitein, and genistein standards used for LC-MS were obtained from BSI. [13C6]-daidzin, [13C6]-glycitin, [13C6]-genistin, [13C6]-daidzein, [13C6]-glycitein, and [13C6]-genistein used for ID-LC-MS were obtained from IsoSciences (King of Prussia, PA, USA). Sodium hydroxide and acetic acid used in the hydrolysis were reagent grade and were obtained from Sigma. Dimethyl sulfoxide (DMSO) used in calibrant preparation and ammonium acetate used in mobile phase preparation were also obtained from Sigma. Water and methanol used in the extraction and used to prepare LC mobile phases were high performance LC (HPLC) grade from J&H Berge (South Plainfield, NJ, USA).

Purity of reference standards by q1H-NMRIS

The mass purity (g/g) of each reference standard was assessed using quantitative proton nuclear magnetic resonance with an internal standard approach (q1H-NMRIS). With this precise ratio method, the primary chemical component of each material was measured directly for respective amount of substance determinations, and information supporting chemical identification was obtained. Three samples of each reference standard were dissolved with a neat internal standard material of known chemical purity (maleic acid or dimethylmalonic acid). 1H-NMR spectra were acquired using a Avance 600 MHz spectrometer (Bruker BioSpin, Billerica, MA, USA) equipped with a 5-mm broadband inverse (BBI) detection probe. All quantitative one-dimensional 1H-NMR analyses were performed at 300 K using 90° excitation pulses and GARP composite pulse decoupling [28] to mitigate 13C-spin splitting effects. For each sample, 64 scans were performed with a recycle delay of 60 s to ensure reliable quantitative peak integrations. Phase adjustments, baseline corrections, and signal integrations were performed manually during processing of all Fourier-transformed 1H-NMR spectra.

Samples

Soy samples were SRMs produced and distributed by NIST (Gaithersburg, MD, USA). Soy materials included SRM 3234 Soy Flour (defatted soy flour prepared by a food ingredient manufacturer), SRM 3235 Soy Milk (commercial product prepared by the manufacturer), SRM 3236 Soy Protein Isolate and SRM 3237 Soy Protein Concentrate (commercial products prepared by a manufacturer of food and agricultural products), and SRM 3238 Soy-Containing Solid Oral Dosage Form (combination of several common commercial products). Powders were stored at room temperature; SRM 3235 Soy Milk was stored under refrigeration.

Sample preparation for LC-absorbance analysis

The sample preparation procedure described in AOAC 2001.10 [12] was evaluated and adapted as necessary for each soy material, and the general protocol is described. Ten to twelve units (bottles, ampoules, or packets) of each soy material were selected for analysis, and duplicate subsamples were taken from each unit. The contents of each unit were shaken or mixed to distribute the contents. A sample of material (100 mg to 200 mg) was weighed into a 15 mL polypropylene centrifuge tube and an appropriate volume of sissotrin internal standard solution [prepared in 80:20 methanol:water (volume fractions) containing ≈ 10 drops of DMSO] was added to match the expected level of analytes in each sample. Sissotrin was chosen as an internal standard for LC-absorbance analysis as an isoflavone similar in structure to the isoflavones of interest, but its presence has not been reported in soy products. The extraction solvent (80:20 methanol:water, volume fraction) was added to bring the total solution volume to 5 mL and the contents were mixed well. Specific details outlining sample preparation for each material are provided in the Electronic Supplementary Material (ESM) (Table S1). The tubes were placed in an ultrasonic bath for 15 min without heating, then centrifuged at 3000 rpm (314 rad/s) for 10 min. Approximately 1 mL of the supernatant was transferred to a clean 15 mL polypropylene centrifuge tube and 75 μL of a 2 mol/L aqueous sodium hydroxide solution was added. The tubes were placed in an ultrasonic bath for 15 min without heating to hydrolyze malonyl and acetyl glycosides of daidzin, glycitin, and genistin. Excess base was neutralized by addition of 25 μL of glacial acetic acid and the sample extract was transferred to an autosampler vial for analysis by LC-absorbance. Extracts of SRM 3234 Soy Flour, SRM 3236 Soy Protein Isolate, and SRM 3238 Soy-Containing Solid Oral Dosage Form were diluted 15-fold, 3-fold, and 30-fold, respectively, with 80:20 methanol:water (volume fraction) prior to LC-absorbance analysis to ensure that the isoflavone peak areas were within the calibration range.

Sample preparation for ID-LC-MS analysis

Six to twelve units (bottles, ampoules, or packets), different from those used for LC-absorbance determinations, of each soy material were selected for analysis, and duplicate subsamples were taken from each unit. The contents of each unit were shaken or mixed to distribute the contents. A sample of material (50 mg to 700 mg) was weighed into a 15 mL polypropylene centrifuge tube and an appropriate volume of isotopically labeled internal standard solution [prepared in 80:20 methanol:water (volume fractions)] was added to match the expected level of analytes in each sample. The extraction solvent (80:20 methanol:water, volume fraction) was added to bring the total solution volume to 2.5 mL and the contents were mixed well. Specific details outlining sample preparation for each material are provided in the ESM (Table S1). The tubes were placed in an ultrasonic bath for 60 min without heating. Samples were then vortex mixed for 10 s and 190 μL of a 2 mol/L aqueous sodium hydroxide solution was added. The samples were vortex mixed for 10 s and placed in an ultrasonic bath for 10 min without heating to hydrolyze malonyl and acetyl glycosides of daidzin, glycitin, and genistin. Excess base was neutralized by addition of 65 μL of glacial acetic acid. After 10 s of vortex mixing, an additional 4.2 mL of 80:20 methanol:water (volume fraction) was added for a total volume of 7 mL. The tubes were mixed by vortex for 10 s then centrifuged at 6000 rpm (628 rad/s) for 5 min. An aliquot of the supernatant was transferred to an autosampler vial for analysis by ID-LC-MS. Extracts of SRM 3234 Soy Flour and SRM 3238 Soy-Containing Solid Oral Dosage Form were diluted 2-fold and 10-fold, respectively, with 80:20 methanol:water (volume fraction) prior to ID-LC-MS analysis to ensure that the isoflavone peak areas were within the calibration range.

LC-absorbance analysis

Samples and standards were analyzed by using an UltiMate 3000 LC (Thermo Fisher Scientific, Waltham, MA, USA) with a photodiode array detector. An Ascentis Express RP-Amide column (150 mm × 4.6 mm i.d., 2.7 μm particles) from Supelco (Bellefonte, PA, USA) was used for the analyses with the corresponding guard cartridge. Mobile phase A consisted of 5 mmol/L ammonium acetate in water adjusted to pH 4.7 with acetic acid, and mobile phase B was acetonitrile. Gradient elution was used from 10 %B to 60 %B over 20 min, followed by a 10 min wash at 90 %B and a 5 min reequilibration at the initial conditions. For analysis of SRM 3237 Soy Protein Concentrate, the gradient was modified slightly to resolve matrix components from isoflavones; gradient elution began after a 5 min hold at 10 %B, from 10 %B to 60 %B over 20 min, followed by a 10 min wash at 90 %B and a 5 min reequilibration at the initial conditions. The flow rate for all separations was 1.2 mL/min and the column temperature was maintained at 35 °C. The autosampler temperature was maintained at 10 °C and a 5.0 μL injection volume was used for all standards and samples. Quantitation was performed using the absorbance response (as peak area) at 254 nm for isoflavones in all standards and samples with respect to the internal standard sissotrin.

Isoflavone concentrations in each of the soy samples were bracketed with five calibration solutions of nominally the same concentration, independently prepared in a 80:20 methanol:water (volume fractions) containing ≈ 10 drops of DMSO. Details of calibration solution preparation are outlined in the ESM (Table S2). The stock solutions of each isoflavone and the internal standard were gravimetrically mixed in appropriate ratios to reflect the concentration of each compound in each sample after extraction for determination of response factors (described in ESM, Table S3). All solutions were stored in the freezer (−20 °C) when not in use. Each calibration solution was injected at least four times, and the response factor was calculated for each injection. An average response factor was used for the calculation of the concentration of each isoflavone in each sample.

ID-LC-MS analysis

Samples and standards were analyzed by using an Agilent 1100 Series LC (Agilent Technologies, Palo Alto, CA, USA) equipped with an SL Series MS with electrospray ionization (ESI) in the positive ion mode. To provide orthogonality with the LC-absorbance approach already described, a Zorbax SB-CN column (250 × 4.6 mm i.d., 5 μm particles) from Agilent Technologies was used for the analyses without a guard cartridge. Mobile phase A consisted of 0.1 % formic acid in water, and mobile phase B was 0.1 % formic acid in acetonitrile (volume fractions). Gradient elution was from 30 %B to 55 %B over 35 min, followed by a 5 min hold at 55 %B and a 7 min reequilibration at the initial conditions. The flow rate for all separations was maintained at 1.0 mL/min and the column temperature was not controlled. The autosampler temperature was maintained at 4 °C and a 10 μL injection volume was used for all standards and samples. Quantitation was performed by IDMS in selected ion monitoring (SIM) mode using the following mass-to-charge ratios (m/z): daidzin (m/z 417), [13C6]-daidzin (m/z 423), genistin (m/z 433), [13C6]-genistin (m/z 439), glycitin (m/z 447), [13C6]- glycitein (m/z 453), daidzein (m/z 255), [13C6]-daidzein (m/z 261), genistein (m/z 271), [13C6]-genistein (m/z 277), glycitein (m/z 285), and [13C6]- glycitein (m/z 291). The mass spectrometer was operated at a nebulizer pressure of 380 kPa (55 psi), a drying gas temperature of 350 °C, a drying gas flow rate of 12 L/min, a capillary voltage of 3500 V, and a fragmentor voltage of 130 V.

Isoflavone concentrations in each of the soy samples were bracketed with three calibration solutions of nominally the same concentration, independently prepared in a 80:20 methanol:water (volume fractions). Details of calibration solution preparation are outlined in the ESM (Table S2). The stock solutions of each isoflavone and the internal standards were gravimetrically mixed in appropriate ratios to reflect the concentration of each compound in each sample after extraction for determination of response factors (described in ESM, Tables S3 and S4). All solutions were stored in the freezer (−20 °C) when not in use. Each calibration solution was injected at least five times, and the response factor was calculated for each injection. An average response factor was used for calculation of the concentration of each isoflavone in each sample.

Value assignment

For the final value assignment of the isoflavones in the four SRMs, the mean from the combination of the mean results from each approach was used. The stated uncertainty of each value is an expanded uncertainty interval (U) about the mean that covers the measurand with approximately 95 % confidence. The expanded uncertainty is calculated as U = ku c , where the combined standard uncertainty (u c ), consistent with the ISO Guide and its Supplement 1, is derived from the observed difference between corresponding results from the independent methods and the respective pooled uncertainties, as well as an uncertainty component related to moisture correction, and k is a coverage factor corresponding to approximately 95 % confidence [2931].

Results and discussion

The determination of isoflavones in soy foods and dietary supplements has been described in the literature [3, 4, 1226]. Many of the quantitative methods that report values of isoflavones in foods and supplements have significant drawbacks, including potential sources of bias arising from incomplete characterization of calibration standards, improper choice or use of internal standards (or lack thereof), insufficient extraction from the soy matrix, lack of consideration of the various isoflavone glycoforms (and extent of hydrolysis), or inadequate separation of the isoflavones from one another or interfering matrix components. The methods described herein have thoroughly evaluated these potential biases and have been used to characterize four soy SRMs that can be used by laboratories for future method evaluation and validation.

Purity of reference standards by q1H-NMRIS

qNMR with an internal standard is a primary ratio, direct measurement technique that is used for traceable chemical purity assessments [3240]. Resonant proton responses of the primary chemical species of interest within a neat material are compared to those of an internal standard of known mass purity. This, in addition to the carefully-determined mass ratio of neat chemical and internal standard materials within the NMR sample, are used to evaluate the molar amount of the primary species with respect to the mass of the aggregate material. From this, the mass fraction purity may be derived from the corresponding molecular weight of the primary species. The NMR experiment must be performed with sufficient recycle delay times for total T1 relaxation of all 1H structural moieties, in order to ensure quantitatively consistent 1H responses for all resonances of the sample.

1H-NMR signals within the spectral region ≈ 6.3 ppm to ≈ 9.5 ppm, most of which correspond to aromatic proton resonances, were integrated for quantification of the respective primary chemical component in each isoflavone reference standard. Internal standard peaks at 6.2 ppm or 1.3 ppm were integrated for maleic acid or dimethymalonic acid, respectively. The purity results for each material (n = 3), derived using Monte Carlo calculation modeled to the qNMR measurement equation, are summarized in Table 1 along with respective manufacturer-stated purities and water contents. Most purities were determined to be within 2 % of the corresponding manufacturer-stated purity, and most were lower than that purported. All calibrant concentrations were adjusted using the q1H-NMRIS purity results. The uncertainty of the purity determination was not propagated for value assignment, as the variability between the two LC methods was large enough to render these small uncertainties (0.5 % or less) insignificant.

Table 1 Summary of results for the quantitative purity (mass %) assessment of isoflavones by q1H-NMRIS

Sample preparation for LC-absorbance analysis

The sample preparation procedure described in AOAC Official Method 2001.10 [12] was modified as necessary to provide exhaustive extraction of the isoflavones in each soy material. Conditions of extraction temperature, extraction time, the number of sequential extraction processes, and hydrolysis time were investigated and evaluated based on overall extraction yield compared with measurement precision. An example of the optimization of the extraction of genistin from SRM 3236 Soy Protein Isolate is demonstrated in Fig. 1, and will be discussed in detail to illustrate the process used for all soy materials. Details of extraction optimization for the other isoflavones in SRM 3236 Soy Protein Isolate and also in the other soy SRMs can be found in Figures S1-S19 in the ESM.

Fig. 1
figure 1

Results of extraction and hydrolysis optimization for genistin in SRM 3236 Soy Protein Isolate. Number of extractions: one (blue), two (red), and three (green) with sonication for 15 min (no heat) and 15 min hydrolysis. Extraction temperature: sonication for 30 min at 35 °C (no heat, blue) and 65 °C (red), each with 15 min hydrolysis. Extraction time: 0 min (vortexing only, blue) and sonication (no heat) for 5 min (red), 15 min (green), 30 min (purple), 60 min (aqua), and 120 min (orange), each with 15 min hydrolysis. Hydrolysis time: Native (without addition of base) after 0 h (blue), 24 h (red), and 48 h (green) and using sonication (no heat) with 2 mol/L NaOH for 15 min (purple), 30 min (aqua), and 60 min (orange). Error bars represent the standard deviation of two measurements

First, increasing recovery by increasing the number of extraction cycles was investigated by combining sequential extracts obtained by sonication for 15 min with no added heat. After sonication, the samples were centrifuged, hydrolyzed by sonication with sodium hydroxide for 15 min, and a small aliquot (100 μL to 150 μL) of the supernatant was removed for LC-absorbance analysis. The remaining supernatant was decanted into a clean vessel, fresh extraction solvent was added, and the 15 min sonication cycle repeated. Following centrifugation, the supernatant was combined with the supernatant from the first extraction cycle, the solution was mixed well, hydrolyzed by sonication with sodium hydroxide for 15 min, and an aliquot was removed for LC-absorbance analysis. This procedure was repeated for a third extraction cycle, and the mass fractions determined by LC-absorbance for each cycle were compared. As shown in Fig. 1 for genistin in SRM 3236, the second and third extraction cycles did not increase the recovery and in some cases decreased precision of the measurement. A similar trend was observed for the other isoflavones in all soy materials, and an extraction protocol involving a single sonication cycle of 15 min with no added heat was adopted.

The effect of sonication temperature was investigated by conducting the extraction of soy materials without added heat (approximately 35 °C), as well as by controlling the bath temperature at 65 °C. The samples extracted at each temperature were analyzed by LC-absorbance and the resulting mass fractions compared. The elevated temperature caused no significant increase in recovery for any of the isoflavones in any of the soy materials (Fig. 1), and sonication extraction with no added heat was adopted for future sample preparation.

Increasing the extraction time was evaluated as an approach to increase recovery of isoflavones from soy materials. Samples were extracted for 0 min (vortex mixing only) and with sonication (no added heat) for 5 min, 15 min, 30 min, 60 min, and 120 min and the mass fractions resulting from LC-absorbance analysis were compared. For all of the isoflavones in all matrices, recovery increased up to 15 min sonication, with no further increase with increased time (Fig. 1), therefore a 15 min sonication time was selected for further experiments.

In soy products, isoflavones are present as the six main compounds discussed previously as well as in the form of malonyl- and acetyl-glycosides of daidzin, genistin, and glycitin. Bioactivity [3, 4], as well as a lack of suitable reference standards, has directed the dietary supplement community to focus on only the three isoflavones aglycones (daidzein, glycitein, and genistein) and their glycosides (daidzin, genistin, and glycitin). To ensure that all of the malonyl- and acetyl-glycosides are hydrolyzed during the sample preparation process, the effectiveness of the hydrolysis procedure was evaluated by a comparison of native degradation and forced basic hydrolysis. An initial measurement of an extracted sample was recorded at 0 h and compared to the same sample allowed to naturally degrade for 24 h and 48 h. These samples were also compared to samples that were hydrolyzed by addition of sodium hydroxide and sonicated for 15 min, 30 min, and 60 min. As shown in Fig. 1, native hydrolysis was not sufficient for recovery of the isoflavone glycosides (daidzin, genistin, and glycitin; data only shown for genistin), as a large fraction of these compounds are bound as malonyl and acetyl esters. The addition of base was required to release the isoflavones glycosides, and sonication for 15 min in the presence of sodium hydroxide yielded the best recoveries. For further experiments, a forced base hydrolysis with a 15 min sonication time was selected.

Sample preparation for ID-LC-MS analysis

The sample preparation procedure used prior to ID-LC-MS analysis was very similar to AOAC Official Method 2001.10 [12] and the method described for LC-absorbance analysis. All samples were extracted by sonication for 60 min without heating, which will provide exhaustive extraction of isoflavones from soy matrices based on data from extraction studies discussed previously. Samples for ID-LC-MS analysis were hydrolyzed directly, without removing an aliquot of extracted supernatant to a clean vessel. Compared to the sample preparation approach used for LC-absorbance, a direct hydrolysis may be advantageous if acetyl- and malonyl-glycosides are not fully extracted into the solvent and instead remain in the soy matrix. The hydrolysis time, however, was slightly shorter, with 10 min for ID-LC-MS analysis compared to 15 min for LC-absorbance analysis. The hydrolysis time described in AOAC Official Method 2001.10 was 10 min, however, so the times for both methods should be sufficient [12].

LC-absorbance analysis

Both AOAC Official Methods for determination of isoflavones, AOAC 2001.10 and 2008.03, are based on LC with absorbance detection at 260 nm. The method described here is based on the same principles, but was modified for applicability to determination of isoflavones in P. lobata (kudzu) and T. pretense (red clover) (data not presented here). Kudzu is known to contain daidzin and daidzein, as well as puerarin; formonentin, biochanin A, and coumestrol have been reported in red clover extracts [26]. To provide separation of the six soy isoflavones as well as these four additional isoflavones, a reversed phase column with an embedded amide group was utilized. Example separations for the six isoflavones in the suite of soy materials with detection by absorbance at 254 nm are provided in Fig. 2. To resolve some isoflavones from interfering matrix components, the gradient was expanded slightly in the certification of SRM 3237 Soy Protein Concentrate to include a 5 min isocratic hold at the initial chromatographic conditions. These methods provided baseline resolution of all six isoflavones of interest.

Fig. 2
figure 2

LC-absorbance separation and detection of isoflavones in (A) SRM 3234 Soy Flour; (B) SRM 3235 Soy Milk; (C) SRM 3236 Soy Protein Isolate; (D) SRM 3238 Soy-Containing Solid Oral Dosage Form; (E) SRM 3237 Soy Protein Concentrate. Separation was achieved using an Ascentis Express RP-Amide column (150 mm × 4.6 mm i.d., 2.7 μm particles), with conditions as described in the text

The mass fractions determined in the four soy samples using these methods are listed in Table 2. In combination with the optimized sample extraction conditions detailed above, 20 values for isoflavones were measured by LC-absorbance in 4 soy SRMs. Isoflavone levels ranged from 0.804 mg/kg (glycitin in SRM 3237 Soy Protein Concentrate) to 12.5 mg/g (genistin in SRM 3238 Soy-Containing Solid Oral Dosage Form), and for nearly all isoflavones in all matrices, the repeatability of the method was good, with relative standard deviations (RSDs) of less than 6 %. Some exceptions included daidzein (9.51 % RSD) and genistein (10.20 % RSD) in SRM 3234 Soy Flour, and glycitin (8.12 % RSD) in SRM 3237 Soy Protein Concentrate. The large variabilities on these mass fractions can be attributed to the lower levels of these components in the soy samples relative to the concentrations of other isoflavones, as the absorbance signal was very low (Fig. 2). While a larger sample size could have been used for extraction to increase the concentration in solution, the signal for other components may move outside of the range of linearity for the calibration range or detector response, or other matrix components may have more significantly interfered in the separation and detection. In the interest of quantification of all isoflavones in a single chromatographic run from a single preparation of each sample, the precision for the low level analytes was sacrificed slightly.

Table 2 Mass fraction values for isoflavones in soy Standard Reference Materials

ID-LC-MS analysis

As an alternative method to LC-absorbance, an LC-MS method using isotopically labeled internal standards was also developed. Isotope dilution based approaches for quantitation reduce the impact of sample handling and instrument variability, as well as the effect of ion suppression or enhancement from interfering matrix compounds [27]. A cyano column was used for ID-LC-MS analysis, which provided alternate chromatographic selectivity by inverting the retention order of daidzein and glycitein relative to that in the LC-absorbance analysis using the amide column. In addition, this method also demonstrated the ability to separate the four additional isoflavones of interest in kudzu and red clover samples. Example separations for the six isoflavones in the suite of soy materials with detection ESI-MS in SIM mode are provided in Fig. 3. (Additional ID-LC-MS chromatograms for SRM 3234 Soy Flour, SRM 3236 Soy Protein Isolate, and SRM 3237 Soy Protein Concentrate can be found in Figures S20-S22 in the ESM.) This method provided baseline resolution of all six isoflavones of interest.

Fig. 3
figure 3

ID-LC-MS separation and detection of isoflavones in SRM 3238 Soy-Containing Solid Oral Dosage Form. Separation was achieved using an Zorbax SB-CN column (250 × 4.6 mm i.d., 5 μm particles), with conditions as described in the text

The mass fractions determined in the four soy samples using these methods are listed in Table 2. As with the LC-absorbance method, 20 values for isoflavones were measured by ID-LC-MS in 4 soy SRMs. Isoflavone levels ranged from 0.810 mg/kg (glycitin in SRM 3237 Soy Protein Concentrate) to 12.9 mg/g (genistin in SRM 3238 Soy-Containing Solid Oral Dosage Form), consistent with the LC-absorbance data. The repeatability of the ID-LC-MS approach was slightly better compared to the LC-absorbance method, with relative standard deviations (RSDs) of less than 2.5 % for nearly all isoflavones in all matrices. Some exceptions included genistein (5.14 % RSD) in SRM 3236 Soy Protein Isolate, glycitin (6.71 % RSD) in SRM 3237 Soy Protein Concentrate, and genistein (4.80 % RSD) in SRM 3238 Soy-Containing Solid Oral Dosage Form. While these variabilities are high for this data set, RSD values less than 10 % are generally within the acceptable range for determination of organic compounds in food and dietary supplements.

Method comparison

A direct comparison of the LC-absorbance and ID-LC-MS methods is possible using data collected for isoflavones in the various soy SRMs. The entire raw data sets for daidzin, daidzein, genistin, genistein, glycitin, and glycitein in SRM 3236 Soy Protein Isolate are depicted in Tukey boxplots in Fig. 4a–f, respectively. (Additional Tukey boxplots for SRM 3234 Soy Flour, SRM 3237 Soy Protein Isolate, and SRM 3238 Soy-Containing Solid Oral Dosage Form are provided in Figure S23-S25, respectively, in the ESM.) In these plots, the distribution of the data within the method data set is illustrated, where the top and bottom of the box represent the first and third quartiles, respectively, and the inner horizontal band represents the median value for the data set. The top and bottom whiskers represent the limits of 1.5IQR (interquartile range) of the upper and lower quartiles, respectively. For daidzein (Fig. 4b), glycitin (Fig. 4e), and glycitein (Fig. 4f), the overlap in the data sets is visually obvious indicating that the two methods are providing equivalent data. For daidzin (Fig. 4a), genistin (Fig. 4c), and genistein (Fig. 4d), the data sets appear less consistent. Similar inconsistencies were observed for the isoflavone glycosides in SRM 3234 Soy Flour [daidzin (Figure S23A), genistin (Figure S23C), and glycitin (Figure S23E)]; genistin in SRM 3237 Soy Protein Concentrate (Figure S24B); and daidzin in SRM 3238 Soy-Containing Solid Oral Dosage Form (Figure S25A). With the exception of genistein in SRM 3236 (Fig. 4d), the inconsistent results were observed for determination of isoflavone glycosides, not aglycones. The differences are not likely related to incomplete hydrolysis of the malonyl- and acetyl-glycosides, since a thorough hydrolysis study was conducted prior to value assignment, and if present, these compounds would be easily identifiable in the chromatogram. The hydrolysis approach used for ID-LC-MS analysis was more rigorous than the optimum conditions determined and used by the LC-absorbance approach. Also worth noting, with respect to inconsistencies between the two analytical methods, the values determined by LC-absorbance were lower than those determined by ID-LC-MS, with the exception of genistin in SRM 3237 (Figure S24B). An incomplete extraction of the isoflavones in the LC-absorbance method could explain many of these differences, yet the extraction optimization studies indicated that the isoflavones were being exhaustively extracted from each matrix. In addition, the direct nature of the hydrolysis used for ID-LC-MS samples could account for the higher recoveries, although higher recoveries would be expected for all glycosides by ID-LC-MS, which was not observed. The cause of the differences in results for these isoflavones has not been determined, but inclusion of both data sets provides greater confidence in the trueness of the certified value, despite the wider uncertainty that results.

Fig. 4
figure 4

Tukey boxplots depicting the data for isoflavones in SRM 3236 Soy Protein Isolate as determined by LC-absorbance (red, N = 24) and ID-LC-MS (black, N = 12). The top and bottom of the box represent the first and third quartiles, respectively. The band inside the box represents the median value for the data set. The top and bottom whiskers represent the limits of 1.5IQR (interquartile range) of the upper and lower quartiles, respectively. (A) Daidzin; (B) Daidzein; (C) Genistin; (D) Genistein; (E) Glycitin; (F) Glycitein

Value assignment

For each isoflavone in the four soy matrices, the certified and reference values were calculated as the mean of the mean values from each method, and the values are summarized in Table 2. The uncertainty of the combined mean is estimated using a bootstrap procedure based on a Gaussian random effects model for between-method effects [2931, 41]. All uncertainties in powder SRMs also incorporate an uncertainty component from correction of all values to dry-mass basis using the experimentally determined value. To address issues of possible inhomogeneity of each material, analyses of variance with 5 % significance level were conducted on both the LC-absorbance and ID-LC-MS data. No indication of heterogeneity was identified in SRM 3236 Soy Protein Isolate. Possible heterogeneity was identified in SRM 3237 Soy Protein Concentrate and SRM 3238 Soy-Containing Solid Oral Dosage Form, and a component of uncertainty for inhomogeneity based on the standard deviation is incorporated in the uncertainty of the combined estimator for each material. Additionally, while possible heterogeneity was noted for SRM 3234 Soy Flour, the trends were attributed to measurement variability present only in the ID-LC-MS data. This distinction was possible because the two methods utilized samples from the same units, and the variability was only observed in the ID-LC-MS analysis. This potential heterogeneity was insignificant compared to the measurement differences between the two methods, and therefore was not included in the final assessment of uncertainty.

The data for SRM 3235 Soy Milk contained a number of outliers in each method, and no trend could be identified to pinpoint the source of variability. Outliers were observed in both methods, both high and low relative to the average value calculated after exclusion of outliers, and from different units between the methods (more specific information about outlier distribution is provided in the ESM, Table S5). SRM 3235 is a liquid suspension, and therefore homogeneity issues are not surprising; however, no trends in outlying data could be attributed to precipitation of sample solids or other characteristics of the sample appearance. This random heterogeneity observed for isoflavones prohibited values from being assigned in this SRM. Interestingly, data has been collected for other analytes (tocopherols, elements, water-soluble vitamins), and no significant heterogeneity has been observed, and this data may be used to assign nutrient values in SRM 3235.

Conclusions

Two analytical methods, based on LC-absorbance and ID-LC-MS, have been developed and utilized for the determination of six isoflavones (daidzin, genistin, glycitin, daidzein, genistein, and glycitein) in soy foods and dietary supplements. Both methods utilized a basic hydrolysis to cleave malonyl- and acetyl-glycosides, as well as an internal standard approach to quantitation. The results from the two methods were in agreement and were used to assign values to matrix-based SRMs. The matrices range from soy flour, a common food commodity, to soy protein isolate and concentrate, common food additives, to a soy supplement. Unfortunately, the soy milk material was found to be inhomogeneous and could not be characterized for isoflavone content. These SRMs can be used by researchers, manufacturers, and testing laboratories to evaluate the performance of methods for soy isoflavone analysis, and also in method development and validation.