Introduction

The year 2021 marks the 50th anniversary of a significant event in the development of Standard Reference Materials®, namely, the issuance of SRM 1571 Orchard Leaves as the first biological-matrix SRM developed for the determination of trace elements by the National Bureau of Standards (NBS), now the National Institute of Standards and Technology (NIST). In 1905, four years after its establishment, NBS, in collaboration with the American Foundrymen’s Association, initiated a program to provide “standardized irons,” which became Standard Samples. Standard Sample 1 “Argillaceous Limestone” was issued in 1910 with values assigned for various metal oxides and it has been available continuously since then. In 1965, the Standard Samples became SRMs consisting mainly of steel, iron, and cement samples certified for content of major constituents of industrial significance.

SRM 1571 Orchard Leaves, which was intended primarily for environmental analysis, was issued with certified values for 19 major, minor, and trace elements based on the concept of using results of multiple independent analytical techniques to assign certified values. During the next decade, NBS issued additional biological and environmental matrix SRMs for trace elements including spinach and tomato leaves, pine needles, bovine liver, wheat and rice flour, river sediment, air particulate matter, and oyster tissue. However, it was not until a decade later that NBS issued the first SRMs for the determination of trace organic constituents in a natural matrix, i.e., SRM 1580 Organics in Shale Oil with values assigned for polycyclic aromatic hydrocarbons (PAHs) and SRM 909 Human Serum with values assigned for clinical diagnostic markers (e.g., cholesterol and glucose).

The initial natural matrix SRMs for organic environmental analysis focused on the measurement of PAHs in several matrices including air and diesel particulate matter, coal tar, marine sediment, and mussel tissue [1]. Other legacy and emerging contaminants were soon certified in environmental matrix SRMs including polychlorinated biphenyls (PCBs), chlorinated pesticides, and polybrominated diphenyl ethers (PBDEs). During the past four decades, NBS/NIST developed over 180 SRMs for organic analysis of environmental, clinical, food, and dietary supplement matrices.

As I began my career as an analytical chemist at NBS in 1976, I had the opportunity to participate in the development of the first natural matrix SRM for the determination of trace organic constituents, and over the next four decades, I was intimately involved in planning, collection, preparation, and analysis for many of these SRMs. The development of these SRMs included significant challenges in the preparation of large quantities of unique materials and the development and implementation of new, improved analytical methods to assign the certified values. This feature article provides highlights of these challenges and accomplishments through my personal Top Ten List of SRMs developed at NBS/NIST during the past four decades.

What are natural matrix SRMs and why are they important?

SRMs are Certified Reference Materials (CRMs), the international designation, issued by NBS/NIST. A review by Ulberth [2] summarized the international terminology for reference materials and the types and uses of CRMs. CRMs for chemical composition are used (1) to assess the accuracy or trueness of measurement results, (2) to assist in validation of new analytical methods, (3) to serve as control materials for quality assurance of routine analyses, and (4) to provide metrological traceability of measurement results. The use of a matrix CRM that is similar to the real-world sample analyzed provides the required assessment of the complete analytical process (sample pretreatment including dissolution; extraction; cleanup, enrichment, and/or isolation of the analytes of interest) prior to the actual instrumental measurement. Because CRMs are homogeneous, stabile materials that are widely available, they can also be used in novel research applications and the results compared with other laboratory measurements.

Implementation of the multiple independent analytical methods concept for organic analysis

The basis of the multiple independent analytical methods concept to assign certified values to reference materials was established when William F. Hillebrand, the chief chemist of NBS from 1901 to 1925, stated that one criterion for a standard sample is “Its composition should have been determined by independent and reliable methods affording agreeing results” [3]. For the first 50 years at NBS, standard samples, and later SRMs, were issued with elemental content values assigned based on Hillebrand’s concept of independent, reliable methods. With the advent of SRM 1571 Orchard Leaves, the approaches for assigning a certified value had evolved to “reference method, two independent methods, or interlaboratory comparison” [4] and “definitive reference methods”, and “two or more independent and reliable methods” [5] as summarized in a paper by Epstein in 1991 [6]. In 2000, NIST formalized the approaches or modes for assigning values and established a hierarchy of values (denoted as certified, reference, and information) with decreasing confidence in their accuracy based on the various approaches used [7]. This document was recently updated with numerous examples illustrating the implementation of various modes of certification [8].

The development of the multiple independent methods concept for assigning certified values for trace elements in matrix SRMs was discussed by Epstein [6], which describes using independent methods (i.e., independence in physical principle upon which the measurement is based, sample preparation, standards, and calibration) and presents the certification of SRM 2704 Buffalo River Sediment to illustrate this approach. For the assignment of certified values for 25 elements in SRM 2704, a total of 14 analytical techniques were used including 10 different sample preparation techniques [6]. For the determination of elements in matrix SRMs, the concept of using multiple independent methods was relatively straight forward because a variety of analytical techniques based on different measurement principles and different sample preparation approaches (e.g., direct analysis of a solid sample or dissolve the matrix and analyze the resulting solution) were available.

In the mid-1970s, a small Trace Organic Analysis Group was formed at NBS [9] with the goal of developing SRMs for trace organic analysis to complement the strong existing inorganic analysis capabilities. The challenge faced by the Trace Organic Analysis Group was how to implement the multiple independent analytical methods approach when the matrix could not be dissolved or destroyed and the ultimate analysis techniques were generally limited to gas chromatography (GC) and the emerging technique of liquid chromatography (LC). In 1980, the first natural matrix SRM for trace organic constituents was issued, SRM 1580 Shale Oil, as part of a collaboration with the US Department of Energy to improve the quality of measurements associated with the development of alternative energy sources. SRM 1580 was not widely used within the analytical chemistry community, and many would consider it to have been a failure. However, the development of SRM 1580 laid the foundation for transferring the NBS multiple independent methods concept from trace element analysis to trace organic analysis. Hertz et al. [10] described the analytical approaches developed and used to assign certified values for three PAHs and two phenols in SRM 1580. The independence of the analytical methodologies was based on different sample preparation approaches using traditional acid/base extraction, LC extraction (isolation), and/or no extraction (direct injection) prior to analysis and quantification using GC with flame ionization detection (FID) or GC with mass spectrometry (GC-MS) detection [10]. The multiple independent methods approach for PAHs was significantly expanded with the development of additional environmental matrices during the next decade [1, 11].

Selection criteria for the Top Ten SRM list

During the next 40 years, there were over 180 natural matrix SRMs developed with assigned values for organic constituents intended for environmental, clinical, food, and dietary supplement analysis, and the growth and availability of these SRMs is illustrated in Fig. 1. In many respects, these four decades could be considered the “golden age” for the development of natural matrix SRMs for organic analyses because every new matrix and analyte group were new analytical challenges and a significant contribution to the field of reference materials and often to the field of analytical chemistry. To identify the significant SRMs for the Top Ten list, the characteristics summarized in Table 1 were used as the selection criteria. A top criterion was the analytical challenge involved in performing the measurements to assign certified values for the organic constituents of interest including the need to develop and implement new and/or improved analytical methods. Another important criterion was whether there was a significant challenge involved in obtaining and preparing a sufficiently homogeneous quantity of material to produce the required SRM inventory. During this period, an arbitrary 5-year inventory was established at NIST as a target for purposes of cost recovery through sales of the SRM. Ideally, an SRM sales unit should contain a sufficient quantity to provide the user with multiple analytical subsamples. However, even though a 5-year inventory was the target, for many materials, it was desirable to collect/obtain and process a much larger quantity of the matrix because we did not want to repeat the often laborious, resource-intensive collection and processing every 5 years. Several of the SRMs that are described in this article have been available for several decades which has provided significant benefits; thus, longevity of an SRM became an important selection criterion.

Fig. 1
figure 1

Graph illustrating the number of natural matrix SRMs for organic analysis available per year from 1980 through 2020 categorized as intended for environmental, clinical, food, and dietary supplement analysis. This graph does not include pure organic materials and calibration solution SRMs. The number on the y-scale represents the SRM available at a specified year (x-axis), i.e., cumulative number of SRMs developed minus SRMs that were discontinued

Table. 1 Selection criteria for the Top Ten SRMs

Important criteria for appearing in the Top Ten list include the unique and/or novel nature of the matrix and whether it was the first time that such a matrix was issued as an SRM. In addition to first-of-a-kind matrix, the first time that an analyte or class of analytes was certified in a particular matrix was considered significant. If the development of a particular SRM was the foundation for subsequently producing similar matrix SRMs, this was considered an important consideration for inclusion on the list. Ideally, prior to the development of an SRM, it would be useful to be able to predict reliably whether the SRM would be used widely and have a large customer base. One indicator of potential user demand was if a government agency with regulation oversight for a specific area requested the development of the SRM and/or provided financial support and/or other resources to assist in the development of the SRM. The final selection criterion is whether the SRM represents a novel concept in design and/or intended use.

The Top Ten SRM list

The Top Ten SRMs are provided in Table 2. The list is generally chronological (with the exception of number 8) and follows the timeline for development of SRMs for organic analysis at NBS/NIST. Each of the 10 SRMs will be discussed briefly relative to the selection criteria, and the characteristics for each SRM and how it meets the criteria are summarized in Table 3.

Table. 2 Top Ten SRMs for environmental, clinical, food, and dietary supplement analysis
Table. 3 Characteristics of the Top Ten SRMs relative to the selection criteria

Number 1: SRM 1649 Urban Dust/Organics

During the mid-1970s, NBS, with support from the US Environmental Protection Agency (EPA), undertook the challenge to collect large quantities of atmospheric particulate matter (PM) for the development of SRMs for trace element analysis. Using an industrial baghouse collector (see Fig. S1, Supplementary Information, ESM), PM was collected over a 12-month period in St. Louis, MO, USA, and then later in 1976 and 1977 in Washington, DC, USA. The air particulate material collected in St. Louis became SRM 1648 Urban Particulate Matter, which was issued in 1978 with certified values for 15 elements and was intended to be representative of PM from an industrial urban area. The approximate 50 kg of PM collected in Washington, DC, was initially planned as a second SRM for trace elements from a non-industrial urban environment to complement SRM 1648. However, with the increasing interest and growing capabilities in organic analysis at NBS, the Washington DC PM was repurposed as the first particulate matter SRM for organic analysis, i.e., SRM 1649 Urban Dust/Organics, issued in 1982 [12].

As particulate material, SRM 1649 addressed a major challenge in organic analysis, namely, the efficient and complete removal of organic constituents from a particulate matrix. During the development of SRM 1649 (and even to the present time [13]), the question of whether we were removing all the PAHs from the PM was debated. NBS experience with particulate material SRMs was based on the concept of assigning a value for total elemental composition after dissolution of the particulate matrix and elemental analysis of the resulting solution. However, for the determination of PAHs (and other organic contaminants) in particulate material, dissolving the matrix was not a viable option. Solvent extraction was the only suitable approach. To satisfy the requirement of multiple independent methods of extraction, we investigated different extraction methods available at the time (Soxhlet extraction and ultrasonic agitation) with various organic solvents to convince ourselves that we had removed all of the PAHs from the particulate matter. Ultimately, we decided that we were using the best available extraction techniques and that we would state on the SRM Certificate of Analysis what approach was used to extract the material. If and when future improvements in extraction techniques resulted in higher recovery of PAHs from the SRM matrix, the certified values would be revised to reflect these advances. This same approach has been continued with other materials requiring solvent extraction.

The initial certification of PAHs in SRM 1649 did not significantly advance the analytical approach for using multiple analytical techniques [12]; however, SRM 1649 is probably the best example of the evolving nature of assigned values and the increasing number of values assigned to an SRM over an extended lifetime [13, 14]. SRM 1649 has been re-issued three times (and the assigned values updated four times) as shown in Table 4 with sales of over 4000 units over the past four decades. SRM 1649 also changed from a bottle containing 5 g to a unit now containing 2 g, because of the reduction in sample amount required as analytical methods have advanced and as a strategy to prolong the lifetime of this unique material. The current version of the urban dust, SRM 1649b, updated in 2015 has values assigned for 239 constituents including PAHs, nitro-PAHs, PCBs, pesticides, and trace elements.

Table. 4 Evolution of SRMs for organic analysis with increasing number of values assigneda

Shortly after the development of SRM 1649, a second significant particulate material, SRM 1650 Diesel Particulate Matter was developed. SRM 1649 and SRM 1650 became the ideal matrices to evaluate new extraction techniques particularly for PAHs. Three notable papers using SRM 1649 and SRM 1650 to validate advances in extraction technologies were published by Schantz et al. [13, 15] and Benner [16] comparing traditional Soxhlet extraction, pressurized fluid extraction (PFE), and supercritical fluid extraction (SFE). The extraction recoveries for the removal of five PAHs from SRM 1649b and SRM 1650b using SFE with CO2 at 200 °C and PFE with toluene at 100 °C and 200 °C are compared in Fig. 2. For the urban dust (Fig. 2A), recoveries for Soxhlet, PFE, and SFE are similar except for the two heavier PAHs. However, for the diesel particulate matter (Fig. 2B) as the number of aromatic rings in the PAHs increases, the recovery using SFE decreases markedly to < 10% for the six-ring PAHs, while the recovery using PFE at both temperatures increases and even exceeds the Soxhlet benchmark. In the 1990s, SFE with CO2 was advocated as an environmentally friendly alternative to solvent extraction; however, the study of Benner [16] using the air and diesel particulate SRMs clearly demonstrated the inadequacy of SFE to extract PAHs from particulate matter samples. The first PFE study of Schantz et al. [15] resulted in the use of PFE as the only extraction technique to assign certified values for PAHs in SRM 1650a, and the second study by Schantz et al. [13], which was prompted by a study reporting higher extraction recovery for the SRMs using a higher temperature [17], resulted in the revision of the assigned values for both SRM 1649b and SRM 1650b to reflect various extraction conditions and it revived the ongoing discussion of whether the PAHs were completely removed from the particulate matter, and does it really matter?

Fig. 2
figure 2

Comparison of extraction recovery for selected PAHs from A SRM 1649b and B SRM 1650 using SFE with CO2 and PFE with toluene at 100 °C and 200 °C. Fluor = fluoranthene; BaA = benz[a]anthracene; BeP = benzo[e]pyrene; BghiPer = benzo[ghi]perylene; and InPyr = indeno[1,2-cd]pyrene. The dashed black line represents the result using Soxhlet extraction with dichloromethane; all bars represent percent recovery relative to the Soxhlet extraction result and error bars are the standard deviations of the measurements. Graphs based on results from Schantz et al. [13, 15] and Benner [16]

Two additional atmospheric particulate matrix SRMs were developed in the early 2000s, in collaboration with EPA, to meet the need for SRMs for atmospheric fine PM < 10 μm (i.e., PM10). After failed attempts to collect sufficient quantity of PM2.5 for producing an SRM using an ultra-high-volume sampler (UHVS), an alternative approach was undertaken. Total suspended particulate matter from an air intake filtration system of a major exhibition center in Prague, Czech Republic, was resuspended and size-separated using the UHVS with the face velocity of the cyclone adjusted to control the aerodynamic particle size to collect two fractions, < 10 μm and < 4 μm [18]. These two PM size fractions were issued in 2011 as SRM 2786 Fine Particulate Matter (< 4 μm) and SRM 2787 Fine Particulate Matter (< 10 μm), and they represent the most characterized fine PM CRMs available with values assigned for PAHs, nitro-PAHs, PBDEs, and PCDDs/PCDFs [18]. The European Commission’s Joint Research Centre (EC-JRC) at Geel also produced a PM10-like CRM starting with a coarse tunnel dust, sieving (0.5 mm followed by 0.250 mm), and finally jet-milling to produce a PM10-like CRM, ERM-CZ100 [19]. Recently, another PM-matrix CRM was produced by EC-JRC using a novel approach involving suspension of PM, freezing, and freeze-drying to produce PM2.5-like material (ERM-CZ110) as reported by Emteborg et al. [20]. The production of these fine PM-matrix CRMs by NIST and EC-JRC emphasizes the challenges and difficulties in obtaining sufficient quantities of PM to produce a CRM and re-emphasized the monumental achievement decades earlier in the collection of the extraordinary quantities of PM used for SRMs 1648 and SRM 1649.

Number 2: SRM 1941 Organics in Marine Sediment

In the mid-1970s, NBS established a collaboration with the US National Oceanic and Atmospheric Administration (NOAA) to develop methods for the determination of petroleum hydrocarbons in water, sediment, and tissues as part of an effort to establish baseline levels of petroleum hydrocarbons in the Alaskan environment prior to completion of the Alaskan pipeline and transport of crude oil from the North Slope to Valdez. A decade later, this relationship with NOAA expanded to include the development of several noteworthy SRMs for the determination of organic contaminants in marine environmental matrices. In 1987, NIST with financial and logistical support from NOAA collected 1000 kg of wet marine sediment from Baltimore harbor (MD, USA) (see Fig. S2, ESM). The sediment was spread on shallow aluminum pans and air-dried resulting in hard clods, which were then pulverized, and sieved (< 150 um) resulting in 40 kg of dry sediment, which was then homogenized and distributed in bottles at 75 g each. As this was our first experience in working with large-scale production of an environmental matrix, we, the NIST analytical chemists, performed the dirty work ourselves (see Fig, S2, ESM).

Because SRM 1941 was analyzed nearly a decade after the first SRMs for the determination of PAHs, we had more fully developed our approach for the use of multiple independent methods for assigning values for PAHs including the use of both reversed-phase LC with fluorescence detection (of both total PAH fractions and isomer group fractions isolated by normal-phase LC [21]) and GC-MS using columns with different selectivity [1, 11]. As a result, SRM 1941 was issued in 1989 with values assigned for 25 PAHs [22]. SRM 1941 found widespread use in the marine environmental analysis community, and the supply of the sediment SRM was depleted after 5 years. SRM 1941 was replaced by SRM 1941a in 1995 [23] and again in 2002 by SRM 1941b [24]; both renewal batches were collected from the same Baltimore Harbor location as the original SRM 1941, however, in greater quantities. The analytical approach for assigning certified values for PAHs improved with each renewal of SRM 1941 as indicated in Table 4 by the increasing number of values assigned. In 1999, SRM 1944 New York/New Jersey Waterway Sediment was issued complementing SRM 1941a with a factor of 10 higher concentrations of PAHs as well as certified concentrations for trace elements [24]. SRM 1944 represented one of the first (and still limited number) environmental matrix SRMs for contaminants with certified values assigned for both organic contaminants and trace elements. Unfortunately, it is difficult to know whether the SRM 1944 customers use it for organic or inorganic contaminants or for both. The SRM 1941 series and SRM 1944 have sold over 6500 units in the past 32 years.

Number 3: SRM 1974 Organics in Frozen Mussel Tissue (Mytilus edulis)

From a storage and distribution standpoint, the ideal environmental matrix SRM would be a dry, homogeneous powder that could be distributed and stored on the shelf at ambient temperature. However, in many instances, the actual environmental samples analyzed in the laboratory are not dry powders, but contain significant amounts of endogenous water. For example, in the NOAA Mussel Watch Program [25], laboratories analyze mussel tissue samples that would be collected, frozen in the field, shucked, and the tissue stored frozen in the laboratory until analyzed. To address the need for a wet marine tissue matrix, NIST developed SRM 1974 Organics in Frozen Mussel Tissue (Mytilus edulis). In 1987 as a continuation of the collaboration with NOAA to develop SRMs to support marine monitoring programs, NIST chemists collected 2300 mussels in Boston Harbor (Dorchester Bay) for the preparation of SRM 1974 (see Fig. S3, ESM). The collection site was selected based on monitoring data from the Mussel Watch program indicating a relatively high concentration of PAHs and PCBs. The challenge was how to produce a large quantity of homogeneous, frozen powder for the mussel tissue SRM. After shucking the mussels, the wet tissue was frozen and stored at − 80 °C or lower. Conventional metal disc mill grinding technology was converted to a Teflon disc mill for cryogrinding as described by Zeisler et al. [26] (see Fig. 4, ESM). Batches of 150 g of frozen tissue were milled resulting in 28 kg of frozen powder (about 180 milling batches!), which was then homogenized in a custom-made aluminum cylinder designed to fit inside a liquid nitrogen vapor freezer and to rotate and mix the frozen powder (see Fig. S4E, ESM) [27]. The SRM preparation was a challenging, labor-intensive process.

SRM 1974 was issued in 1990 as the first frozen tissue SRM with values for PAHs, PCBs, chlorinated pesticides, and trace elements [27]. This SRM was widely used to support marine monitoring programs particularly the NOAA Mussel Watch Program. Similar to the SRM 1941 sediment series, SRM 1974 has been re-issued three times in 1995, 2003, and 2012 with an increasing number of values assigned with each renewal (see Table 4) [28, 29]. SRM 1974a and SRM 1974b both involved the same cryogrinding and homogenization process described above; however, the batch size was increased to 700 g using larger Teflon grinding mills to reduce the time/labor required and to accommodate the larger quantities of tissue collected, i.e., 81 kg and 59 kg of tissue for SRM 1974a and 1974b, respectively. Not until the preparation of SRM 1974c did NIST use a large-scale commercial ryogrinding unit (Palla VM-KT, KHD Humboldt Wedag, Cologne, Germany) to grind 120 kg of tissue [30]. Recently, as NIST was preparing to collect mussels for the fourth re-issue of SRM 1974, they learned that mussels were no longer available at the collection site presumably due to the increased urbanization of the location where the levels of individual PAHs and PBDEs had increased by 50 to 100% during the 17 year span of mussel collections even as the legacy pollutants (PCBs and chlorinated pesticides) had decreased by more than 50% [31].

In 1997, NIST assigned the first speciation values to a matrix SRM with the addition of values for methylmercury to SRM 1974a [32] and to subsequent mussel tissue SRMs [33, 34]. Several freeze-dried mussel tissue SRMs were also produced during this 30-year period including SRM 2974, SRM 2974a (produced from the same batch of mussels as SRM 1974c), SRM 2977, and SRM 2978 [28]. In 2010, in response to the Deepwater Horizon oil spill, sales of SRM 1974b tripled during the 18 months immediately following the spill because it was the only seafood-matrix SRM with values assigned for petroleum hydrocarbons to support measurements to assess the impact of the spill on the seafood industry in the Gulf of Mexico. Over the last three decades, eight different mussel tissue SRMs have sold over 5000 units. Three other frozen tissue matrix SRMs were developed using the same cryogrinding approach, i.e., SRM 1945 Whale Blubber [35], SRM 1946 Lake Superior Fish Tissue [36], and Lake Michigan Fish Tissue, all with significant numbers of values assigned for PCBs and chlorinated pesticides. A summary of SRMs that have a significant number of values assigned is provided in Table 5, and the majority of these SRMs are for determination of environmental contaminants.

Table. 5 Examples of SRMs with large numbers of certified and reference values assigneda

Number 4: SRM 1589a PCBs, Pesticides, and Dioxin/Furans in Human Serum

Human serum was the next new matrix for environmental contaminants with the release in 2000 of SRM 1589a PCBs, Pesticides, and Dioxin/Furans in Human Serum. SRM 1589a was preceded by SRM 1589 Aroclor in Human Serum, which had been spiked with Aroclor 1260 and issued in 1985 with a value assigned for total Aroclor 1260. As analytical methods and regulations moved from total aroclor measurements to the determination of multiple single PCB congeners, the desire for a new human serum SRM with endogenous levels of contaminants increased. In our minds, the challenge was to obtain a serum pool with measurable levels of PCB congeners and chlorinated pesticides. In 1996, NIST procured serum from donors living in Chicago, IL (USA), who indicated that they fished in the Great Lakes and ate their catch and individuals who, in their judgment, ate large quantities of fish. Serum samples from these donors were screened at NIST and those donors with significant levels of PCBs were selected and the serum was pooled. Subsamples of 10 mL were aliquoted into 30-mL bottles and then freeze-dried. Values were assigned for 53 PCB congeners, 9 chlorinated pesticides, 7 PBDEs, and 14 dioxins and furans [37] using GC-MS with different extraction and cleanup approaches. The Center for Disease Control and Prevention (CDC) collaborated in the assignment of values by contributing results using a GC-high-resolution MS method that also provided results for dioxins and furans, a group of environmental contaminants that NIST never developed capabilities to measure. At the time, SRM 1589a represented the pinnacle for value assignment for PCBs, pesticides, and dioxins/furans in an SRM, and the approach was based on the foundation laid by the development of SRM 1588a Cod Liver Oil released 2 years earlier [38, 39].

Fortunately, SRM 1589a had significant levels of PCB congeners and chlorinated pesticides, and CDC became a primary user of this SRM. However, after nearly a decade of use and the desire for a serum material with more contemporary levels of not only legacy contaminants but also emerging contaminants, NIST and CDC collaborated to produce a new human serum SRM with values assigned for contaminants. With financial and measurement support from CDC, NIST expanded the human biological fluid matrices for organic contaminants to include not only serum but also milk and urine. For the human serum and milk materials, 200 L of serum was collected from blood banks in 11 US cities and 100 L of milk was obtained from milk banks in 6 states. For both the serum and milk materials, the pools were equally divided to produce a pool with endogenous levels of contaminants and a pool that was fortified (spiked) with over 170 organic contaminants at levels approximately 5 to 10 times the endogenous levels. The target list of contaminants added to the pools included PCBs, hydroxylated PCBs, chlorinated pesticides, chlorinated and brominated dioxins/furans, PBDEs, polychlorinated naphthalenes (PCNs), perfluorinated compounds (PFCs), and toxaphene congeners. Many of these target compounds were not routinely measured at this time at CDC, but the intent was to have a material available containing these potentially emerging contaminants for future studies. In 2009, both the human serum (SRM 1957 and SRM 1958) and milk (SRM 1953 and SRM 1954) materials were issued with values assigned for PCB congeners, chlorinated pesticides, PBDEs, and dioxins/furans [40].

With SRM 1957, NIST had succeeded in producing a human serum material with contemporary levels of legacy contaminants and emerging contaminants as shown in Fig. 3. Because the serum pools used for SRM 1957 were not screened to select donors with high levels, the new SRM had considerably lower levels, and as a result, a significantly smaller number of values were assigned for PCB congeners and chlorinated pesticides. Recently, Rodowa and Reiner [41] published an assessment of the use of SRM 1957 during the past decade with over 50 publications specifically using it for the determination of per- and polyfluoroalkyl substances (PFAS). They documented traditional uses of the SRM but also highlighted that several users had reported and quantified 12 new PFAS compounds not previously determined in SRM 1957 [41].

Fig. 3
figure 3

Legacy versus emerging contaminants in two human serum SRMs. Serum pools for SRM 1589a (PCBs, Pesticides, PBDEs, and Dioxins/Furans in Human Serum) and SRM 1957 (Organic Contaminants in Non-Fortified Human Serum (freeze-dried)) were collected in 1996 and 2004, respectively. Mass fractions of legacy contaminants dichlorodiphenyldichloroethylene (4,4′-DDE); 2,2′,3,4,4′,5′-hexachlorobiphenyl (PCB 138); 2,2′,4,4′,5,5′-hexachlorobiphenyl (PCB 153); 1,2,3,4,6,7,8-heptachlorodibenzo-p-dioxin (1234678-HCDD); and 1,2,3,6,7,8-hexachlorodibenzo-p-dioxin (123678-HCDD) and emerging contaminants 2,2′,4,4′-tetrabromodiphenyl (PBDE 47); 2,2′,4,4′,5-pentabromodiphenyl (PBDE 99); and 2,2′,4,4′,6-pentabromodiphenyl (PBDE 100) are compared for SRM 1589a and SRM 1957. Note that the y-axis scale for 4,4′-DDE; PCB 138; and PCB 153 (dashed lines) is the y-axis log scale on the right; all other compounds use the y-axis scale on the left. Note that mass fractions of 1234678-HCDD and 123678-HCDD are picograms per kilogram while all other contaminants are nanograms per kilogram

For production of the urine SRMs for contaminants, pools of non-smoker (50 L) and smoker (25 L) urine were obtained and 10-mL subsamples were aliquoted into amber bottles and stored at − 80 °C. As with the serum and milk materials, it was intended that there would be an endogenous non-smokers urine pool and one spiked with hydroxylated PAHs. Unfortunately, the exogenous hydroxylated PAHs were not stabile after the addition to the urine matrix, and the certification of this material was abandoned. SRM 3672 Organic Contaminants in Smokers’ Urine (Frozen) and SRM 3673 Non-Smokers’ Urine were issued in 2014 with values assigned for hydroxylated PAHs, phthalates, phenols, and volatile organic compound metabolites [42]. These three human biological-matrix SRM pairs have annual sales of 20 units/year, 70 units/year, and 100 units/year for milk, serum, and urine, respectively. The success of the urine SRMs may be due, in part, to customer desire for characterized smoker and non-smoker urine pools for their investigations beyond their use as control materials.

Expanding the number of values assigned for environmental SRMs

As shown in Table 4, there has been a continuous goal of increasing the number of contaminants with assigned values in the environmental matrix SRMs with each renewal. In the first decade of assigning certified values for PAHs and PCBs, the number of compounds determined was typically limited by the analytical methods and the requirement to have multiple methods. However, as the analytical methods advanced, the limiting factor often became the availability of authentic reference standards that were accurately assessed for purity. With the development of calibration solution SRMs for PAHs and PCB congeners containing 36 and 20 compounds, respectively, it became easier to routinely assign large numbers of values. However, a valid question is: how many certified values are necessary for an environmental matrix SRM to be useful? Many environmental monitoring programs have established target lists for compounds of interest (e.g., the 16 EPA priority pollutant PAH list), and assignment of values to match these target lists may be an appropriate answer. As new contaminants are identified, NIST has attempted to add values for these emerging contaminants (e.g., PBDEs, PFAS) to appropriate existing SRMs. As the contaminant list expands, however, the resources needed to maintain and renew such materials also increases. The development of a successful SRM is a “good news, bad news” scenario. The good news is that the customers are using the SRM, and the bad news is that the customers will demand a continuous supply of similar high-quality material, which can be a strain on the producer’s resources.

Number 5: SRM 1849 Infant/Adult Nutritional Formula

In the mid-1990s NIST initiated a program to develop food-matrix SRMs for the determination of vitamins and organic nutrients [43]. With the issuance of SRM 1846 Infant Formula in 1996, NIST started the most successful series of food-matrix SRMs. Because infant formula is the most regulated food worldwide, primarily for safety concerns, the need for such an SRM was obvious, and the infant formula matrix was ideal (dry and already in powdered form) for SRM development. The focus for the infant formula SRM certification, however, was not on constituents related to safety but on the content of vitamins and nutrients. SRM 1846 was issued with certified values for only four vitamins and iodine; however, reference values were available (and many were added in later years) for 38 additional vitamins and nutrients (both organic and elements) [44]. The limited number of certified values for vitamins was primarily due to the lack of reliable multiple analytical methods at NIST to provide measurements of the required quality to assign certified values. As shown in Fig. 4, distribution of SRM 1846 increased steadily from 50 to 250 units/year over its 13-year lifetime, in part due to the convenience of the homogeneous, powder matrix provided in 10 single-use 30-g packets in a unit.

Fig. 4
figure 4

Bar graph illustrating the sales of infant formula and infant/adult nutritional formula SRMs from 1996 through 2020. SRMs include SRM 1846 Infant Formula, SRM 1849 Infant/Adult Nutritional Formula, SRM 1849a Infant/Adult Nutritional Formula I (Milk-based), and SRM 1869 Infant/Adult Nutritional Formula II (Milk/Whey/Soy-based). SRM 1849a inventory depleted in 2020

When SRM 1846 was replaced in 2009 with SRM 1849 Infant/Adult Nutritional Formula, LC-MS methods using isotopically labeled internal standards had been implemented at NIST to provide the necessary multiple analytical methods to significantly increase the number of assigned values to 44 certified and 41 reference values, a substantial improvement over the previous infant formula material [45]. Customer demand for this new infant formula SRM jumped to nearly 500 units/year, and after just over 2 years, the supply was depleted. Fortunately, NIST rapidly produced a similar replacement material, SRM 1849a Infant/Adult Nutritional Formula, in 2012 with similar numbers of values assigned for the vitamins and nutrients. As shown in Fig. 4, sales of SRM 1849a increased steadily to over 700 units/year. The increasing use of SRM 1849a was due in part to extensive involvement of the AOAC International Stakeholder Panel for Infant Formula and Adult Nutritionals (SPIFAN) to develop AOAC Official Analytical Methods for infant formula and to promote the use of SRMs as part of this process [46,47,48]. In 2019, SRM 1869 Infant/Adult Nutritional Formula II (milk/whey/soy-based) was released to complement the milk-based SRM 1849a.

After the introduction of SRM 1846 Infant Formula, several notable food-matrix SRMs were issued during the next decade including SRM 2383 Baby Food (1996), SRM 1546 Meat Homogenate (1999), and SRM 2387 Peanut Butter (2003). SRM 2383 was unique in that it was a custom-designed mixture of foods to provide suitable content of carotenoids and other vitamins and was prepared by a baby food manufacturer including pressure cooking and sealing in baby food jars. SRM 1546 was a regular production batch of a classic commercial meat product packaged in a mini-sized can (85 g) as would be purchased in the grocery market. Similarly, SRM 2387 is a commercial batch of a peanut butter product in a mini-jar containing 175 g.

Number 6: Ginkgo biloba (Leaves)

In its 2001 budget authorization, the Office of Dietary Supplements at the National Institutes of Health (NIH-ODS) was instructed to “allocate sufficient funds to speed up an ongoing collaborative effort to develop and disseminate validated analytical methods and reference materials for the most commonly used botanical and other dietary supplements.” To address this request, NIH-ODS approached NIST in 2002 to collaborate in the development of reference materials for botanical dietary supplements. The initial efforts of the NIH-ODS/NIST collaboration were to develop authentic botanical ingredient reference materials with values assigned for the content of active and/or marker compounds for verification of supplement label claims and for quality control during manufacturing, particularly to address safety concerns related to contaminants such as toxic elements (As, Cd, Pb, and Pb). NIH-ODS, NIST, and other stakeholders, i.e., US Food and Drug Administration (FDA), US Department of Agriculture (USDA), and AOAC International, worked together to identify priorities based on safety concerns, market share, and ongoing and proposed clinical trials of botanical supplement ingredients. The initial dietary supplement ingredients identified for SRM development included ephedra, Ginkgo biloba, saw palmetto, St. John’s wort, green tea, berries of various Vaccinium species, and various botanical oils. Within the dietary supplement industry, chemical measurements are typically performed on both raw materials (plants and extracts of plants) and the finished products. Thus, the SRMs were designed to be representative of these various matrices resulting in a “suite” of materials consisting of authentic plant material, an extract of the plant material, and finished product (e.g., tablets), as appropriate. The intent was to provide different matrices that would provide different analytical challenges, e.g., different concentrations of constituents of interest, differences in extractability of constituents from the matrix, and potential different interferences.

Ephedra was identified as the highest priority based on safety concerns and a suite of SRMs for ephedra was developed including plant aerial parts, extracts, solid oral dosage form (SODF), and ephedra-containing protein powder. Unfortunately, in 2004 before the ephedra SRM suite was issued, the FDA banned the sale of ephedra-containing supplements. The ephedra SRM suite was issued in 2006 and distributed until 2011 when it was discontinued. Even though the ephedra SRMs were not widely distributed or available for a significant time, it was not a failure because the development of the ephedra SRM suite provided the experience and a model approach for the botanical matrix SRMs that followed [49].

Whereas ephedra was selected as high priority for safety concerns, Ginkgo biloba was a high priority because of high sales of ginkgo-containing supplements, which were as high as $244 million annually when priorities were established in 2005 [50] and are currently near $100 million annually [51]. In traditional medicine, Ginkgo biloba leaves find predominant usage for memory improvement and Alzheimer’s treatment and prevention, and the perceived health benefits are attributed to terpene lactones and flavonoid aglycones. In 2007 NIST issued a suite of three Ginkgo biloba materials: SRM 3246 Ginkgo biloba (Leaves), SRM 3247 Ginkgo biloba (Extract), and SRM 3248 Ginkgo-containing Tablets [52]. Using the multiple analytical methods approach, values were assigned for flavonoids, ginkgolides, and toxic elements [52]. As a result of the NIH-ODS collaboration, NIST has developed over 40 SRMs/RMs to support the botanical dietary supplement measurement community including ephedra, Ginkgo biloba, St. John’s wort, Vaccinium spp. berries, soy, botanical oils, saw palmetto, yerba mate, kelp powder, turmeric, and ginger.

Number 7: Multivitamin/Multielement Tablets

In 2009, the NIH-ODS collaboration expanded the SRM portfolio to include non-botanical dietary supplement SRMs with the release of SRM 3280 Multivitamin/Multielement Tablets. The motivation for the development of SRM 3280 was twofold: (1) the widespread use of multivitamin/mineral (MVM) supplements among the US adult population, and (2) the need to assure the quality of results included in an important dietary supplement composition database. The most widely used dietary supplements in the USA are MVM supplements with 53% of adults reporting usage in 2017 [51]. Of the $48.7 billion in US dietary supplement sales in 2019, $6.46 billion were for MVM supplements [53]. Since 2003 NIH-ODS has collaborated with the US Department of Agriculture (USDA) in the development of a Dietary Supplement Ingredient Database (DSID) with reported composition verified by chemical analysis [54, 55]. The initial products selected for analytical verification in the DSID were adult MVM supplements, and it was recognized that an SRM would be a valuable tool to assess the accuracy of the chemical analyses performed by contract laboratories for the MVM products included in the DSID.

The multivitamin/multielement tablets used for SRM 3280 were prepared by a manufacturer of MVM products as a non-commercial batch of tablets [56] (see Fig. S5, ESM). SRM 3280 is provided in the form of whole tablets (30 tablets per bottle) rather than ground powdered material, which would provide better homogeneity. However, because some of the vitamins are coated or encapsulated to provide stability, grinding the tablets would potentially compromise the coating and reduce stability. The Certificate of Analysis for SRM 3280 recommends that a minimum of 15 tablets be ground to obtain a homogeneous powdered sample prior to removal of the test portion for analysis.

The goal in the development of SRM 3280 was to assign certified values for all the vitamins and elements listed on the Supplement Facts label for an MVM product (typically 30 to 35 constituents). The development of SRM 3280 (and during the same time period SRM 1849) provided the opportunity and motivation at NIST to develop ID LC-MS and ID LC-MS/MS methods for both fat-soluble and water-soluble vitamins and carotenoids to satisfy the requirement for assigning certified values using multiple independent methods. At the time, most methods for the determination of vitamins and carotenoids were based on LC with either UV absorbance or fluorescence detection; however, these UV absorbance and fluorescence detection-based methods were limited for some vitamins and often lacked the specificity required for complex food matrices. Only limited LC-MS or LC-MS/MS methods were in use at the time and no LC-MS-based methods employed an ID approach for quantification of vitamins. As part of the certification of SRM 3280, NIST developed ID LC-MS and ID LC-MS/MS for 9 of the 13 vitamins and carotenoids using isotopically labeled analogues. Many of these methods were the first reports of ID LC-MS and ID LC-MS/MS methods for vitamins and their application to SRM 3280 is described by Phinney et al. [57]. A significant challenge in the development of these ID methods was obtaining isotopically labeled vitamins for use as internal standards. Isotopically labeled vitamins were available commercially for only a limited number of the vitamins. To address this challenge, NIST, with support from NIH-ODS, worked with commercial sources to synthesize and make available isotopically labeled analogues of vitamins determined in the MVM SRM.

The value assignment of vitamins and carotenoids in SRM 3280 was supported by measurements from collaborating laboratories including USDA and laboratories participating in interlaboratory exercises conducted by the European Committee for Standardization (CEN) and the Food Industry Analytical Chemists Committee (FIACC) of the Grocery Manufacturers Association (GMA). The details of the methods used for value assignment of the vitamins and carotenoids in SRM 3280 are described in Sander et al. [56]. The certified and reference values for 17 vitamins and carotenoids in SRM 3280 are summarized in Table S1 (ESM), including the multiple independent method and collaborating laboratory results used to assign each value [56]. SRM 3280 was the first non-botanical dietary supplement SRM developed as part of the collaboration with NIH-ODS; however, six additional non-botanical dietary supplement SRMs were developed later including fish oils, krill oil, tocopherols in oil, calcium and chromium supplements, and iodized table salt.

Number 8: SRM 909 Human Serum

SRM 909, issued in 1980, was the first in a long line of serum-based SRMs certified for clinical diagnostic markers. SRM 909 was issued as a freeze-dried serum matrix with certified values assigned for cholesterol, creatinine, glucose, urea, uric acid, and inorganic electrolytes (Ca, Li, Mg, K, Na, and Cl). Certified values were dependent on using a specified procedure for weighing and reconstituting the freeze-dried serum. In contrast to the multiple independent methods approach described for the previous Top Ten SRMs, the certified values for the organic clinical markers in SRM 909 were based on only one method, a “definitive” method. The definitive methods for cholesterol, glucose, creatinine, uric acid, and urea were based on ID GC-MS [58,59,60,61]. SRM 909 was re-issued in 1993 (SRM 909a) and 2003 (SRM 909b) as freeze-dried serum and finally in 2010 as a frozen material (SRM 909c). The SRM 909 series has evolved over four decades changing from one level to two levels and back to only one level for the current version. SRM 909b was issued with high purity diluent water included with the unit for use in reconstituting the freeze-dried material. As early as 1988, frozen serum-matrix SRMs for individual clinical diagnostic markers with values assigned in SRM 909 appeared with SRM 1951 Lipids in Frozen Human Serum for cholesterol and SRM 956 Electrolytes in Human Serum in 1990. SRM 909b was issued without a value for glucose because the glucose concentration was found to decrease overtime slowly and predictably in the freeze-dried serum matrix, and a frozen serum SRM for glucose only (SRM 965 Glucose in Frozen Human Serum) has been available since 1996. SRM 967 Creatinine in Frozen Human Serum was issued in 2007. Thus, the currently available SRM 909c is unique only in having values for urea and uric acid but continues to have values for creatinine, cholesterol, and electrolytes even though other multi-level (2 to 4 levels) SRMs are available for these analytes. SRM 909c currently has sales of about 130 units/year.

SRM 909 was significant because it established the model for the development of serum-based SRMs for clinical diagnostic markers (i.e., homocysteine, steroid hormones, and thyroid hormones) using only one analytical method and providing multiple concentration levels. While the original ID GC-MS definitive method for creatinine has been replaced with a more suitable ID LC-MS method [62], cholesterol, uric acid, urea, and glucose are still certified using the ID GC-MS methods developed nearly four decades ago [58,59,60,61], which is a tribute to their accuracy considering the significant advances in analytical methodology, particularly with the emergence of LC-MS and LC-MS/MS as the preferred techniques for clinical markers in serum.

Number 9: SRM 972 Vitamin D in Human Serum

The development of SRM 972 Vitamin D in Human Serum, issued in 2009, was a “perfect storm” scenario for the production of a successful SRM. The major metabolites of vitamin D2 (ergocalciferol) and vitamin D3 (cholecalciferol) are 25-hydroxyvitamin D2 [25(OH)D2] and 25-hydroxyvitamin D3 [25(OH)D3], respectively, with 25(OH)D3 as the predominant metabolite unless supplementation with ergocalciferol has occurred. Epimers of each vitamin D metabolite exist in the serum but only the 3-epi-25(OH)D3 is significant at about 5 to 7% of the 25(OH)D3 content. The primary clinical marker of vitamin D status is total serum 25-hydroxyvitamin D [25(OH)D], which is defined as the sum of 25(OH)D2 and 25(OH)D3 excluding the 3-epi-25(OH)D3. Immunoassays utilize antibodies that interact with similar regions of the 25(OH)D2 and 25(OH)D2 to provide (ideally) equal response and recovery for both metabolites. A variety of immunoassays and LC-MS/MS assays exist for the determination of 25(OH)D; however, the results are known to vary significantly depending on the particular assay used. In principle, LC-MS/MS assays are considered to be more accurate because they determine 25(OH)D2 and 25(OH)D3 unambiguously. However, if the LC-MS/MS assay does not chromatographically separate the 3-epi-25(OH)D3 and the 25(OH)D3, it may be biased high.

In 2006 NIH-ODS provided significant funding to NIST to support activities, including the development of SRMs, to improve the comparability and quality of measurements of 25(OH)D to assess vitamin D status. NIST first developed reference measurement procedures based on ID LC-MS/MS for the determination of 25(OH)D2 and 25(OH)D3, which were published in 2010 [63]. In clinical chemistry, a reference measurement procedure is “accepted as providing measurement results fit for their intended use in assessing measurement trueness of measured values obtained from other measurement procedures for quantities of the same kind, in calibration, or in characterizing reference materials” [64]. In practice, a reference measurement procedure is a higher order method based on specific criteria [65] (see [66] for general requirements) and recognized by the Joint Committee on Traceability in Laboratory Medicine (JCTLM) [67]. Parallel with the development of the reference measurement procedures, NIST designed an SRM based on four pools of human serum with differing levels of 25(OH)D and the individual metabolites, i.e., normal level 25(OH)D3, low level of 25(OH)D3, high level 25(OH)D2, and high level 3-epi-25(OH)D3. The resulting levels were level 1 = endogenous normal level, level 2 = level 1 pool diluted 2× with horse serum, level 3 = normal serum pool fortified with 25(OH)D2, and level 4 = normal serum fortified with 3-epi-25(OH)D3. Because the reference measurement procedures were not yet recognized by the JCTLM, certified values (see Table S2, ESM) were assigned based on results from the candidate ID LC-MS/MS reference measurement procedures, an ID LC-MS method, and a CDC ID LC-MS/MS procedure [68]. SRM 972 was issued in 2009 as a frozen serum matrix, and it was rapidly embraced by the vitamin D measurement community with sales of over 700 units/year. Even with the high demand for SRM 972, there were concerns raised regarding the design of the SRM, i.e., some users claimed that the use of horse serum to dilute level 2 and the fortification of level 3 with 25(OH)D2 affected their assay’s performance. Unfortunately, with the high sales rate, the inventory of SRM 972 was depleted in late 2011. Fortunately, NIST had the opportunity to address the design flaws for SRM 972 and to significantly improve the replacement material.

In 2013, SRM 972a Vitamin D Metabolites in Frozen Human Serum was re-issued, again with four concentration levels. The material design was significantly improved with three endogenous levels including the normal pool, low level pool, and high 25(OH)D2 pool achieved through donors supplementing with ergocalciferol. Only level 4 contained an exogenous high level of 3-epi-25(OH)D3. The value assignment approach still combined results from ID LC-MS, ID LC-MS/MS, and CDC ID LC-MS/MS; however, the CDC ID LC-MS/MS method had been upgraded to separate the 3-epi-25(OH)D3 and 25(OH)D3 and was now a candidate reference measurement procedure [69] (see Table S2, ESM). In 2017, SRM 2973 Vitamin D Metabolites in Frozen Human Serum (High Level) was released with a concentration of 25(OH)D3 34% higher than the highest level in SRM 972a, thereby doubling the working range for the five levels among the two SRMs [70]. At the same time, NIST developed a reference measurement procedure for 24R,25-dihydroxyvitamin D3 [24,25(OH)2D3] [71], another important marker for vitamin D status, and assigned certified values to both SRM 972a and SRM 2973.

In 2019, a unique SRM was produced with human serum pools from female donors of reproductive age who were not pregnant or pregnant in each of the three trimesters. SRM 1949 Frozen Prenatal Human Serum was intended primarily for the determination of thyroid hormones, total thyroxine (T4) and total triiodothyronine (T3); however, reference values were assigned for 25(OH)D2, 25(OH)D3, 3-epi-25(OH)D3, and vitamin D binding protein (VDBP). SRM 1949 is the first SRM to have values assigned for VDBP [72] and the four serum pools illustrate the increasing concentrations of VDBP during pregnancy (see Table S2, ESM). Other human serum-based SRMs developed specifically for nutritional markers (all with support from NIH) include fat-soluble vitamins and carotenoids (SRM 968 series), fatty acids (SRM 2378), vitamin B6 (SRM 3950), and folate vitamers (SRM 3949).

Improving the uncertainty of certified values with renewals

In addition to a desire to increase the number of constituents with values assigned for an SRM as discussed above for environmental matrix SRMs, another goal for next-generation renewals in an SRM series was to improve (i.e., reduce) the uncertainty associated with the assigned certified values. This concept for environmental SRMs was discussed in a review paper [1] with the SRM 1941 marine sediment series as an example with uncertainties for certified mass fractions of 11 PAHs improving from 7 to 24% in the original SRM 1941 to 5 to 17% in SRM 1941b after 13 years. A similar trend is observed for the SRM 1974 series of four mussel tissue SRMs as shown in Table S3 (ESM) where relative uncertainties for certified mass fractions for a group of 14 PAHs decreased from a range of 11 to 28% in the original SRM 1974 to 1 to 7% in SRM 1974c over of period of 22 years.

SRM 1649 Urban Dust provides another example based on the recertification of the same material batch, as opposed to different collections of sediment or mussels as discussed above. The improvement in the uncertainty associated with the assigned values for six PAHs in SRM 1649 through SRM 1649b is illustrated in Fig. 5. The assigned values and relative uncertainties for 15 PAHs in the urban dust SRMs are summarized in Table S4 (in ESM). For this group of PAHs, the uncertainties associated with the certified values were reduced from a range of 5 to 24% in the initial certification in 1982 to a range of < 1.0 to 5.5% in the current SRM 1649b. The significant reductions in the uncertainties through four major updates for the same material over more than three decades of measurements illustrate the improvements in analytical techniques and certification approach. In the development of first- and second-generation environmental matrix SRMs certified for PAHs, PCB congeners, pesticides, and PBDEs, NIST often used four to six multiple methods including results from experienced collaborating laboratories (typically in the form of interlaboratory comparison studies). However, over time with increased confidence in the NIST measurement approaches, the number of methods was reduced and the use of collaborating laboratories minimized or eliminated (e.g., SRM 1974c). An example of the reduced uncertainties associated with certified values assigned for individual PCB congeners in SRM 1588 Cod Liver Oil series was reported previously [1].

Fig. 5
figure 5

Evolution of mass fraction of selected PAHs and associated uncertainties through renewals and updates of SRM 1649 Urban Dust series. A Benzo[a]pyrene. B Indeno[1,2,3-cd]pyrene. C Benzo[ghi]perylene. D Chrysene. E Fluoranthene. F Benzo[b]fluoranthene. Percentages at beginning and end of the dashed arrow indicate the relative uncertainty for SRM 1649 and SRM 1649b (2015)

For the evolving uncertainties associated with the measurement of vitamins in food-matrix SRMs, the infant formula SRM series offers an excellent example. The changing uncertainties of certified values for three vitamins are illustrated in Fig. S6 for vitamin B3 (niacin) and in Fig. S7 for vitamin B1 (thiamine) and vitamin B6 (pyridoxine) (see ESM). The case for niacin has been discussed previously [43] with a reduction from 12% for SRM 1846 to 2.2% for SRM 1869, with an anomalous increase for SRM 1849a when precise ID LC-MS measurements were combined with non-concordant manufacturer’s data resulting in a conservatively large estimate of the uncertainty. The results in ESM Fig. S7 show a continuous, consistent improvement in uncertainties from 12 to 2.4% and from 8 to 2.4% for vitamin B6 and vitamin B1, respectively, for the infant formula SRM series.

A final example of uncertainty improvement with renewals from the clinical area is illustrated in Table S2 (see ESM) for vitamin D metabolites in human serum. Traditionally, clinical markers have been certified based on single definitive or reference methods as described above for SRM 909. However, for the certification of 25(OH)D3 in SRM 972, results were combined from two methods at NIST (i.e., an ID LC-MS/MS candidate reference measurement procedure and an ID LC-MS method) and an ID LC-MS/MS method from CDC that did not separate the 3-epi-25(OH)D3 from 25(OH)D3. The combination of results from these three methods resulted in uncertainties for 25(OH)D3 ranging from 2.5 to 6.1% for the four levels of SRM 972. For the certification of the renewal SRM 972a, the CDC ID LC-MS/MS method was a different method that separated the 3-epi-25(OH)D3 and 25(OH)D3. The resulting uncertainties for the 25(OH)D3 improved with a range of 2.1 to 3.9% over the four levels of SRM 972a. Four years later, 25(OH)D3 was certified in SRM 2973 using only results from the NIST reference measurement procedure resulting in an uncertainty of 2.1% [70]. Similarly, 24,25(OH)2D3 was certified in both SRM 972a and SRM 2973 using only results from the reference measurement procedure [71] with a consistent uncertainty of 3.5 to 3.8% for the five different serum pools. Using a reference measurement procedure as the only method for assigning certified values generally provides lower uncertainties than achieved when combining results from multiple methods.

Number 10: SRM 1950 Metabolites in Human Plasma

The last SRM to make the Top Ten, SRM 1950 Metabolites in Human Plasma, is included because of its unique design and potential for novel uses. At a 2005 workshop, NIH-funded metabolomics researchers identified a need for a reference material to support the development of technologies for metabolomics, and NIH collaborated with NIST to develop SRM 1950. With input from a panel of experts, SRM 1950 was designed to represent “normal” human plasma which was obtained from 100 donors (50 male and 50 female) who had undergone an overnight fast prior to the blood draw and who met a number of requirements including (1) free from overt diseases (i.e., a healthy), (2) 40 to 50 years old, and (3) a racial distribution based on US population (2000 census, i.e., 77% White, 12% African American, 4% Asian, 2% Native American or Alaskan Native, 5% other with 15% of individuals of Hispanic origin), and no medications 72 h prior to the draw. Excluded were donors who (1) were extreme exercisers (e.g., marathon runners), (2) adhered to extreme diets, or (3) had body mass indices outside the 95th percentile. Why were the requirements so specific to generate this plasma pool? The intent was to provide a plasma pool that could be replicated in the future using similar criteria and to provide an inventory that would last a decade or more.

Even though SRM 1950 was not designed for any specific class of metabolites or for use with any specific analytical methods, the value assignment approach was based on results from a reference measurement procedure or multiple independent analytical methods (GC-MS, LC-MS, LC-MS/MS, ICP-MS, most with isotope dilution quantification) at NIST and at CDC. SRM 1950 was issued in 2011 with certified values assigned for 45 metabolites including cholesterol/triglycerides, fatty acids, vitamins, amino acids, clinical markers, and elements and reference values for an additional 45 metabolites, as described by Phinney et al. [73]. In the decade since SRM 1950 was issued, sales have increased steadily (see Fig. 6A) to the current rate of nearly 300 units/year. Is SRM 1950 being used just as a traditional quantitative control material or in metabolomic studies as intended? As shown in Fig. 6B, publications reporting the use of SRM 1950 have grown significantly in recent years and particularly those reporting use in metabolomic [74,75,76] and lipidomic studies [76, 77]. In a review position paper, Burla et al. [78] endorsed the use of SRM 1950 with this statement: “Using the NIST SRM 1950 as a reference plasma will not only be useful in harmonizing datasets but will also provide valuable information on the analytical variability across approaches, platforms, and software, recognizing problematic lipid species and classes whose quantification is “consistently inconsistent” between sites, identifying platform-dependent quantification biases, and, hence, enabling the continuous improvement and standardization of quantitative plasma lipidomics. With less than a decade of inventory remaining, what will replace SRM 1950? Will resources be available to assign values for all the metabolites listed for the current material? Perhaps the model for the next SRM 1950, and perhaps for other SRMs intended for “omics” measurements, should be to produce a common plasma/serum pool with a limited number of assigned values and to encourage users to report their qualitative and quantitative characterization, including information on analytical methods used, to a common database to allow comparison among the measurement community.

Fig. 6
figure 6

Distribution and use of SRM 1950—A bar graph representing sales of SRM 1950 from 2011 through 2020 and B bar graph of number of publications per year (2010 to 2020) reporting the use of SRM 1950 with publications categorized as using SRM 1950 for metabolomic studies (orange), lipidomic studies (yellow), and other (purple)

Lessons learned from the Top Ten SRMs

What have we learned from 40 years of developing SRMs for environmental, clinical, food, and dietary supplement analyses?

Matrix presentation

A dry, homogeneous powder may be the ideal, most convenient matrix for an SRM, but the natural state of the sample analyzed is ultimately the preferred presentation, particularly if the dry material does not behave the same as the wet presentation during the analytical process (commutability [79]), e.g., frozen serum/tissue versus freeze-dried serum/tissue, or wet food versus dry food. For most clinical and many food-matrix SRMs, storage frozen at < − 20 °C is now required, and undoubtedly, most CRMs in these categories, as well as many for environmental analysis, will eventually be stored frozen in the future.

Endogenous versus exogenous constituents

Avoid fortification (spiking) of constituents of interest into the matrix, if possible. Endogenous constituents are preferred and may be more stabile, and in the case of human biological fluids, supplementation and screening of donors can often provide low or elevated levels as needed.

Renewals of SRMs

When an SRM is re-issued, we should always ask the question, can it be improved? Generally, the first time an SRM is re-issued, it is likely that it can be improved in either the number or quality of the values assigned or matrix characteristics.

Longevity

Several of the SRMs discussed have been available as the same batch of material for 10 years to 40 years as shown in Table 4. If it is possible to prepare a large batch of material with long-term stability (e.g., for environmental contaminants), do it. If collection and preparation of an SRM batch requires significant resources, you do not want to have to do it again in a relatively short period. In most cases, a 10-year inventory should be a minimum quantity prepared when stability is not an issue. For SRMs that have a long lifetime, considerable valuable information generally accumulates in the literature reporting additional characterization of the material (e.g., SRM 1649b).

Certified values may change

A certified value for an SRM may change over time as the analytical methods and/or certification approaches improve. This does not mean that the original value was “wrong” at the time; instead, the analytical methods and our understanding of the measurement system have advanced to provide an “improved” certified value. NIST and other CRM producers use state-of-art analytical techniques and approaches to assign what they consider to be the true value. If other researchers find a different result using an advanced analytical method or approach, they should publish their findings. CRM producers should not be embarrassed if other researchers obtain a different result but should embrace the opportunity to investigate and advance their measurements.

Dual usage SRMs for both organic and elemental analysis

Most CRMs are generally intended for either trace element or trace organic analysis, but not for both. Although it may seem like a good idea to produce one SRM intended for both organic and elemental analysis, it may not be cost-effective for NIST to produce or for the customer to use such materials. Several of the SRMs on the list were produced for both inorganic and organic analysis (e.g., SRM 909, SRM 3280, SRM 1849, SRM 3246), whereas for other SRMs, the addition of trace element values to a material used primarily for organic analysis was a secondary goal to further characterize the matrix (e.g., 1649, 1941, 1974).

Honorable Mention SRMs

Of the matrix SRMs developed in the past 40 years, these Top Ten are outstanding examples of the scope of analytical challenges associated with producing such materials. For further consideration, I have included a second list of 10 noteworthy SRMs that I have designated as “Honorable Mention” based on the same criteria (see Table S3, ESM).

Conclusions

As illustrated by the Top Ten SRM list, there have been significant advances and evolution during the past four decades in production and analytical capabilities for the development and characterization of SRMs for organic analysis. The matrices produced and analytes measured have expanded significantly, and many challenges have been suitably addressed to produce useful SRM for the environmental, clinical, food, and dietary supplement measurement communities. There are significantly new and challenging opportunities as needs for CRMs expand to address quantitative measurements for the biosciences. It will be interesting to observe how CRM development and use evolves during the next 40 years.