The Metabolomics Standards Initiatives manuscripts published in 2007 represented a community-based agreement in reporting standards for a wide range of topics including chemical analysis, data analysis, ontologies and other processes associated with metabolomics (Fiehn et al. 2007). However, there has been no community-based agreement on the requirements and rules associated with quality assurance (QA) and quality control (QC) in metabolomics studies, although research into QC has been around for about 10 years (Sangster et al. 2006; Dunn et al. 2011). This has created a situation where each metabolomics lab had to determine their own QA and QC procedures.

In 2014, a Data Quality Task Group (DQTG) was created in the Metabolomics Society to promote QA and QC in the metabolomics community through increased awareness, education and the endorsement of QA and QC best practices (Bearden et al. 2014). The DQTG released a questionnaire about QA and QC practices in metabolomics labs on August 16, 2018 and approximately 100 scientists responded to the questionnaire (Dunn et al. 2017). The results of the DQTG questionnaire generated four major recommendations: (1) provide guidance on QA processes and develop consensus processes through meetings and reports; (2) provide education to the metabolomics community on usage of QC processes; (3) communicate with the metabolomics community to define the types and volumes of Standard Reference Materials required; and (4) recognize the need to provide further incentive for laboratories to improve overall QA/QC practices.

In the 2 months since the “Guidelines and considerations for the use of system suitability and QC samples in mass spectrometry assays applied in untargeted clinical metabolomics” manuscript by Broadhurst et al. (2018) was published, it has been downloaded over 3300 times. This achievement indicates great interest in QC to advance mass-spectrometry based clinical metabolomics. The major insights from this paper are the use of “system suitability” QC samples to determine whether the performance of the analytical system is “fit for purpose” before analyzing the actual samples of interest. The authors discuss metrics to apply to evaluate the performance of system suitability QC and different QC samples and appropriate uses of system suitability QC and QC samples.

The Broadhurst et al.’s paper (2018) identifies two types of system suitability QC samples to be used: (1) blank sample that can determine whether there are impurities or contaminants in the system before starting analysis, and (2) synthetic sample that is comprised of a small set of authentic standards (5–10) with known mass-to-charge (m/z) ratio and chromatography characteristics to determine whether the system is meeting instrument specifications and predefined intra-laboratory or intra-study precision and reproducibility criteria. Examples of acceptance characteristics for system suitability synthetic sample is instrument theoretical m/z error < 5 ppm (typically mass spectrometer accuracy), retention difference < 2% drift, and peak area less than a 10% deviation from predefined acceptable peak area, and symmetric peak shape with no evidence of peak splitting. The authors discuss the collection and analysis of pooled samples, process internal standards that are added to test samples to measure m/z, retention time, peak shape, and peak area during the sample run, standard reference materials and long term reference samples for inter-study and inter-laboratory assessment.

The three QC metrics Broadhurst and colleagues describe are: (1) relative standard deviation (RSD) for calculating precision in the QC samples; (2) D-ratio as a method to measure dispersion ratio of metabolites between pooled QC samples and test samples and; (3) peak detection rate that can be reported for every metabolite in the QC samples. The acceptance criteria for detection rate is often set at 70%, RSD < 20% or 30% (Sangster et al. 2006; Dunn et al. 2011; Lewis et al. 2016) depending on the sample type and D-ratio is set at a maximum of 50% and preferably much lower for optimal biomarker discovery. Metabolites or peaks that do not pass each of these tests can be removed before further analysis of the test samples. Filtering the peaks before biomarker(s) discovery should improve the process of biomarker discovery.

Metabolomics can be used to discover multiple types of clinical biomarkers including susceptibility/risk, diagnostic, prognostic, predictive, pharmacodynamic/response, and safety biomarkers (https://www.ncbi.nlm.nih.gov/books/NBK338448/). In addition to biomarkers, there are several reports of the usefulness of clinical metabolomics data for pharmacometabolomics (Clayton et al. 2009) and precision medicine (Kaddurah-Daouk and Weinshilboum 2014; Beger et al. 2016). Highlighting the utility of clinical metabolomics, one should note that as of October 2018, there are 131 human studies within the MetaboLights (https://www.ebi.ac.uk/metabolights/) database, 310 human studies deposited at the Metabolomics Workbench (http://www.metabolomicsworkbench.org/; which just recently has secured NIH funding to continue, http://jacobsschool.ucsd.edu/news/news_releases/release.sfe?id=2616), and 51 clinical cohorts of metabolomics data on blood samples in the COnsortium of METabolomics Studies (COMETS) (https://epi.grants.cancer.gov/comets/). Currently, however, there is no requirement for QC data to be submitted to databases or in publications, often rendering it hard to determine the quality of the data in the metabolomics databases. Most of the clinical studies in the metabolomics databases include some level of QC but, as noted above, QC practices differ between laboratories and the absence of accepted quality standards makes study-to-study comparison of QC data difficult. It is becoming increasingly clear that industry-accepted QC standards need to be adopted by the greater metabolomics community, along with a requirement that the QC data be submitted along with the metabolomics data when manuscripts are being reviewed and datasets are being submitted to the repositories.

Sufficient and standardized QC metrics will aid and impact all aspects of clinical metabolomics studies including the initial study design, sample handling, data collection, metabolite identification through database matching, biomarker discovery, and biological and pathway interpretation of the data. Several of the authors of the guidelines manuscript (Broadhurst et al. 2018) were among the ~ 40 scientist that participated in the “Think Tank on Quality Assurance and Quality Control for Untargeted Metabolomic Studies” that was held from October 19–20, 2017 at the National Cancer Institute’s Shady Grove Campus in Rockville, MD, USA (Beger et al. 2018). The objectives of the Think Tank were to identify and prioritize the types of test materials that are needed in the field of metabolomics for QA/QC in untargeted studies, identify the most useful metrics for assessing study and data quality for untargeted metabolomic studies and identify and prioritize processes to ensure appropriate reporting of QA/QC data.

Following the Think Tank QA and QC meeting, the group of 40 scientists established the Metabolomics Quality Assurance and Quality Control Consortium (mQACC) (https://epi.grants.cancer.gov/Consortia/mQACC). The objectives of the mQACC include:

  1. 1.

    Identify, catalog, harmonize and disseminate QA/QC best practices for untargeted metabolomics.

  2. 2.

    Establish mechanisms to enable the metabolomics community to adopt QA/QC best practices.

  3. 3.

    Promote and support systematic training in QA/QC best practices for the metabolomics community.

  4. 4.

    Encourage the prioritization and development of reference materials applicable to metabolomics research.

The mQACC is comprised of two working groups; (1) a “Reference and Test Material Working group” that will evaluate three reference materials that are currently needed for inter-laboratory QC and (2) “Dissemination of Current QA/QC Practices Working Group” that will collect and publish the QC SOPs.

The mQACC is sharing information amongst its members and has invited participation of the “MEtabolomics standaRds Initiative in Toxicology” (MERIT), an international multi-stakeholder project that is defining best practices and minimum reporting requirements for metabolomics studies in regulatory toxicology (Viant et al., unpublished). The joint efforts of these groups have led to synchronized naming and definitions of types of QC samples for metabolomics studies. This is important because QC definitions that mQACC and MERIT have agreed to are slightly different than some QC sample definitions that were defined in the Broadhurst et al.’s paper (2018). Synchronization of QC definitions and buy in from the entire metabolomics community will be very important going forward. Information on how to become an affiliate or non-affiliate member of the mQACC will soon be made public. Looking to the immediate future, the mQACC and related QA and QC efforts will require participation from academia, metabolomics service providers, instrument vendors, biotechs and government to produce community accepted QC standards that are required for publication and submission to databases. Best practices will need to be determined and tested in large inter-laboratory studies. The hope would be that community accepted QC standards would do for metabolomics what the microarray quality control consortium (MAQC) (Shi et al. 2006, 2008) did for transcriptomics, which was to increase quality and community acceptance of the transcriptomics data. The Broadhurst et al.’s paper (2018) has generated a lot of interest and showed that now is the time to harness that interest in a community-wide effort to develop community-accepted and tested QC standards that will be useful in all types of metabolomics studies.