Abstract
Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes ∼30 min.
Similar content being viewed by others
References
Goodacre, R., Vaidyanathan, S., Dunn, W.B., Harrigan, G.G. & Kell, D.B. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245–252 (2004).
Fondi, M. & Liò, P. Multi-omics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol. Res. 171, 52–64 (2015).
Patti, G.J., Yanes, O. & Siuzdak, G. Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 (2012).
Zampieri, M., Sekar, K., Zamboni, N. & Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 15–23 (2017).
Cajka, T. & Fiehn, O. Toward merging untargeted and targeted methods in mass spectrometry-based metabolomics and lipidomics. Anal. Chem. 88, 524–545 (2016).
Johnson, C.H., Ivanisevic, J. & Siuzdak, G. Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 (2016).
Smith, C., Want, E., O′Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Anal. Chem. 78, 779–787 (2006).
Gowda, H. et al. Interactive XCMS online: simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal. Chem. 86, 6931–6939 (2014).
Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).
Tautenhahn, R., Patti, G.J., Rinehart, D. & Siuzdak, G. XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012).
Smith, C.A. et al. METLIN - a metabolite mass spectral database. Thera. Drug Monit. 27, 747–751 (2005).
Xia, J., Sinelnikov, I.V., Han, B. & Wishart, D.S. MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic Acids Res. 43, W251–W257 (2015).
Xia, J. & Wishart, D.S. MetPA: a web-based metabolomics tool for pathway analysis and visualization. Bioinformatics 26, 2342–2344 (2010).
Yamada, T., Letunic, I., Okuda, S., Kanehisa, M. & Bork, P. iPath2.0: interactive pathway explorer. Nucleic Acids Res. 39, W412–W415 (2011).
Pirhaji, L. et al. Revealing disease-associated pathways by network integration of untargeted metabolomics. Nat. Methods 13, 770–776 (2016).
Li, S.Z. et al. Predicting network activity from high throughput metabolomics. PLoS Comput. Biol. 9, 11 (2013).
Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).
Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42, D459–D471 (2014).
Johnson, C.H. et al. Metabolism links bacterial biofilms and colon carcinogenesis. Cell Metab. 21, 891–897 (2015).
Gendelman, H.E. et al. Evaluation of the safety and immunomodulatory effects of sargramostim in a randomized, double-blind phase 1 clinical Parkinson's disease trial. Parkinson's Dis. 3, 10 (2017).
Warth, B. et al. Exposome-scale investigations guided by global metabolomics, pathway analysis, and cognitive computing. Anal. Chem. 89, 11505–11513 (2017).
Scheltema, R.A., Jankevics, A., Jansen, R.C., Swertz, M.A. & Breitling, R. PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Anal. Chem. 83, 2786–2793 (2011).
Pluskal, T., Castillo, S., Villar-Briones, A. & Orešiˇ, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).
Chagoyen, M. & Pazos, F. MBRole: enrichment analysis of metabolomic data. Bioinformatics 27, 730–731 (2011).
Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44, W3–W10 (2016).
Giacomoni, F. et al. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics 31, 1493–1495 (2015).
Davidson, R.L., Weber, R.J.M., Liu, H.Y., Sharma-Oates, A. & Viant, M.R. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. GigaScience 5, 10 (2016).
Kamburov, A., Cavill, R., Ebbels, T.M., Herwig, R. & Keun, H.C. Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics 27, 2917–2918 (2011).
Sun, H. et al. iPEAP: integrating multiple omics and genetic data for pathway enrichment analysis. Bioinformatics 30, 737–739 (2014).
Cottret, L. et al. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks. Nucleic Acids Res. 38, W132–W137 (2010).
Karnovsky, A. et al. Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28, 373–380 (2012).
Fabregat, A. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 44, D481–D487 (2016).
Kelder, T. et al. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301–D1307 (2012).
Gika, H. & Theodoridis, G. Sample preparation prior to the LC-MS-based metabolomics/metabonomics of blood-derived samples. Bioanalysis 3, 1647–1661 (2011).
Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
Benton, H.P. et al. Autonomous metabolomics for rapid metabolite identification in global profiling. Anal. Chem. 87, 884–891 (2015).
Zhu, Z.-J. et al. Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database. Nat. Protoc. 8, 451–460 (2013).
Smith, G. et al. Mutations in APC, Kirsten-ras, and p53 - alternative genetic pathways to colorectal cancer. Proc. Natl. Acad. Sci. USA 99, 9433–9438 (2002).
Zhan, X.Q. & Desiderio, D.M. Signaling pathway networks mined from human pituitary adenoma proteomics data. BMC Med. Genom. 3, 26 (2010).
Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Haas, B.J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Martin, J.A. & Wang, Z. Next-generation transcriptome assembly. Nat. Rev. Genet. 12, 671–682 (2011).
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Washburn, M.P., Wolters, D. & Yates, J.R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).
Geiger, T., Cox, J., Ostasiewicz, P., Wisniewski, J.R. & Mann, M. Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7, 383–385 (2010).
Montenegro-Burke, J.R. et al. Data streaming for metabolomics: accelerating data processing and analysis from days to minutes. Anal. Chem. 89, 1254–1259 (2017).
Montenegro-Burke, J.R. et al. Smartphone analytics: mobilizing the lab into the cloud for omicscale analyses. Anal. Chem. 88, 9753–9758 (2016).
Trutschel, D., Schmidt, S., Grosse, I. & Neumann, S. Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data. Metabolomics 11, 851–860 (2015).
Causon, T.J. & Hann, S. Review of sample preparation strategies for MS-based metabolomic studies in industrial biotechnology. Anal. Chim. Acta 938, 18–32 (2016).
Engskog, M.K.R., Haglof, J., Arvidsson, T. & Pettersson, C. LC-MS based global metabolite profiling: the necessity of high data quality. Metabolomics 12, 19 (2016).
Haggarty, J. & Burgess, K.E.V. Recent advances in liquid and gas chromatography methodology for extending coverage of the metabolome. Curr. Opin. Biotechnol. 43, 77–85 (2017).
Kohler, I. & Giera, M. Recent advances in liquid-phase separations for clinical metabolomics. J. Sep. Sci. 40, 93–108 (2017).
Muzny, D.M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Feldman, D., Krishnan, A.V., Swami, S., Giovannucci, E. & Feldman, B.J. The role of vitamin D in reducing cancer risk and progression. Nat. Rev. Cancer 14, 342–357 (2014).
Payne, C.M., Bernstein, C., Dvorak, K. & Bernstein, H. Hydrophobic bile acids, genomic instability, Darwinian selection, and colon carcinogenesis. Clin. Exp. Gastroenterol. 1, 19–47 (2008).
Field, A.E. et al. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch. Intern. Med. 161, 1581–1586 (2001).
Frei, B., Kim, M.C. & Ames, B.N. Ubiquinol-10 is an effective lipid-soluble antioxidant at physiological concentrations. Proc. Natl. Acad. Sci. USA 87, 4879–4883 (1990).
Xian, F., Hendrickson, C.L. & Marshall, A.G. High resolution mass spectrometry. Anal. Chem. 84, 708–719 (2012).
Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinform. 9, 504 (2008).
Shevlyakov, G., Morgenthaler, S. & Shurygin, A. Redescending M-estimators. J. Stat. Plan. Infer. 138, 2906–2917 (2008).
Welch, B.L. The generalisation of student′s problems when several different population variances are involved. Biometrika 34, 28–35 (1947).
Mann, H.B. & Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist. 18, 50–60 (1947).
Fisher, R.A. On the probable error of a coefficient of correlation deduced from a small sample. Metron 1, 3–32 (1921).
Kruskal, W.H. & Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952).
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).
Maier, T. et al. Quantification of mRNA and protein and integration with protein turnover in a bacterium. Mol. Syst. Biol. 7, 511–511 (2011).
Hirai, M.Y. et al. Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. J. Biol. Chem. 280, 25590–25595 (2005).
Bateman, A. et al. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).
Patti, G.J., Tautenhahn, R. & Siuzdak, G. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protoc. 7, 508–516 (2012).
Tautenhahn, R. et al. metaXCMS: second-order analysis of untargeted metabolomics data. Anal. Chem. 83, 696–700 (2011).
Acknowledgements
The authors thank the following for funding assistance: Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA), a Scientific Focus Area Program at Lawrence Berkeley National Laboratory for the US Department of Energy, Office of Science, Office of Biological and Environmental Research under contract number DE-AC02-05CH11231 (G.S.); and the National Institutes of Health (grants R01 GM114368 (G.S.) and PO1 A1043376-02S1 (G.S.)).
Author information
Authors and Affiliations
Contributions
E.M.F. and T.H. contributed equally to writing the manuscript. E.M.F., T.H., D.R., H.P.B., B.H. and G.S. contributed to platform development, and H.P.B., B.W. and G.S. contributed to manuscript writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Methods and Supplementary Table 1. (PDF 852 kb)
Supplementary Data 1
Demonstration transcriptomics data set. (ZIP 5 kb)
Supplementary Data 2
Significant protein data set. (ZIP 0 kb)
Rights and permissions
About this article
Cite this article
Forsberg, E., Huan, T., Rinehart, D. et al. Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat Protoc 13, 633–651 (2018). https://doi.org/10.1038/nprot.2017.151
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2017.151
- Springer Nature Limited
This article is cited by
-
The isolation strategy and chemical analysis of oil cells from Asari Radix et Rhizoma
Plant Methods (2024)
-
Small molecule metabolites: discovery of biomarkers and therapeutic targets
Signal Transduction and Targeted Therapy (2023)
-
Soil microbiome engineering for sustainability in a changing environment
Nature Biotechnology (2023)
-
Challenges and Opportunities for Bioactive Compound and Antibiotic Discovery in Deep Space
Journal of the Indian Institute of Science (2023)
-
A novel 6-metabolite signature for prediction of clinical outcomes in type 2 diabetic patients undergoing percutaneous coronary intervention
Cardiovascular Diabetology (2022)