De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4

Ludwig, Marcus; Fleischauer, Markus; Dührkop, Kai; Hoffmann, Martin A.; Böcker, Sebastian

doi:10.1007/978-1-0716-0239-3_11

Marcus Ludwig³,
Markus Fleischauer³,
Kai Dührkop³,
Martin A. Hoffmann^3,4 &
…
Sebastian Böcker³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2104))

7838 Accesses
10 Citations
9 Altmetric

Abstract

SIRIUS 4 is the best-in-class computational tool for metabolite identification from high-resolution tandem mass spectrometry data. It offers de novo molecular formula annotation with outstanding accuracy. When searching fragmentation spectra in a structure database, it reaches over 70% correct identifications. A predicted fingerprint, which indicates the presence or absence of thousands of molecular properties, helps to deduce information about the compound of interest even if it is not contained in any structure database. Here, we present best practices and describe how to leverage the full potential of SIRIUS 4, how to incorporate it into your own workflow, and how it adds value to the analysis of mass spectrometry data beyond spectral library search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information

Article 18 March 2019

A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect

Article Open access 14 February 2024

Database-independent molecular formula annotation using Gibbs sampling through ZODIAC

Article 13 October 2020

Notes

References

Allen F, Greiner R, Wishart D (2015) Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification. Metabolomics 11(1):98–110. https://doi.org/10.1007/s11306-014-0676-4
Article CAS Google Scholar
Böcker S (2017) Searching molecular structure databases using tandem MS data: are we there yet? Curr Opin Chem Biol 36:1–6. https://doi.org/10.1016/j.cbpa.2016.12.010. https://authors.elsevier.com/a/1UF-u4sz6LvFfY
Article PubMed CAS Google Scholar
Böcker S, Dührkop K (2016) Fragmentation trees reloaded. J Cheminform 8:5. https://doi.org/10.1186/s13321-016-0116-8. http://www.jcheminf.com/content/8/1/5
Caspi R, Altman T, Billington R, Dreher K, Foerster H, Fulcher CA, Holland TA, Keseler IM, Kothari A, Kubo A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Subhraveti P, Weaver, DS, Weerasinghe D, Zhang P, Karp PD (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 42(D1):D459–D471. https://doi.org/10.1093/nar/gkt1103. http://nar.oxfordjournals.org/content/42/D1/D459.abstract
Article PubMed PubMed Central CAS Google Scholar
da Silva RR, Dorrestein PC, Quinn RA (2015) Illuminating the dark matter in metabolomics. Proc Natl Acad Sci U S A 112(41):12549–12550. https://doi.org/10.1073/pnas.1516878112
Article PubMed PubMed Central CAS Google Scholar
Djoumbou-Feunang Y, Fiamoncini J, Gil-de-la Fuente A, Greiner R, Manach C, Wishart DS (2019) BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification. J Cheminf 11(1):2
Article Google Scholar
Dührkop K, Shen H, Meusel M, Rousu J, Böcker S (2015) Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc Natl Acad Sci U S A 112(41):12580–12585. https://doi.org/10.1073/pnas.1509788112
Article PubMed PubMed Central CAS Google Scholar
Dührkop K, Lataretu MA, White WTJ, Böcker S (2018) Heuristic algorithms for the maximum colorful subtree problem. In: Proceedings of workshop on algorithms in bioinformatics (WABI 2018). Leibniz international proceedings in informatics (LIPIcs), vol 113. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, pp 23:1–23:14. https://doi.org/10.4230/LIPIcs.WABI.2018.23. http://drops.dagstuhl.de/opus/volltexte/2018/9325
Dührkop K, Fleischauer M, Ludwig M, Aksenov AA, Melnik AV, Meusel M, Dorrestein PC, Rousu J, Böcker S (2019) Sirius 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat Methods. https://doi.org/10.1038/s41592-019-0344-8
Article PubMed CAS Google Scholar
Fonger GC, Hakkinen P, Jordan S, Publicker S (2014) The National Library of Medicine’s (NLM) Hazardous Substances Data Bank (HSDB): background, recent enhancements and future plans. Toxicology 325:209–216. https://doi.org/10.1016/j.tox.2014.09.003
Article CAS PubMed Google Scholar
Gu J, Gui Y, Chen L, Yuan G, Lu HZ, Xu X (2013) Use of natural products as chemical library for drug discovery and network pharmacology. PLoS One 8(4):1–10
Google Scholar
Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, Steinbeck C (2016) ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res 44(D1):D1214–D1219. https://doi.org/10.1093/nar/gkv1031. http://europepmc.org/articles/PMC4702775
Article PubMed PubMed Central CAS Google Scholar
Heinonen M, Shen H, Zamboni N, Rousu J (2012) Metabolite identification and molecular fingerprint prediction via machine learning. Bioinformatics 28(18):2333–2341. https://doi.org/10.1093/bioinformatics/bts437
Article CAS PubMed Google Scholar
Hoffmann N, Rein J, Sachsenberg TT, Hartler J, Haug K, Mayer G, Alka O, Dayalan S, Pearce JTM, Rocca-Serra P et al (2019) mzTab-M: a data standard for sharing quantitative results in mass spectrometry metabolomics. Anal Chem 91(5):3302–3310. https://doi.org/10.1021/acs.analchem.8b04310
Article CAS PubMed PubMed Central Google Scholar
Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, Ojima Y, Tanaka K, Tanaka S, Aoshima K, Oda Y, Kakazu Y, Kusano M, Tohge T, Matsuda F, Sawada Y, Hirai MY, Nakanishi H, Ikeda K, Akimoto N, Maoka T, Takahashi H, Ara T, Sakurai N, Suzuki H, Shibata D, Neumann S, Iida T, Tanaka K, Funatsu K, Matsuura F, Soga T, Taguchi R, Saito K, Nishioka T (2010) MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom 45(7):703–714. https://doi.org/10.1002/jms.1777
Article CAS PubMed Google Scholar
Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52(7):1757–1768
Article CAS PubMed PubMed Central Google Scholar
Jeffryes JG, Colastani RL, Elbadawi-Sidhu M, Kind T, Niehaus TD, Broadbelt LJ, Hanson AD, Fiehn O, Tyo KEJ, Henry CS (2015) MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J Cheminform 7:44. https://doi.org/10.1186/s13321-015-0087-1
Article PubMed PubMed Central CAS Google Scholar
Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M (2016) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44(D1):D457–D462
Article CAS PubMed Google Scholar
Keseler IM, Mackie A, Santos-Zavaleta A, Billington R, Bonavides-Martínez C, Caspi R, Fulcher C, Gama-Castro S, Kothari A, Krummenacker M, Latendresse M, Muñiz-Rascado L, Ong Q, Paley S, Peralta-Gil M, Subhraveti P, Velázquez-Ramírez DA, Weaver D, Collado-Vides J, Paulsen I, Karp PD (2017) The EcoCyc database: reflecting new knowledge about Escherichia coli k-12. Nucleic Acids Res 45:D543–D550
Article CAS PubMed Google Scholar
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH (2016) PubChem substance and compound databases. Nucleic Acids Res 44:D1202–D1213. https://doi.org/10.1093/nar/gkv951
Article CAS PubMed Google Scholar
Klekota J, Roth FP (2008) Chemical substructures that enrich for biological activity. Bioinformatics 24(21):2518–2525. https://doi.org/10.1093/bioinformatics/btn479
Article CAS PubMed PubMed Central Google Scholar
Larson EA, Hutchinson CP, Lee YJ (2018) Gas chromatography-tandem mass spectrometry of lignin pyrolyzates with dopant-assisted atmospheric pressure chemical ionization and molecular structure search with CSI:FingerID. J Am Soc Mass Spectrom 29(9):1908–1918. https://doi.org/10.1007/s13361-018-2001-3
Article CAS PubMed Google Scholar
Ludwig M, Dührkop K, Böcker S (2018) Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints. Bioinformatics 34(13):i333–i340. https://doi.org/10.1093/bioinformatics/bty245. Proceedings of Intelligent Systems for Molecular Biology (ISMB 2018)
Article CAS PubMed PubMed Central Google Scholar
Meusel M, Hufsky F, Panter F, Krug D, Müller R, Böcker S (2016) Predicting the presence of uncommon elements in unknown biomolecules from isotope patterns. Anal Chem 88(15):7556–7566. https://doi.org/10.1021/acs.analchem.6b01015
Article CAS PubMed Google Scholar
Mohimani H, Gurevich A, Shlemov A, Mikheenko A, Korobeynikov A, Cao L, Shcherbin E, Nothias LF, Dorrestein PC, Pevzner PA (2018) Dereplication of microbial metabolites through database search of mass spectra. Nat Commun 9(1):4035. https://doi.org/10.1038/s41467-018-06082-8
Article PubMed PubMed Central CAS Google Scholar
Nelson SJ, Johnston WD, Humphreys BL (2001) Relationships in medical subject headings. In: Bean CA, Green R (eds) Relationships in the organization of knowledge. Kluwer Academic Publishers, Dordrecht, pp 171–184. http://www.nlm.nih.gov/mesh/meshrels.html
Chapter Google Scholar
Pluskal T, Castillo S, Villar-Briones A, Oresic M (2010) MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinf 11:395. https://doi.org/10.1186/1471-2105-11-395
Article CAS Google Scholar
Ramirez-Gaona M, Marcu A, Pon A, Guo AC, Sajed T, Wishart NA, Karu N, Djoumbou Feunang Y, Arndt D, Wishart DS (2017) YMDB 2.0: a significantly expanded version of the yeast metabolome database. Nucleic Acids Res 45:D440–D445
Article CAS PubMed Google Scholar
Rasche F, Svatoš A, Maddula RK, Böttcher C, Böcker S (2011) Computing fragmentation trees from tandem mass spectrometry data. Anal Chem 83(4):1243–1251. https://doi.org/10.1021/ac101825k
Article CAS PubMed Google Scholar
Ridder L, van der Hooft JJJ, Verhoeven S, de Vos RCH, Bino RJ, Vervoort J (2013) Automatic chemical structure annotation of an LC-MS(n) based metabolic profile from green tea. Anal Chem 85(12):6033–6040. https://doi.org/10.1021/ac400861a
Article CAS PubMed Google Scholar
Röst HL, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, Andreotti S, Ehrlich HC, Gutenbrunner P, Kenar E, Liang X, Nahnsen S, Nilse L, Pfeuffer J, Rosenberger G, Rurik M, Schmitt U, Veit J, Walzer M, Wojnar D, Wolski WE, Schilling O, Choudhary JS, Malmström L, Aebersold R, Reinert K, Kohlbacher O (2016) OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods 13(9):741–748. https://doi.org/10.1038/nmeth.3959
Article CAS PubMed Google Scholar
Ruttkies C, Schymanski EL, Wolf S, Hollender J, Neumann S (2016) MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform 8:3. https://doi.org/10.1186/s13321-016-0115-9
Article PubMed PubMed Central CAS Google Scholar
Schymanski EL, Ruttkies C, Krauss M, Brouard C, Kind T, Dührkop K, Allen FR, Vaniya A, Verdegem D, Böcker S, Rousu J, Shen H, Tsugawa H, Sajed T, Fiehn O, Ghesquière B, Neumann S (2017) Critical assessment of small molecule identification 2016: automated methods. J Cheminf 9:22. https://doi.org/10.1186/s13321-017-0207-1
Article Google Scholar
Shinbo Y, Nakamura Y, Altaf-Ul-Amin M, Asahi H, Kurokawa K, Arita M, Saito K, Ohta D, Shibata D, Kanaya S (2006) KNApSAcK: a comprehensive species-metabolite relationship database. In: Saito K, Dixon RA, Willmitzer L (eds) Plant metabolomics. Biotechnology in agriculture and forestry, vol 57. Springer, Berlin, pp 165–181
Chapter Google Scholar
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The Chemistry Development Kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493–500
Article CAS PubMed PubMed Central Google Scholar
Tautenhahn R, Cho K, Uritboonthai W, Zhu Z, Patti GJ, Siuzdak G (2012) An accelerated workflow for untargeted metabolomics using the METLIN database. Nat Biotechnol 30(9):826–828. https://doi.org/10.1038/nbt.2348
Article CAS PubMed PubMed Central Google Scholar
Tsugawa H, Kind T, Nakabayashi R, Yukihira D, Tanaka W, Cajka T, Saito K, Fiehn O, Arita M (2016) Hydrogen rearrangement rules: computational ms/ms fragmentation and structure elucidation using MS-FINDER software. Anal Chem 88(16):7946–7958. https://doi.org/10.1021/acs.analchem.6b00770
Article CAS PubMed PubMed Central Google Scholar
Wang R, Fu Y, Lai L (1997) A new atom-additive method for calculating partition coefficients. J Chem Inf Comput Sci 37(3):615–621. https://doi.org/10.1021/ci960169p
Article CAS Google Scholar
Wang R, Gao Y, Lai L (2000) Calculating partition coefficient by atom-additive method. Perspect Drug Discov Des 19(1):47–66. https://doi.org/10.1023/A:1008763405023
Article CAS Google Scholar
Wang Y, Kora G, Bowen BP, Pan C (2014) MIDAS: a database-searching algorithm for metabolite identification in metabolomics. Anal Chem 86(19):9496–9503. https://doi.org/10.1021/ac5014783
Article CAS PubMed Google Scholar
Wang M et al (2016) Sharing and community curation of mass spectrometry data with Global Natural Products Social molecular networking. Nat Biotechnol 34(8):828–837. https://doi.org/10.1038/nbt.3597
Article CAS PubMed PubMed Central Google Scholar
Weber RJM, Li E, Bruty J, He S, Viant MR (2012) MaConDa: a publicly accessible mass spectrometry contaminants database. Bioinformatics 28(21):2856–2857. https://doi.org/10.1093/bioinformatics/bts527
Article CAS PubMed Google Scholar
Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluskal T, Rojas-Chertó M, Spjuth O, Torrance G, Evelo CT, Guha R, Steinbeck C (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminf 9(1):33. http://dx.doi.org/10.1186/s13321-017-0220-4
Wishart DS, Feunang YD, Marcu A, Guo AC, Liang K, Vázquez-Fresno R, Sajed T, Johnson D, Li C, Karu N, Sayeeda Z, Lo E, Assempour N, Berjanskii M, Singhal S, Arndt D, Liang Y, Badran H, Grant J, Serra-Cayuela A, Liu Y, Mandal R, Neveu V, Pon A, Knox C, Wilson M, Manach C, Scalbert A (2018) HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res 46(D1):D608–D617. http://dx.doi.org/10.1093/nar/gkx1089
Article PubMed Central CAS Google Scholar
Wolf S, Schmidt S, Müller-Hannemann M, Neumann S (2010) In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinf 11:148. https://doi.org/10.1186/1471-2105-11-148
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Chair for Bioinformatics, Friedrich-Schiller University, Jena, Germany
Marcus Ludwig, Markus Fleischauer, Kai Dührkop, Martin A. Hoffmann & Sebastian Böcker
International Max Planck Research School “Exploration of Ecological Interactions with Molecular and Chemical Techniques”, Max Planck Institute for Chemical Ecology, Jena, Germany
Martin A. Hoffmann

Authors

Marcus Ludwig
View author publications
You can also search for this author in PubMed Google Scholar
Markus Fleischauer
View author publications
You can also search for this author in PubMed Google Scholar
Kai Dührkop
View author publications
You can also search for this author in PubMed Google Scholar
Martin A. Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Böcker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Böcker .

Editor information

Editors and Affiliations

Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
Shuzhao Li

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Ludwig, M., Fleischauer, M., Dührkop, K., Hoffmann, M.A., Böcker, S. (2020). De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4. In: Li, S. (eds) Computational Methods and Data Analysis for Metabolomics. Methods in Molecular Biology, vol 2104. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0239-3_11

Download citation

DOI: https://doi.org/10.1007/978-1-0716-0239-3_11
Published: 18 January 2020
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0238-6
Online ISBN: 978-1-0716-0239-3
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information

A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect

Database-independent molecular formula annotation using Gibbs sampling through ZODIAC

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information

A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect

Database-independent molecular formula annotation using Gibbs sampling through ZODIAC

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation