Abstract
Computational cell metabolism models seek to provide metabolic explanations of cell behavior under different conditions or following genetic alterations, help in the optimization of in vitro cell growth environments, or predict cellular behavior in vivo and in vitro. In the extremes, mechanistic models can include highly detailed descriptions of a small number of metabolic reactions or an approximate representation of an entire metabolic network. To date, all mechanistic models have required details of individual metabolic reactions, either kinetic parameters or metabolic flux, as well as information about extracellular and intracellular metabolite concentrations. Despite the extensive efforts and the increasing availability of high-quality data, required in vivo data are not available for the majority of known metabolic reactions; thus, mechanistic models are based primarily on ex vivo kinetic measurements and limited flux information. Machine learning approaches provide an alternative for derivation of functional dependencies from existing data. The increasing availability of metabolomic and lipidomic data, with growing feature coverage as well as sample set size, is expected to provide new data options needed for derivation of machine learning models of cell metabolic processes. Moreover, machine learning analysis of longitudinal data can lead to predictive models of cell behaviors over time. Conversely, machine learning models trained on steady-state data can provide descriptive models for the comparison of metabolic states in different environments or disease conditions. Additionally, inclusion of metabolic network knowledge in these analyses can further help in the development of models with limited data.
This chapter will explore the application of machine learning to the modeling of cell metabolism. We first provide a theoretical explanation of several machine learning and hybrid mechanistic machine learning methods currently being explored to model metabolism. Next, we introduce several avenues for improving these models with machine learning. Finally, we provide protocols for specific examples of the utilization of machine learning in the development of predictive cell metabolism models using metabolomic data. We describe data preprocessing, approaches for training of machine learning models for both descriptive and predictive models, and the utilization of these models in synthetic and systems biology. Detailed protocols provide a list of software tools and libraries used for these applications, step-by-step modeling protocols, troubleshooting, as well as an overview of existing limitations to these approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Richelle A, David B, Demaegd D et al (2020) Towards a widespread adoption of metabolic modeling tools in biopharmaceutical industry: a process systems biology engineering perspective. NPJ Syst Biol Appl 6(1):6
Puniya BL, Amin R, Lichter B et al (2021) Integrative computational approach identifies drug targets in CD4(+) T-cell-mediated immune disorders. NPJ Syst Biol Appl 7(1):4
Blais EM, Rawls KD, Dougherty BV et al (2017) Reconciled rat and human metabolic networks for comparative toxicogenomics and biomarker predictions. Nat Commun 8:14250
Bordbar A, Jamshidi N, Palsson BO (2011) iAB-RBC-283: a proteomically derived knowledge-base of erythrocyte metabolism that can be used to simulate its physiological and patho-physiological states. BMC Syst Biol 5:110
Thomas A, Rahmanian S, Bordbar A et al (2014) Network reconstruction of platelet metabolism identifies metabolic signature for aspirin resistance. Sci Rep 4:3925
Rico J, Nantel A, Pham PL et al (2018) Kinetic model of metabolism of monoclonal antibody producing CHO cells. Current Metabolomics 6
Nguyen TNT, Sha S, Hong MS et al (2021) Mechanistic model for production of recombinant adeno-associated virus via triple transfection of HEK293 cells. Mol Ther Methods Clin Dev 21:642–655
Chandrasekaran S, Zhang J, Sun Z et al (2017) Comprehensive mapping of pluripotent stem cell metabolism using dynamic genome-scale network modeling. Cell Rep 21(10):2965–2977
Cuperlovic-Culf M (2018) Machine learning methods for analysis of metabolic data and metabolic pathway modeling. Meta 8(1)
Srinivasan S, Cluett WR, Mahadevan R (2015) Constructing kinetic models of metabolism at genome-scales: a review. Biotechnol J 10(9):1345–1359
Helmy M, Smith D, Selvarajoo K (2020) Systems biology approaches integrated with artificial intelligence for optimized metabolic engineering. Metab Eng Commun 11:e00149
Borzì A (2020) Modelling with ordinary differential equations: a comprehensive approach, 1st edn. Chapman and Hall/CRC
von Stosch M, Peres J, de Azevedo SF et al (2010) Modelling biochemical networks with intrinsic time delays: a hybrid semi-parametric approach. BMC Syst Biol 4:131
Srinivasan B (2021) A guide to the Michaelis-Menten equation: steady state and beyond. FEBS J
Wittig U, Kania R, Golebiewski M et al (2012) SABIO-RK--database for biochemical reaction kinetics. Nucleic Acids Res 40(Database issue):D790–D796
Chang A, Jeske L, Ulbrich S et al (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49(D1):D498–D508
Saa PA, Nielsen LK (2016) Construction of feasible and accurate kinetic models of metabolism: a Bayesian approach. Sci Rep 6:29635
Orth JD, Thiele I, Palsson BO (2010) What is flux balance analysis? Nat Biotechnol 28(3):245–248
Jerby L, Shlomi T, Ruppin E (2010) Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism. Mol Syst Biol 6:401
Zhang C, Bidkhori G, Benfeitas R et al (2018) ESS: a tool for genome-scale quantification of essentiality score for reaction/genes in constraint-based modeling. Front Physiol 9:1355
Lewis NE, Nagarajan H, Palsson BO (2012) Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat Rev Microbiol 10(4):291–305
Richelle A, Joshi C, Lewis NE (2019) Assessing key decisions for transcriptomic data integration in biochemical networks. PLoS Comput Biol 15(7):e1007185
Opdam S, Richelle A, Kellman B et al (2017) A systematic evaluation of methods for tailoring genome-scale metabolic models. Cell Syst 4(3):318–329. e316
Aurich MK, Fleming RM, Thiele I (2016) MetaboTools: a comprehensive toolbox for analysis of genome-scale metabolic models. Front Physiol 7:327
Brunk E, Sahoo S, Zielinski DC et al (2018) Recon3D enables a three-dimensional view of gene variation in human metabolism. Nat Biotechnol 36(3):272–281
Fahy E, Subramaniam S, Murphy RC et al (2009) Update of the LIPID MAPS comprehensive classification system for lipids. J Lipid Res 50(Suppl):S9–S14
Shevchenko A, Simons K (2010) Lipidomics: coming to grips with lipid diversity. Nat Rev Mol Cell Biol 11(8):593–598
Bennett SAL, Valenzuela N, Xu H et al (2013) Using neurolipidomics to identify phospholipid mediators of synaptic (dys)function in Alzheimer’s disease. Front Physiol 4:168
Mao C, Obeid LM (2008) Ceramidases: regulators of cellular responses mediated by ceramide, sphingosine, and sphingosine-1-phosphate. Biochim Biophys Acta 1781(9):424–434
Teichgraber V, Ulrich M, Endlich N et al (2008) Ceramide accumulation mediates inflammation, cell death and infection susceptibility in cystic fibrosis. Nat Med 14(4):382–391
Bastanlar Y, Ozuysal M (2014) Introduction to machine learning. Methods Mol Biol 1107:105–128
Costello Z, Martin HG (2018) A machine learning approach to predict metabolic pathway dynamics from time-series multiomics data. NPJ Syst Biol Appl 4:19
Culley C, Vijayakumar S, Zampieri G et al (2020) A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 117(31):18869–18879
Mc Auley MT, Mooney KM (2015) Computationally modeling lipid metabolism and aging: a mini-review. Comput Struct Biotechnol J 13:38–46
Shaked I, Oberhardt MA, Atias N et al (2016) Metabolic network prediction of drug side effects. Cell Syst 2(3):209–213
Folch-Fortuny A, Teusink B, Hoefsloot HCJ et al (2018) Dynamic elementary mode modelling of non-steady state flux data. BMC Syst Biol 12(1):71
Metzcar J, Wang Y, Heiland R et al (2019) A review of cell-based computational modeling in cancer biology. JCO Clin Cancer Inform 3:1–13
Van Houdt G, Mosquera C, Nápoles G (2020) A review on the long short-term memory model. Artif Intell Rev 53:5929–5955
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Cheng L, Ramchandran S, Vatanen T et al (2019) An additive Gaussian process regression model for interpretable non-parametric analysis of longitudinal data. Nat Commun 10(1):1798
Mass spectrometry-based lipidomics approaches (2016) In: Hsu F-F (ed) Lipidomics. pp 53–88
Lipidomics (2017) Springer Protocols
Chitpin JG, Surendra A, Nguyen TT et al (2021) BATL: Bayesian annotations for targeted lipidomics. Bioinformatics. in press
Tsugawa H, Arita M, Kanazawa M et al (2013) MRMPROBS: a data assessment and metabolite identification tool for large-scale multiple reaction monitoring based widely targeted metabolomics. Anal Chem 85(10):5191–5199
Domingo-Almenara X, Montenegro-Burke JR, Ivanisevic J et al (2018) XCMS-MRM and METLIN-MRM: a cloud library and public resource for targeted analysis of small molecules. Nat Methods 15(9):681–684
Niu W, Knight E, Xia Q et al (2014) Comparative evaluation of eight software programs for alignment of gas chromatography-mass spectrometry chromatograms in metabolomics experiments. J Chromatogr A 1374:199–206
Wang Y, Ma L, Zhang M et al (2019) A simple method for peak alignment using relative retention time related to an inherent peak in liquid chromatography-mass spectrometry-based metabolomics. J Chromatogr Sci 57(1):9–16
Lin CY, Wu H, Tjeerdema RS et al (2007) Evaluation of metabolite extraction strategies from tissue samples using NMR metabolomics. Metabolomics 3(1):55–67
Wishart DS, Jewison T, Guo AC et al (2013) HMDB 3.0--the human metabolome database in 2013. Nucleic Acids Res 41(Database issue):D801–D807
Velankar S, Burley SK, Kurisu G et al (2021) The protein data bank archive. Methods Mol Biol 2305:3–21
Romero PR, Kobayashi N, Wedell JR et al (2020) BioMagResBank (BMRB) as a resource for structural biology. Methods Mol Biol 2112:187–218
Ravanbakhsh S, Liu P, Bjorndahl TC et al (2015) Accurate, fully-automated NMR spectral profiling for metabolomics. PLoS One 10(5):e0124219
Wang RCC, Campbell DA, Green JR et al (2021) Automatic 1D (1)H NMR metabolite quantification for bioreactor monitoring. Meta 11(3)
Jager S, Allhorn A, Biessmann F (2021) A benchmark for data imputation methods. Front Big Data 4:693674
Jauhiainen A, Madhu B, Narita M et al (2014) Normalization of metabolomics data with applications to correlation maps. Bioinformatics 30(15):2155–2161
Walach J, Filzmoser P, Hron K (2018) Data normalization and scaling: consequences for the analysis in omics sciences. Compr Anal Chem 82:165–196
Heirendt L, Arreckx S, Pfau T et al (2019) Creation and analysis of biochemical constraint-based models using the COBRA toolbox v.3.0. Nat Protoc 14(3):639–702
Wang H, Marcisauskas S, Sanchez BJ, Domenzain I, Hermansson D, Agren R, Nielsen J, Kerkhoven EJ (2018) RAVEN 2.0: a versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor. PLoS Comput Biol 14(10):e1006541
Cornish-Bowden A (2014) Fundamentals of enzyme kinetics. Elsevier
Lewis JE, Kemp ML (2021) Integration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance. Nat Commun 12(1):2700
Guyon I (2017) Advances in neural information processing system 30 pre-proceedings. NeurlPS 2017
Blattmann P, Henriques D, Zimmermann M et al (2017) Systems pharmacology dissection of cholesterol regulation reveals determinants of large pharmacodynamic variability between cell lines. Cell Syst 5(6):604–619.e607
Sahle S, Gauges R, Pahle J, et al. Simulation of Biochemical Networks Using Copasi – A Complex Pathway Simulator. In: Proceedings of the 2006 Winter Simulation Conference, 2006
Matsuoka Y, Funahashi A, Ghosh S et al (2014) Modeling and simulation using CellDesigner. Methods Mol Biol 1164:121–145
Resasco DC, Gao F, Morgan F et al (2012) Virtual cell: computational tools for modeling in cell biology. Wiley Interdiscip Rev Syst Biol Med 4(2):129–140
Bergmann FT, Hoops S, Klahn B et al (2017) COPASI and its applications in biotechnology. J Biotechnol 261:215–220
Martinez JA, Bulte DB, Contreras MA et al (2020) Dynamic modeling of CHO cell metabolism using the hybrid cybernetic approach with a novel elementary mode analysis strategy. Front Bioeng Biotechnol 8:279
Sanft KR, Wu S, Roh M et al (2011) StochKit2: software for discrete stochastic simulation of biochemical systems with events. Bioinformatics 27(17):2457–2458
Tonn MK, Thomas P, Barahona M et al (2019) Stochastic modelling reveals mechanisms of metabolic heterogeneity. Commun Biol 2:108
Ebrahim A, Lerman JA, Palsson BO et al (2013) COBRApy: COnstraints-based reconstruction and analysis for python. BMC Syst Biol 7:74
Dias O, Rocha M, Ferreira EC et al (2015) Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 43(8):3899–3910
Gutierrez JM, Feizi A, Li S et al (2020) Genome-scale reconstructions of the mammalian secretory pathway predict metabolic costs and limitations of protein secretion. Nat Commun 11(1):68
Yu JS, Bagheri N (2020) Agent-based models predict emergent behavior of heterogeneous cell populations in dynamic microenvironments. Front Bioeng Biotechnol 8:249
Malik-Sheriff RS, Glont M, Nguyen TVN et al (2020) BioModels-15 years of sharing computational models in life science. Nucleic Acids Res 48(D1):D407–D415
Wittig U, Rey M, Weidemann A et al (2018) SABIO-RK: an updated resource for manually curated biochemical reaction kinetics. Nucleic Acids Res 46(D1):D656–D660
Flamholz A, Noor E, Bar-Even A et al (2012) eQuilibrator--the biochemical thermodynamics calculator. Nucleic Acids Res 40(Database issue):D770–D775
Acknowledgments
Work was supported in part by operating grants AI-4D-102-3 to SALB and MCC from the National Research Council AI for Design Challenge Program, RGPIN-2019-06796 to SALB from the Natural Sciences and Engineering Research Council of Canada (NSERC), as well as an NSERC CREATE Matrix Metabolomics Training grant to SALB. TTN received an NSERC CREATE Matrix Metabolomics Graduate Scholarship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Cuperlovic-Culf, M., Nguyen-Tran, T., Bennett, S.A.L. (2023). Machine Learning and Hybrid Methods for Metabolic Pathway Modeling. In: Selvarajoo, K. (eds) Computational Biology and Machine Learning for Metabolic Engineering and Synthetic Biology. Methods in Molecular Biology, vol 2553. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2617-7_18
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2617-7_18
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2616-0
Online ISBN: 978-1-0716-2617-7
eBook Packages: Springer Protocols