Abstract
In this paper we survey work being conducted at Imperial College on the use of machine learning to build Systems Biology models of the effects of toxins on biochemical pathways. Several distinct, and complementary modelling techniques are being explored. Firstly, work is being conducted on applying Support-Vector ILP (SVILP) as an accurate means of screening high-toxicity molecules. Secondly, Bayes’ networks have been machine-learned to provide causal maps of the effects of toxins on the network of metabolic reactions within cells. The data were derived from a study on the effects of hydrazine toxicity in rats. Although the resultant network can be partly explained in terms of existing KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway descriptions, several of the strong dependencies in the Bayes’ network involve metabolite pairs with high separation in KEGG. Thirdly, in a complementary study KEGG pathways are being used as background knowledge for explaining the same data using a model constructed using Abductive ILP, a logic-based machine learning technique. With a binary prediction model (up/down regulation) cross validation results show that even with a restricted number of observed metabolites high predictive accuracy (80-90%) is achieved on unseen metabolite concentrations. Further increases in accuracy are achieved by allowing discovery of general rules from additional literature data on hydrazine inhibition. Ongoing work is aimed at formulating probabilistic logic models which combine the learned Bayes’ network and ILP models.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Nuclear Magnetic Resonance
- Background Knowledge
- Nuclear Magnetic Resonance Data
- Nuclear Magnetic Resonance Analysis
- Inductive Logic Programming
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bryant, C.H., Muggleton, S.H., Oliver, S.G., Kell, D.B., Reiser, P., King, R.D.: Combining inductive logic programming, active learning and robotics to discover the function of genes. Electronic Transactions in Artificial Intelligence 5-B1(012), 1–36 (2001)
Goto, S., Okuno, Y., Hattori, M., Nishioka, T., Kanehisa, M.: Ligand: database of chemical compounds and reactions in biological pathways. Nucleic Acids Research 30, 402–404 (2002)
Hughes, T.R., Marton, M.J., Jones, A.R., Roberts, C.J., Stoughton, R., Armour, C.D., Bennett, H.A., Coffey, E., Dai, H., He, Y.D., Kidd, M.J., King, A.M., Meyer, M.R., Slade, D., Lum, P.Y., Stepaniants, S.B., Shoemaker, D.D., Gachotte, D., Chakraburtty, K., Simon, J., Bard, M., Friend, S.H.: Functional discovery via a compendium of expression profiles. Cell 102(1), 109–126 (2000)
King, R.D., Muggleton, S.H., Srinivasan, A., Sternberg, M.: Structure-activity relationships derived by machine learning: the use of atoms and their bond connectives to predict mutagenicity by inductive logic programming. Proceedings of the National Academy of Sciences 93, 438–442 (1996)
King, R.D., Whelan, K.E., Jones, F.M., Reiser, P.K.G., Bryant, C.H., Muggleton, S.H., Kell, D.B., Oliver, S.G.: Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 247–252 (2004)
Kitano, H.: Computational systems biology. Nature 420, 206–210 (2002)
Kitano, H.: Systems biology: a brief overview. Science 295, 1662–1664 (2002)
Muggleton, S.H., Bryant, C.H.: Theory completion using inverse entailment. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS, vol. 1866, pp. 130–146. Springer, Heidelberg (2000)
Muggleton, S.H., Lodhi, H., Amini, A., Sternberg, M.J.E.: Support Vector Inductive Logic Programming. In: Holmes, D., Jain, L.C. (eds.) Recent Advances in Machine Learning. Springer, Heidelberg (2005) (to appear)
Nicholson, J.K., Connelly, J., Lindon, J.C., Holmes, E.: Metabonomics: a platform for studying drug toxicity and gene function. Nature Drug Discovery 1, 153–161 (2002)
Reiser, P.G.K., King, R.D., Kell, D.B., Muggleton, S.H., Bryant, C.H., Oliver, S.G.: Developing a logical model of yeast metabolism. Electronic Transactions in Artificial Intelligence 5-B2(024), 223–244 (2001)
Richard, A.M., Williams, C.R.: Distributed structure-searchable toxicity (DSSTox) public database network: A proposal. Mutation Research 499, 27–52 (2000)
Srinivasan, A., Muggleton, S.H., King, R., Sternberg, M.: Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence 85(1,2), 277–299 (1996)
Tamaddoni-Nezhad, A., Kakas, A., Muggleton, S.H., Pazos, F.: Modelling inhibition in metabolic pathways through abduction and induction. In: Proceedings of the 14th International Conference on Inductive Logic Programming. Springer, Heidelberg (2004)
Tamaddoni-Nezhad, A., Muggleton, S., Bang, J.: A Bayesian model for metabolic pathways. In: International Joint Conference on Artificial Intelligence (IJCAI 2003) Workshop on Learning Statistical Models from Relational Data, pp. 50–57. IJCAI (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muggleton, S.H. (2005). Machine Learning for Systems Biology. In: Kramer, S., Pfahringer, B. (eds) Inductive Logic Programming. ILP 2005. Lecture Notes in Computer Science(), vol 3625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11536314_27
Download citation
DOI: https://doi.org/10.1007/11536314_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28177-1
Online ISBN: 978-3-540-31851-4
eBook Packages: Computer ScienceComputer Science (R0)