Abstract
The task of predicting protein functions using computational techniques is a major research area in the field of bioinformatics. Casting the task into a classification problem makes it challenging, since the classes (functions) to be predicted are hierarchically related, and a protein can have more than one function. One approach is to produce a set of local classifiers; each is responsible for discriminating between a subset of the classes in a certain level of the hierarchy. In this paper we tackle the hierarchical classification problem in a local fashion, by learning an ensemble of Bayesian network classifiers for each class in the hierarchy and combining their outputs with four alternative methods: a) selecting the best classifier, b) majority voting, c) weighted voting, and d) constructing a meta-classifier. The ensemble is built using ABC-Miner, our recently introduced Ant-based Bayesian Classification algorithm. We use different types of protein representations to learn different classification models. We empirically evaluate our proposed methods on an ageing-related protein dataset created for this research.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Binns, D., Dimmer, E., Huntley, R., Barrell, D., O’Donovan, C., Apweiler, R.: QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics 25, 3045–3046 (2009)
de Campos, L.M., Fernandez-Luna, J.M., Gamez, J.A., Puerta, J.M.: Ant colony optimization for learning Bayesian networks. International Journal of Approximate Reasoning 31(3), 291–311 (2002)
Cheng, J., Greiner, R.: Learning Bayesian Belief Network Classifiers: Algorithms and System. In: Stroulia, E., Matwin, S. (eds.) AI 2001. LNCS (LNAI), vol. 2056, pp. 141–151. Springer, Heidelberg (2001)
Costa, E.P., Lorena, A.C., Carvalho, A.C.P.L.F., Freitas, A.A.: Top-Down Hierarchical Ensembles of Classifiers for Predicting G-Protein-Coupled-Receptor Functions. In: Bazzan, A.L.C., Craven, M., Martins, N.F. (eds.) BSB 2008. LNCS (LNBI), vol. 5167, pp. 35–46. Springer, Heidelberg (2008)
Daly, R., Shen, Q., Aitken, S.: Learning bayesian networks: Approaches and issues. Knowledge Engineering Reviews 26(2), 99–157 (2011)
Dorigo, M., Stützle, T.: Ant Colony Optimization. The MIT Press (2004)
Freitas, A.A., de Carvalho, A.C.P.F.L.: A tutorial on hierarchical classification with applications in bioinformatics. In: Research and Trends in Data Mining Technologies and Applications, pp. 175–208 (2007)
Huang, D., Sherman, B., Lempicki, R.: Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nature Protocol 4, 44–57 (2009)
Jiang, L., Wang, D., Cai, Z., Yan, X.: Survey of Improving Naive Bayes for Classification. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 134–145. Springer, Heidelberg (2007)
de Magalhaes, J., Budovsky, A., Lehmann, G., Costa, J., Li, Y., Church, V.F.G.: The Human Ageing Genomic Resources: online databases and tools for biogerontologists. Aging Cell, 65–72 (2009)
Martens, D., Baesens, B., Fawcett, T.: Editorial survey: swarm intelligence for data mining. Machine Learning 82(1), 1–42 (2011)
Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data mining with an ant colony optimization algorithm. IEEE TEC 6, 321–332 (2002)
Pinto, P.C., Nägele, A., Dejori, M., Runkler, T.A., Sousa, J.M.C.: Using a local discovery ant algorithm for Bayesian network structure learning. IEEE Transactions on Evolutionary Computation 13(4), 767–779 (2009)
Salama, K.M., Freitas, A.A.: ABC-Miner: An Ant-Based Bayesian Classification Algorithm. In: Dorigo, M., Birattari, M., Blum, C., Christensen, A.L., Engelbrecht, A.P., Groß, R., Stützle, T. (eds.) ANTS 2012. LNCS, vol. 7461, pp. 13–24. Springer, Heidelberg (2012)
Schietgat, L., Vens, C., Struyf, J., Blockeel, H., Kocev, D., Dzeroski, S.: Predicting gene function using hierarchical multi-label decision tree ensembles. BMC Bioinformatics 11(1) (2010)
Secker, A., Davies, M.N., Freitas, A.A., Clark, E., Timmis, J., Flower, D.R.: Hierarchical classification of GPCR with data-driven selection of attributes and classifiers. International Journal of Data Mining and Bioinformatics 4(2), 191–210 (2010)
Secker, A., Davies, M.N., Freitas, A.A., Timmis, J., Mendao, M., Flower, D.R.: An experimental comparison of classification algorithms for the hierarchical prediction of protein function. Expert Update (BCS-SGAI Magazine) 9, 17–22 (2007)
Silla, C.N., Freitas, A.A.: Selecting different protein representations and classification algorithms in hierarchical protein function prediction. Intelligent Data Analysis 15(6), 979–999 (2011)
Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery 22(1-2), 31–72 (2011)
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
The UniProt Consortium: The Universal Protein Resource (Uniprot). Nucleic Acids Research 38, D142–D148 (2010)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2010)
Wu, Y., McCall, J., Corne, D.: Two novel Ant Colony Optimization approaches for Bayesian network structure learning. In: International Conference on Evolutionary Computation (CEC), pp. 1–7 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Salama, K.M., Freitas, A.A. (2013). ACO-Based Bayesian Network Ensembles for the Hierarchical Classification of Ageing-Related Proteins. In: Vanneschi, L., Bush, W.S., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2013. Lecture Notes in Computer Science, vol 7833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37189-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-37189-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37188-2
Online ISBN: 978-3-642-37189-9
eBook Packages: Computer ScienceComputer Science (R0)