Abstract
The reconstruction of protein-protein interaction networks is nowadays an important challenge in systems biology. Computational approaches can address this problem by complementing high-throughput technologies and by helping and guiding biologists in designing new laboratory experiments. The proteins and the interactions between them form a network, which has been shown to possess several topological properties. In addition to information about proteins and interactions between them, knowledge about the topological properties of these networks can be used to learn accurate models for predicting unknown protein-protein interactions. This paper presents a principled way, based on Bayesian inference, for combining network topology information jointly with information about proteins and interactions between them. The goal of this combination is to build accurate models for predicting protein-protein interactions. We define a random graph model for generating networks with topology similar to the ones observed in protein-protein interaction networks. We define a probability model for protein features given the absence/presence of an interaction and combine this with the random graph model by using Bayes’ rule, to finally arrive at a model incorporating both topological and feature information.
Chapter PDF
Similar content being viewed by others
References
Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein–protein interactions. Bioinformatics 21(1), 38–46 (2005)
Chen, X.W., Liu, M.: Prediction of protein-protein interactions using random decision forest framework. Bioinformatics 21(24), 4394–4400 (2005)
Chung, F., Lu, L.: Connected components in random graphs with given expected degree sequences. Annals of Combinatorics 6(2), 125–145 (2002)
Friedel, C., Zimmer, R.: Inferring topology from clustering coefficients in protein-protein interaction networks. BMC Bioinformatics 7, 519 (2006)
Geurts, P., Touleimat, N., Dutreix, M., d’Alché-Buc, F.: Inferring biological networks with output kernel trees. BMC Bioinformatics (PMSB 2006 Special Issue) 8(suppl. 2), S4 (2007)
Geurts, P., Wehenkel, L., d’Alché-Buc, F.: Gradient boosting for kernelized output spaces. In: Proceedings of the 24th International Conference on Machine Learning. ACM International Conference Proceeding Series, vol. 227, pp. 289–296. ACM (2007)
Geurts, P., Wehenkel, L., d’Alché Buc, F.: Kernelizing the output of tree-based methods. In: Proceedings of the 23th International Conference on Machine Learning, pp. 345–352 (2006)
Hollander, M., Wolfe, D.: Nonparametric Statistical Methods. John Wiley & Sons (1999)
Jansen, R., Yu, H., et al.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302(5644), 449–453 (2003)
Jeong, H., Mason, S.P., Barabási, A.-L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411(6833), 41–42 (2001)
Kashima, H., Yamanishi, Y., Kato, T., Sugiyama, M., Tsuda, K.: Simultaneous inference of biological networks of multiple species from genome-wide data and evolutionary information. Bioinformatics 25(22), 2962–2968 (2009)
Kuchaiev, O., Rasajski, M., Higham, D.J., Przulj, N.: Geometric de-noising of protein-protein interaction networks. PLOS Computational Biology 5(8) (2009)
Li, Z.C., Lai, Y.H., et al.: Identifying functions of protein complexes based on topology similarity with random forest. Mol. Biosyst. (10), 514–525 (2014)
Lin, N., Wu, B., Jansen, R., Gerstein, M., Zhao, H.: Information assessment on predicting protein-protein interactions. BMC Bioinformatics 5, 154 (2004)
Maslov, S., Sneppen, K.: Specificity and stability in topology of protein networks. Science 296, 910–913 (2002)
Memisevic, V., Milenkovic, T., Przulj, N.: Complementarity of network and sequence information in homologous proteins. Journal of Integrative Bioinformatics 7(3), 135 (2010)
Milenkovic, T., Przulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Informatics 6, 257–273 (2008)
Mohamed, T.P., Carbonell, J.G., Ganapathiraju, M.K.: Active learning for human protein-protein interaction prediction. BMC Bioinformatics 11(suppl. 1), S57 (2010)
Muntean, M., Valean, H., Ileana, I., Rotar, C.: Improving classification with support vector machine for unbalanced data. In: Proceedings of 2010 IEEE International Conference on Automation, Quality and Testing, Robotics, THETA, 17th edn., pp. 234–239 (2010)
Park, Y., Marcotte, E.M.: Revisiting the negative example sampling problem for predicting protein-protein interactions. Bioinformatics 27(21), 3024–3028 (2011)
Przulj, N., Corneil, D., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004)
Qi, Y., Klein-Seetharaman, J., Bar-Joseph, Z.: Random forest similarity for protein-protein interaction prediction from multiple sources. In: Altman, R.B., Jung, T.A., Klein, T.E., Dunker, A.K., Hunter, L. (eds.) Pacific Symposium on Biocomputing. World Scientific (2005)
Qi, Y., Klein-Seetharaman, J., Bar-Joseph, Z.: A mixture of feature experts approach for protein-protein interaction prediction. BMC Bioinformatics 8(suppl. 10), S6 (2007)
Qi, Y., Tastan, O., Carbonell, J.G., Klein-Seetharaman, J., Weston, J.: Semi-supervised multi-task learning for predicting interactions between hiv-1 and human proteins. Bioinformatics 26(18), i645–i652 (2010)
Sarajlic, A., Janjic, V., Stojkovic, N., Radak, D., Przulj, N.: Network topology reveals key cardiovascular disease genes. PLoS One 8(8), e71537 (2013)
Shi, M.G., Xia, J.F., Li, X.L., Huang, D.S.: Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset. Amino Acids 38(3), 891–899 (2010)
Sprinzak, E., Altuvia, Y., Margalit, H.: Characterization and prediction of protein-protein interactions within and between complexes. PNAS 103(40), 14718–14723 (2006)
Tanaka, R., Yi, T.M., Doyle, J.: Some protein interaction data do not exhibit power law statistics. FEBS Letters 579, 5140–5144 (2005)
Tastan, O., Qi, Y., Carbonell, J.G., Klein-Seetharaman, J.: Prediction of interactions between hiv-1 and human proteins by information integration. In: Proceedings of the Pacific Symposium on Biocomputing, vol. 14, pp. 516–527 (2009)
von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)
Yamanishi, Y., Vert, J.-P., Kanehisa, M.: Protein network inference from multiple genomic data: a supervised approach. Bioinformatics 20(1), 363–370 (2004)
Yu, J., Guo, M., Needham, C.J., Huang, Y., Cai, L., Westhead, D.: Simple sequence-based kernels do not predict protein-protein interactions. Bioinformatics 26(20), 2610–2614 (2010)
Zhang, L.V., Wong, S., King, O., Roth, F.: Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics 5, 38 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Birlutiu, A., Heskes, T. (2014). Using Topology Information for Protein-Protein Interaction Prediction. In: Comin, M., Käll, L., Marchiori, E., Ngom, A., Rajapakse, J. (eds) Pattern Recognition in Bioinformatics. PRIB 2014. Lecture Notes in Computer Science(), vol 8626. Springer, Cham. https://doi.org/10.1007/978-3-319-09192-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-09192-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09191-4
Online ISBN: 978-3-319-09192-1
eBook Packages: Computer ScienceComputer Science (R0)