Abstract
Protein-protein interactions (PPIs) are crucial for almost all cellular processes, including metabolic cycles, DNA transcription and replication, and signaling cascades. However, the experimental methods for identifying PPIs are both time-consuming and expensive. Therefore, it is important to develop computational approaches for predicting PPIs. In this article, a sequence-based method is developed by combining a novel feature representation using binary coding and Support Vector Machine (SVM). The binary-coding-based descriptors account for the interactions between residues a certain distance apart in the protein sequence, thus this method adequately takes the neighboring effect into account and mine interaction information from the continuous and discontinuous amino acids segments at the same time. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 86.93% prediction accuracy with 86.99% sensitivity at the precision of 86.90%. Extensive experiments are performed to compare our method with the existing sequence-based method. Achieved results show that the proposed approach is very promising for predicting PPI, so it can be a useful supplementary tool for future proteomics studies.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Gavin, A.C., Bosche, M., Krause, R., Grandi, P.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868), 141–147 (2002)
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the National Academy of Sciences of the United States of America 98(8), 4569–4574 (2001)
Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)
Krogan, N.J., Cagney, G., Yu, H.Y., Zhong, G.Q.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)
Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403(6770), 623–627 (2000)
Giot, L., Bader, J.S., Brouwer, C., Chaudhuri, A., Kuang, B., Li, Y.: A protein interaction map of Drosophila melanogaster. Science 302(5651), 1727–1736 (2003)
Guo, Y., Yu, L., Wen, Z., Li, M.: Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Research 36(9), 3025–3030 (2008)
You, Z.H., Yin, Z., Han, K., Huang, D.S., Zhou, X.: A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network. Bmc Bioinformatics 11 (2010)
You, Z.H., Lei, Y.K., Gui, J., Huang, D.S., Zhou, X.: Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics 26(21), 2744–2751 (2010)
Xia, J.F., You, Z.H., Wu, M., Wang, S.L., Zhao, X.M.: Improved method for predicting pi-turns in proteins using a two-stage classifier. Protein and Peptide Letters 17(9), 1117–1122 (2010)
Lei, Y.K., You, Z.H., Ji, Z., Zhu, L., Huang, D.S.: Assessing and predicting protein interactions by combining manifold embedding with multiple information integration. Bmc Bioinformatics 13 (2012)
You, Z.-H., Li, L., Yu, H., Chen, S., Wang, S.-L.: Increasing reliability of protein interactome by combining heterogeneous data sources with weighted network topological metrics. In: Huang, D.-S., Zhao, Z., Bevilacqua, V., Figueroa, J.C. (eds.) ICIC 2010. LNCS, vol. 6215, pp. 657–663. Springer, Heidelberg (2010)
Qi, Y.J., Seetharaman, J.K., Joseph, Z.B.: Random forest similarity for protein-protein interaction prediction from multiple sources. In: Pac. Symp. Biocomput., pp. 531–542 (2005)
Yang, L., Xia, J.F., Gui, J.: Prediction of Protein-Protein Interactions from protein sequence using local descriptors. Protein and Peptide Letters 17(9), 1085–1090 (2010)
Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., Li, Y., Jiang, H.: Predictina protein-protein interactions based only on sequences information. Proceedings of the National Academy of Sciences of the United States of America 104(11), 4337–4341 (2007)
Shi, M.G., Xia, J.F., Li, X.L., Huang, D.S.: Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset. Amino Acids 38(3), 891–899 (2010)
Xia, J.F., Han, K., Huang, D.S.: Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. Protein and Peptide Letters 17(1), 137–145 (2010)
Tong, J.C., Tammi, M.T.: Prediction of protein allergenicity using local description of amino acid sequence. Frontiers in Bioscience 13, 6072–6078 (2008)
Herrera, L.J.: Recursive prediction for long term time series forecasting using advanced models. Neurocomputing 70(16), 2870–2880 (2007)
Cortes, C., Vapnik, V.: Support vector network. Machine Learning (1995)
Davies, M.N., Secker, A., Freitas, A.A., Clark, E., Timmis, J., Flower, D.R.: Optimizing amino acid groupings for GPCR classification. Bioinformatics 24(18), 1980–1986 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
You, Z., Ming, Z., Niu, B., Deng, S., Zhu, Z. (2013). A SVM-Based System for Predicting Protein-Protein Interactions Using a Novel Representation of Protein Sequences. In: Huang, DS., Bevilacqua, V., Figueroa, J.C., Premaratne, P. (eds) Intelligent Computing Theories. ICIC 2013. Lecture Notes in Computer Science, vol 7995. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39479-9_73
Download citation
DOI: https://doi.org/10.1007/978-3-642-39479-9_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39478-2
Online ISBN: 978-3-642-39479-9
eBook Packages: Computer ScienceComputer Science (R0)