Abstract
Detecting protein complexes is crucial to understand principles of cellular organization. Plenty evidences have indicated that sub-graphs with high density in protein-protein interaction (PPI) network, especially dynamic PPI network (DPIN), usually correspond to protein complexes. As a well-known density-based clustering algorithm, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) has been used in many areas due to its simplicity and the ability to detect clusters of different sizes and shapes. However, one of its limitations is that the performance of DBSCAN depends on two specified parameters ε and MinPts, where ε represents the maximum radius of a neighborhood from an observing point while MinPts means the minimum number of data points contained in such a neighborhood. In this article, we develop a new method named as P-DBSCAN to detect protein complexes in DPIN by using Pigeon-Inspired Optimization (PIO) Algorithm to optimize the parameters ε and MinPts in DBSCAN. The experiments on DIP and MIPS datasets show that P-DBSCAN outperforms the state-of-the-art methods for protein complex detection in terms of several criteria such as precision, recall and f-measure.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Uetz P, Giot L, Cagney G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 2000, 403: 623–627
Zhu H, Bilgin M, Bangham R, et al. Global analysis of protein activities using proteome chips. Science, 2001, 293: 2101–2105
Xenarios I, Salwnski L, Duan X J, et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucl Acids Res, 2002, 30: 303–305
Güldener U, Münsterkötter M, Kastenmller G, et al. CYGD: the comprehensive yeast genome database. Nucl Acids Res, 2005, 33: 364–368
Cherry J M. SGD: Saccharomyces Genome Database. Nucl Acids Res, 1998, 26: 73–79
Montanez G, Cho Y R. Predicting false positives of protein-protein interaction data by semantic similarity measures. Curr Bioinform, 2013, 8: 339–346
Li M, Zheng R, Zhang H, et al. Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods, 2014, 67: 325–333
Watts D J, Strogatz S H. Collective dynamics of ‘small-world’ networks. Nature, 1998, 393: 440–442
Antonio S, Paul O M. Small-world network approach to identify key residues in protein-protein interaction. Proteins, 2005, 58: 672–682
Rives A W, Galitski T. Modular organization of cellular networks. Proc Nat Acad Sci USA, 2003, 100: 1128–1133
Palla G, Dernyi I, Farkas I J, et al. Uncoverring the overlapping community structure of complex networks in nature and society. Nature, 2005, 435: 814–818
Adamcsek B, Palla G, Farkas I, et al. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics, 2006, 22: 1021–1023
Altaf-Ul-Amin M, Shinbo Y, Mihara K, et al. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform, 2006, 7: 207–228
Li M, Chen J, Wang J, et al. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform, 2008, 9: 398–413
Peng J, Mona S. SPICi: a fast clustering algorithm for large biological networks. Bioinformatics, 2010, 26: 1105–1111
Liu G, Wong L, Chua H N. Complex discovery from weighted PPI networks. Bioinformatics, 2009, 25: 1891–1897
Leung H C M, Xiang Q, Yiu S M, et al. Predicting protein complexes from PPI data: a core-attachment approach. J Comput Biol, 2009, 16: 133–144
Wang J X, Liu B B, Li M, et al. Identifying protein complexes from interaction networks based on clique percolation and distance restriction. BMC Genom, 2010, 11: S10–S24
Ester M, Kriegel H P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, 1996. 226–231
Duan H B, Qiao P X. Pigeon-inspired optimization: a new swarm intelligence optimizer for air robot path planning. Int J Intell Comput Cybern, 2014, 7: 24–37
Lei X J, Wu S, Ge L, et al. Clustering and overlapping modules detection in PPI network based on IBFO. Proteomics, 2013, 13: 278–290
Lei X J, Tian J F, Ge L, et al. The clustering model and algorithm of PPI network based on propagating mechanism of artificial bee colony. Inform Sci, 2013, 247: 21–39
Lv Q, Wu H J, Wu J Z, et al. A parallel ant colonies approach to de novo prediction of protein backbone in CASP8/9. Sci China Inf Sci, 2013, 56: 108103
Lei X J, Wang F, Wu F X, et al. Protein complex identification through Markov clustering with firefly algorithm on dynamic proteinCprotein interaction networks. Inf Sci, 2016, 329: 303–316
Lei X J, Ying C, Wu F X, et al. Clustering PPI data by combining FA and SHC method. BMC Genom, 2015, 16: S3–S12
Zhao J, Zhou R. Pigeon-inspired optimization applied to constrained gliding trajectories. Nonlinear Dyn, 2015, 82: 1781–1795
Li C, Duan H B. Target detection approach for UAVs via improved Pigeon-inspired Optimization and Edge Potential Function. Aerosp Sci Technol, 2014, 39: 352–360
Sun H, Duan H B. PID controller design based on Prey-Predator Pigeon-Inspired Optimization algorithm. In: Proceedings of the International Conference on Mechatronics and Automation, Tianjin, 2014. 1416–1421
Wang J X, Li M, Chen J, et al. A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform. 2011, 8: 607–620
van Dongen S. Graph clustering by flow simulation. Dissertation for Doctoral Degree. Center for Math and Computer Science (CWI), University of Utrecht. 2000
King A D, Przulj N, Jurisica I. Protein complex prediction via cost-based clustering. Bioinformatics, 2004, 20: 3013–3020
Zhang A D. Protein interaction networks. New York: Cambridge University Press, 2009
Radicchi F, Castellano C, Cecconi F, et al. Defining and identifying communities in networks. Proc Nat Acad Sci USA, 2004, 101: 2658–2663
Washburn M P, Wolters D, Yates J R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol, 2001, 19: 242–247
Cho Y R, Hwang H, Ramanathan M, et al. Semantic integration to identify overlapping functional modules in protein interaction networks. BMC Bioinform, 2007, 8: 265–277
Wang J X, Peng X Q, Li M, et al. Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics, 2013, 13: 301–312
Tu B P, Kudlicki A, Rowicka M, et al. Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science, 2005, 310: 1152–1158
Pu S, Wong J, Turner B, et al. Up-to-date catalogues of yeast protein complexes. Nucl Acids Res 2009, 37: 825–831
Mewes H W, Amid C, Arnold R, et al. MIPS: analysis and annotation of proteins from whole genomes. Nucl Acids Res, 2004, 32: 41–44
Tang Y, Li M, Wang J X. CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems, 2015, 127: 67–72
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lei, X., Ding, Y. & Wu, FX. Detecting protein complexes from DPINs by density based clustering with Pigeon-Inspired Optimization Algorithm. Sci. China Inf. Sci. 59, 070103 (2016). https://doi.org/10.1007/s11432-016-5578-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-016-5578-9