SySAP: a system-level predictor of deleterious single amino acid polymorphisms

Huang, Tao; Wang, Chuan; Zhang, Guoqing; Xie, Lu; Li, Yixue

doi:10.1007/s13238-011-1130-2

SySAP: a system-level predictor of deleterious single amino acid polymorphisms

Communication
Published: 19 December 2011

Volume 3, pages 38–43, (2012)
Cite this article

Download PDF

Protein & Cell

SySAP: a system-level predictor of deleterious single amino acid polymorphisms

Download PDF

Tao Huang^1,2,
Chuan Wang¹,
Guoqing Zhang¹,
Lu Xie² &
…
Yixue Li^1,2

501 Accesses
17 Citations
Explore all metrics

Abstract

Single amino acid polymorphisms (SAPs), also known as non-synonymous single nucleotide polymorphisms (nsSNPs), are responsible for most of human genetic diseases. Discriminate the deleterious SAPs from neutral ones can help identify the disease genes and understand the mechanism of diseases. In this work, a method of deleterious SAP prediction at system level was established. Unlike most existing methods, our method not only considers the sequence and structure information, but also the network information. The integration of network information can improve the performance of deleterious SAP prediction. To make our method available to the public, we developed SySAP (a System-level predictor of deleterious Single Amino acid Polymorphisms), an easy-to-use and high accurate web server. SySAP is freely available at http://www.biosino.org/ SySAP/and http://lifecenter.sgst.cn/SySAP/.

Avoid common mistakes on your manuscript.

References

Ahmad, S., and Sarai, A. (2005). PSSM-based prediction of DNA binding sites in proteins. BMC Bioinformatics 6, 33.
Article Google Scholar
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.
Article Google Scholar
Atchley, W.R., Zhao, J., Fernandes, A.D., and Drüke, T. (2005). Solving the protein sequence metric problem. Proc Natl Acad Sci U S A 102, 6395–6400.
Article Google Scholar
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A., and Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 412–424.
Article Google Scholar
Burke, D.F., Worth, C.L., Priego, E.M., Cheng, T., Smink, L.J., Todd, J. A., and Blundell, T.L. (2007). Genome bioinformatic analysis of nonsynonymous SNPs. BMC Bioinformatics 8, 301.
Article Google Scholar
Cai, Y., Huang, T., Hu, L., Shi, X., Xie, L., and Li, Y. Prediction of lysine ubiquitination with mRMR feature selection and analysis. Amino Acids. 2011 Jan 26. [Epub ahead of print].
Cai, Y.D., Huang, T., Feng, K.Y., Hu, L., and Xie, L. (2010). A unified 35-gene signature for both subtype classification and survival prediction in diffuse large B-cell lymphomas. PLoS One 5, e12726.
Article Google Scholar
Care, M.A., Needham, C.J., Bulpitt, A.J., and Westhead, D.R. (2007). Deleterious SNP prediction: be mindful of your training data! Bioinformatics 23, 664–672.
Article Google Scholar
Chou, K.C. (2001). Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43, 246–255.
Article Google Scholar
Chou, K.C. (2011). Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273, 236–247.
Article Google Scholar
Chou, K.C., and Shen, H.B. (2007). Recent progress in protein subcellular location prediction. Anal Biochem 370, 1–16.
Article Google Scholar
Chou, K.C., and Shen, H.B. (2008). Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat Protoc 3, 153–162.
Article Google Scholar
Chou, K.C., Wu, Z.C., and Xiao, X. (2011). iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins. PLoS One 6, e18258.
Article Google Scholar
Chou, K.C., and Zhang, C.T. (1995). Prediction of protein structural classes. Crit Rev Biochem Mol Biol 30, 275–349.
Article Google Scholar
Esmaeili, M., Mohabatkar, H., and Mohsenzadeh, S. (2010). Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J Theor Biol 263, 203–209.
Article Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008). LIBLINEAR: A library for large linear classification. J Mach Learn Res 9, 1871–1874.
Google Scholar
Freeman, L.C. (1979). Centrality in social networks: Conceptual clarification. Soc Networks 1, 215–239.
Article Google Scholar
Georgiou, D.N., Karakasidis, T.E., Nieto, J.J., and Torres, A. (2009). Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition. J Theor Biol 257, 17–26.
Article Google Scholar
Grantham, R. (1974). Amino acid difference formula to help explain protein evolution. Science 185, 862–864.
Article Google Scholar
Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., and McKusick, V.A. (2005). Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33, D514–D517.
Article Google Scholar
Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., and Sundararajan, S. (2008). A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th international conference on Machine learning. Helsinki, Finland: ACM, 408–415.
Google Scholar
Hu, L., Huang, T., Shi, X., Lu, W.C., Cai, Y.D., and Chou, K.C. (2011a). Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS One 6, e14556.
Article Google Scholar
Hu, L.L., Huang, T., Cai, Y.D., and Chou, K.C. (2011b). Prediction of body fluids where proteins are secreted into based on protein interaction network. PLoS One 6, e22989.
Article Google Scholar
Huang, T., Chen, L., Cai, Y.D., and Chou, K.C. (2011a). Classification and analysis of regulatory pathways using graph property, biochemical and physicochemical property, and functional property. PLoS One 6, e25297.
Article Google Scholar
Huang, T., Cui, W., Hu, L., Feng, K., Li, Y.X., and Cai, Y.D. (2009). Prediction of pharmacological and xenobiotic responses to drugs based on time course gene expression profiles. PLoS One 4, e8126.
Article Google Scholar
Huang, T., Niu, S., Xu, Z., Huang, Y., Kong, X., Cai, Y.D., and Chou, K. C. (2011b). Predicting transcriptional activity of multiple site p53 mutants based on hybrid properties. PLoS One 6, e22940.
Article Google Scholar
Huang, T., Shi, X.H., Wang, P., He, Z., Feng, K.Y., Hu, L., Kong, X., Li, Y.X., Cai, Y.D., and Chou, K.C. (2010a). Analysis and prediction of the metabolic stability of proteins based on their sequential features, subcellular locations and interaction networks. PLoS One 5, e10972.
Article Google Scholar
Huang, T., Tu, K., Shyr, Y., Wei, C.C., Xie, L., and Li, Y.X. (2008). The prediction of interferon treatment effects based on time series microarray gene expression profiles. J Transl Med 6, 44.
Article Google Scholar
Huang, T., Wang, P., Ye, Z.Q., Xu, H., He, Z., Feng, K.Y., Hu, L., Cui, W., Wang, K., Dong, X., et al. (2010b). Prediction of deleterious non-synonymous SNPs based on protein interaction network and hybrid properties. PLoS One 5, e11900.
Article Google Scholar
Jensen, L.J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M., et al. (2009). STRING 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37, D412–D416.
Article Google Scholar
Kawashima, S., Ogata, H., and Kanehisa, M. (1999). AAindex: amino acid index database. Nucleic Acids Res 27, 368–369.
Article Google Scholar
Keerthi, S.S., Sundararajan, S., Chang, K.-W., Hsieh, C.-J., and Lin, C.-J. (2008). A sequential dual method for large scale multi-class linear svms. In: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. Las Vegas, Nevada, USA: ACM, 408–416.
Chapter Google Scholar
Li, S., Xi, L., Li, J., Wang, C., Lei, B., Shen, Y., Liu, H., Yao, X., and Li, B. (2011). In silico prediction of deleterious single amino acid polymorphisms from amino acid sequence. J Comput Chem 32, 1211–1216.
Article Google Scholar
Lin, C.-J., Weng, R.C., and Keerthi, S.S. (2008). Trust region newton method for logistic regression. J Mach Learn Res 9, 627–650.
Google Scholar
Lin, W.Z., Fang, J.A., Xiao, X., and Chou, K.C. (2011). iDNA-Prot: identification of DNA binding proteins using random forest with grey model. PLoS One 6, e24756.
Article Google Scholar
Mohabatkar, H. (2010). Prediction of cyclin proteins using Chou’s pseudo amino acid composition. Protein Pept Lett 17, 1207–1214.
Article Google Scholar
Ng, P.C., and Henikoff, S. (2002). Accounting for human polymorphisms predicted to affect protein function. Genome Res 12, 436–446.
Article Google Scholar
Ng, P.C., and Henikoff, S. (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31, 3812–3814.
Article Google Scholar
Niu, S., Huang, T., Feng, K., Cai, Y., and Li, Y. (2010). Prediction of tyrosine sulfation with mRMR feature selection and analysis. J Proteome Res 9, 6490–6497.
Article Google Scholar
Peng, K., Radivojac, P., Vucetic, S., Dunker, A.K., and Obradovic, Z. (2006). Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics 7, 208.
Article Google Scholar
Qiu, J.D., Huang, J.H., Shi, S.P., and Liang, R.P. (2010). Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform. Protein Pept Lett 17, 715–722.
Article Google Scholar
Ramensky, V., Bork, P., and Sunyaev, S. (2002). Human non-synonymous SNPs: server and survey. Nucleic Acids Res 30, 3894–3900.
Article Google Scholar
Sharan, R., Ulitsky, I., and Shamir, R. (2007). Network-based prediction of protein function. Mol Syst Biol 3, 88.
Article Google Scholar
Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., and Sirotkin, K. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29, 308–311.
Article Google Scholar
Stenson, P.D., Ball, E.V., Mort, M., Phillips, A.D., Shiel, J.A., Thomas, N.S., Abeysinghe, S., Krawczak, M., and Cooper, D.N. (2003). Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat 21, 577–581.
Article Google Scholar
Wang, P., Xiao, X., and Chou, K.C. (2011). NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features. PLoS One 6, e23505.
Article Google Scholar
Wu, Z.C., Xiao, X., and Chou, K.C. (2011). iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites. Mol Biosyst 7, 3287–3297.
Article Google Scholar
Xiao, X., Wu, Z.C., and Chou, K.C. (2011). A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites. PLoS One 6, e20592.
Article Google Scholar
Ye, Z.Q., Zhao, S.Q., Gao, G., Liu, X.Q., Langlois, R.E., Lu, H., and Wei, L. (2007). Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP). Bioinformatics 23, 1444–1450.
Article Google Scholar
Zeng, Y.H., Guo, Y.Z., Xiao, R.Q., Yang, L., Yu, L.Z., and Li, M.L. (2009). Using the augmented Chou’s pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. J Theor Biol 259, 366–372.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
Tao Huang, Chuan Wang, Guoqing Zhang & Yixue Li
Shanghai Center for Bioinformation Technology, Shanghai, 200235, China
Tao Huang, Lu Xie & Yixue Li

Authors

Tao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Chuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoqing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lu Xie
View author publications
You can also search for this author in PubMed Google Scholar
Yixue Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lu Xie or Yixue Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, T., Wang, C., Zhang, G. et al. SySAP: a system-level predictor of deleterious single amino acid polymorphisms. Protein Cell 3, 38–43 (2012). https://doi.org/10.1007/s13238-011-1130-2

Download citation

Received: 04 November 2011
Accepted: 14 November 2011
Published: 19 December 2011
Issue Date: January 2012
DOI: https://doi.org/10.1007/s13238-011-1130-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

SySAP: a system-level predictor of deleterious single amino acid polymorphisms

Abstract

Article PDF

Similar content being viewed by others

Computational identification of deleterious synonymous variants in human genomes using a feature-based approach

In silico methods for predicting functional synonymous variants

regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SySAP: a system-level predictor of deleterious single amino acid polymorphisms

Abstract

Article PDF

Similar content being viewed by others

Computational identification of deleterious synonymous variants in human genomes using a feature-based approach

In silico methods for predicting functional synonymous variants

regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation