Prediction of Protein Phosphorylation Sites by Integrating Secondary Structure Information and Other One-Dimensional Structural Properties

Dou, Yongchao; Yao, Bo; Zhang, Chi

doi:10.1007/978-1-4939-6406-2_18

Yongchao Dou⁶,
Bo Yao⁶ &
Chi Zhang⁶

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1484))

2756 Accesses
6 Citations

Abstract

Studies on phosphorylation are important but challenging for both wet-bench experiments and computational studies, and accurate non-kinase-specific prediction tools are highly desirable for whole-genome annotation in a wide variety of species. Here, we describe a phosphorylation site prediction webserver, PhosphoSVM, that employs Support Vector Machine to combine protein secondary structure information and seven other one-dimensional structural properties, including Shannon entropy, relative entropy, predicted protein disorder information, predicted solvent accessible area, amino acid overlapping properties, averaged cumulative hydrophobicity, and subsequence k-nearest neighbor profiles. This method achieved AUC values of 0.8405/0.8183/0.7383 for serine (S), threonine (T), and tyrosine (Y) phosphorylation sites, respectively, in animals with a tenfold cross-validation. The model trained by the animal phosphorylation sites was also applied to a plant phosphorylation site dataset as an independent test. The AUC values for the independent test data set were 0.7761/0.6652/0.5958 for S/T/Y phosphorylation sites, respectively. This algorithm with the optimally trained model was implemented as a webserver. The webserver, trained model, and all datasets used in the current study are available at http://sysbio.unl.edu/PhosphoSVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection

Article Open access 31 July 2017

Resources for Assignment of Phosphorylation Sites on Peptides and Proteins

JUPred_SVM: Prediction of Phosphorylation Sites Using a Consensus of SVM Classifiers

References

Caenepeel S, Charydczak G, Sudarsanam S, Hunter T, Manning G (2004) The mouse kinome: discovery and comparative genomics of all mouse protein kinases. Proc Natl Acad Sci U S A 101(32):11707–11712. doi:10.1073/pnas.0306880101
Article CAS PubMed PubMed Central Google Scholar
Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) The protein kinase complement of the human genome. Science 298(5600):1912–1934. doi:10.1126/science.1075762
Article CAS PubMed Google Scholar
Vlad F, Turk BE, Peynot P, Leung J, Merlot S (2008) A versatile strategy to define the phosphorylation preferences of plant protein kinases and screen for putative substrates. Plant J 55(1):104–117. doi:10.1111/j.1365-313X.2008.03488.x
Article CAS PubMed Google Scholar
Trost B, Kusalik A (2011) Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 27(21):2927–2935. doi:10.1093/bioinformatics/btr525
Article CAS PubMed Google Scholar
Dou Y, Yao B, Zhang C (2014) PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine. Amino Acids 46(6):1459–1469. doi:10.1007/s00726-014-1711-5
Article CAS PubMed Google Scholar
Diella F, Gould CM, Chica C, Via A, Gibson TJ (2008) Phospho.ELM, a database of phosphorylation sites—update. Nucleic Acids Res 36(Database issue):D240–D244. doi:10.1093/nar/gkm772
CAS PubMed Google Scholar
Heazlewood JL, Durek P, Hummel J, Selbig J, Weckwerth W, Walther D, Schulze WX (2008) PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor. Nucleic Acids Res 36(Database issue):D1015–D1021. doi:10.1093/nar/gkm812
CAS PubMed Google Scholar
Durek P, Schmidt R, Heazlewood JL, Jones A, MacLean D, Nagel A, Kersten B, Schulze WX (2010) PhosPhAt: the Arabidopsis thaliana phosphorylation site database. An update. Nucleic Acids Res 38(Database issue):D828–D834. doi:10.1093/nar/gkp810
Article CAS PubMed Google Scholar
Zulawski M, Braginets R, Schulze WX (2013) PhosPhAt goes kinases—searchable protein kinase target information in the plant phosphorylation site database PhosPhAt. Nucleic Acids Res 41(Database issue):D1176–D1184. doi:10.1093/nar/gks1081
Article CAS PubMed Google Scholar
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Article CAS PubMed PubMed Central Google Scholar
Fan RE, Chen PH, Lin CJ (2005) Working set selection using second order information for training support vector machines. J Mach Learn Res 6:1889–1918
Google Scholar
Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, Dunker AK (2004) The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res 32(3):1037–1049. doi:10.1093/nar/gkh253
Article CAS PubMed PubMed Central Google Scholar
McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16(4):404–405
Article CAS PubMed Google Scholar
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337(3):635–645. doi:10.1016/j.jmb.2004.02.002
Article CAS PubMed Google Scholar
Ahmad S, Gromiha MM, Sarai A (2003) RVP-net: online prediction of real valued accessible surface area of proteins from single sequences. Bioinformatics 19(14):1849–1851
Article CAS PubMed Google Scholar
Taylor WR (1986) The classification of amino acid conservation. J Theor Biol 119(2):205–218
Article CAS PubMed Google Scholar
Sweet RM, Eisenberg D (1983) Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. J Mol Biol 171(4):479–488
Article CAS PubMed Google Scholar
Biswas AK, Noman N, Sikder AR (2010) Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information. BMC Bioinformatics 11:273. doi:10.1186/1471-2105-11-273
Article PubMed PubMed Central Google Scholar
Capra JA, Singh M (2007) Predicting functionally important residues from sequence conservation. Bioinformatics 23(15):1875–1882. doi:10.1093/bioinformatics/btm270
Article CAS PubMed Google Scholar
Mihalek I, Res I, Lichtarge O (2004) A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol 336(5):1265–1282. doi:10.1016/j.jmb.2003.12.078
Article CAS PubMed Google Scholar
Johansson F, Toh H (2010) A comparative study of conservation and variation scores. BMC Bioinformatics 11:388. doi:10.1186/1471-2105-11-388
Article PubMed PubMed Central Google Scholar
Wu TD, Brutlag DL (1995) Identification of protein motifs using conserved amino acid properties and partitioning techniques. Proc Int Conf Intell Syst Mol Biol 3:402–410
CAS PubMed Google Scholar
Gok M, Ozcerit AT (2012) Prediction of MHC class I binding peptides with a new feature encoding technique. Cell Immunol 275(1–2):1–4. doi:10.1016/j.cellimm.2012.04.005
Article CAS PubMed Google Scholar
Wu CY, Hwa YH, Chen YC, Lim C (2012) Hidden relationship between conserved residues and locally conserved phosphate-binding structures in NAD(P)-binding proteins. J Phys Chem B. doi:10.1021/jp3014332
Google Scholar
Dou Y, Zheng X, Yang J, Wang J (2010) Prediction of catalytic residues based on an overlapping amino acid classification. Amino Acids 39(5):1353–1361. doi:10.1007/s00726-010-0587-2
Article CAS PubMed Google Scholar
Dou Y, Wang J, Yang J, Zhang C (2012) L1pred: a sequence-based prediction tool for catalytic residues in enzymes with the L1-logreg classifier. PLoS One 7(4):e35666. doi:10.1371/journal.pone.0035666
Article CAS PubMed PubMed Central Google Scholar
Zhang T, Zhang H, Chen K, Shen S, Ruan J, Kurgan L (2008) Accurate sequence-based prediction of catalytic residues. Bioinformatics 24(20):2329–2338. doi:10.1093/bioinformatics/btn433
Article CAS PubMed Google Scholar
Wang L, Brown SJ (2006) BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res 34(Web Server issue):W243–W248. doi:10.1093/nar/gkl298
Article CAS PubMed PubMed Central Google Scholar
Gao J, Thelen JJ, Dunker AK, Xu D (2010) Musite, a tool for global prediction of general and kinase-specific phosphorylation sites. Mol Cell Proteomics 9(12):2586–2600. doi:10.1074/mcp.M110.001388
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgement

This project was supported by funding under CZ’s startup funds from University of Nebraska, Lincoln, NE. This work was completed utilizing the Holland Computing Center of the University of Nebraska.

Author information

Authors and Affiliations

School of Biological Sciences, University of Nebraska–Lincoln, Lincoln, NE, 68588-0118, USA
Yongchao Dou, Bo Yao & Chi Zhang

Authors

Yongchao Dou
View author publications
You can also search for this author in PubMed Google Scholar
Bo Yao
View author publications
You can also search for this author in PubMed Google Scholar
Chi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chi Zhang .

Editor information

Editors and Affiliations

Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, Australia
Yaoqi Zhou
Battelle Center for Mathematical Medicine, Nationwide Children’s Hospital, Columbus, Ohio, USA
Andrzej Kloczkowski
Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, USA
Eshel Faraggi
Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, Australia
Yuedong Yang

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Dou, Y., Yao, B., Zhang, C. (2017). Prediction of Protein Phosphorylation Sites by Integrating Secondary Structure Information and Other One-Dimensional Structural Properties. In: Zhou, Y., Kloczkowski, A., Faraggi, E., Yang, Y. (eds) Prediction of Protein Secondary Structure. Methods in Molecular Biology, vol 1484. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6406-2_18

Download citation

DOI: https://doi.org/10.1007/978-1-4939-6406-2_18
Published: 28 October 2016
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6404-8
Online ISBN: 978-1-4939-6406-2
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Prediction of Protein Phosphorylation Sites by Integrating Secondary Structure Information and Other One-Dimensional Structural Properties

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection

Resources for Assignment of Phosphorylation Sites on Peptides and Proteins

JUPred_SVM: Prediction of Phosphorylation Sites Using a Consensus of SVM Classifiers

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Prediction of Protein Phosphorylation Sites by Integrating Secondary Structure Information and Other One-Dimensional Structural Properties

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

PhosphoPredict: A bioinformatics tool for prediction of human kinase-specific phosphorylation substrates and sites by integrating heterogeneous feature selection

Resources for Assignment of Phosphorylation Sites on Peptides and Proteins

JUPred_SVM: Prediction of Phosphorylation Sites Using a Consensus of SVM Classifiers

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation