Abstract
The prediction of non-classical secreted proteins is a significant problem for drug discovery and development of disease diagnosis. The characteristic of non-classical secreted proteins is they are leaderless proteins without signal peptides in N-terminal. This characteristic makes the prediction of non-classical proteins more difficult and complicated than the classical secreted proteins. We identify a set of informative physicochemical properties of amino acid indices cooperated with support vector machine (SVM) to find discrimination between secreted and non-secreted proteins and to predict non-classical secreted proteins. When the sequence identity of dataset was reduced to 25%, the prediction accuracy on training dataset is 85% which is much better than the traditional sequence similarity-based BLAST or PSI-BLAST tool. The accuracy of independent test is 82%. The most effective features of prediction revealed the fundamental differences of physicochemical properties between secreted and non-secreted proteins. The interpretable and valuable information could be beneficial for drug discovery or the development of new blood biochemical examinations.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Arnold, R., Brandmaier, S., Kleine, F., Tischler, P., Heinz, E., Behrens, S., Niinikoski, A., Mewes, H.W., Horn, M., Rattei, T. 2009. Sequence-based prediction of type III secreted proteins. PLoS Pathog 5, e1000376.
Bendtsen, J.D., Jensen, L.J., Blom, N., von Heijne, G., Brunak, S. 2004a. Feature-based prediction of nonclassical and leaderless protein secretion. Protein Eng Des Sel 17, 349–356.
Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S. 2004b. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340, 783–795.
Bendtsen, J.D., Binnewies, T.T., Hallin, P.F., Sicheritz-Ponten, T., Ussery, D.W. 2005. Genome update: Prediction of secreted proteins in 225 bacterial proteomes. Microbiology 151, 1725–1727.
Bonin-Debs, Boche, I., Gille, H., Brinkmann, U. 2004. Development of secreted proteins as biotherapeutic agents. Expert Opin Biol Ther 4, 551–558.
Chang, C.C., Lin, C.J. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/cjlin/libsvm.
Chen, Y., Yu, P., Luo, J., Jiang, Y. 2003. Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT. Mamm Genome 14, 859–865.
Chevallet, M., Diemer, H., van Dorssealer, A., Villiers, C., Rabilloud, T. 2007. Toward a better analysis of secreted proteins: The example of the myeloid cells secretome. Proteomics 7, 1757–1770.
Cui, J., Liu, Q., Puett, D., Xu, Y. 2008. Computational prediction of human proteins that can be secreted into the bloodstream. Bioinformatics 24, 2370–2375.
Damas, J.K., Gullestad, L., Aukrust, P. 2001. Cytokines as new treatment targets in chronic heart failure. Curr Control Trials Cardiovasc Med 2, 271–277.
Dey, A. 1985. Orthogonal Fractional Factorial Designs. Wiley, New York.
Duong, F., Lazdunski, A., Murgier, M. 1996. Protein secretion by heterologous bacterial ABC-transporters: the C-terminus secretion signal of the secreted protein confers high recognition specificity. Mol Microbiol 21, 459–470.
Emanuelsson, O., Brunak, S., von Heijne, G., Nielsen, H. 2007. Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2, 953–971.
Garg, A., Raghava, G.P. 2008. A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search. In Silico Biol 8, 129–140.
Grimmond, S.M., Miranda, K.C., Yuan, Z., Davis, M.J., Hume, D.A., Yagi, K., Tominaga, N., Bono, H., Hayashizaki, Y., Okazaki, Y., RIKEN GER Group, GSL Members, Teasdale, R.D. 2003. The mouse secretome: Functional classification of the proteins secreted into the extracellular environment. Genome Res 13, 1350–1359.
Ho, S.Y., Shu, L.S., Chen, J.H. 2004. Intelligent evolutionary algorithms for large parameter optimization problems. IEEE Transactions on Evolutionary Computation 8, 522–541.
Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., Kanehisa, M. 2008. AAindex: Amino acid index database, progress report 2008. Nucleic Acids Res 36, D202–205.
Keller, M., Ruegg, A., Werner, S., Beer, H.D. 2008. Active caspase-1 is a regulator of unconventional protein secretion. Cell 132, 818–831.
Klee, E.W., Sosa, C.P. 2007. Computational classification of classically secreted proteins. Drug Discov Today 12, 234–240.
Klee, E.W., Finlay, J.A., McDonald, C., Attewell, J.R. Hebrink, D., Dyer, R., Love, B., Vasmatzis, G., Li, T.M., Beechem, J.M., Klee, G.G. 2006. Bioinformatics methods for prioritizing serum biomarker candidates. Clin Chem 52, 2162–2164.
Klumperman, J. 2000. Transport between ER and Golgi. Curr Opin Cell Biol 12, 445–449.
Nickel, W. 2003. The mystery of nonclassical protein secretion. A current view on cargo proteins and potential export routes. Eur J Biochem 270, 2109–2119.
Nickel, W. 2005. Unconventional secretory routes: Direct protein export across the plasma membrane of mammalian cells. Traffic 6, 607–614.
Pierleoni, A., Martelli, P.L., Fariselli, P., Casadio, R. 2006. BaCelLo: A balanced subcellular localization predictor. Bioinformatics 22, e408–e416.
Tang, J., Bond, J.S. 1998. Maturation of secreted meprin alpha during biosynthesis: role of the furin site and identification of the COOH-terminal amino acids of the mouse kidney metalloprotease subunit. Arch Biochem Biophys 349, 192–200.
Tung, C.W., Ho, S.Y. 2007. POPI: Predicting immunogenicity of MHC class I binding peptides by mining informative physicochemical properties. Bioinformatics 23, 942–949.
Wang, G., Dunbrack, J.R.J. 2003. PISCES: A protein sequence culling server. Bioinformatics 19, 1589–1591.
Wu, Q. 1978. On the optimality of orthogonal experimental design. Acta Math Appl Sinica 1, 283–299.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hung, CH., Huang, HL., Hsu, KT. et al. Prediction of non-classical secreted proteins using informative physicochemical properties. Interdiscip Sci Comput Life Sci 2, 263–270 (2010). https://doi.org/10.1007/s12539-010-0023-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-010-0023-z