Abstract
The sequence-based prediction of the secondary and supersecondary structures enjoys strong interest and finds applications in numerous areas related to the characterization and prediction of protein structure and function. Substantial efforts in these areas over the last three decades resulted in the development of accurate predictors, which take advantage of modern machine learning models and availability of evolutionary information extracted from multiple sequence alignment. In this chapter, we first introduce and motivate both prediction areas and introduce basic concepts related to the annotation and prediction of the secondary and supersecondary structures, focusing on the β hairpin, coiled coil, and α-turn-α motifs. Next, we overview state-of-the-art prediction methods, and we provide details for 12 modern secondary structure predictors and 4 representative supersecondary structure predictors. Finally, we provide several practical notes for the users of these prediction tools.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pauling L, Corey RB, Branson HR (1951) The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA 37:205–211
Pauling L, Corey RB (1951) The pleated sheet, a new layer configuration of polypeptide chains. Proc Natl Acad Sci U S A 37:251–256
Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181:223–230
Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Pruitt KD, Tatusova T, Klimke W et al (2009) NCBI Reference sequences: current status, policy, and new initiatives. Nucleic Acids Res 37(Database issue):D32–D36
Gronwald W, Kalbitzer HR (2010) Automated protein NMR structure determination in solution. Methods Mol Biol 673:95–127
Chayen NE (2009) High-throughput protein crystallization. Adv Protein Chem Struct Biol 77:1–22
Zhang Y (2009) Protein structure prediction: when is it useful? Curr Opin Struct Biol 19:145–155
Ginalski K (2006) Comparative modeling for protein structure prediction. Curr Opin Struct Biol 16:172–177
Yang Y, Faraggi E, Zhao H et al. (2011) Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of the query and corresponding native properties of templates. Bioinformatics 27(15):2076–2082
Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725–738
Faraggi E, Yang Y, Zhang S et al (2009) Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure 17:1515–1527
Wu S, Zhang Y (2008) MUSTER: improving protein sequence profile–profile alignments by using multiple sources of structure information. Proteins 72:547–556
Zhou H, Skolnick J (2007) Ab initio protein structure prediction using chunk-TASSER. Biophys J 93:1510–1518
Skolnick J (2006) In quest of an empirical potential for protein structure prediction. Curr Opin Struct Biol 16:166–171
Zhang H, Zhang T, Chen K et al (2011) Critical assessment of high-throughput standalone methods for secondary structure prediction. Brief Bioinform 12(6):672–688
Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23: 802–808
Zhang T, Zhang H, Chen K et al (2010) Analysis and prediction of RNA-binding residues using sequence, evolutionary conservation, and predicted secondary structure and solvent accessibility. Curr Protein Pept Sci 11:609–628
Pulim V, Bienkowska J, Berger B (2008) LTHREADER: prediction of extracellular ligand-receptor interactions in cytokines using localized threading. Protein Sci 17:279–292
Fischer JD, Mayer CE, Söding J (2008) Prediction of protein functional residues from sequence by probability density estimation. Bioinformatics 24:613–620
Song J, Tan H, Mahmood K et al (2009) Prodepth: predict residue depth by support vector regression approach from protein sequences only. PLoS One 4:e7072
Zhang H, Zhang T, Chen K et al (2008) Sequence based residue depth prediction using evolutionary information and predicted secondary structure. BMC Bioinform 9:388
Mizianty MJ, Kurgan L (2009) Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences. BMC Bioinform 10:414
Kurgan L, Cios K, Chen K (2008) SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. BMC Bioinform 9:226
Chen K, Kurgan L (2007) PFRES: protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics 23:2843–2850
Xue B, Faraggi E, Zhou Y (2009) Predicting residue-residue contact maps by a two-layer, integrated neural-network method. Proteins 76:176–183
Cheng J, Baldi P (2007) Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinform 8:113
Mizianty MJ, Stach W, Chen K et al (2010) Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources. Bioinformatics 26:i489–i496
Mizianty MJ, Zhang T, Xue B et al (2011) In-silico prediction of disorder content using hybrid sequence representation. BMC Bioinform 12:245
Schlessinger A, Punta M, Yachdav G et al (2009) Improved disorder prediction by combination of orthogonal approaches. PLoS One 4:e4433
Zhang H, Zhang T, Gao J et al. (2012) Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility. Amino Acids. 42(1):271–283
Gao J, Zhang T, Zhang H et al (2010) Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility. Proteins 78:2114–2130
Jiang Y, Iglinski P, Kurgan L (2009) Prediction of protein folding rates from primary sequences using hybrid sequence representation. J Comput Chem 30:772–783
Mizianty M, Kurgan L (2011) Sequence-based prediction of protein crystallization, purification, and production propensity. Bioinformatics 27:i24–i33
Slabinski L, Jaroszewski L, Rychlewski L et al (2007) XtalPred: a web server for prediction of protein crystallizability. Bioinformatics 23:3403–3405
Bryson K, McGuffin LJ, Marsden RL et al (2005) Protein structure prediction servers at University College London. Nucleic Acids Res 33:W36–W38
Kurgan L, Miri Disfani F (2011) Structural protein descriptors in 1-dimension and their sequence-based predictions. Curr Protein Pept Sci. 12(6):470–489
Jones D (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202
Buchan DW, Ward SM, Lobley AE et al (2010) Protein annotation and modelling servers at University College London. Nucleic Acids Res 38:W563–W568
Rost B (1996) PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol 266:525–539
Rost B, Yachdav G, Liu J (2004) The predict protein server. Nucleic Acids Res 32(Web Server issue):W321–W326
O’Donnell CW, Waldispühl J, Lis M et al (2011) A method for probing the mutational landscape of amyloid structure. Bioinformatics 27:i34–i42
Bryan A Jr, Menke M, Cowen LJ et al (2009) BETASCAN: probable beta-amyloids identified by pairwise probabilistic analysis. PLoS Comput Biol 5:e1000333
Bradley P, Cowen L, Menke M et al (2001) BETAWRAP: successful prediction of parallel beta-helices from primary sequence reveals an association with many microbial pathogens. Proc Natl Acad Sci U S A 98:14819–14824
Hornung T, Volkov OA, Zaida TM et al (2008) Structure of the cytosolic part of the subunit b-dimer of Escherichia coli F0F1-ATP synthase. Biophys J 94:5053–5064
Sun ZR, Cui Y, Ling LJ et al (1998) Molecular dynamics simulation of protein folding with supersecondary structure constraints. J Protein Chem 17:765–769
Szappanos B, Süveges D, Nyitray L et al (2010) Folded-unfolded cross-predictions and protein evolution: the case study of coiled-coils. FEBS Lett 584:1623–1627
Rackham OJ, Madera M, Armstrong CT et al (2010) The evolution and structure prediction of coiled coils across all genomes. J Mol Biol 403:480–493
Gerstein M, Hegyi H (1998) Comparing genomes in terms of protein structure: surveys of a finite parts list. FEMS Microbiol Rev 22:277–304
Reddy CC, Shameer K, Offmann BO et al (2008) PURE: a webserver for the prediction of domains in unassigned regions in proteins. BMC Bioinform 9:281
de la Cruz X, Hutchinson EG, Shepherd A et al (2002) Toward predicting protein topology: an approach to identifying beta hairpins. Proc Natl Acad Sci U S A 99:11157–11162
Kumar M, Bhasin M, Natt NK et al (2005) BhairPred: prediction of beta-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33(Web Server issue):W154–W159
Barton GJ (1995) Protein secondary structure prediction. Curr Opin Struct Biol 5:372–376
Heringa J (2000) Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr Protein Pept Sci 1:273–301
Rost B (2001) Protein secondary structure prediction continues to rise. J Struct Biol 134:204–218
Albrecht M, Tosatto SC, Lengauer T et al (2003) Simple consensus procedures are effective and sufficient in secondary structure prediction. Protein Eng 16:459–462
Rost B (2009) Prediction of protein structure in 1D—secondary structure, membrane regions, and solvent accessibility. In: Bourne PE, Weissig H (eds) Structural bioinformatics, 2nd edn. Wiley, New York, pp 679–714
Pirovano W, Heringa J (2010) Protein secondary structure prediction. Methods Mol Biol 609:327–348
Singh M (2006) Predicting protein secondary and supersecondary structure. In: Aluru S (ed) Handbook of computational molecular biology. Chapman and Hall/CRC Press, pp 29.1–29.29
Gruber M, Söding J, Lupas AN (2006) Comparative analysis of coiled-coil prediction methods. J Struct Biol 155:140–145
Kolodny R, Honig B (2006) VISTAL-a new 2D visualization tool of protein 3D structural alignments. Bioinformatics 22:2166–2167
Moreland JL, Gramada A, Buzko OV et al (2005) The molecular biology toolkit (MBT): a modular platform for developing molecular visualization applications. BMC Bioinformatics 6:21
Porollo AA, Adamczak R, Meller J (2004) POLYVIEW: a flexible visualization tool for structural and functional annotations of proteins. Bioinformatics 20:2460–2462
Murzin AG, Brenner SE, Hubbard T et al (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540
Orengo CA, Michie AD, Jones S et al (1997) CATH—a hierarchic classification of protein domain structures. Structure 5:1093–1108
Andreeva A, Howorth D, Chandonia JM et al (2008) Data growth and its impact on the SCOP database: new developments. Nucl Acids Res 36:D419–D425
Cuff AL, Sillitoe I, Lewis T et al (2011) Extending CATH: increasing coverage of the protein structure universe and linking structure with function. Nucleic Acids Res 39(Database issue):D420–D426
Levitt M, Greer J (1997) Automatic identification of secondary structure in globular proteins. J Mol Biol 114:181–239
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637
Richards F, Kundrot CE (1988) Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure. Proteins 3:71–84
Sklenar H, Etchebest C, Lavery R (1989) Describing protein structure: a general algorithm yielding complete helicoidal parameters and a unique overall axis. Proteins 6:46–60
Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23:566–579
Labesse G, Colloc’h N, Pothier J et al (1997) P-SEA: a new efficient assignment of secondary structure from C alpha trace of proteins. Comput Appl Biosci 13:291–295
King S, Johnson WC (1999) Assigning secondary structure from protein coordinate data. Proteins 3:313–320
Fodje M, Al-Karadaghi S (2002) Occurrence, conformational features and amino acid propensities for the pi-helix. Protein Eng 15:353–358
Martin J, Letellier G, Marin A et al (2005) Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct Biol 5:17
Cubellis MV, Cailliez F, Lovell SC (2005) Secondary structure assignment that accurately reflects physical and evolutionary characteristics. BMC Bioinform 6(Suppl 4):S8
Majumdar I, Krishna SS, Grishin NV (2005) PALSSE: a program to delineate linear secondary structural elements from protein structures. BMC Bioinform 6:202
Zhang W, Dunker AK, Zhou Y (2008) Assessing secondary structure assignment of protein structures by using pairwise sequence-alignment benchmarks. Proteins 71:61–67
Hosseini SR, Sadeghi M, Pezeshk H et al (2008) PROSIGN: a method for protein secondary structure assignment based on three-dimensional coordinates of consecutive C(alpha) atoms. Comput Biol Chem 32:406–411
Park SY, Yoo MJ, Shin J et al (2011) SABA (secondary structure assignment program based on only alpha carbons): a novel pseudo center geometrical criterion for accurate assignment of protein secondary structures. BMB Rep 44:118–122
Klose DP, Wallace BA, Janes RW (2010) 2Struc: the secondary structure server. Bioinformatics 26:2624–2625
Moult J, Pedersen JT, Judson R et al (1995) A large-scale experiment to assess protein structure prediction methods. Proteins. 23:ii-v.
Koh IY, Eyrich VA, Marti-Renom MA et al (2003) EVA: evaluation of protein structure prediction servers. Nucleic Acids Res 31:3311–3315
Parry DA (2008) Fifty years of coiled-coils and alpha-helical bundles: a close relationship between sequence and structure. J Struct Biol 163:258–269
Pellegrini-Calace M, Thornton JM (2005) Detecting DNA-binding helix-turn-helix structural motifs using sequence and structure information. Nucleic Acids Res 33:2129–2140
Hutchinson EG, Thornton JM (1996) PROMOTIF—a program to identify and analyze structural motifs in proteins. Protein Sci 5:212–220
Walshaw J, Woolfson DN (2001) Socket: a program for identifying and analysing coiled-coil motifs within protein structures. J Mol Biol 307:1427–1450
Testa OD, Moutevelis E, Woolfson DN (2009) CC+: a relational database of coiled-coil structures. Nucleic Acids Res 37(Database issue):D315–D322
Michalopoulos I, Torrance GM, Gilbert DR et al (2004) TOPS: an enhanced database of protein structural topology. Nucleic Acids Res 32(Database issue):D251–D254
Rost B, Sander C (1993) Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci U S A 90:7558–7562
Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Faraggi E, Xue B, Zhou Y (2009) Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network. Proteins 74:847–856
Dor O, Zhou Y (2007) Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training. Proteins 66:838–845
Zhou T, Shu N, Hovmöller S (2010) A novel method for accurate one-dimensional protein structure prediction based on fragment matching. Bioinformatics 26:470–477
Kountouris P, Hirst JD (2009) Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinform 10:437
Karplus K, Barrett C, Hughey R (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14:846–856
Karplus K, Karchin R, Barrett C et al (2001) What is the value added by human intervention in protein structure prediction? Proteins 5(Suppl):86–91
Karplus K, Karchin R, Draper J et al (2003) Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53:491–496
Karplus K, Katzman S, Shackleford G et al (2005) SAM-T04: what is new in protein-structure prediction for CASP6. Proteins 61(Suppl 7):135–142
Karplus K (2009) SAM-T08, HMM-based protein structure prediction. Nucleic Acids Res 37(Web Server issue):W492–W497
Montgomerie S, Cruz JA, Shrivastava S et al (2008) PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation. Nucleic Acids Res 36(Web Server issue):W202–W209
Montgomerie S, Sundararaj S, Gallin WJ et al (2006) Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinform 7:301
Cole C, Barber JD, Barton GJ (2008) The Jpred 3 secondary structure prediction server. Nucleic Acids Res 2008(36):W197–W201
Cuff JA, Clamp ME, Siddiqui AS et al (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14:892–893
Cuff J, Barton GJ (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40:502–511
Won K, Hamelryck T, Prügel-Bennett A et al (2007) An evolutionary method for learning HMM structure: prediction of protein secondary structure. BMC Bioinform 8:357
Pollastri G, Martin AJM, Mooney C et al (2007) Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information. BMC Bioinform 8:201
Pollastri G, McLysaght A (2005) Porter: a new, accurate server for protein secondary structure prediction. Bioinformatics 21:1719–1720
Martin J, Gibrat JF, Rodolphe F (2006) Analysis of an optimal hidden Markov model for secondary structure prediction. BMC Struct Biol 6:25
Karypis G (2006) YASSPP: better kernels and coding schemes lead to improvements in protein secondary structure prediction. Proteins 64:575–586
Lin K, Simossis VA, Taylor WR et al (2005) A simple and fast secondary structure prediction algorithm using hidden neural networks. Bioinformatics 21:152–159
Adamczak R, Porollo A, Meller J (2005) Combining prediction of secondary structure and solvent accessibility in proteins. Proteins 59:467–475
Cheng J, Randall AZ, Sweredoski MJ et al (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 33:W72–W76
Pollastri G, Przybylski D, Rost B et al (2002) Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47:228–235
Madera M, Calmus R, Thiltgen G et al (2010) Improving protein secondary structure prediction using a simple k-mer model. Bioinformatics 26:596–602
Lupas A, Van Dyke M, Stock J (1991) Predicting coiled coils from protein sequences. Science 252:1162–1164
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755–763
Eyrich VA, Martí-Renom MA, Przybylski D et al (2001) EVA: continuous automatic evaluation of protein structure prediction servers. Bioinformatics 17:1242–1243
Bau D, Martin AJ, Mooney C et al (2006) Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins. BMC Bioinform 7:402
Zhang T, Faraggi E, Zhou Y (2010) Fluctuations of backbone torsion angles obtained from NMR-determined structures and their prediction. Proteins 78:3353–3362
Jia SC, Hu XZ (2011) Using random forest algorithm to predict β-hairpin motifs. Protein Pept Lett 18:609–617
Xia JF, Wu M, You ZH et al (2010) Prediction of beta-hairpins in proteins using physicochemical properties and structure information. Protein Pept Lett 17:1123–1128
Zou D, He Z, He J (2009) Beta-hairpin prediction with quadratic discriminant analysis using diversity measure. J Comput Chem 30:2277–2284
Hu XZ, Li QZ (2008) Prediction of the beta-hairpins in proteins using support vector machine. Protein J 27:115–122
Kuhn M, Meiler J, Baker D (2004) Strand-loop-strand motifs: prediction of hairpins and diverging turns in proteins. Proteins 54:282–288
Bartoli L, Fariselli P, Krogh A et al (2009) CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information. Bioinformatics 25:2757–2763
McDonnell AV, Jiang T, Keating AE et al (2006) Paircoil2: improved prediction of coiled coils from sequence. Bioinformatics 2006(22):356–358
Mason JM, Schmitz MA, Müller KM et al (2006) Semirational design of Jun-Fos coiled coils with increased affinity: Universal implications for leucine zipper prediction and design. Proc Natl Acad Sci U S A 103:8989–8994
Gruber M, Söding J, Lupas AN (2005) REPPER—repeats and their periodicities in fibrous proteins. Nucleic Acids Res 33(Web Server issue):W239–W243
Delorenzi M, Speed T (2002) An HMM model for coiled-coil domains and a comparison with PSSM-based predictions. Bioinformatics 18:617–625
Dodd IB, Egan JB (1990) Improved detection of helix-turn-helix DNA-binding motifs in protein sequences. Nucleic Acids Res 18:5019–5026
Narasimhan G, Bu C, Gao Y et al (2002) Mining protein sequences for motifs. J Comput Biol 9:707–720
Ferrer-Costa C, Shanahan HP, Jones S et al (2005) HTHquery: a method for detecting DNA-binding proteins with a helix-turn-helix structural motif. Bioinformatics 21:3679–3680
Wolf E, Kim PS, Berger B (1997) MultiCoil: a program for predicting two- and three-stranded coiled coils. Protein Sci 6(6):1179–1189
Ahmad S, Gromiha MM (2002) NETASA: neural network based prediction of solvent accessibility. Bioinformatics 18(6):819–824
Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Cryst A32:922–923
Shanahan H, Garcia M, Jones S et al (2004) Identifying DNA binding proteins using structural motifs and the electrostatic potential. Nucleic Acids Res 32:4732–4741
Fischer D, Barret C, Bryson K et al (1999) CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins Suppl 3:209–217
Acknowledgment
This work was supported by the Alberta Ingenuity and Alberta Innovates Graduate Student Scholarship to KC and the NSERC Discovery grant to LK.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media New York
About this protocol
Cite this protocol
Chen, K., Kurgan, L. (2012). Computational Prediction of Secondary and Supersecondary Structures. In: Kister, A. (eds) Protein Supersecondary Structures. Methods in Molecular Biology, vol 932. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-065-6_5
Download citation
DOI: https://doi.org/10.1007/978-1-62703-065-6_5
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-064-9
Online ISBN: 978-1-62703-065-6
eBook Packages: Springer Protocols