Abstract
The protein energy landscape, which lifts the protein structure space by associating energies with structures, has been useful in improving our understanding of the relationship between structure, dynamics, and function. Currently, however, it is challenging to automatically extract and utilize the underlying organization of an energy landscape to the link structural states it houses to biological activity. In this chapter, we first report on two computational approaches that extract such an organization, one that ignores energies and operates directly in the structure space and another that operates on the energy landscape associated with the structure space. We then describe two complementary approaches, one based on unsupervised learning and another based on supervised learning. Both approaches utilize the extracted organization to address the problem of decoy selection in template-free protein structure prediction. The presented results make the case that learning organizations of protein energy landscapes advances our ability to link structures to biological activity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Boehr DD, Wright PE (2008) How do proteins interact? Science 320(5882):1429–1430
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A (2016) Principles and overview of sampling methods for modeling macromolecular structure and dynamics. PLoS Comp Biol 12(4):e1004619
Leaver-Fay A et al (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol 487:545–574
Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins: Struct Funct Bioinf 80(7):1715–1735. https://doi.org/10.1002/prot.24065
Olson B, Shehu A (2013) Multi-objective stochastic search for sampling local minima in the protein energy surface. In: ACM conference on bioinformatics, computational biology (BCB), Washington, DC, pp 430–439
Clausen R, Shehu A (2014) A multiscale hybrid evolutionary algorithm to obtain sample-based representations of multi-basin protein energy landscapes. In: ACM conference on bioinformatics, computational biology (BCB), Newport Beach, CA, pp 269–278
Shehu A, Plaku E (2016) A survey of computational treatments of biomolecules by robotics-inspired methods modeling equilibrium structure and dynamics. J Artif Intell Res 597:509–572
Shehu A, Clementi C, Kavraki LE (2007) Sampling conformation space to model equilibrium fluctuations in proteins. Algorithmica 48(4):303–327
Okazaki K, Koga N, Takada S, Onuchic JN, Wolynes PG (2006) Multiple-basin energy landscapes for large-amplitude conformational motions of proteins: structure-based molecular dynamics simulations. Proc Natl Acad Sci U S A 103(32):11844–11849
Boehr DD, Nussinov R, Wright PE (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol 5(11):789–796
Nussinov R, Wolynes PG (2014) A second molecular biology revolution? The energy landscapes of biomolecular function. Phys Chem Chem Phys 16(14):6321–6322
Frauenfelder H, Sligar SG, Wolynes PG (1991) The energy landscapes and motion on proteins. Science 254(5038):1598–1603
Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG (1995) Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins Struct Funct Genet 21(3):167–195
Shehu A (2015) A review of evolutionary algorithms for computing functional conformations of protein molecules. In: Zhang W (ed) Computer-aided drug discovery, Springer methods in pharmacology and toxicology series
Samoilenko S (2008) Fitness landscapes of complex systems: insights and implications on managing a conflict environment of organizations. Complex Organ 10(4):38–45
Kryshtafovych A, Fidelis K, Tramontano A (2011) Evaluation of model quality predictions in CASP9. Proteins 79(Suppl 10):91–106
Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, Tramon- tano A (2014) Assessment of the assessment: evaluation of the model quality estimates in CASP10. Proteins 82(Suppl 2):112–126
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2014) Critical assessment of methods of protein structure prediction (CASP)—round X. Proteins: Struct Funct Bioinf 82:109–115
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2018) Critical assessment of methods of protein structure prediction (CASP)—round XII. Proteins 86(Suppl 1):7–15. https://doi.org/10.1002/prot.25415
Uziela K, Wallner B (2016) Proq2: estimation of model accuracy implemented in rosetta. Bioinformatics 32(9):1411–1413
Liu T, Wang Y, Eickholt J, Wang Z (2016) Benchmarking deep networks for predicting residue-specific quality of individual protein models in casp11. Sci Rep 6(19):301
Ginalski K, Elofsson A, Fischer D, Rychlewski L (2003) 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19(8):1015–1018
Wallner B, Elofsson A (2006) Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci 15(4):900–913
Lorenzen S, Zhang Y (2007) Identification of near-native structures by clustering protein docking conformations. Proteins 68(1):187–194
Zhang Y, Skolnick J (2004) Spicker: a clustering approach to identify near-native protein folds. J Comput Chem 25(6):865–871
Molloy K, Saleh S, Shehu A (2013) Probabilistic search and energy guidance for biased decoy sampling in ab-initio protein structure prediction. IEEE/ACM Trans Bioinf Comput Biol 10(5):1162–1175
Shehu A (2013) Probabilistic search and optimization for protein energy land- scapes. In: Aluru S, Singh A (eds) Handbook of computational molecular biology, Chapman & Hall/CRC Computer & Information Science SeriesBoca Raton
Guan W, Ozakin A, Gray A, et al (2011) Learning protein folding energy functions. In: International conference data mining. IEEE, pp 1062–1067
Jing X, Wang K, Lu R, Dong Q (2016) Sorting protein decoys by machine-learning-to-rank. Sci Rep 6(31):571
He Z, Alazmi M, Zhang J, Xu D (2013) Protein structural model selection by combining consensus and single scoring methods. PLoS One 8(9):e74006
Pawlowski M, Kozlowski L, Kloczkowski A (2016) Mqapsingle: a quasi single-model approach for estimation of the quality of individual protein structure models. Proteins 84(8):1021–1028
Cao R, Wang Z, Wang Y, Cheng J (2014) Smoq: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines. BMC Bioinform 15(1):120
Nguyen SP, Shang Y, Xu D (2014) Dl-pro: a novel deep learning method for protein model quality assessment. In: International conference on neural networks (IJCNN). IEEE, pp 2071–2078
Manavalan B, Lee J, Lee J (2014) Random forest-based protein model quality assessment (rfmqa) using structural features and potential energy terms. PLoS One 9(9):e106542
Chatterjee S, Ghosh S, Vishveshwara S (2013) Network properties of decoys and casp predicted models: a comparison with native protein structures. Mol BioSyst 9(7):1774–1788
Mirzaei S, Sidi T, Keasar C, Crivelli S (2016) Purely structural protein scoring functions using support vector machine and ensemble learning. In: IEEE/ACM transactions on computational biology and bioinformatics
Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Cryst A32:922–923
Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6(30):750
Cazals F, Dreyfus T (2017) The structural bioinformatics library: modeling in biomolecular science and beyond. Bioinformatics 33(7):997–1004
Zhou ZH (2012) Ensemble methods: foundations and algorithms. CRC Press, Boca Raton
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794
Tomek I (1976) Two modifications of CNN. IEEE Trans Syst Man Cybernet 6:769–772
Akhter N, Shehu A (2017) From extraction of local structures of protein energy landscapes to improved decoy selection in template-free protein structure prediction. Molecules 23(1):216
Berman HM, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980–980
Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: International conference on data mining (ICDM), pp 745–754
Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: International AAAI conference on weblogs and social media. AAS, pp 361–362
Jacomy M, Venturini T, Heymann S, Bastian M (2014) ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS One 9(6):e98679
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Akhter, N., Hassan, L., Rajabi, Z., Barbará, D., Shehu, A. (2019). Learning Organizations of Protein Energy Landscapes: An Application on Decoy Selection in Template-Free Protein Structure Prediction. In: Kister, A. (eds) Protein Supersecondary Structures. Methods in Molecular Biology, vol 1958. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9161-7_8
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9161-7_8
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-9160-0
Online ISBN: 978-1-4939-9161-7
eBook Packages: Springer Protocols