Abstract
In this paper we present a novel parallel coordinate based clustering method using Gaussian mixture distribution models to characterize the conformational space of proteins. We detect highly populated regions which may correspond to intermediate states that are difficult to detect experimentally. The data is represented as feature vectors of N dimensions, which are lower-dimension projections of the protein conformations. Parallel coordinates are a visualization technique that lays out coordinate axes in parallel rather than orthogonal to each other, thereby allowing patterns between pairs of axis as well as outliers to be visually identified in multi-dimensional data. We believe that the size of the resulting clusters may provide information about the likelihood of the corresponding conformations to exist as important intermediates. We tested our method on the conformational space for the enzyme Adenylate Kinase (AdK) which undergoes large scale conformational changes and used our method to detect clusters which may correspond to experimentally known intermediates. Finally, we compare our clusters with the ones generated by the K-Means clustering algorithm and discuss the advantages of our method for the problem of characterizing proteins conformational space.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Perutz, M.F.: Mechanisms of cooperativity and allosteric regulation in proteins. Quart. Rev. Biophys. 22, 139–236 (1989)
Case, D.A., Cheatham, T., Darden, T., Gohlke, H., Luo, R., Merz Jr., K.M., Onufriev, A., Simmerling, C., Wang, B., Woods, R.: The Amber biomolecular simulation programs. J. Computat. Chem. 26, 1668–1688 (2005)
Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Haspel, N., Moll, M., Baker, M., Chiu, W., Kavraki, L.E.: Tracing conformational changes in proteins. BMC Structural Biology (2010) (in press)
Thomas, S., Tang, X., Tapia, L., Amato, N.M.: Simulating protein motions with rigidity analysis. J. Comp. Biol. 14(6), 839–855 (2007)
Chiang, T.H., Apaydin, M.S., Brutlag, D.L., Hsu, D., Latombe, J.-C.: Using stochastic roadmap simulation to predict experimental quantities in protein folding kinetics. J. Comp. Biol. 14(5), 578–593 (2007)
Raveh, B., Enosh, A., Furman-Schueler, O., Halperin, D.: Rapid sampling of molecular motions with prior information constraints. Plos Comp. Biol. (2009) (in press)
Zheng, W., Brooks, B.: Identification of dynamical correlations within the myosin motor domain by the normal mode analysis of an elastic network model. J. Mol. Biol. 346(3), 745–759 (2005)
Schroeder, G., Brunger, A.T., Levitt, M.: Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. Structure 15, 1630–1641 (2007)
Weiss, D.R., Levitt, M.: Can morphing methods predict intermediate structures? J. Mol. Biol. 385, 665–674 (2009)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall (1988)
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–296 (1967)
Shehu, A., Kavraki, L.E., Clementi, C.: Multiscale characterization of protein conformational ensembles. Proteins: Structure, Function and Bioinformatics (2009)
Ladd, A.M.: Motion Planning for Physical Simulation. PhD thesis, Dept. of Computer Science, Rice University, Houston, TX (December 2006)
Inselberg, A.: Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the First IEEE Conference on Visualization, California, USA, pp. 361–378 (1990)
McLachlan, G., Peel, D.: Finite Mixture Models. John Wiley and Sons (2000)
Inselberg, A.: Visual data mining with parallel coordinates. Computational Statistics 13 (1998)
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 611–631 (June 2002)
Baudry, J., Raftery, A.E., Celeux, G., Lo, K., Gottardo, R.: Combining mixture components for clustering. Journal of Computational and Graphical Statistics 19(2), 332–353 (2010)
Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 719–725 (2000)
Celis, M.R., Dennis, J.E., Tapia, R.A.: A trust region strategy for nonlinear equality constrained optimization. In: Proceedings of the SIAM Conference on Numerical Optimization, pp. 71–82 (1984)
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. SIAM, PA (2000)
Feng, Y., Yang, L., Kloczkowski, A., Jernigan, R.L.: The energy profiles of atomic conformational transition intermediates of adenylate kinase. Proteins 77(3), 551–558 (2009)
Henzler-Wildman, K.A., Thai, V., Lei, M., Ott, M., Wolf-Watz, M., Fenn, T., Pozharski, E., Wilson, M.A., Petsko, G.A., Karplus, M., Hübner, C.G., Kern, D.: Intrinsic motions along an enzymatic reaction trajectory. Nature 450(7171), 838–844 (2007)
Schlauderer, G.J., Proba, K., Schulz, G.E.: Intrinsic motions along an enzymatic reaction trajectory. J. Mol. Biol. 256, 223–227 (1996)
Holmes, G., Donkin, A., Witten, I.H.: Weka: a machine learning workbench. In: Proceedings of the 1994 Second Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vetro, R., Haspel, N., Simovici, D. (2013). Characterizing Intermediate Conformations in Protein Conformational Space. In: Peterson, L.E., Masulli, F., Russo, G. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2012. Lecture Notes in Computer Science(), vol 7845. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38342-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-38342-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38341-0
Online ISBN: 978-3-642-38342-7
eBook Packages: Computer ScienceComputer Science (R0)