Abstract
The prediction of RNA secondary structure including pseudoknots remains a challenge due to the intractable computation of the sequence conformation from nucleotide interactions under free energy models. Optimal algorithms often assume a restricted class for the predicted RNA structures and yet still require a high-degree polynomial time complexity, which is too expensive to use. Heuristic methods may yield time-efficient algorithms but they do not guarantee optimality of the predicted structure. This paper introduces a new and efficient algorithm for the prediction of RNA structure with pseudoknots for which the structure is not restricted. Novel prediction techniques are developed based on graph tree decomposition. In particular, based on a simplified energy model, stem overlapping relationships are defined with a graph, in which a specialized maximum independent set corresponds to the desired optimal structure. Such a graph is tree decomposable; dynamic programming over a tree decomposition of the graph leads to an efficient optimal algorithm. The final structure predictions are then based on re-ranking a list of suboptimal structures under a more comprehensive free energy model. The new algorithm is evaluated on a large number of RNA sequence sets taken from diverse resources. It demonstrates overall sensitivity and specificity that outperforms or is comparable with those of previous optimal and heuristic algorithms yet it requires significantly less time than the compared optimal algorithms.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Abrahams J., van den Berg M., van Batenburg E. and Pleij C. (1990). Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. Nucleic Acids Res. 18: 3035–3044
Adams P.L., Stahley M.R., Kosek A.B., Wang J. and Strobel S.A. (2004). Crystal structure of a self-splicing group i intron with both exons. Nature 430: 45–50
Akutsu T. (2000). Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discrete Appl. Math. 104: 45–62
Bodlaender, H.L.: Classes of graphs with bounded tree-width. Tech. Rep. RUU-CS-86-22, Dept. of Computer Science, Utrecht University, the Netherlands (1986)
Bodlaender, H.L.: Dynamic programming algorithms on graphs with bounded tree-width. In: Proceedings of the 15th International Colloquium on Automata, Languages and Programming, pp. 105–119. Springer Verlag, Lecture Notes in Computer Science, vol. 317, (1987)
Brown J. (1999). The ribonuclease p database. Nucleic Acids Res. 27: 314
Chen J.-H., Le S.-Y. and Maize J.V. (2000). Prediction of common secondary structures of RNAs: a genetic algorithm approach. Nucleic Acids Res. 28(4): 991–999
Dirks R. and Pierce N. (2003). A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 24: 1664–1677
Durbin R., Eddy S.R., Krogh A. and Mitchison G.J. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge
Eddy S.R. and Durbin R. (1994). RNA sequence analysis using covariance models. Nucleic Acids Res. 22: 2079–2088
Giedroc D., Theimer C. and Nixon P. (2000). Structure, stability and function of RNA pseudoknots involved in stimulating ribosomal frame shifting. J. Mol. Biol. 298: 167–185
Hicks, I.V., Koster, A.M.C.A., Kolotoglu, E.: Branch and tree decomposition techniques for discrete optimization. In: Tutorials in Operations Research: INFORMS, New Orleans 2005 (2005)
Ji Y., Xu X. and Stormo G.D. (2004). A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics 20(10): 1591–1602
Ke A., Zhou K., Ding F., Cate J.H. and Doudna J.A. (2004). A conformational switch controls hepatitis delta virus ribozyme catalysis. Nature 429: 201–205
Knudsen B. and Hein J. (2003). Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res. 31(13): 3423–3428
Lyngso R.B. and Pedersen C.N.S. (2000). RNA pseudoknot prediction in energy-based models. J. Comput. Biol. 7(3–4): 409–427
Mathews D.H., Sabina J., Zuker M. and Pederson C.N.S. (1999). Expanded sequence dependence of the thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288: 911–940
Nussinov R., Pieczenik G., Griggs J. and Kleitman D. (1978). Algorithms for loop matchings. SIAM J. Appl. Math. 35: 68–82
Ren J., Rastegart B., Condon A. and Hoos H.H. (2005). HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. RNA 11: 1194–1504
Rivas E. and Eddy S.R. (1999). A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285: 2053–2068
Robertson N. and Seymour P.D. (1986). Graph minors ii. Algorithmic aspects of tree width. J. Algorithms 7: 309–322
Ruan J., Stormo G.D. and Zhang W. (2004). An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots. Bioinformatics 20(1): 58–66
Sakakibara Y., Brown M., Hughey R., Mian I.S., Sjölander K., Underwood R.C. and Haussler D. (1994). Stochastic context-free grammars for tRNA modeling. Nucleic Acids Res. 22: 5112–5120
Schimmel P. (1989). RNA pseudoknots that interact with components of the translation apparatus. Cell 58(1): 9–12
Serra M.J., Turner D.H. and Freier S.M. (1995). Predicting thermodynamic properties of RNA. Meth. Enzymol. 259: 243–261
Song, Y., Liu, C., Malmberg, R.L., Pan, F., Cai, L.: Tree decomposition based fast search of RNA structures including pseudoknots in genomes. In: Proceedings of 2005 Computational System Bioinformatics Conference, pp. 223–234. IEEE Computer Society (2005)
Sprinzl M., Horn C., Brown M., Ioudovitch A. and Steinberg S. (1998). Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 26: 148–153
Steffen P., Voss B., Rehmsmeier M., Reeder J. and Giegerich R. (2006). Rnashapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics 22(4): 500–503
Tabaska J., Cary R., Gabow H. and Stormo G. (1998). An RNA folding method capable of identifying pseudoknots and base triples. Bioinformatics 14(8): 691–699
Uemura Y., Hasegawa A., Kobayashi S. and Yokomori T. (1999). Tree adjoining grammars for RNA structure prediction. Theor. Comput. Sci. 210: 277–303
van Batenburg F., Gultyaev A., Pleij C., Ng J. and Oliehoek J. (2000). Pseudobase: a database with RNA pseudoknots. Nucleic Acids Res. 28: 201–204
Zuker M. and Stiegler P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9(1): 133–148
Author information
Authors and Affiliations
Corresponding authors
Additional information
The preliminary version of this paper appeared in the proceedings of the 6th Workshop on Algorithms for Bioinformatics (WABI 2006).
Rights and permissions
About this article
Cite this article
Zhao, J., Malmberg, R.L. & Cai, L. Rapid ab initio prediction of RNA pseudoknots via graph tree decomposition. J. Math. Biol. 56, 145–159 (2008). https://doi.org/10.1007/s00285-007-0124-4
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-007-0124-4