Abstract
A new algorithm for optimal sequence alignment allowing for long insertions and deletions is developed. The algorithm requires O((L+C)MN) computational steps, O(LN) primary memory and O(MN) secondary memory storage, whereM andN(M≥N) are sequence lengths,L (typicallyL≤3) is the number of segment specifying the gap weighting function, andC is a constant. We have also modified our earlier traceback algorithm so that it finds all and only the optimal alignments in a compact form of a directed graph. The current versions accept a set of aligned sequences as input, which facilitates multiple sequence alignment by some iterative procedures.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Literature
Aho, A. L., J. E. Hopcroft and J. D. Ullman. 1983.Data Structure and Algorithms. Reading, MA: Addison-Wesley.
Anderson, S., A. T. Bankier, B. G. Barrell, M. H. L. de Bruijn, A. R. Coulson, J. Drouin, I. C. Eperson, D. P. Nierlich, B. A. Roe, F. Sanger, P. H. Schreier, A. J. H. Smith, R. Staden and I. G. Young. 1981. Sequence and organization of the human mitochondrial genome.Nature 290, 457–465.
Altschul, S. F. and B. W. Erickson. 1986. Optimal sequence alignment using affine gap costs.Bull. math. Biol. 48, 603–616.
Feng, D.-F. and R. F. Doolittle. 1987. Progressive sequence alignment as a prerequisite to correct phylogenetic trees.J. molec. Evol. 25, 351–360.
Fickett, J. W. 1984. Fast optimal alignment.Nucleic Acids Res. 12, 175–179.
Fitch, W. M. and T. F. Smith. 1983. Optimal sequence alignments.Proc. natl. Acad. Sci. U.S.A. 80, 1382–1386.
Fredman, M. L. 1984. Algorithms for computing evolutionary similarity measures with length independent gap penalties.Bull. math. Biol. 46, 553–566.
Gotoh, O. 1982. An improved algorithm for matching biological sequences.J. molec. Biol. 162, 705–708.
Gotoh, O. 1986. Alignment of three biological sequences with an efficient traceback procedure.J. theor. Biol. 121, 327–337.
Gotoh, O. 1987. Pattern matching of biological sequences with limited storage.Comput. Applic. Biosci. 3, 17–20.
Gotoh, O. and Y. Fujii-Kuriyama. 1989. Evolution, structure, and gene regulation of cytochrome P450. InFrontier in Biotransformation, K. Ruckpaul and H. Rein (eds), Vol. 1, pp. 165–243. Berlin: Akademie Verlag.
Miller, W. and E. W. Myers. 1988. Sequence comparison with concave weighting functions.Bull. math. Biol. 50, 97–120.
Myers, E. W. and W. Miller. 1988. Optimal alignments in linear space.Comput. Applic Biosci. 4, 11–17.
Needleman, S. B. and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins.J. molec. Biol. 48, 443–453.
Roe, B. A., D.-P. Ma, R. K. Wilson and J. F.-H. Wong. 1985. The complete nucleotide sequence of theXenopus laevis mitochondrial genome.J. biol. Chem. 260, 9759–9774.
Sankoff, D. 1972. Matching sequences under deletion/insesertion constraints.Proc. natl Acad. Sci. U.S.A. 69, 4–6.
Sellers, P. H. 1974. On the theory and computation of evolutionary distances.SIAM J. appl. Math. 26, 787–793.
Smith, T. F. and M. S. Waterman. 1981. Identification of common molecular subsequences.J. molec. Biol. 147, 195–197.
Smith, T. F., M. S. Waterman and W. M. Fitch. 1981. Comparative biosequence metrics.J. molec. Evol. 18, 38–46.
Taylor, P. 1984. A fast homology program for aligning biological sequences.Nucleic Acids Res. 12, 447–455.
Ukkonen, E. 1983. On approximate string matching.Proc. Int. Conf. Found. Comp. Theor. Lectures Notes Comp. Sci. 158, 487–496.
Waterman, M. S. 1984. Efficient sequence alignment algorithms.J. theor. Biol. 108, 333–337.
Waterman, M. S. and M. D. Perlwitz. 1984. Line geometries for sequence comparisons.Bull. math. Biol. 46, 567–577.
Waterman, M. S., T. F. Smith and W. A. Beyer. 1976. Some biological sequence metrics.Adv. Math. 20, 367–387.
Wilbur, W. J. and D. J. Lipman. 1983. Rapid similarity searches of nucleic acid and protein data banks.Proc. natl Acad. Sci. U.S.A. 80, 726–730.
Zwieb, C., C. Glotz and R. Brimacombe. 1981. Secondary structure comparisons between small subunit ribosomal RNA molecules from six different species.Nucleic Acids Res. 9, 3621–3640.
Author information
Authors and Affiliations
Additional information
Dedicated to Professor Akiyoshi Wada on the occasion of his 60th birthday.
Rights and permissions
About this article
Cite this article
Gotoh, O. Optimal sequence alignment allowing for long gaps. Bltn Mathcal Biology 52, 359–373 (1990). https://doi.org/10.1007/BF02458577
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02458577