Abstract
This paper re-examines, in a unified framework, two classic approaches to the problem of finding a longest common subsequence (LCS) of two strings, and proposes faster implementations for both. Letl be the length of an LCS between two strings of lengthm andn ≥m, respectively, and let s be the alphabet size. The first revised strategy follows the paradigm of a previousO(ln) time algorithm by Hirschberg. The new version can be implemented in timeO(lm · min logs, logm, log(2n/m)), which is profitable when the input strings differ considerably in size (a looser bound for both versions isO(mn)). The second strategy improves on the Hunt-Szymanski algorithm. This latter takes timeO((r +n) logn), wherer≤mn is the total number of matches between the two input strings. Such a performance is quite good (O(n logn)) whenr∼n, but it degrades to Θ(mn logn) in the worst case. On the other hand the variation presented here is never worse than linear-time in the productmn. The exact time bound derived for this second algorithm isO(m logn +d log(2mn/d)), whered ≤r is the number ofdominant matches (elsewhere referred to asminimal candidates) between the two strings. Both algorithms require anO(n logs) preprocessing that is nearly standard for the LCS problem, and they make use of simple and handy auxiliary data structures.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
A. V. Aho, D. S. Hirschberg, and J. D. Ullman, Bounds on the complexity of the maximal common subsequence problem,J. Assoc. Comput. Mach.,23 (1976), 1–12.
A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1976
A. Apostolico, Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings,Inform. Process. Lett.,23 (1986), 63–69.
A. Apostolico and C. Guerra, A fast linear space algorithm for computing longest common subsequences,Proceedings of the 23rd Allerton Conference, Monticello, IL, 1985, pp. 76–84.
J. L. Bentley and A. C.-C. Yao, An almost optimal algorithm for unbounded searching,Inform. Process. Lett.,5 (1976), 82–87.
M. R. Brown, and R. E. Tarjan, A representation of linear lists with movable fingers,Proceedings of the Wth Annual ACM Symposium on Theory of Computing, San Diego, CA, 1978, pp. 19–29.
M. R. Brown and R. E. Tarjan, A fast merging algorithm,J. Assoc. Comput. Mach. 26 (1979), 211–226.
D. S. Hirschberg, A linear space algorithm for computing maximal common subsequences,Comm. ACM,18 (1975), 341–343.
D. S. Hirschberg, Algorithms for the longest common subsequence problemJ Assoc. Comput. Mach.,24 (1977), 664–675.
W. J. Hsu and M. W. Du, New algorithms for the LCS problem,J. Comput. System Sci.,29 (1984), 133–152.
J. W. Hunt and T. G. Szymanski, A fast algorithm for computing longest common subsequences,Comm. ACM,20 (1977), 350–353.
H. M. Martinez (ed.), Mathematical and computational problems in the analysis of molecular sequences,Bull. Math. Biol,46, 4 (1984).
W. J. Masek and M. S. Paterson, A faster algorithm for computing string editing distances,J. Comput. System Sci.,20 (1980), 18–31.
K. Mehlhorn,Data Structures and Algorithms 1: Sorting and Searching, EATCS Monographs on TCS, Springer-Verlag, Berlin, 1984.
D. Sankoff and J. B. Kruskal (eds.),Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparisons, Addison-Wesley, Reading, MA, 1983.
P. van Emde Boas, Preserving order in a forest in less than logarithmic time and linear space,Inform. Process. Lett.,6 (1977), 80–82.
R. A. Wagner and M. J. Fischer, The string to string correction problem,J. Assoc. Comput. Mach.,21 (1974), 168–173.
Author information
Authors and Affiliations
Additional information
Communicated by David P. Dobkin.
Rights and permissions
About this article
Cite this article
Apostolico, A., Guerra, C. The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987). https://doi.org/10.1007/BF01840365
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01840365