Abstract
In this paper we revisit the dynamic dictionary matching problem, which asks for an index for a set of patterns P 1, P 2, ..., P k that can support the following query and update operations efficiently. Given a query text T, we want to find all the occurrences of of these patterns; furthermore, as the set of patterns may change over time, we also want to insert or delete a pattern. The major contribution of this paper is the first succinct index for dynamic dictionary matching. Prior to our work, the most compact index is given by Chan et al. (2007), which is based on the compressed suffix arrays (Grossi and Vitter (2005) and Sadakane (2003)) and the FM-index (Ferragina and Manzini (2005)), and it requires O(n σ) bits where n is the total length of patterns and σ is the alphabet size. We develop a dynamic succinct index using a different (and simpler) paradigm based on suffix sampling. The new index not only improves the space complexity to (1 + o(1))n logσ + O(klogn) bits, but also the time complexity of the query and update operations. Specifically, the query and update operations respectively take O(|T|logn + occ) and O(|P|logσ + logn) times, where occ is the number of occurrences.
This work is supported in part by Taiwan NSC Grant 96-2221-E-007-082-MY3 (W. Hon), Hong Kong RGC Grant HKU 7140/06E (T. Lam), and US NSF Grant CCF–0621457 (R. Shah and J. S. Vitter).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aho, A., Corasick, M.: Efficient String Matching: An Aid to Bibliographic Search. Communications of the ACM 18(6), 333–340 (1975)
Alstrup, S., Husfeldt, T., Rauhe, T.: Marked Ancestor Problems. In: Proceedings of Symposium on Foundations of Computer Science, pp. 534–544 (1998)
Amir, A., Farach, M., Idury, R., La Poutre, A., Schaffer, A.: Improved Dynamic Dictionary Matching. Information and Computation 119(2), 258–282 (1995)
Arge, L., Vitter, J.S.: Optimal External Memory Interval Management. SIAM Journal on Computing, 1488–1508 (2003)
Bender, M.A., Cole, R., Demaine, E.D., Farach-Colton, M., Zito, J.: Two Simplified Algorithms for Maintaining Order in a List. In: Proceedings of European Symposium on Algorithms, pp. 152–164 (2002)
Burrows, M., Wheeler, D.J.: A Block-sorting Lossless Data Compression Algorithm. Tech Report 124, Digital Equipment Corporation, CA, USA (1994)
Chan, H.L., Hon, W.K., Lam, T.W., Sadakane, K.: Compressed Indexes for Dynamic Text Collections. ACM Transactions on Algorithms 3(2) (2007)
Dietz, P.F., Sleator, D.D.: Two Algorithms for Maintaining Order in a List. In: Proceedings of Symposium on Theory of Computing, pp. 365–372 (1987)
Ferragina, P., Grossi, R.: The String B-tree: A New Data Structure for String Searching in External Memory and Its Application. Journal of the ACM 46(2), 236–280 (1999)
Ferragina, P., Manzini, G.: Indexing Compressed Text. Journal of the ACM 52(4), 552–581 (2005)
Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. SIAM Journal on Computing 35(2), 378–407 (2005)
Hon, W.-K., Lam, T.-W., Shah, R., Tam, S.-L., Vitter, J.S.: Compressed Index for Dictionary Matching. In: DCC 2008, pp. 23–32 (2008)
Kärkkäinen, J., Ukkonen, E.: Sparse Suffix Trees. In: Proceedings of International Conference on Computing and Combinatorics, pp. 219–230 (1996)
Knuth, D.E., Morris, J.H., Pratt, V.B.: Fast Pattern Matching in Strings. SIAM Journal on Computing 6(2), 323–350 (1977)
Manber, U., Myers, G.: Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)
McCreight, E.M.: A Space-economical Suffix Tree Construction Algorithm. Journal of the ACM 23(2), 262–272 (1976)
Overmars, M.H.: The Design of Dynamic Data Structures. LNCS, vol. 156. Springer, Heidelberg (1983)
Sadakane, K.: New text indexing functionalities of the compressed suffix arrays. Journal of Algorithms 48(2), 294–313 (2003)
Sadakane, K.: Compressed Suffix Trees with Full Functionality. Theory of Computing Systems, 589–607 (2007)
Weiner, P.: Linear Pattern Matching Algorithms. In: Proceedings of Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hon, WK., Lam, TW., Shah, R., Tam, SL., Vitter, J.S. (2009). Succinct Index for Dynamic Dictionary Matching. In: Dong, Y., Du, DZ., Ibarra, O. (eds) Algorithms and Computation. ISAAC 2009. Lecture Notes in Computer Science, vol 5878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10631-6_104
Download citation
DOI: https://doi.org/10.1007/978-3-642-10631-6_104
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10630-9
Online ISBN: 978-3-642-10631-6
eBook Packages: Computer ScienceComputer Science (R0)