Abstract
We first give a representation of a suffix tree that uses \(n \lg n + O(n)\) bits of space and supports searching for a pattern in the given text (from a fixed size alphabet) in O(m) time, where n is the size of the text and m is the size of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan in [17]. Previous compact representations of suffix trees had a higher lower order term in space and had some expectation assumption [3], or required more time for searching [5]. Then, surprisingly, we show that we can even do better, by developing a structure that uses a suffix array (and so \(n \lceil \lg n \rceil \) bits) and an additional o(n) bits. String searching can be done in this structure also in O(m) time. Besides supporting string searching, we can also report the number of occurrences of the pattern in the same time using no additional space. In this case the space occupied by the structures is much less compared to many of the previously known structures to do this. When the size of the alphabet k is not a constant, our structures can be easily extended, using standard tricks, to those that use the same space but take \(O(m \lg k)\) time for string searching or to those that use an additional \(O(m \lg k)\) bits but take the same O(m) time for searching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Apostolico, A., Preparata, F.P.: Structural properties of the string statistics problem. Journal of Computer and System Sciences 31, 394–411 (1985)
Cardenas, A.F.: Analysis and performance of inverted data base structures. Communications of The ACM 18(5), 253–263 (1975)
Clark, D.R., Munro, J.I.: Efficient Suffix Trees on Secondary Storage. In: Proceedings of the 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 383–391 (1996)
Clift, B., Haussler, D., McConnel, R., Schneider, T.D., Stormo, G.D.: Sequence landscapes. Nucleic Acids Research 4(1), 141–158 (1986)
Colussi, L., De Col, A.: A time and space efficient data structure for string searching on large texts. Information Processing Letters 58, 217–222 (1996)
Fraser, C., Wendt, A., Myers, E.W.: Analysing and compressing assembly code. In: Proceedings of the SIGPLAN Symposium on Compiler Construction (1984)
Gonnet, G.H., Baeza-Yates, R.A., Snider, T.: New indices for text: PAT trees and PAT arrays. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures and Algorithms, pp. 66–82. Prentice-Hall, Englewood Cliffs (1992)
Jacobson, G.: Space-efficient Static Trees and Graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 549–554 (1989)
Kärkkäinen, J., Ukkonen, E.: Sparse suffix trees. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 219–230. Springer, Heidelberg (1996)
Landau, G.M., Vishkin, U.: Introducing efficient parallelism into approximate string matching. In: Proc. 18th ACM Symposium on Theory of Computing, pp. 220–230 (1986)
Manber, U., Myers, G.: Suffix Arrays: A New Method for On-line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)
McCreight, M.E.: A space-economical suffix tree construction algorithm. Journal of the ACM 23, 262–272 (1976)
Morrison, D.R.: PATRICIA: Practical Algorithm To Retrieve Information Coded In Alphanumeric. Journal of the ACM 15, 514–534 (1968)
Munro, J.I., Benoit, D.: Succinct Representation of k-ary trees. Manuscript
Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)
Munro, J.I., Raman, V.: Succinct representation of balanced parentheses, static trees and planar graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 118–126 (1997)
Muthukrishnan, S.: Randomization in Stringology. In: Proceedings of the Preconference Workshop on Randomization, Kharagpur, India (December 1997)
Rodeh, M., Pratt, V.R., Even, S.: Linear algorithm for data compression via string matching. Journal of the ACM 28(1), 16–24 (1991)
Shang, H.: Trie methods for text and spatial data structures on secondary storage, PhD Thesis, McGill University (1995)
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Munro, I., Raman, V., Rao, S.S. (1998). Space Efficient Suffix Trees. In: Arvind, V., Ramanujam, S. (eds) Foundations of Software Technology and Theoretical Computer Science. FSTTCS 1998. Lecture Notes in Computer Science, vol 1530. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49382-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-49382-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65384-4
Online ISBN: 978-3-540-49382-2
eBook Packages: Springer Book Archive