Compressed Text Indexes with Fast Locate

González, Rodrigo; Navarro, Gonzalo

doi:10.1007/978-3-540-73437-6_23

Rodrigo González¹ &
Gonzalo Navarro¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4580))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

746 Accesses
38 Citations

Abstract

Compressed text (self-)indexes have matured up to a point where they can replace a text by a data structure that requires less space and, in addition to giving access to arbitrary text passages, support indexed text searches. At this point those indexes are competitive with traditional text indexes (which are very large) for counting the number of occurrences of a pattern in the text. Yet, they are still hundreds to thousands of times slower when it comes to locating those occurrences in the text. In this paper we introduce a new compression scheme for suffix arrays which permits locating the occurrences extremely fast, while still being much smaller than classical indexes. In addition, our index permits a very efficient secondary memory implementation, where compression permits reducing the amount of I/O needed to answer queries.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Fast Compressed Self-indexes with Deterministic Linear-Time Construction

Article 22 October 2019

Adding Compression and Blended Search to a Compact Two-Level Suffix Array

Most Recent Match Queries in On-Line Suffix Trees

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ferragina, P., Manzini, G.: Indexing compressed texts. J. of the ACM 52(4), 552–581 (2005)
Article MathSciNet Google Scholar
Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representation of sequences and full-text indexes. ACM Transactions on Algorithms, 2006. TR 2004-05, Technische Fakultät, Univ. Bielefeld, Germany (to appear)
Google Scholar
González, R., Navarro, G.: Statistical encoding of succinct data structures. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 295–306. Springer, Heidelberg (2006)
Chapter Google Scholar
Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. 14th SODA, pp. 841–850 (2003)
Google Scholar
Jacobson, G.: Space-efficient static trees and graphs. In: Proc. 30th FOCS, pp. 549–554 (1989)
Google Scholar
Larsson, J., Moffat, A.: Off-line dictionary-based compression. Proc. IEEE 88(11), 1722–1732 (2000)
Article Google Scholar
Mäkinen, V.: Compact suffix array — a space-efficient full-text index. Fundamenta Informaticae 56(1–2), 191–210 (2003)
MATH MathSciNet Google Scholar
Mäkinen, V., Navarro, G.: Succinct suffix arrays based on run-length encoding. Nordic J. of Computing 12(1), 40–66 (2005)
Google Scholar
Mäkinen, V., Navarro, G., Sadakane, K.: Advantages of backward searching — efficient secondary memory and distributed implementation of compressed suffix arrays. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 681–692. Springer, Heidelberg (2004)
Google Scholar
Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Computing 22(5), 935–948 (1993)
Article MATH MathSciNet Google Scholar
Manzini, G.: An analysis of the Burrows-Wheeler transform. J. of the ACM 48(3), 407–430 (2001)
Article MathSciNet Google Scholar
Navarro, G.: Indexing text using the Ziv-Lempel trie. J. of Discrete Algorithms 2(1), 87–114 (2004)
Article MATH MathSciNet Google Scholar
Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys (to appear)
Google Scholar
Sadakane, K.: New text indexing functionalities of the compressed suffix arrays. J. of Algorithms 48(2), 294–313 (2003)
Article MATH MathSciNet Google Scholar
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symp. on Switching and Automata Theory, pp. 1–11 (1973)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, University of, Chile
Rodrigo González & Gonzalo Navarro

Authors

Rodrigo González
View author publications
You can also search for this author in PubMed Google Scholar
Gonzalo Navarro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Bin Ma Kaizhong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

González, R., Navarro, G. (2007). Compressed Text Indexes with Fast Locate. In: Ma, B., Zhang, K. (eds) Combinatorial Pattern Matching. CPM 2007. Lecture Notes in Computer Science, vol 4580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73437-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-73437-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73436-9
Online ISBN: 978-3-540-73437-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Compressed Text Indexes with Fast Locate

Abstract

Chapter PDF

Similar content being viewed by others

Fast Compressed Self-indexes with Deterministic Linear-Time Construction

Adding Compression and Blended Search to a Compact Two-Level Suffix Array

Most Recent Match Queries in On-Line Suffix Trees

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Compressed Text Indexes with Fast Locate

Abstract

Chapter PDF

Similar content being viewed by others

Fast Compressed Self-indexes with Deterministic Linear-Time Construction

Adding Compression and Blended Search to a Compact Two-Level Suffix Array

Most Recent Match Queries in On-Line Suffix Trees

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation