Abstract
We present a new data structure called the Compressed Random Access Memory (CRAM) that can store a dynamic string T of characters, e.g., representing the memory of a computer, in compressed form while achieving asymptotically almost-optimal bounds (in terms of empirical entropy) on the compression ratio. It allows short substrings of T to be decompressed and retrieved efficiently and, significantly, characters at arbitrary positions of T to be modified quickly during execution without decompressing the entire string. This can be regarded as a new type of data compression that can update a compressed file directly. Moreover, at the cost of slightly increasing the time spent per operation, the CRAM can be extended to also support insertions and deletions. Our key observation that the empirical entropy of a string does not change much after a small change to the string, as well as our simple yet efficient method for maintaining an array of variable-length blocks under length modifications, may be useful for many other applications as well.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Interscience (1991)
Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM 52(4), 552–581 (2005)
Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms 3(2), article No. 20 (2007)
Ferragina, P., Venturini, R.: A simple storage scheme for strings achieving entropy bounds. Theoretical Computer Science 372(1), 115–121 (2007)
Fredman, M.L., Saks, M.E.: The cell probe complexity of dynamic data structures. In: Proceedings of ACM STOC, pp. 345–354 (1989)
González, R., Navarro, G.: Statistical Encoding of Succinct Data Structures. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 294–305. Springer, Heidelberg (2006)
González, R., Navarro, G.: Rank/select on dynamic compressed sequences and applications. Theoretical Computer Science 410(43), 4414–4422 (2009)
Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: Proceedings of ACM-SIAM SODA, pp. 841–850 (2003)
Hagerup, T.: Sorting and searching on the word RAM. In: Proceedings of Symposium on Theory Aspects of Computer Science (STACS 1998), pp. 366–398 (1998)
Hagerup, T., Raman, R.: An Efficient Quasidictionary. In: Penttonen, M., Schmidt, E.M. (eds.) SWAT 2002. LNCS, vol. 2368, pp. 1–18. Springer, Heidelberg (2002)
He, M., Munro, J.I.: Succinct Representations of Dynamic Strings. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 334–346. Springer, Heidelberg (2010)
Kosaraju, S.R., Manzini, G.: Compression of low entropy strings with Lempel-Ziv algorithms. SIAM Journal on Computing 29(3), 893–911 (1999)
Mäkinen, V., Navarro, G.: Dynamic entropy-compressed sequences and full-text indexes. ACM Transactions on Algorithms 4(3), article No. 32 (2008)
Manzini, G.: An analysis of the Burrows-Wheeler transform. Journal of the ACM 48(3), 407–430 (2001)
Navarro, G., Sadakane, K.: Fully-functional static and dynamic succinct trees. Submitted for Journal Publication (2010), http://arxiv.org/abs/0905.0768 ; A preliminary version appeared in Proc. ACM-SIAM SODA, pp. 134–149 (2010)
Raman, R., Raman, V., Rao, S.S.: Succinct Dynamic Data Structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001)
Sadakane, K., Grossi, R.: Squeezing succinct data structures into entropy bounds. In: Proceedings of ACM-SIAM SODA, pp. 1230–1239 (2006)
Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.M., Birol, İ.: ABySS: A parallel assembler for short read sequence data. Genome Research 19(6), 1117–1123 (2009), http://dx.doi.org/10.1101/gr.089532.108
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jansson, J., Sadakane, K., Sung, WK. (2012). CRAM: Compressed Random Access Memory. In: Czumaj, A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds) Automata, Languages, and Programming. ICALP 2012. Lecture Notes in Computer Science, vol 7391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31594-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-31594-7_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31593-0
Online ISBN: 978-3-642-31594-7
eBook Packages: Computer ScienceComputer Science (R0)