This chapter will look in detail at how the Burrows-Wheeler Transform is implemented in practice. The examples given in Chapter 1 overlooked some important practical details — to transform a text of n characters the encoder was sorting an array of n strings, each n characters long, and the decoder performed n sorts to reverse the transform. This complexity is not necessary for the BWT, and in this chapter we will see how to perform the encoding and decoding in O(n) space, and O(n log n) time. In fact, using a few tricks, the time can be reduced to O(n).
We will also look at various auxiliary data structures that are used for decoding the Burrows-Wheeler Transform, as some of them, while not essential for decoding, are useful if the transformed text is to be searched. These extra structures can still be constructed in O(n) time so in principle they add little to the decoding cost.
This chapter considers only the transform; in the next chapter we will look at how a compression system can take advantage of the transformed text to reduce its size; we refer to this second phase as the “Local to Global Transform”, or LGT.
We will present the Burrows-Wheeler Transform for coding a string T of n characters, T[1 … n], over an alphabet Σ of |Σ| characters. Note that there is a summary of all the main notation in Appendix A on page 309.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
(2008). How the Burrows-Wheeler Transform works. In: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78909-5_2
Download citation
DOI: https://doi.org/10.1007/978-0-387-78909-5_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-78908-8
Online ISBN: 978-0-387-78909-5
eBook Packages: Computer ScienceComputer Science (R0)