ACM Home Page
Please provide us with feedback. Feedback
An analysis of the Burrows—Wheeler transform
Full text PdfPdf (183 KB)
Source Journal of the ACM (JACM) archive
Volume 48 ,  Issue 3  (May 2001) table of contents
Pages: 407 - 430  
Year of Publication: 2001
ISSN:0004-5411
Author
Giovanni Manzini  Univ. del Piemonte Orientale, Alessandria, Italy
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 142,   Citation Count: 25
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/382780.382782
What is a DOI?

ABSTRACT

The Burrows—Wheeler Transform (also known as Block-Sorting) is at the base of compression algorithms that are the state of the art in lossless data compression. In this paper, we analyze two algorithms that use this technique. The first one is the original algorithm described by Burrows and Wheeler, which, despite its simplicity outperforms the Gzip compressor. The second one uses an additional run-length encoding step to improve compression. We prove that the compression ratio of both algorithms can be bounded in terms of the kth order empirical entropy of the input string for any k ≥ 0. We make no assumptions on the input and we obtain bounds which hold in the worst case that is for every possible input string. All previous results for Block-Sorting algorithms were concerned with the average compression ratio and have been established assuming that the input comes from a finite-order Markov source.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ARNOLD, R., AND BELL, T. 2000. The Canterbury corpus home page. http://corpus.canterbury. ac.nz.
2
 
3
BURROWS, M., AND WHEELER, D. J. 1994. A block sorting lossless data compression algorithm. Tech. Rep. 124. Digital Equipment Corporation, Palo Alto, Calif.
 
4
CLEARY,J.G.,AND TEAHAN, W. J. 1997. Unbounded length contexts for PPM. Comput. J. 40, 2/3, 67-75.
 
5
 
6
 
7
FENWICK, P. 1996a. Block sorting text compression-final report. Tech. Rep. 130, Dept. of Computer Science, The University of Auckland, New Zealand.
 
8
FENWICK, P. 1996b. The Burrows-Wheeler transform for block sorting text compression: principles and improvements. Computer J. 39,9,731-740.
 
9
 
10
 
11
 
12
HOWARD, P., AND VITTER, J. 1992b. Practical implementations of arithmetic coding. In Image and Text Compression, J. A. Storer, ed. Kluwer Academic, pp. 85-112.
 
13
HUFFMAN, D. A. 1952. A method for the construction of minimum redundancy codes. Proc. IRE 40 (Sept.), 1098-1101.
 
14
 
15
 
16
MOFFAT, A. 1990. Implementing the PPM data compression scheme. IEEE Trans. Commun. COM-38, 1917-1921.
 
17
 
18
NELSON, M. 1996. Data compression with the Burrows-Wheeler transform. Dr. Dobb's J. Softw. Tools 21,9,46-50, http://www.dogma.net/markn/articles/bwt/bwt.htm.
 
19
RYABKO, B. Y. 1980. Data compression by means of a 'book stack'. Prob. Inf. Transm. 16,4, 265-269.
 
20
 
21
 
22
 
23
SEWARD, J. 1997. The BZIP2 home page. http://sourceware.cygnus.com/bzip2/in-dex. html.
24
 
25
WHEELER, D. 1995. An implementation of block coding. Computer Laboratory. Cambridge University, Cambridge, UK, ftp://ftp.cl.cam.ac.uk/users/djw3/bred.ps.
 
26
WHEELER, D. 1997. Upgrading bred with multiples tables. Computer Laboratory. Cambridge University, Cambridge, UK, ftp://ftp.cl.cam.ac.uk/users/djw3/bred3.ps.
27

CITED BY  25