|
ABSTRACT
Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on myltisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
BOOKSTEIN, A. AND KLEIN, S. T. 1993. Is Huffman coding dead?. Computing 50, 4, 279-296.
|
| |
4
|
BURROWS, W. AND WHEELER, D.J. 1994. A block-sorting lossless data compression algorithm. Tech. Rep. 124. Digital Equipment Corp., Maynard, MA.
|
| |
5
|
CAPOCELLI, R. M. AND DE SANTIS, A. 1991. New bounds on the redundancy of Huffman codes. IEEE Trans. Inf. Theor. IT-37, 1095-1104.
|
| |
6
|
CHEVION, D., KARNIN, E. D., AND WALACH, E. 1991. High efficiency, multiplication free approximation of arithmetic coding. In Proceedings of the 1991 IEEE Data Compression Conference, J. A. Storer and J. Reif, Eds. IEEE Computer Society Press, Los Alamitos, CA, 43-52.
|
| |
7
|
CLEARY, J. G. AND WITTEN, I. g. 1984. Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. COM-32, 396-402.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
GALLAGER, R. G. 1978. Variations on a theme by Huffman. IEEE Trans. Inf. Theor. IT-24, 6 (Nov.), 668-674.
|
| |
14
|
GALLAGER, R. a. AND VAN VOORHIS, D. C. 1975. Optimal source codes for geometrically distributed integer alphabets. IEEE Trans. Inf. Theor. IT-21, 2 (Mar.), 228-230.
|
| |
15
|
GOLOMB, S.W. 1966. Run-length encodings. IEEE Trans. Inf. Theor. IT-12, 3 (July), 399-401.
|
 |
16
|
|
| |
17
|
|
| |
18
|
HORSPOOL, R. N. AND CORMACK, G.V. 1992. Constructing word-based text compression algorithms. In Proceedings of the 1992 IEEE Data Compression Conference, J. Storer and M. Cohn, Eds. IEEE Computer Society Press, Los Alamitos, CA, 62-71.
|
| |
19
|
|
| |
20
|
HOWARD, P. G. AND VITTER, J. S. 1994. Arithmetic coding for data compression. Proc. IEEE 82, 6, 857-865.
|
| |
21
|
HUFFMAN, D.A. 1952. A method for the construction of minimum-redundancy codes. Proc. Inst. Radio Eng. 40, 9 (Sept.), 1098-1101.
|
| |
22
|
JIANG, J. 1995. Novel design of arithmetic coding for data compression. IEE Proc. Comput. Dig. Tech. 142, 6 (Nov.), 419-424.
|
 |
23
|
|
| |
24
|
|
| |
25
|
KUHN, M. 1996. Implementation of JBIG (including a QM-coder). ftp://ftp.informatik. uni-erlangen, de/pub/doc/ISO/JBIG/jbigkit-0.9.tar.gz
|
| |
26
|
LANGDON, G. G. 1984. An introduction to arithmetic coding. IBM J. Res. Dev. 28, 2 (Mar.), 135-149.
|
| |
27
|
LANGDON, G. G. AND RISSANEN, J. 1981. Compresson of black-and-white images with arithmetic coding. IEEE Trans. Commun. COM-29, 858-867.
|
| |
28
|
MANSTETTEN, D. 1992. Tight upper bounds on the redundancy of Huffman codes. IEEE Trans. Inf. Theor. 38, 1 (Jan.), 144-151.
|
| |
29
|
|
| |
30
|
MOFFAT, A. 1990a. Implementing the PPM data compression scheme. IEEE Trans. Commun. 38, 11 (Nov.), 1917-1921.
|
| |
31
|
MOFFAT, A. 1990b. Linear time adaptive arithmetic coding. IEEE Trans. Inf. Theor. 36, 2 (Mar.), 401-406.
|
| |
32
|
MOFFAT, A. 1997. Critique of "Novel design of arithmetic coding for data compression". IEE Proc. Comput. Digit. Tech.
|
| |
33
|
MOFFAT, A. AND TURPIN, A. 1997. On the implementaiton of minimum-redundancy prefix codes. IEEE Trans. Commun. 45, 10 (Oct.), 1200-1207.
|
| |
34
|
|
| |
35
|
|
| |
36
|
RISSANEN, J. 1976. Generalised Kraft inequality and arithmetic coding. IBM J. Res. Dev. 20, 198-203.
|
| |
37
|
RISSANEN, J. AND LANGDON, G. G. 1979. Arithmetic coding. IBM J. Res. Dev. 23, 2 (Mar.), 149-162.
|
| |
38
|
RISSANEN, J. AND LANGDON, G. G. 1981. Universal modeling and coding. IEEE Trans. Inf. Theor. IT-27, 1 (Jan.), 12-23.
|
| |
39
|
RISSANEN, J. AND MOHIUDDIN, K. M. 1989. A multiplication-free multialphabet arithmetic code. IEEE Trans. Comm. 37, 2 (Feb.), 93-98.
|
| |
40
|
RUBIN, F. 1979. Arithmetic stream coding using fixed precision registers. IEEE Trans. Inf. Theor. IT-25, 6 (Nov.), 672-675.
|
| |
41
|
|
| |
42
|
SHANNON, C. E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 79-423.
|
 |
43
|
|
| |
44
|
WITTEN, I. g. AND BELL, T. C. 1991. The zero frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theor. 37, 4 (July), 1085-1094.
|
| |
45
|
|
 |
46
|
|
| |
47
|
WITTEN, I. g., NEAL, R. M., AND CLEARY, J. a. 1988. Authors' response to "Compress and Compact discussed further". Commun. ACM 31, 9 (Sept.), 1140-1145.
|
| |
48
|
ZIv, J. AND LEMPEL, A. 1977. A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. IT-23, 3, 337-343.
|
CITED BY 30
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Martin Bertram , Daniel E. Laney , Mark A. Duchaineau , Charles D. Hansen , Bernd Hamann , Kenneth I. Joy, Wavelet representation of contour sets, Proceedings of the conference on Visualization '01, October 21-26, 2001, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Martina , G. Masera , A. Molino , F. Vacca , L. Sterpone , M. Violante, A new approach to compress the configuration information of programmable devices, Proceedings of the conference on Design, automation and test in Europe: Designers' forum, March 06-10, 2006, Munich, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christopher J. Augeri , Barry E. Mullins , Dursun A. Bulutoglu , Rusty O. Baldwin , Leemon C. Baird, III, An analysis of XML binary formats and compression, Experimental computer science on Experimental computer science, p.6-6, June 13-14, 2007, San Diego
|
|
|
|
|
|
|
|
|
Martin Bertram , Mark A. Duchaineau , Bernd Hamann , Kenneth I. Joy, Bicubic subdivision-surface wavelets for large-scale isosurface representation and visualization, Proceedings of the conference on Visualization '00, p.389-396, October 2000, Salt Lake City, Utah, United States
|
|
|
Christopher J. Augeri , Dursun A. Bulutoglu , Barry E. Mullins , Rusty O. Baldwin , Leemon C. Baird, III, An analysis of XML compression efficiency, Proceedings of the 2007 workshop on Experimental computer science, p.7-es, June 13-14, 2007, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|