|
ABSTRACT
In recent research we proposed GP-zip, a system which uses evolution to find optimal ways to combine standard compression algorithms for the purpose of maximally losslessly compressing files and archives. The system divides files into blocks of predefined length. It then uses a linear, fixed-length representation where each primitive indicates what compression algorithm to use for a specific data block. GP-zip worked well with heterogonous data sets, providing significant improvements in compression ratio compared to some of the best standard compression algorithms. In this paper we propose a substantial improvement, called GP-zip*, which uses a new representation and intelligent crossover and mutation operators such that blocks of different sizes can be evolved. Like GP-zip, GP-zip* finds what the best compression technique to use for each block is. The compression algorithms available in the primitive set of GP-zip* are: Arithmetic coding (AC), Lempel-Ziv-Welch (LZW), Unbounded Prediction by Partial Matching (PPMD), Run Length Encoding (RLE), and Boolean Minimization. In addition, two transformation techniques are available: the Burrows-Wheeler Transformation (BWT) and Move to Front (MTF). Results show that GP-zip* provides improvements in compression ratio ranging from a fraction to several tens of percent over its predecessor.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
Peter Nordin and Wolfgang Banzhaf. Programmatic compression of images and sound. In John R. Koza, David E. Goldberg, David B. Fogel, and Rick L. Riolo, editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 345--350, Stanford University, CA, USA, 28-31 July 1996. MIT Press.
|
| |
6
|
Evelyne Lutton, Jacques Levy-Vehel, Guillaume Cretin, Philippe Glevarec, and Cidric Roll. Mixed IFS: Resolution of the inverse problem using genetic programming. Complex Systems, 9:375--398, 1995.
|
| |
7
|
Evelyne Lutton, Jacques Levy-Vehel, Guillaume Cretin, Philippe Glevarec, and Cidric Roll. Mixed IFS: Resolution of the inverse problem using genetic programming. Research Report No 2631, Inria, 1995.
|
| |
8
|
|
| |
9
|
Andreas Klappenecker and Frank U. May. Evolving better wavelet compression schemes. In Andrew F. Laine, Michael A. Unser, and Mladen V. Wickerhauser, editors, Wavelet Applications in Signal and Image Processing III, volume 2569, San Diego, CA, USA, 9-14 July 1995. SPIE.
|
| |
10
|
Alex Fukunaga and Andre Stechert. Evolving nonlinear predictive models for lossless image compression with genetic programming. In John R. Koza, Wolfgang Banzhaf, Kumar Chellapilla, Kalyanmoy Deb, Marco Dorigo, David B. Fogel, Max H. Garzon, David E. Goldberg, Hitoshi Iba, and Rick Riolo, editors, Genetic Programming 1998: Proceedings of the Third Annual Conference, pages 95--102, University of Wisconsin, Madison, Wisconsin, USA, 22-25 July 1998. Morgan Kaufmann.
|
| |
11
|
|
| |
12
|
|
| |
13
|
Thomas Krantz, Oscar Lindberg, Gunnar Thorburn, and Peter Nordin. Programmatic compression of natural video. In Erick Cantú-Paz, editor, Late Breaking Papers at the Genetic and Evolutionary Computation Conference (GECCO-2002), pages 301--307, New York, NY, July 2002. AAAI.
|
| |
14
|
|
 |
15
|
|
| |
16
|
J. Ziv and A. Lempel, Compression of Individual Sequences via Variable--Rate Coding, IEEE Transactions on Information Theory, September 1978.
|
| |
17
|
|
| |
18
|
S. W. Golomb, Run-length encodings, IEEE Trans. Inform. Theory, Vol. IT-12, pp. 399--401, 1966.
|
| |
19
|
A. Kattan, Universal Lossless Data Compression with built in Encryption. Master Thesis, University of Essex 2006.
|
| |
20
|
M. Burrows and D. J. Wheeler, A block-sorting lossless data compression algorithm, SRC, Number 124, 1994.
|
| |
21
|
|
| |
22
|
ACT Archive Compression Test {cited 2 December 2007} Available from: http://compression.ca/act/act-win.html
|
INDEX TERMS
Primary Classification:
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.2
Automatic Programming
Subjects:
Program synthesis
General Terms:
Algorithms,
Performance,
Reliability
Keywords:
AC,
GP-zip,
GP-zip*,
LZW,
MTF,
PPMD,
RLE,
boolean minimization. BWT,
lossless data compression
|