ACM Home Page
Please provide us with feedback. Feedback
XMill: an efficient compressor for XML data
Full text PdfPdf (404 KB)
Source International Conference on Management of Data archive
Proceedings of the 2000 ACM SIGMOD international conference on Management of data table of contents
Dallas, Texas, United States
Pages: 153 - 164  
Year of Publication: 2000
ISBN:1-58113-217-4
Also published in ...
Authors
Hartmut Liefke  Univ. of Pennsylvania
Dan Suciu  AT&T Labs
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 142,   Citation Count: 71
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/342009.335405
What is a DOI?

ABSTRACT

We describe a tool for compressing XML data, with applications in data exchange and archiving, which usually achieves about twice the compression ratio of gzip at roughly the same speed. The compressor, called XMill, incorporates and combines existing compressors in order to apply them to heterogeneous XML data: it uses zlib, the library function for gzip, a collection of datatype specific compressors for simple data types, and, possibly, user defined compressors for application specific data types.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Belanger and K. Church. Data flows with examples from telecommunications. In Proceedings of 1999 Workshop on Databases in Telecommunication, Edinburgh, UK, September 1999.
 
2
 
3
M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. Technical report, Digital Equipment Corporation, May 1994.
 
4
Clark and S. DeRose. XML path language (XPath), version 1.0. W3C Working Draft, August 1999. Available as http ://www. w3. org/TR/xpath.
 
5
 
6
 
7
 
8
D. G. Higgins, R. Fuchs, P. J. Stoehr, and G. N. Cameron. The EMBL data library. Nucleic Acids Research, 20:2071- 2074, 1992.
 
9
 
10
11
 
12
H. Liefke and D. Suciu. XMill: An efficient compressor for XML data. Technical Report MS-CIS-98-06, Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, October 1999.
 
13
 
14
S. Nestorov, S. Abiteboul, and R. Motwani. Inferring structure in semistructured data. In Proceedings of the Workshop on Management of Semi-structured Data, 1997. Available from http ://www. research, att. com/~ suc iu/workshop-papers, html.
 
15
16
 
17
 
18
 
19
H.S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. XML schema part 1: Structures. 1/113C Working Draft, September 1999. Available as http://www, w3. org/TR/xmls chema-I.
 
20
J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337-343, 1977.

CITED BY  71

Collaborative Colleagues:
Hartmut Liefke: colleagues
Dan Suciu: colleagues