ACM Home Page
Please provide us with feedback. Feedback
XPRESS: a queriable compression for XML data
Full text PdfPdf (277 KB)
Source International Conference on Management of Data archive
Proceedings of the 2003 ACM SIGMOD international conference on Management of data table of contents
San Diego, California
SESSION: XML indexing and compression table of contents
Pages: 122 - 133  
Year of Publication: 2003
ISBN:1-58113-634-X
Authors
Jun-Ki Min  Korea Advanced Institute of Science and Technology (KAIST), Taejon, Korea
Myung-Jae Park  Korea Advanced Institute of Science and Technology (KAIST), Taejon, Korea
Chin-Wan Chung  Korea Advanced Institute of Science and Technology (KAIST), Taejon, Korea
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 82,   Citation Count: 26
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/872757.872775
What is a DOI?

ABSTRACT

Like HTML, many XML documents are resident on native file systems. Since XML data is irregular and verbose, the disk space and the network bandwidth are wasted. To overcome the verbosity problem, the research on compressors for XML data has been conducted. However, some XML compressors do not support querying compressed data, while other XML compressors which support querying compressed data blindly encode tags and data values using predefined encoding methods. Thus, the query performance on compressed XML data is degraded.In this paper, we propose XPRESS, an XML compressor which supports direct and efficient evaluations of queries on compressed XML data. XPRESS adopts a novel encoding method, called reverse arithmetic encoding, which is intended for encoding label paths of XML data, and applies diverse encoding methods depending on the types of data values. Experimental results with real life data sets show that XPRESS achieves significant improvements on query performance for compressed XML data and reasonable compression ratios. On the average, the query performance of XPRESS is 2.83 times better than that of an existing XML compressor and the compression ratio of XPRESS is 73%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Anonymous. http://www.cs.washington.edu/research/projects/xmltk/www/xmlproperties.html.
 
3
S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML Query Language. Working Draft, http://www.w3.org/TR/2002/WD-xquery-20020816, 16 August 2002.
 
4
T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler. Extensible Markup Language (XML) 1.0. W3C Recommendation, http://www.w3.org/TR/REC-xml, 1998.
5
 
6
J. Clark and S. DeRose. XML Path Language(XPath) Version 1.0. W3C Recommendation, http://www.w3.org/TR/xpath, November 1999.
 
7
R. Cover. The XML Cover Pages. http://www.oasis-open.org/cover/xml.html, 2001.
 
8
 
9
 
10
D. Florescu and D. Kossman. Storing and Querying XML Data using an RDMBS. IEEE Data Engineering Bulletin, 22(3):27--34, September 1999.
 
11
12
 
13
E. R. Harold. Long Baseball Examples from The XML Bible. ibiblio, http://www.ibiblio.org/xml/examples/baseball/.
 
14
P. G. Howard and J. S. Vitter. Analysis of Arithmetic Coding for Data Compression. In Proceedings of the IEEE Data Compression Conference, pages 3--12, April 1991.
 
15
D. A. Huffman. A Method for the Construction of Minimum Redandancy Codes. In Proceedings of the Institute of Radio Engineers 40, pages 1098--1101, September 1952.
16
 
17
C.-W. Park, J.-K. Min, and C.-W. Chung. Structural Function Inlining Technique for Structurally Recursive XML Queries. In Proceedings of 28th International Conference on Very Large Data Bases, pages 83--94, August 2002.
 
18
 
19
 
20
C. E. Shannon. A Mathematical Theory of Communication. Bell Syst. Tech. J., 27:398--403, July 1948.
 
21
22
 
23
24

CITED BY  26

Collaborative Colleagues:
Jun-Ki Min: colleagues
Myung-Jae Park: colleagues
Chin-Wan Chung: colleagues