ACM Home Page
Please provide us with feedback. Feedback
Supporting efficient query processing on compressed XML files
Full text PdfPdf (210 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2005 ACM symposium on Applied computing table of contents
Santa Fe, New Mexico
SESSION: Database theory, technology and applications (DTTA) table of contents
Pages: 660 - 665  
Year of Publication: 2005
ISBN:1-58113-964-0
Authors
Yongjing Lin  University of Texas at Dallas, Richardson, TX
Youtao Zhang  University of Texas at Dallas, Richardson, TX
Quanzhong Li  IBM Almaden Research Center, San Jose, CA
Jun Yang  University of California at Riverside, Riverside, CA
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 50,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066677.1066827
What is a DOI?

ABSTRACT

XML has been widely accepted as the de facto format for data representation and exchange. However, it is also known for the excessive information redundancy in its representation. While various compression schemes have been proposed and some of them can support query processing over compressed files, it is usually inevitable to perform partial (or full) data decompression which is expensive and in some cases may dominate the query processing time.In this paper, we propose a new XML compression scheme based on the Sequitur compression algorithm. By organizing the compression result as a set of context free grammar rules, the scheme supports efficient processing of XPath queries without decompression. The experimental results show that this scheme achieves comparable compression ratio as gzip while its query processing time is among the best of existing algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Cheng, and W. Ng, "XQzip: Querying Compressed XML using Structural Indexing," In EDBT 2004, LNCS 2992, 2004.
 
2
A. Arion and et. al. "XQueC: Pushing Queries to Compressed XML Data," In Proceedings of VLDB (Demo), 2003.
 
3
P. Buneman, M. Grohe, and C. Koch, "Path Queries on Compressed XML," In Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003.
 
4
5
6
 
7
A. R. Schmidt, F. Waas, M. L. Kersten, M. J. Carey, I. Manolescu, R. Busse, "XMark: A Benchmark for XML Data Management," In Proceedings of the International Conference on Very Large Data Based (VLDB), pages 974--985, Hong Kong, China, August 2002.
 
8
 
9
 
10
Shakespeare, http://www.navdeeps.com/shakespeare/, Data Set, 2001.
 
11
 
12
World Wide Web Consortium. XML Path Language (XPath) Version 1.0. http://www.w3.org/TR/xpath/, W3C Recommendation 16 November 1999.
 
13
J. Ziv and A. Lempel, "A Universal Algorithm for Sequential Data Compression," In IEEE Transactions on Information Theory, May 1977.


Collaborative Colleagues:
Yongjing Lin: colleagues
Youtao Zhang: colleagues
Quanzhong Li: colleagues
Jun Yang: colleagues