ACM Home Page
Please provide us with feedback. Feedback
ROX: run-time optimization of XQueries
Full text PdfPdf (660 KB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 16: query processing on semi-structured data table of contents
Pages 615-626  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Authors
Riham Abdel Kader  University of Twente, Enschede, Netherlands
Peter Boncz  CWI, Amsterdam, Netherlands
Stefan Manegold  CWI, Amsterdam, Netherlands
Maurice van Keulen  University of Twente, Enschede, Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 142,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559910
What is a DOI?

ABSTRACT

Optimization of complex XQueries combining many XPath steps and joins is currently hindered by the absence of good cardinality estimation and cost models for XQuery. Additionally, the state-of-the-art of even relational query optimization still struggles to cope with cost model estimation errors that increase with plan size, as well as with the effect of correlated joins and selections.

In this research, we propose to radically depart from the traditional path of separating the query compilation and query execution phases, by having the optimizer execute, materialize partial results, and use sampling based estimation techniques to observe the characteristics of intermediates. The proposed technique takes as input a Join Graph where the edges are either equi-joins or XPath steps, and the execution environment provides value- and structural-join algorithms, as well as structural and value-based indices.

While run-time optimization with sampling removes many of the vulnerabilities of classical optimizers, it brings its own challenges with respect to keeping resource usage under control, both with respect to the materialization of intermediates, as well as the cost of plan exploration using sampling. Our approach deals with these issues by limiting the run-time search space to so-called "zero-investment algorithms for which sampling can be guaranteed to be strictly linear in sample size. All operators and XML value indices used by ROX for sampling have the zero-investment property.

We perform extensive experimental evaluation on large XML datasets that shows that our run-time query optimizer finds good query plans in a robust fashion and has limited run-time overhead.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
5
6
7
 
8
9
10
 
11
 
12
 
13
D. Fisher and S. Maneth. Structural Selectivity Estimation for XML Documents. In ICDE, 2007.
14
 
15
16
 
17
T. Grust. Purely Relational FLWORs. In XIME-P, 2005
 
18
T. Grust, M. Mayr, and J. Rittinger. XQuery Join Graph Isolation. In ICDE, 2009. arXiv:0810.4809.
19
 
20
 
21
 
22
J. Hidders, P. Michiels, J. Siméon, and R. Vercammen. How To Recognize Different Kinds of Tree Patterns from Quite a Long Way Away. In PLAN-X, 2007.
23
24
25
 
26
F. Olken and D. Rotem. Random Sampling from Databases -A Survey. Statistics and Computing, 5:25--42, 1995.
27
28
 
29
 
30
 
31

Collaborative Colleagues:
Riham Abdel Kader: colleagues
Peter Boncz: colleagues
Stefan Manegold: colleagues
Maurice van Keulen: colleagues