| A sampling approach for XML query selectivity estimation |
| Full text |
Pdf
(2.38 MB)
|
| Source
|
Extending Database Technology; Vol. 360
archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
table of contents
Saint Petersburg, Russia
SESSION: Research sessions: XML, XPath, XQuery
table of contents
Pages 335-344
Year of Publication: 2009
ISBN:978-1-60558-422-5
|
|
Authors
|
|
Cheng Luo
|
Coppin State University, Baltimore, MD
|
|
Zhewei Jiang
|
Southern Illinois University, Carbondale, IL
|
|
Wen-Chi Hou
|
Southern Illinois University, Carbondale, IL
|
|
Feng Yu
|
Southern Illinois University, Carbondale, IL
|
|
Qiang Zhu
|
University of Michigan-Dearborn, Dearborn, MI
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 84, Citation Count: 0
|
|
|
ABSTRACT
As the Extensible Markup Language (XML) rapidly establishes itself as the de facto standard for presenting, storing, and exchanging data on the Internet, large volume of XML data and their supporting facilities start to surface. A fast and accurate selectivity estimation mechanism is of practical importance because selectivity estimation plays a fundamental role in XML query optimization. Recently proposed techniques are all based on some forms of structure synopses that could be time-consuming to build and not effective for summarizing complex structure relationships. In this research, we propose an innovative sampling method that can capture the tree structures and intricate relationships among nodes in a simple and effective way. The derived sample tree is stored as a synopsis for selectivity estimation. Extensive experimental results show that, in comparison with the state-of-the-art structure synopses, specifically the TreeSketch and Xseed synopses, our sample tree synopsis applies to a broader range of query types, requires several orders of magnitude less construction time, and generates estimates with considerably better precision for complex datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Dblp data set. http://www.informatik.unitrier.de/ley/db/index.html.
|
 |
2
|
|
 |
3
|
Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya, On random sampling over joins, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.263-274, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
4
|
|
| |
5
|
Zhiyuan Chen , H. V. Jagadish , Flip Korn , Nick Koudas , S. Muthukrishnan , Raymond T. Ng , Divesh Srivastava, Counting Twig Matches in a Tree, Proceedings of the 17th International Conference on Data Engineering, p.595-604, April 02-06, 2001
|
| |
6
|
W. G. Cochran. Sampling Techniques. Wiley, 1977.
|
 |
7
|
Juliana Freire , Jayant R. Haritsa , Maya Ramanath , Prasan Roy , Jérôme Siméon, StatiX: making XML count, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564713]
|
 |
8
|
Sumit Ganguly , Phillip B. Gibbons , Yossi Matias , Avi Silberschatz, Bifocal sampling for skew-resistant join size estimation, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.271-281, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
9
|
|
 |
10
|
Peter J. Haas , Jeffrey F. Naughton , S. Seshadri , Arun N. Swami, Fixed-precision estimation of join selectivity, Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.190-201, May 25-28, 1993, Washington, D.C., United States
[doi> 10.1145/153850.153875]
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
Richard J. Lipton , Jeffrey F. Naughton , Donovan A. Schneider, Practical selectivity estimation through adaptive sampling, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.1-11, May 23-26, 1990, Atlantic City, New Jersey, United States
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
A. R. Schmidt , Florian Waas , Martin L. Kersten , D. Florescu , I. Manolescu , M. J. Carey , R. Busse, The XML benchmark project, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, 2001
|
 |
24
|
|
| |
25
|
|
|