|
ABSTRACT
In this paper, we ask if the traditional relational query acceleration techniques of summary tables and covering indexes have analogs for branching path expression queries over tree- or graph-structured XML data. Our answer is yes --- the forward-and-backward index already proposed in the literature can be viewed as a structure analogous to a summary table or covering index. We also show that it is the smallest such index that covers all branching path expression queries. While this index is very general, our experiments show that it can be so large in practice as to offer little performance improvement over evaluating queries directly on the data. Likening the forward-and-backward index to a covering index on all the attributes of several tables, we devise an index definition scheme to restrict the class of branching path expressions being indexed. The resulting index structures are dramatically smaller and perform better than the full forward-and-backward index for these classes of branching path expressions. This is roughly analogous to the situation in multidimensional or OLAP workloads, in which more highly aggregated summary tables can service a smaller subset of queries but can do so at increased performance. We evaluate the performance of our indexes on both relational decompositions of XML and a native storage technique. As expected, the performance benefit of an index is maximized when the query matches the index definition.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Xmark: The xml benchmark project. http://monetdb.cwi.nl/xml/index.html.
|
| |
2
|
|
| |
3
|
J. Clark and S. DeRose. XML path language (XPath) 1.0. W3C recommendation. World Wide Web Consortium, http://www.w3.org/TR/xpath, Nov. 1999.
|
| |
4
|
|
| |
5
|
S. DeRose, E. Maler, and D. Orchard. The XLink standard. World Wide Web Consortium, http ://www.w3.org/TR/xquery, Nov. 1999.
|
| |
6
|
|
| |
7
|
R. Goldman and J. Widom. Approximate DataGuides. In Proc. of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats, pages 436-445, January 1999.
|
 |
8
|
|
| |
9
|
|
| |
10
|
R. Milner. A Calculus for Communicating Processes, volume 92 of Lecture Notes in Computer Science. Springer Verlag, 1980.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Open Directory Project. DMOZ open directory project. http://www.dmoz.org.
|
| |
16
|
F. Rizzolo and A. Mendelzon. Indexing XML data with ToXin. In Proc. of WebDB 2001, 2001.
|
| |
17
|
R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes. Exploiting local similarity for efficient indexing of paths in graph structured data. In Proceedings of ICDE, 2002.
|
| |
18
|
Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999
|
 |
19
|
Elizabeth Shriver , Arif Merchant , John Wilkes, An analytic behavior model for disk drives with readahead caches and request reordering, Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.182-191, June 22-26, 1998, Madison, Wisconsin, United States
|
 |
20
|
|
CITED BY 64
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhiyuan Chen , Chen Li , Jian Pei , Yufei Tao , Haixun Wang , Wei Wang , Jiong Yang , Jun Yang , Donghui Zhang, Recent progress on selected topics in database research: a report by nine young Chinese researchers working in the United States, Journal of Computer Science and Technology, v.18 n.5, p.538-552, September 2003
|
|
|
|
|
|
|
|
|
|
|
|
R. J. Bayardo , D. Gruhl , V. Josifovski , J. Myllymaki, An evaluation of binary xml encoding optimizations for fast stream based xml processing, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
|
|
|
|
|
|
Beverly Yang , Marcus Fontoura , Eugene Shekita , Sridhar Rajagopalan , Kevin Beyer, Virtual cursors for XML joins, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA
|
|
|
|
|
|
|
|
|
Wei Wang , Haifeng Jiang , Hongzhi Wang , Xuemin Lin , Hongjun Lu , Jianzhong Li, Efficient processing of XML path queries using the disk-based F&B Index, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
Kevin Beyer , Roberta J. Cochrane , Vanja Josifovski , Jim Kleewein , George Lapis , Guy Lohman , Bob Lyle , Fatma Özcan , Hamid Pirahesh , Normen Seemann , Tuong Truong , Bert Van der Linden , Brian Vickery , Chun Zhang, System RX: one part relational, one part XML, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Beyer , R. Cochrane , M. Hvizdos , V. Josifovski , J. Kleewein , G. Lapis , G. Lohman , R. Lyle , M. Nicola , F. Özcan , H. Pirahesh , N. Seemann , A. Singh , T. Truong , R. C. Van der Linden , B. Vickery , C. Zhang , G. Zhang, DB2 goes hybrid: integratng native XML and XQuery with relational data and SQL, IBM Systems Journal, v.45 n.2, p.271-298, January 2006
|
|
|
|
|
|
|
|
|
Zhiyuan Chen , Johannes Gehrke , Flip Korn , Nick Koudas , Jayavel Shanmugasundaram , Divesh Srivastava, Index structures for matching XML twigs using relational query processors, Data & Knowledge Engineering, v.60 n.2, p.283-302, February, 2007
|
|
|
|
|
|
|
|
|
Dimitri Theodoratos , Stefanos Souldatos , Theodore Dalamagas , Pawel Placek , Timos Sellis, Heuristic containment check of partial tree-pattern queries in the presence of index graphs, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrey Balmin , Fatma Özcan , Kevin S. Beyer , Roberta J. Cochrane , Hamid Pirahesh, A framework for using materialized XPath views in XML query processing, Proceedings of the Thirtieth international conference on Very large data bases, p.60-71, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
Raghav Kaushik , Philip Bohannon , Jeffrey F. Naughton , Pradeep Shenoy, Updates for structure indexes, Proceedings of the 28th international conference on Very Large Data Bases, p.239-250, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Selçuk Candan , Mehmet E. Dönderler , Yan Qi , Jaikannan Ramamoorthy , Jong W. Kim, FMware: middleware for efficient filtering and matching of XML messages with local data, Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware, November 01-01, 2006, Melbourne, Australia
|
|
|
|
|
|
|
|
|
|
|
|
George H. L. Fletcher , Dirk Van Gucht , Yuqing Wu , Marc Gyssens , Sofía Brenes , Jan Paredaens, A methodology for coupling fragments of XPath with structural indexes for XML documents, Information Systems, v.34 n.7, p.657-670, November, 2009
|
|
|
|
|