|
ABSTRACT
XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary greatly in structure, e.g. in the number of author sub-elements. This kind of heterogeneity may permeate the entire document in a recursive fashion: e.g., different authors of the same or different book may in turn greatly vary in structure. Even when the document conforms to a schema, the flexible nature of schemas for XML still allows such significant variations in structure among elements in a collection. Bulk processing of such heterogeneous sets is problematic.In this paper, we introduce the notion of logical classes (LC) of pattern tree nodes, and generalize the notion of pattern tree matching to handle node logical classes. This abstraction pays off significantly in allowing us to reason with an inherently heterogeneous collection of elements in a uniform, homogeneous way. Based on this, we define a Tree Logical Class (TLC) algebra that is capable of handling the heterogeneity arising in XML query processing, while avoiding redundant work. We present an algorithm to obtain a TLC algebra expression from an XQuery statement (for a large fragment of XQuery). We show how to implement the TLC algebra efficiently, introducing the nest-join as an important physical operator for XML query processing. We show that evaluation plans generated using the TLC algebra not only are simpler but also perform better than those generated by competing approaches. TLC is the algebra used in the Timber [8] system developed at the University of Michigan.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query languge. Working Draft. http://www.w3.org/TR/xquery.
|
 |
3
|
|
| |
4
|
Z. Chen, H. V. Jagadish, L. V. S. Lakshmanan, and S. Paparizos. From tree patterns to generalized tree patterns: On efficient evaluation of XQuery. In Proc. VLDB Conf., Sep. 2003.
|
 |
5
|
Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States
|
 |
6
|
|
| |
7
|
D. Florescu and D. Kossman. Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull., 22(3), 1999.
|
| |
8
|
H. V. Jagadish , S. Al-Khalifa , A. Chapman , L. V. S. Lakshmanan , A. Nierman , S. Paparizos , J. M. Patel , D. Srivastava , N. Wiwatwattana , Y. Wu , C. Yu, TIMBER: A native XML database, The VLDB Journal — The International Journal on Very Large Data Bases, v.11 n.4, p.274-291, December 2002
[doi> 10.1007/s00778-002-0081-x]
|
| |
9
|
|
| |
10
|
|
| |
11
|
U. of Michigan. The Timber project. http://www.eecs.umich.edu/db/timber.
|
| |
12
|
U. of Wisconsin. The Niagara internet query system. http://www.cs.wisc.edu/niagara/.
|
| |
13
|
A. R. Schmidt, F. Waas, M. L. Kersten, M. J. Carey, I. Manolescu, and R. Busse. XMark: A benchmark for XML data management. In Proc. VLDB Conf., 2002.
|
| |
14
|
|
| |
15
|
Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999
|
| |
16
|
J. Simeon and M. F. Fernandez. Galax, an open implementation of XQuery. http://db.bell-labs.com/galax/.
|
 |
17
|
Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564715]
|
| |
18
|
S. D. Viglas, L. Galanis, D. J. DeWitt, D. Maier, and J. F. Naughtonn. Putting XML query algebras into context. http://www.cs.wisc.edu/niagara/.
|
| |
19
|
Y. Wu, J. M. Patel, and H. V. Jagadish. Structural join order selection for XML query optimization. In Proc. ICDE Conf., Mar. 2003.
|
| |
20
|
X-Hive Corp. X-Hive/DB native XML storage. http://www.x-hive.com/.
|
| |
21
|
XMark, an XML benchmark project. http://www.xml-benchmark.org/.
|
 |
22
|
Xin Zhang , Bradford Pielech , Elke A. Rundesnteiner, Honey, I shrunk the XQuery!: an XML algebra optimization approach, Proceedings of the 4th international workshop on Web information and data management, November 08-08, 2002, McLean, Virginia, USA
[doi> 10.1145/584931.584936]
|
CITED BY 15
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Markus Kirchberg , Faizal Riaz-ud-Din , Klaus-Dieter Schewe , Alexei Tretiakov, Using reflection for querying XML documents, Proceedings of the 17th Australasian Database Conference, p.119-128, January 16-19, 2006, Hobart, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feng Shao , Lin Guo , Chavdar Botev , Anand Bhaskar , Muthiah Chettiar , Fan Yang , Jayavel Shanmugasundaram, Efficient keyword search over virtual XML views, The VLDB Journal — The International Journal on Very Large Data Bases, v.18 n.2, p.543-570, April 2009
|
|
|
|
|