ACM Home Page
Please provide us with feedback. Feedback
Efficient algorithms for evaluating xpath over streams
Full text PdfPdf (879 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data table of contents
Beijing, China
SESSION: Data stream management table of contents
Pages: 269 - 280  
Year of Publication: 2007
ISBN:978-1-59593-686-8
Authors
Gang Gou  North Carolina State University, Raleigh, NC
Rada Chirkova  North Carolina State University, Raleigh, NC
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 157,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1247480.1247512
What is a DOI?

ABSTRACT

In this paper we address the problem of evaluating XPath queries over streaming XML data. We consider a practical XPath fragment called Univariate XPath, which includes the commonly used '/' and '//' axes and allows *-node tests and arbitrarily nested predicates. It is well known that this XPath fragment can be efficiently evaluated in O(|D||Q|) time in the non-streaming environment, where |D| is the document size and |Q| is the query size. However, this is not necessarily true in the streaming environment, since streaming algorithms have to satisfy stricter requirement than non-streaming algorithms, in that all data must be read sequentially in one pass. Therefore, it is not surprising that state-of-the-art stream-querying algorithms have higher time complexity than O(|D||Q|).

In this paper we revisit the XPath stream-querying problem, and show that Univariate XPath can be efficiently evaluated in O|D||Q|) time in the streaming environment. Specifically, we propose two O(|D||Q|)-time stream-querying algorithms, LQ and EQ, which are based on the lazy strategy and on the eager strategy, respectively. To the best of our knowledge, LQ and EQ are the first XPath stream-querying algorithms that achieve O(|D||Q|) time performance. Further, our algorithms achieve O(|D||Q|) time performance without trading off space performance. Instead, they have better buffering-space performance than state-of-the-art stream-querying algorithms. In particular, EQ achieves optimal buffering-space performance. Our experimental results show that our algorithms have not only good theoretical complexity but also considerable practical performance advantages over existing algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Al--Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: a primitive for efficient XML query pattern matching. ICDE Conference, 2002.
 
2
Apache. Apache Xerces2-Java XML Parser. http://xerces.apache.org/xerces2-j/.
 
3
Z. Bar-Yossef, M. Fontoura, and V. Josifovski. On the memory requirements of XPath evaluation over XML streams. Proceedings of PODS, 2004.
4
 
5
C. Barton, P. Charles, D. Goyal, M. Raghavachari, M. Fontoura, and V. Josifovski. Streaming XPath processing with forward and backward axes. ICDE Conference, 2003.
 
6
S. Boag, D. Chamberlin, M. F. Ferandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML Query Language. W3C, http://www.w3.org/TR/xquery/, 2003.
 
7
D. Brownell and D. Megginson. SAX: Simple API for XML. SAX Project Organization, http://www.saxproject.org/.
 
8
N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. SIGMOD, 2002.
 
9
C. Y. Chan, P. Felber, M. N. Garofalakis, and R. Rastogi. Efficient filtering of XML documents with XPat expressions. ICDE Conference, 2002.
 
10
Y. Chen, S. B. Davidson, and Y. Zheng. An efficient XPath query processor for XML streams. ICDE, 2006.
 
11
B. Choi. What are real DTDs like? WebDB Workshop, 2002.
 
12
J. Clark. XML Transformations (XSLT) Version 1.0. W3C, http://www.w3.org/TR/xslt/, 1999.
 
13
J. Clark and S. DeRose. XML Path Language (XPath) Version 1.0. W3C, http://www.w3.org/TR/xpath/, 1999.
14
 
15
A. L. Diaz and D. Lovell. IBM's XML generator. http://www.alphaworks.ibm.com/tech/xmlgenerator.
 
16
M. Fernandez and et al. Galax: an implementation of XQuery. http://www.galaxquery.org/.
 
17
D. Florescu, C. Hillery, D. Kossmann, P. Lucas, F. Riccardi, T. Westmann, M. J. Carey, A. Sundararajan,and G. Agrawal. The BEA/XQRL streaming XQuery processor. VLDB Conference, 2003.
 
18
G. Gottlob, C. Koch, and R. Pichler. Efficient algorithms for processing XPath queries. VLDB Conference, 2002.
19
20
21
 
22
 
23
M. Kay. Saxon: the XSLT and XQuery processor. http://saxon.sourceforge.net/.
 
24
C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier. Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. VLDB Conference, 2004.
 
25
B. Ludascher, P. Mukhopadhyay, and Y. Papakonstantinou. A transducer-based XML query processor. VLDB, 2002.
 
26
F. Peng and S. S. Chawathe. XSQ: A streaming XPath engine. http://www.cs.umd.edu/projects/xsq/.
27
 
28
A. Schmidt and et al. XMark: an XML benchmark project. http://monetdb.cwi.nl/xml/.
29
 
30
D. Suciu. XML data repository. http://www.cs.washington.edu/research/xmldatasets/.
 
31
W3C. Section 1.2.2, XML query use cases. http://www.w3.org/TR/xquery-use-cases/, 2006.