|
ABSTRACT
We consider the problem of evaluating large numbers of XPath filters, each with many predicates, on a stream of XML documents. The solution we propose is to lazily construct a single deterministic pushdown automata, called the XPush Machine from the given XPath fllters. We describe a number of optimization techniques to make the lazy XPush machine more efficient, both in terms of space and time. The combination of these optimizations results in high, sustained throughput. For example, if the total number of atomic predicates in the filters is up to 200000, then the throughput is at least 0.5 MB/sec: it increases to 4.5 MB/sec when each fllter contains a single predicate.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
C. Chan, P. Felber, M. Garofalakis, and R. Rastogi. Efficient flltering of XML documents with XPath expressions. In Proceedings of ICDE, 2002.
|
 |
5
|
|
| |
6
|
|
 |
7
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
| |
8
|
J. Chen, D. J. DeWitt, and J. F. Naughton. Design and evaluation of alternative selection placement strategies in optimizing continuous queries. In Proceedings of ICDE, 2002.
|
| |
9
|
J. Clark. XML path language (XPath), 1999. http://www.w3.org/TR/xpath.
|
| |
10
|
Richard Cole , Ramesh Hariharan , Piotr Indyk, Tree pattern matching and subset matching in deterministic O(n log3 n)-time, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.245-254, January 17-19, 1999, Baltimore, Maryland, United States
|
| |
11
|
Y. Diao, P. Fischer, M. Franklin, and R. To. Yfilter: Efficient and scalable filtering of XML documents. In Proceedings of ICDE, 2002.
|
| |
12
|
G. Gottlob, C. Koch, and R. Pichler. Efficient algorithm for processing XPath queries. In Proceedings of VLDB, 2002.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
Benjamin Nguyen , Serge Abiteboul , Grégory Cobena , Mihaí Preda, Monitoring XML data on the Web, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.437-448, May 21-24, 2001, Santa Barbara, California, United States
|
| |
18
|
D. Olteanu, T. Kiesling, and F. Bry. An evaluation of regular path expressions with qualifiers against XML streams. In Proceedings of ICDE, 2003.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
CITED BY 64
|
|
|
|
|
|
|
|
Xuemin Lin , Jian Xu , Qing Zhang , Hongjun Lu , Jeffrey Xu Yu , Xiaofang Zhou , Yidong Yuan, Approximate Processing of Massive Continuous Quantile Queries over High-Speed Data Streams, IEEE Transactions on Knowledge and Data Engineering, v.18 n.5, p.683-698, May 2006
|
|
|
|
|
|
|
|
|
|
|
|
Feng Tian , Berthold Reinwald , Hamid Pirahesh , Tobias Mayr , Jussi Myllymaki, Implementing a scalable XML publish/subscribe system using relational database systems, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Selçuk Candan , Wang-Pin Hsiung , Songting Chen , Junichi Tatemura , Divyakant Agrawal, AFilter: adaptable XML filtering with prefix-caching suffix-clustering, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christoph Koch , Stefanie Scherzinger , Nicole Schweikardt , Bernhard Stegmaier, Schema-based scheduling of event processors and buffer minimization for queries on structured data streams, Proceedings of the Thirtieth international conference on Very large data bases, p.228-239, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniela Florescu , Chris Hillery , Donald Kossmann , Paul Lucas , Fabio Riccardi , Till Westmann , Michael J. Carey , Arvind Sundararajan , Geetika Agrawal, The BEA/XQRL streaming XQuery processor, Proceedings of the 29th international conference on Very large data bases, p.997-1008, September 09-12, 2003, Berlin, Germany
|
|
|
Christoph Koch , Stefanie Scherzinger , Nicole Schweikardt , Bernhard Stegmaier, FluXQuery: an optimizing XQuery processor for streaming XML data, Proceedings of the Thirtieth international conference on Very large data bases, p.1309-1312, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John Keeney , Dominik Roblek , Dominic Jones , David Lewis , Declan O'Sullivan, Extending Siena to support more expressive and flexible subscriptions, Proceedings of the second international conference on Distributed event-based systems, July 01-04, 2008, Rome, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|