|
ABSTRACT
There has been much recent interest in XML publish/subscribe systems. Some systems scale to thousands of concurrent queries, but support a limited query language (usually a fragment of XPath 1.0). Other systems support more expressive languages, but do not scale well with the number of concurrent queries. In this paper, we propose a set of novel query processing techniques, referred to as Massively Multi-Query Join Processing techniques, for processing a large number of XML stream queries involving value joins over multiple XML streams and documents. These techniques enable the sharing of representations of inputs to multiple joins, and the sharing of join computation. Our techniques are also applicable to relational event processing systems and publish/subscribe systems that support join queries. We present experimental results to demonstrate the effectiveness of our techniques. We are able to process thousands of XML messages with hundreds of thousands of join queries on real RSS feed streams. Our techniques gain more than two orders of magnitude speedup compared to the naive approach of evaluating such join queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Xpath leashed. http://www-db-out.bell-labs.com/user/benedikt/papers/leashed.ps.gz.
|
| |
2
|
D. J. Abadi, Y. Ahmad, M. Balazinska, U. Çetintemel, M. Cherniack, J. H. Hwang, W. Lindner, A. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. B. Zdonik. The design of the borealis stream processing engine. In Proc. CIDR, pages 277--289, 2005.
|
| |
3
|
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. 1995.
|
 |
4
|
Marcos K. Aguilera , Robert E. Strom , Daniel C. Sturman , Mark Astley , Tushar D. Chandra, Matching events in a content-based subscription system, Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing, p.53-61, May 04-06, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301308.301326]
|
| |
5
|
|
| |
6
|
C. Barton, P. Charles, M. Fontoura, V. Josifovski, D. Goyal, and M. Raghavachari. Streaming xpath processing with forward and backward axes. In Proc. ICDE, 2003.
|
| |
7
|
D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams - a new class of data management applications. In Proc. VLDB, 2002.
|
| |
8
|
|
| |
9
|
S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, V. Raman, F. Reiss, and M. A. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. CIDR, 2003.
|
| |
10
|
Y. Chen, S. Davidson, and Y. Zheng. An efficient xpath query processor for xml streams. In Proc. ICDE, 2006.
|
| |
11
|
Byron Choi. What are real dtds like. 2002.
|
| |
12
|
A. Demers, J. Gehrke, M. Hong, M. Riedewald, and W. White. Towards expressive publish/subscribe systems. In Proc. EDBT, 2006.
|
 |
13
|
|
 |
14
|
Françoise Fabret , H. Arno Jacobsen , François Llirbat , Joăo Pereira , Kenneth A. Ross , Dennis Shasha, Filtering algorithms and implementation for very fast publish/subscribe systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.115-126, May 21-24, 2001, Santa Barbara, California, United States
|
| |
15
|
D. Florescu, C. Hillery, D. Kossmann, P. Lucas, F. Riccardi, T. Westmann, M. Carey, A. Sundararajan, and G. Agrawal. The bea/xqrl streaming xquery processor. In Proc. VLDB, 2003.
|
| |
16
|
X. Gong, W. Qian, Y. Yan, and A. Zhou. Bloom filter-based xml packets filtering for millions of path queries. In Proc. ICDE, 2005.
|
 |
17
|
|
| |
18
|
M. Hong, A. Demers, J. Gehrke, C. Koch, M. Riedewald, and W. White. Massively multi-query join processing in publish/subscribe systems. Technical report, Cornell University, 2007. http://techreports.library.cornell.edu.
|
| |
19
|
J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating window joins over unbounded streams. In Proc. ICDE, 2003.
|
| |
20
|
C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier. Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. In Proc. VLDB, pages 228--239, 2004.
|
| |
21
|
|
| |
22
|
B. Ludascher, P. Mukhopadhayn, and Y. Papakonstantinou. A transducer-based xml query processor. In Proc. VLDB, 2002.
|
| |
23
|
R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. S. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. CIDR, 2003.
|
| |
24
|
Ed Jr. Pegg. Graph minor.http://mathworld.wolfram.com/GraphMinor.html.
|
 |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
A. Yalamanchi, J. Srinivasan, and D. Gawlick. Managing expressions as data in relational database systems. In Proc. CIDR, 2003.
|
|