| Near-optimal algorithms for shared filter evaluation in data stream systems |
| Full text |
Pdf
(518 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
table of contents
Vancouver, Canada
SESSION: Research Session 4: Streaming Filters
table of contents
Pages 133-146
Year of Publication: 2008
ISBN:978-1-60558-102-6
|
|
Authors
|
|
Zhen Liu
|
IBM T. J. Watson Research Center, Hawthorne, NY, USA
|
|
Srinivasan Parthasarathy
|
IBM T. J. Watson Research Center, Hawthorne, NY, USA
|
|
Anand Ranganathan
|
IBM T. J. Watson Research Center, Hawthorne, NY, USA
|
|
Hao Yang
|
IBM T. J. Watson Research Center, Hawthorne, NY, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 28, Downloads (12 Months): 267, Citation Count: 1
|
|
|
ABSTRACT
We consider the problem of evaluating multiple overlapping queries defined on data streams, where each query is a conjunction of multiple filters and each filter may be shared across multiple queries. Efficient support for overlapping queries is a critical issue in the emerging data stream systems, and this is particularly the case when filters are expensive in terms of their computational complexity and processing time. This problem generalizes other well-known problems such as pipelined filter ordering and set cover, and is not only NP-Hard but also hard to approximate within a factor of o(log n) from the optimum, where n is the number of queries. In this paper, we present two near-optimal approximation lgorithms with provably-good performance guarantees for the evaluation of overlapping queries. We present an edge-coverage based Greedy algorithm which achieves an approximation ratio of (1 + log(n) + log(α)), where n is the number of queries and α is the average number of filters in a query. We also present a randomized, fast and easily parallelizable Harmonic algorithm which achieves an approximation ratio of 2β, where β is the maximum number of filters in a query. We have implemented these algorithms in a prototype system, and evaluated their performance using extensive experiments in the context of multimedia stream analysis. The results show that our Greedy algorithm consistently outperforms other known algorithms under various settings and scales well as the numbers of queries and filters increase.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Marcos K. Aguilera , Robert E. Strom , Daniel C. Sturman , Mark Astley , Tushar D. Chandra, Matching events in a content-based subscription system, Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing, p.53-61, May 04-06, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301308.301326]
|
| |
2
|
|
 |
3
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007615]
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
| |
9
|
P. A. Chirita, S. Idreos, M. Koubarakis, and W. Nejdl. Publish/subscribe for RDF-based P2P networks. In Proceedings of the 1st European Semantic Web Symposium, pages 182--197, 2004.
|
| |
10
|
M. Cilia, C. Bornhoevd, and A. P. Buchmann. CREAM: An infrastructure for distributed, heterogeneous event-based applications. In Proceedings of International Conference on Cooperative Information Systems, pages 482--502, 2003.
|
 |
11
|
Anne Condon , Amol Deshpande , Lisa Hellerstein , Ning Wu, Flow algorithms for two pipelined filter ordering problems, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 26-28, 2006, Chicago, IL, USA
[doi> 10.1145/1142351.1142379]
|
| |
12
|
|
 |
13
|
Nilesh N. Dalvi , Sumit K. Sanghai , Prasan Roy , S. Sudarshan, Pipelining in multi-query optimization, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.59-70, May 2001, Santa Barbara, California, United States
[doi> 10.1145/375551.375561]
|
| |
14
|
M. Goemans and J. Vondrak. Stochastic covering and adaptivity. In LATIN'06: Proceedings of the 7th Latin American Symposium on Theoretical Informatics, 2006.
|
 |
15
|
|
 |
16
|
|
 |
17
|
Hoshi Mistry , Prasan Roy , S. Sudarshan , Krithi Ramamritham, Materialized view selection and maintenance using multi-query optimization, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.307-318, May 21-24, 2001, Santa Barbara, California, United States
|
 |
18
|
|
 |
19
|
Apostol! Natsev , Jelena Tešić , Lexing Xie , Rong Yan , John R. Smith, IBM multimedia search and retrieval system, Proceedings of the 6th ACM international conference on Image and video retrieval, p.645-645, July 09-11, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1282280.1282373]
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
|
|