ACM Home Page
Please provide us with feedback. Feedback
Near-optimal algorithms for shared filter evaluation in data stream systems
Full text PdfPdf (518 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 4: Streaming Filters table of contents
Pages 133-146  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Zhen Liu  IBM T. J. Watson Research Center, Hawthorne, NY, USA
Srinivasan Parthasarathy  IBM T. J. Watson Research Center, Hawthorne, NY, USA
Anand Ranganathan  IBM T. J. Watson Research Center, Hawthorne, NY, USA
Hao Yang  IBM T. J. Watson Research Center, Hawthorne, NY, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 28,   Downloads (12 Months): 267,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376633
What is a DOI?

ABSTRACT

We consider the problem of evaluating multiple overlapping queries defined on data streams, where each query is a conjunction of multiple filters and each filter may be shared across multiple queries. Efficient support for overlapping queries is a critical issue in the emerging data stream systems, and this is particularly the case when filters are expensive in terms of their computational complexity and processing time. This problem generalizes other well-known problems such as pipelined filter ordering and set cover, and is not only NP-Hard but also hard to approximate within a factor of o(log n) from the optimum, where n is the number of queries. In this paper, we present two near-optimal approximation lgorithms with provably-good performance guarantees for the evaluation of overlapping queries. We present an edge-coverage based Greedy algorithm which achieves an approximation ratio of (1 + log(n) + log(α)), where n is the number of queries and α is the average number of filters in a query. We also present a randomized, fast and easily parallelizable Harmonic algorithm which achieves an approximation ratio of 2β, where β is the maximum number of filters in a query. We have implemented these algorithms in a prototype system, and evaluated their performance using extensive experiments in the context of multimedia stream analysis. The results show that our Greedy algorithm consistently outperforms other known algorithms under various settings and scales well as the numbers of queries and filters increase.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
6
 
7
8
 
9
P. A. Chirita, S. Idreos, M. Koubarakis, and W. Nejdl. Publish/subscribe for RDF-based P2P networks. In Proceedings of the 1st European Semantic Web Symposium, pages 182--197, 2004.
 
10
M. Cilia, C. Bornhoevd, and A. P. Buchmann. CREAM: An infrastructure for distributed, heterogeneous event-based applications. In Proceedings of International Conference on Cooperative Information Systems, pages 482--502, 2003.
11
 
12
13
 
14
M. Goemans and J. Vondrak. Stochastic covering and adaptivity. In LATIN'06: Proceedings of the 7th Latin American Symposium on Theoretical Informatics, 2006.
15
16
17
18
19
 
20
 
21
22
 
23
 
24


Collaborative Colleagues:
Zhen Liu: colleagues
Srinivasan Parthasarathy: colleagues
Anand Ranganathan: colleagues
Hao Yang: colleagues