ACM Home Page
Please provide us with feedback. Feedback
A geometric approach to monitoring threshold functions over distributed data streams
Full text PdfPdf (481 KB)
Source
ACM Transactions on Database Systems (TODS) archive
Volume 32 ,  Issue 4  (November 2007) table of contents
Article No. 23  
Year of Publication: 2007
ISSN:0362-5915
Authors
Izchak Sharfman  Technion, Haifa, Israel
Assaf Schuster  Technion, Haifa, Israel
Daniel Keren  Haifa University, Haifa, Israel
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 121,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1292609.1292613
What is a DOI?

ABSTRACT

Monitoring data streams in a distributed system is the focus of much research in recent years. Most of the proposed schemes, however, deal with monitoring simple aggregated values, such as the frequency of appearance of items in the streams. More involved challenges, such as the important task of feature selection (e.g., by monitoring the information gain of various features), still require very high communication overhead using naive, centralized algorithms.

We present a novel geometric approach which reduces monitoring the value of a function (vis-à-vis a threshold) to a set of constraints applied locally on each of the streams. The constraints are used to locally filter out data increments that do not affect the monitoring outcome, thus avoiding unnecessary communication. As a result, our approach enables monitoring of arbitrary threshold functions over distributed data streams in an efficient manner.

We present experimental results on real-world data which demonstrate that our algorithms are highly scalable, and considerably reduce communication load in comparison to centralized algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
5
 
6
Berkovitz, L. 2002. Convexity and Optimization in Rn. Wiley, New York, NY.
 
7
 
8
 
9
 
10
Cherniack, M., Balakrishnan, H., Balazinska, M., Carney, D., Cetintemel, U., Xing, Y., and Zdonik, S. 2003. Scalable distributed stream processing. In CIDR 2003: Proceedings of the First Biennial Conference on Innovative Data Systems Research (Asilomar, CA).
11
 
12
Dilman, M. and Raz, D. 2001. Efficient reactive monitoring. In INFOCOM '01: Proceedings of the Twentieth Annual Joint Conference of the IEEE Computer and Communications Societies. 1012--1019.
13
14
15
16
 
17
Jain, A., Hellerstein, J. M., Ratnasamy, S., and Wetherall, D. 2004. A wakeup call for Internet monitoring systems: The case for distributed triggers. In Proceedings of the 3rd ACM SIGCOMM Workshop on Hot Topics in Networks (HotNets, San Diego, CA).
18
 
19
 
20
 
21
 
22
23
 
24
 
25
26
27
 
28
Parrilo, P. 2003. Semidefinite programming relaxations for semialgebraic problems. Math. Programm. 96, 2, 293--320.
 
29
Rose, T., Stevenson, M., and Whitehead, M. 2002. The Reuters Corpus Volume 1---from yesterday's news to tomorrow's language resources. In Proceedings of the Third International Conference on Language Resources and Evaluation (Las Palmas de Gran Canaria).
30
 
31
 
32
33
34
 
35
 
36

Collaborative Colleagues:
Izchak Sharfman: colleagues
Assaf Schuster: colleagues
Daniel Keren: colleagues