ACM Home Page
Please provide us with feedback. Feedback
Boosting topic-based publish-subscribe systems with dynamic clustering
Full text PdfPdf (415 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data table of contents
Beijing, China
SESSION: Publish-subscribe systems table of contents
Pages: 749 - 760  
Year of Publication: 2007
ISBN:978-1-59593-686-8
Authors
Tova Milo  Tel Aviv University, Tel Aviv, Israel
Tal Zur  Tel Aviv University, Tel Aviv, Israel
Elad Verbin  Tel Aviv University, Tel Aviv, Israel
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 162,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1247480.1247563
What is a DOI?

ABSTRACT

We consider in this paper a class of Publish-Subscribe (pub-sub) systems called topic-based systems, where users subscribe to topics and are notified on events that belong to those subscribed topics. With the recent flourishing of RSS news syndication, these systems are regaining popularity and are raising new challenging problems.

In most of the modern topics-based systems, the events in each topic are delivered to the subscribers via a supporting, distributed, data structure (typically a multicast tree). Since peers in the network may come and go frequently, this supporting structure must be continuously maintained so that "holes" do not disrupt the events delivery. The dissemination of events in each topic thus incurs two main costs: (1) the actual transmission cost for the topic events,and (2) the maintenance cost for its supporting structure. This maintenance overhead becomes particularly dominating when a pub-sub system supports a large number of topics with moderate event frequency; a typical scenario in nowadays news syndication scene.

The goal of this paper is to devise a method for reducing this maintenance overhead to the minimum. Our aim is not to invent yet another topic-based pub-sub system, but rather to develop a generic technique for better utilization of existing platforms. Our solution is based on a novel distributed clustering algorithm that utilizes correlations between user subscriptions to dynamically group topics together, into virtual topics (called topic-clusters), andt hereby unifies their supporting structures and reduces costs. Our technique continuously adapts the topic-clusters and the user subscriptions to the system state, and incurs only very minimal overhead. We have implemented our solution in the Tamara pub-sub system. Our experimental study shows this approach to be extremely effective, improving the performance by an order of magnitude.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. E. Bartlett, J. W. Kotrlik, and C. C. Higgins. Organizational research: Determining appropriate sample size in survey research. Information Technology, Learning, and Performance, 19(1), 2001.
 
2
F. Cao and J. P. Singh. Efficient event routing in content-based publish-subscribe service networks. IEEE INFOCOM, 2004.
3
 
4
M. Castro, P. Druschel, A. Kermarrec, and A. Rowstron. SCRIBE: A large-scale and decentralized application-level multicast infrastructure. IEEE Journal on Selected Areas in communications (JSAC), 20(8):1489--1499, 2002.
5
 
6
 
7
Y. Diao, P. Fischer, M. J. Franklin, and R. To. Yfilter: Efficient and scalable filtering of xml documents. In Proc. Intl. Conf. on Data Engineering (ICDE), pages 341--342, 2002.
 
8
Y. Diao, S. Rizvi, and M. J. Franklin. Towards an internet-scale xml dissemination service. In Proc. VLDB, pages 612--623, 2004.
 
9
Freshmeat. The largest index of unix and cross-platform softwares. http://freshmeat.net/projects/gimp/.
 
10
Y. Gourhant, S. Louboutin, V. Cahill, A. Condon, G. Starovic, and B. Tangney. Dynamic clustering in an object-oriented distributed system. In Proc. Objects in Large Distributed Applications OLDA-II, Ottawa, Canada, 1992.
 
11
E. Januzaj, H. Kriegel, and M. Pfeifle. Towards effective and efficient distributed clustering. In Workshop on Clustering Large Data Sets (ICDM2003), Melbourne, FL, 2003.
 
12
 
13
 
14
mandriva.com. Mandrive linux distribution. http://www.mandriva.com/.
 
15
S. Moritz and B. Ernst W. DDC: A dynamic and distributed clustering algorithm for networked virtual environments based on p2p networks. In Proceedings of CoNEXT'05, France, 2005.
 
16
nielsen-netratings.com. Nielsen/NetRating. http://www.nielsen-netratings.com/pr/pr-050920.
17
 
18
V. Ramasubramanian, R. Peterson, and Emin Gun Sirer. Corona: A high performance publish-subscribe system for the world wide web. In Proc. of Networked System Design and Implementation, San Jose, California, 2006.
 
19
 
20
 
21
A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Proceedings of IFIP/ACM Middleware, Germany, 2001.
 
22
D. Sandler, A. Mislove, A. Post, and P. Druschel. FeedTree: Sharing web micronews with peer-to-peer event notification. In Proc. Int. Workshop on Peer-to-Peer Systems (IPTPS'05), New York, 2005.
 
23
T. Sivaharan, G. Blair, and G. Coulson. GREEN: A configurable and re-configurable publish-subscribe middleware for pervasive computing. In Proc, Int. Symposium on Distributed Objects and Applications (DOA05), Agia Napa, Cyprus, 2005.
24
 
25
GIMP.GNU image manipulation program. http://www.gimp.org/.
 
26
E. Patrick Th., G. Rachid, and S. Joe. Type-Based Publish/Subscribe. Technical report, 2000.
 
27
S. Voulgaris, E. Riviere, A. Kermarrec, and M. van Steen. Sub-2-sub: Self-organizing content-based publish and subscribe for dynamic and large scale collaborative networks. In Proc. Int'l Workshop on Peer-to-Peer Systems (IPTPS), California, USA, February 2006.
 
28
 
29
B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Technical Report UCB/CSD-01-1141, UC Berkeley, April 2001.
30


Collaborative Colleagues:
Tova Milo: colleagues
Tal Zur: colleagues
Elad Verbin: colleagues