ACM Home Page
Please provide us with feedback. Feedback
Continuous multi-way joins over distributed hash tables
Full text PdfPdf (555 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 11th international conference on Extending database technology: Advances in database technology table of contents
Nantes, France
SESSION: Research sessions: Join processing table of contents
Pages 594-605  
Year of Publication: 2008
ISBN:978-1-59593-926-5
Authors
Stratos Idreos  CWI, Amsterdam, The Netherlands
Erietta Liarou  CWI, Amsterdam, The Netherlands
Manolis Koubarakis  University of Athens, Greece
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 72,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1353343.1353415
What is a DOI?

ABSTRACT

This paper studies the problem of evaluating continuous multi-way joins on top of Distributed Hash Tables (DHTs). We present a novel algorithm, called recursive join (RJoin), that takes into account various parameters crucial in a distributed setting i.e., network traffic, query processing load distribution, storage load distribution etc. The key idea of RJoin is incremental evaluation: as relevant tuples arrive continuously, a given multi-way join is rewritten continuously into a join with fewer join operators, and is assigned continuously to different nodes of the network. In this way, RJoin distributes the responsibility of evaluating a continuous multi-way join to many network nodes by assigning parts of the evaluation of each binary join to a different node depending on the values of the join attributes. The actual nodes to be involved are decided by RJoin dynamically after taking into account the rate of incoming tuples with values equal to the values of the joined attributes. RJoin also supports sliding window joins which is a crucial feature, especially for long join paths, since it provides a mechanism to reduce the query processing state and thus keep the cost of handling incoming tuples stable. In addition, RJoin is able to handle message delays due to heavy network traffic. We present a detailed mathematical and experimental analysis of RJoin and study the performance tradeoffs that occur.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
D. Abadi et al. The Design of the Borealis Stream Processing Engine. CIDR '05.
 
3
 
4
 
5
Y. Ahmad et al. Locality-Aware Networked Join Evaluation. NETDB '05.
6
 
7
 
8
9
 
10
M. Cherniack et al. Scalable Distributed Stream Processing. CIDR '03.
 
11
 
12
 
13
J. M. Hellerstein et al. A Wakeup Call for Internet Monitoring Systems: The Case for Distributed Triggers. HotNets-III '04.
 
14
 
15
 
16
R. Huebsch et al. The Architecture of PIER: an Internet-Scale Query Processor. CIDR '05.
 
17
S. Idreos. Distributed Evaluation of Continuous Equi-join Queries over Large Structured Overlay Networks. Master thesis. Technical University of Crete. September, 2005.
 
18
19
20
 
21
22
 
23
S. Chandrasekharan et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. CIDR '03.
 
24
J. Shneidman et al. A Cost-Space Approach to Distributed Query Optimization in Stream Based Overlays. NETDB '05.
25
26
 
27
B.-Y. Zhao et al. Tapestry: A Resilient Global-scale Overlay for Service Deployment. IEEE Journal on Selected Areas in Communications, 22(1), 2004.

Collaborative Colleagues:
Stratos Idreos: colleagues
Erietta Liarou: colleagues
Manolis Koubarakis: colleagues