ACM Home Page
Please provide us with feedback. Feedback
Skew handling techniques in sort-merge join
Full text PdfPdf (1.18 MB)
Source International Conference on Management of Data archive
Proceedings of the 2002 ACM SIGMOD international conference on Management of data table of contents
Madison, Wisconsin
SESSION: Research sessions: implementation techniques table of contents
Pages: 169 - 180  
Year of Publication: 2002
ISBN:1-58113-497-5
Authors
Wei Li  Oracle Corporation
Dengfeng Gao  University of Arizona
Richard Thomas Snodgrass  University of Arizona
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 50,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564691.564711
What is a DOI?

ABSTRACT

Joins are among the most frequently executed operations. Several fast join algorithms have been developed and extensively studied; these can be categorized as sort-merge, hash-based, and index-based algorithms. While all three types of algorithms exhibit excellent performance over most data, ameliorating the performance degradation in the presence of skew has been investigated only for hash-based algorithms. However, for sort-merge join, even a small amount of skew present in realistic data can result in a significant performance hit on a commercial DBMS. This paper examines the negative ramifications of skew in sort-merge join and proposes several refinements that deal effectively with data skew. Experiments show that some of these algorithms also impose virtually no penalty in the absence of data skew and are thus suitable for replacing existing sort-merge implementations. We also show how sort-merge band join performance is significantly enhanced with these refinements.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
5
 
6
 
7
 
8
 
9
 
10
 
11
Roger N. Kline and Michael D. Soo, "The TIMEIT Temporal Database Testbed," 1998. www.cs.auc.dk/TimeCenter/software.htm.
 
12
 
13
T. Y. Cliff Leung and Richard R. Muntz, "Generalized Data Stream Indexing and Temporal Query Processing," in International Workshop on Research Issues in Data Engineering: Transaction and Query Processing, Philip S. Yu (ed.), pp. 124-131, Tempe, AZ, February, 1992.
 
14
 
15
Wei Li, Dengfeng Gao and Richard T. Snodgrass, "Skew Handling Techniques in Sort-Merge Join," 2001. http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-62.pdf.
16
 
17
18
 
19
 
20
21
 
22
Thomas Zurek, Parallel Temporal Nested-Loop Joins, Ph.D. Dissertation, Dept. of Computer Science, Edinburgh University, 1996.


Collaborative Colleagues:
Wei Li: colleagues
Dengfeng Gao: colleagues
Richard Thomas Snodgrass: colleagues