ACM Home Page
Please provide us with feedback. Feedback
Improving hash join performance through prefetching
Full text PdfPdf (904 KB)
Source
ACM Transactions on Database Systems (TODS) archive
Volume 32 ,  Issue 3  (August 2007) table of contents
Article No. 17  
Year of Publication: 2007
ISSN:0362-5915
Authors
Shimin Chen  Intel Research Pittsburgh, Pittsburgh, PA
Anastassia Ailamaki  Carnegie Mellon University, Pittsburgh, PA
Phillip B. Gibbons  Intel Research Pittsburgh, Pittsburgh, PA
Todd C. Mowry  Carnegie Mellon University and Intel Research Pittsburgh, Pittsburgh, PA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 171,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1272743.1272747
What is a DOI?

ABSTRACT

Hash join algorithms suffer from extensive CPU cache stalls. This article shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over 80% of its user time stalled on CPU cache misses, and explores the use of CPU cache prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 1.29--4.04X speedups for the join phase and 1.37--3.49X speedups for the partition phase over GRACE and simple prefetching approaches. Moreover, compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 36% faster on large relations and do not require exclusive use of the CPU cache to be effective. Finally, comparing the elapsed real times when disk I/Os are in the picture, our cache prefetching schemes achieve 1.12--1.84X speedups for the join phase and 1.06--1.60X speedups for the partition phase over the GRACE hash join algorithm.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
6
7
8
9
 
10
Hepkin, D. 2006. Guide to multiple page size support on AIX 5L version 5.3. http://www-03.ibm.com/servers/aix/whitepapers/multiple_page.pdf.
 
11
IBM Corporation. 2004. IBM DB2 Universal Database Administration Guide Version 8.2.
 
12
Intel Corporation. 2004. Intel Itanium 2 Processor Reference Manual For Software Development and Optimization. Intel Corporation. Order Number: 251110-003.
 
13
Intel Corporation. 2006. IA-32 Intel Architecture Software Developer's Manual (Volumn 3A and 3B): System Programming Guide. Intel Corporation.
 
14
 
15
Kalla, R. N., Sinharoy, B., and Tendler, J. M. 2004. IBM Power5 chip: A dual-core multithreaded processor. IEEE Micro 24, 2 (Mar./Apr.), 40--47.
 
16
 
17
Kitsuregawa, M., Tanaka, H., and Moto-Oka, T. 1983. Application of hash to data base machine and its architecture. New Generation Comput. 1, 1, 63--74.
 
18
 
19
Lindsay, B. 2002. Hash joins in DB2 UDB: The inside story. Carnegie Mellon DB Seminar.
20
 
21
 
22
 
23
McDougall, R. 2004. Supporting multiple page sizes in the solaris operating system. http://www.sun.com/blueprints/0304/817-5917.pdf.
 
24
McFarling, S. 1993. Combining branch predictors. Tech. Rep. WRL Technical Note TN-36, Digital Equipment Corporation (June).
 
25
26
 
27
 
28
Perfmon Project. http://www.hpl.hp.com/research/linux/perfmon/index.php4.
29
30
 
31
 
32
Sun Microsystems. UltraSPARC IV Processor Architecture Overview. Sun Microsystems. Technical Whitepaper, Version 1.0, Feb. 2004.
 
33
TPC Benchmarks. http://www.tpc.org/.
 
34
 
35


Collaborative Colleagues:
Shimin Chen: colleagues
Anastassia Ailamaki: colleagues
Phillip B. Gibbons: colleagues
Todd C. Mowry: colleagues