|
ABSTRACT
As the performance gap between main memory and modern processors widens, database algorithms must be adapted to be "architecture-aware" for optimal performance. We address this issue using the computation of hash join, one of the most important operations in database query processing, to study the impact of simultaneous multithreading (SMT) and main-memory latency (cache misses) on performance.Prior work [8] has studied cache misses on a simulation based on the Compaq ES40. Our results are obtained by measuring the performance of actual hardware (Intel Pentium and Xeon, and AMD Opteron) first for the single-threaded version of the hash-join algorithm used in the prior work and a new version designed for multiple threads.We found that hardware prefetching from main-memory data into CPU cache as implemented in the architectures we tested significantly reduces the real-world benefit of software prefetching (contrary to prior work on simulated systems). We found that SMT achieved significant speedup for our thread-aware hash join algorithm when compared with a single-threaded execution on the same single processor. Software prefetching also proved beneficial in this environment.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Intel multi-core processor architecture development backgrounder. Intel White Paper.
|
| |
2
|
Multi-core processors-- the next evolution in computing. AMD White Paper, 2005.
|
| |
3
|
|
| |
4
|
D. Boggs, A. Baktha, J. Hawkins, D. T. Marr, J. A. Miller, P. Roussel, R. Singhal, B. Toll, and K. Venkatraman. The microarchitecture of the Intel Pentium 4 processor on 90nm technology. Intel Technology Journal, (Q1):4--15, 2002.
|
| |
5
|
|
| |
6
|
|
| |
7
|
D. Carmean. Data management challenges on new computer architectures. In First Int'l Workshop on Data Management on New Hardware (DaMoN), June 2005. Oral Presentation.
|
| |
8
|
|
 |
9
|
Shimin Chen , Phillip B. Gibbons , Todd C. Mowry, Improving index performance through prefetching, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.235-246, May 21-24, 2001, Santa Barbara, California, United States
|
 |
10
|
|
| |
11
|
Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm , Dean M. Tullsen, Simultaneous Multithreading: A Platform for Next-Generation Processors, IEEE Micro, v.17 n.5, p.12-19, September 1997
[doi> 10.1109/40.621209]
|
 |
12
|
|
| |
13
|
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, (Q1), 2001.
|
| |
14
|
Intel. Intel Pentium 4 Processor Optimization, 2001.
|
| |
15
|
R. Kalla, B. Sinharoy, and J. M. Tendler. IBM Power5 chip: A dual-core multithreaded processor. 2004.
|
| |
16
|
M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of hash to data base machine and its architecture. In New Generation Computing, volume 1, pages 63--74, 1983.
|
| |
17
|
J. J. Lo, L. A. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh. An analysis of database workload performance on simultaneous multithreaded processors. Technical report, Compaq, July 1998.
|
| |
18
|
D. T. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton. Hyper-threading technology architecture and microarchitecture. Intel Technology Journal, (Q1):4--15, 2002.
|
 |
19
|
|
| |
20
|
S. Microsystems. Throughput computing: Changing the economics and ecology of the data center with innovative SPARCtextregistered technology. White Paper.
|
| |
21
|
V. K. Reddy, A. M. Sule, and A. V. Anantaraman. Hyper-threading on the Pentium 4, December 2002.
|
 |
22
|
Scott Rixner , William J. Dally , Ujval J. Kapasi , Peter Mattson , John D. Owens, Memory access scheduling, Proceedings of the 27th annual international symposium on Computer architecture, p.128-138, June 2000, Vancouver, British Columbia, Canada
|
| |
23
|
M. K. S. Mangegold, P. Boncz. Generic database cost models for hierarchical memory systems. In Proceedings of the 28th VLDB Conference, 2002.
|
| |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
Dean M. Tullsen , Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm, Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Proceedings of the 23rd annual international symposium on Computer architecture, p.191-202, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
28
|
|
 |
29
|
|
CITED BY 4
|
|
|
|
|
|
|
|
|
|
|
Layali Rashid , Wessam M. Hassanein , Moustafa A. Hammad, Exploiting multithreaded architectures to improve the hash join operation, Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture, p.46-53, October 26-26, 2008, Toronto, Canada
|
|