| Adaptive execution techniques for SMT multiprocessor architectures |
| Full text |
Pdf
(239 KB)
|
| Source
|
Principles and Practice of Parallel Programming
archive
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
table of contents
Chicago, IL, USA
SESSION: Architecture and systems
table of contents
Pages: 236 - 246
Year of Publication: 2005
ISBN:1-59593-080-9
|
|
Authors
|
|
Changhee Jung
|
Electronics and Telecommunications Research Institute, Daejeon, Korea
|
|
Daeseob Lim
|
University of California, San Diego, La Jolla, CA
|
|
Jaejin Lee
|
Seoul National University, Seoul, Korea
|
|
SangYong Han
|
Seoul National University, Seoul, Korea
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 69, Citation Count: 4
|
|
|
ABSTRACT
In simultaneous multithreading (SMT) multiprocessors, using all the available threads (logical processors) to run a parallel loop is not always beneficial due to the interference between threads and parallel execution overhead. To maximize performance in an SMT multiprocessor, finding the optimal number of threads is important. This paper presents adaptive execution techniques to find the optimal execution mode for SMT multiprocessor architectures. A compiler preprocessor generates code that, based on dynamic feedback, automatically determines at run time the optimal number of threads for each parallel loop in the application. Using 10 standard numerical applications and running them with our techniques on an Intel 4-processor Hyper-Threading Xeon SMP with 8 logical processors, our code is, on average, about 2 and 18 times faster than the original code executed on 4 and 8 logical processors, respectively.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Alpern , C. R. Attanasio , J. J. Barton , M. G. Burke , P. Cheng , J.-D. Choi , A. Cocchi , S. J. Fink , D. Grove , M. Hind , S. F. Hummel , D. Lieber , V. Litvinov , M. F. Mergen , T. Ngo , J. R. Russell , V. Sarkar , M. J. Serrano , J. C. Shepherd , S. E. Smith , V. C. Sreedhar , H. Srinivasan , J. Whaley, The Jalapeño virtual machine, IBM Systems Journal, v.39 n.1, p.211-238, January 2000
|
| |
2
|
William Blume , Ramon Doallo , Rudolf Eigenmann , John Grout , Jay Hoeflinger , Thomas Lawrence , Jaejin Lee , David Padua , Yunheung Paek , Bill Pottenger , Lawrence Rauchwerger , Peng Tu, Parallel Programming with Polaris, Computer, v.29 n.12, p.78-82, December 1996
[doi> 10.1109/2.546612]
|
| |
3
|
John Borozan. Microsoft Windows-Based Servers and Intel Hyper-Threading Technology. Microsoft Corporation, April 2002.
|
| |
4
|
Mark Byler, James Davies, Christopher Huson, Bruce Leasure, and Michael Wolfe. Multiple Version Loops. In Proceedings of the International Conference on Parallel Processing (ICPP), pages 312--318, August 1987.
|
| |
5
|
|
| |
6
|
Robit Chandra , Leonardo Dagum , Dave Kohr , Dror Maydan , Jeff McDonald , Ramesh Menon, Parallel programming in OpenMP, Morgan Kaufmann Publishers Inc., San Francisco, CA, 2001
|
 |
7
|
|
 |
8
|
|
| |
9
|
Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm , Dean M. Tullsen, Simultaneous Multithreading: A Platform for Next-Generation Processors, IEEE Micro, v.17 n.5, p.12-19, September 1997
[doi> 10.1109/40.621209]
|
| |
10
|
|
 |
11
|
|
| |
12
|
Intel. Intel Fortran Compiler User's Guide, 2002.
|
| |
13
|
Intel. Intel Fortran Programmer's Reference Manual, 2002.
|
| |
14
|
Intel. IA-32 Intel Architecture Software Developer's Manual, 2004.
|
| |
15
|
Ron Kalla, Balaram Sinharoy, and Joel M. Tendler. IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro, pages 40--47, March-April 2004.
|
| |
16
|
|
| |
17
|
Jaejin Lee and H. D. K. Moonesinghe. Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2002.
|
| |
18
|
|
| |
19
|
Deborah T. Marr, Frank Binns, David L. Hill, Glenn Hinton, David A. Kaufaty, J. Alan Miller, and Michael Upton. Hyper-Threading Technology Architecture and Microarchitecture. Intel Technology Journal, 6(1), February 2002.
|
| |
20
|
R. L. Mattson, J. Gecsei, D. Slutz, and I. Traiger. Evaluation Techniques for Storage Hierarchies. IBM Systems Journal, 9(2):78--117, December 1970.
|
| |
21
|
OpenMP Standard Board. OpenMP Fortran Interpretations, April 1999. Version 1.0.
|
 |
22
|
|
| |
23
|
Theodore H. Romer, Dennis Lee, Brian N. Bershad, and Bradley Chen. Dynamic Page Mapping Policies for Cache Conflict Resolution on Standard Hardware. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation, pages 255--266, November 1994.
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
 |
27
|
|
CITED BY 4
|
|
|
|
|
Matthew Curtis-Maury , James Dzierwa , Christos D. Antonopoulos , Dimitrios S. Nikolopoulos, Online power-performance adaptation of multithreaded programs using hardware event-based prediction, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
Angela C. Sodan , Garima Gupta , Lin Han , Lun Liu , Benjamin Lafreniere, Time and space adaptation for computational grids with the ATOP-Grid middleware, Future Generation Computer Systems, v.24 n.6, p.561-581, June, 2008
|
|
|
|
|