| The trade-off between implicit and explicit data distribution in shared-memory programming paradigms |
| Full text |
Pdf
(290 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 15th international conference on Supercomputing
table of contents
Sorrento, Italy
Pages: 23 - 37
Year of Publication: 2001
ISBN:1-58113-410-X
|
|
Authors
|
|
Dimitrios S. Nikolopoulos
|
Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main Street, Urbana, IL
|
|
Eduard Ayguadé
|
Department d' Arquirectura, de Computadors, Universitat Politecnica de Catalunya, c/Jordi Girona 1-3 08034, Barcelona, Spain
|
|
Theodore S. Papatheodorou
|
Department of Computer, Engineering and Informatics, University of Patras, Rion, 26500, Patras, Greece
|
|
Constantine D. Polychronopoulos
|
Coordinated Science Laboratory, University of Illinois, at Urbana-Champaign, 1308 West Main Street, Urbana, IL
|
|
Jesús Labarta
|
Department d' Arquirectura, de Computadors, Universitat Politecnica de Catalunya, c/Jordi Girona 1-3 08034, Barcelona, Spain
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 22, Citation Count: 4
|
|
|
ABSTRACT
This paper explores previously established and novel methods for scaling the performance of OpenMP on NUMA architectures. The spectrum of methods under investigation includes OS-level automatic page placement algorithms, dynamic page migrationd manual data distribution. The trade-off that these methods face lies between performance and programming effort. Automatic page placement algorithms are transparent to the programmer, but may compromise memory access locality. Dynamic page migration is also transparent, but requires careful engineering of online algorithms to be effective. Manual data distribution on the other requires substantial programming effort and architecture-specific extensions to OpenMP, but may localize memory accesses in a nearly optimal manner.
The main contributions of the paper are: a classification of application characteristics, which identifies clearly the conditions under which transparent methods are both capable and sufficient for optimizing memory locality in an OpenMP program; and the use of two novel runtime techniques, runtime data distribution based on memory access traces and affinity scheduling with iteration schedule reuse, as competitive substitutes of manual data distribution in several important classes of applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
John Bircsak , Peter Craig , RaeLyn Crowell , Zarka Cvetanovic , Jonathan Harris , C. Alexander Nelson , Carl D. Offner, Extending OpenMP for NUMA machines, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.48-es, November 04-10, 2000, Dallas, Texas, United States
|
 |
3
|
Rohit Chandra , Ding-Kai Chen , Robert Cox , Dror E. Maydan , Nenad Nedeljkovic , Jennifer M. Anderson, Data distribution support on distributed shared memory multiprocessors, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.334-345, June 16-18, 1997, Las Vegas, Nevada, United States
|
| |
4
|
|
| |
5
|
M. Frumkin, H. Jin, and J. Yan. Implementation of NAS Parallel Benchmarks in High Performance FORTRAN. Technical Report NAS-98-009, NASA Ames Research Center, Sept. 1998.
|
| |
6
|
W. Gropp. A User's View of OpenMP: The Good, The Bad and the Ugly. In Workshop on OpenMP Applications and Tools (WOMPAT'2000), San Diego, California, July 2000.
|
| |
7
|
High Performance FORTRAN Forum. High Performance FORTRAN Language Specification, Version 2.0. Technical Report CRPCTR-92225, Center for Research onParallel Computation, Rice University, Jan. 1997.
|
| |
8
|
HPF+ Project Consortium. HPF+: Optimizing HPF for Advanced Applications. http://www.par.univie.ac.at/project/hpf+, 1998.
|
 |
9
|
|
| |
10
|
H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementation of the NAS Parallel Benchmarks and its Performance. Technical Report NAS-99-011, NASA Ames Research Center, Oct. 1999.
|
| |
11
|
D. Kuck. OpenMP: Past and Future. In Proc. of the Workshop on OpenMP Applications and Tools (WOMPAT'2000), San Diego, California, July 2000.
|
 |
12
|
|
| |
13
|
J. Levesque. The Future of OpenMP on IBM SMP Systems. In Proc. of the First European Workshop on OpenMP (EWOMP'99), pages 5-6, Lund, Sweden, Oct. 1999.
|
| |
14
|
|
| |
15
|
|
| |
16
|
J. Merlin and V. Schuster. HPF-OpenMP for SMP Clusters. In Proc. of the 4th Annual HPF User Group Meeting (HPFUG'2000), Tokyo, Japan, Oct. 2000.
|
 |
17
|
Dimitrios S. Nikolopoulos , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesús Labarta , Eduard Ayguadé, A case for user-level dynamic page migration, Proceedings of the 14th international conference on Supercomputing, p.119-130, May 08-11, 2000, Santa Fe, New Mexico, United States
[doi> 10.1145/335231.335243]
|
| |
18
|
Dimitrios S. Nikolopoulos , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesus Labarta , Eduard Ayguade;eacute;, Is data distribution necessary in OpenMP?, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.47-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
19
|
Dimitrios S. Nikolopoulos , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesús Labarta , Eduard Ayguadé, UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors, Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, p.85-99, May 25-27, 2000
|
| |
20
|
OpenMP Architecture Review Board. OpenMP Fortran Application Programming Interface. Version 1.2, http://www.openmp.org, Nov. 2000.
|
 |
21
|
|
 |
22
|
D. Baxter , R. Mirchandaney , J. H. Saltz, Run-time parallelization and scheduling of loops, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.303-312, June 18-21, 1989, Santa Fe, New Mexico, United States
[doi> 10.1145/72935.72967]
|
| |
23
|
V. Schuster and D. Miles. Distributed OpenMP, Extensions to OpenMP for SMP Clusters. In Proc. of the Workshop on OpenMP Applications and Tools (WOMPAT'2000), San Diego, California, July 2000.
|
| |
24
|
Hongzhang Shan , Jaswinder P. Singh , Leonid Oliker , Rupak Biswas, A comparison of three programming models for adaptive applications on the Origin2000, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.11-es, November 04-10, 2000, Dallas, Texas, United States
|
|