| A case for user-level dynamic page migration |
| Full text |
Pdf
(1.33 MB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 14th international conference on Supercomputing
table of contents
Santa Fe, New Mexico, United States
Pages: 119 - 130
Year of Publication: 2000
ISBN:1-58113-270-0
|
|
Authors
|
|
Dimitrios S. Nikolopoulos
|
Department of Computer Engineering and Informatics, University of Patras, Rion, 26 500, Patras, Greece
|
|
Theodore S. Papatheodorou
|
Department of Computer Engineering and Informatics, University of Patras, Rion, 26 500, Patras, Greece
|
|
Constantine D. Polychronopoulos
|
Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL
|
|
Jesús Labarta
|
Department of Computer Architecture, Polytechnic University of Catalonia, c/Jordi Girona 1-3, Modul D6, 08034, Barcelona, Spain
|
|
Eduard Ayguadé
|
Department of Computer Architecture, Polytechnic University of Catalonia, c/Jordi Girona 1-3, Modul D6, 08034, Barcelona, Spain
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 23, Citation Count: 7
|
|
|
ABSTRACT
This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel programs to tune their memory performance on distributed shared memory multiprocessors, with feedback obtained from dynamic monitoring of memory activity. Our technique exploits the iterative nature of parallel programs and information available to the program both at compile time and at runtime in order to improve the accuracy and the timeliness of page migrations, as well as amortize better the overhead, compared to page migration engines implemented in the operating system. We present an adaptive page migration algorithm based on a competitive and a predictive criterion. The competitive criterion is used to correct poor page placement decisions of the operating system, while the predictive criterion makes the algorithm responsive to scheduling events that necessitate immediate page migrations, such as preemptions and migrations of threads. We also present a new technique for preventing page pingpong and a mechanism for monitoring the performance of page migration algorithms at runtime and tuning their sensitive parameters accordingly. Our experimental evidence on a SGI Origin2000 shows that unmodified OpenMP codes linked with our runtime system for dynamic page migration are effectively immune to the page placement strategy of the operating system and the associated problems with data locality. Furthermore, our runtime system achieves solid performance improvements compared to the IRIX 6.5.5 page migration engine, for single parallel OpenMP codes and multiprogrammed workloads.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
E. Ayguadd et al. NanosCompiler: A Research Platform for OpenMP Extensions. In Proc. of the First European Workshop on OpenMP, pages 27-31, October 1999.
|
| |
2
|
D. Black and D. Sleator. Competitive Algorithms for Replication and Migration Problems. Technical Report CMU-CS-89-201, Department of Computer Science, Carnegie-Mellon University, 1989.
|
 |
3
|
William J. Bolosky , Michael L. Scott , Robert P. Fitzgerald , Robert J. Fowler , Alan L. Cox, NUMA policies and their relation to memory architecture, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.212-221, April 08-11, 1991, Santa Clara, California, United States
|
| |
4
|
|
| |
5
|
L. Brieger. HPF to OpenMP on the Origin2000: A Case Study. In Proc. of the First European Workshop on OpenMP, pages 19-20, October 1999.
|
 |
6
|
Rohit Chandra , Scott Devine , Ben Verghese , Anoop Gupta , Mendel Rosenblum, Scheduling and page migration for multiprocessor compute servers, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.12-24, October 05-07, 1994, San Jose, California, United States
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
H. Jin, M. Frumkin, and J. Yam The OpenMP Implementation of NAS Parallel Benchmarks and its Performance. Technical Report NAS-99-011, NASA Ames Research Center, 1999.
|
 |
14
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
 |
15
|
|
| |
16
|
J. Levesque. The Future of OpenMP on IBM SMP Systems. In Proc. of the First European Workshop on OpenMP, pages 5-6, October 1999.
|
 |
17
|
|
| |
18
|
|
| |
19
|
|
 |
20
|
|
| |
21
|
M. Resch and B. Sander. A Comparison of OpenMP and MPI for the Parallel CFD Test Case. In Proc. of the First European Workshop on OpenMP, October 1999.
|
| |
22
|
Silicon Graphics Inc. IRIX 6.5 Operating System Man Pages. http://techpubs.sgi.com, 1999.
|
| |
23
|
Silicon Graphics Inc. Origin2000 and Onyx2 Performance Tuning and Optimization Guide. http://techpubs.sgi.com, 1999.
|
 |
24
|
Vijayaraghavan Soundararajan , Mark Heinrich , Ben Verghese , Kourosh Gharachorloo , Anoop Gupta , John Hennessy, Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors, Proceedings of the 25th annual international symposium on Computer architecture, p.342-355, June 27-July 02, 1998, Barcelona, Spain
|
 |
25
|
Ben Verghese , Scott Devine , Anoop Gupta , Mendel Rosenblum, Operating system support for improving data locality on CC-NUMA compute servers, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.279-289, October 01-04, 1996, Cambridge, Massachusetts, United States
|
CITED BY 7
|
|
|
|
|
|
|
|
|
Dimitrios S. Nikolopoulos , Eduard Ayguadé , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesús Labarta, The trade-off between implicit and explicit data distribution in shared-memory programming paradigms, Proceedings of the 15th international conference on Supercomputing, p.23-37, June 2001, Sorrento, Italy
|
|
|
Dimitrios S. Nikolopoulos , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesus Labarta , Eduard Ayguade;eacute;, Is data distribution necessary in OpenMP?, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.47-es, November 04-10, 2000, Dallas, Texas, United States
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|