|
ABSTRACT
Recent research advocates asymmetric multi-core architectures, where cores in the same processor can have different performance. These architectures support single-threaded performance and multithreaded throughput at lower costs (e.g., die size and power). However, they also pose unique challenges to operating systems, which traditionally assume homogeneous hardware. This paper presents AMPS, an operating system scheduler that efficiently supports both SMP-and NUMA-style performance-asymmetric architectures. AMPS contains three components: asymmetry-aware load balancing, faster-core-first scheduling, and NUMA-aware migration. We have implemented AMPS in Linux kernel 2.6.16 and used CPU clock modulation to emulate performance asymmetry on an SMP and NUMA system. For various workloads, we show that AMPS achieves a median speedup of 1.16 with a maximum of 1.44 over stock Linux on the SMP, and a median of 1.07 with a maximum of 2.61 on the NUMA system. Our results also show that AMPS improves fairness and repeatability of application performance measurements.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
Cyril Banino , Olivier Beaumont , Larry Carter , Jeanne Ferrante , Arnaud Legrand , Yves Robert, Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Platforms, IEEE Transactions on Parallel and Distributed Systems, v.15 n.4, p.319-330, April 2004
[doi> 10.1109/TPDS.2004.1271181]
|
 |
5
|
|
| |
6
|
S. Y. Borkar, P. Dubey, K. C. Kahn, D. J. Kuck, H. Mulder, S. S. Pawlowski, and J. Rattner. Platform 2015: Intel® processor and platform evolution for the next decade. White Paper, Intel Corporation, 2005.
|
 |
7
|
Rohit Chandra , Scott Devine , Ben Verghese , Anoop Gupta , Mendel Rosenblum, Scheduling and page migration for multiprocessor compute servers, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.12-24, October 05-07, 1994, San Jose, California, United States
|
| |
8
|
M. DeVuyst, R. Kumar, and D. M. Tullsen. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors. In Proceedings of the 20th International Parallel and Distributed Processing Symposium, Apr. 2006.
|
| |
9
|
Alexandra Fedorova , Margo Seltzer , Christoper Small , Daniel Nussbaum, Performance of multithreaded chip multiprocessors and implications for operating system design, Proceedings of the annual conference on USENIX Annual Technical Conference, p.26-26, April 10-15, 2005, Anaheim, CA
|
| |
10
|
R. J. O. Figueiredo and J. A. B. Fortes. Impact of heterogeneity on DSM performance. In Proceedings of the Sixth IEEE Symposium on High-Performance Computer Architecture, pages 26--35, Jan. 2000.
|
 |
11
|
|
 |
12
|
Richard A. Hankins , Gautham N. Chinya , Jamison D. Collins , Perry H. Wang , Ryan Rakvic , Hong Wang , John P. Shen, Multiple Instruction Stream Processor, Proceedings of the 33rd annual international symposium on Computer Architecture, p.114-127, June 17-21, 2006
|
| |
13
|
|
| |
14
|
|
 |
15
|
Rakesh Kumar , Dean M. Tullsen , Parthasarathy Ranganathan , Norman P. Jouppi , Keith I. Farkas, Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance, Proceedings of the 31st annual international symposium on Computer architecture, p.64, June 19-23, 2004, München, Germany
|
 |
16
|
Richard P. LaRowe, Jr. , Carla Schlatter Ellis , Laurence S. Kaplan, The robustness of NUMA memory management, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.137-151, October 13-16, 1991, Pacific Grove, California, United States
|
| |
17
|
Linux Kernel Mailing List. Scalability of signal delivery for POSIX threads. http://lkml.org/lkml/2004/11/22/432, Nov. 2004.
|
| |
18
|
|
| |
19
|
D. Pham, S. Asano, M. Bolliger, M. N. Day, H. P. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, M. Riley, D. Shippy, D. Stasiak, M. Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel, T. Yamazaki, and K. Yazawa. The design and implementation of a first generation CELL* processor. In IEEE International Solid-State Circuits Conference Digest of Technical Papers, pages 184--185, Feb. 2005.
|
 |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
Volkmar Uhlig , Joshua LeVasseur , Espen Skoglund , Uwe Dannowski, Towards scalable multiprocessor virtual machines, Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium, p.4-4, May 06-07, 2004, San Jose, California
|
| |
25
|
|
CITED BY 7
|
|
|
|
|
T. Scogland , P. Balaji , W. Feng , G. Narayanaswamy, Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, November 15-21, 2008, Austin, Texas
|
|
|
|
|
|
|
|
|
|
|
|
Daniel Shelepov , Juan Carlos Saez Alcaide , Stacey Jeffery , Alexandra Fedorova , Nestor Perez , Zhi Feng Huang , Sergey Blagodurov , Viren Kumar, HASS: a scheduler for heterogeneous multicore systems, ACM SIGOPS Operating Systems Review, v.43 n.2, April 2009
|
|
|
|
|