ACM Home Page
Please provide us with feedback. Feedback
Performance implications of single thread migration on a chip multi-core
Full text PdfPdf (444 KB)
Source ACM SIGARCH Computer Architecture News archive
Volume 33 ,  Issue 4  (November 2005) table of contents
Special issue: dasCMP'05
SPECIAL ISSUE: Special issue: dasCMP'05 table of contents
Pages: 80 - 91  
Year of Publication: 2005
ISSN:0163-5964
Authors
Theofanis Constantinou  University of Cyprus, Nicosia, Cyprus
Yiannakis Sazeides  University of Cyprus, Nicosia, Cyprus
Pierre Michaud  Irisa/Inria, Rennes Cedex, France
Damien Fetis  Irisa/Inria, Rennes Cedex, France
Andre Seznec  Irisa/Inria, Rennes Cedex, France
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 163,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1105734.1105745
What is a DOI?

ABSTRACT

High performance multi-core processors are becoming an industry reality. Although multi-cores are suited for multithreaded and multi-programmed workloads, many applications are still mono-thread and multi-core performance with a single thread workload is an important issue. Furthermore, recent studies suggest that performance, power and temperature considerations of future multi-cores may necessitate activity-migration between cores.Motivated by the above, this paper investigates the performance implications of single thread migration on a multi-core. Specifically, the study considers the influence on the performance of a single thread of the following migration and multi-core parameters: frequency of migration, core warm-up modes, subset of resources that are warmed-up, number of cores, and cache hierarchy organization. The results of this study can provide insight to architects on how to design performance-efficient power and thermal strategies for a multi-core chip.The experimental results, for the benchmarks and microarchitectures used in this study, show that the performance loss due to activity migration on a multi-core with private L1s and a shared L2 can be minimized if: (a) a migrating thread continues its execution on a core that was previously visited by the thread, and (b) cores remember their predictor state since their previous activation (all other core resources can be cold). The analogous conclusions for a multi-core with private L1s and L2s and a shared L3 are: remembering the predictor state, maintaining the tags of the various L2 caches coherent and allowing L2-L2 data transfers from inactive cores to the active core.The data also show that when migration period is at least every 160K cycles, the transfer of register state between two cores and the flushing of dirty private L1 data have a negligible performance overhead.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
AMD. Multi-core processors the next evolution in computing. In AMD Multi-Core Technology Whitepaper, 2005.
2
 
3
 
4
D. Burger, T. M. Austin, and S. Bennett. Evaluating Future Microprocessors: The SimpleScalar Tool Set. Technical Report CS-TR-96-1308, University of Wisconsin-Madison, July 1996.
 
5
P. Chaparro, J. Gonzalez, and A. Gonzalez. Thermal-effective clustered microarchitectures. In First Workshop on Temperature-Aware Computer Systems (TACS-1), 2004.
6
 
7
T. Constantinou, Y. Sazeides, P. Michaud, D. Fetis, and A. Seznec. Performance Implications of Single Thread Migration on a Chip Multi-Core. In Workshop on Design, Architecture and simulation of Chip Multi-Processors (affiliated with MICRO-38), November 2005.
 
8
Flachs et al. The microarchitecture of the streaming processor for a cell processor. In Proceedings of the IEEE International Solid-State Circuits Conference, February 2005.
 
9
M. Fleischmann. Crusoe longrun power management. In Transmeta Corporation Whitepaper, 2001.
 
10
D. J. Frank. Power-constrained CMOS scaling limits. IBM Journal of Research and Development, 46(2/3):235--244, 2002.
 
11
Gochman et al. The Intel Pentium M Processor: Microarchitecture and Performance. Intel Technology Journal, 7(Q2), May 2003.
 
12
S. Gunther, F. Binns, D. Carmean, and J. Hall. Managing the impact of increasing microprocessor power consumption. Intel Technology Journal, 5(Q1), Feb 2001.
13
 
14
Intel. Intel multi-core processor architecture development backgrounder. In Intel Whitepaper, 2005.
 
15
R. Kalla, B. Sinharoy, and J. M. Tendler. IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro, 24(2):40--47, Mar./Apr. 2004.
 
16
 
17
 
18
 
19
 
20
 
21
22
 
23
C. Poirier, R. McGowen, C. Bostak, and S. Naffziger. Power and temperature control on a 90nm itanium-family processor. In Proceedings of the IEEE International Solid-State Circuits Conference, February 2005.
 
24
25
 
26
27
 
28
Y. Taur. CMOS design near to the Limit of Scaling. IBM Journal of Research and Development, 46(2/3):213--222, Mar./May 2002.
 
29
Tendler et al. POWER4 system microarchitecture. IBM Journal of Research and Development, 46(1):5--26, Jan. 2002.


Collaborative Colleagues:
Theofanis Constantinou: colleagues
Yiannakis Sazeides: colleagues
Pierre Michaud: colleagues
Damien Fetis: colleagues
Andre Seznec: colleagues