| Core architecture optimization for heterogeneous chip multiprocessors |
| Full text |
Pdf
(279 KB)
|
| Source
|
PACT
archive
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
table of contents
Seattle, Washington, USA
SESSION: Multi-core design I
table of contents
Pages: 23 - 32
Year of Publication: 2006
ISBN:1-59593-264-X
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 47, Downloads (12 Months): 240, Citation Count: 15
|
|
|
ABSTRACT
Previous studies have demonstrated the advantages of single-ISA heterogeneous multi-core architectures for power and performance. However, none of those studies examined how to design such a processor; instead, they started with an assumed combination of pre-existing cores.This work assumes the flexibility to design a multi-core architecture from the ground up and seeks to address the following question: what should be the characteristics of the cores for a heterogeneous multi processor for the highest area or power efficiency? The study is done for varying degrees of thread-level parallelism and for different area and power budgets.The most efficient chip multiprocessors are shown to be heterogeneous, with each core customized to a different subset of application characteristics - no single core is necessarily well suited to all applications. The performance ordering of cores on such processors is different for different applications; there is only a partial ordering among cores in terms of resources and complexity. This methodology produces performance gains as high as 40%. The performance improvements come with the added cost of customization.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
International Technology Roadmap for Semiconductors 2003, http://public.itrs.net.
|
 |
2
|
|
 |
3
|
|
| |
4
|
S. Ghiasi and D. Grunwald. Aide de camp: Asymmetric dual core design for power and energy reduction. In University of Colorado Technical Report CU-CS-964-03, 2003.
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
R. Kotla, A. Devgan, S. Ghiasi, T. Keller, and F. Rawson. Characterizing the impact of different memory-intensity levels. In Proceedings of IEEE 7th Annual Workshop on Workload Characterization, 2004.
|
| |
10
|
|
 |
11
|
Rakesh Kumar , Dean M. Tullsen , Parthasarathy Ranganathan , Norman P. Jouppi , Keith I. Farkas, Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance, Proceedings of the 31st annual international symposium on Computer architecture, p.64, June 19-23, 2004, München, Germany
|
 |
12
|
|
| |
13
|
J. Li and J. Martinez. Power-performance implications of thread-level parallelism in chip multiprocessors. In Proceedings of International Symposium on Performance Analysis of Systems and Software, 2005.
|
| |
14
|
S. McFarling. Combining branch predictors. Technical Report TN-36, DEC-WRL, June 1993.
|
| |
15
|
T. Morad, U. Weiser, and A. Kolodny. ACCMP - assymetric cluster chip-multiprocessing. In CCIT Technical Report 488, 2004.
|
| |
16
|
Tomer Y. Morad , Uri C. Weiser , Avinoam Kolodny , Mateo Valero , Eduard Ayguade, Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors, IEEE Computer Architecture Letters, v.5 n.1, p.4, January 2006
[doi> 10.1109/L-CA.2006.6]
|
| |
17
|
J. M. Mulder, N. T. Quach, and M. J. Flynn. An area model for on-chip memories and its applications. In IEEE Journal of Solid State Circuits, Vol 26, No. 2, Feb. 1991.
|
| |
18
|
|
 |
19
|
|
| |
20
|
P. Shivakumar and N. Jouppi. CACTI 3.0: An integrated cache timing, power and area model. In Technical Report 2001/2, Compaq Computer Corporation, Aug. 2001.
|
 |
21
|
|
| |
22
|
D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.
|
CITED BY 15
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yunlian Jiang , Xipeng Shen , Jie Chen , Rahul Tripathi, Analysis and approximation of optimal co-scheduling on chip multiprocessors, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
Henry Wong , Anne Bracy , Ethan Schuchman , Tor M. Aamodt , Jamison D. Collins , Perry H. Wang , Gautham Chinya , Ankur Khandelwal Groen , Hong Jiang , Hong Wang, Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ayse K. Coskun , Richard Strong , Dean M. Tullsen , Tajana Simunic Rosing, Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors, Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, June 15-19, 2009, Seattle, WA, USA
|
|