| Integrated code and data placement in two-dimensional mesh based chip multiprocessors |
| Full text |
Pdf
(799 KB)
|
Source
|
International Conference on Computer Aided Design
archive
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
table of contents
San Jose, California
SESSION: Advances in embedded systems
table of contents
Pages 583-588
Year of Publication: 2008
ISBN ~ ISSN:1092-3152 , 978-1-4244-2820-5
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
IEEE Press
Piscataway, NJ, USA
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 34, Citation Count: 0
|
|
|
ABSTRACT
As transistor sizes continue to shrink and the number of transistors per chip keeps increasing, chip multiprocessors (CMPs) are becoming a promising alternative to remain on the current performance trajectory for both high-end systems and embedded systems. Since future technologies offer the promise of being able to integrate billions of transistors on a chip, the prospects of having hundreds to thousands of processors on a single chip along with an underlying memory hierarchy and an interconnection system is entirely feasible. This paper proposes a compiler directed integrated code and data placement scheme for two-dimensional mesh based CMP architectures. The proposed approach uses a Code-Data Affinity Graph (CDAG) to represent the relationship between loop iterations and array data and then assigns the sets of loop iterations to processing cores and sets of data blocks to on-chip memories. During the mapping process, the on-chip memory capacity and load imbalance across different cores and the topology of the NoC are taken into account. In this paper, we present two variants of our approach: depth-first placement (DFP) and breadth-first placement (BFP), and compare them to three alternate code/data mapping schemes. The experimental evaluation shows that our CDAG based placement schemes are very successful in practice, achieving average performance improvements of 19.9% (DFP) and 16.8% (BFP), and average energy improvements of 29.7% (DFP) and 27.8% (BFP).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bas Aarts , Michel Barreteau , François Bodin , Peter Brinkhaus , Zbigniew Chamski , Henri-Pierre Charles , Christine Eisenbeis , John R. Gurd , Jan Hoggerbrugge , Ping Hu , William Jalby , Peter M. W. Knijnenburg , Michael F. P. O'Boyle , Erven Rohou , Rizos Sakellariou , Henk Schepers , André Seznec , Elena Stöhr , Marco Verhoeven , Harry A. G. Wijshoff, OCEANS: Optimizing Compilers for Embedded Applications, Proceedings of the Third International Euro-Par Conference on Parallel Processing, p.1351-1356, August 26-29, 1997
|
| |
2
|
S. P. Amarasinghe et al. The SUIF compiler for scalable parallel machines. In Proc. Seventh SIAM PP, Feb. 1995.
|
| |
3
|
|
| |
4
|
Vishal Aslot , Max J. Domeika , Rudolf Eigenmann , Greg Gaertner , Wesley B. Jones , Bodo Parady, SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance, Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming, p.1-10, July 30-31, 2001
|
| |
5
|
|
 |
6
|
|
| |
7
|
M. Brorsson. Performance Impact of Code and Data Placement on the IBM RP3. TR, IBM, 1989.
|
 |
8
|
|
| |
9
|
|
| |
10
|
J. Hu and R. Marculescu. Energy- and performance-aware mapping for regular NoC architectures. IEEE TCAD, 24(4):551--562, Apr. 2005.
|
| |
11
|
W. Hung , C. Addo-Quaye , T. Theocharides , Y. Xie , N. Vijaykrishnan , M. J. Irwin, Thermal-Aware IP Virtualization and Placement for Networks-on-Chip Architecture, Proceedings of the IEEE International Conference on Computer Design, p.430-437, October 11-13, 2004
|
| |
12
|
J. A. Kahle , M. N. Day , H. P. Hofstee , C. R. Johns , T. R. Maeurer , D. Shippy, Introduction to the cell multiprocessor, IBM Journal of Research and Development, v.49 n.4/5, p.589-604, July 2005
|
| |
13
|
F. Kuijlman et al. A unified compiler framework for work and data placement. In Proc. ASCI 2002 Conference, pages 109--115, 2002.
|
 |
14
|
|
| |
15
|
|
| |
16
|
Cesar Marcon , Ney Calazans , Fernando Moraes , Altamiro Susin , Igor Reis , Fabiano Hessel, Exploring NoC Mapping Strategies: An Energy and Timing Aware Technique, Proceedings of the conference on Design, Automation and Test in Europe, p.502-507, March 07-11, 2005
[doi> 10.1109/DATE.2005.149]
|
| |
17
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
| |
18
|
P. Shivakumar and N. Jouppi. CACTI 3.0. http://research.compaq.com/wrl/people/jouppi/CACTI.html
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
|