|
ABSTRACT
The performance of the various cache coherence protocols proposed in the literature have been extensively analyzed in the context of high-performance multi-processor systems.A similar analysis for Multi-Processor Systems-on-Chips (MP-SoCs), where energy is at least as important as performace, and for which strict constraints on hardware and software resources do exist, has not been done yet.This work provides an effort in that sense, showing energy/performance tradeoffs for different snoop-based protocols on a realistic MPSoC architecture. The analysis leverage a multi-processor simulation platform, augmented with accurate power models, that allows cycle-accurate simulations.Our analysis show that (i) cache write policy is actually more important than the actual cache coherence protocol, and (ii) matching the programming model and style to the architecture may have dramatic effects on the energy and performance of the system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
"Broadening the Reach of the Intel Itanium 2 Processor Family," Technical White Paper, www.intel.com/ebusiness/pdf/prod/itanium/wp reach.pdf
|
| |
2
|
|
| |
3
|
J.M. Tendler, J.S. Dodson, J.S. Fields Jr., H. Le, B. Sin-Haroy. "POWER4 System Microarchitecture," IBM Journal of Research and Development, Vol. 46, No. 1, January 2002.
|
| |
4
|
P. Cumming "The TI OMAP Platform Approach to SoC," in Winning the SOC Revolution, Kluwer Academic Publishers, 2003.
|
| |
5
|
S. Richardson, "MPOC: A Chip Multiprocessor for Embedded Systems,", HP Technical Report, HPL-2002-186, July 2002.
|
| |
6
|
B. Ackland et al., "A Single Chip, 1.6 Billion, 16-b MAC/s Multiprocessor DSP," IEEE Journal of Solid State Circuits, Vol. 35, No. 3, March 2000.
|
| |
7
|
Philips Semiconductor, "Philips Nexperia Platform", www.semiconductors.philips.com/products/nexperia/home S. Dutta, R. Jensen, A. Rieckmann.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
A. Macii, L. Benini, M. Poncino, Memory Design Techniques for Low-Energy Embedded Systems, Kluwer Academic Publishers, 2002.
|
| |
12
|
C. Lin, L. Snyder, "A Comparison of Programming Models for Shared Memory Multiprocessors," International Conference on Parallel Processing, pp. 163--170, 1990.
|
| |
13
|
T.J. LeBlanc, E.P. Markatos, "Shared memory vs. message passing in shared-memory multiprocessors," Symposium on Parallel and Distributed Processing, pp. 254--263, Dec. 1992.
|
 |
14
|
|
 |
15
|
Satish Chandra , James R. Larus , Anne Rogers, Where is time spent in message-passing and shared-memory programs?, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.61-73, October 05-07, 1994, San Jose, California, United States
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
Mirko Loghi , Federico Angiolini , Davide Bertozzi , Luca Benini , Roberto Zafalon, Analyzing On-Chip Communication in a MPSoC Environment, Proceedings of the conference on Design, automation and test in Europe, p.20752, February 16-20, 2004
|
| |
28
|
Software ARM, www.g141.com/projects/swarm.
|
| |
29
|
ARM Ltd., www.arm.com/products/solutions/AMBAHomePage.html
|
| |
30
|
RTEMS home page, www.rtems.com.
|
| |
31
|
L. Benini et al. "A power modeling and estimation framework for VLIW-based embedded systems," PATMOS'01, October 2001, pp. 26--28.
|
 |
32
|
Mauro Chinosi , Roberto Zafalon , Carlo Guardiani, Automatic characterization and modeling of power consumption in static RAMs, Proceedings of the 1998 international symposium on Low power electronics and design, p.112-114, August 10-12, 1998, Monterey, California, United States
[doi> 10.1145/280756.280815]
|
| |
33
|
|
 |
34
|
|
|