| Distributed and low-power synchronization architecture for embedded multiprocessors |
| Full text |
Pdf
(360 KB)
|
Source
|
International Conference on Hardware Software Codesign
archive
Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
table of contents
Atlanta, GA, USA
SESSION: Multiprocessor and MPSoC architectures
table of contents
Pages 73-78
Year of Publication: 2008
ISBN:978-1-60558-470-6
|
|
Authors
|
|
Chenjie Yu
|
University of Maryland, College Park, MD, USA
|
|
Peter Petrov
|
University of Maryland, College Park, MD, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 24, Downloads (12 Months): 99, Citation Count: 0
|
|
|
ABSTRACT
In this paper we present a framework for a distributed and very low-cost implementation of synchronization controllers and protocols for embedded multiprocessors. The proposed architecture effectively implements the queued-lock semantics in a completely distributed way. The proposed approach to synchronization implementation not only completely eliminates the overwhelming bus contention traffic when multiple cores compete for a synchronization variable, but also achieves very high energy efficiency as the local synchronization controller can efficiently determine, without any bus transactions or local cache spinning, the exact timing of when the lock is made available to the local processor. Application-specific information regarding synchronization variables in the local task is exploited in implementing the distributed synchronization protocol. The local synchronization controllers enable the system software or the thread library to implement various low-power policies, such as disabling the cache accesses or even completely powering down the local processor while waiting for a synchronization variable.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
M. Monchiero, G. Palermo, C. Silvano and O. Villa, "Efficient Synchronization for Embedded On-Chip Multiprocessors", IEEE Transactions on Very Large Scale Integration Systems, vol. 14, n. 10, pp. 1049--1062, October 2006.
|
 |
3
|
Andrea Marongiu , Luca Benini , Mahmut Kandemir, Lightweight barrier-based parallelization support for non-cache-coherent MPSoC platforms, Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, September 30-October 03, 2007, Salzburg, Austria
[doi> 10.1145/1289881.1289908]
|
 |
4
|
Bilge Saglam Akgul , Jaehwan Lee , Vincent John Mooney, A system-on-a-chip lock cache with task preemption support, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502217.502242]
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
M-L. Li, R. Sasanka, S. Adve, Y-K. Chen and E. Debes, "The ALPBench benchmark suite for complex multimedia applications", in International Symposium on Workload Characterization, pp. 34--45, October 2005.
|
| |
11
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
12
|
M. R. Guthaus , J. S. Ringenberg , D. Ernst , T. M. Austin , T. Mudge , R. B. Brown, MiBench: A free, commercially representative embedded benchmark suite, Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, p.3-14, December 02-02, 2001
[doi> 10.1109/WWC.2001.15]
|
| |
13
|
Nathan L. Binkert , Ronald G. Dreslinski , Lisa R. Hsu , Kevin T. Lim , Ali G. Saidi , Steven K. Reinhardt, The M5 Simulator: Modeling Networked Systems, IEEE Micro, v.26 n.4, p.52-60, July 2006
[doi> 10.1109/MM.2006.82]
|
| |
14
|
D. Tarjan, S. Thoziyoor and N. Jouppi, "CACTI 4.0: An Integrated Cache Timing, Power and Area Model", Technical report, HP Laboratories Palo Alto, June 2006.
|
|