ACM Home Page
Please provide us with feedback. Feedback
A tuneable software cache coherence protocol for heterogeneous MPSoCs
Full text PdfPdf (647 KB)
Source
International Conference on Hardware Software Codesign archive
Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis table of contents
Grenoble, France
SESSION: Exploring the hardware software boundaries for MPSoC design table of contents
Pages 383-392  
Year of Publication: 2009
ISBN:978-1-60558-628-1
Authors
Frank Ophelders  Eindhoven University of Technology, Eindhoven, Netherlands
Marco J.G. Bekooij  NXP Semiconductors, Eindhoven, Netherlands
Henk Corporaal  Eindhoven University of Technology, Eindhoven, Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGBED: ACM Special Interest Group on Embedded Systems
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
SIGDA: ACM Special Interest Group on Design Automation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 27,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1629435.1629488
What is a DOI?

ABSTRACT

In a multiprocessor system-on-chip (MPSoC) private caches introduce the cache coherence problem. Here, we target at heterogeneous MPSoCs with a network-on-chip (NoC). Existing hardware cache coherence protocols are less suitable for MPSoCs because many off-the-shelf processors used in MPSoCs do not support these protocols. Furthermore, these protocols typically rely on global visibility and serialization of writes which does not match well with the parallel point-to-point communication provided by a NoC. Therefore, we propose a software cache coherence protocol, which can be applied in a heterogeneous MPSoC with a NoC. The software cache coherence protocol relies on explicit synchronization in the software. More specifically, caches are guaranteed to be coherent according to the Release Consistency model, on top of which we have implemented the standard Pthreads communication library. Heterogeneous MPSoCs with off-the-shelf processors can easily be supported, because processors are only required to provide cache control operations, e.g., clean and invalidate. All cache coherence operations are interruptible and do not impact the execution of tasks on other processors, therefore this protocol is suitable for predictable MPSoCs. Our software cache coherence protocol is implemented on an ARM926EJ-S MPSoC which is mapped on an FPGA. From experiments we conclude that the protocol overhead is low for the applications taken from the SPLASH-2 benchmark set. For these applications we observed a speedup between 1.89 and 2.01 on the two processor MPSoC.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
The POSIX Threads Standard. ISO/IEC standard 9945-1:1996, also known as ANSI/IEEE POSIX 1003.1-1995.
 
2
S. Adve and K. Gharachorloo. Shared memory consistency models: a tutorial. Computer, 29(12):66--76, Dec 1996.
 
3
H.-J. Boehm. Threads cannot be implemented as a library. In Proc. PLDI, pages 261--268, New York, NY, USA, 2005. ACM.
 
4
D. Culler, J. P. Singh, and A. Gupta. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, 1999.
 
5
M. Dubois, C. Scheurich, and F. Briggs. Memory access buffering in multiprocessors. SIGARCH Comput. Archit. News, 14(2):434--442, 1986.
 
6
S. F. Fahmy, B. Ravidran, and E. Jensen. On bounding response times under software transactional memory in distributed multiprocessor real-time systems. In Proc. DATE, 2009.
 
7
K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proc. of the 17th Annual International Symposium on Computer Architecture, pages 15--26, 1990.
 
8
L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput., 28(9):690--691, September 1979.
 
9
J. Laudon and D. Lenoski. The SGI Origin: a ccNUMA highly scalable server. In Proc. The 24th Annual International Symposium on Computer Architecture, pages 241--251, 1997.
 
10
D. Lenoski, J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. The stanford Dash multiprocessor. Computer, 25(3):63--79, Mar 1992.
 
11
F. Petrot, A. Greiner, and P. Gomez. On cache coherency and memory consistency issues in NoC based shared memory multiprocessor SoC architectures. Proc. DSD, pages 53--60, 2006.
 
12
H. Sandhu, B. Gamsa, and S. Zhou. The shared regions approach to software cache coherence on multiprocessors. ACM SIGPLAN Notices, 28(7):229--238, 1993.
 
13
T. Suh, D. Blough, and H.-H. Lee. Supporting cache coherence in heterogeneous multiprocessor systems. In Proc. DATE, volume 2, pages 1150--1155 Vol.2, Feb. 2004.
 
14
I. Tartalja and V. Milutinovic. An approach to dynamic software cache consistency maintenance based on conditional invalidation. Proc. of the Twenty-Fifth Hawaii International Conference on System Sciences, pages 457--466 vol.1, Jan 1992.
 
15
J.-W. van de Waerdt, S. Vassiliadis, J.-P. van Itegem, and H. van Antwerpen. The TM3270 media-processor data cache. In Proc. Computer Design: VLSI in Computers and Processors, ICCD, pages 334--341, Oct. 2005.
 
16
J. van den Brand and M. Bekooij. Streaming consistency: a model for efficient MPSoC design. Proc. DSD, pages 27--34, 2007.
 
17
S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. Proc. of the 22nd Annual International Symposium on Computer Architecture, pages 24--36, Jun 1995.