ACM Home Page
Please provide us with feedback. Feedback
StageNetSlice: a reconfigurable microarchitecture building block for resilient CMP systems
Full text PdfPdf (795 KB)
Source
International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems table of contents
Atlanta, GA, USA
SESSION: Resiliency table of contents
Pages 1-10  
Year of Publication: 2008
ISBN:978-1-60558-469-0
Authors
Shantanu Gupta  University of Michigan, Ann Arbor, MI, USA
Shuguang Feng  University of Michigan, Ann Arbor, MI, USA
Amin Ansari  University of Michigan, Ann Arbor, MI, USA
Jason Blome  University of Michigan, Ann Arbor, MI, USA
Scott Mahlke  University of Michigan, Ann Arbor, MI, USA
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
ACM: Association for Computing Machinery
SIGBED: ACM Special Interest Group on Embedded Systems
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 114,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1450095.1450099
What is a DOI?

ABSTRACT

Although CMOS feature size scaling has been the source of dramatic performance gains, it has lead to mounting reliability concerns due to increasing power densities and on-chip temperatures. Given that most wearout mechanisms that plague semiconductor devices are highly dependent on these parameters, significantly higher failure rates are projected for future technology generations. Traditional techniques for dealing with device failures have relied on coarse-grained redundancy to maintain service in the face of failed components. In this work, we challenge this practice by identifying its inability to scale to high failure rate scenarios and investigate the advantages of finer-grained configurations. We use this study to motivate the design of StageNet, an embedded CMP architecture designed from its inception with reliability as a first class design constraint. StageNet relies on a reconfigurable network of replicated processor pipeline stages to maximize the useful lifetime of the chip, gracefully degrading performance toward end of life. This paper addresses the microarchitecture of the basic building block of StageNet, named StageNetSlice, which is a processor core comprised of networked pipeline stages. A naive slice design results in approximately 4X slowdown verses a traditional processor due to longer communication delays in the pipeline. However, several small design changes that eliminate inter-stage communication paths and minimize communication bandwidth reduce this overhead to 11% on average while providing high levels of fine-grain adaptability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
ARM. Arm11. http://www.arm.com/products/CPUs/families/ARM11Family.html.
 
3
ARM. Arm9. http://www.arm.com/products/CPUs/families/ARM9Family.html.
 
4
J. S. S. T. Association. Failure mechanisms and models for semiconductor devices. Technical Report JEP122C, JEDEC Solid State Technology Association, Mar. 2006.
 
5
 
6
 
7
 
8
 
9
J. A. Blome, S. Feng, S. Gupta, and S. Mahlke. Online timing analysis for wearout detection. In Proc. of the 2nd Workshop on Architectural Reliability, pages 51--60, 2006.
 
10
 
11
 
12
 
13
A. Christou. Electromigration and Electronic Device Degradation. John Wiley and Sons, Inc., 1994.
14
 
15
K. Constantinides et al. Bulletproof: A defect-tolerant CMP switch architecture. In Proc. of the 12th International Symposium on High-Performance Computer Architecture, pages 3--14, Feb. 2006.
 
16
 
17
D. Dumin. Oxide Reliability: A Summary of Silicon Oxide Wearout, Breakdown, and Reliability. World Scientific Publishing Co. Pte. Ltd., 2002.
 
18
S. Gupta, S. Feng, J. Blome, and S. Mahlke. Stagenet: A reconfigurable cmp fabric for resilient systems. In Proc. of the 2nd Reconfigurable and Adaptive Architecture Workshop, 2007.
 
19
V. Kathail, M. Schlansker, and B. Rau. HPL-PD architecture specification: Version 1.1. Technical Report HPL-93-80(R.1), Hewlett-Packard Laboratories, Feb. 2000.
 
20
 
21
M.-L. Li, P. Ramachandran, S. Sahoo, S. Adve, V. Adve, and Y. Zhou. Trace-based microarchitecture-level diagnosis of permanent hardware faults. In Proc. of the 2008 International Conference on Dependable Systems and Networks, June 2008.
 
22
 
23
OpenCores. OpenRISC 1200, 2006. http://www.opencores.org/projects.cgi/web/ or1k/openrisc_1200.
 
24
25
 
26
 
27
 
28
 
29
 
30
L. Spainhower and T. Gregg. IBM S/390 Parallel Enterprise Server G5 Fault Tolerance: A Historical Perspective. IBM Journal of Research and Development, 43(6):863--873, 1999.
31
 
32
33
 
34
 
35
Trimaran. An infrastructure for research in ILP, 2000. http://www.trimaran.org/.
36
37
 
38
 
39
E. Wu et al. Interplay of voltage and temperature acceleration of oxide breakdown for ultra-thin gate oxides. Solid-State Electronics, 46:1787--1798, 2002.
 
40


Collaborative Colleagues:
Shantanu Gupta: colleagues
Shuguang Feng: colleagues
Amin Ansari: colleagues
Jason Blome: colleagues
Scott Mahlke: colleagues