ACM Home Page
Please provide us with feedback. Feedback
Reliability modeling and management in dynamic microprocessor-based systems
Full text PdfPdf (714 KB)
Source Annual ACM IEEE Design Automation Conference archive
Proceedings of the 43rd annual Design Automation Conference table of contents
San Francisco, CA, USA
SESSION: Session 58: advanced methods for interconnect extraction, clocks and reliability table of contents
Pages: 1057 - 1060  
Year of Publication: 2006
ISBN:1-59593-381-6
Authors
Eric Karl  University of Michigan, Ann Arbor, MI
David Blaauw  University of Michigan, Ann Arbor, MI
Dennis Sylvester  University of Michigan, Ann Arbor, MI
Trevor Mudge  University of Michigan, Ann Arbor, MI
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 73,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1146909.1147174
What is a DOI?

ABSTRACT

Reliability failure mechanisms, such as time dependent dielectric breakdown, electromigration, and thermal cycling have become a key concern in processor design. The traditional approach to reliability qualification assumes that the processor will operate at maximum performance continuously under worst case voltage and temperature conditions. However, the typical processor spends a very small fraction of its operational time at maximum voltage and temperature. In this paper, we show how this results in a reliability "slack" that can be leveraged to provide increased performance during periods of peak processor demand. We develop a novel, real time reliability model based on workload driven conditions. We then propose a new dynamic reliability management (DRM) scheme that results in 20-35% performance improvement during periods of peak computational demand while ensuring the required reliability lifetime.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
M. A. Miner, "Cumulative damage in fatigue," Journal of App. Mech. 67,1945.
 
4
J. H. Stathis, "Physical and Predictive Models of Ultra Thin Oxide Reliability in CMOS Devices and Circuits," IEEE Trans. Dev. & Mater. Reliabil. 1, 2001.
 
5
R. Degraeve, et al, "A consistent model for intrinsic breakdown in ultra-thin oxides," Int. Electron Devices Meeting, p. 866, 1995.
 
6
J. R. Black, "Electromigration failure modes for aluminum metallization in semiconductor devices," Proc.of IEEE 57, 1969.
 
7
 
8
 
9
R. C. Blish, II, "Thermal cycling and thermal shock failure rate modeling," Proc. Int. Reliability Physics Symp., 1997.
 
10
Nelson, Wayne, Accelerated Testing: Statistical Models, Test Plans and Data Analyses, John Wiley & Sons, Inc., New York, 1990.
 
11
K. Skadron, et al, "HotSpot: Techniques for modeling thermal effects at the processor-architecture level," Int. Workshop on THERMal Investigations of ICs and Sys., 2002


Collaborative Colleagues:
Eric Karl: colleagues
David Blaauw: colleagues
Dennis Sylvester: colleagues
Trevor Mudge: colleagues