ACM Home Page
Please provide us with feedback. Feedback
Transient fault prediction based on anomalies in processor events
Full text PdfPdf (174 KB)
Source Design, Automation, and Test in Europe archive
Proceedings of the conference on Design, automation and test in Europe table of contents
Nice, France
SESSION: Reliable microarchitectures table of contents
Pages: 1140 - 1145  
Year of Publication: 2007
ISBN:978-3-9810801-2-4
Authors
Satish Narayanasamy  University of California, San Diego
Ayse K. Coskun  University of California, San Diego
Brad Calder  University of California, San Diego and Microsoft
Sponsors
: IEEE Council on Electronic Design Automation (CEDA)
SIGDA: ACM Special Interest Group on Design Automation
: The EDA Consortium
EDAA : European Design and Automation Association
RAS : RAS
: The IEEE Computer Society TTTC
: ECSI
Publisher
EDA Consortium  San Jose, CA, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 26,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Future microprocessors will be highly susceptible to transient errors as the sizes of transistors decrease due to CMOS scaling. Prior techniques advocated full scale structural or temporal redundancy to achieve fault tolerance. Though they can provide complete fault coverage, they incur significant hardware and/or performance cost. It is desirable to have mechanisms that can provide partial but sufficiently high fault coverage with negligible cost.

To meet this goal, we propose leveraging speculative structures that already exist in modern processors. The proposed mechanism is based on the insight that when a fault occurs, it is likely that the incorrect execution would result in abnormally higher or lower number of mispredictions (branch mispredictions, L2 misses, store set mispredictions) than a correct execution. We design a simple transient fault predictor that detects the anomalous behavior in the outcomes of the speculative structures to predict transient faults.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Anonymous. HP integrity nonstop computing. http://h20223.www2.hp.com/nonstopcomputing/cache/76385-0-0-0-121.html.
 
3
 
4
D. C. Burger and T. M. Austin. The simplescalar tool set, version 2.0. Technical Report CS-TR-97-1342, University of Wisconsin, Madison, June 1997.
5
 
6
7
8
9
 
10
 
11
S. McFarling. Combining branch predictors. Technical Report TN--36, Digital Equipment Corporation, Western Research Lab, June 1993.
12
 
13
14
 
15
16
 
17
18
 
19
 
20
21
Collaborative Colleagues:
Satish Narayanasamy: colleagues
Ayse K. Coskun: colleagues
Brad Calder: colleagues