| Hierarchical Verification for Increasing Performance in Reliable Processors |
| Source
|
Journal of Electronic Testing: Theory and Applications
archive
Volume 24 , Issue 1-3 (June 2008)
table of contents
Pages: 117 - 128
Year of Publication: 2008
ISSN:0923-8174
|
|
Authors
|
|
Joonhyuk Yoo
|
Department of Electrical and Computer Engineering, University of Maryland, College Park, USA 20742
|
|
Manoj Franklin
|
Department of Electrical and Computer Engineering, University of Maryland, College Park, USA 20742
|
|
| Publisher |
Kluwer Academic Publishers
Norwell, MA, USA
|
| Bibliometrics |
Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Citation Count: 0
|
|
|
ABSTRACT
Dynamic verification using the checker processor introduces severe degradation in performance unless the checker is as fast as the main processor core. Without widening the checker's bandwidth, we propose an active verification management (AVM) approach that utilizes a checker hierarchy. Before an instruction is verified at the checker processor, a filter checker marks a correctness non-criticality indicator (CNI) bit to indicate how likely its result is to be unimportant for reliability. AVM uses the CNI information to realize a congestion avoidance policy. Both reactive and proactive congestion avoidance policies are proposed to mitigate the performance degradation caused by the checker's congestion. Based on a simplified queueing model, we evaluate the proposed AVM analytically. Our experimental results show that AVM has the potential to solve the verification congestion problem when perfect fault coverage is not needed. With no AVM, congestion at the checker badly affects performance, to the tune of 57%, when compared to that of a non-fault-tolerant processor. With good marking by AVM, the performance of a reliable processor approaches 95% of that of a processor with no verification. Although instructions can be skipped on a random basis, such an approach reduces the fault coverage. A filter checker with a marking policy correlated with the correctness non-criticality metric, on the other hand, significantly reduces the soft error rate. Finally, we also present results showing the trade-off between performance and reliability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
Kumar S, Aggarwal A (2006) Reducing resource redundancy for concurrent error detection techniques in high performance microprocessors. In: Proceedings of the 12th international symposium on high-performance computer architecture, Feb.
|
| |
6
|
|
 |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Reinhardt SK (2005) Using the m5 simulator. ISCA tutorials and workshops, Jun.
|
 |
11
|
|
| |
12
|
Rotenberg E (1999) Slipstream processors: improving both performance and fault tolerance. In: Proceedings of the 33rd international symposium on microarchitecture, Jun.
|
| |
13
|
|
 |
14
|
Kevin Skadron , Mircea R. Stan , Wei Huang , Sivakumar Velusamy , Karthik Sankaranarayanan , David Tarjan, Temperature-aware microarchitecture, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
| |
15
|
Jared C. Smolens , Jangwoo Kim , James C. Hoe , Babak Falsafi, Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.257-268, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.19]
|
| |
16
|
Sundaramoorthy K, Purser Z, Rotenberg E (2000) Slip-stream processors: improving both performance and fault tolerance. In: Proceedings of the 33rd international symposium on microarchitecture, Dec.
|
| |
17
|
|
 |
18
|
|
| |
19
|
Weaver C, Emer J, Mukherjee SS, Reinhardt SK (2004) Characterizing the effects of transient faults on a high-performance processor pipeline. In: Proceedings of the international conference on dependable systems and networks, Jun.
|
| |
20
|
|
INDEX TERMS
Primary Classification:
B.
Hardware
B.8
Performance and Reliability
B.8.1
Reliability, Testing, and Fault-Tolerance
Additional Classification:
B.
Hardware
B.7
INTEGRATED CIRCUITS
B.7.1
Types and Design Styles
Subjects:
Microprocessors and microcomputers
C.
Computer Systems Organization
C.4
PERFORMANCE OF SYSTEMS
General Terms:
Design,
Performance,
Reliability,
Verification
Keywords:
Active verification management,
Correctness non-critically,
Fault tolerance,
Filter checker,
Hierarchical verification,
Performance
|