|
ABSTRACT
We report a register transfer level technique for concurrent error detection and diagnosis in data dominated designs called Introspection. Introspection uses idle computation cyles in the data path and idle data transfer cycles in the interconnection network in a synergistic fashion for concurrent error detection and diagnosis (CEDD). The resulting on-chip fault latencies are one ten-thousandth (10-4) of previously reported system level concurrent error detection and diagnosis latencies. The associated area overhead and performance penalty are negligible. We derive a cost function that considers introspection constraints such as (i) executing an operation on three disjoint function units for diagnosis and (ii) promoting function units to participate in at least one CEDD operation. We formulate integration of introspection constraints into the operation-to-operator binding phase of high-level synthesis as a weighted bipartite matching problem. The effectiveness of introspection and its implementation are illustrated on numerous industrial strength benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
BLOUGH, D., AND NICOLAU, A. 1992. Fault tolerance in super-scalar and VLIW processors. In Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems. IEEE Computer Society Press, Los Alamitos, Calif., pp. 193-200.
|
| |
3
|
|
| |
4
|
|
| |
5
|
DE, K., NATARAJAN, C., NAIR,D.,AND BANERJEE, P. 1994. RSYN: A system for automated synthesis of reliable multilevel circuits. IEEE Trans. VLSI Syst.
|
| |
6
|
|
| |
7
|
FUCHS, W. K., CHIEN,C.Y.,AND ABRAHAMS, J. A. 1987. Concurrent error detection and testing in highly structured logic arrays. IEEE J. Solid State Circ. 22, 4, 386-394.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
HU, C. 1992. The Berkeley reliability simulator BERT: An IC reliability simulator. Microelect J. 23, 2, 97-102.
|
| |
13
|
Balakrishnan Iyer , Ramesh Karri , Israel Koren, Phantom redundancy: a high-level synthesis approach for manufacturability, Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design, p.658-661, November 05-09, 1995, San Jose, California, United States
|
| |
14
|
|
| |
15
|
KARRI, R., AND ORAILOCGLU, A. 1993. Synthesis of optimal self-recovering microarchitectures. In Proceedings of Fault Tolerant Computing Symposium.
|
| |
16
|
KARRI, R., AND ORAILOCGLU, A. 1996. High level synthesis of fault-secure VLSI digital signal processors. IEEE Trans. Reliab.
|
| |
17
|
KHAKBAZ,J.,AND MCCLUSKEY, E. J. 1982. Concurrent error detection and testing for large PLAs. IEEE J. Solid State Circ., 17, 2, 386-394.
|
| |
18
|
|
| |
19
|
|
| |
20
|
LI,C.C.,AND FUCHS, W. K. 1990. CATCH: Compiler assisted techniques for checkpointing. In Proceedings of the Fault-Tolerant Computing Symposium. IEEE Computer Society Press, Los Alamitos, Calif., pp. 74-81.
|
| |
21
|
LI,P.C.,AND HAJJ, I. 1993. Computer aided redesign of VLSI circuits for hot-carrier reliability. In Proceedings of International Conference on Computer Design.
|
| |
22
|
LO,J.C.,THANAWASTIEN, S., RAO,T.R.N.,AND NICOLAIDIS, M. 1992. An SFS Berger check prediction ALUand its application to self-checking processor designs. IEEE Trans. Comput. Aided Des. Integ. Circ. Syst. 11, 4, 525-540.
|
| |
23
|
LONG, J., FUCHS,W.K.,AND ABRAHAMS, J. A. 1992. Compiler assisted static checkpoint insertion. In Proceedings of the Fault-Tolerant Computing Symposium. IEEE Computer Society Press, Los Alamitos, Calif., pp. 58-65.
|
| |
24
|
MCFARLAND, M., PARKER, A., AND CAMPOSANO, R. 1990. The high-level synthesis of digital systems. Proc. IEEE, 78, 301-318.
|
| |
25
|
NAJM,F.N.,BURCH, R., YANG,P.,AND HAJJ, I. 1988. CREST-Acurrent estimator for CMOScircuits. In Proceedings of the 1988 IEEE/ACM International Conference on CAD. ACM, New York.
|
| |
26
|
|
| |
27
|
|
| |
28
|
ORAILOCGLU, A., AND KARRI, R. 1994. Coactive scheduling and checkpoint determination during the high level synthesis of self recovering microarchitectures. IEEE Trans. VLSI Syst., 2, 3, 304-311.
|
| |
29
|
|
| |
30
|
|
| |
31
|
PETERSON,W.W.,AND WELDON, E. J. 1972. Error Correcting Codes. MIT Press, Cambridge, Mass.
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
|
| |
36
|
UPADHYAYA,J.S.,AND SALUJA, K. K. 1986. Rollback and recovery strategies for computer programs. IEEE Trans. Softwa. Eng., 37, 546-556.
|
|