|
ABSTRACT
While Moore's Law predicts the ability of semi-conductor industry to engineer smaller and more efficient transistors and circuits, there are serious issues not contemplated in that law. One concern is the verification effort of modern computing systems, which has grown to dominate the cost of system design. On the other hand, technology scaling leads to burn-in phase out. As a result, in-the-field error rate may increase due to both actual errors and latent defects. Whereas data can be protected with arithmetic codes (like parity or ECC), there is a lack of cost-effective mechanisms for control logic. This paper presents a light-weight microarchitectural mechanism that ensures that data consumed through registers are correct. Microarchitecture presents a new way to manage reliability and testing without significantly sacrificing cost and performance, offering a unique opportunity to detect errors in the field at low cost. Our results show a coverage around 90% for the targeted structures with a cost in power and area of about 4%. The structures protected include the issue queue logic and the data associated (i.e., tags, control signals), input multiplexors, rename data, replay logic, register free list, bypasses data and logic, MOB data and addresses, register file logic, register file storage and functional units.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
M. Agostinelli, J. Hicks, J. Xu, B. Woolery, K. Mistry, K. Zhang, S. Jacobs, J. Jopling, W. Yang, B. Lee, T. Raz, M. Mehalel, P. Kolar, Y. Wang, J. Sandford, D. Pivin, C. Peterson, M. DiBattista, S. Pae, M. Jones, S. Johnson, and G. Subramanian. Erratic fluctuations of SRAM cache vmin at the 90nm process technology node. In Technical digest of IEEE International Electron Devices Meeting (IEDM), pages 655--658, December 2005.
|
 |
3
|
Hisashige Ando , Yuuji Yoshida , Aiichiro Inoue , Itsumi Sugiyama , Takeo Asakawa , Kuniki Morita , Toshiyuki Muta , Tsuyoshi Motokurumada , Seishi Okada , Hideo Yamashita , Yoshihiko Satsukawa , Akihiko Konmoto , Ryouichi Yamashita , Hiroyuki Sugiyama, A 1.3GHz fifth generation SPARC64 microprocessor, Proceedings of the 40th annual Design Automation Conference, June 02-06, 2003, Anaheim, CA, USA
[doi> 10.1145/775832.776010]
|
| |
4
|
|
| |
5
|
|
| |
6
|
T. Barnett, A. Singh, and V. Nelson. Extending integrated-circuit yield-models to estimate early-life reliability. IEEE Transactions on Reliability, 52(3):296--300, Sept. 2003.
|
 |
7
|
|
 |
8
|
|
| |
9
|
G. Hinton, D. Sager, M. Upton, D. Bogs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, 5(1):13, Feb. 2001.
|
| |
10
|
Yu-Tsao Hsing , Chih-Wea Wang , Ching-Wei Wu , Chih-Tsun Huang , Cheng-Wen Wu, Failure Factor Based Yield Enhancement for SRAM Designs, Proceedings of the Defect and Fault Tolerance in VLSI Systems, 19th IEEE International Symposium, p.20-28, October 10-13, 2004
|
| |
11
|
S. Kumar and A. Aggarwal. Reducing resource redundancy for concurrent error detection techniques in high performance microprocessors. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), 2006.
|
| |
12
|
|
| |
13
|
G. Langdon and C. Tang. Concurrent error detection for group look-ahead binary adders. IBM Journal of Research and Development, 14(5):563--573, 1970.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
P. Monteiro and T. Rao. A residue checker for arithmetic and logical operations. In 2nd Fault Tolerant Computing Symposium, 1972.
|
| |
20
|
M. Mueller, L. C. Alves, W. Fischer, M. L. Fair, and I. Modi. RAS strategy for IBM S/390 G5 and G6. IBM Journal of Research and Development, 43(5):875--888, 1999.
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
V. Reddy, A. Al-Zawawi, and E. Rotenberg. Assertion-based microarchitecture design for improved fault tolerance. In Proceedings of International Conference on Computer Design (ICCD), pages 362--369, 2007.
|
| |
28
|
K. Reick, P. Sanda, S. Swaney, J. Kellington, M. Floyd, and D. Henderson. Fault-tolerant design of the IBM Power6\TMark microprocessor. In Proceedings of the Hot Chips 19 Symposium, 2007.
|
| |
29
|
|
 |
30
|
George A. Reis , Jonathan Chang , Neil Vachharajani , Ram Rangan , David I. August , Shubhendu S. Mukherjee, Design and Evaluation of Hybrid Fault-Detection Systems, Proceedings of the 32nd annual international symposium on Computer Architecture, p.148-159, June 04-08, 2005
|
| |
31
|
|
 |
32
|
Smitha Shyam , Kypros Constantinides , Sujay Phadke , Valeria Bertacco , Todd Austin, Ultra low-cost defect protection for microprocessor pipelines, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
33
|
Sih and Reinheimer. Checking logical operations by residues. IBM Technical Disclosure Bulletin, 15(7):2325--2327, 1972.
|
| |
34
|
J. Smolens, B. Gold, J. Hoe, B. Falsafi, and K. Mai. Detecting emerging wearout faults. In Proceedings of the 3rd Workshop on Silicon Errors in Logic - System Effects (SELSE), 2007.
|
| |
35
|
Jared C. Smolens , Jangwoo Kim , James C. Hoe , Babak Falsafi, Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.257-268, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.19]
|
| |
36
|
L. Spainhower and T. Gregg. IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective. IBM Journal of Research and Development, 43(5/6):863--873, 1999.
|
| |
37
|
|
| |
38
|
SPECCPU 2000. SPEC Newsletter, Sept. 2000.
|
| |
39
|
K. Sundaramoorthy, Z. Purser, and E. Rotenberg. Slipstream processors: improving both performance and fault tolerance. In Proceedings of the 33th International Symposium on Microarchitecture (MICRO), 2000.
|
 |
40
|
|
| |
41
|
David M. Wu , Mike Lin , Madhukar Reddy , Talal Jaber , Anil Sabbavarapu , Larry Thatcher, An Optimized DFT and Test Pattern Generation Strategy for an Intel High Performance Microprocessor, Proceedings of the International Test Conference on International Test Conference, p.38-47, October 26-28, 2004
|
|