|
ABSTRACT
This paper presents a novel technique, Memory Mapped ECC, which reduces the cost of providing error correction for SRAM caches. It is important to limit such overheads as processor resources become constrained and error propensity increases. The continuing decrease in SRAM cell size and the growing capacity of caches increases the likelihood of errors in SRAM arrays. To address this, redundant information can be used to correct a value after an error occurs. Information redundancy is typically provided through error-correcting codes (ECC), which append bits to every SRAM row and increase the array's area and energy consumption. We make three observations regarding error protection and utilize them in our architecture: (1) much of the data in a cache is replicated throughout the hierarchy and is inherently redundant; (2) error-detection is necessary for every cache access and is cheaper than error correction, which is very infrequent; (3) redundant information for correction need not be stored in high-cost SRAM. Our unique architecture only dedicates SRAM for error detection while the ECC bits are stored within the memory hierarchy as data. We associate a physical memory address with each cache line for ECC storage and rely on locality to minimize the impact. The cache is dynamically and transparently partitioned between data and ECC with the fraction of ECC growing with the number of dirty cache lines. We show that this has little impact on both performance (1.3% average and < 4%) and memory traffic (3%) across a range of memory-intensive applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
H. Ando, K. Seki, S. Sakashita, M. Aihara, R. Kan, K. Imada, M. Itoh, M. Nagai, Y. Tosaka, K. Takahisa, and K. Hatanaka. Accelerated Testing of a 90nm SPARC64 V Microprocessor for Neutron SER. In Proceedings of IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE), April 2007.
|
| |
2
|
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Technical Report TR-811-08, Princeton University, January 2008.
|
| |
3
|
B. H. Calhoun and A. P. Chandrakasan. A 256kb Sub-threshold SRAM in 65nm CMOS. In Proceedings of the International Solid-State Circuits Conference (ISSCC), February 2006.
|
| |
4
|
L. Chang, D. M. Fried, J. Hergenrother, J. W. Sleight, R. H. Dennard, R. K. Montoye, L. Sekaric, S. J. McNab, A. W. Topol, C. D. Adams, K. W. Guarini, and W. Haensch. Stable SRAM Cell Design for the 32nm Node and Beyond. In Digest of Technical Papers of Symposium on VLSI Technology, June 2005.
|
| |
5
|
C. L. Chen and M. Y. Hsiao. Error-correcting Ccodes for Semiconductor Memory Applications: A State-of-the-art Review. IBM Journal of Research and Development, 28(2):124--134, March 1984.
|
| |
6
|
|
| |
7
|
Digital Equipment Corp. Alpha 21264 Microprocessor Hardware Reference Manual, July 1999.
|
| |
8
|
G. Hamerly, E. Perelman, J. Lau, and B. Calder. SimPoint 3.0: Faster and More Flexible Program Analysis. In Proceedings of Workshop on Modeling, Benchmarking and Simulation, June 2005.
|
| |
9
|
R. W. Hamming. Error Correcting and Error Detecting Codes. Bell System Technical Journal, 29:147--160, April 1950.
|
| |
10
|
M. Y. Hsiao. A Class of Optimal Minimum Odd-weight-column SEC-DED codes. IBM Journal of Reserach and Development, 14:395--301, 1970.
|
| |
11
|
J. Huynh. White Paper: The AMD Athlon MP Processor with 512KB L2 Cache, May 2003.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
J. P. Kulkarni, K. Kim, and K. roy. A 160mV Robust Schmitt Trigger Based Subthreshold SRAM. IEEE Journal of Solid-State Circuits, 42(10):2303--2313, October 2007.
|
 |
17
|
|
 |
18
|
Lin Li , Vijay Degalahal , N. Vijaykrishnan , Mahmut Kandemir , Mary Jane Irwin, Soft error and energy consumption interactions: a data cache perspective, Proceedings of the 2004 international symposium on Low power electronics and design, August 09-11, 2004, Newport Beach, California, USA
[doi> 10.1145/1013235.1013273]
|
| |
19
|
S. Lin and D. J. C. Jr. Error Control Coding: Fundamentals and Applications. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1983.
|
 |
20
|
Chi-Keung Luk , Robert Cohn , Robert Muth , Harish Patil , Artur Klauser , Geoff Lowney , Steven Wallace , Vijay Janapa Reddi , Kim Hazelwood, Pin: building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
21
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
| |
22
|
J. Maiz, S. Hareland, K. Zhang, and P. Armstrong. Characterization of Multi-Bit Soft Error Events in Advanced SRAMs. In Technical Digest of IEEE International Electron Devices Meeting (IEDM), December 2003.
|
 |
23
|
Milo M. K. Martin , Daniel J. Sorin , Bradford M. Beckmann , Michael R. Marty , Min Xu , Alaa R. Alameldeen , Kevin E. Moore , Mark D. Hill , David A. Wood, Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset, ACM SIGARCH Computer Architecture News, v.33 n.4, November 2005
[doi> 10.1145/1105734.1105747]
|
| |
24
|
K. Osada, K. Yamaguchi, and Y. Saitoh. SRAM Immunity to Cosmic-Ray-Induced Multierrors based on Analysis of an Induced Parasitic Bipolar Effect. IEEE Journal of Solid-State Circuits, 39:827--833, May 2004.
|
 |
25
|
A. M. Patel , M. Y. Hsiao, An adaptive error correction scheme for computer memory system, Proceedings of the December 5-7, 1972, fall joint computer conference, part I, December 05-07, 1972, Anaheim, California
[doi> 10.1145/1479992.1480002]
|
| |
26
|
I. S. Reed and G. Solomon. Polynomial Codes Over Certain Finite Fields. Journal of Society for Industrial and Applied Mathematics, 8:300--304, June 1960.
|
| |
27
|
N. N. Sadler and D. J. Sorin. Choosing an Error Protection Scheme for a Microprocessor's L1 Data Cache. In Proceedings of International Conference on Computer Design (ICCD), October 2006.
|
| |
28
|
N. Seifert, V. Zia, and B. Gill. Assessing the Impact of Scaling on the Efficacy of Spatial Redundancy based Mitigation Schemes for Terrestrial Applications. In Proceedings of IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE), April 2007.
|
| |
29
|
C. Slayman. Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations. IEEE Transactions on Device and Materials Reliability, 5:397--404, September 2005.
|
| |
30
|
Standard Performance Evaluation Corporation. SPEC CPU 2006. http://www.spec.org/cpu2006/, 2006.
|
| |
31
|
J. Standards. JESD89 Measurement and Reporting of Alpha Particles and Terrestrial Cosmic Ray-Induced Soft Errors in Semiconductor Devices, JESD89-1 System Soft Error Rate (SSER) Method and JESD89-2 Test Method for Alpha Source Accelerated Soft Error Rate, 2001.
|
| |
32
|
Sun Microsystems Inc. OpenSPARC T2 System-On-Chip (SOC) Microarchitecture Specification, May 2008.
|
| |
33
|
J. M. Tendler, J. S. Dodson, J. S. F. Jr., H. Le, and B. Sinharoy. POWER4 System Microarchitecture. IBM Journal of Research and Development, 46(1):5--25, January 2002.
|
| |
34
|
S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. CACTI 5.1. Technical report, HP Laboratories, April 2008.
|
 |
35
|
|
 |
36
|
Chris Wilkerson , Hongliang Gao , Alaa R. Alameldeen , Zeshan Chishti , Muhammad Khellah , Shih-Lien Lu, Trading off Cache Capacity for Reliability to Enable Low Voltage Operation, Proceedings of the 35th International Symposium on Computer Architecture, p.203-214, June 21-25, 2008
|
 |
37
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
| |
38
|
J. Wuu, D. Weiss, C. Morganti, and M. Dreesen. The asynchronous 24MB on-chip level-3 cache for a dual-core Itanium-family processor. In Proceedings of the International Solid-State Circuits Conference (ISSCC), February 2005.
|
| |
39
|
|
| |
40
|
W. Zhang, S. Gurumurthi, M. Kandemir, and A. Sivasubramaniam. ICR: In-Cache Replication for Enhancing Data Cache Reliability. In Proceedings of the International Conference on Dependable Systems and Networks (DSN), June 2003.
|
|