ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Flexible cache error protection using an ECC FIFO
Full text PdfPdf (790 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis table of contents
Portland, Oregon
SESSION: Technical papers table of contents
Article No.: 49  
Year of Publication: 2009
ISBN:978-1-60558-744-8
Authors
Doe Hyun Yoon  The University of Texas at Austin
Mattan Erez  The University of Texas at Austin
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
: IEEE CS
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 61,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1654059.1654109
What is a DOI?

ABSTRACT

We present ECC FIFO, a mechanism enabling two-tiered last-level cache error protection using an arbitrarily strong tier-2 code without increasing on-chip storage. Instead of adding redundant ECC information to each cache line, our ECC FIFO mechanism off-loads the extra information to off-chip DRAM. We augment each cache line with a tier-1 code, which provides error detection consuming limited resources. The redundancy required for strong protection is provided by a tier-2 code placed in off-chip memory. Because errors that require tier-2 correction are rare, the overhead of accessing DRAM is unimportant. We show how this method can save 15--25% and 10--17% of on-chip cache area and power respectively while minimally impacting performance, which decreases by 1% on average across a range of scientific and consumer benchmarks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
H. Ando, K. Seki, S. Sakashita, M. Aihara, R. Kan, K. Imada, M. Itoh, M. Nagai, Y. Tosaka, K. Takahisa, and K. Hatanaka. Accelerated Testing of a 90nm SPARC64 V Microprocessor for Neutron SER. In Proc. the IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE), April 2007.
 
2
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Technical Report TR-811-08, Princeton Univ., January 2008.
 
3
L. Chang, D. M. Fried, J. Hergenrother, J. W. Sleight, R. H. Dennard, R. K. Montoye, L. Sekaric, S. J. McNab, A. W. Topol, C. D. Adams, K. W. Guarini, and W. Haensch. Stable SRAM Cell Design for the 32nm Node and Beyond. In Digest of Technical Papers of Symp. VLSI Technology, June 2005.
 
4
C. L. Chen and M. Y. Hsiao. Error-correcting Ccodes for Semiconductor Memory Applications: A State-of-the-art Review. IBM J. Research and Development, 28(2):124--134, March 1984.
 
5
Digital Equipment Corporation. Alpha 21264 Microprocessor Hardware Reference Manual, July 1999.
 
6
G. Hamerly, E. Perelman, J. Lau, and B. Calder. SimPoint 3.0: Faster and More Flexible Program Analysis. In Proc. the Workshop on Modeling, Benchmarking and Simulation, June 2005.
 
7
R. W. Hamming. Error Correcting and Error Detecting Codes. Bell System Technical J., 29:147--160, April 1950.
 
8
M. Y. Hsiao. A Class of Optimal Minimum Odd-weight-column SEC-DED codes. IBM J. Reserach and Development, 14:395--301, 1970.
 
9
J. Huynh. White Paper: The AMD Athlon MP Processor with 512KB L2 Cache, May 2003.
10
 
11
 
12
 
13
14
15
16
 
17
S. Lin and D. J. C. Jr. Error Control Coding: Fundamentals and Applications. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1983.
 
18
 
19
J. Maiz, S. Hareland, K. Zhang, and P. Armstrong. Characterization of Multi-Bit Soft Error Events in Advanced SRAMs. In Technical Digest of the IEEE Int'l Electron Devices Meeting (IEDM), December 2003.
20
 
21
K. Osada, K. Yamaguchi, and Y. Saitoh. SRAM Immunity to Cosmic-Ray-Induced Multierrors based on Analysis of an Induced Parasitic Bipolar Effect. IEEE J. Solid-State Circuits, 39:827--833, May 2004.
22
 
23
 
24
I. S. Reed and G. Solomon. Polynomial Codes Over Certain Finite Fields. J. Soc. for Industrial and Applied Math., 8:300--304, June 1960.
 
25
N. N. Sadler and D. J. Sorin. Choosing an Error Protection Scheme for a Microprocessor's L1 Data Cache. In Proc. the Int'l Conf. Computer Design (ICCD), October 2006.
 
26
N. Seifert, V. Zia, and B. Gill. Assessing the Impact of Scaling on the Efficacy of Spatial Redundancy based Mitigation Schemes for Terrestrial Applications. In Proc. the IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE), April 2007.
 
27
C. Slayman. Cache and Memory Error Detection, Correction, and Reduction Techniques for Terrestrial Servers and Workstations. IEEE Trans. Device and Materials Reliability, 5:397--404, September 2005.
 
28
Standard Performance Evaluation Corporation. SPEC CPU 2006. http://www.spec.org/cpu2006/, 2006.
 
29
J. Standards. JESD89 Measurement and Reporting of Alpha Particles and Terrestrial Cosmic Ray-Induced Soft Errors in Semiconductor Devices, JESD89-1 System Soft Error Rate (SSER) Method and JESD89-2 Test Method for Alpha Source Accelerated Soft Error Rate, 2001.
 
30
Sun Microsystems Inc. OpenSPARC T2 System-On-Chip (SOC) Microarchitecture Specification, May 2008.
 
31
J. M. Tendler, J. S. Dodson, J. S. F. Jr., H. Le, and B. Sinharoy. POWER4 System Microarchitecture. IBM J. Research and Development, 46(1):5--25, January 2002.
 
32
S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. CACTI 5.1. Technical report, HP Laboratories, April 2008.
33
34
 
35
J. Wuu, D. Weiss, C. Morganti, and M. Dreesen. The asynchronous 24MB On-Chip Level-3 Cache for a Dual-Core Itanium®-Family Processor. In Proc. the Int'l Solid-State Circuits Conf. (ISSCC), February 2005.
36
 
37
 
38
W. Zhang, S. Gurumurthi, M. Kandemir, and A. Sivasubramaniam. ICR: In-Cache Replication for Enhancing Data Cache Reliability. In Proc. the Int'l Conf. Dependable Systems and Networks (DSN), June 2003.

Collaborative Colleagues:
Doe Hyun Yoon: colleagues
Mattan Erez: colleagues