ACM Home Page
Please provide us with feedback. Feedback
An analysis of latent sector errors in disk drives
Full text PdfPdf (605 KB)
Source
Joint International Conference on Measurement and Modeling of Computer Systems archive
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems table of contents
San Diego, California, USA
SESSION: Systems table of contents
Pages: 289 - 300  
Year of Publication: 2007
ISBN:978-1-59593-639-4
Also published in ...
Authors
Lakshmi N. Bairavasundaram  University of Wisconsin-Madison
Garth R. Goodson  Network Appliance, Inc.
Shankar Pasupathy  Network Appliance, Inc.
Jiri Schindler  Network Appliance, Inc.
Sponsors
SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 25,   Downloads (12 Months): 159,   Citation Count: 20
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1254882.1254917
What is a DOI?

ABSTRACT

The reliability measures in today's disk drive-based storage systems focus predominantly on protecting against complete disk failures. Previous disk reliability studies have analyzed empirical data in an attempt to better understand and predict disk failure rates. Yet, very little is known about the incidence of latent sector errors i.e., errors that go undetected until the corresponding disk sectors are accessed.

Our study analyzes data collected from production storage systems over 32 months across 1.53 million disks (both nearline and enterprise class). We analyze factors that impact latent sector errors, observe trends, and explore their implications on the design of reliability mechanisms in storage systems. To the best of our knowledge, this is the first study of such large scale our sample size is at least anorder of magnitude larger than previously published studies and the first one to focus specifically on latent sector errors and their implications on the design and reliability of storage systems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
 
6
J. G. Elerath and S. Shah. Server class disk drives: how reliable are they. IEEE Reliability and Maintainability Symposium, p. 151--156, Jan. 2004.
 
7
J. Gray and C. van Ingen. Empirical measurements of disk failure rates and error rates. MSR-TR-2005-166. Microsoft Research, Dec. 2005.
 
8
9
 
10
Network Appliance Inc. Introduction to Data ONTAP 7G. TR 3356, Network Appliance Inc. Oct. 2005.
11
 
12
 
13
F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Robust, portable I/O scheduling with the disk mimic. USENIX Annual Technical Conference, p. 297--310, Jun. 2003.
14
 
15
H. Reiser. ReiserFS. http://www.namesys.com/.
 
16
 
17
 
18
 
19
S. Shah and J. G. Elerath. Disk drive vintage and its effect on reliability. IEEE Reliability and Maintainability Symposium, p. 163--167, Jan. 2004.
 
20
S. Shah and J. G. Elerath. Reliability analyses of disk drive failure mechanisms. IEEE Reliability and Maintainability Symposium, p. 226--231, Jan. 2005.
 
21
 
22
 
23
Information Technology: SCSI primary commands (SPC-2). T10 Revision 5, Project 1236-D. Sept. 1998.

CITED BY  20

Collaborative Colleagues:
Lakshmi N. Bairavasundaram: colleagues
Garth R. Goodson: colleagues
Shankar Pasupathy: colleagues
Jiri Schindler: colleagues