|
ABSTRACT
The reliability measures in today's disk drive-based storage systems focus predominantly on protecting against complete disk failures. Previous disk reliability studies have analyzed empirical data in an attempt to better understand and predict disk failure rates. Yet, very little is known about the incidence of latent sector errors i.e., errors that go undetected until the corresponding disk sectors are accessed. Our study analyzes data collected from production storage systems over 32 months across 1.53 million disks (both nearline and enterprise class). We analyze factors that impact latent sector errors, observe trends, and explore their implications on the design of reliability mechanisms in storage systems. To the best of our knowledge, this is the first study of such large scale our sample size is at least anorder of magnitude larger than previously published studies and the first one to focus specifically on latent sector errors and their implications on the design and reliability of storage systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
 |
4
|
Mary Baker , Mehul Shah , David S. H. Rosenthal , Mema Roussopoulos , Petros Maniatis , TJ Giuli , Prashanth Bungale, A fresh look at the reliability of long-term digital storage, Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, April 18-21, 2006, Leuven, Belgium
|
| |
5
|
Peter Corbett , Bob English , Atul Goel , Tomislav Grcanac , Steven Kleiman , James Leong , Sunitha Sankar, Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
| |
6
|
J. G. Elerath and S. Shah. Server class disk drives: how reliable are they. IEEE Reliability and Maintainability Symposium, p. 151--156, Jan. 2004.
|
| |
7
|
J. Gray and C. van Ingen. Empirical measurements of disk failure rates and error rates. MSR-TR-2005-166. Microsoft Research, Dec. 2005.
|
| |
8
|
James Lee Hafner , Veera Deenadhayalan , K. K. Rao , John A. Tomlin, Matrix methods for lost data reconstruction in erasure codes, Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p.14-14, December 13-16, 2005, San Francisco, CA
|
 |
9
|
|
| |
10
|
Network Appliance Inc. Introduction to Data ONTAP 7G. TR 3356, Network Appliance Inc. Oct. 2005.
|
 |
11
|
David A. Patterson , Garth Gibson , Randy H. Katz, A case for redundant arrays of inexpensive disks (RAID), Proceedings of the 1988 ACM SIGMOD international conference on Management of data, p.109-116, June 01-03, 1988, Chicago, Illinois, United States
|
| |
12
|
|
| |
13
|
F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Robust, portable I/O scheduling with the disk mimic. USENIX Annual Technical Conference, p. 297--310, Jun. 2003.
|
 |
14
|
Vijayan Prabhakaran , Lakshmi N. Bairavasundaram , Nitin Agrawal , Haryadi S. Gunawi , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau, IRON file systems, Proceedings of the twentieth ACM symposium on Operating systems principles, October 23-26, 2005, Brighton, United Kingdom
|
| |
15
|
H. Reiser. ReiserFS. http://www.namesys.com/.
|
| |
16
|
Steven W. Schlosser , Jiri Schindler , Stratos Papadomanolakis , Minglong Shao , Anastassia Ailamaki , Christos Faloutsos , Gregory R. Ganger, On multidimensional data and modern disks, Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p.17-17, December 13-16, 2005, San Francisco, CA
|
| |
17
|
Bianca Schroeder , Garth A. Gibson, Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?, Proceedings of the 5th conference on USENIX Conference on File and Storage Technologies, p.1-1, February 13-16, 2007, San Jose, CA
|
| |
18
|
Thomas J. E. Schwarz , Qin Xin , Ethan L. Miller , Darrell D. E. Long , Andy Hospodor , Spencer Ng, Disk Scrubbing in Large Archival Storage Systems, Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS'04), p.409-418, October 04-08, 2004
|
| |
19
|
S. Shah and J. G. Elerath. Disk drive vintage and its effect on reliability. IEEE Reliability and Maintainability Symposium, p. 163--167, Jan. 2004.
|
| |
20
|
S. Shah and J. G. Elerath. Reliability analyses of disk drive failure mechanisms. IEEE Reliability and Maintainability Symposium, p. 226--231, Jan. 2005.
|
| |
21
|
|
| |
22
|
|
| |
23
|
Information Technology: SCSI primary commands (SPC-2). T10 Revision 5, Project 1236-D. Sept. 1998.
|
CITED BY 20
|
|
Mark W. Storer , Kevin M. Greenan , Ethan L. Miller , Kaladhar Voruganti, Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage, Proceedings of the 6th USENIX Conference on File and Storage Technologies, p.1-16, February 26-29, 2008, San Jose, California
|
|
|
Lakshmi N. Bairavasundaram , Garth R. Goodson , Bianca Schroeder , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dussea, An analysis of data corruption in the storage stack, Proceedings of the 6th USENIX Conference on File and Storage Technologies, p.1-16, February 26-29, 2008, San Jose, California
|
|
|
Weihang Jiang , Chongfeng Hu , Yuanyuan Zhou , Arkady Kanevsky, Are disks the dominant contributor for storage failures?: a comprehensive study of storage subsystem failure characteristics, Proceedings of the 6th USENIX Conference on File and Storage Technologies, p.1-15, February 26-29, 2008, San Jose, California
|
|
|
Andrew Krioukov , Lakshmi N. Bairavasundaram , Garth R. Goodson , Kiran Srinivasan , Randy Thelen , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dussea, Parity lost and parity regained, Proceedings of the 6th USENIX Conference on File and Storage Technologies, p.1-15, February 26-29, 2008, San Jose, California
|
|
|
John K. Edwards , Daniel Ellard , Craig Everhart , Robert Fair , Eric Hamilton , Andy Kahn , Arkady Kanevsky , James Lentini , Ashish Prakash , Keith A. Smith , Edward Zayas, FlexVol: flexible, efficient file volume virtualization in WAFL, USENIX 2008 Annual Technical Conference on Annual Technical Conference, p.129-142, June 22-27, 2008, Boston, Massachusetts
|
|
|
|
|
|
Alexander Wait Zaranek , Tom Clegg , Ward Vandewege , George M. Church, Free factories: unified infrastructure for data intensive web services, USENIX 2008 Annual Technical Conference on Annual Technical Conference, p.391-404, June 22-27, 2008, Boston, Massachusetts
|
|
|
Ningfang Mi , Alma Riska , Xin Li , Evgenia Smirni , Erik Riedel, Restrained utilization of idleness for transparent scheduling of background tasks, Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, June 15-19, 2009, Seattle, WA, USA
|
|
|
|
|
|
|
|
|
Kevin M. Greenan , Ethan L. Miller , Thomas J. E. Schwarz , Darrell D.E. Long, Disaster recovery codes: increasing reliability with large-stripe erasure correcting codes, Proceedings of the 2007 ACM workshop on Storage security and survivability, October 29-29, 2007, Alexandria, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Suzhen Wu , Hong Jiang , Dan Feng , Lei Tian , Bo Mao, WorkOut: I/O workload outsourcing for boosting RAID reconstruction performance, Proccedings of the 7th conference on File and stroage technologies, p.239-252, February 24-27, 2009, San Francisco, California
|
|
|
Hakim Weatherspoon , Lakshmi Ganesh , Tudor Marian , Mahesh Balakrishnan , Ken Birman, Smoke and mirrors: reflecting files at a geographically remote location without loss of performance, Proccedings of the 7th conference on File and stroage technologies, p.211-224, February 24-27, 2009, San Francisco, California
|
|
|
Weihang Jiang , Chongfeng Hu , Shankar Pasupathy , Arkady Kanevsky , Zhenmin Li , Yuanyuan Zhou, Understanding customer problem troubleshooting from storage system logs, Proccedings of the 7th conference on File and stroage technologies, p.43-56, February 24-27, 2009, San Francisco, California
|
|
|
|
|
|
|
|