ACM Home Page
Please provide us with feedback. Feedback
Improving file system reliability with I/O shepherding
Full text FlvFlv (30:35),  Mp3Mp3 (12.85 MB),  PdfPdf (550 KB)
Source
ACM Symposium on Operating Systems Principles archive
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles table of contents
Stevenson, Washington, USA
SESSION: Storage table of contents
Pages: 293 - 306  
Year of Publication: 2007
ISBN:978-1-59593-591-5
Also published in ...
Authors
Haryadi S. Gunawi  University of Wisconsin - Madison, Madison, WI
Vijayan Prabhakaran  Microsoft Research - Silicon Valey, Mountain View, CA
Swetha Krishnan  University of Wisconsin - Madison, Madison, WI
Andrea C. Arpaci-Dusseau  University of Wisconsin - Madison, Madison, WI
Remzi H. Arpaci-Dusseau  University of Wisconsin - Madison, Madison, WI
Sponsors
ACM: Association for Computing Machinery
SIGOPS: ACM Special Interest Group on Operating Systems
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 186,   Citation Count: 4
Additional Information:

appendices and supplements   abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1294261.1294290
What is a DOI?

APPENDICES and SUPPLEMENTS
Zipp293-slides.zip (34.87 MB),
Supplemental material for Improving file system reliability with I/O shepherding


ABSTRACT

We introduce a new reliability infrastructure for file systems called I/O shepherding. I/O shepherding allows a file system developer to craft nuanced reliability policies to detect and recover from a wide range of storage system failures. We incorporate shepherding into the Linux ext3 file system through a set of changes to the consistency management subsystem, layout engine, disk scheduler, and buffer cache. The resulting file system, CrookFS, enables a broad class of policies to be easily and correctly specified. We implement numerous policies, incorporating data protection techniques such as retry, parity, mirrors, checksums, sanity checks, and data structure repairs; even complex policies can be implemented in less than 100 lines of code, confirming the power and simplicity of the shepherding framework. We also demonstrate that shepherding is properly integrated, adding less than 5% overhead to the I/O path.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Lakshmi Bairavasundaram. On the frequency of transient faults in modern disk drives. Personal Communication, 2007.
4
5
 
6
 
7
8
9
 
10
 
11
12
13
 
14
Jim Gray. A Census of Tandem System Availability Between 1985 and 1990. Technical Report 90.1, Tandem Computers, 1990.
 
15
 
16
Roedy Green. EIDE Controller Flaws Version 24. http://mindprod.com/jgloss/eideflaw.html, February 2005.
17
18
19
20
 
21
 
22
Jeffrey Katcher. PostMark: A New File System Benchmark. Technical Report TR-3022, Network Appliance Inc., October 1997.
 
23
Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Lopes, Jean-Marc Loingtier, and John Irwin. Aspect-Oriented Programming. In Proceedings of the European Conference on Object-Oriented Programming (ECOOP), pages 220--242, 1997.
 
24
Steve R. Kleiman. Vnodes: An Architecture for Multiple File System Types in Sun UNIX. In USENIX Summer'86, pages 238--247, Atlanta, GA, June 1986.
 
25
 
26
27
 
28
29
30
 
31
 
32
 
33
Sun Microsystems. ZFS: The last word in file systems. www.sun.com/2004-0914/feature/, 2006.
 
34
Rajesh Sundaram. The Private Lives of Disk Drives. http://www.netapp.com/go/techontap/matl/sample/0206tot_resiliency.html, February 2006.
35
 
36
Nisha Talagala and David Patterson. An Analysis of Error Behaviour in a Large Storage System. In The IEEE Workshop on Fault Tolerance in Parallel and Distributed Systems, San Juan, Puerto Rico, April 1999.
 
37
Transaction Processing Council. TPC Benchmark B Standard Specification, Revision 3.2. Technical Report, 1990.
 
38
Stephen C. Tweedie. Journaling the Linux ext2fs File System. In The Fourth Annual Linux Expo, Durham, North Carolina, May 1998.
 
39
 
40


Collaborative Colleagues:
Haryadi S. Gunawi: colleagues
Vijayan Prabhakaran: colleagues
Swetha Krishnan: colleagues
Andrea C. Arpaci-Dusseau: colleagues
Remzi H. Arpaci-Dusseau: colleagues