ACM Home Page
Please provide us with feedback. Feedback
Improving the reliability of commodity operating systems
Full text PdfPdf (263 KB)
Source ACM Symposium on Operating Systems Principles archive
Proceedings of the nineteenth ACM symposium on Operating systems principles table of contents
Bolton Landing, NY, USA
SESSION: Making operating systems more robust table of contents
Pages: 207 - 222  
Year of Publication: 2003
ISBN:1-58113-757-5
Also published in ...
Authors
Michael M. Swift  University of Washington, Seattle, WA
Brian N. Bershad  University of Washington, Seattle, WA
Henry M. Levy  University of Washington, Seattle, WA
Sponsors
SIGOPS: ACM Special Interest Group on Operating Systems
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 28,   Downloads (12 Months): 362,   Citation Count: 38
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/945445.945466
What is a DOI?

ABSTRACT

Despite decades of research in extensible operating system technology, extensions such as device drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85% of recently reported failures. This paper describes Nooks, a reliability subsystem that seeks to greatly enhance OS reliability by isolating the OS from driver failures. The Nooks approach is practical: rather than guaranteeing complete fault tolerance through a new (and incompatible) OS or driver architecture, our goal is to prevent the vast majority of driver-caused crashes with little or no change to existing driver and system code. To achieve this, Nooks isolates drivers within lightweight protection domains inside the kernel address space, where hardware and software prevent them from corrupting the kernel. Nooks also tracks a driver's use of kernel resources to hasten automatic clean-up during recovery.To prove the viability of our approach, we implemented Nooks in the Linux operating system and used it to fault-isolate several device drivers. Our results show that Nooks offers a substantial increase in the reliability of operating systems, catching and quickly recovering from many faults that would otherwise crash the system. In a series of 2000 fault-injection tests, Nooks recovered automatically from 99% of the faults that caused Linux to crash.While Nooks was designed for drivers, our techniques generalize to other kernel extensions, as well. We demonstrate this by isolating a kernel-mode file system and an in-kernel Internet service. Overall, because Nooks supports existing C-language extensions, runs on a commodity operating system and hardware, and enables automated recovery, it represents a substantial step beyond the specialized architectures and type-safe languages required by previous efforts directed at safe extensibility.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Apache Project. http://httpd.apache.orgApache HTTP server version 2.0, 2000. Available at http://httpd.apache.org.
2
3
4
 
5
 
6
7
8
 
9
10
 
11
12
13
 
14
D. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system-specific, programmer-written compiler extensions. In Proceedings of the 4th USENIX Symposium on Operating Systems Design and Implementation, pages 1--16, 2000.
15
16
17
 
18
A. Forin, D. Golub, and B. Bershad. An I/O system for Mach. In Proc. Usenix Mach Symposium, pages 163--176, Nov. 1991.
 
19
J. Gettys, P. L. Carlton, and S. McGregor. http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-90-8.pdfThe X window system version 11. Technical Report CRL-90-08, Digital Equipment Corporation, Dec. 1900.
 
20
A. Gillen, D. Kusnetzky, and S. McLaron. The role of Linux in reducing the cost of enterprise computing, Jan. 2002. IDC white paper.
 
21
 
22
J. C. Haarsten. The Bluetooth radio system. IEEE Personal Communications Magazine, 7(1):28--36, Feb. 2000.
 
23
 
24
Hewlett Packard. http://www.hp.com/hpinfo/newsroom/press/31oct01a.htmHewlett Packard Digital Entertainment Center, Oct. 2001. http://www.hp.com/hpinfo/newsroom/press/31oct01a.htm.
 
25
 
26
 
27
Intel Corporation. The IA-32 Architecture Software Developer's Manual, Volume 1: Basic Architecture. Intel Corporation, Jan. 2002. Available at http://www.intel.com/design/pentium4/manuals/24547010.pdf.
 
28
R. Jones. http://www.netperf.orgNetperf: A network performance benchmark, version 2.1, 1995. Available at http://www.netperf.org.
29
 
30
31
 
32
Microsoft Corporation. http://www.microsoft.com/hwdev/download/hardware/fatgen103.pdf FAT: General overview of on-disk format, version 1.03, Dec. 2000.
 
33
D. Mosberger and T. Jin. httperf: A tool for measuring web server performance. In First Workshop on Internet Server Performance, pages 59---67, June 1998. ACM.
34
 
35
 
36
 
37
 
38
Project-UDI. Introduction to UDI version 1.0. Technical report, Project UDI, Aug. 1999.
 
39
Rob Short, Vice President of Windows Core Technology, Microsoft Corp. private communication, 2003.
40
41
42
43
 
44
Standard Performance Evaluation Corporation. http://www.spec.org/osg/web99/The SPECweb99 benchmark, 1999.
 
45
 
46
 
47
P. Thurrott. Windows 2000 server: The road to gold, part two: Developing windows. Paul Thurrott's SuperSite for Windows, Jan. 2003.
 
48
TiVo Corporation. www.tivo.com TiVo digital video recorder, 2001. www.tivo.com.
 
49
V. Uhlig, U. Dannowski, E. Skoglund, A. Haeberlen, and G. Heiser. Performance of address-space multiplexing on the Pentium. Technical Report 2002-1, University of Karlsruhe, 2002.
 
50
A. van de Ven. http://www.fenrus.demon.nl/kHTTPd: Linux HTTP accelerator. Available at http://www.fenrus.demon.nl/.
51
 
52
D. A. Wheeler. http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.htmlMore than a gigabuck: Estimating GNU/Linux's size, July 2002. Available at http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html.
 
53
A. Whitaker, M. Shaw, and S. D. Gribble. Denali: Lightweight virtual machines for distributed and networked applications. In Proceedings of the 5th USENIX Symposium on Operating Systems Design and Implementation, pages 195--209, Dec. 2002.
54
 
55
M. Young, M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, and A. Tevanian. Mach: A new kernel foundation for UNIX development. In Proceedings of the 1986 Summer USENIX Conference, pages 93--113, June 1986.

CITED BY  38

Collaborative Colleagues:
Michael M. Swift: colleagues
Brian N. Bershad: colleagues
Henry M. Levy: colleagues