ACM Home Page
Please provide us with feedback. Feedback
Dependable ≠ unaffordable
Full text PdfPdf (359 KB)
Source Architectural Support for Programming Languages and Operating Systems archive
Proceedings of the 1st workshop on Architectural and system support for improving software dependability table of contents
San Jose, California
Pages: 58 - 62  
Year of Publication: 2006
ISBN:1-59593-576-2
Authors
Alan L. Cox  Rice University, Houston, TX
Kartik Mohanram  Rice University, Houston, TX
Scott Rixner  Rice University, Houston, TX
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 37,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1181309.1181318
What is a DOI?

ABSTRACT

This paper presents a software architecture for hardware fault tolerance based on loosely-synchronized, redundant virtual machines (LSRVM). LSRVM will provide high levels of reliability by tolerating hardware faults at all levels of the system. Historically, such hardware fault tolerance has only been achievable using custom-designed hardware and proprietary operating systems. Today, however, technological trends and economic factors are driving a reduction in the amount of custom-designed hardware. We believe that this path should be followed to its ultimate conclusion: a highly-available, fault-tolerant computing system based entirely on commodity hardware and open-source operating systems. Our revolutionary approach utilizes virtualization to efficiently provide redundancy on modern commodity hardware. When combined with existing application-level fault tolerance mechanisms, LSRVM will provide very high levels of reliability at extremely low cost.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
5
 
6
 
7
8
 
9
J. Bartlett, J. Gray, and B. Horst, "Fault tolerance in Tandem computer systems," Technical report 86.2, Tandem Computers, March 1986.
 
10
11
 
12
Intel, Intel Virtualization Technology Specification for the Intel Itanium Architecture (VT-i), April 2005. Revision 2.0.
 
13
Advanced Micro Devices, Secure Virtual Machine Architecture Reference Manual, May 2005. Revision 3.01.
 
14
 
15
 
16
 
17
 
18
 
19
 
20
M. L. Bushnell and V. D. Agrawal, eds., Essentials of electronic testing for digital, memory and mixed-signal VLSI circuits. MA, USA: Kluwer Academic Publishers, 2000.
 
21
J. H. Wensley et al., "SIFT: Design and analysis of a fault-tolerant computer for aircraft control," vol. 66, pp. 1240--1255, Oct. 1978.
 
22
 
23
J. L. Gersting et al., "A comparison of voting algorithms for n-version programming," in Intl. Conference on System Sciences, pp. 253--262, 1991.
 
24
J. M. Bass, G. Latif-Shabgahi, and S. Bennett, "History-based weighted average voter: A novel software voting algorithm for fault-tolerant computer systems," in Euromicro Conference, pp. 402--409, 2001.
 
25
 
26
IEEE Standard 729-1982, IEEE Glossary of Software Engineering Terminology. IEEE, 1982.


Collaborative Colleagues:
Alan L. Cox: colleagues
Kartik Mohanram: colleagues
Scott Rixner: colleagues