ACM Home Page
Please provide us with feedback. Feedback
ASSURE: automatic software self-healing using rescue points
Full text PdfPdf (642 KB)
Source
Architectural Support for Programming Languages and Operating Systems archive
Proceeding of the 14th international conference on Architectural support for programming languages and operating systems table of contents
Washington, DC, USA
SESSION: Reliable systems I table of contents
Pages 37-48  
Year of Publication: 2009
ISBN:978-1-60558-406-5
Also published in ...
Authors
Stelios Sidiroglou  Columbia University, New York, USA
Oren Laadan  Columbia University, New York, USA
Carlos Perez  Columbia University, New York, USA
Nicolas Viennot  Columbia University, New York, USA
Jason Nieh  Columbia University, New York, USA
Angelos D. Keromytis  Columbia University, New York, USA
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
SIGOPS: ACM Special Interest Group on Operating Systems
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 289,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1508244.1508250
What is a DOI?

ABSTRACT

Software failures in server applications are a significant problem for preserving system availability. We present ASSURE, a system that introduces rescue points that recover software from unknown faults while maintaining both system integrity and availability, by mimicking system behavior under known error conditions. Rescue points are locations in existing application code for handling a given set of programmer-anticipated failures, which are automatically repurposed and tested for safely enabling fault recovery from a larger class of (unanticipated) faults. When a fault occurs at an arbitrary location in the program, ASSURE restores execution to an appropriate rescue point and induces the program to recover execution by virtualizing the program's existing error-handling facilities. Rescue points are identified using fuzzing, implemented using a fast coordinated checkpoint-restart mechanism that handles multi-process and multi-threaded applications, and, after testing, are injected into production code using binary patching. We have implemented an ASSURE Linux prototype that operates without application source code and without base operating system kernel changes. Our experimental results on a set of real-world server applications and bugs show that ASSURE enabled recovery for all of the bugs tested with fast recovery times, has modest performance overhead, and provides automatic self-healing orders of magnitude faster than current human-driven patch deployment methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
J. Boyd. Patterns of Conflict. Unpublished briefing, http://www.d-n-i.net/boyd/pdf/poc.pdf, 1986.
3
 
4
 
5
 
6
 
7
8
9
 
10
J. Etoh. GCC extension for protecting applications from stack-smashing attacks. http://www.trl.ibm.com/projects/security/ssp/.
 
11
 
12
 
13
 
14
15
 
16
J. Newsome, D. Brumley, and D. Song. Vulnerability-specific execution filtering for exploit prevention on commodity software. In Proceedings of the Symposium on Network and Distributed System Security (SNDSS), February 2006.
 
17
National Vulnerability Database. http://nvd.nist.gov/statistics.cfm, April 2006.
18
 
19
PaX Project. Address space layout randomization, Mar 2003. http://pageexec.virtualave.net/docs/aslr.txt.
 
20
A. D. Roelker. Snort 2.0: Protocol flow analyzer.
 
21
S. Sidiroglou, Y. Giovanidis, and A. Keromytis. A dynamic mechanism for recovery from buffer overflow attacks. In Proceedings of the Information Security Conference (ISC), September 2005.
 
22
23
 
24
M. Sullivan and R. Chillarege. Software defects and their impact on system availability -- a study of field failures in operating systems. 21st Int. Symp. on Fault-Tolerant Computing (FTCS--21), pages 2--9, 1991.
25
26
 
27
28
 
29
30
 
31

Collaborative Colleagues:
Stelios Sidiroglou: colleagues
Oren Laadan: colleagues
Carlos Perez: colleagues
Nicolas Viennot: colleagues
Jason Nieh: colleagues
Angelos D. Keromytis: colleagues