|
ABSTRACT
Memory bugs in C/C++ programs severely affect system availability and security. This paper presents First-Aid, a lightweight runtime system that survives software failures caused by common memory management bugs and prevents future failures by the same bugs during production runs. Upon a failure, First-Aid diagnoses the bug type and identifies the memory objects that trigger the bug. To do so, it rolls back the programto previous checkpoints and uses two types of environmental changes that can prevent or expose memory bug manifestation during re-execution. Based on the diagnosis, First-Aid generates and applies runtime patches to avoid the memory bug and prevent its reoccurrence. Furthermore, First-Aid validates the consistent effects of the runtime patches and generates on-site diagnostic reports to assist developers in fixing the bugs. We have implemented First-Aid on Linux and evaluated it with seven applications that contain various types of memory bugs, including buffer overflow, uninitialized read, dangling pointer read/write, and double free. The results show that First-Aid can quickly diagnose the tested bugs and recover applications from failures (in 0.084 to 3.978 seconds). The results also show that the runtime patches generated by First-Aid can prevent future failures caused by the diagnosed bugs. Additionally, First-Aid provides detailed diagnostic information on both the root cause and the manifestation of the bugs. Furthermore, First-Aid incurs low overhead (0.4-11.6% with an average of 3.7%) during normal execution for the tested buggy applications, SPEC INT2000, and four allocation intensive programs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
Emery D. Berger , Kathryn S. McKinley , Robert D. Blumofe , Paul R. Wilson, Hoard: a scalable memory allocator for multithreaded applications, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.117-128, November 2000, Cambridge, Massachusetts, United States
|
 |
4
|
|
| |
5
|
A. Bobbio and M. Sereno. Fine grained software rejuvenation models. In Intl. Computer Performance and Dependability Symposium (ICPDS '98), pages 4--12, 1998.
|
 |
6
|
|
| |
7
|
|
| |
8
|
George Candea , James Cutler , Armando Fox , Rushabh Doshi , Priyank Garg , Rakesh Gowda, Reducing Recovery Time in a Small Recursively Restartable System, Proceedings of the 2002 International Conference on Dependable Systems and Networks, p.605-614, June 23-26, 2002
|
| |
9
|
George Candea , Shinichi Kawamoto , Yuichi Fujiki , Greg Friedman , Armando Fox, Microreboot — A technique for cheap recovery, Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, p.3-3, December 06-08, 2004, San Francisco, CA
|
 |
10
|
Manuel Costa , Miguel Castro , Lidong Zhou , Lintao Zhang , Marcus Peinado, Bouncer: securing software by blocking bad input, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
| |
11
|
Brendan Cully , Geoffrey Lefebvre , Dutch Meyer , Mike Feeley , Norm Hutchinson , Andrew Warfield, Remus: high availability via asynchronous virtual machine replication, Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, p.161-174, April 16-18, 2008, San Francisco, California
|
| |
12
|
S. Garg, A. Puliafito, M. Telek, and K. S. Trivedi. On the analysis of software rejuvenation policies. In Proceedings of the Annual Conference on Computer Assurance (CA'97), pages 88---96, 1997.
|
| |
13
|
GNU. Gdb: The gnu project debugger.
|
| |
14
|
J. Gray. Why do computers stop and what can be done about it? In Proceedings of Symposium on Reliable Distributed Systems (RDS' 86), pages 3--12, 1986.
|
| |
15
|
W. Gu, Z. Kalbarczyk, R. K. Iyer, and Z. Yang. Characterization of Linux kernel behavior under errors. In Proceedings of Intl. Conf. on Dependable Systems and Networks (DSN'03), pages 459--468, Jun 2003.
|
| |
16
|
R. Hasting and B. Joyce. Purify: Fast detection of memory leaks and access errors. In Proceedings of the USENIX Winter 1992 Technical Conference, pages 125--136, Dec 1992.
|
| |
17
|
|
| |
18
|
H. Jula, D. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In Proceedings of Symposium on Operating System Design and Implementation (OSDI'08), pages 295--308, Dec 2008.
|
| |
19
|
|
| |
20
|
D. Lea. A Memory Allocator, 1996.
|
| |
21
|
D. E. Lowell and P. M. Chen. Discount checking: Transparent, low-overhead recovery for general applications. Technical report, CSE-TR-410-99, University of Michigan, 1998.
|
 |
22
|
Chi-Keung Luk , Robert Cohn , Robert Muth , Harish Patil , Artur Klauser , Geoff Lowney , Steven Wallace , Vijay Janapa Reddi , Kim Hazelwood, Pin: building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
 |
23
|
Vitaliy B. Lvin , Gene Novark , Emery D. Berger , Benjamin G. Zorn, Archipelago: trading address space for reliability and security, Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, March 01-05, 2008, Seattle, WA, USA
|
 |
24
|
|
 |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
James S. Plank , Micah Beck , Gerry Kingsley , Kai Li, Libckpt: transparent checkpointing under Unix, Proceedings of the USENIX 1995 Technical Conference Proceedings on USENIX 1995 Technical Conference Proceedings, p.18-18, January 16-20, 1995, New Orleans, Louisiana
|
| |
29
|
|
 |
30
|
|
| |
31
|
Martin Rinard , Cristian Cadar , Daniel Dumitran , Daniel M. Roy , Tudor Leu , William S. Beebee, Jr., Enhancing server availability and security through failure-oblivious computing, Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, p.21-21, December 06-08, 2004, San Francisco, CA
|
| |
32
|
Stelios Sidiroglou , Michael E. Locasto , Stephen W. Boyd , Angelos D. Keromytis, Building a reactive immune system for software services, Proceedings of the annual conference on USENIX Annual Technical Conference, p.11-11, April 10-15, 2005, Anaheim, CA
|
| |
33
|
SPEC. http://www.spec.org/cpu2000.
|
| |
34
|
Sudarshan M. Srinivasan , Srikanth Kandula , Christopher R. Andrews , Yuanyuan Zhou, Flashback: a lightweight extension for rollback and deterministic replay for software debugging, Proceedings of the annual conference on USENIX Annual Technical Conference, p.3-3, June 27-July 02, 2004, Boston, MA
|
| |
35
|
M. Sullivan and R. Chillarege. Software defects and their impact on system availability -- A study of field failures in operating systems. In Proceedings of the Annual Intl. Symposium on Fault-Tolerant Computing (FTC'91), pages 2--9, Jun 1991.
|
| |
36
|
Symantec. Internet security threat report. http://www.symantec.com/enterprise/threatreport/index.jsp, Sept 2006.
|
| |
37
|
Yan Tang , Yan Tang , Qi Gao , Qi Gao , Feng Qin , Feng Qin, LeakSurvivor: towards safely tolerating memory leaks for garbage-collected languages, USENIX 2008 Annual Technical Conference on Annual Technical Conference, p.307-320, June 22-27, 2008, Boston, Massachusetts
|
 |
38
|
Joseph Tucek , Shan Lu , Chengdu Huang , Spiros Xanthos , Yuanyuan Zhou, Triage: diagnosing production run failures at the user's site, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
 |
39
|
Joseph Tucek , James Newsome , Shan Lu , Chengdu Huang , Spiros Xanthos , David Brumley , Yuanyuan Zhou , Dawn Song, Sweeper: a lightweight end-to-end system for defending against fast worms, Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, March 21-23, 2007, Lisbon, Portugal
|
| |
40
|
US-CERT. US-CERT vulnerability notes database. http://www.kb.cert.org/vuls.
|
 |
41
|
|
 |
42
|
|
| |
43
|
|
| |
44
|
Pin Zhou , Wei Liu , Long Fei , Shan Lu , Feng Qin , Yuanyuan Zhou , Samuel Midkiff , Josep Torrellas, AccMon: Automatically Detecting Memory-Related Bugs via Program Counter-Based Invariants, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.269-280, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.3]
|
|