|
ABSTRACT
The last several years have seen a proliferation of static and runtime analysis tools for finding security violations that are caused by explicit information flow in programs. Much of this interest has been caused by the increase in the number of vulnerabilities such as cross-site scripting and SQL injection. In fact, these explicit information flow vulnerabilities commonly found in Web applications now outnumber vulnerabilities such as buffer overruns common in type-unsafe languages such as C and C++. Tools checking for these vulnerabilities require a specification to operate. In most cases the task of providing such a specification is delegated to the user. Moreover, the efficacy of these tools is only as good as the specification. Unfortunately, writing a comprehensive specification presents a major challenge: parts of the specification are easy to miss, leading to missed vulnerabilities; similarly, incorrect specifications may lead to false positives. This paper proposes Merlin, a new approach for automatically inferring explicit information flow specifications from program code. Such specifications greatly reduce manual labor, and enhance the quality of results, while using tools that check for security violations caused by explicit information flow. Beginning with a data propagation graph, which represents interprocedural flow of information in the program, Merlin aims to automatically infer an information flow specification. Merlin models information flow paths in the propagation graph using probabilistic constraints. A naive modeling requires an exponential number of constraints, one per path in the propagation graph. For scalability, we approximate these path constraints using constraints on chosen triples of nodes, resulting in a cubic number of constraints. We characterize this approximation as a probabilistic abstraction, using the theory of probabilistic refinement developed by McIver and Morgan. We solve the resulting system of probabilistic constraints using factor graphs, which are a well-known structure for performing probabilistic inference. We experimentally validate the Merlin approach by applying it to 10 large business-critical Web applications that have been analyzed with CAT.NET, a state-of-the-art static analysis tool for .NET. We find a total of 167 new confirmed specifications, which result in a total of 322 additional vulnerabilities across the 10 benchmarks. More accurate specifications also reduce the false positive rate: in our experiments, Merlin-inferred specifications result in 13 false positives being removed; this constitutes a 15% reduction in the CAT.NET false positive rate on these 10 programs. The final false positive rate for CAT.NET after applying Merlin in our experiments drops to under 1%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Chandra and M. Franz. Fine-grained information flow analysis and enforcement in a java virtual machine. In Annual Computer Security Applications Conference, pages 463--475, 2007.
|
 |
2
|
|
 |
3
|
Dawson Engler , David Yu Chen , Seth Hallem , Andy Chou , Benjamin Chelf, Bugs as deviant behavior: a general approach to inferring errors in systems code, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
| |
4
|
Fortify. Fortify code analyzer. http://www.ouncelabs.com/, 2008.
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
Yao-Wen Huang , Fang Yu , Christian Hang , Chung-Hung Tsai , Der-Tsai Lee , Sy-Yen Kuo, Securing web application code by static analysis and runtime protection, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988679]
|
| |
9
|
|
| |
10
|
Ted Kremenek , Paul Twohey , Godmar Back , Andrew Ng , Dawson Engler, From uncertainty to belief: inferring the specification within, Proceedings of the 7th symposium on Operating systems design and implementation, November 06-08, 2006, Seattle, Washington
|
 |
11
|
Maxwell Krohn , Alexander Yip , Micah Brodsky , Natan Cliffer , M. Frans Kaashoek , Eddie Kohler , Robert Morris, Information flow control for standard OS abstractions, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
| |
12
|
F. R. Kschischang, B. J. Frey, and H. A. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2):498--519, 2001.
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
Michael Martin , Benjamin Livshits , Monica S. Lam, Finding application errors and security flaws using PQL: a program query language, Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, October 16-20, 2005, San Diego, CA, USA
|
| |
18
|
M. Martin, B. Livshits, and M. S. Lam. SecuriFly: Runtime vulnerability protection for Web applications. Technical report, Stanford University, Oct. 2006.
|
| |
19
|
|
 |
20
|
|
| |
21
|
Microsoft Corporation. Microsoft Code Analysis Tool .NET (CAT.NET). http://www.microsoft. com/downloads/details.aspx?FamilyId= 0178e2ef-9da8-445e-9348-c93f24cc9f9d&displaylang=en, 3 2009.
|
| |
22
|
T. Minka, J.Winn, J. Guiver, and A. Kannan. Infer.NET 2.2, 2009. Microsoft Research Cambridge. http://research.microsoft.com/infernet.
|
| |
23
|
A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans. Automatically hardening Web applications using precise tainting. In Proceedings of the IFIP International Information Security Conference, June 2005.
|
| |
24
|
OunceLabs, Inc. Ounce. http://www.ouncelabs.com/, 2008.
|
| |
25
|
T. Pietraszek and C. V. Berghe. Defending against injection attacks through context-sensitive string evaluation. In Proceedings of the Recent Advances in Intrusion Detection, Sept. 2005.
|
 |
26
|
|
| |
27
|
A. Sabelfeld and A. Myers. Language-based information-flow security. IEEE Journal on Selected Areas in Communications, 21(1):5--19, January 2003.
|
 |
28
|
|
 |
29
|
Steve Vandebogart , Petros Efstathopoulos , Eddie Kohler , Maxwell Krohn , Cliff Frey , David Ziegler , Frans Kaashoek , Robert Morris , David Mazières, Labels and event processes in the Asbestos operating system, ACM Transactions on Computer Systems (TOCS), v.25 n.4, p.11-es, December 2007
[doi> 10.1145/1314299.1314302]
|
| |
30
|
L. Wall. Perl security. http://search.cpan.org/dist/perl/ pod/perlsec.pod.
|
| |
31
|
W.Weimer and G. C. Necula. Mining temporal specifications for error detection. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 461--476, 2005.
|
 |
32
|
|
| |
33
|
|
 |
34
|
Jinlin Yang , David Evans , Deepali Bhardwaj , Thirumalesh Bhat , Manuvir Das, Perracotta: mining temporal API rules from imperfect traces, Proceedings of the 28th international conference on Software engineering, May 20-28, 2006, Shanghai, China
[doi> 10.1145/1134285.1134325]
|
| |
35
|
|
| |
36
|
|
|