|
ABSTRACT
Recent advances in static program analysis have made it possible to detect errors in applications that have been thoroughly tested and are in wide-spread use. The ability to find errors that have eluded traditional validation methods is due to the development and combination of sophisticated algorithmic techniques that are embedded in the implementations of analysis tools. Evaluating new analysis techniques is typically performed by running an analysis tool on a collection of subject programs, perhaps enabling and disabling a given technique in different runs. While seemingly sensible, this approach runs the risk of attributing improvements in the cost-effectiveness of the analysis to the technique under consideration, when those improvements may actually be due to details of analysis tool implementations that are uncontrolled during evaluation.In this paper, we focus on the specific class of path-sensitive error detection techniques and identify several factors that can significantly influence the cost of analysis. We show, through careful empirical studies, that the influence of these factors is sufficiently large that, if left uncontrolled, they may lead researchers to improperly attribute improvements in analysis cost and effectiveness. We make several recommendations as to how the influence of these factors can be mitigated when evaluating techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
J. C. Corbett, M. B. Dwyer, J. Hatcliff, and Robby. Expressing checkable properties of dynamic systems: The Bandera Specification Language. International Journal on Software Tools for Technology Transfer, 2002.
|
| |
3
|
|
| |
4
|
H. Do, S. G. Elbaum, and G. Rothermel. Subject infrastructure repository. http://esquared.unl.edu/sir.
|
| |
5
|
Y. Dong, X. Du, G. J. Holzmenn, and S. A. Smolka. Fighting livelock in the gnu i-protocol: a case study in explicit-state model checking. Int'l. Journal on Software Tools for Tech. Transfer, 4(4):505--528, 2003.
|
| |
6
|
M. B. Dwyer, J. Hatcliff, M. Hoosier, V. Ranganath, Robby, and T. Wallentine. Evaluating the effectiveness of program slicing for model reduction of concurrent object-oriented programs. In Proc. of the Twelfth Int'l. Conf. Tools and Algorithms for the Construction and Analysis of Systems, 2006. LNCS 3920.
|
| |
7
|
|
| |
8
|
S. Edelkamp, S. Leue, and A. Lluch-Lafuente. Partial-order reduction and trail improvement in directed model checking. International Journal on Software Tools for Technology Transfer, 6(4), 2004.
|
| |
9
|
|
| |
10
|
Y. Eytani, K. Havelund, S. D. Stoller, and S. Ur. Toward a framework and benchmark for testing tools for multi-threaded programs. Concurrency and Computation: Practice and Experience, to appear.
|
| |
11
|
Y. Eytani and S. Ur. Compiling a benchmark of documented multi-threaded bugs. In Proc. of the Workshop on Parallel and Distributed Systems: Testing and Debugging, 2004.
|
| |
12
|
|
| |
13
|
|
| |
14
|
A. Groce and W. Visser. Heuristics for model checking java programs. Int'l. Journal on Software Tools for Tech. Transfer, 6(4):260--276, 2004.
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
M. Musuvathi and D. R. Engler. Model Checking Large Network Protocol Implementations. In Proc. of the First Symp. on Networked Systems Design and Implementation, Mar. 2004.
|
 |
20
|
Madanlal Musuvathi , David Y. W. Park , Andy Chou , Dawson R. Engler , David L. Dill, CMC: a pragmatic approach to model checking real code, Proceedings of the 5th symposium on Operating systems design and implementation Due to copyright restrictions we are not able to make the PDFs for this conference available for downloading, December 09-11, 2002, Boston, Massachusetts
[doi> 10.1145/1060289.1060297]
|
| |
21
|
C. Păsăreanu, M. B. Dwyer, and W. Visser. Finding feasible abstract counter-examples. Int'l. Journal on Software Tools for Tech. Transfer, 5(1):34--48, 2003.
|
| |
22
|
|
 |
23
|
|
| |
24
|
Robby, M. B. Dwyer, J. Hatcliff, and R. Iosif. Space-reduction strategies for model checking dynamic systems. In Proceedings of the 2003 Workshop on Software Model Checking, July 2003.
|
| |
25
|
S. D. Stoller. Testing concurrent java programs using randomized scheduling. In Proc. Workshop on Runtime Verification, 2002.
|
 |
26
|
Jianbin Tan , George S. Avrunin , Lori A. Clarke , Shlomo Zilberstein , Stefan Leue, Heuristic-guided counterexample search in FLAVERS, Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering, October 31-November 06, 2004, Newport Beach, CA, USA
|
| |
27
|
|
| |
28
|
J. Yang, P. Twohey, D. R. Engler, and M. Musuvathi. Using Model Checking to Find Serious File System Errors. In Proc. of the Seventh Symp. on Operating Systems Design and Implementation, Dec. 2004.
|
|