ACM Home Page
Please provide us with feedback. Feedback
Automatic steering of behavioral model inference
Full text PdfPdf (523 KB)
Source
Foundations of Software Engineering archive
Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering on European software engineering conference and foundations of software engineering symposium table of contents
Amsterdam, The Netherlands
SESSION: Analysis and testing 2 table of contents
Pages 345-354  
Year of Publication: 2009
ISBN:978-1-60558-001-2
Authors
David Lo  Singapore Management University, Singapore, Singapore
Leonardo Mariani  University of Milano Bicocca, Milan, Italy
Mauro Pezzè  University of Milano Bicocca, Milan, Italy
Sponsors
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 37,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1595696.1595761
What is a DOI?

ABSTRACT

Many testing and analysis techniques use finite state models to validate and verify the quality of software systems. Since the specification of such models is complex and time-consuming, researchers defined several techniques to extract finite state models from code and traces. Automatically generating models requires much less effort than designing them, and thus eases the verification and validation of large software systems. However, when models are inferred automatically, the precision of the mining process is critical. Behavioral models mined with imprecise processes can include many spurious behaviors, and can thus compromise the results of testing and analysis techniques that use those models.

In this paper, we increase the precision of automata inferred from execution traces, by leveraging two learning techniques. We first mine execution traces to infer statistically significant temporal properties that capture relations between non consecutive and possibly distant events. We then incrementally refine a simple initial automaton by merging likely equivalent states. We identify equivalent states by analyzing set of consecutive events, and we use the inferred temporal properties to evaluate whether two equivalent states can be merged or not. We merge equivalent states only if the merging does violate any temporal property, since a merging that violates temporal properties is likely to introduce an imprecise generalization. Our generalization process that preserves temporal properties while merging states avoids breaking non-local relations, and thus solves one of the major cause of overgeneralized models. Thus, mined properties steer the learning of behavioral models. The technique is completely automated and generates an automaton that both accepts the input traces and satisfies the mined temporal properties.

We evaluated our solution by comparing models inferred with and without checking mined temporal properties. Results show that our steering process can significantly improve precision without noticeable loss of recall.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Eclipse Test and Performance Tools Platform. http://www.eclipse.org/tptp/, visited in 2008.
 
2
M. Acharya, T. Xie, J. Pei, and J. Xu. Mining API Patterns as Partial Orders from Source Code: From Usage Scenarios to Specifications. In proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2007.
 
3
G. Ammons, R. Bodik, and J. R. Larus. Mining Specification. In proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2002.
 
4
A. Biermann and J. Feldman. On the synthesis of finite-state machines from samples of their behaviour. IEEE Transactions on Computers, 21:591--597, 1972.
 
5
L. C. Briand, Y. Labiche, and J. Leduc. Toward the reverse engineering of UML sequence diagrams for distributed java software. IEEE Transactions on Software Engineering, 32(9):642--663, 2006.
 
6
E. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 1999.
 
7
J. E. Cook and A. L. Wolf. Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3):215--249, July 1998.
 
8
M. El-Ramly, E. Stroulia, and P. Sorenson. Interaction-pattern mining: Extracting usage scenarios from run-time behavior traces. In proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.
 
9
M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Transaction on Software Engineering, 27(2):99--123, February 2001.
 
10
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The daikon system for dynamic detection of likely invariants. Science of Computer Programming, 69(1-3):35--45, 2007.
 
11
A. Fox. Addressing software dependability with statistical and machine learning techniques. In proceedings of the International Conference on Software Engineering, 2005. Invited Talk.
 
12
J. Han and M. Kamber. Data Mining Concepts and Techniques. Morgan Kaufmann, 2006.
 
13
M. Harder, J. Mellen, and M. D. Ernst. Improving test suites via operational abstraction. In proceedings of the 25th International Conference on Software Engineering, pages 60--71, 2003.
 
14
J. G. Hosking. Visualisation of object oriented program execution. In proceedings of the IEEE Symposium on Visual Languages, 1996.
 
15
Javazoom. jlgui - music player. http://www.javazoom.net/jlgui/jlgui.html, 2009.
 
16
D. Lo and S.-C. Khoo. QUARK: Empirical assessment of automaton-based specification miners. In proceedings of the 13th Working Conference on Reverse Engineering, 2006.
 
17
D. Lo and S.-C. Khoo. SMArTIC: Towards building an accurate, robust and scalable specification miner. In proceedings of the 14th ACM SIGSOFT Symposium on Foundations of Software Engineering, 2006.
 
18
D. Lo, S.-C. Khoo, and C. Liu. Efficient mining of iterative patterns for software specification discovery. proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007.
 
19
D. Lo, S.-C. Khoo, and C. Liu. Mining past-time temporal rules from execution traces. In proceedings of the 6th International Workshop on Dynamic Analysis, 2008.
 
20
D. Lo, S.-C. Khoo, and C. Liu. Mining temporal rules for software maintenance. Journal of Software Maintenance and Evolution: Research and Practice, 20(4):227--247, 2008.
 
21
D. Lo, S. Maoz, and S.-C. Khoo. Mining Modal Scenario-Based Specification from Execution Traces of Reactive Systems. In proceedings of the 22nd ACM/IEEE International Conference on Automated Software Engineering, 2007.
 
22
D. Lo and S.Maoz. Mining scenario-based triggers and effects. In proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008.
 
23
D. Lorenzoli, L. Mariani, and M. Pezzè. Automatic Generation of Software Behavioral Models. In proceedings of the International Conference on Software Engineering, 2008.
 
24
L. Mariani, S. Papagiannakis, and M. Pezzè. Compatibility and regression testing of COTS-component-based software. In proceedings of the 29th International Conference on Software Engineering, 2007.
 
25
L. Mariani and M. Pezzè. Dynamic detection of COTS components incompatibility. IEEE Software, 24(5):76--85, September/October 2007.
 
26
S. McCamant and M. D. Ernst. Predicting problems caused by component upgrades. In proceedings of the 9th European Software Engineering Conference and the 11th ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 287--296, 2003.
 
27
M. McGavin, T. Wright, and S. Marshall. Visualisations of execution traces (VET): an interactive plugin-based visualisation tool. In proceedings of the 7th Australasian User Interface Conference. Australian Computer Society, Inc., 2006.
 
28
J. Perkins and M. Ernst. Efficient incremental algorithms for dynamic detection of likely invariants. In proceedings of the 12th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2004.
 
29
A. V. Raman and J. D. Patrick. The sk-strings method for inferring PFSA. In proceedings of the Workshop on Automata Induction, Grammatical Inference and Language Acquisition, 1997.
 
30
O. Raz, P. Koopman, and M.Shaw. Semantic anomaly detection in online data sources. In proceedings of International Conference on Software Engineering, 2002.
 
31
S. P. Reiss and M. Renieris. Encoding program executions. In proceedings of the 23rd International Conference on Software Engineering, 2001.
 
32
H. Safyallah and K. Sartipi. Dynamic analysis of software systems using execution pattern mining. In proceedings of the International Conference on Program Comprehension, 2006.
 
33
N. Walkinshaw and K. Bogdanov. Inferring finite-state models with temporal constraints. In proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008.
 
34
J. Whaley, M. Martin, and M. Lam. Automatic extraction of object oriented component interfaces. In proceedings of the International Symposium on Software Testing and Analysis, 2002.
 
35
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M.Das. Perracotta: Mining temporal API rules from imperfect traces. In proceedings of the International Conference on Software Engineering, 2006.
 
36
Y. Zou, T. Lau, K. Kontogiannis, T. Tong, and R. McKegney. Model-driven busineess process recovery. In proceedings of the 11th Working Conference on Reverse Engineering, 2004.