ACM Home Page
Please provide us with feedback. Feedback
Deriving input syntactic structure from execution
Full text PdfPdf (488 KB)
Source Foundations of Software Engineering archive
Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering table of contents
Atlanta, Georgia
SESSION: Testing table of contents
Pages 83-93  
Year of Publication: 2008
ISBN:978-1-59593-995-1
Authors
Zhiqiang Lin  Purdue University, West Lafayette, Indiana
Xiangyu Zhang  Purdue University, West Lafayette, Indiana
Sponsor
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 124,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1453101.1453114
What is a DOI?

ABSTRACT

Program input syntactic structure is essential for a wide range of applications such as test case generation, software debugging and network security. However, such important information is often not available (e.g., most malware programs make use of secret protocols to communicate) or not directly usable by machines (e.g., many programs specify their inputs in plain text or other random formats). Furthermore, many programs claim they accept inputs with a published format, but their implementations actually support a subset or a variant. Based on the observations that input structure is manifested by the way input symbols are used during execution and most programs take input with top-down or bottom-up grammars, we devise two dynamic analyses, one for each grammar category. Our evaluation on a set of real-world programs shows that our technique is able to precisely reverse engineer input syntactic structure from execution.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Libyahoo2: A c library for yahoo! messenger. http://libyahoo2.sourceforge.net/.
 
2
The Protocol Informatics Project. http://www.baselineresearch.net/PI/.
 
3
The SNORT network intrusion detection system. http://www.snort.org.
 
4
Wireshark: The World's Most Popular Network Protocol Analyzer. http://www.wireshark.org/.
 
5
Grammar of HTML Document. http://www.unix.org.ua/orelly/web/html/appa_02.html.
 
6
 
7
8
 
9
 
10
11
12
 
13
K. Hanford. Automatic Generation of Test Cases. In IBM Systems Journal, 9(4), 1970.
 
14
Z. Lin, X. Jiang, D. Xu and X. Zhang. Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS), 2008.
 
15
Z. Lin and X. Zhang. Deriving Input Syntactic Structure from Execution and Its Applications. Purdue Technical Report CSD TR #08-006, 2008.
 
16
17
 
18
 
19
20
 
21
V. Nagarajan, R. Gupta, X. Zhang, M. Madou, B. De Sutter, and K. De Bosschere. Matching control flow of program versions. In Proceedings of the 2007 International Conference on Software Maintenance (ICSM), Paris, 2007.
22
 
23
R. Parekh and V. Honavar. Grammar Inference, Automata Induction, and Language Acquisition. 2000.
 
24
P. Purdom. A sentence generator for testing parsers. In BIT Numerical Mathematics, 12(3), 1972.
 
25
L. V. Put, D. Chanet, B. De Bus, B. De Sutter, and K. D. Bosschere. Diablo: a reliable, retargetable and extensible link-time rewriting framework. In Proceedings of IEEE International Symposium On Signal Processing And Information Technology, 2005.
26
27
28
 
29
G. Wondracek, P. M. Comparetti, C. Kruegel, and E. Kirda. Automatic Network Protocol Analysis. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS), 2008.
30
31
 
32
 
33
34
35


Collaborative Colleagues:
Zhiqiang Lin: colleagues
Xiangyu Zhang: colleagues