|
ABSTRACT
Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic analysis and is based on a unique intuition - the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Polyglot and evaluated it extensively using real-world implementations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of-the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between implementations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
How Samba Was Written. http://samba.org/ftp/tridge/misc/french cafe.txt.
|
| |
2
|
Icqlib: The ICQ Library. http://kicq.sourceforge.net/icqlib.shtml.
|
| |
3
|
Libyahoo2: A C Library for Yahoo! Messenger. http://libyahoo2.sourceforge.net.
|
| |
4
|
MSN Messenger Protocol. http://www.hypothetic.org/docs/msn/index.php.
|
| |
5
|
Qemu: Open Source Processor Emulator. http://fabrice.bellard.free.fr/qemu/.
|
| |
6
|
Tcpdump. http://www.tcpdump.org/.
|
| |
7
|
The UnOfficial AIM/OSCAR Protocol Specification. http://www.oilcan.org/oscar/.
|
| |
8
|
Wireshark, Network Protocol Analyzer. http://www.wireshark.org.
|
| |
9
|
M. A. Beddoe. Network Protocol Analysis Using Bioinformatics Algorithms. http://www.baselineresearch.net/PI/.
|
| |
10
|
N. Borisov, D. J. Brumley, H. J. Wang, and C. Guo. Generic Application-Level Protocol Analyzer and Its Language. Network and Distributed System Security Symposium, San Diego, CA, February 2007.
|
| |
11
|
David Brumley , Juan Caballero , Zhenkai Liang , James Newsome , Dawn Song, Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation, Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, p.1-16, August 06-10, 2007, Boston, MA
|
| |
12
|
J. Caballero, S. Venkataraman, P. Poosankam, M. G. Kang, D. Song, and A. Blum. FiG: Automatic Fingerprint Generation. Network and Distributed System Security Symposium, San Diego, CA, February 2007.
|
| |
13
|
Jim Chow , Ben Pfaff , Tal Garfinkel , Kevin Christopher , Mendel Rosenblum, Understanding data lifetime via whole system simulation, Proceedings of the 13th conference on USENIX Security Symposium, p.22-22, August 09-13, 2004, San Diego, CA
|
 |
14
|
Manuel Costa , Jon Crowcroft , Miguel Castro , Antony Rowstron , Lidong Zhou , Lintao Zhang , Paul Barham, Vigilante: end-to-end containment of internet worms, Proceedings of the twentieth ACM symposium on Operating systems principles, October 23-26, 2005, Brighton, United Kingdom
|
 |
15
|
|
| |
16
|
D. Crocker and P. Overell. Augmented BNF for Syntax Specifications: ABNF. RFC 4234 (Draft Standard), 4234, October 2005.
|
| |
17
|
|
| |
18
|
W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz. Protocol-Independent Adaptive Replay of Application Dialog. Network and Distributed System Security Symposium, San Diego, CA, February 2006.
|
| |
19
|
Holger Dreger , Anja Feldmann , Michael Mai , Vern Paxson , Robin Sommer, Dynamic application-layer protocol analysis for network intrusion detection, Proceedings of the 15th conference on USENIX Security Symposium, p.18-18, July 31-August 04, 2006, Vancouver, B.C., Canada
|
 |
20
|
Concettina Del Grosso , Giuliano Antoniol , Massimiliano Di Penta , Philippe Galinier , Ettore Merlo, Improving network applications security: a new heuristic to generate stress testing data, Proceedings of the 2005 conference on Genetic and evolutionary computation, June 25-29, 2005, Washington DC, USA
[doi> 10.1145/1068009.1068185]
|
 |
21
|
Patrick Haffner , Subhabrata Sen , Oliver Spatscheck , Dongmei Wang, ACAS: automated construction of application signatures, Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data, August 26-26, 2005, Philadelphia, Pennsylvania, USA
[doi> 10.1145/1080173.1080183]
|
 |
22
|
|
| |
23
|
|
| |
24
|
|
 |
25
|
Justin Ma , Kirill Levchenko , Christian Kreibich , Stefan Savage , Geoffrey M. Voelker, Unexpected means of protocol inference, Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, October 25-27, 2006, Rio de Janeriro, Brazil
[doi> 10.1145/1177080.1177123]
|
 |
26
|
Phil McMinn , Mark Harman , David Binkley , Paolo Tonella, The species per path approach to SearchBased test data generation, Proceedings of the 2006 international symposium on Software testing and analysis, July 17-20, 2006, Portland, Maine, USA
[doi> 10.1145/1146238.1146241]
|
| |
27
|
|
| |
28
|
J. Newsome and D. Song. Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software. Network and Distributed System Security Symposium, San Diego, CA, February 2005.
|
| |
29
|
J. Newsome, D. Brumley, and D. Song. Vulnerability-Specific Execution Filtering for Exploit Prevention on Commodity Software. Network and Distributed System Security Symposium, San Diego, CA, February 2006.
|
 |
30
|
James Newsome , David Brumley , Jason Franklin , Dawn Song, Replayer: automatic protocol replay by binary analysis, Proceedings of the 13th ACM conference on Computer and communications security, October 30-November 03, 2006, Alexandria, Virginia, USA
[doi> 10.1145/1180405.1180444]
|
| |
31
|
|
| |
32
|
Ruoming Pang , Mark Allman , Mike Bennett , Jason Lee , Vern Paxson , Brian Tierney, A first look at modern enterprise traffic, Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement, p.2-2, October 19-21, 2005, Berkeley, CA
|
 |
33
|
|
 |
34
|
|
 |
35
|
G. Edward Suh , Jae W. Lee , David Zhang , Srinivas Devadas, Secure program execution via dynamic information flow tracking, Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, October 07-13, 2004, Boston, MA, USA
|
| |
36
|
P. Vogt, F. Nentwich, N. Jovanovic, E. Kirda, C. Kruegel, and G. Vigna. Cross-Site Scripting Prevention with Dynamic Data Tainting and Static Analysis. Network and Distributed System Security Symposium, San Diego, CA, February 2007.
|
 |
37
|
Heng Yin , Dawn Song , Manuel Egele , Christopher Kruegel , Engin Kirda, Panorama: capturing system-wide information flow for malware detection and analysis, Proceedings of the 14th ACM conference on Computer and communications security, October 28-31, 2007, Alexandria, Virginia, USA
[doi> 10.1145/1315245.1315261]
|
CITED BY 6
|
|
Rui Wang , XiaoFeng Wang , Kehuan Zhang , Zhuowei Li, Towards automatic reverse engineering of software security configurations, Proceedings of the 15th ACM conference on Computer and communications security, October 27-31, 2008, Alexandria, Virginia, USA
|
|
|
Artem Dinaburg , Paul Royal , Monirul Sharif , Wenke Lee, Ether: malware analysis via hardware virtualization extensions, Proceedings of the 15th ACM conference on Computer and communications security, October 27-31, 2008, Alexandria, Virginia, USA
|
|
|
Weidong Cui , Marcus Peinado , Karl Chen , Helen J. Wang , Luis Irun-Briz, Tupni: automatic reverse engineering of input formats, Proceedings of the 15th ACM conference on Computer and communications security, October 27-31, 2008, Alexandria, Virginia, USA
|
|
|
|
|
|
|
|
|
Prateek Saxena , Pongsin Poosankam , Stephen McCamant , Dawn Song, Loop-extended symbolic execution on binary programs, Proceedings of the eighteenth international symposium on Software testing and analysis, July 19-23, 2009, Chicago, IL, USA
|
|