ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
A targeted web crawling for building malicious javascript collection
Full text PdfPdf (505 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the ACM first international workshop on Data-intensive software management and mining table of contents
Hong Kong, China
SESSION: Large-scale software corpus table of contents
Pages: 23-26  
Year of Publication: 2009
ISBN:978-1-60558-810-0
Authors
Peter Likarish  The University of Iowa, Iowa City, USA
Eunjin Jung  The University of Iowa, Iowa City, IA, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 39,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1651309.1651317
What is a DOI?

ABSTRACT

Malicious javascript frequently serves as a starting point of web-based attacks, in particular cross-site scripting. Thus detecting malicious javascript before execution can protect users from attacks such as malware infection, drive-by downloads, and even from participating in denial-of-service attacks as part of botnet sometimes. A large collection of malicious javascript would help with detector development, but by the time crawler arrives at blacklisted domains attackers and malicious scripts are often long gone. We have used classifiers to direct a web crawler better towards more likely locations of malicious scripts, and show how this targeted web crawler performs compared to crawler seed with blacklisted-domains.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Chenette. The ultimate deobfuscator. http://securitylabs.websense.com/content/Blogs/3198.aspx.
 
2
Computer Security Group at UCSB. Wepawet. http://wepawet.cs.ucsb.edu/
 
3
 
4
B. Feinstein and C. Peck. Caffeine monkey: Automated collection, detection and analysis of malicious javascript. In Blackhat, 2007.
 
5
 
6
B. Harstein. jsunpack. http://jsunpack.jeek.org/dec/go/.
 
7
Internet Archive. Heritrix. http://crawler.archive.org/, 2009.
 
8
M. Johns. On javascript malware and related threats. Journal in Computer Virology, Jan 2008.
 
9
P. Likarish, E. Jung, and I. Jo. Feature selection for automatic malicious javascript detection. Technical Report TR09-03, Dept. of Computer Science, The University of Iowa, August 2009.
 
10
A. Moshchuk, T. Bragin, S. D. Gribble, and H. M. Levy. A crawler-based study of spyware on the web. In Proceedings of the 2006 Network and Distributed Systems and Software (NDSS 2006), 2006.
 
11
Mozilla.org. Spidermonkey. http://www.mozilla.org/js/spidermonkey/, 2009.
12
 
13
The SANS Institute. Sans top-20 2007 security risks. http://www.sans.org/top20/.
14

Collaborative Colleagues:
Peter Likarish: colleagues
Eunjin Jung: colleagues