|
ABSTRACT
Redirection spam presents a web page with false content to a crawler for indexing, but automatically redirects the browser to a different web page. Redirection is usually immediate (on page load) but may also be triggered by a timer or a harmless user event such as a mouse move. JavaScript redirection is the most notorious of redirection techniques and is hard to detect as many of the prevalent crawlers are script-agnostic. In this paper, we study common JavaScript redirection spam techniques on the web. Our findings indicate that obfuscation techniques are very prevalent among JavaScript redirection spam pages. These obfuscation techniques limit the effectiveness of static analysis and static feature based systems. Based on our findings, we recommend a robust counter measure using a light weight JavaScript parser and engine.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Z. Gyöngyi and H. Garcia-Molina (2005), "Web spam taxonomy," First International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), Japan, 2005.
|
| |
2
|
Mozilla Foundation, "About JavaScript," Online at http://developer.mozilla.org/en/docs/About_JavaScript
|
| |
3
|
Microsoft, "JScript Reference," Online at http://msdn.microsoft.com/library/default.asp?url=/library/enus/jscript7/html/jslrfjscriptlanguagereference.asp
|
| |
4
|
Ecma International, "ECMAScript," Online at http://www.ecma-international.org/publications/standards/ECMA-262.htm
|
| |
5
|
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, "RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1," Online at ftp://ftp:isi.edu/in-notes/rfc2616.txt
|
| |
6
|
Google, Inc. Google information for webmasters, 2007. Online at http://www.google.com/webmasters/faq.html.
|
| |
7
|
Yahoo! Inc. Yahoo! Help - Yahoo! Search, 2007. Online at http://help.yahoo.com/help/us/ysearch/deletions/.
|
| |
8
|
Microsoft, Inc. Live Search Site Owner Help, 2007. Online at http://help.live.com/help.aspx?project=wl_webmasters
|
| |
9
|
B. Wu and B. D. Davison (2005) "Cloaking and Redirection: A Preliminary Study," First International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), Chiba, Japan, 2005.
|
| |
10
|
A. Benczúr, K. Csalogány, T. Sarlós, M. Uher, "SpamRank -- Fully Automatic Link Spam Detection," First International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), Chiba, Japan, 2005.
|
| |
11
|
Y. Niu, Y. Wang, H. Chen, M. Ma, and F. Hsu, "A Quantitative Study of Forum Spamming Using Context-based Analysis," Proceedings of the 14th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, February, 2007.
|
| |
12
|
Y. Wang, D. Beck, X. Jiang, R. Roussev, C. Verbowski, S. Chen, and S. King, "Automated Web Patrol with Strider Honey Monkeys: Finding Web Sites That Exploit Browser Vulnerabilities," In Proc. Network and Distributed System Security (NDSS) Symposium, February 2006.
|
| |
13
|
A. Turing, "On computable numbers, with an application to the Entscheidungsproblem," Proceedings of the London Mathematical Society, Series 2, 42 (1936), pp 230--265.
|
| |
14
|
K. Chellapilla and M. Chickering, "Improving Cloaking Detection using Search Query Popularity and Monetizability," Second Intl. Workshop on Adversarial Information Retrieval on the Web (AIRWEB'2006), Seattle, USA
|
| |
15
|
C. Pirillo. 2005. Google: Kill blogspot already!!! available online at http://chris.pirillo.com/blog/archives/2005/10/16/1302867.html.
|
 |
16
|
Dennis Fetterly , Mark Manasse , Marc Najork, Spam, damn spam, and statistics: using statistical analysis to locate spam web pages, Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004, June 17-18, 2004, Paris, France
[doi> 10.1145/1017074.1017077]
|
 |
17
|
|
| |
18
|
T. Berners-Lee, R. Fielding, and L. Masinter, "RFC 3986: Uniform Resource Identifier (URI): Generic Syntax," online at http://www.ietf.org/rfc/rfc3986.txt
|
| |
19
|
International Obfuscated C Code Contest, http://www0.us.ioccc.org/main.html
|
| |
20
|
Asynchronous JavaScript and XML (AJAX) programming. http://en.wikipedia.org/wiki/AJAX
|
|