|
ABSTRACT
Spammers use questionable search engine optimization (SEO) techniques to promote their spam links into top search results. In this paper, we focus on one prevalent type of spam - redirection spam - where one can identify spam pages by the third-party domains that these pages redirect traffic to. We propose a five-layer, double-funnel model for describing end-to-end redirection spam, present a methodology for analyzing the layers, and identify prominent domains on each layer using two sets of commercial keywords. one targeting spammers and the other targeting advertisers. The methodology and findings are useful for search engines to strengthen their ranking algorithms against spam, for legitimate website owners to locate and remove spam doorway pages, and for legitimate advertisers to identify unscrupulous syndicators who serve ads on spam pages.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Adali, S., Liu, T., and Magdon-Ismail, M. Optimal Link Bombs are Uncoordinated. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), May 2005.
|
| |
2
|
Baeza-Yates, R, Castillo, C., and Lopez, V. Pagerank Increase Under Different Collusion Topologies. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), May 2005.
|
| |
3
|
Becchetti, L., Castillo, C., Donato, D., Leonardi, S., Baeza-Yates, R. Link-based Characterization and Detection of Web Spam. In the 2nd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), August 2006.
|
| |
4
|
Benczur, A., Csalogany, K., Sarlos, T., and Uher, M. SpamRank -- Fully Automatic Link Spam Detection. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), May 2005.
|
| |
5
|
Chellapilla, K. and Chickering, D.M. Improving Cloaking Detection Using Search Query Popularity and Monetizability. In the 2nd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), August 2006.
|
 |
6
|
André Luiz da Costa Carvalho , Paul - Alexandru Chirita , Edleno Silva de Moura , Pável Calado , Wolfgang Nejdl, Site level noise removal for search engines, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
[doi> 10.1145/1135777.1135793]
|
 |
7
|
Dennis Fetterly , Mark Manasse , Marc Najork, Spam, damn spam, and statistics: using statistical analysis to locate spam web pages, Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004, June 17-18, 2004, Paris, France
[doi> 10.1145/1017074.1017077]
|
| |
8
|
Gyongyi, Z. and Garcia-Molina, H. Web Spam Taxonomy. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2005.
|
| |
9
|
Jansen, B.J. Adversarial Informaton Retrieval Aspects of Sponsored Search. In the 2nd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2006.
|
 |
10
|
|
| |
11
|
Kolari, P., Tim Finin, T., and Joshi, A. SVMs for the Blogosphere: Blog Identification and Splog Detection. In AAAI Spring Symposium on Computational Approaches to Analysing Weblogs, March 2006.
|
| |
12
|
Krishnan, V. and Raj, R. Web Spam Detection and Anti-Trust Rank. In the 2nd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), August 2006.
|
| |
13
|
Metaxas, P. and DeStephano, J. Web Spam, Propaganda and Trust. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), May 2005.
|
| |
14
|
Mishne, G., Carmel, D., and Lempel, R. Blocking Blog Spam with Language Model Disagreement. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), May 2005.
|
| |
15
|
Niu, Y., Wang, Y. M., Chen, H., Ma, M., and Hsu, F. A Quantitative Study of Forum Spamming Using Context-based Analysis. In Proc. Network and Distributed System Security (NDSS) Symposium, February 2007.
|
 |
16
|
|
 |
17
|
|
| |
18
|
Urvoy, T., Lavernge, T., Filoche, P. Tracking Web Spam with Hidden Style Similarity. In the 2nd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), August 2006.
|
| |
19
|
Wang, Y. M., Beck, D., Jiang, X., Roussev, R., Verbowski, C., Chen, S., and King, S. Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities. In Proc. Network and Distributed System Security (NDSS) Symposium, February 2006.
|
| |
20
|
Yi-Min Wang , Doug Beck , Jeffrey Wang , Chad Verbowski , Brad Daniels, Strider typo-patrol: discovery and analysis of systematic typo-squatting, Proceedings of the 2nd conference on Steps to Reducing Unwanted Traffic on the Internet, p.5-5, July 07, 2006, San Jose, CA
|
| |
21
|
Wang, Y. M. and Ma, M. Strider Search Ranger: Towards an Autonomic Anti-Spam Search Engine. Microsoft Research Technical Report, MSR-TR-2006-174, December 2006.
|
| |
22
|
Wang, Y. M. and Ma, M. Detecting Stealth Web Pages That Use Click-Through Cloaking. Microsoft Research Technical Report, MSR-TR-2006-178, December 2006.
|
| |
23
|
Wu, B. and Davison, B.D. Cloaking and Redirection: A Preliminary Study. In the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2005.
|
 |
24
|
|
 |
25
|
|
| |
26
|
Wu, B., Goel, V., Davison, B.D. Propagating Trust and Distrust to Demote Web Spam. In Proc. Models of Trust for the Web Workshop (MTW), International World Wide Web Conference, 2006.
|
| |
27
|
Fiddler HTTP Proxy, http://www.fiddlertool.com/
|
| |
28
|
Fighting Splogs, http://fightsplog.blogspot.com/
|
| |
29
|
The Google AdSense Program, http://google.com/adsense
|
| |
30
|
Network Whois records, http://whois.domaintools.com/ 66.230.138.211 and http://whois.domaintools.com/64.111.214.154
|
| |
31
|
Screenshots of sample redirection spam pages, http://research.microsoft.com/SearchRanger/Redirection-spam_3_types.htm
|
| |
32
|
Screenshots of sample click-through analyses, http://research.microsoft.com/SearchRanger/Spam_ads_click-through_analysis.htm
|
CITED BY 10
|
|
|
|
|
Krysta M. Svore , Qiang Wu , Chris J. C. Burges , Aaswath Raman, Improving web spam classification using rank-time features, Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, May 08-08, 2007, Banff, Alberta, Canada
|
|
|
András Benczúr , István Bíró , Károly Csalogány , Tamás Sarlós, Web spam detection via commercial intent analysis, Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, May 08-08, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
Yuuki Sato , Takehito Utsuro , Yoshiaki Murakami , Tomohiro Fukuhara , Hiroshi Nakagawa , Yasuhide Kawada , Noriko Kando, Analysing features of Japanese splogs and characteristics of keywords, Proceedings of the 4th international workshop on Adversarial information retrieval on the web, April 22-22, 2008, Beijing, China
|
|
|
Yiqun Liu , Rongwei Cen , Min Zhang , Shaoping Ma , Liyun Ru, Identifying web spam with user behavior analysis, Proceedings of the 4th international workshop on Adversarial information retrieval on the web, April 22-22, 2008, Beijing, China
|
|
|
Chris Kanich , Christian Kreibich , Kirill Levchenko , Brandon Enright , Geoffrey M. Voelker , Vern Paxson , Stefan Savage, Spamalytics: an empirical analysis of spam marketing conversion, Proceedings of the 15th ACM conference on Computer and communications security, October 27-31, 2008, Alexandria, Virginia, USA
|
|
|
Taichi Katayama , Takehito Utsuro , Yuuki Sato , Takayuki Yoshinaka , Yasuhide Kawada , Tomohiro Fukuhara, An empirical study on selective sampling in active learning for splog detection, Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, April 21-21, 2009, Madrid, Spain
|
|
|
|
|