| Web spam identification through content and hyperlinks |
| Full text |
Pdf
(233 KB)
|
| Source
|
AIRWeb; Vol. 295
archive
Proceedings of the 4th international workshop on Adversarial information retrieval on the web
table of contents
Beijing, China
SESSION: General
table of contents
Pages 41-44
Year of Publication: 2008
ISBN:978-1-60558-159-0
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 73, Citation Count: 2
|
|
|
ABSTRACT
We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Graph Labeling Workshop. http://graphlab.lip6.fr/, 2007.
|
| |
2
|
Web Spam Challenge. http://webspam.lip6.fr/, 2007.
|
| |
3
|
J. Abernethy, O. Chapelle, and C. Castillo. WITCH: A new approach to web spam detection. Technical Report 2008--001, Yahoo! Research, 2008.
|
| |
4
|
M. Belkin, P. Niyogi, and V. Sindhwani. On manifold regularization. In Proceedings of the Tenth International Workshop on Artifical Intelligence and Statistics (AISTATS), 2005.
|
 |
5
|
Carlos Castillo , Debora Donato , Luca Becchetti , Paolo Boldi , Stefano Leonardi , Massimo Santini , Sebastiano Vigna, A reference collection for web spam, ACM SIGIR Forum, v.40 n.2, p.11-24, December 2006
[doi> 10.1145/1189702.1189703]
|
 |
6
|
Carlos Castillo , Debora Donato , Aristides Gionis , Vanessa Murdock , Fabrizio Silvestri, Know your neighbors: web spam detection using the web topology, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277814]
|
 |
7
|
|
 |
8
|
|
| |
9
|
Z. Gyöngyi and H. Garcia-Molina. Web spam taxonomy. In First International Workshop on Adversarial Information Retrieval on the Web, pages 39--47, Chiba, Japan, 2005.
|
| |
10
|
|
 |
11
|
|
| |
12
|
V. Krishnan and R. Raj. Web spam detection with anti-trust rank. In ACM SIGIR workshop on Adversarial Information Retrieval on the Web, 2006.
|
 |
13
|
|
| |
14
|
|
| |
15
|
V. Vapnik. Statistical Learning Theory. John Wiley & Sons Inc, 1998.
|
 |
16
|
|
 |
17
|
|
|