ACM Home Page
Please provide us with feedback. Feedback
The anti-social tagger: detecting spam in social bookmarking systems
Full text PdfPdf (7.67 MB)
Source AIRWeb; Vol. 295 archive
Proceedings of the 4th international workshop on Adversarial information retrieval on the web table of contents
Beijing, China
SESSION: Social networks table of contents
Pages 61-68  
Year of Publication: 2008
ISBN:978-1-60558-159-0
Authors
Beate Krause  University of Kassel, Germany
Christoph Schmitz  University of Kassel, Germany
Andreas Hotho  University of Kassel, Germany
Gerd Stumme  University of Kassel, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 104,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1451983.1451998
What is a DOI?

ABSTRACT

The annotation of web sites in social bookmarking systems has become a popular way to manage and find information on the web. The community structure of such systems attracts spammers: recent post pages, popular pages or specific tag pages can be manipulated easily. As a result, searching or tracking recent posts does not deliver quality results annotated in the community, but rather unsolicited, often commercial, web sites. To retain the benefits of sharing one's web content, spam-fighting mechanisms that can face the flexible strategies of spammers need to be developed.

A classical approach in machine learning is to determine relevant features that describe the system's users, train different classifiers with the selected features and choose the one with the most promising evaluation results. In this paper we will transfer this approach to a social bookmarking setting to identify spammers. We will present features considering the topological, semantic and profile-based information which people make public when using the system. The dataset used is a snapshot of the social bookmarking system BibSonomy and was built over the course of several months when cleaning the system from spam. Based on our features, we will learn a large set of different classification models and compare their performance. Our results represent the groundwork for a first application in BibSonomy and for the building of more elaborate spam detection mechanisms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines (version 2.31).
 
4
5
 
6
 
7
T. Hammond, T. Hannay, B. Lund, and J. Scott. Social Bookmarking Tools (I): A General Review. D-Lib Magazine, 11(4), April 2005.
 
8
 
9
 
10
A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. BibSonomy: A social bookmark and publication sharing system. In CS-TIW '06, Aalborg, Denmark, July 2006. Aalborg University Press.
 
11
A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Proc. ESWC '06, pages 411--426, Budva, Montenegro, June 2006. Springer.
 
12
R. Jäschke, L. B. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme. Tag recommendations in folksonomies. In Proc. PKDD '07, Berlin, Heidelberg.
 
13
P. Kolari, T. Finin, and A. Joshi. SVMs for the Blogosphere: Blog Identification and Splog Detection. AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs, 2006.
 
14
P. Kolari, A. Java, T. Finin, T. Oates, and A. Joshi. Detecting Spam Blogs: A Machine Learning Approach. AAAI '06, 2006.
15
 
16
R. Lambiotte and M. Ausloos. Collaborative tagging as a tripartite network. Lecture Notes in Computer Science, 3993:1114, Dec 2005.
 
17
B. Lund, T. Hammond, M. Flack, and T. Hannay. Social Bookmarking Tools (II): A Case Study - Connotea. D-Lib Magazine, 11(4), April 2005.
 
18
A. Mathes. Folksonomies - Cooperative Classification and Communication Through Shared Metadata, December 2004. http://www.adammathes.com/academic/computermediated- communication/folksonomies.html.
 
19
P. Mika. Ontologies are us: A unified model of social networks and semantics. In Proc. ISWC '05, LNCS, pages 522--536. Springer, 2005.
 
20
G. Mishne, D. Carmel, and R. Lempel. Blocking blog spam with language model disagreement. In Proc. AIRWeb '05, pages 1--6, New York, NY, USA, 2005. ACM.
21
 
22
23
 
24
25


Collaborative Colleagues:
Beate Krause: colleagues
Christoph Schmitz: colleagues
Andreas Hotho: colleagues
Gerd Stumme: colleagues