ACM Home Page
Please provide us with feedback. Feedback
Hyperincident connected components of tagging networks
Full text PdfPdf (1.32 MB)
Source
Conference on Hypertext and Hypermedia archive
Proceedings of the 20th ACM conference on Hypertext and hypermedia table of contents
Torino, Italy
SESSION: Networks properties table of contents
Pages 229-238  
Year of Publication: 2009
ISBN:978-1-60558-486-7
Authors
Nicolas Neubauer  Technische Universität Berlin, Berlin, Germany
Klaus Obermayer  Technische Universität Berlin, Berlin, Germany
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 54,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557914.1557954
What is a DOI?

ABSTRACT

Data created by social bookmarking systems can be described as 3-partite 3-uniform hypergraphs connecting documents, users, and tags (tagging networks), such that the toolbox of complex network analysis can be applied to examine their properties. One of the most basic tools, the analysis of connected components, however cannot be applied meaningfully: Tagging networks tend to be almost entirely connected. We therefore propose a generalization of connected components, m-hyperincident connected components. We show that decomposing tagging networks into 2-hyperincident connected components yields a characteristic component distribution with a salient giant component that can be found across various datasets. This pattern changes if the underlying formation process changes, for example, if the hypergraph is constructed from search logs, or if the tagging data is contaminated by spam: It turns out that the second- to 129th largest components of the spam-labeled Bibsonomy dataset are inhabited exclusively by spam users. Based on these findings, we propose and unsupervised method for spam detection.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
F. Boesch. Synthesis of reliable networks, a survey. IEEE Trans. Reliab, 35:240--246, 1986.
 
2
 
3
P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17--61, 1960.
 
4
 
5
A. Gkanogiannis and T. Kalamboukis. A novel supervised learning algorithm and its use for spam detection in social bookmarking systems. In ECML PKDD Discovery Challenge 2008 (RSDC'08), 2008.
 
6
 
7
C. Goldschmidt. Critical random hypergraphs: the emergence of a giant set of identifiable vertices,. Annals of Probability, 33(4):1573--1600, 2005.
 
8
 
9
A. Hotho, D. Benz, R. Jäschke, and B. Krause, editors. ECML PKDD Discovery Challenge 2008 (RSDC'08). Workshop at 18th Europ. Conf. on Machine Learning (ECML'08) / 11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), 2008.
10
11
12
 
13
N. Neubauer and K. Obermayer. Predicting tag spam examining cooccurrences, network structures and url components. In ECML PKDD Discovery Challenge 2008 (RSDC'08), 2008.
 
14
Knowledge&Data Engineering Group, University of Kassel Benchmark folksonomy data from bibsonomy, version of june 30th, 2008.
15
 
16
J. Schmidt-Pruzan and E. Shamir. Component structure in the evolution of random hypergraphs. Combinatorica, 5(1):81--94, 1984.
 
17
 
18
R. Wetzker, C. Zimmermann, and C. Bauckhage. Analyzing social bookmarking systems: A del.icio.us cookbook. In Mining Social Data (MSoDa) Workshop Proceedings, ECAI 2008, pages 26--30, 2008.

Collaborative Colleagues:
Nicolas Neubauer: colleagues
Klaus Obermayer: colleagues