ACM Home Page
Please provide us with feedback. Feedback
Classifying tags using open content resources
Full text PdfPdf (440 KB)
Source Web Search and Web Data Mining archive
Proceedings of the Second ACM International Conference on Web Search and Data Mining table of contents
Barcelona, Spain
SESSION: Classification and clustering table of contents
Pages 64-73  
Year of Publication: 2009
ISBN:978-1-60558-390-7
Authors
Simon Overell  Multimedia and Information Systems, Imperial College London, London, UK
Börkur Sigurbjörnsson  Yahoo! Research, Barcelona, Spain
Roelof van Zwol  Yahoo! Research, Barcelona, Spain
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
: Google
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
: Yahoo! Research
Microsoft : Microsoft
: Nokia
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 34,   Downloads (12 Months): 293,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1498759.1498810
What is a DOI?

ABSTRACT

Tagging has emerged as a popular means to annotate on-line objects such as bookmarks, photos and videos. Tags vary in semantic meaning and can describe different aspects of a media object. Tags describe the content of the media as well as locations, dates, people and other associated meta-data. Being able to automatically classify tags into semantic categories allows us to understand better the way users annotate media objects and to build tools for viewing and browsing the media objects. In this paper we present a generic method for classifying tags using third party open content resources, such as Wikipedia and the Open Directory. Our method uses structural patterns that can be extracted from resource meta-data. We describe the implementation of our method on Wikipedia using WordNet categories as our classification schema and ground truth. Two structural patterns found in Wikipedia are used for training and classification: categories and templates. We apply our system to classifying Flickr tags. Compared to a WordNet baseline our method increases the coverage of the Flickr vocabulary by 115%. We can classify many important entities that are not covered by WordNet, such as, London Eye, Big Island, Ronaldinho, geocaching and wii.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proc. of EACL, pages 9--16, 2006.
 
3
D. Buscaldi, P. Rosso, and P. García. Inferring geographic ontologies from multiple resources for geographical information retrieval. In Proc. of the SIGIR workshop on GIR, pages 53--55, 2006.
 
4
P. Clough, A. Al-Maskari, and K. Darwish. Providing multilingual access to Flickr for arabic users. In Proc. of CLEF, 2006.
 
5
S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. In Proc. of EMNLP-CoNLL, pages 708--716, 2007.
 
6
DBpedia. http://dbpedia.org/. Accessed 5 Dec 08.
 
7
Delicious. http://del.icio.us/. Accessed 5 Dec 08.
 
8
Flickr. http://www.Flickr.com/. Accessed 5 Dec 08.
 
9
FlickrAPI. http://www.flickr.com/services/api/. Accessed 5 Dec 08.
 
10
T. Joachims. Making large-scale SVM learning practical. In Advances in Kernal Methods - Support Vector Learning, pages 41--56, 1998.
 
11
R. Mihalcea. Using wikipedia for automatic word sense disambiguation. In Proc. of NAACL, pages 196--203, 2007.
12
13
 
14
M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets. In Proc. of AWIC, pages 380--386, 2005.
 
15
P. Schmitz. Inducing an ontology from flickr tags. In Proc. of the Workshop on Collaborative Web Tagging at WWW'06, 2006.
16
17
 
18
TagExplorer. http://sandbox.yahoo.com/TagExplorer. Accessed 5 Dec 08.
19
 
20
Wikipedia. http://www.wikipedia.org/. Accessed 5 Dec 08.
 
21
WordNet. http://wordnet.princeton.edu/. Accessed 5 Dec 08.
22
 
23
YouTube. http://youtube.com/. Accessed 5 Dec 08.


Collaborative Colleagues:
Simon Overell: colleagues
Börkur Sigurbjörnsson: colleagues
Roelof van Zwol: colleagues