ACM Home Page
Please provide us with feedback. Feedback
Real-time automatic tag recommendation
Full text PdfPdf (619 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Social tagging table of contents
Pages 515-522  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Yang Song  The Pennsylvania State University, University Park, PA, USA
Ziming Zhuang  Yahoo! Applied Research, Santa Clara, CA, USA
Huajing Li  The Pennsylvania State University, University Park, PA, USA
Qiankun Zhao  AOL Research Lab, Beijing, China
Jia Li  The Pennsylvania State University, University Park, PA, USA
Wang-Chien Lee  The Pennsylvania State University, University Park, PA, USA
C. Lee Giles  The Pennsylvania State University, University Park, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 74,   Downloads (12 Months): 642,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390423
What is a DOI?

ABSTRACT

Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time tag recommendation. The tagged training documents are treated as triplets of (words, docs, tags), and represented in two bipartite graphs, which are partitioned into clusters by Spectral Recursive Embedding (SRE). Tags in each topical cluster are ranked by our novel ranking algorithm. A two-way Poisson Mixture Model (PMM) is proposed to model the document distribution into mixture components within each cluster and aggregate words into word clusters simultaneously. A new document is classified by the mixture model based on its posterior probabilities so that tags are recommended according to their ranks. Experiments on large-scale tagging datasets of scientific documents (CiteULike) and web pages del.icio.us) indicate that our framework is capable of making tag recommendation efficiently and effectively. The average tagging time for testing a document is around 1 second, with over 88% test documents correctly labeled with the top nine tags we suggested.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In International Workshop on Clustering Information over the Web (in conjunction with EDBT), 2004.
 
2
G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, 2006.
 
3
J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Uncertainty in Artificial Intelligence. Proceedings of the Fourteenth Conference (1998), pages 43--52, 1998.
4
5
 
6
 
7
 
8
 
9
10
 
11
 
12
 
13
M. Kendall. A new measure of rank correlation. Biometrika, 30:81--89, 1938.
 
14
J. Li and H. Zha. Two-way poisson mixture models for simultaneous document classification and word clustering. Computational Statistics & Data Analysis, 2006.
15
16


Collaborative Colleagues:
Yang Song: colleagues
Ziming Zhuang: colleagues
Huajing Li: colleagues
Qiankun Zhao: colleagues
Jia Li: colleagues
Wang-Chien Lee: colleagues
C. Lee Giles: colleagues