| A sparse gaussian processes classification framework for fast tag suggestions |
| Full text |
Pdf
(658 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceeding of the 17th ACM conference on Information and knowledge management
table of contents
Napa Valley, California, USA
SESSION: KM: classification
table of contents
Pages 93-102
Year of Publication: 2008
ISBN:978-1-59593-991-3
|
|
Authors
|
|
Yang Song
|
The Pennsylvania State University, University Park, PA, USA
|
|
Lu Zhang
|
The Pennsylvania State University, University Park, PA, USA
|
|
C. Lee Giles
|
The Pennsylvania State University, University Park, PA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 21, Downloads (12 Months): 223, Citation Count: 1
|
|
|
ABSTRACT
Tagged data is rapidly becoming more available on the World Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, 2006.
|
| |
2
|
K. Brinker, J. Furnkranz, and E. Hullermeier. A unified model for multilabel classification and ranking. In ECAI '06.
|
 |
3
|
Paul - Alexandru Chirita , Stefania Costache , Wolfgang Nejdl , Siegfried Handschuh, P-TAG: large scale automatic generation of personalized annotation tags for the web, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242686]
|
| |
4
|
A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In NIPS 14, 2001.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
T. Kohonen. Self Organization Maps. Springer, 2001.
|
| |
10
|
|
| |
11
|
N. Lawrence, M. Seeger, and R. Herbrich. Fast sparse gaussian process methods: The informative vector machine. In NIPS 15, pages 609--616. 2003.
|
 |
12
|
|
| |
13
|
J. Li and H. Zha. Two-way poisson mixture models for simultaneous document classification and word clustering. Computational Statistics & Data Analysis, 2006.
|
| |
14
|
J. C. Platt. Probabilities for sv machines. Advances in Large Margin Classifiers, pages 61--74, 2000.
|
| |
15
|
|
| |
16
|
M. Seeger and M. Jordan. Sparse gaussian process classification with multiple classes. TR 661, Department of Statistics, University of California at Berkeley, 2004.
|
| |
17
|
M. Seeger and C. Williams. Fast forward selection to speed up sparse gaussian process regression. In Workshop on AI and Statistics 9, 2003.
|
| |
18
|
S. Seo, M. Bode, and K. Obermayer. Soft nearest prototype classification. IEEE Trans. Neural Networks, 2003.
|
| |
19
|
E. Snelson and Z. Ghahramani. Sparse gaussian processes using pseudo-inputs. In NIPS 18. 2006.
|
 |
20
|
Yang Song , Ziming Zhuang , Huajing Li , Qiankun Zhao , Jia Li , Wang-Chien Lee , C. Lee Giles, Real-time automatic tag recommendation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
[doi> 10.1145/1390334.1390423]
|
| |
21
|
G. Tsoumakas and I. Katakis. Multi-label classification: An overview. Intl. J. of Data Warehousing and Mining,3(3):1--13, 2007.
|
 |
22
|
|
 |
23
|
Hongyuan Zha , Xiaofeng He , Chris Ding , Horst Simon , Ming Gu, Bipartite graph partitioning and data clustering, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502591]
|
|