| Large scale multi-label classification via metalabeler |
| Full text |
Pdf
(569 KB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Data mining/session: learning
table of contents
Pages 211-220
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 57, Downloads (12 Months): 222, Citation Count: 1
|
|
|
ABSTRACT
The explosion of online content has made the management of such content non-trivial. Web-related tasks such as web page categorization, news filtering, query categorization, tag recommendation, etc. often involve the construction of multi-label categorization systems on a large scale. Existing multi-label classification methods either do not scale or have unsatisfactory performance. In this work, we propose MetaLabeler to automatically determine the relevant set of labels for each instance without intensive human involvement or expensive cross-validation. Extensive experiments conducted on benchmark data show that the MetaLabeler tends to outperform existing methods. Moreover, MetaLabeler scales to millions of multi-labeled instances and can be deployed easily. This enables us to apply the MetaLabeler to a large scale query categorization problem in Yahoo!, yielding a significant improvement in performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
R.-E. Fan and C.-J. Lin. A study on threshold selection for multi-label classication. 2007.
|
 |
8
|
|
| |
9
|
|
 |
10
|
Shuiwang Ji , Lei Tang , Shipeng Yu , Jieping Ye, Extracting shared subspace for multi-label classification, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
[doi> 10.1145/1401890.1401939]
|
| |
11
|
|
| |
12
|
I. Katakis, G. Tsoumakas, and I. Vlahavas. Multilabel text classification for automated tag suggestion. In Proceedings of the ECML/PKDD 2008 Discovery Challenge, 2008.
|
 |
13
|
S. Sathiya Keerthi , S. Sundararajan , Kai-Wei Chang , Cho-Jui Hsieh , Chih-Jen Lin, A sequential dual method for large scale multi-class linear svms, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
[doi> 10.1145/1401890.1401942]
|
| |
14
|
|
| |
15
|
|
 |
16
|
Tie-Yan Liu , Yiming Yang , Hao Wan , Hua-Jun Zeng , Zheng Chen , Wei-Ying Ma, Support vector machines classification with a very large-scale taxonomy, ACM SIGKDD Explorations Newsletter, v.7 n.1, p.36-43, June 2005
[doi> 10.1145/1089815.1089821]
|
 |
17
|
Tie-Yan LIU , Yiming YANG , Hao WAN , Qian ZHOU , Bin GAO , Hua-Jun ZENG , Zheng CHEN , Wei-Ying MA, An experimental study on large-scale web categorization, Special interest tracks and posters of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1062745.1062891]
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
 |
25
|
Ioannis Tsochantaridis , Thomas Hofmann , Thorsten Joachims , Yasemin Altun, Support vector machine learning for interdependent and structured output spaces, Proceedings of the twenty-first international conference on Machine learning, p.104, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015341]
|
| |
26
|
G. Tsoumakas and K. Ioannis. Multi label classification: An overview. International Journal of Data Warehousing and Mining, 3:1--13, 2007.
|
| |
27
|
|
| |
28
|
N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In NIPS, pages 721--728, 2002.
|
 |
29
|
|
 |
30
|
|
 |
31
|
|
 |
32
|
|
| |
33
|
|
 |
34
|
|
|