| Acclimatizing Taxonomic Semantics for Hierarchical Content Classification |
| Full text |
Pdf
(911 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Philadelphia, PA, USA
SESSION: Research track papers
table of contents
Pages: 384 - 393
Year of Publication: 2006
ISBN:1-59593-339-5
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 90, Citation Count: 2
|
|
|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
Hierarchical models have been shown to be effective in content classification. However, we observe through empirical study that the performance of a hierarchical model varies with given taxonomies; even a semantically sound taxonomy has potential to change its structure for better classification. By scrutinizing typical cases, we elucidate why a given semantics-based hierarchy does not work well in content classification, and how it could be improved for accurate hierarchical classification. With these understandings, we propose effective localized solutions that modify the given taxonomy for accurate hierarchical classification. We conduct extensive experiments on both toy and real-world data sets, report improved performance and interesting findings, and provide further analysis of algorithmic issues such as time complexity, robustness, and sensitivity to the number of features.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
America Online Inc. http://www.aol.com/.
|
 |
2
|
Charu C. Aggarwal , Stephen C. Gates , Philip S. Yu, On the merits of building categorization systems by supervised clustering, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.352-356, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312279]
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
Ofer Dekel , Joseph Keshet , Yoram Singer, Large margin hierarchical classification, Proceedings of the twenty-first international conference on Machine learning, p.27, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015374]
|
| |
7
|
Inderjit S. Dhillon, James Fan, and Yuqiang Guan. Efficient clustering of very large document collections. In Data Mining for Scientific and Engineering Applications. Kluwer Academic Publishers, 2001.
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
Tao Li and Shenghuo Zhu. Hierarchical document classification using automatically generated hierarchy. In SDM, 2005.
|
 |
12
|
Tie-Yan Liu , Yiming Yang , Hao Wan , Hua-Jun Zeng , Zheng Chen , Wei-Ying Ma, Support vector machines classification with a very large-scale taxonomy, ACM SIGKDD Explorations Newsletter, v.7 n.1, p.36-43, June 2005
[doi> 10.1145/1089815.1089821]
|
| |
13
|
|
 |
14
|
|
 |
15
|
Juho Rousu , Craig Saunders , Sandor Szedmak , John Shawe-Taylor, Learning hierarchical multi-category text classification models, Proceedings of the 22nd international conference on Machine learning, p.744-751, August 07-11, 2005, Bonn, Germany
[doi> 10.1145/1102351.1102445]
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
Ioannis Tsochantaridis , Thomas Hofmann , Thorsten Joachims , Yasemin Altun, Support vector machine learning for interdependent and structured output spaces, Proceedings of the twenty-first international conference on Machine learning, p.104, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015341]
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
|