ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Acclimatizing Taxonomic Semantics for Hierarchical Content Classification
Full text PdfPdf (911 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 384 - 393  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Lei Tang  Arizona State University, Tempe, Arizona
Jianping Zhang  AOL Inc., Dulles, Virginia
Huan Liu  Arizona State University, Tempe, Arizona
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 90,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150446
What is a DOI?

Warning: The download time has expired please click on the item to try again.


ABSTRACT

Hierarchical models have been shown to be effective in content classification. However, we observe through empirical study that the performance of a hierarchical model varies with given taxonomies; even a semantically sound taxonomy has potential to change its structure for better classification. By scrutinizing typical cases, we elucidate why a given semantics-based hierarchy does not work well in content classification, and how it could be improved for accurate hierarchical classification. With these understandings, we propose effective localized solutions that modify the given taxonomy for accurate hierarchical classification. We conduct extensive experiments on both toy and real-world data sets, report improved performance and interesting findings, and provide further analysis of algorithmic issues such as time complexity, robustness, and sensitivity to the number of features.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
America Online Inc. http://www.aol.com/.
2
3
 
4
5
6
 
7
Inderjit S. Dhillon, James Fan, and Yuqiang Guan. Efficient clustering of very large document collections. In Data Mining for Scientific and Engineering Applications. Kluwer Academic Publishers, 2001.
8
 
9
 
10
 
11
Tao Li and Shenghuo Zhu. Hierarchical document classification using automatically generated hierarchy. In SDM, 2005.
12
 
13
14
15
16
 
17
 
18
19
 
20
 
21
22
23
 
24


Collaborative Colleagues:
Lei Tang: colleagues
Jianping Zhang: colleagues
Huan Liu: colleagues