| Text classification in a hierarchical mixture model for small training sets |
| Full text |
Pdf
(1.40 MB)
|
| Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the tenth international conference on Information and knowledge management
table of contents
Atlanta, Georgia, USA
Session: Text Extraction and Summarization
table of contents
Pages: 105 - 113
Year of Publication: 2001
ISBN:1-58113-436-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 65, Citation Count: 12
|
|
|
ABSTRACT
Documents are commonly categorized into hierarchies of topics, such as the ones maintained by Yahoo! and the Open Directory project, in order to facilitate browsing and other interactive forms of information retrieval. In addition, topic hierarchies can be utilized to overcome the sparseness problem in text categorization with a large number of categories, which is the main focus of this paper. This paper presents a hierarchical mixture model which extends the standard naive Bayes classifier and previous hierarchical approaches. Improved estimates of the term distributions are made by differentiation of words in the hierarchy according to their level of generality/specificity. Experiments on the Newsgroups and the Reuters-21578 dataset indicate improved performance of the proposed classifier in comparison to other state-of-the-art methods on datasets with a small number of positive examples.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. D'Alessio, M. Murray, R. Schiaflino, and A. Kershenbaum. Category levels in hierarchical text categorization. In Proceedingf of EMNLP-3, 3rd Conference on Empirical Methods in Natural Language Processing, 1998.
|
 |
2
|
|
 |
3
|
Susan Dumais , John Platt , David Heckerman , Mehran Sahami, Inductive learning algorithms and representations for text categorization, Proceedings of the seventh international conference on Information and knowledge management, p.148-155, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288651]
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
F. Jelinek and R. Mercer. Interpolated estimation of Markov source parameters from sparse data. In S. Gelsema and L. Kanal, editors, Pattern Recognition in Practice, pages 381-402. North-Holland, 1980.
|
| |
8
|
|
| |
9
|
|
| |
10
|
K. Lang. Newsweeder: Learning to filter netnews. In International Conference on Machine Learning, pages 331-339, 1995.
|
| |
11
|
|
| |
12
|
|
| |
13
|
A. McCallum and K. Nigam. A comparison of event models for naive Bayes text classification. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-93), 1998.
|
| |
14
|
|
| |
15
|
|
| |
16
|
Kamal Nigam , Andrew McCallum , Sebastian Thrun , Tom Mitchell, Learning to classify text from labeled and unlabeled documents, Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, p.792-799, July 1998, Madison, Wisconsin, United States
|
| |
17
|
|
 |
18
|
|
|