| Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization |
| Full text |
Pdf
(452 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceeding of the 17th ACM conference on Information and knowledge management
table of contents
Napa Valley, California, USA
SESSION: KM: classification
table of contents
Pages 83-92
Year of Publication: 2008
ISBN:978-1-59593-991-3
|
|
Authors
|
|
Jian Huang
|
Pennsylvania State University, University Park, PA, USA
|
|
Omid Madani
|
SRI International, Menlo Park, CA, USA
|
|
C. Lee Giles
|
Pennsylvania State University, University Park, PA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 117, Citation Count: 0
|
|
|
ABSTRACT
We introduce a multi-stage ensemble framework, Error-Driven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a generalist, capable of classifying under all classes, to deliver a reasonably accurate initial category ranking given an instance. Edge then computes a confusion graph for the generalist and allocates the learning resources to train experts on relatively small groups of classes that tend to be systematically confused with one another by the generalist. The experts' votes, when invoked on a given instance, yield a reranking of the classes, thereby correcting the errors of the generalist. Our evaluations showcase the improved classification and ranking performance on several large-scale text categorization datasets. Edge is in particular efficient when the underlying learners are efficient. Our study of confusion graphs is also of independent interest.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
 |
7
|
Susan Dumais , John Platt , David Heckerman , Mehran Sahami, Inductive learning algorithms and representations for text categorization, Proceedings of the seventh international conference on Information and knowledge management, p.148-155, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288651]
|
| |
8
|
A. Esuli, T. Fagni, and F. Sebastiani. TreeBoost.MH: A boosting algorithm for multi-label hierarchical text categorization. In Proc of 13th Int'l Conf on String Processing and Information Retrieval (SPIRE), 2006.
|
| |
9
|
|
 |
10
|
Yoav Freund , Robert E. Schapire , Yoram Singer , Manfred K. Warmuth, Using and combining predictors that specialize, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.334-343, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258616]
|
 |
11
|
|
| |
12
|
William Hersh , Chris Buckley , T. J. Leone , David Hickam, OHSUMED: an interactive retrieval evaluation and new large test collection for research, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, p.192-201, July 03-06, 1994, Dublin, Ireland
|
| |
13
|
|
| |
14
|
|
 |
15
|
Tie-Yan Liu , Yiming Yang , Hao Wan , Hua-Jun Zeng , Zheng Chen , Wei-Ying Ma, Support vector machines classification with a very large-scale taxonomy, ACM SIGKDD Explorations Newsletter, v.7 n.1, p.36-43, June 2005
[doi> 10.1145/1089815.1089821]
|
| |
16
|
O. Madani and M. Connor. Large-scale many-class learning. In SIAM Conf on Data Mining (SDM), 2008.
|
| |
17
|
O. Madani, W. Greiner, D. Kempe, and M. R. Salavatipour. Recall systems: Efficient learning and use of category indices. In Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS), 2007.
|
 |
18
|
|
| |
19
|
M. E. J. Newman. Mixing patterns in networks. Physical Review E, 67:026126, 2003.
|
| |
20
|
J. Rennie, L. Shih, J. Teevan, and D. Karger. Tackling the poor assumptions of naive Bayes text classifiers. In Proceedings of the 20th International Conference on Machine Learning (ICML), pages 616--623, 2003.
|
| |
21
|
|
| |
22
|
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 56(6):386--408, 1958.
|
| |
23
|
|
 |
24
|
|
| |
25
|
K. Tumer and J. Ghosh. Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29(2):341--348, 1996.
|
| |
26
|
K. Tumer and J. Ghosh. Robust combining of disparate classifiers through order statistics. Pattern Analysis & Applications, 5(2):189--200, 2002.
|
| |
27
|
D. J. Watts and S. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.
|
| |
28
|
|
| |
29
|
L. Xu, A. Krzyzak, and C. Y. Suen. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man and Cybernetics, 22(3):418--435, 1992.
|
| |
30
|
|
 |
31
|
|
|