| A meta-learning approach for text categorization |
| Full text |
Pdf
(180 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
New Orleans, Louisiana, United States
Pages: 303 - 309
Year of Publication: 2001
ISBN:1-58113-331-6
|
|
Authors
|
|
Wai Lam
|
The Chinese Univ. of Hong Kong, Shatin, Hong Kong
|
|
Kwok-Yin Lai
|
The Chinese Univ. of Hong Kong, Shatin, Hong Kong
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 48, Citation Count: 8
|
|
|
ABSTRACT
We investigate a meta-model approach, called Meta-learning Using Document Feature characteristics (MUDOF), for the task of automatic textual document categorization. It employs a meta-learning phase using document feature characteristics. Document feature characteristics, derived from the training document set, capture some inherent category-specific properties of a particular category. Different from existing categorization methods, MUDOF can automatically recommend a suitable algorithm for each category based on the category-specific statistical characteristics. Hence, different algorithms may be employed for different categories. Experiments have been conducted on a real-world document collection demonstrating the effectiveness of our approach. The results confirm that our meta-model approach can exploit the advantage of its component algorithms, and demonstrate a better performance than existing algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
P. K. Chan and S. J. Stolfo. Comparative evaluation of voting and meta-learning on partitioned data. In Proceedings of the Twelfth International Conference on Machine Learning (ICML), pages 90-98, 1995.
|
 |
3
|
|
 |
4
|
Susan Dumais , John Platt , David Heckerman , Mehran Sahami, Inductive learning algorithms and representations for text categorization, Proceedings of the seventh international conference on Information and knowledge management, p.148-155, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288651]
|
| |
5
|
|
 |
6
|
|
 |
7
|
Raj D. Iyer , David D. Lewis , Robert E. Schapire , Yoram Singer , Amit Singhal, Boosting for document routing, Proceedings of the ninth international conference on Information and knowledge management, p.70-77, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354794]
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
David D. Lewis , Robert E. Schapire , James P. Callan , Ron Papka, Training algorithms for linear text classifiers, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, p.298-306, August 18-22, 1996, Zurich, Switzerland
[doi> 10.1145/243199.243277]
|
| |
13
|
|
 |
14
|
Fabrizio Sebastiani , Alessandro Sperduti , Nicola Valdambrini, An improved boosting algorithm and its application to text categorization, Proceedings of the ninth international conference on Information and knowledge management, p.78-85, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354804]
|
| |
15
|
K. M. Ting and I. H. Witten. Stacked generalization: when does it work? In Proceedings of the Fifteenth International Joint Conference onArtificial Intelligence, pages 866-871, 1997.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
Yiming Yang , Tom Ault , Thomas Pierce , Charles W. Lattimer, Improving text categorization methods for event tracking, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.65-72, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345550]
|
 |
20
|
|
 |
21
|
|
| |
22
|
|
CITED BY 8
|
|
|
|
|
Haoran Wu , Tong Heng Phang , Bing Liu , Xiaoli Li, A refinement approach to handling model misfit in text categorization, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|