| Classifying news stories using memory based reasoning |
| Full text |
Pdf
(588 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Copenhagen, Denmark
Pages: 59 - 65
Year of Publication: 1992
ISBN:0-89791-523-2
|
|
Authors
|
|
Brij Masand
|
Thinking Machines Corporation, 245 First Street, Cambridge, Massachusetts
|
|
Gordon Linoff
|
Thinking Machines Corporation, 245 First Street, Cambridge, Massachusetts
|
|
David Waltz
|
Thinking Machines Corporation, 245 First Street, Cambridge, Massachusetts and Center for Complex Systems at Brandeis University, Waltham, MA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 53, Citation Count: 46
|
|
|
ABSTRACT
We describe a method for classifying news stories using Memory Based Reasoning (MBR) a k-nearest neighbor method), that does not require manual topic definitions. Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire, and SEEKER [Stanfill] (a text retrieval system that supports relevance feedback) as the underlying match engine, codes are assigned to new, unseen stories with a recall of about 80% and precision of about 70%. There are about 350 different codes to be assigned. Using a massively parallel supercomputer, we leverage the information already contained in the thousands of coded stories and are able to code a story in about 2 seconds. Given SEEKER, the text retrieval system, we achieved these results in about two person-months. We believe this approach is effective in reducing the development time to implement classification systems involving large number of topics for the purpose of classification, message routing etc.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Biebricher, Peter; Fuhr, Norbert et al, "The Automatic Indexing System AIR/PHYS -- From Research to Application." Internal report, TH Darmstadt, Department of Computer Science, Darmstadt, Germany.
|
 |
2
|
|
| |
3
|
Dasrathy B. V. Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos, California (1991).
|
| |
4
|
|
| |
5
|
|
| |
6
|
P. J. Hayes , P. M. Andersen , I. B. Nirenburg , L. M. Schmandt, TCS: a shell for content-based text categorization, Proceedings of the sixth conference on Artificial intelligence applications, p.320-326, January 1990, Santa Barbara, California, United States
|
 |
7
|
|
| |
8
|
Lewis, David D., "An Evaluation of Phrasal and Clustered Representation on a Text Categorization Task." University of Chicago, personal communication, manuscript in progress.
|
 |
9
|
|
 |
10
|
|
| |
11
|
Stanfill, C. and Waltz, D. L. "The Memory-Based Reasoning Paradigm?' Proc. Case-Based Reasoning Workshop, Clearwater Beach, FL (May 1988), pp. 414-424.
|
 |
12
|
|
| |
13
|
Young, Sheryl R., Hayes, Philip J., "Automatic Classification and Summarization of Banking Telexes." Proceedings of the Second IEEE Conference on AI Applications, 1985, Miami Beach, FL.
|
| |
14
|
Waltz, D. L. "Memory-Based Reasoning." In M.A. Arbib and J.A. Robinson (eds), Natural and Artificial Parallel Computation, The MIT Press, Cambridge, Mass., (1990), pp. 251-276.
|
CITED BY 46
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dimitris Meretakis , Dimitris Fragoudis , Hongjun Lu , Spiros Likothanassis, Scalable association-based text classification, Proceedings of the ninth international conference on Information and knowledge management, p.5-11, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hinrich Schütze , David A. Hull , Jan O. Pedersen, A comparison of classifiers and document representations for the routing problem, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.229-237, July 09-13, 1995, Seattle, Washington, United States
|
|
|
|
|
|
Yiming Yang , Jaime G. Carbonell , Ralf D. Brown , Thomas Pierce , Brian T. Archibald , Xin Liu, Learning Approaches for Detecting and Tracking News Events, IEEE Intelligent Systems, v.14 n.4, p.32-43, July 1999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|