| Topic-conditioned novelty detection |
| Full text |
Pdf
(642 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers
table of contents
Pages: 688 - 693
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 98, Citation Count: 29
|
|
|
ABSTRACT
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which addresses this problem in two stages: 1) using a supervised learning algorithm to classify the on-line document stream into pre-defined broad topic categories, and 2) performing topic-conditioned novelty detection for documents in each topic. We also focus on exploiting named-entities for event-level novelty detection and using feature-based heuristics derived from the topic histories. Evaluating these methods using a set of broadcast news stories, our results show substantial performance gains over the traditional one-level approach to the novelty detection problem.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
The 2001 topic detection and tracking (tdt2001) task definition and evaluation plan. In http://www.nist.gov/speech/tests/tdt/tdt2OO1/evalplan.htm, 2001.
|
 |
2
|
James Allan , Victor Lavrenko , Hubert Jin, First story detection in TDT is hard, Proceedings of the ninth international conference on Information and knowledge management, p.374-381, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354843]
|
| |
3
|
T. Ault and Y. Yang. knn, rocchio and metrics for information filtering at trec-10. In Proceedings of TREC-10, 2002 (to appear).
|
| |
4
|
Daniel M. Bikel , Scott Miller , Richard Schwartz , Ralph Weischedel, Nymble: a high-performance learning name-finder, Proceedings of the fifth conference on Applied natural language processing, p.194-201, March 31-April 03, 1997, Washington, DC
[doi> 10.3115/974557.974586]
|
| |
5
|
J. Fiscus, G. Doddington, J. Garofolo, and A. Martin. Nist's 1998 topic detection and tracking evaluation (tdt2). In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pages 19--26, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.
|
| |
6
|
J. J. Rocchio-Jr. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, Inc., Englewood Cliffs, New Jersay, 1971.
|
| |
7
|
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of American Society for Information Sciences, 41:288--297, 1990.
|
 |
8
|
|
| |
9
|
F. Walls, H. Jin, S. Sista, and R. Schwartz. Topic detection in broadcast news. In Proceedings of the DARPA Broadcast News Workshop, pages 193--198, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
CITED BY 29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qiankun Zhao , Tie-Yan Liu , Sourav S. Bhowmick , Wei-Ying Ma, Event detection from evolution of click-through data, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
Ding Zhou , Xiang Ji , Hongyuan Zha , C. Lee Giles, Topic evolution and social interactions: how authors effect research, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|