ACM Home Page
Please provide us with feedback. Feedback
Topic-conditioned novelty detection
Full text PdfPdf (642 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers table of contents
Pages: 688 - 693  
Year of Publication: 2002
ISBN:1-58113-567-X
Authors
Yiming Yang  Carnegie Mellon University, Pittsburgh, PA
Jian Zhang  Carnegie Mellon University, Pittsburgh, PA
Jaime Carbonell  Carnegie Mellon University, Pittsburgh, PA
Chun Jin  Carnegie Mellon University, Pittsburgh, PA
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
: AAAI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 98,   Citation Count: 29
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775047.775150
What is a DOI?

ABSTRACT

Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which addresses this problem in two stages: 1) using a supervised learning algorithm to classify the on-line document stream into pre-defined broad topic categories, and 2) performing topic-conditioned novelty detection for documents in each topic. We also focus on exploiting named-entities for event-level novelty detection and using feature-based heuristics derived from the topic histories. Evaluating these methods using a set of broadcast news stories, our results show substantial performance gains over the traditional one-level approach to the novelty detection problem.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
The 2001 topic detection and tracking (tdt2001) task definition and evaluation plan. In http://www.nist.gov/speech/tests/tdt/tdt2OO1/evalplan.htm, 2001.
2
 
3
T. Ault and Y. Yang. knn, rocchio and metrics for information filtering at trec-10. In Proceedings of TREC-10, 2002 (to appear).
 
4
 
5
J. Fiscus, G. Doddington, J. Garofolo, and A. Martin. Nist's 1998 topic detection and tracking evaluation (tdt2). In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pages 19--26, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.
 
6
J. J. Rocchio-Jr. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, Inc., Englewood Cliffs, New Jersay, 1971.
 
7
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of American Society for Information Sciences, 41:288--297, 1990.
8
 
9
F. Walls, H. Jin, S. Sista, and R. Schwartz. Topic detection in broadcast news. In Proceedings of the DARPA Broadcast News Workshop, pages 193--198, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.
 
10
 
11
 
12
13

CITED BY  29
 
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Yiming Yang: colleagues
Jian Zhang: colleagues
Jaime Carbonell: colleagues
Chun Jin: colleagues

Peer to Peer - Readers of this Article have also read: