ACM Home Page
Please provide us with feedback. Feedback
Topic detection and tracking in English and Chinese
Full text PdfPdf (688 KB)
Source International Workshop on Information Retrieval with Asia Languages archive
Proceedings of the fifth international workshop on on Information retrieval with Asian languages table of contents
Hong Kong, China
Pages: 165 - 172  
Year of Publication: 2000
ISBN:1-58113-300-6
Author
Charles L. Wayne  Department of Defense, Ft. Meade, Maryland
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGLINK: Hypertext, Hypermedia, and Web
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM Hong Kong Chapter : ACM Hong Kong Chapter Executive Committee
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 66,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/355214.355238
What is a DOI?

ABSTRACT

Topic Detection and Tracking (TDT) refers to automatic techniques for discovering, threading, and retrieving topically related material in streams of data. Newswire and broadcast news are the canonical sources. In 1999, TDT research was extended from English to Chinese, and carefully annotated multilingual corpora were created. Researchers devised clever approaches to the cross-language challenge, and formal performance evaluations yielded very promising results. This paper outlines the 1999 research tasks, corpora, evaluation procedures, technical approaches, and results. The multilingual, multimedia research and evaluations are continuing in 2000 and 2001 under the DARPA TIDES program.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Martin, A., Doddington, G., Kamm, T., Ordowski, M. and Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of Eurospeech 1998. 1998. Available at http://www.itl.nist, gov/iaui/894.01/publications.
 
2
Doddington, G. The 1999 topic detection and tracking (TDT) task definition and evaluation plan. 1999. http://www.nist.gov/TDT/tdt99/doc/tdt3.eval.plan.99. v2.7.ps.
 
3
Proceedings of the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/papers.
 
4
Presentations at the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/presentations.
 
5
Franz, M., McCarley, J.S., Roukos, S., Ward, T. and Zhu, W.-J. Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering. In Proceedings of the TDT 1999 Workshop. 2000. http:l/www.nist.govlTDT/tdt99/papers.
 
6
Leek, T., Jin. H., Sista, S. and Schwartz, R. The BBN crosslingual topic detection and tracking system. In Proceedings of the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/papers.
 
7
Proceedings of DARPA Broadcast News Workshop. 1999. http://www.nist.gov/speech/publicationsldarpa99/inde x.htm
 
8
Proceedings of DARPA Broadcast News Transcription and Understanding Workshop. 1998. http://www.nist.gov/speech/publications/darpa98/inde x.htm.
 
9
Cieri, C. Multiple annotation of reuseable data resources: Corpora for topic detection and tracking. In Actes des 5es Journees internationales d'analyse statistique des donnees textueUes, Rajman, M. and Chappelier, J., eds. 2000, volume 1.
 
10
Cieri, C., Graft, D., Liberman, M., Martey, N. and Strassel, S. Large multilingual broadcast news corpora for cooperative research in topic detection and tracking: The TDT2 and TDT3 corpus efforts. In Proceedings of the Second International Language Resources and Evaluation Conference. 2000.
 
11
Strassel, S., Graft, D., Martey, N. and Cieri, C. Quality control in large annotation projects involving multiple judges: The case of the TDT corpora. In Proceedings of the Second International Language Resources and Evaluation Conference. 2000.
 
12
 
13
14
15
 
16
 
17
van Mulbregt, P., Carp, I., Gillick, L., Lowe, S., Yamron, J. Text segmentation and event tracking on broadcast news via a hidden markov model approach. In Proceedings of the ESCA ETRW Workshop on Accessing Information in Spoken Audio. 1999, pp. 90- 95.
 
18
Yamron, J., Carp, I., Gillick L., Lowe, S. and van Mulbregt, P. A hidden markov model approach to text segmentation and event tracking. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. 1998, l:pp. 333-336.
19
 
20
Dharanipragada, S., Franz, M., McCarley, J.S., Papineni, K., Roukos, S., Ward, T. and Zhu, W.-J. Statistical models for topic segmentation. In Proceedings of lCSLP 2000. 2000.
 
21
Dharanipragada, S., Franz, M., McCarley, J.S., Papineni, K., Roukos, S. and Ward, T. Story segmentation and topic detection for recognized speech. In Proceedings of Eurospeech 1998. 1998.
 
22
 
23
Tur, G., Hakkani-Tur, D., Stolcke, A. and Shriberg, E. Integrating prosodic and lexical cues for automatic topic segmentation. To appear in Computational Linguistics.
 
24
Papka, R. and Allan, J. Topic detection and tracking: Event clustering as a basis for first story detection. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, Croft, B., ed. Kluwer Academic Publishers, 2000, pp 97-126.
25
 
26
 
27
Carthy, J. and Smeaton, A. The design of a topic tracking system. In Proceedings of BCS-IRSG 2000. 2000, pp. 84-93.
 
28
Hatch, P., Stokes, N., and Carthy, J. Topic detection, a new application for lexical chaining? In Proceedings of BCS-IRSG 2000. 2000, pp. 94-103.