|
ABSTRACT
Topic Detection and Tracking (TDT) refers to automatic techniques for discovering, threading, and retrieving topically related material in streams of data. Newswire and broadcast news are the canonical sources. In 1999, TDT research was extended from English to Chinese, and carefully annotated multilingual corpora were created. Researchers devised clever approaches to the cross-language challenge, and formal performance evaluations yielded very promising results. This paper outlines the 1999 research tasks, corpora, evaluation procedures, technical approaches, and results. The multilingual, multimedia research and evaluations are continuing in 2000 and 2001 under the DARPA TIDES program.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Martin, A., Doddington, G., Kamm, T., Ordowski, M. and Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of Eurospeech 1998. 1998. Available at http://www.itl.nist, gov/iaui/894.01/publications.
|
| |
2
|
Doddington, G. The 1999 topic detection and tracking (TDT) task definition and evaluation plan. 1999. http://www.nist.gov/TDT/tdt99/doc/tdt3.eval.plan.99. v2.7.ps.
|
| |
3
|
Proceedings of the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/papers.
|
| |
4
|
Presentations at the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/presentations.
|
| |
5
|
Franz, M., McCarley, J.S., Roukos, S., Ward, T. and Zhu, W.-J. Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering. In Proceedings of the TDT 1999 Workshop. 2000. http:l/www.nist.govlTDT/tdt99/papers.
|
| |
6
|
Leek, T., Jin. H., Sista, S. and Schwartz, R. The BBN crosslingual topic detection and tracking system. In Proceedings of the TDT 1999 Workshop. 2000. http://www.nist.gov/TDT/tdt99/papers.
|
| |
7
|
Proceedings of DARPA Broadcast News Workshop. 1999. http://www.nist.gov/speech/publicationsldarpa99/inde x.htm
|
| |
8
|
Proceedings of DARPA Broadcast News Transcription and Understanding Workshop. 1998. http://www.nist.gov/speech/publications/darpa98/inde x.htm.
|
| |
9
|
Cieri, C. Multiple annotation of reuseable data resources: Corpora for topic detection and tracking. In Actes des 5es Journees internationales d'analyse statistique des donnees textueUes, Rajman, M. and Chappelier, J., eds. 2000, volume 1.
|
| |
10
|
Cieri, C., Graft, D., Liberman, M., Martey, N. and Strassel, S. Large multilingual broadcast news corpora for cooperative research in topic detection and tracking: The TDT2 and TDT3 corpus efforts. In Proceedings of the Second International Language Resources and Evaluation Conference. 2000.
|
| |
11
|
Strassel, S., Graft, D., Martey, N. and Cieri, C. Quality control in large annotation projects involving multiple judges: The case of the TDT corpora. In Proceedings of the Second International Language Resources and Evaluation Conference. 2000.
|
| |
12
|
|
| |
13
|
Yiming Yang , Jaime G. Carbonell , Ralf D. Brown , Thomas Pierce , Brian T. Archibald , Xin Liu, Learning Approaches for Detecting and Tracking News Events, IEEE Intelligent Systems, v.14 n.4, p.32-43, July 1999
[doi> 10.1109/5254.784083]
|
 |
14
|
Yiming Yang , Tom Ault , Thomas Pierce , Charles W. Lattimer, Improving text categorization methods for event tracking, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.65-72, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345550]
|
 |
15
|
|
| |
16
|
|
| |
17
|
van Mulbregt, P., Carp, I., Gillick, L., Lowe, S., Yamron, J. Text segmentation and event tracking on broadcast news via a hidden markov model approach. In Proceedings of the ESCA ETRW Workshop on Accessing Information in Spoken Audio. 1999, pp. 90- 95.
|
| |
18
|
Yamron, J., Carp, I., Gillick L., Lowe, S. and van Mulbregt, P. A hidden markov model approach to text segmentation and event tracking. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. 1998, l:pp. 333-336.
|
 |
19
|
|
| |
20
|
Dharanipragada, S., Franz, M., McCarley, J.S., Papineni, K., Roukos, S., Ward, T. and Zhu, W.-J. Statistical models for topic segmentation. In Proceedings of lCSLP 2000. 2000.
|
| |
21
|
Dharanipragada, S., Franz, M., McCarley, J.S., Papineni, K., Roukos, S. and Ward, T. Story segmentation and topic detection for recognized speech. In Proceedings of Eurospeech 1998. 1998.
|
| |
22
|
|
| |
23
|
Tur, G., Hakkani-Tur, D., Stolcke, A. and Shriberg, E. Integrating prosodic and lexical cues for automatic topic segmentation. To appear in Computational Linguistics.
|
| |
24
|
Papka, R. and Allan, J. Topic detection and tracking: Event clustering as a basis for first story detection. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, Croft, B., ed. Kluwer Academic Publishers, 2000, pp 97-126.
|
 |
25
|
|
| |
26
|
|
| |
27
|
Carthy, J. and Smeaton, A. The design of a topic tracking system. In Proceedings of BCS-IRSG 2000. 2000, pp. 84-93.
|
| |
28
|
Hatch, P., Stokes, N., and Carthy, J. Topic detection, a new application for lexical chaining? In Proceedings of BCS-IRSG 2000. 2000, pp. 94-103.
|
|