|
ABSTRACT
Automatic summarization of open domain spoken dialogues is a new research area. This paper introduces the task, the challenges involved, and presents an approach to obtain automatic extract summaries for multi-party dialogues of four different genres, without any restriction on domain. We address the following issues which are intrinsic to spoken dialogue summarization and typically can be ignored when summarizing written text such as newswire data: (i) detection and removal of speech disfluencies; (ii) detection and insertion of sentence boundaries; (iii) detection and linking of cross-speaker information units (question-answer pairs). A global system evaluation using a corpus of 23 relevance annotated dialogues containing 80 topical segments shows that for the two more informal genres, our summarization system using dialogue specific components significantly outperforms a baseline using TFIDF term weighting with maximum marginal relevance ranking (MMR).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
J. S. Garofolo, E. M. Voorhees, C. G. P. Auzanne, and V. M. Stanford. Spoken document retrieval: 1998 evaluation and investigation of new metrics. In Proceedings of the ESCA workshop: Accessing information in spoken audio, pages 1-7. Cambridge, UK, Apr. 1999.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
C. Hori and S. Furui. Improvements in automatic speech summarization and evaluation methods. In Proceedings of ICSLP-00, Beijing, China, October, pages 326-329, 2000.
|
| |
8
|
D. Jurafsky, R. Bates, N. Coccaro, R. Martin, M. Meteer, K. Ries, E. Shriberg, A. Stolcke, P. Taylor, and C. V. Ess-Dykema. SwitchBoard discourse language modeling project, final report. Research Note 30, Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, 1998.
|
| |
9
|
M. Kameyama, G. Kawai, and I. Arima. A real-time system for summarizing human-human spontaneous spoken dialogues. In Proceedings of the ICSLP-96, pages 681-684, 1996.
|
| |
10
|
K. Koumpis and S. Renals. Transcription and summarization of voicemail speech. In Proceedings of ICSLP-00, Beijing, China, October, pages 688-91, 2000.
|
 |
11
|
Julian Kupiec , Jan Pedersen , Francine Chen, A trainable document summarizer, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.68-73, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215333]
|
| |
12
|
Linguistic Data Consortium. LDC. CallHome and CallFriend LVCSR databases, 1996.
|
| |
13
|
Linguistic Data Consortium. LDC. Treebank-3: CD-ROM containing databases of dis uency annotated Switchboard transcripts (LDC99T42), 1999.
|
| |
14
|
I. Mani, D. House, G. Klein, L. Hirschman, L. Obrst, T. Firmin, M. Chrzanowski, and B. Sundheim. The TIPSTER SUMMAC text summarization evaluation. Mitre Technical Report MTR 98W0000138, October 1998, 1998.
|
| |
15
|
|
| |
16
|
D. Marcu. Discourse trees are good indicators of importance in text. In Mani and Maybury {15}, pages 123-136.
|
| |
17
|
M. Meteer, A. Taylor, R. MacIntyre, and R. Iyer. Dys uency annotation stylebook for the Switchboard corpus. Revised by Ann Taylor, June 1995, available on the LDC99T42 CD-ROM, published by LDC, 1995.
|
| |
18
|
|
| |
19
|
G. J. Rath, A. Resnick, and T. R. Savage. The formation of abstracts by the selection of sentences. American Documentation, 12(2):139-143, 1961.
|
| |
20
|
|
| |
21
|
R. L. Rose. The communicative value of filled pauses in spontaneous speech. PhD thesis, University of Birmingham, Birmingham, UK, 1998.
|
| |
22
|
E. E. Shriberg. Preliminaries to a Theory of Speech Dis uencies. PhD thesis, University ofBerkeley, Berkeley, CA, 1994.
|
| |
23
|
Andreas Stolcke , Noah Coccaro , Rebecca Bates , Paul Taylor , Carol Van Ess-Dykema , Klaus Ries , Elizabeth Shriberg , Daniel Jurafsky , Rachel Martin , Marie Meteer, Dialogue act modeling for automatic tagging and recognition of conversational speech, Computational Linguistics, v.26 n.3, p.339-373, September 2000
[doi> 10.1162/089120100561737]
|
| |
24
|
A. Stolcke, E. Shriberg, R. Bates, M. Ostendorf, D. Hakkani, M. Plauche, G. T. ur, and Y. Lu. Automatic detection of sentence boundaries and dis uencies based on recognized words. In Proceedings of the ICSLP-98, Sydney, Australia, December, volume 5, pages 2247-2250, 1998.
|
| |
25
|
S. Teufel and M. Moens. Sentence extraction as a classification task. In ACL/EACL-97 Workshop on Intelligent and Scalable Text Summarization, Madrid, Spain, 1997.
|
| |
26
|
R. Valenza, T. Robinson, M. Hickey, and R. Tucker. Summarisation of spoken audio through information extraction. In Proceedings of the ESCA workshop: Accessing information in spoken audio, pages 111-116. Cambridge, UK, Apr. 1999.
|
| |
27
|
|
| |
28
|
A. Waibel, M. Bett, and M. Finke. Meeting browser: Tracking and summarizing meetings. In Proceedings of the DARPA Broadcast News Workshop, 1998.
|
| |
29
|
A. Waibel, M. Bett, F. Metze, K. Ries, T. Schaaf, T. Schultz, H. Soltau, H. Yu,and K. Zechner. Advances in automatic meeting record creation and access. In Proceedings of ICASSP-2001, Salt Lake City, UT, May, 2001.
|
 |
30
|
Steve Whittaker , Julia Hirschberg , John Choi , Don Hindle , Fernando Pereira , Amit Singhal, SCAN: designing and evaluating user interfaces to support retrieval from speech archives, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.26-33, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312639]
|
| |
31
|
K. Zechner. Automatic Summarization of Spoken Dialogues in Unrestricted Domains. PhD thesis, Language Technologies Institute, School of Computer Science, Carnegie Mellon University, forthcoming.
|
| |
32
|
K. Zechner and A. Lavie. Increasing the coherence of spoken dialogue summaries by cross-speaker information linking. In Proceedings of the NAACL-01 Workshop on Automatic Summarization, Pittsburgh, PA, June, 2001.
|
| |
33
|
|
| |
34
|
|
CITED BY 10
|
|
|
|
|
Shelly Park , Jörg Denzinger , Frank Maurer , Ehud Sharlin, An interactive speech interface for summarizing agile project planning meetings, CHI '06 extended abstracts on Human factors in computing systems, April 22-27, 2006, Montréal, Québec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gao Cong , Long Wang , Chin-Yew Lin , Young-In Song , Yueheng Sun, Finding question-answer pairs from online forums, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
Chiori Hori , Sadaoki Furui , Rob Malkin , Hua Yu , Alex Waibel, Automatic summarization of English broadcast news speech, Proceedings of the second international conference on Human Language Technology Research, p.241-246, March 24-27, 2002, San Diego, California
|
|
|
|
|
|
|
|