| Scalable summaries of spoken conversations |
| Full text |
Pdf
(1.42 MB)
|
Source
|
International Conference on Intelligent User Interfaces
archive
Proceedings of the 13th international conference on Intelligent user interfaces
table of contents
Gran Canaria, Spain
Pages 267-275
Year of Publication: 2008
ISBN:978-1-59593-987-6
|
|
Authors
|
|
Sumit Basu
|
Microsoft Research, Redmond, WA
|
|
Surabhi Gupta
|
Microsoft Research, Redmond, WA and Stanford University, Stanford, CA
|
|
Milind Mahajan
|
Microsoft Research, Redmond, WA
|
|
Patrick Nguyen
|
Microsoft Research, Redmond, WA
|
|
John C. Platt
|
Microsoft Research, Redmond, WA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 56, Citation Count: 2
|
|
|
ABSTRACT
In this work, we present a novel means of browsing recorded audio conversations. The method we develop produces scalable summaries of the recognized speech, in which we can increase the amount of text continuously with the desired level of detail to best fill the available space. We present an interface in which a user can view an entire conversation in one screen, but can also quickly zoom in to see the full transcript; the corresponding audio can be easily played as well. The scaling is achieved via a combination of topic segmentation and informative phrase selection, where the threshold for informativeness decreases with increasing level of detail. Finally, we evaluate our method and interface against a baseline interface with a user study.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Bederson, B., Hollan, J. D., Perlin, K., Meyer, J., Bacon, D., & Furnas, G. W. (1996). "Pad++: A Zoomable Graphical Sketchpad for Exploring Alternate Interface Physics." Journal of Visual Languages and Computing, 7, 3--31
|
| |
2
|
|
| |
3
|
Alexandra Canavan, David Graff, and George Zipperlen, CALLHOME American English Speech, LDC Catalog Number LDC97S42, Linguistic Data Consortium, Philadelphia, 1997.
|
| |
4
|
H. Christensen, B. Kolluru, Y. Gotoh and S. Renals, "From Text Summarization to Style-Specific Summarization for Broadcast News." In Proc. of (ECIR'04), Sunderland, UK, 2004.
|
 |
5
|
Liwei He , Elizabeth Sanocki , Anoop Gupta , Jonathan Grudin, Auto-summarization of audio-video presentations, Proceedings of the seventh ACM international conference on Multimedia (Part 1), p.489-498, October 30-November 05, 1999, Orlando, Florida, United States
[doi> 10.1145/319463.319691]
|
| |
6
|
|
| |
7
|
J. Hirschberg, "Speech Summarization." Lecture Slides available at http://www1.cs.columbia.edu/~julia/cs4706/sum.ppt
|
| |
8
|
C. Hori and S. Furui, "A New Approach to Automatic Speech Summarization." IEEE Transactions on Multimedia, Vol. 5, NO. 3, September 2003, pp. 368--378.
|
 |
9
|
|
| |
10
|
W. Hsu, L. Kennedy, S.-F. Chang, M. Franz, J. Smith, "Columbia-IBM News Video Story Segmentation In TRECVID 2004." Columbia ADVENT Technical Report 209-2005-3, 2005.
|
 |
11
|
|
| |
12
|
L. Lamel and J. L. Gauvain, "Alternate Phone Models for Conversational Speech," Proc. IEEE ICASSP'05, Philadelphia, March 2005.
|
| |
13
|
H. R. Lindman, Analysis of Variance in Complex Experimental Designs, San Francisco: W. H. Freeman and Co., 1974.
|
| |
14
|
S. R. Maskey and J. Hirschberg, "Summarizing Speech Without Text Using Hidden Markov Models," in Proceedings of HLT-NAACL, 2006.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
K. Zechner, "Summarization of Spoken Language - Challenges, Methods, and Prospects," Speech Technology Expert eZine, Issue 6, January 2002.
|
|