|
ABSTRACT
Because of unique temporal and spatial properties of video data, different techniques for summarizing videos have been proposed. Key frames extracted directly from video inform users about content without requiring them to view the entire video. As part of ongoing work to develop video browsing interfaces, several interface displays based on key frames were investigated. Variations on dynamic key frame "slide shows" were examined and compared to a static key frame "filmstrip" display. The slide show mechanism displays key frames in rapid succession and is designed to facilitate visual browsing by exploiting human perceptual capabilities. User studies were conducted in a series of three experiments. Key frame display rate, number of simultaneous displays, and user perception were investigated as a function of user performance in object recognition and gist determination tasks. No significant performance degradation was detected at display rates up to 8 key frames per second, but performance degraded significantly at higher rates. Performance on gist determination tasks degraded less severely than performance on object recognition tasks as display rates increased. Furthermore, gist determination performance dropped significantly between three and four simultaneous slide shows in a single display. Users generally preferred key frame filmstrips to dynamic displays, although objective measures of performance were mixed. Implications for visual interface design and further questions for future research are provided.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Boyce, S. J., A. Pollatsek, and K. Rayner. (1989). Effect of background information on object identification. Journal of Experimental Psychology: Human Perception and Performance, 15(3): 556--566.
|
| |
3
|
|
| |
4
|
Ding, W., G. Marchionini, and T. Tse. (1997). Previewing video data: Browsing key frames at high rates using a video slide show interface. Proceedings of the International Symposium on Research, Development, and Practice in Digital Libraries (ISDL '97), Tsukuba, Japan.
|
| |
5
|
Ellis, H. C. and R. R. Hunt. (1989). Fundamentals of human memory and cognition. Wm. C. Brown Publishers: Dubuque, IA.
|
| |
6
|
Elliot, E. (1993). Watch, grab, arrange, see: Thinking with motion images via streams and collages. MSVS Thesis Document. Cambridge, MA: MIT Media Lab.
|
 |
7
|
|
| |
8
|
Komlodi, A. (1997). Visual surrogates for motion picture documents: Presentation techniques for key frames. CLIS-TR-97-15, College Park, MD: Digital Library Research Group (http://www.glue.umd.edu/~dlrg/).
|
| |
9
|
|
| |
10
|
O'Connor, B. C. (1991). Selecting key frames of moving image documents: A digital environment for analysis and navigation. Microcomputers for Information Management, 8(2), 119--133.
|
| |
11
|
Potter, M. C. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2(5), 509--522.
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
Wickens, C. D. (1992). Engineering psychology and human performance. Second Edition. New York: HarperCollins.
|
| |
16
|
Yeung, M. M, B. L. Yeo, W. Wolf, and B. Liu. (1995). Video browsing using clustering and scene transition on compressed sequences. Proceedings of Multimedia Computing and Networking, San Jose.
|
| |
17
|
Yow, D, B. L. Yeo, M. M. Yeung, and B. Liu. (1995). Analysis and presentation of soccer highlights from digital video. Proceedings of the Second Asian Conference on Computer Vision (ACCV '95).
|
| |
18
|
|
CITED BY 14
|
|
Wei Ding , Gary Marchionini , Dagobert Soergel, Multimodal surrogates for video browsing, Proceedings of the fourth ACM conference on Digital libraries, p.85-93, August 11-14, 1999, Berkeley, California, United States
|
|
|
|
|
|
Kent Wittenburg , Clifton Forlines , Tom Lanning , Alan Esenther , Shigeo Harada , Taizo Miyachi, Rapid serial visual presentation techniques for consumer digital video devices, Proceedings of the 16th annual ACM symposium on User interface software and technology, p.115-124, November 02-05, 2003, Vancouver, Canada
|
|
|
Bob Spence , Mark Witkowski , Catherine Fawcett , Brock Craft , Oscar de Bruijn, Image presentation in space and time: errors, preferences and eye-gaze activity, Proceedings of the working conference on Advanced visual interfaces, May 25-28, 2004, Gallipoli, Italy
|
|
|
Chao-Ming (James) Teng , Chon-In Wu , Yi-Chao Chen , Hao-hua Chu , Jane Yung-jen Hsu, Design and evaluation of mProducer: a mobile authoring tool for personal experience computing, Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia, p.141-148, October 27-29, 2004, College Park, Maryland
|
|
|
Barbara M. Wildemuth , Gary Marchionini , Meng Yang , Gary Geisler , Todd Wilkens , Anthony Hughes , Richard Gruss, How fast is too fast?: evaluating fast forward surrogates for digital video, Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, May 27-31, 2003, Houston, Texas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|