|
ABSTRACT
Automatic generation of video summarization is one of the key techniques in video management and browsing. In this paper, we present a generic framework of video summarization based on the modeling of viewer's attention. Without fully semantic understanding of video content, this framework takes advantage of understanding of video content, this framework takes advantage of computational attention models and eliminates the needs of complex heuristic rules in video summarization. A set of methods of audio-visual attention model features are proposed and presented. The experimental evaluations indicate that the computational attention based approach is an effective alternative to video semantic analysis for video summarization.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
H.J Zhang, et al, "An integrated system for content-based video retrieval and browsing," Pattern Recognition, vol. 30, no.4, pp. 643--658, 1997.
|
| |
2
|
W. Wolf, "Key frame selection by motion analysis," Proc. of ICASSP'96, vol.2, pp. 1228--1231, 1996.
|
| |
3
|
Y. Zhuang, et al, "Adaptive key frame extraction using unsupervised clustering," Proc. of ICIP'98, 1998.
|
| |
4
|
A. Girgensohn, and J. Boreczky, "Time-constrained key frame selection technique," Proc. of ICMCS, pp. 756--761, 1999.
|
| |
5
|
F. Dufaux, "Key frame selection to represent a video," Proc. of ICME 2000.
|
| |
6
|
P. Campisi, A. Longari and A. Neri, "Automatic key frame selection using a wavelet based approach," Proc. of SPIE, vol. 3813, pp. 861--872, July 1999.
|
 |
7
|
|
 |
8
|
Nosa Omoigui , Liwei He , Anoop Gupta , Jonathan Grudin , Elizabeth Sanocki, Time-compression: systems concerns, usage, and benefits, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.136-143, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/302979.303017]
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
X. Orriols, X. Binefa, "An EM algorithm for video summarization, generative model approach," Proc. of ICCV, 2001.
|
| |
13
|
Y.H. Gong, X. Liu, "Video Summarization Using Singular Value Decomposition," Proc. of CVPR, June, 2000.
|
| |
14
|
L. Itti, C. Koch, "Computational Modeling of Visual Attention," Nature Reviews Neuroscience, Vol. 2, No. 3, pp. 194--203, Mar. 2001.
|
| |
15
|
|
| |
16
|
L. Itti, C. Koch, "A Comparison of Feature Combination Strategies for Saliency-Based Visual Attention Systems," Proc. of SPIE Human Vision and Electronic Imaging IV (HVEI'99), San Jose, CA, Vol. 3644, pp. 473--82, Jan 1999.
|
| |
17
|
|
| |
18
|
|
| |
19
|
Y.F Ma, H.J. Zhang, "A Model of Motion Attention for Video Skimming," Proc. of ICIP, 2002.
|
| |
20
|
Y.F. Ma, H.J. Zhang "A New Perceived Motion based Shot Content Representation," Proc. of ICIP, 2001.
|
| |
21
|
Stan Z. Li , Long Zhu , ZhenQiu Zhang , Andrew Blake , HongJiang Zhang , Harry Shum, Statistical Learning of Multi-view Face Detection, Proceedings of the 7th European Conference on Computer Vision-Part IV, p.67-81, May 28-31, 2002
|
 |
22
|
|
| |
23
|
L. Lu, Stan Li, H. J. Zhang, "Content-based Audio Segmentation Using Support Vector Machines". Proc. of ICME 2001, pp956--959, Tokyo, Japan, 2001.
|
| |
24
|
|
CITED BY 45
|
|
|
|
|
M. Fayzullin , V. S. Subrahmanian , A. Picariello , M. L. Sapino, The CPR model for summarizing video, Proceedings of the 1st ACM international workshop on Multimedia databases, November 07-07, 2003, New Orleans, LA, USA
|
|
|
|
|
|
Bin Yu , Wei-Ying Ma , Klara Nahrstedt , Hong-Jiang Zhang, Video summarization based on user log enhanced link analysis, Proceedings of the eleventh ACM international conference on Multimedia, November 02-08, 2003, Berkeley, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Vidya Setlur , Saeko Takagi , Ramesh Raskar , Michael Gleicher , Bruce Gooch, Automatic image retargeting, Proceedings of the 4th international conference on Mobile and ubiquitous multimedia, December 08-10, 2005, Christchurch, New Zealand
|
|
|
|
|
|
|
|
|
|
|
|
Jun-Cheng Chen , Wei-Ta Chu , Jin-Hau Kuo , Chung-Yi Weng , Ja-Ling Wu, Tiling slideshow, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
Wen-Huang Cheng , Yung-Yu Chuang , Bing-Yu Chen , Ja-Ling Wu , Shao-Yen Fang , Yin-Tzu Lin , Chi-Chang Hsieh , Chen-Ming Pan , Wei-Ta Chu , Min-Chun Tien, Semantic-event based analysis and segmentation of wedding ceremony videos, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
|
|
|
Jim Kleban , Anindya Sarkar , Emily Moxley , Stephen Mangiat , Swapna Joshi , Thomas Kuo , B. S. Manjunath, Feature fusion and redundancy pruning for rush video summarization, Proceedings of the international workshop on TRECVID video summarization, p.84-88, September 28-28, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reede Ren , Punitha Puttu Swamy , Joemon M. Jose , Jana Urban, Attention-based video summarisation in rushes collection, Proceedings of the international workshop on TRECVID video summarization, p.89-93, September 28-28, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Hangzai Luo , Yuli Gao , Xiangyang Xue , Jinye Peng , Jianping Fan, Incorporating feature hierarchy and boosting to achieve more effective classifier training and concept-oriented video summarization and skimming, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.4 n.1, p.1-25, January 2008
|
|
|
|
|
|
|
|
|
|
|
|
Johannes Sasongko , Cyril Rohr , Dian Tjondronegoro, Efficient generation of pleasant video summaries, Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, p.119-123, October 31-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Jinchang Ren , Jianmin Jiang , Christian Eckes, Hierarchical modeling and adaptive clustering for real-time summarization of rush videos in trecvid'08, Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, p.26-30, October 31-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
Jacob Eisenstein , Regina Barzilay , Randall Davis, Turning lectures into comic books using linguistically salient gestures, Proceedings of the 22nd national conference on Artificial intelligence, p.877-882, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|