ACM Home Page
Please provide us with feedback. Feedback
Feature fusion and redundancy pruning for rush video summarization
Full text PdfPdf (253 KB)
Source
International Multimedia Conference archive
Proceedings of the international workshop on TRECVID video summarization table of contents
Augsburg, Bavaria, Germany
Pages: 84 - 88  
Year of Publication: 2007
ISBN:978-1-59593-780-3
Authors
Jim Kleban  University of California: Santa Barbara, Santa Barbara, CA
Anindya Sarkar  University of California: Santa Barbara, Santa Barbara, CA
Emily Moxley  University of California: Santa Barbara, Santa Barbara, CA
Stephen Mangiat  University of California: Santa Barbara, Santa Barbara, CA
Swapna Joshi  University of California: Santa Barbara, Santa Barbara, CA
Thomas Kuo  University of California: Santa Barbara, Santa Barbara, CA
B. S. Manjunath  University of California: Santa Barbara, Santa Barbara, CA
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 46,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1290031.1290047
What is a DOI?

ABSTRACT

This paper presents a video summarization technique for rushes that employs high-level feature fusion to identify segments for inclusion. It aims to capture distinct video events using a variety of features: k-means based weighting, speech, camera motion, significant differences in HSV color space, and a dynamic time warping (DTW) based feature that suppresses repeated scenes. The feature functions are used to drive a weighted k-means based clustering to identify visually distinct, important segments that constitute the final summary. The optimal weights corresponding to the individual features are obtained using a gradient descent algorithm that maximizes the recall of ground truth events from representative training videos. Analysis reveals a lengthy computation time but high quality results (60% average recall over 42 test videos) as based on manually-judged inclusion ofdistinct shots. The summaries were judged relatively easy to view and had an average amount of redundancy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
P. Bouthemy, M. Gelgon, and F. Ganansia. A unified approach to shot change detection and camera motion characterization. IEEE Trans. on Circuits and Systems for Video Technology, 9(7):1030--1044, 1999.
 
3
 
4
A. Hanjalic. Generic approach to highlights extraction from a sport video. In Proc. of ICIP (1), pages 1--4, 2003.
 
5
A. Hanjalic. Content-Based Analysis of Digital Video. Kluwer Academic Publishers, 2004.
 
6
Y. Li and C.-C. J. Kuo. Video Content Analysis using Multimodal Information. Kluwer Academic Publishers, 2003.
 
7
8
9
10
 
11
12
 
13


Collaborative Colleagues:
Jim Kleban: colleagues
Anindya Sarkar: colleagues
Emily Moxley: colleagues
Stephen Mangiat: colleagues
Swapna Joshi: colleagues
Thomas Kuo: colleagues
B. S. Manjunath: colleagues