ACM Home Page
Please provide us with feedback. Feedback
A time series clustering based framework for multimedia mining and summarization using audio features
Full text PdfPdf (619 KB)
Source International Multimedia Conference archive
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval table of contents
New York, NY, USA
SESSION: Learning II table of contents
Pages: 157 - 164  
Year of Publication: 2004
ISBN:1-58113-940-3
Authors
Regunathan Radhakrishnan  Mitsubishi Electric Research Laboratory, Cambridge, MA
Ajay Divakaran  Mitsubishi Electric Research Laboratory, Cambridge, MA
Ziyou Xiong  Mitsubishi Electric Research Laboratory, Cambridge, MA
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 57,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1026711.1026738
What is a DOI?

ABSTRACT

Past work on multimedia analysis has shown the utility of detecting specific temporal patterns for different content genres. In this paper, we propose a unified, content-adaptive, unsupervised mining framework to bring out such temporal patterns from different multimedia genres. We formulate the problem of pattern discovery from video as a time series clustering problem. We treat the sequence of low/mid level audio-visual features extracted from the video as a time series and perform a temporal segmentation. The segmentation is based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the subsequences of the time series. We are thus able to detect transitions and outliers from a sequence of observations from a stationary background process. We define a confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the mining parameters and the confidence measure. Furthermore the confidence measure can be used to rank the detected outliers in terms of their departures from the back-ground process. Our experimental results with sequences of low and mid level audio features extracted from sports video show that "highlight" events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out patterns from surveillance videos without any a priori knowledge. Finally, we show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
H.Pan, P.Van Beek and M.I.Sezan. Detection of slow-motion replay segments in sports video for highlights generation. Proc. IEEE International Conf. on Acoustics, Speech and Signal Processing (2001).
4
 
5
 
6
 
7
L.Xie, S.-F.Chang, A.Divakaran, H.Sun. Unsupervised mining of statistical temporal structures in video. Video Mining, Azriel Rosenfeld, David Doermann, Daniel Dementhon Eds, Kluwer Academic Publishers (2003).
 
8
M.P.Wand and M.C.Jones. Kernel smoothing. London:Chapman & Hall (1995).
9
 
10
R.Radhakrishnan, Z.Xiong, A.Divakaran and T.Kan. Time series analysis and segmentation using eigenvectors for mining semantic audio label sequences. ICME (2004).
 
11
S.J.Sheather and M.C.Jones. A reliable data-based bandwidth selection method for kernel density estimation. J.R. Statist. Society (1991).
 
12
Winston Hsu and Shih-Fu Chang. A statistical framework for fusing mid-level perceptual features in news story segmentation. Proc. of ICME (2003).
 
13


Collaborative Colleagues:
Regunathan Radhakrishnan: colleagues
Ajay Divakaran: colleagues
Ziyou Xiong: colleagues