|
ABSTRACT
Temporal consistency is ubiquitous in video data, where temporally adjacent video shots usually share similar visual and semantic content.This paper presents a thorough study of temporal consistency defined with respect to semantic concepts and query topics using quantitative measures,and discusses its implications to video analysis and retrieval tasks. We further show that,in interactive settings, using temporal consistency leads to considerable improvement on the performance of semantic concept detection and retrieval of video data.Speci fically,an active learning method with temporal sampling strategy is proposed for building classifiers of semantic concepts,and a temporal reranking method is proposed for improving the efficiency of interactive video search.Both methods outperform existing methods by considerable margins on the TRECVID dataset.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
LSCOM lexicon definitions and annotations version 1.0. In DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report 217-2006-3, 2006.
|
| |
2
|
S. Chang, W. Chen, H. Horace, H. Sundaram, and D. Zhong. A fully automated content based video search engine supporting spatio-temporal queries. IEEE Trans. on Circuit System and Video Technology, 8(5):602--615, 1998.
|
 |
3
|
|
| |
4
|
S. Ebadollahi, L. Xie, S.-F. Chang, and J. Smith. Visual event detection using multi-dimensional concept dynamics. In Proc. IEEE Int'l Conf. on Multimedia and Expo (ICME 2006), 2006.
|
| |
5
|
R. Khalaf and S. S. Intille. Improving multiple people tracking using temporal consistency. In MIT Dept.of Architecture House N Project Technical Report, 2001.
|
| |
6
|
R. Lienhart. Comparison of automatic shot boundary detection algorithms. In SPIE Conf.on Storage and Retrieval for Image and Video Databases VII, volume 3656, pages 290--301, 1999.
|
| |
7
|
M. R. Naphade, T. Kristjansson, B. Frey, and T. Huang. Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems. In Proc. of ICIP, 1998.
|
 |
8
|
|
| |
9
|
|
| |
10
|
A. Smeaton and P. Over. Trecvid: Benchmarking the effectiveness of infomration retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
|
 |
11
|
|
 |
12
|
|
| |
13
|
L. Xie, S.-F. Chang, A. Divakaran,and H. Sun. Structure analysis of soccer video with hidden markov models. In IEEE Int'l Conf. on Acoustic, Speech and Signal Processing, Orlando, FL, May 2002.
|
| |
14
|
L. Xie, L. Kennedy, S.-F. Chang, A. Divakaran, H. Sun, and C.-Y. Lin. Layered dynamic mixture model for pattern discovery in asynchronous multi-modal streams. In Int'l Conf. on Acoustic, Speech and Signal Processing, Philadelphia, PA, March 2005.
|
 |
15
|
|
| |
16
|
|
CITED BY 10
|
|
|
|
|
|
|
|
|
|
|
Meng Wang , Xian-Sheng Hua , Xun Yuan , Yan Song , Li-Rong Dai, Optimizing multi-graph learning: towards a unified video annotation scheme, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
Daragh Byrne , Peter Wilkins , Gareth J.F. Jones , Alan F. Smeaton , Noel E. O'Connor, Measuring the impact of temporal context on video retrieval, Proceedings of the 2008 international conference on Content-based image and video retrieval, July 07-09, 2008, Niagara Falls, Canada
|
|
|
Yanan Liu , Fei Wu , Yueting Zhuang , Jun Xiao, Active post-refined multimodality video semantic concept detection with tensor representation, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Meng Wang , Xian-Sheng Hua , Richang Hong , Jinhui Tang , Guo-Jun Qi , Yan Song, Unified video annotation via multigraph learning, IEEE Transactions on Circuits and Systems for Video Technology, v.19 n.5, p.733-746, May 2009
|
|
|
|
|
|
|
|