ACM Home Page
Please provide us with feedback. Feedback
Video diver: generic video indexing with diverse features
Full text PdfPdf (1.91 MB)
Source
International Multimedia Conference archive
Proceedings of the international workshop on Workshop on multimedia information retrieval table of contents
Augsburg, Bavaria, Germany
SESSION: Video retrieval table of contents
Pages: 61 - 70  
Year of Publication: 2007
ISBN:978-1-59593-778-0
Authors
Dong Wang  Tsinghua University, Beijing, China
Xiaobing Liu  Tsinghua University, Beijing, China
Linjie Luo  Tsinghua University, Beijing, China
Jianmin Li  Tsinghua University, Beijing, China
Bo Zhang  Tsinghua University, Beijing, China
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 74,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1290082.1290094
What is a DOI?

ABSTRACT

Semantic video indexing is critical for practical video retrieval systems and a generic and scalable indexing framework is a must for indexing a large semantic lexicon with over 1000 concepts present. This paper fully explores the idea of incorporating many kinds of diverse features into a single framework, combining them altogether to obtain larger degree of invariance which is absent in any of the component features, and thus achieves genericness and scalability. We scale down the formidable computational expense with a clever design of the classification and fusion schemes. To be specific, ~20 kinds of diverse features are extracted to capture limited yet complementary variance in color, texture and edge with spatial constraints implicitly integrated, and over 100 classifiers are built subsequently and fused to produce a generic detector. The extensive experiments on a total of 310 hours of TRECVID news videos show that the proposed framework yields significantly improved performance over that of the best single feature across a variety of concepts. Moreover, a benchmark comparison demonstrates that this approach is state-of-the-art. Meanwhile, the proposed approach generalizes well over previously unseen programs and stations and scales well to a lexicon of over 300 concepts in the LSCOM [18] ontology.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Amir, J. Argillandery, M. Campbell, A. Haubold, G. Iyengar, S. Ebadollahi, F. Kang, M. R. Naphade, A. P. Natsev, J. R. Smith, J. Tešić, and T. Volkmer. Ibm research trecvid-2005 video retrieval system. In Proc. of TRECVID workshop, 2006.
 
2
A. Amir and et al. Ibm research trecvid-2003 video retrieval system. In Proc. of TRECVID workshop, 2004.
3
 
4
H. Bay, T. Tuytelaars, and L. Gool. Surf: Speeded up robust features. In Proc. of ECCV 2006.
 
5
S.-F. Chang, W. Hsu, W. Jiang, L. Kennedy, D. Xu, A. Yanagawa, , and E. Zavesky. Columbia university trecvid-2006 video search and high-level feature extraction. In Proc. of TRECVID workshop, 2007.
 
6
S.-F. Chang, W. Hsu, W. Jiang, L. Kennedy, D. Xu, A. Yanagawa, and E. Zavesky. Evaluating the impact of 374 visualbased lscom concept detectors on automatic search. www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html.
 
7
S.-F. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D.-Q. Zhang. Columbia university trecvid-2005 video search and high-level feature extraction. In Proc. of TRECVID workshop, 2007.
 
8
G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, at ECCV, 2004.
 
9
 
10
 
11
J. Fan, A. Elmagarmid, X. Zhu, W. Aref, and L. Wu. Classview: hierarchical video shot classification, indexing, and accessing. IEEE Trans. Multimedia, 6(1):70--86, 2004.
 
12
 
13
A. Hauptmann, M.-Y. Chen, M. Christel, W.-H. Lin, R. Yan, and J. Yang. Multi-lingual broadcast news retrieval. In Proc. of TRECVID workshop, 2007.
 
14
A. Hauptmann, M. Christel, R. Concescu, J. Gao, Q. Jin, W.-H. Lin, J.-Y. Pan, S. M. Stevens, R. Yan, J. Yang, and Y. Zhang. Cmu informediaaŕs trecvid 2005 skirmishes. In Proc. of TRECVID workshop, 2006.
 
15
W. Jiang, S.-F. Chang, and A. C. Loui. Context-based concept fusion with boosted conditional random fields. In Proc. of ICASSP, Hawaii, USA, April 2007.
16
17
 
18
 
19
M. R. Naphade, L. Kennedy, J. R. Kender, S.-F. Chang, J. R. Smith, P. Over, and A. Hauptmann. A light scale concept ontology for multimedia understanding for trecvid 2005. 2005. www-nlpir.nist.gov/projects/ tv2005/LSCOMlite_NKKCSOH.pdf.
20
 
21
J. Platt. Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research, 1998.
 
22
J. Platt. Advances in Large Margin Classifiers, chapter Probabilities for SV machines, pages 61--74. MIT Press, 2000.
 
23
M. Riesenhuber and T. Poggio. Hierachical models of object recognition in cortex. Nature Neuroscience, 2(11):1019--1025, 1999.
 
24
25
 
26
C. Snoek, J. van Gemert, J. Geusebroek, B. Huurnink, D. Koelma, G. Nguyen, O. de Rooij, F. Seinstra, A. Smeulders, C. Veenman, and M. Worring. The mediamill trecvid 2005 semantic video search engine. In Proc. of TRECVID workshop, 2006.
 
27
C. Snoek, J. van Gemert, T. Gevers, B. Huurnink, D. Koelma, M. van Liempt, O. de Rooij, K. van de Sande, F. Seinstra, A. Smeulders, A. Thean, C. Veenman, and M. Worring. The mediamill trecvid 2006 semantic video search engine. In Proc. of TRECVID workshop, 2007.
 
28
 
29
 
30
D. Wang, J. Li, and B. Zhang. Relay boost fusion for learning rare concepts in multimedia. CIVR 2006.
31
 
32
J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classifcation of texture and object categories: An in-depth study. Technical Report RR-5737, INRIA Rhône-Alpes, 2005.

CITED BY  9

Collaborative Colleagues:
Dong Wang: colleagues
Xiaobing Liu: colleagues
Linjie Luo: colleagues
Jianmin Li: colleagues
Bo Zhang: colleagues