|
ABSTRACT
Great effort has been made to improve video concept detection and continuous progress has been reported. With the current evaluation method being confined to carefully annotated domains and thus quite forgiving, the reliability of the state-of-the-art concept classifiers remains in question. Adopting a more rigorous evaluation approach, we find that most concept classifiers built using the mainstream approach are unreliable because they generalize poorly to domains other than their training domain. Moreover, evidences show that SVM-based concept classifiers learn little beyond memorizing most of the positive training data, and behave close to memory-based models such as kNN indicated by comparable performance between the two models. Examining the properties of the reliable concept classifiers, we find that the classifiers of frequent concepts, "bloated" classifiers, and classifiers capable of learning the pattern of data, tend to be more reliable. This paper contributes to a better understanding of concept detection, suggests heuristics to identify reliable concept classifiers, and discusses solutions to improving concept detection reliability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Campbell, A. Haubold, S. Ebadollahi, M. Naphade, A. Natsev, J. Smith, J. Tesic, and L. Xie. IBM Research TRECVID-2006 Video Retrieval System. TREC Video Retrieval Evaluation Proceedings, 2006.
|
| |
2
|
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001.
|
| |
3
|
S. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D. Zhang. Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction. TREC Video Retrieval Evaluation Proceedings, 2005.
|
| |
4
|
S. Chang, W. Jiang, A. Yanagawa, and E. Zavesky. Columbia University TRECVID 2007 High-Level Feature Extraction. TREC Video Retrieval Evaluation Proceedings, 2007.
|
| |
5
|
D. M. Mount and S. Arya. ANN: A Library for Approximate Nearest Neighbor Searching.
|
| |
6
|
M. R. Naphade, L. Kennedy, J. R. Kender, S. F. Chang, J. Smith, P. Over, and A. Hauptmann. A light scale concept ontology for multimedia understanding for TRECVID 2005. In IBM Research Technical Report, 2005.
|
| |
7
|
M. R. Naphade, T. Kristjansson, B. Frey, and T. Huang. Probabilistic multimedia objects Multijects: A novel approach to video indexing and retrieval in multimedia systems. In Proc. of ICIP, 1998.
|
| |
8
|
C. Ngo, Y. Jiang, X. Wei, F. Wang, W. Zhao, H. Tan, and X. Wu. Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and Search. TREC Video Retrieval Evaluation Proceedings, 2007.
|
| |
9
|
J. Philbin, O. Chum, J. Sivic, V. Ferrari, M. Marin, A. Bosch, N. Apostolof, and A. Zisserman. Oxford TRECVid 2007 Notebook paper. TREC Video Retrieval Evaluation Proceedings, 2007.
|
 |
10
|
Guo-Jun Qi , Xian-Sheng Hua , Yong Rui , Jinhui Tang , Tao Mei , Hong-Jiang Zhang, Correlative multi-label video annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291245]
|
 |
11
|
|
| |
12
|
C. Snoek, I. Everts, J. van Gemert, J. Geusebroek, B. Huurnink, D. Koelma, M. van Liempt, O. de Rooij, K. van de Sande, and A. Smeulders. The MediaMill TRECVID 2007 Semantic Video Search Engine. TREC Video Retrieval Evaluation Proceedings, 2007.
|
| |
13
|
R. Yan, M. yu Chen, and A. G. Hauptmann. Mining relationship between video concepts using probabilistic graphical model. In IEEE Int'l Conf. on Multimedia and Expo, 2006.
|
 |
14
|
|
 |
15
|
|
|