ACM Home Page
Please provide us with feedback. Feedback
Online multi-label active annotation: towards large-scale content-based video search
Full text PdfPdf (750 KB)
Source
International Multimedia Conference archive
Proceeding of the 16th ACM international conference on Multimedia table of contents
Vancouver, British Columbia, Canada
SESSION: Content track C4: video sp81-wei.pdfearch table of contents
Pages 141-150  
Year of Publication: 2008
ISBN:978-1-60558-303-7
Authors
Xian-Sheng Hua  Microsoft Research Asia, Beijing, China
Guo-Jun Qi  University of Science and Technology of China, Hefei, China
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 39,   Downloads (12 Months): 223,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459359.1459379
What is a DOI?

ABSTRACT

Existing video search engines have not taken the advantages of video content analysis and semantic understanding. Video search in academia uses semantic annotation to approach content-based indexing. We argue this is a promising direction to enable real content-based video search. However, due to the complexity of both video data and semantic concepts, existing techniques on automatic video annotation are still not able to handle large-scale video set and large-scale concept set, in terms of both annotation accuracy and computation cost. To address this problem, in this paper, we propose a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online multi-label active learning. This framework is scalable to both the video sample dimension and concept label dimension. Large-scale unlabeled video samples are assumed to arrive consecutively in batches with an initial pre-labeled training set, based on which a preliminary multi-label classifier is built. For each arrived batch, a multi-label active learning engine is applied, which automatically selects and manually annotates a set of unlabeled sample-label pairs. And then an online learner updates the original classifier by taking the newly labeled sample-label pairs into consideration. This process repeats until all data are arrived. During the process, new labels, even without any pre-labeled training samples, can be incorporated into the process anytime. Experiments on TRECVID dataset demonstrate the effectiveness and efficiency of the proposed framework.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
A. Kapoor, K. Grauman, R. Urtasun, and T. Darrel, "Active Learning with Gaussian Processes for Object Recognition," in Proc. of IEEE International Conference on Computer Vision, 2007.
3
4
 
5
 
6
 
7
8
 
9
E. Chang, S. Tong, K. Goh, and C. Chang, "Support Vector Machine Concept-Dependent Active Learning for Image Retrieval," IEEE Transactions on Multimedia, 2005.
 
10
X. Li, L. Wang, and E. Sung, "Multi-Label SVM Active Learning for Image Classification," in Proc. of IEEE International Conference on Image Processing, 2004.
 
11
M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, "Learning Multi-Label Scene Classification," Pattern Recognition, 2004.
 
12
K. Brinker, "On active learning in multi-label classification," in Book "From Data and Information Analysis to Knowledge Engineering" of Book Series "Studies in Classification, Data Analysis, and Knowledge Organization", Springer, 2006.
 
13
 
14
G.-J. Qi, X.--S. Hua, et al., "Two-Dimensional Active Learning for Image Classification," in Proc. of IEEE Conference on Computer Vision and Patter Recognition, 2008.
 
15
S. F. Chen and R. Rosenfeld, "A Gaussian Prior for Smoothing Maximum Entropy Models," School of Computer Science, Carnegie Mellon University, Tech. Rep. CMU-CS-99-108, 1999.
 
16
N. Syed, H. Liu, and K. Sung, "Incremental Learning with Support Vector Machines," in Workshop on Support Vector Machines, at the IJCAI, 1999.
 
17
G. Cauwenberghs and T. Poggio, "Incremental and Decremental Support Vector Machine," in Proc. of NIPS, 2000.
18
 
19
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum-Likelihood from Incomplete Data via EM Algorithm," Journal of the Royal Statistical Society (Series B), 1977.
 
20
W. Jiang, S.-F. Chang, and A. Loui, "Active Concept-Based Concept Fusion with Partial User Labels," in Proc. of IEEE International Conference on Image Processing, 2006.
21
 
22
Z.-J. Zha, X.-S. Hua, et al., "Joint Multi-Label Multi-Instance Learning for Image Classification," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2008.
 
23
X.-S. Hua, T. Mei, W. Lai, M. Wang, J. Tang, G.-J. Qi, L. Li, Z. Gu, "Microsoft Research Asia TRECVID 2006: High-Level Feature Extraction and Rushes Exploitation," In TREC Video Retrieval Evaluation Online Proceeding, 2006.
24
 
25
reCAPTCHA. http://recaptcha.net/.
26
 
27
C. Ngo, Y. Jiang, X. Wei, F. Wang, W. Zhao, H. Tan and X. Wu. Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and search. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
28
S. Chang, W. Jiang, A. Yanagawa, and E. Zavesky. Columbia University TRECVID 2007 High-Level Feature Extraction. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
29
M. Campbell, et al. IBM Research TRECVID-2007 Video Retrieval System. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
30
C. G. M. Snoek, et al. The MediaMill TRECVID 2007 Semantic Video Search Engine. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
31
J. Yuan, et al. THU and ICRC at TRECVID 2007. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
32
 
33
 
34
S. Ayache and G. Quénot. TRECVID 2007: Collaborative Annotation using Active Learning. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
35
Q. Zhang, et al. The COST292 experimental framework for TRECVID 2007. In TREC Video Retrieval Evaluation Online Proceeding, 2007.
 
36
G.-J. Qi, X.-S. Hua, Y. Rui and H.-J. Zhang. Two-Dimensional Multi-Label Active Learning with An Efficient Online Adaption Model for Image Classification, Pre-prints of submission of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
 
37
 
38
Alexander Sorokin, David Forsyth. Utility data annotation with Amazon Mechanical Turk. First International Workshop on Internet Vision (in conjunction with CVPR), 2008.
39
40


Collaborative Colleagues:
Xian-Sheng Hua: colleagues
Guo-Jun Qi: colleagues