ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video
Full text PdfPdf (393 KB)
Source
International Multimedia Conference archive
Proceeding of the 16th ACM international conference on Multimedia table of contents
Vancouver, British Columbia, Canada
SESSION: Content track C7: video analysis table of contents
Pages: 249-258  
Year of Publication: 2008
ISBN:978-1-60558-303-7
Authors
Gang Wang  National University of Singapore, Singapore, Singapore
Tat-Seng Chua  National University of Singapore, Singapore, Singapore
Ming Zhao  GOOGLE, Mountain View, USA
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 72,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459359.1459393
What is a DOI?

ABSTRACT

In this paper, we present a model based on a multi-resolution, multi-source and multi-modal (M3) bootstrapping framework that exploits knowledge of sub-domains for concept detection in news video. Because the characteristics and distributions of data in different sub-domains are different, we model and analyze the video in each sub-domain separately using a transductive framework. Along with this framework, we propose a "pseudo-Vapnik combined error bound" to tackle the problem of imbalanced distribution of training data in certain segments of sub-domains. For effective fusion of multi-modal features, we utilize multi-resolution inference and constraints to permit evidences from different modal features to support each other. Finally, we employ a bootstrapping technique to leverage unlabeled data to boost the overall system performance. We test our framework by detecting semantic concepts in the TRECVID 2004 dataset. Experimental results demonstrate that our approach is effective.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Amir et al, "IBM research TRECVID 2005 video retrieval system", Proceedings of TRECVID 2005, Gaithersburg, MD, November 2005 available at: http://www-nlpir.nist.gov/projects/tvpubs/tv5.papers/
 
2
L. Chaisorn, "A Hierarchical Multi-Modal approach to story segmentation in news video", PhD thesis in National University of Singapore, 2004
 
3
S. F. Chang, R. Manmatha, and T. S. Chua, "Combining text and audio-visual features in video indexing", Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1005--1008, 2005
4
 
5
T. S. Chua et al, "TRECVID 2004 Search and Feature Extraction Task by NUS PRIS" Proceedings of (VIDEO) TRECVID 2004, Gaithersburg, MD, November 2004, available at : http://www-nlpir.nist.gov/projects/tvpubs/
 
6
T. S. Chua et al, "TRECVID 2005 by NUS PRIS", Proceeding of TRECVID 2005, Gaithersburg, MD, November 2005, available at http://www-nlpir.nist.gov/projects/tvpubs/
7
 
8
 
9
A. Hauptmann et al, "Multi-Lingual Broadcast News Retrieval" Proceedings of TRECVID 2006 available at: http://www-nlpir.nist.gov/projects/tvpubs/
10
11
 
12
 
13
M. Lan, C. L. Tan and H. B. Low "Proposing a new term weighting scheme for text categorization", Proceeding of the 21st National Conference on Artificial Intelligence, AAAI-2006
 
14
15
 
16
G. J. Qi, X. S. Hua, Y. Song, J. H. Tang, H. J. Zhang, "Transductive Inference with Hierarchical Clustering for Video Annotation" International Conference on Multimedia and Expo, pp.643--646, 2007
17
18
 
19
Q. Tian, J. Yu, Q. Xue, and N. Sebe, "A New Analysis of the Value of Unlabeled Data in Semi-Supervised Learning for Image Retrieval", Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2004), Vol.2, pp.1019--1022, 2004.
 
20
V. N. Vapnik, "Statistical learning theory", Wiley Interscience New York. pp.120--200, 1998,
 
21
22
 
23
J. Yang, A. Hauptmann, M. Y. Chen, "Finding Person X: Correlating Names with Visual Appearances", International Conference on Image and Video Retrieval (CIVR'04), Dublin City University, Ireland, July 21--23, 2004
24
 
25
R. E. Yaniv, and L. Gerzon, "Effective Transductive Learning via PAC-Bayesian Model Selection." Technical Report CS-2004-05, IIT, 2004.

Collaborative Colleagues:
Gang Wang: colleagues
Tat-Seng Chua: colleagues
Ming Zhao: colleagues