ACM Home Page
Please provide us with feedback. Feedback
Study on the combination of video concept detectors
Full text PdfPdf (392 KB)
Source
International Multimedia Conference archive
Proceeding of the 16th ACM international conference on Multimedia table of contents
Vancouver, British Columbia, Canada
SESSION: Content track short papers session 1: content analysis table of contents
Pages 647-650  
Year of Publication: 2008
ISBN:978-1-60558-303-7
Authors
Meng Wang  Microsoft Research Asia, Beijing, China
Xian-Sheng Hua  Microsoft Research Asia, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 75,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459359.1459450
What is a DOI?

ABSTRACT

This paper studies the combination of video concept detectors with a labeled fusion set. We point out that the computational cost of the grid search for fusion weights increases exponentially with the number of detectors, and it is thus infeasible when dealing with a large number of detectors. To avoid the difficulty, we adopt incremental fusion approach, i.e., in each round two detectors are combined and hence only 1-dimensional grid search is needed. We propose a Bottom-Up Incremental Fusion (BUIF) method which keeps selecting the detectors with lowest performance for combination. We conduct experiments on TRECVID benchmark dataset for 39 concepts with 38 detection methods. Ten different fusion strategies are compared, and empirical results have demonstrated the superiority of the proposed incremental fusion approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
TRECVID: TREC video retrieval evaluation, http://www-nlpir.nist.gov/projects/trecvid.
 
2
A. Amir, J. Argillander, M. Campbell, A. Haubold, G. Iyengar, S. Ebadollahi, F. Kang, M. R. Naphade, A. Natsev, J. R. Smith, J. Tesic, and T. Volkmer. IBM research TRECVID-2005 video retrieval system. In Proceedings of TRECVID, 2005.
 
3
L. Chen, D. Ding, D. Wang, F. Lin, and B. Zhang. AP-based Borda voting method for feature extraction in TRECVID 2004. In Proceedings of ECIR, 2005.
 
4
5
 
6
A. G. Hauptmann, R. Yan, W. H. Lin, M. Christel, and H. Wactlar. Can high-level concepts bridge the semantic gap in video retrieval? a case study with broadcast news. IEEE transactions on Multimedia, 9(5), 2007.
7
 
8
W. Jiang, S. F. Chang, and A. C. Loui. Context-based concept fusion with boosted conditional random fields. In Proceedings of ICASSP, 2007.
 
9
10
11
12
13
14
 
15
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Columbia University ADVENT Technical Report #222-2006-8, 2007.
 
16
X. Zhu. Semi-supervised learning literature survey. Technical Report (1530), University of Wisconsin-Madison.

Collaborative Colleagues:
Meng Wang: colleagues
Xian-Sheng Hua: colleagues