| Study on the combination of video concept detectors |
| Full text |
Pdf
(392 KB)
|
Source
|
International Multimedia Conference
archive
Proceeding of the 16th ACM international conference on Multimedia
table of contents
Vancouver, British Columbia, Canada
SESSION: Content track short papers session 1: content analysis
table of contents
Pages 647-650
Year of Publication: 2008
ISBN:978-1-60558-303-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 75, Citation Count: 0
|
|
|
ABSTRACT
This paper studies the combination of video concept detectors with a labeled fusion set. We point out that the computational cost of the grid search for fusion weights increases exponentially with the number of detectors, and it is thus infeasible when dealing with a large number of detectors. To avoid the difficulty, we adopt incremental fusion approach, i.e., in each round two detectors are combined and hence only 1-dimensional grid search is needed. We propose a Bottom-Up Incremental Fusion (BUIF) method which keeps selecting the detectors with lowest performance for combination. We conduct experiments on TRECVID benchmark dataset for 39 concepts with 38 detection methods. Ten different fusion strategies are compared, and empirical results have demonstrated the superiority of the proposed incremental fusion approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
TRECVID: TREC video retrieval evaluation, http://www-nlpir.nist.gov/projects/trecvid.
|
| |
2
|
A. Amir, J. Argillander, M. Campbell, A. Haubold, G. Iyengar, S. Ebadollahi, F. Kang, M. R. Naphade, A. Natsev, J. R. Smith, J. Tesic, and T. Volkmer. IBM research TRECVID-2005 video retrieval system. In Proceedings of TRECVID, 2005.
|
| |
3
|
L. Chen, D. Ding, D. Wang, F. Lin, and B. Zhang. AP-based Borda voting method for feature extraction in TRECVID 2004. In Proceedings of ECIR, 2005.
|
| |
4
|
|
 |
5
|
|
| |
6
|
A. G. Hauptmann, R. Yan, W. H. Lin, M. Christel, and H. Wactlar. Can high-level concepts bridge the semantic gap in video retrieval? a case study with broadcast news. IEEE transactions on Multimedia, 9(5), 2007.
|
 |
7
|
|
| |
8
|
W. Jiang, S. F. Chang, and A. C. Loui. Context-based concept fusion with boosted conditional random fields. In Proceedings of ICASSP, 2007.
|
| |
9
|
Cees G. M. Snoek , Marcel Worring , Jan-Mark Geusebroek , Dennis C. Koelma , Frank J. Seinstra , Arnold W. M. Smeulders, The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.28 n.10, p.1678-1689, October 2006
[doi> 10.1109/TPAMI.2006.212]
|
 |
10
|
|
 |
11
|
Dong Wang , Xiaobing Liu , Linjie Luo , Jianmin Li , Bo Zhang, Video diver: generic video indexing with diverse features, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
[doi> 10.1145/1290082.1290094]
|
 |
12
|
Meng Wang , Xian-Sheng Hua , Yan Song , Xun Yuan , Shipeng Li , Hong-Jiang Zhang, Automatic video annotation by semi-supervised learning with kernel density estimation, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180855]
|
 |
13
|
Meng Wang , Tao Mei , Xun Yuan , Yan Song , Li-Rong Dai, Video annotation by graph-based learning with neighborhood similarity, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291303]
|
 |
14
|
|
| |
15
|
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Columbia University ADVENT Technical Report #222-2006-8, 2007.
|
| |
16
|
X. Zhu. Semi-supervised learning literature survey. Technical Report (1530), University of Wisconsin-Madison.
|
|