|
ABSTRACT
We propose to incorporate hundreds of pre-trained concept detectors to provide contextual information for improving the performance of multimodal video search. The approach takes initial search results from established video search methods (which typically are conservative in usage of concept detectors) and mines these results to discover and leverage co-occurrence patterns with detection results for hundreds of other concepts, thereby refining and reranking the initial video search result. We test the method on TRECVID 2005 and 2006 automatic video search tasks and find improvements in mean average precision (MAP) of 15%-30%. We also find that the method is adept at discovering contextual relationships that are unique to news stories occurring in the search set, which would be difficult or impossible to discover even if external training data were available.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
NIST TREC Video Retrieval Evaluation http://www-nlpir.nist.gov/projects/trecvid/.
|
| |
2
|
LSCOM Lexicon Definitions and Annotations Version 1.0, DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia. Technical report, Columbia University, March 2006.
|
| |
3
|
M. Campbell, S. Ebadollahi, M. Naphade, A. P. Natsev, J. R. Smith, J. Tesic, L. Xie, and A. Haubold. IBM Research TRECVID-2006 Video Retrieval System. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
| |
4
|
J. Carbonell, Y. Yang, R. Frederking, R. Brown, Y. Geng, and D. Lee. Translingual information retrieval: A comparative evaluation. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 708--715, 1997.
|
| |
5
|
S.-F. Chang, W. Hsu, W. Jiang, L. Kennedy, D. Xu, A. Yanagawa, and E. Zavesky. Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
| |
6
|
S.-F. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D. Zhang. Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction. In NIST TRECVID workshop, Gaithersburg, MD, November 2005.
|
| |
7
|
T.-S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu. TRECVID 2004 search and feature extraction task by NUS PRIS. In TRECVID 2004 Workshop, 2004.
|
| |
8
|
A. Hauptmann, M.-Y. Chen, M. Christel, C. Huang, W.-H. Lin, T. Ng, N. Papernick, A. Velivelli, J. Yang, R. Yan, H. Yang, and H. D. Wactlar. Confounded expectations: Informedia at TRECVID 2004. In TRECVID 2004 Workshop, 2004.
|
| |
9
|
A. G. Hauptmann, M.-Y. Chen, M. Christel, W.-H. Lin, R. Yan, and J. Yang. Multi-Lingual Broadcast News Retrieval. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
 |
10
|
|
| |
11
|
W. Jiang, S.-F. Chang, and A. C. Loui. Active context-based concept fusion with partial user labels. In IEEE International Conference on Image Processing (ICIP 06), Atlanta, GA, USA, 2006.
|
| |
12
|
W. Jiang, S.-F. Chang, and A. C. Loui. Context-based Concept Fusion with Boosted Conditional Random Fields. In IEEE ICASSP, 2007.
|
| |
13
|
W. Lin and A. Hauptmann. Which Thousand Words are Worth a Picture? Experiments on Video Retrieval Using a Thousand Concepts. July 2006.
|
| |
14
|
M. Naphade, L. Kennedy, J. Kender, S. Chang, J. Smith, P. Over, and A. Hauptmann. LSCOM-lite: A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005. Technical report, IBM Research Tech. Report, RC23612 (W0505-104), May, 2005.
|
 |
15
|
|
| |
16
|
|
| |
17
|
C. G. Snoek, M. Worring, D. C. Koelma, and A. W. Smeulders. Learned lexicon-driven interactive video retrieval. In CIVR, 2006.
|
| |
18
|
C. G. M. Snoek, J. C. van Gemert, T. Gevers, B. Huurnink, D. C. Koelma, M. V. Liempt, O. D. Rooij, K. E. A. van de Sande, F. J. Seinstra, A. W. M. Smeulders, A. H. Thean, C. J. Veenman, and M. Worring. The MediaMill TRECVID 2006 Semantic Video Search Engine. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
| |
19
|
|
| |
20
|
R. Yan, A. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. Intl Conf on Image and Video Retrieval, pages 238--247, 2003.
|
| |
21
|
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Brief descriptions of visual features for baseline trecvid concept detectors. Technical report, Columbia University, July 2006.
|
CITED BY 12
|
|
Po Tun Wu , Yi Hsuan Yang , Kuan Ting Chen , Winston H. Hsu , Tien Hsu Li , Chun Jen Lee, Keyword-based concept search on consumer photos by web-based kernel function, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Lyndon Kennedy , Mor Naaman , Shane Ahern , Rahul Nair , Tye Rattenbury, How flickr helps us make sense of the world: context and content in community-contributed media collections, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
Xinmei Tian , Linjun Yang , Jingdong Wang , Yichen Yang , Xiuqing Wu , Xian-Sheng Hua, Bayesian video search reranking, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Xiao-Yong Wei , Chong-Wah Ngo, Fusing semantics, observability, reliability and diversity of concept detectors for video search, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
Yi Hsuan Yang , Po Tun Wu , Ching Wei Lee , Kuan Hung Lin , Winston H. Hsu , Homer H. Chen, ContextSeer: context search and recommendation at query time for shared consumer photos, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Yuan Liu , Tao Mei , Xiuqing Wu , Xian-Sheng Hua, Optimizing video search reranking via minimum incremental information loss, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Dong Wang , Zhikun Wang , Jianmin Li , Bo Zhang , Xirong Li, Query representation by structured concept threads with application to interactive video retrieval, Journal of Visual Communication and Image Representation, v.20 n.2, p.104-116, February, 2009
|
|
|
|
|
|
|
|