|
ABSTRACT
Multimedia search over distributed sources often result in recurrent images or videos which are manifested beyond the textual modality. To exploit such contextual patterns and keep the simplicity of the keyword-based search, we propose novel reranking methods to leverage the recurrent patterns to improve the initial text search results. The approach, context reranking, is formulated as a random walk problem along the context graph, where video stories are nodes and the edges between them are weighted by multimodal contextual similarities. The random walk is biased with the preference towards stories with higher initial text search scores - a principled way to consider both initial text search results and their implicit contextual relationships. When evaluated on TRECVID 2005 video benchmark, the proposed approach can improve retrieval on the average up to 32% relative to the baseline text search method in terms of story-level Mean Average Precision. In the people-related queries, which usually have recurrent coverage across news sources, we can have up to 40% relative improvement. Most of all, the proposed method does not require any additional input from users (e.g., example images), or complex search models for special queries (e.g., named person search).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
TRECVID: TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/.
|
| |
2
|
|
 |
3
|
|
| |
4
|
M. Campbell and et al. IBM Research TRECVID-2006 Video Retrieval System. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
| |
5
|
J. G. Carbonell and et al. Translingual information retrieval: A comparative evaluation. In International Joint Conference on Artificial Intelligence, 1997.
|
| |
6
|
S.-F. Chang and et al. Columbia University TRECVID-2006 video search and high-level feature extraction. In TRECVID Workshop, Washington DC, 2006.
|
| |
7
|
T.-S. Chua and et. al. TRECVID 2004 search and feature extraction task by NUS PRIS. In TRECVID Workshop, Washington DC, 2004.
|
| |
8
|
|
| |
9
|
K. M. Donald and A. F. Smeaton. A comparison of score, rank and probability-based fusion methods for video shot retrieval. In International Conference on Content-based Image and Video Retrieval (CIVR), Singapore, 2005.
|
| |
10
|
R. Fergus, P. Perona, and A. Zisserman. A visual category filter for google images. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, May 2004.
|
 |
11
|
|
| |
12
|
A. G. Hauptmann and et al. Multi-Lingual Broadcast News Retrieval. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
|
| |
13
|
|
| |
14
|
W. Hsu, L. Kennedy, S.-F. Chang, M. Franz, and J. Smith. Columbia-IBM news video story segmentation in trecvid 2004. Technical Report ADVENT #207-2005-3, Columbia University, 2005.
|
| |
15
|
W. H. Hsu and S.-F. Chang. Topic tracking across broadcast news videos with visual duplicates and semantic concepts. In International Conference on Image Processing (ICIP), Atlanta, GA, USA, 2006.
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
M. Meila and J. Shi. Learning segmentation with random walk. In Neural Information Processing Systems Conference (NIPS), pages 873--879, 2001.
|
 |
21
|
|
| |
22
|
A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Neural Information Processing Systems Conference (NIPS), 2002.
|
| |
23
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
|
| |
24
|
J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Gcap: Graph-based automatic image captioning. In International Workshop on Multimedia Data and Document Engineering, Washington, DC, USA, 2004.
|
| |
25
|
|
 |
26
|
|
| |
27
|
C. G. M. Snoek and et. al. The MediaMill TRECVID2006 Semantic Video Search Engine. In NIST TRECVID workshop, Gaithersburg, MD, Nov. 2006.
|
| |
28
|
R. Yan, A. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In International Conference on Image and Video Retrieval, Urbana-Champaign, IL, USA, 2003.
|
| |
29
|
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Technical Report ADVENT #222-2006-8, Columbia University, 2007.
|
| |
30
|
Yiming Yang , Jaime G. Carbonell , Ralf D. Brown , Thomas Pierce , Brian T. Archibald , Xin Liu, Learning Approaches for Detecting and Tracking News Events, IEEE Intelligent Systems, v.14 n.4, p.32-43, July 1999
[doi> 10.1109/5254.784083]
|
 |
31
|
|
CITED BY 17
|
|
Lu Liu , Lifeng Sun , Yong Rui , Yao Shi , Shiqiang Yang, Web video topic discovery and tracking via bipartite graph reinforcement model, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
Xinmei Tian , Linjun Yang , Jingdong Wang , Yichen Yang , Xiuqing Wu , Xian-Sheng Hua, Bayesian video search reranking, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
Yi Hsuan Yang , Po Tun Wu , Ching Wei Lee , Kuan Hung Lin , Winston H. Hsu , Homer H. Chen, ContextSeer: context search and recommendation at query time for shared consumer photos, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Yuan Liu , Tao Mei , Xiuqing Wu , Xian-Sheng Hua, Optimizing video search reranking via minimum incremental information loss, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Dong Liu , Xian-Sheng Hua , Linjun Yang , Meng Wang , Hong-Jiang Zhang, Tag ranking, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|
|
|
|
|
Shiliang Zhang , Qi Tian , Gang Hua , Qingming Huang , Shipeng Li, Descriptive visual words and visual phrases for image applications, Proceedings of the seventeen ACM international conference on Multimedia, October 19-24, 2009, Beijing, China
|
|
|
|
|
|
Shuhui Wang , Qingming Huang , Shuqiang Jiang , Lei Qin , Qi Tian, Visual ContextRank for web image re-ranking, Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining, October 23-23, 2009, Beijing, China
|
|
|
|
|
|
Liang-Chi Hsieh , Kuan-Ting Chen , Chien-Hsing Chiang , Yi-Hsuan Yang , Guan-Long Wu , Chun-Sung Ferng , Hsiu-Wen Hsueh , Angela Charng-Rurng Tsai , Winston H. Hsu, Canonical image selection and efficient image graph construction for large-scale flickr photos, Proceedings of the seventeen ACM international conference on Multimedia, October 19-24, 2009, Beijing, China
|
|
|
|
|
|
Jie Xiao , Yun Fu , Yijuan Lu , Qi Tian, Refining image retrieval using one-class classification, Proceedings of the 2009 IEEE international conference on Multimedia and Expo, p.314-317, June 28-July 03, 2009, New York, NY, USA
|
|
|
Winston Hsu , Tao Mei , Rong Yan, Knowledge discovery over community-sharing media: from signal to intelligence, Proceedings of the 2009 IEEE international conference on Multimedia and Expo, p.1448-1451, June 28-July 03, 2009, New York, NY, USA
|
|