ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Video search reranking through random walk over document-level context graph
Full text PdfPdf (648 KB)
Source
International Multimedia Conference archive
Proceedings of the 15th international conference on Multimedia table of contents
Augsburg, Germany
SESSION: Content 6 - video search table of contents
Pages: 971 - 980  
Year of Publication: 2007
ISBN:978-1-59593-702-5
Authors
Winston H. Hsu  National Taiwan University, Taipei, Taiwan Roc
Lyndon S. Kennedy  Columbia University, New York, NY
Shih-Fu Chang  Columbia University, New York, NY
Sponsors
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 123,   Citation Count: 17
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1291233.1291446
What is a DOI?

ABSTRACT

Multimedia search over distributed sources often result in recurrent images or videos which are manifested beyond the textual modality. To exploit such contextual patterns and keep the simplicity of the keyword-based search, we propose novel reranking methods to leverage the recurrent patterns to improve the initial text search results. The approach, context reranking, is formulated as a random walk problem along the context graph, where video stories are nodes and the edges between them are weighted by multimodal contextual similarities. The random walk is biased with the preference towards stories with higher initial text search scores - a principled way to consider both initial text search results and their implicit contextual relationships. When evaluated on TRECVID 2005 video benchmark, the proposed approach can improve retrieval on the average up to 32% relative to the baseline text search method in terms of story-level Mean Average Precision. In the people-related queries, which usually have recurrent coverage across news sources, we can have up to 40% relative improvement. Most of all, the proposed method does not require any additional input from users (e.g., example images), or complex search models for special queries (e.g., named person search).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
TRECVID: TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/.
 
2
3
 
4
M. Campbell and et al. IBM Research TRECVID-2006 Video Retrieval System. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
 
5
J. G. Carbonell and et al. Translingual information retrieval: A comparative evaluation. In International Joint Conference on Artificial Intelligence, 1997.
 
6
S.-F. Chang and et al. Columbia University TRECVID-2006 video search and high-level feature extraction. In TRECVID Workshop, Washington DC, 2006.
 
7
T.-S. Chua and et. al. TRECVID 2004 search and feature extraction task by NUS PRIS. In TRECVID Workshop, Washington DC, 2004.
 
8
 
9
K. M. Donald and A. F. Smeaton. A comparison of score, rank and probability-based fusion methods for video shot retrieval. In International Conference on Content-based Image and Video Retrieval (CIVR), Singapore, 2005.
 
10
R. Fergus, P. Perona, and A. Zisserman. A visual category filter for google images. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, May 2004.
11
 
12
A. G. Hauptmann and et al. Multi-Lingual Broadcast News Retrieval. In NIST TRECVID workshop, Gaithersburg, MD, November 2006.
 
13
 
14
W. Hsu, L. Kennedy, S.-F. Chang, M. Franz, and J. Smith. Columbia-IBM news video story segmentation in trecvid 2004. Technical Report ADVENT #207-2005-3, Columbia University, 2005.
 
15
W. H. Hsu and S.-F. Chang. Topic tracking across broadcast news videos with visual duplicates and semantic concepts. In International Conference on Image Processing (ICIP), Atlanta, GA, USA, 2006.
16
 
17
18
 
19
 
20
M. Meila and J. Shi. Learning segmentation with random walk. In Neural Information Processing Systems Conference (NIPS), pages 873--879, 2001.
21
 
22
A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Neural Information Processing Systems Conference (NIPS), 2002.
 
23
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
 
24
J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Gcap: Graph-based automatic image captioning. In International Workshop on Multimedia Data and Document Engineering, Washington, DC, USA, 2004.
 
25
26
 
27
C. G. M. Snoek and et. al. The MediaMill TRECVID2006 Semantic Video Search Engine. In NIST TRECVID workshop, Gaithersburg, MD, Nov. 2006.
 
28
R. Yan, A. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In International Conference on Image and Video Retrieval, Urbana-Champaign, IL, USA, 2003.
 
29
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Technical Report ADVENT #222-2006-8, Columbia University, 2007.
 
30
31

CITED BY  17

Collaborative Colleagues:
Winston H. Hsu: colleagues
Lyndon S. Kennedy: colleagues
Shih-Fu Chang: colleagues