ACM Home Page
Please provide us with feedback. Feedback
Web-based information content and its application to concept-based video retrieval
Full text PdfPdf (312 KB)
Source
Conference On Image And Video Retrieval archive
Proceedings of the 2008 international conference on Content-based image and video retrieval table of contents
Niagara Falls, Canada
SESSION: Improving the quality of retrieval table of contents
Pages 437-446  
Year of Publication: 2008
ISBN:978-1-60558-070-8
Authors
Alexander Haubold  Columbia University, New York, NY, USA
Apostol Natsev  IBM Thomas J. Watson, Hawthorne, NY, USA
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 164,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1386352.1386408
What is a DOI?

ABSTRACT

Semantic similarity between words or phrases is frequently used to find matching correlations between search queries and documents when straightforward matching of terms fails. This is particularly important for searching in visual databases, where pictures or video clips have been automatically tagged with a small set of semantic concepts based on analysis and classification of the visual content. Here, the textual description of documents is very limited, and semantic similarity based on WordNet's cognitive synonym structure, along with information content derived from term frequencies, can help to bridge the gap between an arbitrary textual query and a limited vocabulary of visual concepts. This approach, termed concept-based retrieval, has received significant attention over the last few years, and its success is highly dependent on the quality of the similarity measure used to map textual query terms to visual concepts.

In this paper, we consider some issues of semantic similarity measures based on Information Content (IC), and propose a way to improve them. In particular, we note that most IC-based similarity measures are derived from a small and relatively outdated corpus (the Brown corpus), which does not adequately capture the usage pattern of many contemporary terms: for example, out of more than 150,000 WordNet terms, only about 36,000 are represented. This shortcoming reflects very negatively on the coverage of typical search query terms. We therefore suggest using alternative IC corpora that are larger and better aligned with the usage of modern vocabulary. We experimentally derive two such corpora using the WWW Google search engine, and show that they provide better coverage of vocabulary, while showing comparable frequencies for Brown corpus terms. Finally, we evaluate the two proposed IC corpora in the context of a concept-based video retrieval application using the TRECVID 2005, 2006, and 2007 datasets, and we show that they increase average precision results by up to 200%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Fellbaum, C. WordNet: An Electronic Lexical Database. 1998. MIT Press, Cambridge, MA.
 
2
Zhai, Y., Liu, J., Shah, M. Automatic Query Expansion for News Video Retrieval. In Proceedings of the International Conference on Multimedia and Expo (Toronto, Canada, July 9-12, 2006). ICME '06. IEEE Press, New York, NY, 965--968.
 
3
Snoek, C.G.M., Huurnink, B., Hollink, L., de Rijke, M., Schreiber, G., Worring, M. Adding Semantics to Detectors for Video Retrieval, IEEE Transactions on Multimedia, Vol. 9, Issue 5 (August 2007). IEEE Press, New York, NY, 975--986.
 
4
 
5
Resnik, P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In Proceedings of the International Joint Conference on Artificial Intelligence (Montréal, Canada, August 20-25, 1995). IJCAI '95. Morgan Kaufmann, San Francsico, CA, 448--453.
 
6
Jiang, J.J., Conrath, D.W. Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In Proceedings of the International Conference Research on Computational Linguistics (Taipei, Taiwan, August 22-24, 1997). ROCLING X. 1997.
 
7
8
 
9
 
10
Over, P. Ianeva, T., Kraaij, W., Smeaton, A.F. TRECVID 2005 An Overview. In Proceedings of the NIST TRECVID 2005 Workshop (Gaithersburg, MD, November 14-15, 2005). TRECVID '05.
 
11
Over P., Ianeva, T., Kraaij, W., Smeaton, A.F. TRECVID 2006 Overview. In Proceedings of the NIST TRECVID 2006 Workshop (Gaithersburg, MD, November 13-14, 2006). TRECVID '06.
 
12
Over, P. Awad, G. Kraaij, W., Smeaton, A.F. TRECVID 2007 - An Introduction. In Proceedings of the NIST TRECVID 2007 Workshop (Gaithersburg, MD, November 5-6, 2007). TRECVID '07.
 
13
Pedersen, T., Patwardhan, Michelizzi, J. Wordnet::similarity - measuring the relatedness of concepts. In Proceedings of the Annual Meeting of the North American Chapter of the Association for Computational Linguistics (Boston, MA, May 3-5, 2004). NAACL '04. Association for Computational Linguistics, Morristown, NJ, 38--41.
 
14
Patwardhan, S., Banerjee, S., Pedersen, T. Using Measures of Semantic Relatedness for Word Sense Disambiguation. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, Mexico, February 16-22, 2003). CICLing '03. Springer Verlag, Berlin, Heidelberg, 241--257.
 
15
Seco, N., Veale, T., Hayes, J. An Intrinsic Information Content Metric for Semantic Similarity in WordNet. In Proceedings of the European Conference on Artificial Intelligence (Valencia, Spain, August 22-27, 2004). ECAI '04. IOS Press, Amsterdam, The Netherlands, 1089--1090.
 
16
Budanitsky, A., Hirst, G. Semantic distance in WordNet: An experimental, application--oriented evaluation of five measures. In Proceedings of the North American Chapter of the Association for Computational Linguistics Workshop (Pittsburgh, PA, June 2-7, 2001). NAACL '01. Association for Computational Linguistics, Morristown, NJ, 29--34.
 
17
Pucher, M. Performance Evaluation of WordNet-based Semantic Relatedness Measures for Word Prediction in Conversational Speech. In Proceedings of the International Workshop on Computational Semantics (Tilburg, Netherlands, January 12-14, 2005). IWCS 6.
 
18
Pedersen, T., Pakhomov, S. Developing Measures of Semantic Relatedness for the Biomedical Domain. Digital Technology Initiatives Forum (Minneapolis, MN, Feb 28, 2005). Digital Technology Center, University of Minnesota.
19
20
 
21
Neo, S.-Y., Zhao, J., Kan, M.-Y., Chua, T.-S.. Video retrieval using high level features: Exploiting query matching and confidence-based weighting. In Proceedings of the ACM International Conference on Image and Video Retrieval (Tempe, AZ, July 13-15, 2006). CIVR '06. Spring Verlag, Berlin, Heidelberg, 143--152.
 
22
Chang, S.-F., Hsu, W., Kennedy, L., Xie, L., Yanagawa, A., Zavesky, E., Zhang, D. Columbia University, TRECVID-2005 Video Search and High-Level Feature Extraction. In Proceedings of the NIST TRECVID 2005 Workshop (Gaithersburg, MD, November 14-15, 2005). TRECVID '05.
 
23
Chua, T.-S., Neo, S.-Y., Zheng, Y., Goh, H.-K., Xiao, Y., Zhao, M., Tang, S., Gao, S., Zhu, X., Chaisorn, L., Sun, Q. TRECVID-2006 by NUS-I2R. In Proceedings of the NIST TRECVID 2006 Workshop (Gaithersburg, MD, November 13-14, 2006). TRECVID '06.
 
24
Snoek, C. G. M., van Gemert, J. C., Geusebroek, J. M., Huurnink, B., Koelma, D. C., Nguyen, G. P., Rooij, O. D., Seinstra, F. J., Smeulders, A. W. M., Veenman, C. J., Worring, M. The MediaMill TRECVID 2005 Semantic Video Search Engine. In Proceedings of the NIST TRECVID 2005 Workshop (Gaithersburg, MD, November 14-15, 2005). TRECVID '05.
 
25
Haubold, A., Natsev, A., Naphade, M. Semantic multimedia retrieval using lexical query expansion and model-based reranking. In Proceedings of the International Conference on Multimedia and Expo (Toronto, Canada, July 9-12, 2006). ICME '06. IEEE Press, New York, NY, 1761--1764.
26


Collaborative Colleagues:
Alexander Haubold: colleagues
Apostol Natsev: colleagues