|
ABSTRACT
Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.
|
 |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.
|
| |
16
|
|
| |
17
|
|
 |
18
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
CITED BY 95
|
|
|
|
|
|
|
|
|
|
|
Jia-Yu Pan , Hyung-Jeong Yang , Christos Faloutsos , Pinar Duygulu, Automatic multimedia cross-modal correlation discovery, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
G. Iyengar , P. Duygulu , S. Feng , P. Ircing , S. P. Khudanpur , D. Klakow , M. R. Krause , R. Manmatha , H. J. Nock , D. Petkova , B. Pytlik , P. Virga, Joint visual-text modeling for automatic retrieval of multimedia documents, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kobus Barnard , Quanfu Fan , Ranjini Swaminathan , Anthony Hoogs , Roderic Collins , Pascale Rondot , John Kaufhold, Evaluation of Localized Semantics: Data, Methodology, and Experiments, International Journal of Computer Vision, v.77 n.1-3, p.199-217, May 2008
|
|
|
|
|
|
|
|
|
|
|
|
B. Vassiliadis , A. Stefani , L. Drossos , K. Ioannou, Knowledge discovery in multimedia repositories: the role of metadata, Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering, p.330-335, October 27-29, 2005, Sofia, Bulgaria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Changhu Wang , Feng Jing , Lei Zhang , Hong-Jiang Zhang, Scalable search-based image annotation of personal images, Proceedings of the 8th ACM international workshop on Multimedia information retrieval, October 26-27, 2006, Santa Barbara, California, USA
|
|
|
Jing Liu , Mingjing Li , Wei-Ying Ma , Qingshan Liu , Hanqing Lu, An adaptive graph model for automatic image annotation, Proceedings of the 8th ACM international workshop on Multimedia information retrieval, October 26-27, 2006, Santa Barbara, California, USA
|
|
|
Kai Song , Yonghong Tian , Wen Gao , Tiejun Huang, Diversifying the image retrieval results, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
Gang Chen , Xiaoyan Li , Lidan Shou , Jinxiang Dong , Chun Chen, HISA: a query system bridging the semantic gap for large image databases, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
Changhu Wang , Feng Jing , Lei Zhang , Hong-Jiang Zhang, Image annotation refinement using random walk with restarts, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
Ritendra Datta , Weina Ge , Jia Li , James Z. Wang, Toward bridging the annotation-retrieval gap in image search by a generative modeling approach, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
Xirong Li , Le Chen , Lei Zhang , Fuzong Lin , Wei-Ying Ma, Image annotation by large-scale content-based image retrieval, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Meng Wang , Xian-Sheng Hua , Xun Yuan , Yan Song , Li-Rong Dai, Optimizing multi-graph learning: towards a unified video annotation scheme, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Ritendra Datta , Dhiraj Joshi , Jia Li , James Z. Wang, Image retrieval: Ideas, influences, and trends of the new age, ACM Computing Surveys (CSUR), v.40 n.2, p.1-60, April 2008
|
|
|
|
|
|
Jing Liu , Bin Wang , Mingjing Li , Zhiwei Li , Weiying Ma , Hanqing Lu , Songde Ma, Dual cross-media relevance model for image annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
Xiangdong Zhou , Mei Wang , Qi Zhang , Junqi Zhang , Baile Shi, Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching, Proceedings of the 6th ACM international conference on Image and video retrieval, p.25-32, July 09-11, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaoguang Rui , Mingjing Li , Zhiwei Li , Wei-Ying Ma , Nenghai Yu, Bipartite graph reinforcement model for web image annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
Jonathon S. Hare , Paul H. Lewis , Peter G. B. Enser , Christine J. Sandom, Semantic facets: an in-depth analysis of a semantic image retrieval system, Proceedings of the 6th ACM international conference on Image and video retrieval, p.250-257, July 09-11, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
Omara Abdul Hamid , Muhammad Abdul Qadir , Nadeem Iftikhar , Mohib Ur Rehman , Mobin Uddin Ahmed , Imran Ihsan, Generic Multimedia Database Architecture Based upon Semantic Libraries, Informatica, v.18 n.4, p.483-510, December 2007
|
|
|
|
|
|
|
|
|
|
|
|
Yong Wang , Tao Mei , Shaogang Gong , Xian-Sheng Hua, Combining global, regional and contextual features for automatic image annotation, Pattern Recognition, v.42 n.2, p.259-266, February, 2009
|
|
|
|
|
|
|
|
|
Roelof van Zwol , Vanessa Murdock , Lluis Garcia Pueyo , Georgina Ramirez, Diversifying image search with user generated content, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Lei Wu , Xian-Sheng Hua , Nenghai Yu , Wei-Ying Ma , Shipeng Li, Flickr distance, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Jing Liu , Mingjing Li , Qingshan Liu , Hanqing Lu , Songde Ma, Image annotation via graph learning, Pattern Recognition, v.42 n.2, p.218-228, February, 2009
|
|
|
Jinhui Tang , Haojie Li , Guo-Jun Qi , Tat-Seng Chua, Integrated graph-based semi-supervised multiple/single instance learning framework for image annotation, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Doina Ana Cernea , Esther Del Moral , Emilio Labra, Combining folksonomies and automatic information techniques for LO semantic indexing, Proceedings of the 7th conference on Data networks, communications, computers, p.79-84, November 07-09, 2008, Bucharest, Romania
|
|
|
Julien Ah-Pine , Marco Bressan , Stephane Clinchant , Gabriela Csurka , Yves Hoppenot , Jean-Michel Renders, Crossing textual and visual content in different application scenarios, Multimedia Tools and Applications, v.42 n.1, p.31-56, March 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Changyun Zhu , Kun Li , Qin Lv , Li Shang , Robert P. Dick, iScope: personalized multi-modality image search for mobile devices, Proceedings of the 7th international conference on Mobile systems, applications, and services, June 22-25, 2009, Wroclaw, Poland
|
|
|
|
|
|
Chien-Ju Ho , Tsung-Hsiang Chang , Jane Yung-Jen Hsu, PhotoSlap: a multi-player online game for semantic annotation, Proceedings of the 22nd national conference on Artificial intelligence, p.1359-1364, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|