ACM Home Page
Please provide us with feedback. Feedback
Automatic image annotation and retrieval using cross-media relevance models
Full text PdfPdf (540 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Multimedia information retrieval table of contents
Pages: 119 - 126  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
J. Jeon  University of Massachusetts, Amherst, MA
V. Lavrenko  University of Massachusetts, Amherst, MA
R. Manmatha  University of Massachusetts, Amherst, MA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 292,   Citation Count: 98
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860459
What is a DOI?

ABSTRACT

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.
3
4
 
5
 
6
W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.
 
7
 
8
 
9
 
10
 
11
D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.
12
13
14
 
15
Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.
 
16
 
17
18

CITED BY  98

Collaborative Colleagues:
J. Jeon: colleagues
V. Lavrenko: colleagues
R. Manmatha: colleagues