ACM Home Page
Please provide us with feedback. Feedback
Toward bridging the annotation-retrieval gap in image search by a generative modeling approach
Full text PdfPdf (725 KB)
Source International Multimedia Conference archive
Proceedings of the 14th annual ACM international conference on Multimedia table of contents
Santa Barbara, CA, USA
SESSION: Content session 6: multimedia exploration table of contents
Pages: 977 - 986  
Year of Publication: 2006
ISBN:1-59593-447-2
Authors
Ritendra Datta  Pennsylvania State University
Weina Ge  Pennsylvania State University
Jia Li  Pennsylvania State University
James Z. Wang  Pennsylvania State University
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 121,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1180639.1180856
What is a DOI?

ABSTRACT

While automatic image annotation remains an actively pursued research topic, enhancement of image search through its use has not been extensively explored. We propose an annotation-driven image retrieval approach and argue that under a number of different scenarios, this is very effective for semantically meaningful image search. In particular, our system is demonstrated to effectively handle cases of partially tagged and completely untagged image databases, multiple keyword queries, and example based queries with or without tags, all in near-realtime. Because our approach utilizes extra knowledge from a training dataset, it outperforms state-of-the-art visual similarity based retrieval techniques. For this purpose, a novel structure-composition model constructed from Beta distributions is developed to capture the spatial relationship among segmented regions of images. This model combined with the Gaussian mixture model produces scalable categorization of generic images. The categorization results are found to surpass previously reported results in speed and accuracy. Our novel annotation framework utilizes the categorization results to select tags based on term frequency, term saliency, and a WordNet-based measure of congruity, to boost salient tags while penalizing potentially unrelated ones. A bag of words distance measure based on WordNet is used to compute semantic similarity. The effectiveness of our approach is shown through extensive experiments.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
C.A. Bouman, "Cluster: An Unsupervised Algorithm for Modeling Gaussian Mixtures," Software Package. http://www.ece.purdue.edu/~bouman
3
 
4
E. Chang, G. Kingshy, G. Sychay, and G. Wu, "CBSA: Content-based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines," IEEE Trans. on Circuits and Systems for Video Tech. 13(1):26--38, 2003.
 
5
 
6
C. Leacock and M. Chodorow, "Combining Local Context and WordNet Similarity for Word Sense Identification," Fel l baum 1998.
 
7
 
8
S.L. Feng, R. Manmatha, and V. Lavrenko, "Multiple Bernoulli Relevance Models for Image and Video Annotation, "IEEE CVPR 2004.
9
10
 
11
12
13
14
 
15
16
 
17
 
18
19
20
 
21
R. Marée, P. Geurts, J. Piater, and L. Wehenkel, "Random Subwindows for Robust Image Classification," CVPR 2005.
22
23
 
24
25
 
26
27

CITED BY  7

Collaborative Colleagues:
Ritendra Datta: colleagues
Weina Ge: colleagues
Jia Li: colleagues
James Z. Wang: colleagues