|
ABSTRACT
Online photo sharing systems, such as Flickr and Picasa, provide a valuable source of human-annotated photos. Textual annotations are used not only to describe the visual content of an image, but also subjective, spatial, temporal and social dimensions, complicating the task of keyword-based search. In this paper we investigate a method that exploits visual annotations, e.g. notes in Flickr, to enhance keyword-based systems retrieval performance. For this purpose we adopt the bag-of-visual-words approach for content-based image retrieval as our baseline. We then apply rank aggregation of the top 25 results obtained with a set of visual annotations that match the keyword-based query. The results on retrieval experiments show significant improvements in retrieval performance when comparing the aggregated approach with our baseline, which also slightly outperforms text-only search. When using a textual filter on the search space in combination with the aggregated approach an additional boost in retrieval performance is observed, which underlines the need for large scale content-based image retrieval techniques to complement the text-based search.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
J. C. Borda. Memoire sur les elections au scrutin. In Histoire de l'Academie Royale des Sciences, 1781.
|
 |
4
|
|
| |
5
|
O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007.
|
| |
6
|
Corel clipart & photos. http://www.corel.com/products/clipartandphotos/, 1999.
|
 |
7
|
Jean-Michel Renders , Eric Gaussier , Cyril Goutte , Francois Pacull , Gabriela Csurka, Categorization in multiple category systems, Proceedings of the 23rd international conference on Machine learning, p.745-752, June 25-29, 2006, Pittsburgh, Pennsylvania
[doi> 10.1145/1143844.1143938]
|
 |
8
|
Micah Dubinko , Ravi Kumar , Joseph Magnani , Jasmine Novak , Prabhakar Raghavan , Andrew Tomkins, Visualizing tags over time, ACM Transactions on the Web (TWEB), v.1 n.2, p.7-es, August 2007
[doi> 10.1145/1255438.1255439]
|
 |
9
|
Cynthia Dwork , Ravi Kumar , Moni Naor , D. Sivakumar, Rank aggregation methods for the Web, Proceedings of the 10th international conference on World Wide Web, p.613-622, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372165]
|
 |
10
|
Ronald Fagin , Ravi Kumar , Kevin S. McCurley , Jasmine Novak , D. Sivakumar , John A. Tomlin , David P. Williamson, Searching the workplace web, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
[doi> 10.1145/775152.775204]
|
| |
11
|
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.
|
 |
12
|
|
| |
13
|
|
| |
14
|
R. Lienhart and M. Slaney. Plsa on large scale image databases. In IEEE International Conference on Acoustics, Speech and Signal Processing 2007 (ICASSP 2007), 2007.
|
| |
15
|
|
| |
16
|
|
 |
17
|
Cameron Marlow , Mor Naaman , Danah Boyd , Marc Davis, HT06, tagging paper, taxonomy, Flickr, academic article, to read, Proceedings of the seventeenth conference on Hypertext and hypermedia, August 22-25, 2006, Odense, Denmark
[doi> 10.1145/1149941.1149949]
|
| |
18
|
|
| |
19
|
|
| |
20
|
S. Nene, S. Nayar, and H. Murase. Columbia object image library: Coil, 1996.
|
| |
21
|
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007.
|
| |
22
|
J. Sivic and A. Zisserman. Video Google: Efficient visual search of videos. In J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, editors, Toward Category-Level Object Recognition, volume 4170 of LNCS, pages 127--144. Springer, 2006.
|
| |
23
|
C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring. Adding semantics to detectors for video retrieval. IEEE Transactions on Multimedia, 9(5):975--986, 2007.
|
| |
24
|
Text retrieval conference homepage. http://trec.nist.gov/.
|
 |
25
|
Roelof van Zwol , Vanessa Murdock , Lluis Garcia Pueyo , Georgina Ramirez, Diversifying image search with user generated content, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
[doi> 10.1145/1460096.1460109]
|
|