| Flickr distance |
| Full text |
Pdf
(887 KB)
|
Source
|
International Multimedia Conference
archive
Proceeding of the 16th ACM international conference on Multimedia
table of contents
Vancouver, British Columbia, Canada
SESSION: Best paper session
table of contents
Pages 31-40
Year of Publication: 2008
ISBN:978-1-60558-303-7
|
|
Authors
|
|
Lei Wu
|
University of Science and Technology of China, Hefei, China
|
|
Xian-Sheng Hua
|
Microsoft Research Asia, Beijing, China
|
|
Nenghai Yu
|
University of Science and Technology of China, Hefei, China
|
|
Wei-Ying Ma
|
Microsoft Research Asia, Beijing, China
|
|
Shipeng Li
|
Microsoft Research Asia, Beijing, China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 21, Downloads (12 Months): 506, Citation Count: 3
|
|
|
ABSTRACT
This paper presents Flickr distance, which is a novel measurement of the relationship between semantic concepts (objects, scenes) in visual domain. For each concept, a collection of images are obtained from Flickr, based on which the improved latent topic based visual language model is built to capture the visual characteristic of this concept. Then Flickr distance between different concepts is measured by the square root of Jensen-Shannon (JS) divergence between the corresponding visual language models. Comparing with WordNet, Flickr distance is able to handle far more concepts existing on the Web, and it can scale up with the increase of concept vocabularies. Comparing with Google distance, which is generated in textual domain, Flickr distance is more precise for visual domain concepts, as it captures the visual relationship between the concepts instead of their co-occurrence in text search results. Besides, unlike Google distance, Flickr distance satisfies triangular inequality, which makes it a more reasonable distance metric. Both subjective user study and objective evaluation show that Flickr distance is more coherent to human perception than Google distance. We also design several application scenarios, such as concept clustering and image annotation, to demonstrate the effectiveness of this proposed distance in image related applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
S. Borgatti. Netdraw. http://www.analytictech.com/Netdraw/netdraw.htm, 2008.
|
 |
3
|
Shih-Fu Chang , Dan Ellis , Wei Jiang , Keansub Lee , Akira Yanagawa , Alexander C. Loui , Jiebo Luo, Large-scale multimodal semantic concept detection for consumer video, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
[doi> 10.1145/1290082.1290118]
|
| |
4
|
|
| |
5
|
|
 |
6
|
Ritendra Datta , Dhiraj Joshi , Jia Li , James Z. Wang, Image retrieval: Ideas, influences, and trends of the new age, ACM Computing Surveys (CSUR), v.40 n.2, p.1-60, April 2008
[doi> 10.1145/1348246.1348248]
|
| |
7
|
|
 |
8
|
|
| |
9
|
T. S. Huang, C. K. Dagli, S. Rajaram, E. Y. Chang, M. I. Mandel, G. E. Poliner, and D. P. W. Ellis. Active learning for interactive multimedia retrieval. In Proc. of the IEEE, 2008.
|
 |
10
|
|
| |
11
|
V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures. In Proc. of NIPS'03., 2003.
|
 |
12
|
|
 |
13
|
|
 |
14
|
Huiying Liu , Shuqiang Jiang , Qingming Huang , Changsheng Xu , Wen Gao, Region-based visual attention analysis with its application in image browsing on small displays, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291298]
|
 |
15
|
Jing Liu , Bin Wang , Mingjing Li , Zhiwei Li , Weiying Ma , Hanqing Lu , Songde Ma, Dual cross-media relevance model for image annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291380]
|
| |
16
|
G. A. Miller and et.al. Wordnet, a lexical database for the english language. Cognition Science Lab, Princeton University, 1995.
|
 |
17
|
Apostol (Paul) Natsev , Alexander Haubold , Jelena Tešić , Lexing Xie , Rong Yan, Semantic concept-based query expansion and re-ranking for multimedia retrieval, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291448]
|
 |
18
|
Guo-Jun Qi , Xian-Sheng Hua , Yong Rui , Jinhui Tang , Tao Mei , Hong-Jiang Zhang, Correlative multi-label video annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291245]
|
| |
19
|
B. Wang, Z. Li, M. Li, and W.-Y. Ma. Large-scale duplicate detection for web image search. In Proc. of IEEE International Conference on Multimedia & Expo (ICME'06), 2006.
|
| |
20
|
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Content-based image annotation refinement. 2007.
|
 |
21
|
Lei Wu , Mingjing Li , Zhiwei Li , Wei-Ying Ma , Nenghai Yu, Visual language modeling for image classification, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
[doi> 10.1145/1290082.1290101]
|
| |
22
|
L. Wu, J. Liu, M. Li, and N. Yu. Query oriented subspace shifting for near-duplicate image detection. In Proc. of IEEE International Conference on Multimedia & Expo (ICME'08), 2008.
|
| |
23
|
J. Yu and Q. Tian. Semantic subspace projection and its application in image retrieval. IEEE Transactions on Circuits and Systems for Video Technology (CSVT), pages 544--548, 2008.
|
CITED BY 3
|
|
|
|
|
Bo Geng , Linjun Yang , Chao Xu , Xian-Sheng Hua, Collaborative learning for image and video annotation, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|