| Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression |
| Full text |
Pdf
(764 KB)
|
Source
|
International Multimedia Conference
archive
Proceedings of the seventeen ACM international conference on Multimedia
table of contents
Beijing, China
SESSION: Content track C3: image annotation and tagging
table of contents
Pages: 125-134
Year of Publication: 2009
ISBN:978-1-60558-608-3
|
|
Authors
|
|
Liangliang Cao
|
University of Illinois at Urbana-Champaign, Urbana, IL, USA
|
|
Jie Yu
|
Kodak Research Laboratories, Eastman Kodak Company, Rochester, NY, USA
|
|
Jiebo Luo
|
Kodak Research Laboratories, Eastman Kodak Company, Rochester, NY, USA
|
|
Thomas S. Huang
|
University of Illinois at Urbana-Champaign, Urbana, IL, USA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 54, Downloads (12 Months): 131, Citation Count: 0
|
|
|
ABSTRACT
Photo community sites such as Flickr and Picasa Web Album host a massive amount of personal photos with millions of new photos uploaded every month. These photos constitute an overwhelming source of images that require effective management. There is an increasingly imperative need for semantic annotation of these web images. This paper addresses the problem by considering two kinds of annotation: semantic annotation and geographic annotation. Both are useful for image search and retrieval and for facilitating communities and social networks. This paper proposes a novel method of Logistic Canonical Correlation Regression (LCCR) for the annotation task. This model exploits the canonical correlation between heterogeneous features and an annotation lexicon of interest, and builds a generalized annotation engine based on canonical correlations in order to produce enhanced annotation for web images. We validate the effectiveness of our algorithm using a dataset of over 380,000 images tagged with GPS coordinates.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Flickr APIs. http://www.flickr.com/services/api/.
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407, 1990.
|
| |
7
|
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. ICML, pages 148--156, 1996.
|
 |
8
|
|
| |
9
|
|
| |
10
|
J. Hays and A. A. Efros. Im2gps: estimating geographic information from a single image. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.
|
| |
11
|
G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. Intelligent Information Systems, pages 357--361, 1994.
|
| |
12
|
H. Hotelling. Relations between two sets of variates. Biometrika, 28(3-4):321--377, 1936.
|
 |
13
|
Alexandar Jaffe , Mor Naaman , Tamir Tassa , Marc Davis, Generating summaries and visualization for large collections of geo-referenced photographs, Proceedings of the 8th ACM international workshop on Multimedia information retrieval, October 26-27, 2006, Santa Barbara, California, USA
[doi> 10.1145/1178677.1178692]
|
 |
14
|
|
 |
15
|
|
 |
16
|
Lyndon Kennedy , Mor Naaman , Shane Ahern , Rahul Nair , Tye Rattenbury, How flickr helps us make sense of the world: context and content in community-contributed media collections, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
[doi> 10.1145/1291233.1291384]
|
| |
17
|
|
| |
18
|
P. Lai and C. Fyfe. Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems, 16(12):2639--2664, 2004.
|
 |
19
|
Jiebo Luo , Jie Yu , Dhiraj Joshi , Wei Hao, Event recognition: viewing the world with a third eye, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
[doi> 10.1145/1459359.1459574]
|
| |
20
|
|
 |
21
|
|
| |
22
|
G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert. Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.
|
| |
23
|
|
| |
24
|
A. Sorokin and D. Forsyth. Utility data annotation with amazon mechanical turk. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1--8, 2008.
|
| |
25
|
|
| |
26
|
A. Vinokourov, J. Shawe-Taylor, and N. Cristianini. Inferring a semantic representation of text via cross-language correlation analysis. Advances in Neural Information Processing Systems, pages 1497--1504, 2003.
|
 |
27
|
|
| |
28
|
|
 |
29
|
|
 |
30
|
Lei Wu , Xian-Sheng Hua , Nenghai Yu , Wei-Ying Ma , Shipeng Li, Flickr distance, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
[doi> 10.1145/1459359.1459364]
|
 |
31
|
|
| |
32
|
W. Zheng, X. Zhou, C. Zou, and L. Zhao. Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 17(1):233--238, 2006.
|
|