|
ABSTRACT
Automatic image annotation has been a hot-pursuit among multimedia researchers of late. Modest performance guarantees and limited adaptability often restrict its applicability to real-world settings. We propose tagging over time (T/T) to push the technology toward real-world applicability. Of particular interest are online systems that receive user-provided images and feedback over time, with user focus possibly changing and evolving. The T/T framework consists of a principled probabilistic approach to meta-learning, which acts as a go-between for a 'black-box' annotation system and the users. Inspired by inductive transfer, the approach attempts to harness available information, including the black-box model's performance, the image representations, and the WordNet ontology. Being computationally 'lightweight', this meta-learner efficiently re-trains over time, to improve and/or adapt to changes. The black-box annotation model is not required to be re-trained, allowing computationally intensive algorithms to be used. We experiment with standard image datasets and real-world data streams, using two existing annotation systems as black-boxes. Both batch and online annotation settings are experimented with. It is observed that the addition of this meta-learning layer produces much improved results that outperform best-known results. For the online setting, the T/T approach produces progressively better annotation with time, significantly outperforming the black-box as well as the static form of the meta-learner, on real-world data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alipr. http://www.alipr.com, 2006.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In Proc. NIPS, 2001.
|
| |
6
|
E. Chang, G. Kingshy, G. Sychay, and G. Wu. CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Trans. on Circuits and Systems for Video Tech, 13:26--38, 2003.
|
| |
7
|
|
 |
8
|
Ritendra Datta , Weina Ge , Jia Li , James Z. Wang, Toward bridging the annotation-retrieval gap in image search by a generative modeling approach, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180856]
|
| |
9
|
S. L. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In Proc. IEEE CVPR, 2004.
|
| |
10
|
Flickr. http://www.flickr.com, Yahoo!, 2005.
|
| |
11
|
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In Proc. ICML, 1996.
|
 |
12
|
Yuli Gao , Jianping Fan , Xiangyang Xue , Ramesh Jain, Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180840]
|
 |
13
|
|
 |
14
|
|
| |
15
|
C. Leacock and M. Chodorow. Combining local context and wordnet similarity for word sense identification. C. Fellbaum, Ed., WordNet: An Electronic Lexical Database, pages 265--283, 1998.
|
 |
16
|
|
 |
17
|
Xirong Li , Le Chen , Lei Zhang , Fuzong Lin , Wei-Ying Ma, Image annotation by large-scale content-based image retrieval, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180764]
|
 |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
D. Silver, G. Bakir, K. Bennett, R. Caruana, M. Pontil, S. Russell, and P. Tadepalli. Inductive transfer: 10 years later. In Int. Workshop at NIPS, 2005.
|
| |
22
|
|
 |
23
|
|
 |
24
|
Changhu Wang , Feng Jing , Lei Zhang , Hong-Jiang Zhang, Image annotation refinement using random walk with restarts, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180774]
|
| |
25
|
|
 |
26
|
|
CITED BY 3
|
|
|
|
|
Jiebo Luo , Jie Yu , Dhiraj Joshi , Wei Hao, Event recognition: viewing the world with a third eye, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Ritendra Datta , Dhiraj Joshi , Jia Li , James Z. Wang, Image retrieval: Ideas, influences, and trends of the new age, ACM Computing Surveys (CSUR), v.40 n.2, p.1-60, April 2008
|
|