|
ABSTRACT
In this paper, a hierarchical classification framework has been proposed for bridging the semantic gap effectively and achieving multi-level image annotation automatically. First, the semantic gap between the low-level computable visual features and users' real information needs is partitioned into four smaller gaps, and multiple approachesallare proposed to bridge these smaller gaps more effectively. To learn more reliable contextual relationships between the atomic image concepts and the co-appearances of salient objects, a multi-modal boosting algorithm is proposed. To enable hierarchical image classification and avoid inter-level error transmission, a hierarchical boosting algorithm is proposed by incorporating concept ontology and multi-task learning to achieve hierarchical image classifier training with automatic error recovery. To bridge the gap between the computable image concepts and the users' real information needs, a novel hyperbolic visualization framework is seamlessly incorporated to enable intuitive query specification and evaluation by acquainting the users with a good global view of large-scale image collections. Our experiments on large-scale image databases have also obtained very positive results.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
J. Jeon, R. Manmatha, "Using maximum entropy for automatic image annotation", CIVR, pp. 24-32, 2004.
|
 |
5
|
|
| |
6
|
Y. Freund, R. E. Schapire, "Experiments with a new boosting algorithm", Proc. ICML, pp. 148-156, 1996.
|
| |
7
|
|
| |
8
|
A. Torralba, K. P. Murphy, W. T. Freeman, "Sharing features: efficient boosting procedures for multiclass object detection", IEEE CVPR, 2004.
|
| |
9
|
K. Yu, A. Schwaighofor, V. Tresp, W. -Y. Ma, H. J. Zhang, "Collaborative ensemble learning: Combining content-based information filtering via hierarchical Bayes", Proc. of Intl. Conf. on Uncertainty in Artificial Intelligence (UAI), 2003.
|
| |
10
|
|
 |
11
|
|
| |
12
|
S. Deerwester, S. T. Dumais and R. Harshman, "Indexing by latent semantic analysis", Journal of the American Society of Information Science 1990.
|
| |
13
|
|
| |
14
|
J. A. Walter, D. Webling, K. Essig, H. Ritter, "Interactive hyperbolic image browsing-towards an integrated multimedia navigator", KDD, 2006.
|
| |
15
|
G. P. Nguyen, M. Worring, "Similarity based visualization of image collections", AVIVDiLib, 2005.
|
| |
16
|
J. Lamping, R. Rao, "The hyperbolic browser: A focus + content technique for visualizing large hierarchies", Journal of Visual Languages and Computing 1996.
|
| |
17
|
|
| |
18
|
J. Friedman, T. Hastie, R. Tibshirani, "Additive logistic regression: a statistical view of boosting", Annals of Statistics vol. 28, no. 2, pp. 337--374, 2000.
|
| |
19
|
K. Grauman, T. Darrell, "The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features", MIT-CSAIL-TR-2006-20, MIT 2006.
|
 |
20
|
|
| |
21
|
K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", IEEE ICCV, 2001.
|
CITED BY 2
|
|
|
|
|
Bo Geng , Linjun Yang , Chao Xu , Xian-Sheng Hua, Collaborative learning for image and video annotation, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|