ACM Home Page
Please provide us with feedback. Feedback
Label to region by bi-layer sparsity priors
Full text PdfPdf (1.70 MB)
Source
International Multimedia Conference archive
Proceedings of the seventeen ACM international conference on Multimedia table of contents
Beijing, China
SESSION: Content track C3: image annotation and tagging table of contents
Pages 115-124  
Year of Publication: 2009
ISBN:978-1-60558-608-3
Authors
Xiaobai Liu  National University of Singapore; Huazhong University of Science and Technology, Singapore; Wuhan, China, Singapore
Bin Cheng  National University of Singapore, Singapore
Shuicheng Yan  National University of Singapore, Singapore
Jinhui Tang  National University of Singapore, Singapore
Tat Seng Chua  National University of Singapore, Singapore
Hai Jin  Huazhong University of Science and Technology, Wuhan, China
Sponsor
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 33,   Downloads (12 Months): 33,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1631272.1631291
What is a DOI?

ABSTRACT

In this work, we investigate how to automatically reassign the manually annotated labels at the image-level to those contextually derived semantic regions. First, we propose a bi-layer sparse coding formulation for uncovering how an image or semantic region can be robustly reconstructed from the over-segmented image patches of an image set. We then harness it for the automatic label to region assignment of the entire image set. The solution to bi-layer sparse coding is achieved by convex l1-norm minimization. The underlying philosophy of bi-layer sparse coding is that an image or semantic region can be sparsely reconstructed via the atomic image patches belonging to the images with common labels, while the robustness in label propagation requires that these selected atomic patches come from very few images. Each layer of sparse coding produces the image label assignment to those selected atomic patches and merged candidate regions based on the shared image labels. The results from all bi-layer sparse codings over all candidate regions are then fused to obtain the entire label to region assignments. Besides, the presenting bi-layer sparse coding framework can be naturally applied to perform image annotation on new test images. Extensive experiments on three public image datasets clearly demonstrate the effectiveness of our proposed framework in both label to region assignment and image annotation tasks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Elisseef and J. Weston. A kernel method for multi-labelled classification. In Neural Information Processing Systems, volume 14, pages 681--687, 2001.
 
2
B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with an implicit shape model. In ECCV workshop on statistical learning in computer vision, pages 17--32, 2004.
 
3
D. Forsyth and M. Fleck. Body plans. In IEEE Conference on Computer Vision and Pattern Recognition, pages 678--683, 1997.
 
4
D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.
 
5
F. Comite, R. Gilleron, and M. Tommasi. Learning multi-label altenating decision tree from texts and data. In Machine Learning and Data Mining in Pattern Recognition, pages 251--274, 2003.
 
6
F. Kang, R. Jin, and R. Sukthankar. Correlated label propagation with application to multi-label learning. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1719--1726, 2006.
 
7
 
8
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In SIGIR Forum, pages 119--126, 2003.
 
9
J. Liu, B. Wang, M. Li, Z. Li, W. Ma, H. Lu, and S. Ma. Dual cross-media relevance model for image annotation. In ACM International Conference on Multimedia, pages 605--614, 2007.
 
10
J. Shotton, J. Winn, C. Rother, and A. Criminisi. Textonboost: Joint appearance, shape and context modeling for mulit-class object recognition and segmentation. In European Conference on Computer Vision, pages 1--15, 2006.
 
11
J. Winn and N. Jojic. Locus: Learning object classes with unsupervised segmentation. In IEEE International Conference on Computer Vision, volume 1, pages 756--763, 2005.
 
12
J. Yuan, J. Li, and B. Zhang. Exploiting spatial context constraints for automatic image region annotation. In ACM International Conference on Multimedia, pages 595--604, 2007.
 
13
L. Cao and F. Li. Spatially coherent latent topic model for concurrent object segmentation and classification. In IEEE International Conference on Computer Vision, pages 1--8, 2007.
 
14
M. Boutell, J. Luo, X. Shen, and C. Brown. Learning multilabel scene classification. Pattern Recognition, 37(9):1757--1771, 2004.
 
15
M. Szummer and R. Picard. Indoor-outdoor image classification. In IEEE International Workshop on Content-Based Access to Image and Video Databases, pages 42--51, 1998.
 
16
M. Zhang and Z. Zhou. Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007.
 
17
N. Haering, Z. Myles, and N. Lobo. Locating dedicuous trees. In IEEE Workshop on Contentbased Access of Image and Video Libraries, pages 18--25, 1997.
 
18
P. Duygulu, K. Barnard, J. de Freitas, D. Forsyth. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed ImageVocabulary. In European Conference on Computer Vision, pages 97--112, 2002.
 
19
P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2):167--181, 2004.
 
20
R. Fan, P. Chen, and C. Lin. Working set selection using the second order information for training svm. In Journal of Machine Learning Research, volume 6, pages 1889--1918, 2005.
 
21
C. Galleguillos, A. Rabinovich and S. Belongie. Object categorization using co-occurrence, location and appearance. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1--8, 2008.
 
22
R. Jin, J. Y. Chai, and L. Si. Effective automatic image annotation via a coherent language model and active learning. In ACM International Conference on Multimedia, pages 892--899, 2004.
 
23
S. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 1002--1009, 2004.
 
24
T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In ACM International Conference on Image and Video Retrieval, 2009.
 
25
V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures. In Neural Information Processing Systems, pages 553--560, 2004.
 
26
X. He and R. Zemel. Latent topic random fields: Learning using a taxonomy of labels learning. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1--8, 2008.
 
27
Y. Chen. Unsupervised learning of probabilistic object models (poms) for object classification, segmentation and recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.