ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Transductive multi-label learning for video concept detection
Full text PdfPdf (207 KB)
Source
International Multimedia Conference archive
Proceeding of the 1st ACM international conference on Multimedia information retrieval table of contents
Vancouver, British Columbia, Canada
SESSION: Video concept, action, and retrieval table of contents
Pages: 298-304  
Year of Publication: 2008
ISBN:978-1-60558-312-9
Authors
Jingdong Wang  Microsoft Research Asia, Beijing, China
Yinghai Zhao  University of Sci. & Tech. of China, Hefei, China
Xiuqing Wu  University of Sci. & Tech. of China, Hefei, China
Xian-Sheng Hua  Microsoft Research Asia, Beijing, China
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 121,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1460096.1460145
What is a DOI?

ABSTRACT

Transductive video concept detection is an effective way to handle the lack of sufficient labeled videos. However, another issue, the multi-label interdependence, is not essentially addressed in the existing transductive methods. Most solutions only applied the transductive single-label approach to detect each individual concept separately, but ignoring the concept relation, or simply imposed the smoothness assumption over the multiple labels for each video, without indeed exploring the interdependence between the concepts. On the other hand, the semi-supervised extension of supervised multi-label classifiers, such as correlative multi-label support vector machines, is usually intractable and hence impractical due to the quite expensive computational cost. In this paper, we propose an effective transductive multi-label classification approach, which simultaneously models the labeling consistency between the visually similar videos and the multi-label interdependence for each video in an integrated framework. We compare the performance between the proposed approach and several representative transductive single-label and supervised multi-label classification approaches for the video concept detection task over the widely-used TRECVID data set. The comparative results demonstrate the superiority of the proposed approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Y. Altun, D. A. McAllester, and M. Belkin. Maximum Margin Semi-Supervised Learning for Structured Variables. In NIPS, 2005.
 
2
 
3
 
4
5
 
6
G. Chen, Y. Song, F. Wang, and C. Zhang. Semi-Supervised Multi-Label Learning by Solving a Sylvester Equation. In SDM, 2008.
 
7
K. Duh and K. Kirchhoff. Structured Multi-Label Transductive Learning. In NIPS Workshop on Advances in Structured Learning for Text/Speech Processing, 2005.
8
 
9
W. Jiang, S.-F. Chang, and A. C. Loui. Active Context-Based Concept Fusion with Partial User Labels. In ICIP, pages 2917--2920, 2006.
 
10
 
11
 
12
H. H. Ku and S. Kullback. Approximating Discrete Probability Distributions. IEEE Transactions on Information Theory, IT-15(4):444--447, 1969.
 
13
C.-H. Lee, S. Wang, F. Jiao, D. Schuurmans, and R. Greiner. Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields. In NIPS, pages 793--800, 2006.
 
14
Y. Liu, R. Jin, and L. Yang. Semi-supervised Multi-label Learning by Constrained Non-Negative Matrix Factorization. In AAAI, 2006.
 
15
M. R. Naphade, L. Kennedy, J. R. Kender, S.-F. Chang, J. R. Smith, P. Over, and A. Hauptmann. A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005. In IBM Research Report RC23612 (W0505--104), 2005.
 
16
A. Y. Ng, M. I. Jordan, and Y. Weiss. On Spectral Clustering: Analysis and an Algorithm. In NIPS, pages 849--856, 2001.
17
18
19
 
20
TRECVID2005. http://www-nlpir.nist.gov/projects/trecvid/.
21
22
 
23
 
24
R. Yan, M. yu Chen, and A. G. Hauptmann. Mining Relationship Between Video Concepts using Probabilistic Graphical Models. In ICME, pages 301--304, 2006.
 
25
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia University's Baseline Detectors for 374 LSCOM Semantic Visual Concepts. Technical report, Columbia University, March 2007.
 
26
Z.-J. Zha, T. Mei, J. Wang, Z. Wang, and X.-S. Hua. Graph-Based Semi-Supervised Learning with Multi-Label. In ICME, pages 1321--1324, 2008.
 
27
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with Local and Global Consistency. In NIPS, 2003.
 
28
X. Zhu. Semi-supervied Learning Literature Survey. Computer Sciences Technical Report, 1530, University of Wisconsin-Madison, 2007.
 
29
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. In ICML, pages 912--919, 2003.
30


Collaborative Colleagues:
Jingdong Wang: colleagues
Yinghai Zhao: colleagues
Xiuqing Wu: colleagues
Xian-Sheng Hua: colleagues