ACM Home Page
Please provide us with feedback. Feedback
Adapting appearance models of semantic concepts to particular videos via transductive learning
Full text PdfPdf (309 KB)
Source
International Multimedia Conference archive
Proceedings of the international workshop on Workshop on multimedia information retrieval table of contents
Augsburg, Bavaria, Germany
POSTER SESSION: Video retrieval and annotation table of contents
Pages: 187 - 196  
Year of Publication: 2007
ISBN:978-1-59593-778-0
Authors
Ralph Ewerth  University of Marburg, Marburg, Germany
Bernd Freisleben  University of Marburg, Marburg, Germany
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 42,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1290082.1290110
What is a DOI?

ABSTRACT

The detection of high-level concepts in video data is an essential processing step of a video retrieval system. The meaning and the appearance of certain events or concepts are strongly related to contextual information. For example, the appearance of semantic concepts, such as e.g. entertainment or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In recent years, supervised machine learning approaches have been extensively used to learn and detect high-level concepts in video shots. The class of semi-supervised learning methods incorporates unlabeled data in the learning process. Transductive learning is a subclass of semi-supervised learning: In the transductive setting, all training samples are labeled, but the unlabeled test samples are considered in the learning process as well. Up to now, transductive learning has not been applied for the purpose of video indexing and retrieval. In this paper, we propose transductive learning, realized by transductive support vector machines (TSVM), for the detection of those high-level concepts whose appearance is strongly related to a particular video. For each video and each concept, a transductive model is learned separately and adapted to the appearance of a specific concept in the particular test video. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed transductive learning approach for several high-level concepts.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., Naphade, M. R., Natsev, A., Smith, J.R., Tesic, J., and Volkmer, T. IBM Research TRECVID-2005 Video Retrieval System, in TREC Video Retrieval Online Proceedings, (2005), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
2
 
3
Brown, G. Wyatt, J., Harris, R., and Yao, X. Diversity Creation Methods: A Survey and Categorisation. In Information Fusion 6 (2005), Elsevier, 2005, 5--20.
 
4
 
5
Chang, C.-C. and Lin, C.-J. LIBSVM: A Library for Support Vector Machines, 2001. http://www.csie.ntu.edu.tw/~cjlin/libsvm
 
6
Chapelle, O., Schölkopf, B., and Zien, A. Semi-Supervised Learning, MIT Press, Cambridge, Massachusetts, 2006.
 
7
 
8
9
 
10
 
11
 
12
Jeannin, S. and Mory, B. Video Motion Representation for Improved Content Access. In IEEE Transactions on Consumer Electronics, Vol. 46, No. 3, 2000, 645--655.
 
13
 
14
Joachims, T. Transductive Learning via Spectral Graph Partitioning. In Proc. of 20th International Conference on Machine Learning (ICML), Washington DC, 2003, 290--297.
 
15
Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., and Duin, R. P. W. Limits on the Majority Vote Accuracy in Classifier Fusion. In Pattern Analysis and Applications, 6, 2003, Springer-Verlag, 22--31.
16
 
17
Manjunath, B. S., Ohm, J.-R., Vasudevan, V., and Yamada, A. Color and Texture Descriptors. In IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 6, June 2001, 703--715.
18
 
19
Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S.-F., Smith, J. R., Over, P., and Hauptmann, A. A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005 (LSCOM-Lite), IBM Research Technical Report, 2005.
 
20
Phillips, P.J., Grother, P., Micheals, R. J., Blackburn, D. M., Tabassi, E., and Bone, M. Face Recognition Vendor Test 2002. Evaluation Report IR 6965, National Institute of Standards and Technology, www.itl.nist.gov/iad/894.03/face/face.html, March 2003.
 
21
Phillips, P. J., Scruggs, W. T., O'Toole, A. J., Flynn, P. J., Bowyer, K. W., Schott, C. L., and Sharpe, M. FRVT 2006 and ICE 2006 Large-Scale Results. NISTIR 7408, National Institute of Standards and Technology, http://www.frvt.org/FRVT2006/docs/FRVT2006andICE2006LargeScaleReport.pdf
 
22
 
23
24
 
25
TRECVID: TREC Video Retrieval Evaluation Series. http://www-nlpir.nist.gov/projects/trecvid/
 
26
 
27
28
 
29
Yan, R. and Hauptmann, A. G. Co-Retrieval: A Boosted Reranking Approach for Video Retrieval. In Proc. of the Int'l Conf. on Image and Video Retrieval, Dublin, Ireland, 2004, 60--69.
 
30
Yan, R. and Naphade, M. Co-Training Non-Robust Classifiers for Video Semantic Concept Detection. In Proceedings of the IEEE International Conference on Image Processing 2005, Vol. 1, Singapore, 1205--1208.
 
31

Collaborative Colleagues:
Ralph Ewerth: colleagues
Bernd Freisleben: colleagues