|
ABSTRACT
The detection of high-level concepts in video data is an essential processing step of a video retrieval system. The meaning and the appearance of certain events or concepts are strongly related to contextual information. For example, the appearance of semantic concepts, such as e.g. entertainment or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In recent years, supervised machine learning approaches have been extensively used to learn and detect high-level concepts in video shots. The class of semi-supervised learning methods incorporates unlabeled data in the learning process. Transductive learning is a subclass of semi-supervised learning: In the transductive setting, all training samples are labeled, but the unlabeled test samples are considered in the learning process as well. Up to now, transductive learning has not been applied for the purpose of video indexing and retrieval. In this paper, we propose transductive learning, realized by transductive support vector machines (TSVM), for the detection of those high-level concepts whose appearance is strongly related to a particular video. For each video and each concept, a transductive model is learned separately and adapted to the appearance of a specific concept in the particular test video. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed transductive learning approach for several high-level concepts.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., Naphade, M. R., Natsev, A., Smith, J.R., Tesic, J., and Volkmer, T. IBM Research TRECVID-2005 Video Retrieval System, in TREC Video Retrieval Online Proceedings, (2005), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
|
 |
2
|
|
| |
3
|
Brown, G. Wyatt, J., Harris, R., and Yao, X. Diversity Creation Methods: A Survey and Categorisation. In Information Fusion 6 (2005), Elsevier, 2005, 5--20.
|
| |
4
|
|
| |
5
|
Chang, C.-C. and Lin, C.-J. LIBSVM: A Library for Support Vector Machines, 2001. http://www.csie.ntu.edu.tw/~cjlin/libsvm
|
| |
6
|
Chapelle, O., Schölkopf, B., and Zien, A. Semi-Supervised Learning, MIT Press, Cambridge, Massachusetts, 2006.
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Jeannin, S. and Mory, B. Video Motion Representation for Improved Content Access. In IEEE Transactions on Consumer Electronics, Vol. 46, No. 3, 2000, 645--655.
|
| |
13
|
|
| |
14
|
Joachims, T. Transductive Learning via Spectral Graph Partitioning. In Proc. of 20th International Conference on Machine Learning (ICML), Washington DC, 2003, 290--297.
|
| |
15
|
Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., and Duin, R. P. W. Limits on the Majority Vote Accuracy in Classifier Fusion. In Pattern Analysis and Applications, 6, 2003, Springer-Verlag, 22--31.
|
 |
16
|
|
| |
17
|
Manjunath, B. S., Ohm, J.-R., Vasudevan, V., and Yamada, A. Color and Texture Descriptors. In IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 6, June 2001, 703--715.
|
 |
18
|
|
| |
19
|
Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S.-F., Smith, J. R., Over, P., and Hauptmann, A. A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005 (LSCOM-Lite), IBM Research Technical Report, 2005.
|
| |
20
|
Phillips, P.J., Grother, P., Micheals, R. J., Blackburn, D. M., Tabassi, E., and Bone, M. Face Recognition Vendor Test 2002. Evaluation Report IR 6965, National Institute of Standards and Technology, www.itl.nist.gov/iad/894.03/face/face.html, March 2003.
|
| |
21
|
Phillips, P. J., Scruggs, W. T., O'Toole, A. J., Flynn, P. J., Bowyer, K. W., Schott, C. L., and Sharpe, M. FRVT 2006 and ICE 2006 Large-Scale Results. NISTIR 7408, National Institute of Standards and Technology, http://www.frvt.org/FRVT2006/docs/FRVT2006andICE2006LargeScaleReport.pdf
|
| |
22
|
|
| |
23
|
Cees G. M. Snoek , Marcel Worring , Jan-Mark Geusebroek , Dennis C. Koelma , Frank J. Seinstra , Arnold W. M. Smeulders, The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.28 n.10, p.1678-1689, October 2006
[doi> 10.1109/TPAMI.2006.212]
|
 |
24
|
Cees G. M. Snoek , Marcel Worring , Jan C. van Gemert , Jan-Mark Geusebroek , Arnold W. M. Smeulders, The challenge problem for automated detection of 101 semantic concepts in multimedia, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180727]
|
| |
25
|
TRECVID: TREC Video Retrieval Evaluation Series. http://www-nlpir.nist.gov/projects/trecvid/
|
| |
26
|
|
| |
27
|
|
 |
28
|
|
| |
29
|
Yan, R. and Hauptmann, A. G. Co-Retrieval: A Boosted Reranking Approach for Video Retrieval. In Proc. of the Int'l Conf. on Image and Video Retrieval, Dublin, Ireland, 2004, 60--69.
|
| |
30
|
Yan, R. and Naphade, M. Co-Training Non-Robust Classifiers for Video Semantic Concept Detection. In Proceedings of the IEEE International Conference on Image Processing 2005, Vol. 1, Singapore, 1205--1208.
|
| |
31
|
|
|