|
ABSTRACT
This paper describes revised content-based search experiments in the context of TRECVID 2003 benchmark. Experiments focus on measuring content-based video retrieval performance with following search cues: visual features, semantic concepts and text. The fusion of features uses weights and similarity ranks. Visual similarity is computed using Temporal Gradient Correlogram and Temporal Color Correlogram features that are extracted from the dynamic content of a video shot. Automatic speech recognition transcripts and concept detectors enable higher-level semantic searching. 60 hours of news videos from TRECVID 2003 search task were used in the experiments. System performance was evaluated with 25 pre-defined search topics using average precision. In visual search, multiple examples improved the results over single example search. Weighted fusion of text, concept and visual features improved the performance over text search baseline. Expanded query term list of text queries gave also notable increase in performance over the baseline text search
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
IBM CueVideo Toolkit. http://www.almaden.ibm.com/projects/cuevideo.shtml (2.2.2004)
|
| |
2
|
|
| |
3
|
Myron Flickner , Harpreet Sawhney , Wayne Niblack , Jonathan Ashley , Qian Huang , Byron Dom , Monika Gorkani , Jim Hafner , Denis Lee , Dragutin Petkovic , David Steele , Peter Yanker, Query by Image and Video Content: The QBIC System, Computer, v.28 n.9, p.23-32, September 1995
[doi> 10.1109/2.410146]
|
| |
4
|
|
| |
5
|
Hampapur, A., Gupta, A., Horowitz, B., Shu, C.-F., Fuller, C., Bach, J., Gorkani, M. and Jain, R. Virage Video Engine. In Proceedings of SPIE vol. 3022, Storage and Retrieval for Image and Video Databases, 1997, 188--198.
|
| |
6
|
Naphade, M.R., Kristjansson, T., Frey, B., and Huang, T.S. Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems. In Proceedings of International Conference on Image Processing, vol. 3. 1998, 536--540.
|
| |
7
|
|
| |
8
|
Chang, S.F., Chen, W., and Sundaram, H. Semantic visual templates - linking features to semantics. In Proceedings of IEEE International Conference on Image Processing, vol. 3., 1998, 531--535.
|
| |
9
|
Del Bimbo, A. Expressive semantics for automatic annotation and retrieval of video streams. In Proceedings of IEEE International Conference on Multimedia and Expo, Vol.2., 2000, 671--674.
|
| |
10
|
Adams, W., Iyengar, G., Lin, C., Naphade, M., Neti, C., Nock, H., and Smith, J. Semantic Indexing of Multimedia Content using Visual, Audio and Text Cues. EURASIP Journal on Applied Signal Processing, vol. 2, 2003, 1--16.
|
| |
11
|
TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/ (4.6.2004)
|
| |
12
|
|
| |
13
|
Prewitt, J.M.S. Object enhancement and extraction. In B.S.Lipkin and A. Rosenfeld, (eds) Picture Processing and Psychopictorics, Academic Press, New York, 1970.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Rautiainen, M., Seppänen, T., Penttilä, J., and Peltola, J. Detecting semantic concepts from video using temporal gradients and audio classification. In Proceedings of International Conference on Image and Video Retrieval, Urbana, IL, 2003, 260--270.
|
| |
18
|
Salton, G. and Yang, C. On the specification of term values in automatic indexing. Journal of Documentation, Vol. 29, 1973, 351--372.
|
| |
19
|
Porter, M. An Algorithm for Suffix Stripping Program. Program, 14(3), 1980, 130--137.
|
| |
20
|
IBM VideoAnnEx Server Page MPEG-7 (4.2.2004) http://mp7.watson.ibm.com/VideoAnnEx/
|
| |
21
|
Rautiainen, M., Penttilä, J., Pietarila, P., Noponen, K., Hosio, M., Koskela, T., Mäkelä, S.M., Peltola, J., Liu, J., Ojala, T., and Seppänen, T. TRECVID 2003 experiments at MediaTeam Oulu and VTT. TRECVID Workshop at Text Retrieval Conference TREC-2003, Gaithersburg, MD, 2003.
|
| |
22
|
|
| |
23
|
Rautiainen, M., Ojala, T. and Seppänen T. Cluster-temporal video browsing with semantic filtering. In Proceedings of Advanced Concepts for Intelligent Vision Systems, Ghent, Belgium, 2003, 116--123.
|
| |
24
|
NIST: Common Evaluation Measures. appendix in Special Publication 500-250 (TREC 2001), NIST, Gaithersburg, MD, http://trec.nist.gov/pubs/trec10/appendices/measures.pdf.
|
| |
25
|
Text REtrieval Conference TREC. http://trec.nist.gov/ (26.3.2004)
|
| |
26
|
Rautiainen, M., Ojala, T., and Seppänen, T. Cluster-temporal browsing of large news video databases. in Proceedings of 2004 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, 2004.
|
CITED BY 4
|
|
|
|
|
|
|
|
Hugo Jair Escalante , Carlos A. Hérnadez , Luis Enrique Sucar , Manuel Montes, Late fusion of heterogeneous methods for multimedia image retrieval, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|