|
ABSTRACT
We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Shih-Fu Chang, W. Chen, H.J. Meng, H. Sundaram, and Di Zhong. A fully automated content-based video search engine supporting spatiotemporal queries. In IEEE Transactions on Circuits and Systems for Video Technology, volume 8, pages 602-- 615, 1998.
|
| |
2
|
M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang. Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval. In IEEE International Conference on Image Processing, volume 3, pages 536--540, 1998.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
| |
8
|
Mikko Kurimo. Indexing audio documents by using latent semantic analysis and som. In Erkki Oja and Samuel Kaski, editors, Kohonen Maps, pages 363--374. Elsevier, 1999.
|
| |
9
|
Rong Zhao and William I Grosky. From features to semantics: Some preliminary results. In International Conference on Multimedia and Expo, 2000.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Wei-Ying Ma and Hong Jiang Zhang. Benchmarking of image features for content-based image retrieval. In Thirty-second Asilomar Conference on Signals, System and Computers, volume 1, pages 253--257, 1998.
|
| |
14
|
Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Latent semantic analysis for semantic content detection of video shots. In International Conference on Multimedia and Expo, 2004.
|
| |
15
|
Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Video content modeling with latent semantic analysis. In Third International Workshop on Content-Based Multimedia Indexing, 2003.
|
| |
16
|
M. Mirmehdi and R. Perissamy. Perceptual image indexing and retrieval. Journal of Visual Communication and Image Representation, 13(4):460--475, December 2002.
|
 |
17
|
|
| |
18
|
|
| |
19
|
Daniel DeMenthon. Spatio-temporal segmentation of video by hierarchical mean shift analysis. In Workshop on Statistical Methods in Video Processing, 2002.
|
| |
20
|
J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, 1971.
|
| |
21
|
Ching-Yung Lin, Belle L. Tseng, and John R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In Proceedings of the TRECVID 2003 Workshop, 2003.
|
| |
22
|
Fabrice Souvannavong, Bernard Merialdo, and Benoit Huet. Latent semantic indexing for video content modeling and analysis. In The 12th Text REtrieval Conference (TREC), 2003.
|
| |
23
|
|
CITED BY 2
|
|
|
|
|
Linjun Yang , Jiemin Liu , Xiaokang Yang , Xian-Sheng Hua, Multi-modality web video categorization, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
|
|