| Continuous visual vocabulary modelsfor pLSA-based scene recognition |
| Full text |
Pdf
(965 KB)
|
Source
|
Conference On Image And Video Retrieval
archive
Proceedings of the 2008 international conference on Content-based image and video retrieval
table of contents
Niagara Falls, Canada
POSTER SESSION: Poster/reception
table of contents
Pages 319-328
Year of Publication: 2008
ISBN:978-1-60558-070-8
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 17, Downloads (12 Months): 132, Citation Count: 0
|
|
|
ABSTRACT
Topic models such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) have been shown to perform well in various image content analysis tasks. However, due to the origin of these models from the text domain, almost all prior work uses discrete vocabularies even when applied in the image domain. Thus in these works the continuous local features used to describe an image need to be quantized to fit the model. In this work we will propose and evaluate three different extensions to the pLSA framework so that words are modeled as continuous feature vector distributions rather than crudely quantized high-dimensional descriptors. The performance of these continuous vocabulary models are compared in an automatic scene recognition task. Our experiments clearly show that the continuous approaches outperform the standard pLSA model. In this paper all required equations for parameter estimation and inference are given for each of the three models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. Ahrendt, C. Goutte, and J. Larsen. Co-occurrence models in music genre classification. In IEEE International Workshop on Machine Learning for Signal Processing, pages 247--252, 2005.
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
A. Bosch, A. Zisserman, and X. Munoz. Scene classification via pLSA. In Proceedings of the European Conference on Computer Vision, 2006.
|
| |
6
|
L. Cao and L. Fei-Fei. Spatially coherent latent topic model for concurrent object segmentation and classification. In IEEE Intern. Conf. on Computer Vision (ICCV), 2007.
|
| |
7
|
A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, 39(1):1--38, 1977.
|
| |
8
|
|
 |
9
|
|
| |
10
|
D. Larlus and F. Jurie. Latent mixture vocabularies for object categorization. In British Machine Vision Conference, 2006.
|
| |
11
|
|
| |
12
|
R. Lienhart and M. Slaney. pLSA on large scale image databases. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2007.
|
| |
13
|
R. Lienhart and M. Slaney. pLSA on large scale image databases. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2007.
|
| |
14
|
|
| |
15
|
P. Quelhas , F. Monay , J.-M. Odobez , D. Gatica-Perez , T. Tuytelaars , L. Van Gool, Modeling Scenes with Local Descriptors and Latent Aspects, Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, p.883-890, October 17-20, 2005
[doi> 10.1109/ICCV.2005.152]
|
| |
16
|
|
| |
17
|
|
| |
18
|
A. Vailaya, M. Figueiredo, A. Jain, and H. Zhang. Image classification for content-based indexing. IEEE Transactions on Image Processing, 10(1):117--130, 2001.
|
| |
19
|
J. Vogel and B. Schiele. Natural scene retrieval based on a semantic modeling step. In CIVR, pages 207--215, 2004.
|
| |
20
|
S. Young. A review of large-vocabulary continuous-speech recognition. IEEE Signal Processing Magazine, 13(5):45--57, 1996.
|
|