ACM Home Page
Please provide us with feedback. Feedback
Semantic context detection based on hierarchical audio models
Full text PdfPdf (330 KB)
Source International Multimedia Conference archive
Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval table of contents
Berkeley, California
SESSION: Applications table of contents
Pages: 109 - 115  
Year of Publication: 2003
ISBN:1-58113-778-8
Authors
Wen-Huang Cheng  National Taiwan University, Taipei, Taiwan
Wei-Ta Chu  National Taiwan University, Taipei, Taiwan
Ja-Ling Wu  National Taiwan University, Taipei, Taiwan
Sponsor
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 39,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/973264.973282
What is a DOI?

ABSTRACT

Semantic context detection is one of the key techniques to facilitate efficient multimedia retrieval. Semantic context is a scene that completely represents a meaningful information segment to human beings. In this paper, we propose a novel hierarchical approach that models the statistical characteristics of several audio events, over a time series, to accomplish semantic context detection. The approach consists of two stages: audio event and semantic context detections. HMMs are used to model basic audio events, and event detection is performed in the first stage. Then semantic context detection is achieved based on Gaussian mixture models, which model the correlations among several audio events temporally. With this framework, we bridge the gaps between low-level features and the semantic contexts that last in a time series. The experimental evaluations indicate that the approach is effective in detecting high-level semantics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Zhang, T., and Kuo, C.-C. J. Hierarchical System for Content-based Audio Classification and Retrieval, Proceedings of SPIE, Multimedia Storage and Archiving Systems III, Vol. 3527, 398--409, 1998.
 
2
Cai, R., Lu, L., Zhang, H.-J., and Cai, L.-H. Highlight Sound Effects Detection in Audio Stream. Proceedings of IEEE International Conference on Multimedia & Expo, 2003.
 
3
Lu, L., Zhang, H.-J., and Jiang, H. Content Analysis for Audio Classification and Segmentation. IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 7, 504--516, 2002.
 
4
Naphade, M. R., and Huang, T. S. Extracting Semantics from Aduiovisual Content: The Final Frontier in Multimedia Retrieval. IEEE Transactions on Neural Network, Vol. 13, No. 4, 793--810, 2002.
 
5
Naphade, M. R., and Huang, T. S. A Probabilistic Framework for Semantic Video Indexing, Filtering, and Retrieval. IEEE Transactions on Multimedia, Vol. 3, No. 1, 141--151, 2001.
 
6
Iyengar, G., Nock, H., Neti, C., and Franz, M. Semantic Indexing of Multimedia Using Audio, Text, and Visual Cues. Proceedings of ICME, 369--372, 2002.
7
 
8
Ho, C.-C. A Study of Effective Techniques for User-Centric Video Streaming. Ph.D. dissertation, National Taiwan University, May, 2003.
 
9
Wang, Y., Liu, Z., and Huang, J.-C. Multimedia Content Analysis. IEEE Signal Processing Magazine, Nov. 2000, 12--36, 2000.
 
10
Rabiner, L. R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proceedings of The IEEE, Vol. 77, No. 2, 257--286, 1989.
 
11
Naphade, M., Mehrotra, R., Ferman, A. M., Warnick, J., Huang, T. S., and Tekalp, A. M. A High Performance Shot Boundary Detection Algorithm Using Multiple Cues. Proceedings of IEEE Int. Conf. Image Processing, Vol. 2, 884--887, 1998.
 
12
13
 
14
Naphade, M. R., Kristjansson, T., Frey, B., and Huang, T. S. Probabilistic Multimedia Objects (Multijects): A Novel Approach to Video Indexing and Retrieval in Multimedia System. IEEE Int. Conf. Image Processing, Vol. 3, 536--540, 1998.
 
15
Kschischang, F. R., and Frey, B. J. Factor Graphs and the Sum-Product Algorithm. IEEE Transactions on Information Theory, Vol. 47, No. 2, 498--519, 2001.
 
16


Collaborative Colleagues:
Wen-Huang Cheng: colleagues
Wei-Ta Chu: colleagues
Ja-Ling Wu: colleagues

Peer to Peer - Readers of this Article have also read: