ACM Home Page
Please provide us with feedback. Feedback
Automatic classification of speech and music using neural networks
Full text PdfPdf (1.67 MB)
Source ACM International Workshop On Multimedia Databases archive
Proceedings of the 2nd ACM international workshop on Multimedia databases table of contents
Washington, DC, USA
SESSION: Multimedia data mining table of contents
Pages: 94 - 99  
Year of Publication: 2004
ISBN:1-58113-975-6
Authors
M. Kashif Saeed Khan  King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia
Wasfi G. Al-Khatib  King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia
Muhammad Moinuddin  King Fahd Univ. of Petroleum and Minerals, Dhahran, Saudi Arabia
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 109,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1032604.1032620
What is a DOI?

ABSTRACT

The importance of automatic discrimination between speech signals and music signals has evolved as a research topic over recent years. The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. Several approaches have been previously used to discriminate between speech and music data. In this paper, we propose the use of the mean and variance of the discrete wavelet transform in addition to other features that have been used previously for audio classification. We have used Multi-Layer Perceptron (MLP) Neural Networks as a classifier. Our initial tests have shown encouraging results that indicate the viability of our approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Carey, M. J., Parris, E. S. and Lloyd-Thomas, H., A Comparison of Features for Speech, Music Discrimination. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 99), Vol. 1, 1999.
 
2
Chou, W. and Gu, L., Robust Singing Detection In Speech/Music Discriminator Design. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 01), Vol. 2, 2001.
 
3
El-Maleh, K., Klein, M., Petrucci, G. and Kabal, P., Speech/Music Discrimination For Multimedia Applications. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 00), Vol. 6, 2000.
 
4
Harb, H. and Chen, L., Robust Speech Music Discrimination Using Spectrum's First Order Statistics And Neural Networks. In Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, Vol. 2, 2003.
 
5
Harb, H., Chen, L. and Auloge, J. Y., Speech/Music/Silence and Gender Detection Algorithm. In Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS 01), 2001.
 
6
 
7
Karneback, S., Discrimination between speech and music based on a low frequency modulation feature. In Proceedings of the European Conference on Speech Communication and Technology, 2001.
 
8
Panagiotakis, C. and Tziritas, G., A Speech/Music Discriminator Based On RMS And Zero-Crossings. IEEE Transactions on Multimedia, 2004.
 
9
Parris, E. S., Carey, M. J. and Lloyd-Thomas, H., Feature Fusion For Music Detection. In Proceedings of the European Conference on Speech Communication and Technology, 1999.
 
10
Pinquier, J., Rouas, J. -L. and André-Obrecht, R., A Fusion Study in Speech/Music Classification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 03), Vol. 2, 2003.
 
11
Pinquier, J., Rouas, J.-L. and André-Obrecht, R., Robust Speech / Music Classification in Audio Documents. In Proceedings of the International Conference on Spoken Language Processing (ICSLP 02), Vol. 3, 2002.
 
12
Pinquier, J., Sénac, C. and André-Obrecht, R., Speech and Music Classification in Audio Documents. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 02), Vol. 4, 2002.
 
13
Saad, E. M., El-Adawy, M. I., Abu-El-Wafa, M. E. and Wahba, A. A., A Multifeature Speech/Music Discrimination System. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE 02), Vol. 2, 2002.
 
14
Saunders, J., Real-Time Discrimination of Broadcast Speech/Music. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 96), Vol. 2, 1996.
 
15
 
16
Wang, W. Q., Gao, W. and Ying, D. W., A Fast and Robust Speech/Music Discrimination Approach. In Proceedings of the International Conference on Information, Communications and Signal Processing, 2003.

Collaborative Colleagues:
M. Kashif Saeed Khan: colleagues
Wasfi G. Al-Khatib: colleagues
Muhammad Moinuddin: colleagues