ACM Home Page
Please provide us with feedback. Feedback
On compensating the Mel-frequency cepstral coefficients for noisy speech recognition
Full text PdfPdf (153 KB)
Source ACM International Conference Proceeding Series; Vol. 171 archive
Proceedings of the 29th Australasian Computer Science Conference - Volume 48 table of contents
Hobart, Australia
Pages: 49 - 54  
Year of Publication: 2006
ISBN ~ ISSN:1445-1336 , 1-920682-30-9
Author
Eric H. C. Choi  Interfaces, Machines and Graphic Environments (IMAGEN), National ICT Australia, Alexandria, NSW, Sydney, Australia
Publisher
Australian Computer Society, Inc.  Darlinghurst, Australia, Australia
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 61,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accuracy of 84.92% for a model set trained from clean speech data. Compared with the ETSI standard Mel-cepstral front-end, the proposed front-end is found to obtain a relative error rate reduction of around 61%. Moreover, the proposed front-end can provide comparable recognition accuracy with the ETSI advanced front-end, at less than half the computation load.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Choi, E. (2004): Noise Robust Front-end for ASR using Spectral Subtraction, Spectral Flooring and Cumulative Distribution Mapping. Proc. 10th Australian Int. Conf. on Speech Science and Technology, pp. 451-456.
2
 
3
Dharanipragada, S. and Padmanabhan, M. (2000): A Nonlinear Unsupervised Adaptation Technique for Speech Recognition. Proc. Int. Conf. on Spoken Language Processing, Vol. 4, pp. 556-559.
 
4
Ephraim, Y. (1992): A Bayesian Estimation Approach for Speech Enhancement Using Hidden Markov Models. IEEE Trans. Signal Processing, Vol. 40, No. 4, pp. 725-735.
 
5
ETSI (2000): Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-end Feature Extraction Algorithm; Compression Algorithms. ETSI standard document ES 201 108.
 
6
ETSI (2002): Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm. ETSI standard document ES 202 050.
 
7
Hermansky, H. (1990): Perceptual Linear Predictive (PLP) Analysis of Speech. Journal Acoustical Society of America (JASA), Vol. 87 (4), pp. 1738-1752.
 
8
Hirsch, H.G. and Pearce, D. (2000): The AURORA Experimental Framework for the Performance Evaluation of Speech Recognition Systems Under Noise Conditions. Proc. ISCA ITRW ASR2000, pp. 181-188.
 
9
Huang, C., Wang, H. and Lee, C. (2001): An SNR-Incremental Stochastic Matching Algorithm for Noisy Speech Recognition. IEEE Trans. Speech and Audio Processing, Vol. 9, No. 8, pp. 866-873.
 
10
 
11
 
12
Sankar, A. and Lee, C.H. (1996): A Maximum Likelihood Approach to Stochastic Matching for Robust Speech Recognition. IEEE Trans. Speech and Audio Processing, Vol. 4, pp. 190-202.
 
13
Stevens, S.S. (1957): On the Psychological Law. Psychological Review, Vol. 64, pp. 153-181.
 
14
Vaseghi, S.V. (2000): Advanced Digital Signal Processing and Noise Reduction. Wiley Press.
 
15
Yao, K., Paliwal, K.K. and Nakamura, S. (2001): Sequential Noise Compensation by a Sequential Kullback Proximal Algorithm. Proc. European Conf. on Speech Communication and Technology, pp. 1139-1142.
 
16
Zhang, Z. and Furui, S. (2004): Piecewise-linear Transformation-based HMM Adaptation for Noisy Speech. Speech Communication, Vol. 42, Issue 1, pp. 43-58.