ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
VoIP speech quality estimation in a mixed context with genetic programming
Full text PdfPdf (160 KB)
Source
Genetic And Evolutionary Computation Conference archive
Proceedings of the 10th annual conference on Genetic and evolutionary computation table of contents
Atlanta, GA, USA
SESSION: Real-world application papers table of contents
Pages: 1627-1634  
Year of Publication: 2008
ISBN:978-1-60558-130-9
Authors
Adil Raja  University of Limerick, Limerick, Ireland
R. Muhammad Atif Azad  University of Limerick, Limerick, Ireland
Colin Flanagan  University of Limerick, Limerick, Ireland
Conor Ryan  University of Limerick, Limerick, Ireland
Sponsors
ACM: Association for Computing Machinery
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 109,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1389095.1389402
What is a DOI?

ABSTRACT

Voice over IP (VoIP) speech quality estimation is crucial to providing optimal Quality of Service (QoS). This paper seeks to provide improved speech quality estimation models with better prediction accuracy by considering a richer set of input features than the current International Telecommunications Union-Telecommunication (ITU-T) recommendations. It addresses a transitional phase, where wideband (WB) networks are becoming available. However, they have to co-exist with the existing narrowband (NB) setups for the time being. Quality estimation becomes a challenge in such a mixed context. The ITU-T recommendation (termed E-Model) has recently been extended to deal with the mixed context. However, it evaluates the speech degradation in the WB scenario based solely on codec related distortions (only a subset of factors affecting the speech quality on a VoIP network). The extension is derived out of speech signals evaluated by human subjects: an expensive and difficult to reproduce exercise. This paper innovates by considering a number of other network distortion types as well to produce generalised models that predict the quality degradation to a higher accuracy. To this end, an extensive set of speech samples is subjected to a wide variety of distortions. The degraded signals are evaluated by the currently best available algorithmic approximation of human evaluation of speech to produce quality scores. Using the distortions as the input features and targeting the quality scores, we employ Genetic Programming to produce parsimonious models that show considerable prediction gain compared to the E-Model. As against some existing approaches, where the models are tailored to various telephony codecs, the evolved models generalise across a variety of modern codecs.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
V. Barriac, J. Y. Sout, and C. Lockwood. Discussion on unified objective methodologies for the comparison of voice quality of narrowband and wideband scenarios. In In. Proc. Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction, 2004.
 
2
A. D. Clark. Modeling the effects of burst packet loss and recency on subjective voice quality. In 2nd IP-Telephony Workshop, Columbia University, New York, April 2001.
 
3
ETSI EN 301 704 V7.2.1. Digital cellular telecommunications system; Adaptive Multi-Rate (AMR) speech transcoding.
 
4
S. Gustafson, E. K. Burke, and N. Krasnogor. On improving genetic programming for symbolic regression. In D. C. et. al., editor, Proceedings of the 2005 IEEE Congress on Evolutionary Computation, volume 1, pages 912--919, Edinburgh, UK, 2-5Sept. 2005. IEEE Press.
 
5
ITU-T. Coding of Speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). International Telecommunications Union, Geneva, Switzerland, March 1996. ITU-T Recommendation G.729.
 
6
ITU-T. Dual rate speech coder for multimedia communication transmitting at 5.3 and 6.3 kbit/s. International Telecommunications Union, Geneva, Switzerland, March 1996. ITU-T Recommendation G.723.1.
 
7
ITU-T. Methods for subjective determination of transmission quality. International Telecommunications Union, Geneva, Switzerland, 1996. ITU-T Recommendation P.800.
 
8
ITU-T. coded-speech database. International Telecommunications Union, Geneva, Switzerland, 1998. ITU-T P.Supplement 23.
 
9
ITU-T. Methodology for the derivation of equipment impairment factors from instrumental models. International Telecommunications Union, Geneva, Switzerland, 2002. ITU-T Recommendation P.834.
 
10
ITU-T. Mean opinion score (MOS) terminology. International Telecommunications Union, Geneva, Switzerland, 2003. ITU-T Recommendation P.800.1.
 
11
ITU-T. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB). International Telecommunications Union, Geneva, Switzerland, July 2003. ITU-T Recommendation G.722.2.
 
12
ITU-T. The E-Model, a computational model for use in transmission planning. International Telecommunications Union, Geneva, Switzerland, 2005. ITU-T Recommendation G.107.
 
13
ITU-T. Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. International Telecommunications Union, Geneva, Switzerland, May 2005. ITU-T Recommendation G.722.1.
 
14
ITU-T. Network model for evaluating multimedia transmission performance over internet protocol. International Telecommunications Union, Geneva, Switzerland, November 2005. ITU-T Recommendation G.1050.
 
15
ITU-T. Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs. International Telecommunications Union, Geneva, Switzerland, 2005. ITU-T Recommendation P.862.2.
 
16
W. Jiang and H. Schulzrinne. Modeling of packet loss and delay and their effect on real-time multimedia service quality. In In Proc. NOSSDAV, June 2000.
 
17
M. Keijzer. Improving symbolic regression with interval arithmetic and linear scaling. In C. Ryan, T. Soule, M. Keijzer, E. Tsang, R. Poli, and E. Costa, editors, Genetic Programming, Proceedings of EuroGP'2003, volume 2610 of LNCS, pages 70--82, Essex, 14-16 Apr. 2003. Springer-Verlag.
 
18
 
19
 
20
Lingfen and E. C. Ifeachor. perceived speech quality prediction for voice over ip-based networks. In IEEE International Conference on Communications (ICC), volume 4, pages 2573--2577, 2002.
 
21
 
22
S. Moller, A. Raake, N. Kitawaki, A. Takahashi, and M. Waltermann. Impairment factor framework for wide-band speech codecs. IEEE Transactions on Audio, Speech and Language Processing, 16(6):1969--1976, November 2006.
 
23
C. Morioka, A. Kurashima, and A. Takahashi. Proposal on objective speech quality assessment for wideband telephony. In IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), 2004.
 
24
S. Pennock. Accuracy of the perceptual evaluation of speech quality (PESQ) algorithm. In Measurement of Speech and Audio Quality in Networks (MESAQIN), January 2002.
 
25
26
 
27
A. Raja, R. M. A. Azad, C. Flanagan, and C. Ryan. Real-time, non-intrusive evaluation of VoIP. In M. Ebner, M. O'Neill, A. Ekárt, L. Vanneschi, and A. I. Esparcia-Alcázar, editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, pages 217--228, Valencia, Spain, 11 - 13Apr. 2007. Springer.
 
28
H. Sanneck and G. Carle. A framework model for packet loss metrics based on loss runlengths. In SPIE/ACM SIGMM Multimedia Computing and Networking Conference, January 2000.
 
29
L. Sun and E. C. Ifeachor. Subjective and objective speech quality evaluation under bursty losses. In Measurement of Speech and Audio Quality in Networks (MESAQIN), January 2002.
 
30
L. Sun and E. C. Ifeachor. Voice quality prediction models and their application in VoIP networks. IEEE Transactions on Multimedia, 8(4):809--820, August 2006.

Collaborative Colleagues:
Adil Raja: colleagues
R. Muhammad Atif Azad: colleagues
Colin Flanagan: colleagues
Conor Ryan: colleagues