ACM Home Page
Please provide us with feedback. Feedback
Progressive perceptual audio rendering of complex scenes
Full text PdfPdf (1.92 MB)
Source
Symposium on Interactive 3D Graphics archive
Proceedings of the 2007 symposium on Interactive 3D graphics and games table of contents
Seattle, Washington
SESSION: Natural phenomena and audio table of contents
Pages: 189 - 196  
Year of Publication: 2007
ISBN:978-1-59593-628-8
Authors
Thomas Moeck  REVES/INRIA Sophia-Antipolis and University of Erlangen-Nuremberg
Nicolas Bonneel  REVES/INRIA Sophia-Antipolis
Nicolas Tsingos  REVES/INRIA Sophia-Antipolis
George Drettakis  REVES/INRIA Sophia-Antipolis
Isabelle Viaud-Delmon  CNRS-UPMC UMR, EdenGames
David Alloza  CNRS-UPMC UMR, EdenGames
Sponsor
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 75,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1230100.1230133
What is a DOI?

ABSTRACT

Despite recent advances, including sound source clustering and perceptual auditory masking, high quality rendering of complex virtual scenes with thousands of sound sources remains a challenge. Two major bottlenecks appear as the scene complexity increases: the cost of clustering itself, and the cost of pre-mixing source signals within each cluster.

In this paper, we first propose an improved hierarchical clustering algorithm that remains efficient for large numbers of sources and clusters while providing progressive refinement capabilities. We then present a lossy pre-mixing method based on a progressive representation of the input audio signals and the perceptual importance of each sound source. Our quality evaluation user tests indicate that the recently introduced audio saliency map is inappropriate for this task. Consequently we propose a "pinnacle", loudness-based metric, which gives the best results for a variety of target computing budgets. We also performed a perceptual pilot study which indicates that in audio-visual environments, it is better to allocate more clusters to visible sound sources. We propose a new clustering metric using this result. As a result of these three solutions, our system can provide high quality rendering of thousands of 3D-sound sources on a "gamer-style" PC.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alais, D., and Burr, D. 2004. The ventriloquism effect results from near-optimal bimodal integration. Current Biology 14, 257--262.
 
2
AND, C. G. 1993. Methods for quality assessment of low bit-rate audio codecs, proceedings of the 12th aes conference. 97--107.
 
3
Berkhout, A., de Vries, D., and Vogel, P. 1993. Acoustic control by wave field synthesis. J. of the Acoustical Society of America 93, 5 (may), 2764--2778.
 
4
Blauert, J. 1997. Spatial Hearing: The Psychophysics of Human Sound Localization. M.I.T. Press, Cambridge, MA.
 
5
Chen, J., Veen, B. V., and Hecox, K. 1995. A spatial feature extraction and regularization model for the head-related transfer function. J. of the Acoustical Society of America 97 (Jan.), 439--452.
 
6
Darlington, D., Daudet, L., and Sandler, M. 2002. Digital audio effects in the wavelet domain. In Proceedings of COST-G6 Conference on Digital Audio Effects, DAFX2002, Hamburg, Germany.
 
7
2003. EBU subjective listening tests on low-bitrate audio codecs. Technical report 3296, European Broadcast Union (EBU), Projet Group B/AIM (june).
 
8
Fouad, H., Hahn, J., and Ballas, J. 1997. Perceptually based scheduling algorithms for real-time synthesis of complex sonic environments. proceedings of the 1997 International Conference on Auditory Display (ICAD'97),.
 
9
Gallo, E., Lemaitre, G., and Tsingos, N. 2005. Prioritizing signals for selective real-time audio processing. In Proc. of ICAD 2005.
 
10
 
11
Herder, J. 1999. Optimization of sound spatialization resource management through clustering. The Journal of Three Dimensional Images, 3D-Forum Society 13, 3 (Sept.), 59--65.
 
12
Hochbaum, D. S., and Schmoys, D. B. 1985. A best possible heuristic for the ik-center problem. Mathematics of Operations Research 10, 2 (May), 180--184.
 
13
Howell, D. C. 1992. Statistical methods for psychology. PWS-Kent.
 
14
International Telecom. Union. 2001--2003. Method for the subjective assessment of intermediate quality level of coding systems. Recommendation ITU-R BS. 1534--1.
 
15
 
16
Itu-R. 1994. Methods for subjective assessment of small impairments in audio systems including multichannel sound systems. itu-r bs 1116. Tech. rep.
 
17
Jot, J.-M., and Walsh, M. 2006. Binaural simulation of complex acoustic scenes for interactive audio. In 121th AES Convention, San Francisco, USA. Preprint 6950.
 
18
Jot, J.-M., Larcher, V., and Pernaux, J.-M. 1999. A comparative study of 3D audio encoding and rendering techniques. Proceedings of the AES 16th international conference, Spatial sound reproduction, Rovaniemi, Finland (April).
 
19
Kayser, C., Petkov, C., Lippert, M., and Logothetis, N. 2005. Mechanisms for allocating auditory attention: An auditory saliency map. Current Biology 15 (Nov.), 1943--1947.
 
20
Kelly, M., and Tew, A. 2002. The continuity illusion in virtual auditory space. proc. of the 112th AES Conv., Munich, Germany (May).
 
21
Kurniawati, E., Absar, J., George, S., Lau, C. T., and Premkumar, B. 2002. The significance of tonality index and nonlinear psychoacoustics models for masking threshold estimation. In Proceedings of the International Conference on Virtual, Synthetic and Entertainment Audio AES22.
 
22
Lanciani, C. A., and Schafer, R. W. 1997. Psychoacoustically-based processing of MPEG-I layer 1--2 encoded signals. In Proc. IEEE Signal Processing Society 1997 Workshop on Multimedia Signal Processing, 53--58.
 
23
 
24
Larcher, V., Jot, J., Guyard, G., and Warusfel, O. 2000. Study and comparison of efficient methods for 3d audio spatialization based on linear decomposition of HRTF data. Proc. 108th Audio Engineering Society Convention.
 
25
Lewald, J., Ehrenstein, W. H., and Guski, R. 2001. Spatio-temporal constraints for auditory-visual integration. Beh. Brain Research 121, 1--2, 69--79.
 
26
Malham, D., and Myatt, A. 1995. 3D sound spatialization using ambisonic techniques. Computer Music Journal 19, 4, 58--70.
 
27
Møller, H. 1992. Fundamentals of binaural technology. Applied Acoustics 36, 171--218.
 
28
Painter, E. M., and Spanias, A. S. 2000. Perceptual coding of digital audio. Proceedings of the IEEE 88, 4 (Apr.).
 
29
Sarlat, L., Warusfel, O., and Viaud-Delmon, I. 2006. Ventriloquism after-effects occur in the rear hemisphere. Neuroscience Letters 404, 324--329.
 
30
Stoll, G., and Kozamernik, F. 2000. EBU subjective listening tests on internet audio codecs. EBU TECHNICAL REVIEW, (June).
31
 
32
Touimi, A. B. 2000. A generic framework for filtering in subband domain. In In Proc. of IEEE 9th Wkshp. on Digital Signal Processing, Hunt, Texas, USA.
33
 
34
Tsingos, N. 2005. Scalable perceptual mixing and filtering of audio signals using an augmented spectral representation. Proc. of 8th Intl. Conf. on Digital Audio Effects (DAFX'05), Madrid, Spain (Sept.).
 
35
Wand, M., and Strasser, W. 2004. Multi-resolution sound rendering. In Symp. Point-Based Graphics.
 
36
Zölzer, U., Ed. 2002. DAFX - Digital Audio Effects. Wiley.


Collaborative Colleagues:
Thomas Moeck: colleagues
Nicolas Bonneel: colleagues
Nicolas Tsingos: colleagues
George Drettakis: colleagues
Isabelle Viaud-Delmon: colleagues
David Alloza: colleagues