|
ABSTRACT
Audio rendering of impact sounds, such as those caused by falling objects or explosion debris, adds realism to interactive 3D audiovisual applications, and can be convincingly achieved using modal sound synthesis. Unfortunately, mode-based computations can become prohibitively expensive when many objects, each with many modes, are impacted simultaneously. We introduce a fast sound synthesis approach, based on short-time Fourier Tranforms, that exploits the inherent sparsity of modal sounds in the frequency domain. For our test scenes, this "fast mode summation" can give speedups of 5--8 times compared to a time-domain solution, with slight degradation in quality. We discuss different reconstruction windows, affecting the quality of impact sound "attacks". Our Fourier-domain processing method allows us to introduce a scalable, real-time, audio processing pipeline for both recorded and modal sounds, with auditory masking and sound source clustering. To avoid abrupt computation peaks, such as during the simultaneous impacts of an explosion, we use crossmodal perception results on audiovisual synchrony to effect temporal scheduling. We also conducted a pilot perceptual user evaluation of our method. Our implementation results show that we can treat complex audiovisual scenes in real time with high quality.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alais, D., and Carlile, S. 2005. Synchronizing to real events: subjective audiovisual alignment scales with perceived auditory depth and speed of sound. Proc Natl Acad Sci 102, 6, 2244--7.
|
| |
2
|
Begault, D. 1999. Auditory and non-auditory factors that potentially influence virtual acoustic imagery. In Proc. AES 16th Int. Conf. on Spatial Sound Reproduction, 13--26.
|
| |
3
|
Fujisaki, W., Shimojo, S., Kashino, M., and Nishida, S. 2004. Recalibration of audiovisual simultaneity. Nature Neuroscience 7, 7, 773--8.
|
| |
4
|
Guski, R., and Troje, N. 2003. Audiovisual phenomenal causality. Perception and Psychophysics 65, 5, 789--800.
|
| |
5
|
Hormander, L. 1983. The Analysis of Linear Partial Differential Operators I. Springer-Verlag.
|
| |
6
|
Howell, D. C. 1992. Statistical Methods for Psychology. PWS-Kent.
|
| |
7
|
ITU. 2001--2003. Method for the subjective assessment of intermediate quality level of coding systems, rec. ITU-R BS.1534-1, http://www.itu.int/.
|
 |
8
|
|
| |
9
|
Larsson, P., Västfjäll, D., and Kleiner, M. 2002. Better presence and performance in virtual environments by improved binaural sound rendering. Proc. AES 22nd Intl. Conf. on virtual, synthetic and entertainment audio (June), 31--38.
|
 |
10
|
Thomas Moeck , Nicolas Bonneel , Nicolas Tsingos , George Drettakis , Isabelle Viaud-Delmon , David Alloza, Progressive perceptual audio rendering of complex scenes, Proceedings of the 2007 symposium on Interactive 3D graphics and games, April 30-May 02, 2007, Seattle, Washington
[doi> 10.1145/1230100.1230133]
|
 |
11
|
|
| |
12
|
|
 |
13
|
Dinesh K. Pai , Kees van den Doel , Doug L. James , Jochen Lang , John E. Lloyd , Joshua L. Richmond , Som H. Yau, Scanning physical interaction behavior of 3D objects, Proceedings of the 28th annual conference on Computer graphics and interactive techniques, p.87-96, August 2001
[doi> 10.1145/383259.383268]
|
| |
14
|
|
 |
15
|
|
| |
16
|
Rodet, X., and Depalle, P. 1992. Spectral envelopes and inverse FFT synthesis. In Proc. 93rd Conv. AES, San Francisco.
|
| |
17
|
Sekuler, R., Sekuler, A. B., and Lau, R. 1997. Sound alters visual motion perception. Nature 385, 6614, 308.
|
| |
18
|
Sugita, Y., and Suzuki, Y. 2003. Audiovisual perception: Implicit estimation of sound-arrival time. Nature 421, 6926, 911.
|
 |
19
|
|
| |
20
|
Tsingos, N. 2005. Scalable perceptual mixing and filtering of audio signals using an augmented spectral representation. In Proc. Int. Conf. on Digital Audio Effects, 277--282.
|
| |
21
|
|
| |
22
|
Van Den Doel, K., and Pai, D. K. 2003. Modal synthesis for vibrating objects. Audio Anecdotes.
|
 |
23
|
|
| |
24
|
Van Den Doel, K., Pai, D. K., Adam, T., Kortchmar, L., and Pichora-Fuller, K. 2002. Measurements of perceptual quality of contact sound models. Intl. Conf. on Auditory Display, (ICAD), 345--349.
|
| |
25
|
|
| |
26
|
Zölzer, U. 2002. Digital Audio Effects (DAFX), chapter 8. Wiley.
|
CITED BY 3
|
|
|
|
|
David Grelaud , Nicolas Bonneel , Michael Wimmer , Manuel Asselot , George Drettakis, Efficient and practical audio-visual rendering for games using crossmodal perception, Proceedings of the 2009 symposium on Interactive 3D graphics and games, February 27-March 01, 2009, Boston, Massachusetts
|
|
|
|
|