| Audio privacy: reducing speech intelligibility while preserving environmental sounds |
| Full text |
Pdf
(1.01 MB)
|
Source
|
International Multimedia Conference
archive
Proceeding of the 16th ACM international conference on Multimedia
table of contents
Vancouver, British Columbia, Canada
SESSION: Content track short papers session 2: content analysis and applications
table of contents
Pages 733-736
Year of Publication: 2008
ISBN:978-1-60558-303-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 46, Citation Count: 0
|
|
|
ABSTRACT
Audio monitoring has many applications but also raises privacy concerns. In an attempt to help alleviate these concerns, we have developed a method for reducing the intelligibility of speech while preserving intonation and the ability to recognize most environmental sounds. The method is based on identifying vocalic regions and replacing the vocal tract transfer function of these regions with the transfer function from prerecorded vowels, where the identity of the replacement vowel is independent of the identity of the spoken syllable. The audio signal is then re-synthesized using the original pitch and energy, but with the modified vocal tract transfer function. We performed an intelligibility study which showed that environmental sounds remained recognizable but speech intelligibility can be dramatically reduced to a 7% word recognition rate.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
K. Caine. Privacy perceptions of visual sensing devices Effects of users' ability and type of sensing device Master's thesis, Georgia Institute of Technology, 2006.
|
| |
2
|
D. T. Chappell and J. H. L. Hansen. Spectral smoothing for concatenative speech synthesis. In International Conference on Spoken Language Processing, volume 5, pages 1935--1938, 1998.
|
| |
3
|
R. A. Cole , Yonghong Yan , B. Mak , M. Fanty , T. Bailey, The contribution of consonants versus vowels to word recognition in fluent speech, Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference, p.853-856, May 07-10, 1996
[doi> 10.1109/ICASSP.1996.543255]
|
| |
4
|
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fixcus, D. S. Pallet, N. L. Dahlgren, and V. Zue. Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium, Philadelphia.
|
| |
5
|
I. Gauthier, A. C.-N. Wong, W. G. Hayward, and O. S. Cheung. Font tuning associated with expertise in letter perception. Perception, 35:541--559, 2006.
|
| |
6
|
D. Kewley-Port, T. Z. Burkle, and J. H. Lee. Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. The Journal of the Acoustical Society of America, 122(4):2365--2375, Oct. 2007.
|
| |
7
|
L. Rabiner and R. Schafer. Digital Processing of Speech Signals, chapter 7. Prentice-Hall, Inc., 1978.
|
| |
8
|
C. Schmandt and G. Vallejo. "listenin" to domestic environments from remote locations. In Proc. the 2003 International Conference on Auditory Display, pages 853--856, Boston, MA, 2003.
|
|