ACM Home Page
Please provide us with feedback. Feedback
Loudness measurement of human utterance to a robot in noisy environment
Full text PdfPdf (965 KB)
Source
ACM/IEEE International Conference on Human-Robot Interaction archive
Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction table of contents
Amsterdam, The Netherlands
SESSION: Technical papers table of contents
Pages 217-224  
Year of Publication: 2008
ISBN:978-1-60558-017-3
Authors
Satoshi Kagami  National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Yoko Sasaki  Tokyo University of Science, Chiba, Japan
Simon Thompson  National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Tomoaki Fujihara  Tokyo University of Science, Chiba, Japan
Tadashi Enomoto  Kansai Electric Power Company, Inc., Hyogo, Japan
Hiroshi Mizoguchi  Tokyo University of Science, Chiba, Japan
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 48,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1349822.1349851
What is a DOI?

ABSTRACT

In order to understand utterance based human-robot interation, and to develop such a system, this paper initially analyzes how loud humans speak in a noisy environment. Experiments were conducted to measure

how loud humans speak with 1) different noise levels, 2) different number of sound sources, 3) different sound sources, and 4) different distances to a robot. Synchronized sound sources add noise to the auditory scene, and resultant utterances are recorded and compared to a previously recorded noiseless utterance. From experiments, we understand that humans generate basically the same level of sound pressure level at his/her location irrespective of distance and background noise. More precisely, there is a band according to a distance, and also according to sound sources that is including

language pronounce.

According to this understanding, we developed an online spoken command recognition system for a mobile robot. System consists of two key componenets: 1) Low side-lobe microphone array that works as omini-directional telescopic microphone, and 2) DSBF combined with FBS

method for sound source localization and segmentation. Caller location and segmented sound stream are calculated, and then the segmented sound stream is sent to voice recognition system. The system works with at most five sound sources at the same time with about at most

18[dB] sound pressure differences. Experimental results with the modile robot are also shown.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A.Lee, T.Kawahara and K.Shikano. Julius - an open source real-time large vocabulary recognition engine. In Proceedings of European Conference on Speech Communication and Technology, pages 1691--1694, 2001.
 
2
 
3
A. A. E. Weinstein, K. Steele and J. Glass. Loud: A 1020-node modular microphone array and beamformer for intelligent computing spaces. Technical Report MIT-LCS-TM-642, MIT/LCS Technical Memo, April 2004.
 
4
J. Hirokawa, T. Koga, K. Suzuki, O. Hideki, and N. Matsuhira. Development of a high performance auditory function robot in interaction with human - aprialphatm with omni-directional auditory function - (in japanese). In Proceedings of Robotics and Mechatronics Conference 2006, pages 1A1--E16, Okubo Campas, Waseda University, May 2006.
 
5
C. T. Ishi, S. Matsuda, T. Kanda, T. Jitsuhiro, H. Ishiguro, S. Nakamura, and N. Hagita. Robust speech recognition system for communication robots in real environments. In Proceedings of IEEE-RAS International Conference on Humanoid Robots(HUMANOIDS2006), pages 340--345, Genova, Italy, December 2006.
 
6
James J. Kuffner. Efficient optimal search of Euclidean-cost grids and lattices. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004.
 
7
J.-C. Junqua. The Lombard reflex and its role on human listeners and automatic speech recognizer. The Journal of the Acoustical Society of America, 93(1):510--524, 1993.
 
8
E. Martinson and A. Xchultz. Auditory evidence grids. In Proceedings of 2006 IEEE/RSJ International Conference on Intelligent Robot and Systems (IROS2006), pages 1140--1145, Beijing, China, October 2006.
 
9
 
10
M. Murase, S. Yamamoto, J.-M. Valin, K. Nakadai, K. Yamada, K. Komatani, T. Ogata, and H. G. Okuno. Multiple moving speaker tracking by microphone array on mobile robot. In Proceedings of Proceedings of the Nineth European Conference on Speech Communication and Technology (Interspeech-2005), pages 249--252, Lisboa, Portugal, September 2005.
 
11
K. Nakadai, H. Nakajima, M. Murase, S. Kaijiri, K. Yamada, Y. Hasegawa, H. G. Okuno, and H. Tsujino. Real-time tracking of multiple sound sources by integration of in-room and robot-embedded microphone arrays. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2006), pages 852--859, Beijing, China, September 2006.
 
12
K. Nakadai, H. Nakajima, M. Murase, S. Kaijiri, K. Yamada, T. Nakamura, Y. Hasegawa, H. G. Okuno, and H. Tsujino. Robust tracking of multiple sound sources by spatial integration of room and robot microphone arrays. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2006, pages IV 929--932, Toulouse, France, May 2006.
 
13
M. SATO, A. SUGIYAMA, and S. OHNAKA. Near-field sound-source localization and adaptive noise cancellation in a personal robot, papero (in japanese). In Proceedings of the 22th Meeting of Special Interest Group on AI Challenges, pages 41--46, October 2005.
 
14
 
15
S. Yamamoto, K. Nakadai, H. Tsujino, T. Yokoyama, and H. G. Okuno. Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory. In Proceedings of IEEE-RAS International Conference on Robots and Automation (ICRA2004), pages 1517--1523, New Orleans, May 2004.

Collaborative Colleagues:
Satoshi Kagami: colleagues
Yoko Sasaki: colleagues
Simon Thompson: colleagues
Tomoaki Fujihara: colleagues
Tadashi Enomoto: colleagues
Hiroshi Mizoguchi: colleagues