|
ABSTRACT
Top end mobile phones include a number of specialized (e.g., accelerometer, compass, GPS) and general purpose sensors (e.g., microphone, camera) that enable new people-centric sensing applications. Perhaps the most ubiquitous and unexploited sensor on mobile phones is the microphone - a powerful sensor that is capable of making sophisticated inferences about human activity, location, and social events from sound. In this paper, we exploit this untapped sensor not in the context of human communications but as an enabler of new sensing applications. We propose SoundSense, a scalable framework for modeling sound events on mobile phones. SoundSense is implemented on the Apple iPhone and represents the first general purpose sound sensing system specifically designed to work on resource limited phones. The architecture and algorithms are designed for scalability and Soundsense uses a combination of supervised and unsupervised learning techniques to classify both general sound types (e.g., music, voice) and discover novel sound events specific to individual users. The system runs solely on the mobile phone with no back-end interactions. Through implementation and evaluation of two proof of concept people-centric sensing applications, we demostrate that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Tarek Abdelzaher , Yaw Anokwa , Peter Boda , Jeff Burke , Deborah Estrin , Leonidas Guibas , Aman Kansal , Samuel Madden , Jim Reich, Mobiscopes for Human Spaces, IEEE Pervasive Computing, v.6 n.2, p.20-29, April 2007
[doi> 10.1109/MPRV.2007.38]
|
| |
2
|
O. Amft, M. Stäger, P. Lukowicz, and G. Tröster. Analysis of chewing sounds for dietary monitoring. In M. Beigl, S. S. Intille, J. Rekimoto, and H.Tokuda, editors, Ubicomp, volume 3660 of Lecture Notes in Computer Science, pages 56--72. Springer, 2005.
|
| |
3
|
Apple. Introduction to the ob jective-c 2.0 programming language. Website, 2008. http://developer.apple.com/documentation/Cocoa/ConceptualObjectiveC/Introduction/chapter_1_section_1.html.
|
| |
4
|
Apple. iphone. Website, 2008. http://www.apple.com/iphone/.
|
| |
5
|
Apple. iphone sdk. Website, 2008. http://developer.apple.com/iphone/.
|
| |
6
|
L. Bao and S. S. Intille. Activity recognition from user-annotated acceleration data. In A. Ferscha and F. Mattern, editors, Pervasive, volume 3001 of Lecture Notes in Computer Science, pages 1--17. Springer, 2004.
|
| |
7
|
S. Basu. A linked-HMM model for robust voicing and speech detection. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03). 2003 IEEE International Conference on, volume 1, 2003.
|
| |
8
|
|
| |
9
|
M. Borgerding. Kiss fft. Website, 2008. http://sourceforge.net/projects/kissfft/.
|
| |
10
|
J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy, and Srivastava. Participatory sensing. In In: Workshop on World-Sensor-Web (WSW): Mobile Device Centric Sensor Networks and Applications, 2006.
|
 |
11
|
Andrew T. Campbell , Shane B. Eisenman , Nicholas D. Lane , Emiliano Miluzzo , Ronald A. Peterson, People-centric urban sensing, Proceedings of the 2nd annual international workshop on Wireless internet, p.18-es, August 02-05, 2006, Boston, Massachusetts
[doi> 10.1145/1234161.1234179]
|
| |
12
|
Tanzeem Choudhury , Gaetano Borriello , Sunny Consolvo , Dirk Haehnel , Beverly Harrison , Bruce Hemingway , Jeffrey Hightower , Predrag Pedja Klasnja , Karl Koscher , Anthony LaMarca , James A. Landay , Louis LeGrand , Jonathan Lester , Ali Rahimi , Adam Rea , Danny Wyatt, The Mobile Sensing Platform: An Embedded Activity Recognition System, IEEE Pervasive Computing, v.7 n.2, p.32-41, April 2008
[doi> 10.1109/MPRV.2008.39]
|
| |
13
|
|
| |
14
|
B. Clarkson, N. Sawhney, and A. Pentl. Auditory context awareness via wearable computing. In In Proceedings of the 1998 Workshop on Perceptual User Interfaces(PUI98), pages 4--6, 1998.
|
| |
15
|
S. Dixon. Onset Detection Revisited. In Proceedings of the 9th International Conference on Digital Audio Effects (DAFx06), Montreal, Canada, 2006.
|
| |
16
|
|
| |
17
|
Google. Android. Website, 2008. http://code.google.com/android/.
|
| |
18
|
F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE, 66(1):51--83, 1978.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
 |
22
|
Kevin A. Li , Timothy Y. Sohn , Steven Huang , William G. Griswold, Peopletones: a system for the detection and notification of buddy proximity on mobile phones, Proceeding of the 6th international conference on Mobile systems, applications, and services, June 17-20, 2008, Breckenridge, CO, USA
[doi> 10.1145/1378600.1378619]
|
| |
23
|
|
 |
24
|
|
| |
25
|
L. Ma, D. Smith, and B. Milner. Context Awareness Using Environmental Noise Classification. In Eighth European Conference on Speech Communication and Technology. ISCA, 2003.
|
| |
26
|
M. McKinney and J. Breebaart. Features for audio and music classification. In Proc. ISMIR, pages 151--158, 2003.
|
 |
27
|
Emiliano Miluzzo , Nicholas D. Lane , Kristóf Fodor , Ronald Peterson , Hong Lu , Mirco Musolesi , Shane B. Eisenman , Xiao Zheng , Andrew T. Campbell, Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application, Proceedings of the 6th ACM conference on Embedded network sensor systems, November 05-07, 2008, Raleigh, NC, USA
[doi> 10.1145/1460412.1460445]
|
| |
28
|
Nokia. N95. Website, 2008. http://nseries.nokia.com.
|
| |
29
|
D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz. Opportunity knocks: A system to provide cognitive assistance with transportation services. In UbiComp 2004: Ubiquitous Computing, volume 3205 of Lecture Notes in Computer Science, pages 433--450, Berlin / Heidelberg, 2004. Springer.
|
| |
30
|
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE International conference on acoustics speech and signal processing, volume 2. IEEE; 1999, 2002.
|
| |
31
|
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE Intl. Conf. on Acoustics Speech and Signal Processing, volume 2. IEEE; 1999, 2002.
|
| |
32
|
|
| |
33
|
D. Reynolds. An Overview of Automatic Speaker Recognition Technology. In IEEE International Conference on Acoustics Speech and Signal Processing, volume 4, pages 4072--4075. IEEE; 1999, 2002.
|
| |
34
|
|
| |
35
|
|
| |
36
|
Albrecht Schmidt , Kofi Asante Aidoo , Antti Takaluoma , Urpo Tuomela , Kristof Van Laerhoven , Walter Van de Velde, Advanced Interaction in Context, Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing, p.89-101, September 27-29, 1999, Karlsruhe, Germany
|
| |
37
|
I. Shafran, M. Riley, and M. Mohri. Voice signatures. In Automatic Speech Recognition and Understanding, 2003. ASRU'03. 2003 IEEE Workshop on, pages 31--36, 2003.
|
| |
38
|
C. Shannon. Communication in the presence of noise. Proceedings of the IRE, 37(1):10--21, 1949.
|
| |
39
|
|
| |
40
|
M. Spina and V. Zue. Automatic transcription of general audio data: preliminary analyses. In Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, volume 2, 1996.
|
| |
41
|
G. Tzanetakis and P. Cook. Musical genre classification of audio signals. Speech and Audio Processing, IEEE Transactions on, 10(5):293--302, 2002.
|
| |
42
|
S. Vemuri, C. Schmandt, W. Bender, S. Tellex, and B. Lassey. An audio-based personal memory aid. In N. Davies, E. D. Mynatt, and I. Siio, editors, Ubicomp, volume 3205 of Lecture Notes in Computer Science, pages 400--417. Springer, 2004.
|
| |
43
|
I. Witten, U. of Waikato, and D. of Computer Science. Weka: Practical Machine Learning Tools and Techniques with Java Implementations. Dept. of Computer Science, University of Waikato, 1999.
|
| |
44
|
T. Zhang and C. Kuo. Audio-guided audiovisual data segmentation, indexing, and retrieval. In Proceedings of SPIE, volume 3656, page 316. SPIE, 1998.
|
| |
45
|
|
|