ACM Home Page
Please provide us with feedback. Feedback
SoundSense: scalable sound sensing for people-centric applications on mobile phones
Full text PdfPdf (1.18 MB)
Source
International Conference On Mobile Systems, Applications And Services archive
Proceedings of the 7th international conference on Mobile systems, applications, and services table of contents
Kraków, Poland
SESSION: Mobile sensing and inference table of contents
Pages 165-178  
Year of Publication: 2009
ISBN:978-1-60558-566-6
Authors
Hong Lu  Dartmouth College, Hanover, NH, USA
Wei Pan  Dartmouth College, Hanover, NH, USA
Nicholas D. Lane  Dartmouth College, Hanover, NH, USA
Tanzeem Choudhury  Dartmouth College, Hanover, NH, USA
Andrew T. Campbell  Dartmouth College, Hanover, NH, USA
Sponsors
SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 89,   Downloads (12 Months): 216,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1555816.1555834
What is a DOI?

ABSTRACT

Top end mobile phones include a number of specialized (e.g., accelerometer, compass, GPS) and general purpose sensors (e.g., microphone, camera) that enable new people-centric sensing applications. Perhaps the most ubiquitous and unexploited sensor on mobile phones is the microphone - a powerful sensor that is capable of making sophisticated inferences about human activity, location, and social events from sound. In this paper, we exploit this untapped sensor not in the context of human communications but as an enabler of new sensing applications. We propose SoundSense, a scalable framework for modeling sound events on mobile phones. SoundSense is implemented on the Apple iPhone and represents the first general purpose sound sensing system specifically designed to work on resource limited phones. The architecture and algorithms are designed for scalability and Soundsense uses a combination of supervised and unsupervised learning techniques to classify both general sound types (e.g., music, voice) and discover novel sound events specific to individual users. The system runs solely on the mobile phone with no back-end interactions. Through implementation and evaluation of two proof of concept people-centric sensing applications, we demostrate that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
O. Amft, M. Stäger, P. Lukowicz, and G. Tröster. Analysis of chewing sounds for dietary monitoring. In M. Beigl, S. S. Intille, J. Rekimoto, and H.Tokuda, editors, Ubicomp, volume 3660 of Lecture Notes in Computer Science, pages 56--72. Springer, 2005.
 
3
Apple. Introduction to the ob jective-c 2.0 programming language. Website, 2008. http://developer.apple.com/documentation/Cocoa/ConceptualObjectiveC/Introduction/chapter_1_section_1.html.
 
4
Apple. iphone. Website, 2008. http://www.apple.com/iphone/.
 
5
Apple. iphone sdk. Website, 2008. http://developer.apple.com/iphone/.
 
6
L. Bao and S. S. Intille. Activity recognition from user-annotated acceleration data. In A. Ferscha and F. Mattern, editors, Pervasive, volume 3001 of Lecture Notes in Computer Science, pages 1--17. Springer, 2004.
 
7
S. Basu. A linked-HMM model for robust voicing and speech detection. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03). 2003 IEEE International Conference on, volume 1, 2003.
 
8
 
9
M. Borgerding. Kiss fft. Website, 2008. http://sourceforge.net/projects/kissfft/.
 
10
J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy, and Srivastava. Participatory sensing. In In: Workshop on World-Sensor-Web (WSW): Mobile Device Centric Sensor Networks and Applications, 2006.
11
 
12
 
13
 
14
B. Clarkson, N. Sawhney, and A. Pentl. Auditory context awareness via wearable computing. In In Proceedings of the 1998 Workshop on Perceptual User Interfaces(PUI98), pages 4--6, 1998.
 
15
S. Dixon. Onset Detection Revisited. In Proceedings of the 9th International Conference on Digital Audio Effects (DAFx06), Montreal, Canada, 2006.
 
16
 
17
Google. Android. Website, 2008. http://code.google.com/android/.
 
18
F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE, 66(1):51--83, 1978.
 
19
20
 
21
22
 
23
24
 
25
L. Ma, D. Smith, and B. Milner. Context Awareness Using Environmental Noise Classification. In Eighth European Conference on Speech Communication and Technology. ISCA, 2003.
 
26
M. McKinney and J. Breebaart. Features for audio and music classification. In Proc. ISMIR, pages 151--158, 2003.
27
 
28
Nokia. N95. Website, 2008. http://nseries.nokia.com.
 
29
D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz. Opportunity knocks: A system to provide cognitive assistance with transportation services. In UbiComp 2004: Ubiquitous Computing, volume 3205 of Lecture Notes in Computer Science, pages 433--450, Berlin / Heidelberg, 2004. Springer.
 
30
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE International conference on acoustics speech and signal processing, volume 2. IEEE; 1999, 2002.
 
31
V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE Intl. Conf. on Acoustics Speech and Signal Processing, volume 2. IEEE; 1999, 2002.
 
32
 
33
D. Reynolds. An Overview of Automatic Speaker Recognition Technology. In IEEE International Conference on Acoustics Speech and Signal Processing, volume 4, pages 4072--4075. IEEE; 1999, 2002.
 
34
 
35
 
36
 
37
I. Shafran, M. Riley, and M. Mohri. Voice signatures. In Automatic Speech Recognition and Understanding, 2003. ASRU'03. 2003 IEEE Workshop on, pages 31--36, 2003.
 
38
C. Shannon. Communication in the presence of noise. Proceedings of the IRE, 37(1):10--21, 1949.
 
39
 
40
M. Spina and V. Zue. Automatic transcription of general audio data: preliminary analyses. In Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, volume 2, 1996.
 
41
G. Tzanetakis and P. Cook. Musical genre classification of audio signals. Speech and Audio Processing, IEEE Transactions on, 10(5):293--302, 2002.
 
42
S. Vemuri, C. Schmandt, W. Bender, S. Tellex, and B. Lassey. An audio-based personal memory aid. In N. Davies, E. D. Mynatt, and I. Siio, editors, Ubicomp, volume 3205 of Lecture Notes in Computer Science, pages 400--417. Springer, 2004.
 
43
I. Witten, U. of Waikato, and D. of Computer Science. Weka: Practical Machine Learning Tools and Techniques with Java Implementations. Dept. of Computer Science, University of Waikato, 1999.
 
44
T. Zhang and C. Kuo. Audio-guided audiovisual data segmentation, indexing, and retrieval. In Proceedings of SPIE, volume 3656, page 316. SPIE, 1998.
 
45

Collaborative Colleagues:
Hong Lu: colleagues
Wei Pan: colleagues
Nicholas D. Lane: colleagues
Tanzeem Choudhury: colleagues
Andrew T. Campbell: colleagues