|
ABSTRACT
Estimating a person's focus of attention is useful for various human-computer interaction applications, such as smart meeting rooms, where a user's goals and intent have to be monitored. In work presented here, we are interested in modeling focus of attention in a meeting situation. We have developed a system capable of estimating participants' focus of attention from multiple cues. We employ an omnidirectional camera to simultaneously track participants' faces around a meeting table and use neural networks to estimate their head poses. In addition, we use microphones to detect who is speaking. The system predicts participants' focus of attention from acoustic and visual information separately, and then combines the output of the audio- and video-based focus of attention predictors. We have evaluated the system using the data from three recorded meetings. The acoustic information has provided 8% error reduction on average compared to using a single modality.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Gregory D. Abowd , Christopher G. Atkeson , Ami Feinstein , Cindy Hmelo , Rob Kooper , Sue Long , Nitin Sawhney , Mikiya Tani, Teaching and learning as multimedia authoring: the classroom 2000 project, Proceedings of the fourth ACM international conference on Multimedia, p.187-198, November 18-22, 1996, Boston, Massachusetts, United States
[doi> 10.1145/244130.244191]
|
| |
2
|
M. Argyle and M. Cook. Gaze and Mutual Gaze. Cambridge University Press, 1976.
|
| |
3
|
|
| |
4
|
M. Black, F. Brard, A. Jepson, W. Newman, E. Saund, G. Socher, and M. Taylor. The digital office: Overview. In Proceedings of the 1998 AAAI Spring Symposium on Intelligent Environments, volume AAAI Technical Report SS-98-02. AAAI, AAAI Press, March 1998.
|
| |
5
|
|
| |
6
|
A. J. Diebold. Animal Communication - Techniques of Study and Results of Research, chapter Anthropology of the comparative psychology of communicative behavior. Bloomington: Indiana University Press, 1968.
|
| |
7
|
A. H. Gee and R. Cipolla. Non-intrusive gaze tracking for human-computer interaction. In Proc. Mechatronics and Machine Vision in Practise, pages 112--117, 1994.
|
| |
8
|
D. Gopher. The Blackwell dictionary of Cognitive Psychology, chapter Attention, pages 23--28. Basil Blackwell Inc., 1990.
|
| |
9
|
|
| |
10
|
S. R. Langton, R. J. Watt, and V. Bruce. Do the eyes have it? cues to the direction of social attention. Trends in Cognitive Neuroscience, 4(2), 2000.
|
| |
11
|
M. Mozer. The neural network house: An environment that adapts to its inhibitants. In Intelligent Environments, Papers from the 1998 AAAI Spring Symposium, number Technical Report SS-98-92, pages 110--114. AAAI, AAAI Press, 1998.
|
| |
12
|
D. Perret and N. Emery. Understanding the intentions of others from visual signals: neurophysiological evidence. Cahiers de Psychologie Cognitive, 13:683-694, 1994.
|
| |
13
|
|
 |
14
|
Rainer Stiefelhagen , Jie Yang , Alex Waibel, Modeling focus of attention for meeting indexing, Proceedings of the seventh ACM international conference on Multimedia (Part 1), p.3-10, October 30-November 05, 1999, Orlando, Florida, United States
[doi> 10.1145/319463.319464]
|
| |
15
|
A. Waibel, M. Bett, M. Finke, and R. Stiefelhagen. Meeting browser: Tracking and summarizing meetings. In D. E. M. Penrose, editor, Proceedings of the Broadcast News Transcription and Understanding Workshop, pages 281--286, Lansdowne, Virginia, February. 8-11 1998. DARPA, Morgan Kaufmann.
|
| |
16
|
|
CITED BY 9
|
|
|
|
|
Alexander W. Skaburskis , Jeffrey S. Shell , Roel Vertegaal , Connor Dickie, AuraMirror: artistically visualizing attention, CHI '03 extended abstracts on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
|
|
|
|
|
|
Alice Oh , Harold Fox , Max Van Kleek , Aaron Adler , Krzysztof Gajos , Louis-Philippe Morency , Trevor Darrell, Evaluating look-to-talk: a gaze-aware interface in a collaborative environment, CHI '02 extended abstracts on Human factors in computing systems, April 20-25, 2002, Minneapolis, Minnesota, USA
|
|
|
Sebastian Lang , Marcus Kleinehagenbrock , Sascha Hohenner , Jannik Fritsch , Gernot A. Fink , Gerhard Sagerer, Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot, Proceedings of the 5th international conference on Multimodal interfaces, November 05-07, 2003, Vancouver, British Columbia, Canada
|
|
|
Junji Watanabe , Hideaki Nii , Yuki Hashimoto , Masahiko Inami, Visual resonator: interface for interactive cocktail party phenomenon, CHI '06 extended abstracts on Human factors in computing systems, April 22-27, 2006, Montréal, Québec, Canada
|
|
|
|
|
|
|
|
|
|
|