|
ABSTRACT
We propose a multi-sensor affect recognition system and evaluate it on the challenging task of classifying interest (or disinterest) in children trying to solve an educational puzzle on the computer. The multimodal sensory information from facial expressions and postural shifts of the learner is combined with information about the learner's activity on the computer. We propose a unified approach, based on a mixture of Gaussian Processes, for achieving sensor fusion under the problematic conditions of missing channels and noisy labels. This approach generates separate class labels corresponding to each individual modality. The final classification is based upon a hidden random variable, which probabilistically combines the sensors. The multimodal Gaussian Process approach achieves accuracy of over 86%, significantly outperforming classification using the individual modalities, and several other combination schemes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
T. W. Chan and A. Baskin. Intelligent Tutoring Systems: At the Crossroads of Artificial Intelligence and Education, chapter 1: Learning companion systems. 1990.
|
| |
2
|
C. Conati. Probabilistic assessment of user's emotions in educational games. Applied Artificial Intelligence, special issue on Merging Cognition and Affect in HCI, 16, 2002.
|
| |
3
|
T. S. Huang, L. S. Chen, and H. Tao. Bimodal emotion recognition by man and machine. In ATR Workshop on Virtual Communication Environments, 1998.
|
| |
4
|
R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3:79--87, 1991.
|
| |
5
|
A. Kapoor, H. Ahn, and R. W. Picard. Mixture of gaussian processes to combine multiple modalities. In Workshop on MCS, 2005.
|
| |
6
|
A. Kapoor, S. Mota, and R. W. Picard. Towards a learning companion that recognizes affect. In AAAI Fall Symposium, Nov 2001.
|
| |
7
|
|
| |
8
|
A. Kapoor, R. W. Picard, and Y. Ivanov. Probabilistic combination of multiple modalities to detect interest. In ICPR, August 2004.
|
| |
9
|
|
| |
10
|
D. J. Miller and L. Yan. Critic-driven ensemble classification. Signal Processing, 47(10), 1999.
|
| |
11
|
|
| |
12
|
S. Mota and R. W. Picard. Automated posture analysis for detecting learner's interest level. In CVPR Workshop on HCI, June 2003.
|
| |
13
|
N. Oliver, A. Garg, and E. Horvitz. Layered representations for learning and inferring office activity from multiple sensory channels. In ICMI, 2002.
|
| |
14
|
M. Pantic and L. J. M. Rothkrantz. Towards an affect-sensitive multimodal human-computer interaction. Proceedings of IEEE, 91(9), 2003.
|
| |
15
|
|
| |
16
|
K. Toyama and E. Horvitz. Bayesian modality fusion: Probabilistic integration of multiple vision algorithms for head tracking. In ACCV, 2000.
|
CITED BY 12
|
|
Willem A. Melder , Khiet P. Truong , Marten Den Uyl , David A. Van Leeuwen , Mark A. Neerincx , Lodewijk R. Loos , B. Stock Plum, Affective multimodal mirror: sensing and eliciting laughter, Proceedings of the international workshop on Human-centered multimedia, September 28-28, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Elizabeth S. Kim , Dan Leyzberg , Katherine M. Tsui , Brian Scassellati, How people talk when teaching a robot, Proceedings of the 4th ACM/IEEE international conference on Human robot interaction, March 09-13, 2009, La Jolla, California, USA
|
|
|
|
|
|
|
|
|
|
|