|
ABSTRACT
In this paper, we describe experiments with methods for learning the appropriateness of behaviors based on a model of the current social situation. We first review different approaches for social robotics, and present a new approach based on situation modeling. We then review algorithms for social learning and propose three modifications to the classical Q-Learning algorithm. We describe five experiments with progressively complex algorithms for learning the appropriateness of behaviors. The first three experiments illustrate how social factors can be used to improve learning by controlling learning rate. In the fourth experiment we demonstrate that proper credit assignment improves the effectiveness of reinforcement learning for social interaction. In our fifth experiment we show that analogy can be used to accelerate learning rates in contexts composed of many situations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Bartlett, M., Littleworth, G., Fasel, I., and Movellan, J., Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction, Workshop on Computer Vision for HCI, CVPR 2003, Vancouver, Canada, 2003.
|
| |
3
|
Brdiczka, O., Learning Situation Models for Context-Aware Services, Doctoral Dissertation, INPG, 2007.
|
| |
4
|
Brdiczka, O., Maisonnasse, J., Reignier P., and Crowley, J. L., Learning individual roles from video in a smart home, International Conference on Intelligent Environments, 2006.
|
| |
5
|
|
| |
6
|
|
| |
7
|
Brooks, R., Breazeal, C., Marjanovic, M., Scassellati, B., and Williamson, M., "The Cog Project: Building a Humanoid Robot". In Computation for metaphors, analogy, and agents, C. Nehaniv (ed), Lecture notes in artificial intelligence 1562. New York, Springer. 52--87, 1998.
|
| |
8
|
Crowley, J. L., "Context Driven Observation of Human Activity", European Symposium on Ambient Intelligence, Amsterdam, 3-5 November 2003.
|
| |
9
|
|
| |
10
|
|
| |
11
|
Fong, T., Nourbakhsh I., and Dautenhahn, K., A Survey of Socially Interactive Robots, Robotics and Autonomous Systems, 42, 2003.
|
| |
12
|
Gockley, R., Bruce, A., Forlizzi, J., Michalowski, M., Mundell, A., Rosenthal, S., Sellner, B., Simmons, R., Snipes, K., Schultz A. and Wang, J., Designing robots for long-term social interaction, IROS 2005, International Conference on Intelligent Robots and Systems, 2005.
|
 |
13
|
Charles Isbell , Christian R. Shelton , Michael Kearns , Satinder Singh , Peter Stone, A social reinforcement learning agent, Proceedings of the fifth international conference on Autonomous agents, p.377-384, May 2001, Montreal, Quebec, Canada
[doi> 10.1145/375735.376334]
|
| |
14
|
Johnson-Laird, P. N., How We Reason. Oxford University Press (2006).
|
| |
15
|
|
| |
16
|
Kidd, C. D., and Breazeal, C., Designing a Sociable Robot System for Weight Maintenance, RO-MAN 2005,14th IEEE International Workshop on Robot and Human Interactive Communication, Nashville TN, Aug 2005.
|
| |
17
|
Klopf, A. H., "Brain function and adaptive systems - A heterostatic theory", Technical Report AFCRL72-0164, Air Force Cambridge Research Laboratories, Bedford, MA, 1972.
|
| |
18
|
Maisonnasse, J., Gourier, N., Brdiczka O., and Reignier, P., "Attentional Model for Perceiving Social Context in Intelligent Environments", 3rd IFIP Conference on Artificial Intelligence App22lications and Innovations (AIAI), pp171--178, June 2006.
|
| |
19
|
Ormrod, J. E., Human Learning, Prentice Hall, 2003.
|
| |
20
|
Padgett, C., and Cottrell, G., A simple neural network models categorical perception of facial expressions. In Proceedings of the 20th Annual Conference of the Cognitive Science Society, Lawerence Erlbaum, Hillsdale NJ, 1998.
|
| |
21
|
|
| |
22
|
|
| |
23
|
Shin, Y. S., A Neural Network Model for Classification of Facial Expressions Based on Dimension Model, Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2005.
|
| |
24
|
|
| |
25
|
|
| |
26
|
Thomaz, A. L. and Breazeal, C. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance, Proc. of the 21st National Conference on Artificial Intelligence, AAAI '06, Boston, Mass, Vol 21, Part 1, pp 1000--1005, 2006.
|
| |
27
|
Thomaz, A. L., Hoffman G., and Breazeal, C., Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots, The 15th IEEE International Symposium on Robot and Human Interactive Communication, pp. 352--357, University of Hertfordshire, Hatfield, Sept 2006.
|
| |
28
|
|
| |
29
|
Watkins, C. J. C. H., Learning from Delayed Rewards, Doctoral Thesis, Cambridge University, 1989.
|
|