ACM Home Page
Please provide us with feedback. Feedback
Behavior transfer for value-function-based reinforcement learning
Full text PdfPdf (365 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems table of contents
The Netherlands
SESSION: Papers: learning table of contents
Pages: 53 - 59  
Year of Publication: 2005
ISBN:1-59593-093-0
Authors
Matthew E. Taylor  The University of Texas at Austin, Austin, Texas
Peter Stone  The University of Texas at Austin, Austin, Texas
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 63,   Citation Count: 13
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1082473.1082482
What is a DOI?

ABSTRACT

Temporal difference (TD) learning methods [22] have become popular reinforcement learning techniques in recent years. TD methods have had some experimental successes and have been shown to exhibit some desirable properties in theory, but have often been found very slow in practice. A key feature of TD methods is that they represent policies in terms of value functions. In this paper we introduce behavior transfer, a novel approach to speeding up TD learning by transferring the learned value function from one task to a second related task. We present experimental results showing that autonomous learners are able to learn one multiagent task and then use behavior transfer to markedly reduce the total training time for a more complex task.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda. Vision-based behavior acquisition for a shooting robot by using a reinforcement learning. In Proc. of IAPR/IEEE Workshop on Visual Behaviors-1994, pages 112--118, 1994.
 
5
R. Boer and J. Kok. The Incremental Development of a Synthetic Multi-agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team. Master's thesis, University of Amsterdam, The Netherlands, February 2002.
 
6
M. Colombetti and M. Dorigo. Robot Shaping: Developing Situated Agents through Learning. Technical Report TR-92-040, International Computer Science Institute, Berkeley, CA, 1993.
 
7
C. Drummond. Accelerating reinforcement learning by composing solutions of automatically identified subtasks. Journal of Artificial Intelligence Research, 16:59--104, 2002.
 
8
 
9
A. Fern, S. Yoon, and R. Givan. Approximate policy iteration with a policy language bias. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004.
 
10
C. Guestrin, D. Koller, C. Gearhart, and N. Kanodia. Generalizing plans to new environments in relational mdps. In International Joint Conference on Artificial Intelligence (IJCAI-03), Acapulco, Mexico, August 2003.
 
11
M. J. Mataric. Reward functions for accelerated learning. In International Conference on Machine Learning, pages 181--189, 1994.
 
12
E. F. Morales. Scaling up reinforcement learning with a relational representation. In Proc. of the Workshop on Adaptability in Multi-agent Systems, January 2003.
 
13
 
14
B. Price and C. Boutilier. Accelerating reinforcement learning through implicit imitation. Journal of Artificial Intelligence Research, 19:569--629, 2003.
 
15
 
16
 
17
O. Selfridge, R. S. Sutton, and A. G. Barto. Training and tracking in robotics. Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 670--672, 1985.
 
18
 
19
P. Stone, G. Kuhlmann, M. Taylor, and Y. Liu. Keepaway Soccer: From Machine Learning Testbed to Benchmark. In Proceedings of RoboCup International Symposium, 2005. To appear.
 
20
 
21
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 2005. To appear.
 
22
 
23

CITED BY  13
Collaborative Colleagues:
Matthew E. Taylor: colleagues
Peter Stone: colleagues