ACM Home Page
Please provide us with feedback. Feedback
Learning complementary multiagent behaviors: a case study
Full text PdfPdf (185 KB)
Source
International Conference on Autonomous Agents archive
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2 table of contents
Budapest, Hungary
SESSION: Comprehensive/cross-cutting table of contents
Pages 1359-1360  
Year of Publication: 2009
ISBN:978-0-9817381-7-8
Authors
Shivaram Kalyanakrishnan  The University of Texas at Austin
Peter Stone  The University of Texas at Austin
Sponsors
: The Foundation for Intelligent Physical Agents
Microsoft Research : Microsoft Research
: Whitestein Technologies
: European Office of Aerospace Research and Development, Air Force Office of Scientific Research, United States Air Force Research Laboratory
: Drexel University
: Wiley -- Blackwell Ltd
Publisher
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 28,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

As the reach of multiagent reinforcement learning extends to increasingly complex tasks, it is likely that the diverse challenges encountered can only be surmounted by combining the strengths of different learning methods. We consider this aspect of learning through the case study of Keepaway, a popular benchmark for multiagent reinforcement learning from the robot soccer domain. Whereas previous successful results in this domain have limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior (Pass), we expand the agents' learning capability to include the more ubiquitous action of moving without the ball (GetOpen), such that at any given time, multiple agents are executing learned behaviors simultaneously. We introduce a policy search method for learning GetOpen to complement the temporal difference learning approach employed for learning Pass [4]. The learned GetOpen policy matches the best hand-coded policy for this task, and outperforms the best policy found when Pass is learned. We demonstrate that Pass and GetOpen can be learned simultaneously, and indeed that these learned behaviors specialize towards the counterpart behaviors with which they are trained.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Benda, V. Jagannathan, and R. Dodhiawala. On optimal cooperation of knowledge sources - an empirical investigation. Technical Report BCS--G2010--28, Boeing Advanced Technology Center, Boeing Computing Services, Seattle, WA, July 1986.
 
2
M. Chen, E. Foroughi, F. Heintz, Z. Huang, S. Kapetanakis, K. Kostiadis, J. Kummeneje, I. Noda, O. Obst, P. Riley, T. Steffens, Y. Wang, and X. Yin. Users manual: RoboCup soccer server -- for soccer server version 7.07 and later. The RoboCup Federation, August 2002.
 
3
P. T. De Boer, D. P. Kroese, S. Mannor, and R. Rubinstein. A tutorial on the cross-entropy method. Annals of Operations Research, 134(1):19--67, 2005.
 
4
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165--188, 2005.

Collaborative Colleagues:
Shivaram Kalyanakrishnan: colleagues
Peter Stone: colleagues