ACM Home Page
Please provide us with feedback. Feedback
Learning of coordination: exploiting sparse interactions in multiagent systems
Full text PdfPdf (283 KB)
Source
International Conference on Autonomous Agents archive
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2 table of contents
Budapest, Hungary
SESSION: Multi-agent learning table of contents
Pages 773-780  
Year of Publication: 2009
ISBN:978-0-9817381-7-8
Authors
Francisco S. Melo  Carnegie Mellon University, Pittsburgh, PA
Manuela Veloso  Carnegie Mellon University, Pittsburgh, PA
Sponsors
: The Foundation for Intelligent Physical Agents
Microsoft Research : Microsoft Research
: Whitestein Technologies
: European Office of Aerospace Research and Development, Air Force Office of Scientific Research, United States Air Force Research Laboratory
: Drexel University
: Wiley -- Blackwell Ltd
Publisher
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 42,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Creating coordinated multiagent policies in environments with uncertainty is a challenging problem, which can be greatly simplified if the coordination needs are known to be limited to specific parts of the state space, as previous work has successfully shown. In this work, we assume that such needs are unknown and we investigate coordination learning in multiagent settings. We contribute a reinforcement learning based algorithm in which independent decision-makers/agents learn both individual policies and when and how to coordinate. We focus on problems in which the interaction between the agents is sparse, exploiting this property to minimize the coupling of the learning processes for the different agents. We introduce a two-layer extension of Q-learning, in which we augment the action space of each agent with a coordination action that uses information from other agents to decide the correct action. Our results show that our agents learn both to act coordinate and to act independently, in the different regions of the space where they need to, and need not to, coordinate, respectively.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Aberdeen. A (revised) survey of approximate methods for solving partially observable Markov decision processes. Technical report, National ICT Australia, Canberra, Australia, 2003.
 
2
 
3
 
4
M. Bowling and M. Veloso. Rational and convergent learning in stochastic games. In Proc. Int. Joint Conf. Artificial Intelligence, pages 1021--1026, 2001.
 
5
 
6
 
7
A. Greenwald and K. Hall. Correlated Q-learning. In Proc. Int. Conf. Machine Learning, pages 242--249, 2003.
 
8
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In Proc. Adv. Neural Information Proc. Systems 14, pages 1523--1530, 2001.
 
9
 
10
J. Kok, P. Hoen, B. Bakker, and N. Vlassis. Utile coordination: Learning interdependencies among cooperative agents. In Proc. Symp. on Computational Intelligence and Games, pages 29--36, 2005.
11
 
12
D. Leslie and E. Collins. Generalised weakened fictitious play. Games and Economic Behavior, 56:285--298, 2006.
 
13
M. Littman. Value-function reinforcement learning in Markov games. J. Cognitive Systems Research, 2(1):55--66, 2001.
14
15
 
16
 
17
 
18
X. Wang and T. Sandholm. Reinforcement learning to play an optimal Nash equilibrium in team Markov games. In Proc. NIPS 15, pages 1571--1578, 2002.

Collaborative Colleagues:
Francisco S. Melo: colleagues
Manuela Veloso: colleagues