| MB-AIM-FSI: a model based framework for exploiting gradient ascent multiagent learners in strategic interactions |
| Full text |
Pdf
(686 KB)
|
Source
|
International Conference on Autonomous Agents
archive
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
table of contents
Estoril, Portugal
SESSION: Agent and multi-agent learning
table of contents
Pages 371-378
Year of Publication: 2008
ISBN:978-0-9817381-0-9
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Citation Count: 0
|
|
|
ABSTRACT
Future agent applications will increasingly represent human users autonomously or semi-autonomously in strategic interactions with similar entities. Hence, there is a growing need to develop algorithmic approaches that can learn to recognize commonalities in opponent strategies and exploit such commonalities to improve strategic response. Recently a framework [9] has been proposed that aims for targeted optimality against a set of finite memory opponents. We propose an approach that aims for targeted optimality against the set of all possible multiagent learning algorithms that perform gradient search to select a single stage Nash Equilibria of a repeated game. Such opponents induce a Markov Decision Process as the learning environment and appropriate responses to such environments are learned by assuming a generative model of the environment. In the absence of a generative model, we present a framework, MB-AIM-FSI, that models the opponent online based on interactions, solves the model off-line when sufficient information has been gathered, stores the strategy in the repository and finally uses it judiciously when playing against the same or similar opponent at a later time.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
M. Bowling. Convergence and no-regret in multiagent learning. In Neural Information Processing Systems 17. MIT Press, 2005.
|
| |
3
|
|
| |
4
|
M. H. Bowling and M. M. Veloso. Rational and convergent learning in stochastic games. In IJCAI, pages 1021--1026, 2001.
|
| |
5
|
S. J. Brams. Theory of Moves. Cambridge University Press, Cambridge: UK, 1994.
|
| |
6
|
Y. Chang and L. Kaelbling. Playing is believing: the role of beliefs in multi-agent learning. In NIPS-2001, 2001.
|
| |
7
|
V. Conitzer and T. Sandholm. Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. pages 83--90, 2003.
|
| |
8
|
|
| |
9
|
R. Powers and Y. Shoham. Learning against opponents with bounded memory. In IJCAI, pages 817--822, 2005.
|
| |
10
|
S. Singh, M. Kearns, and Y. Mansour. Nash convergence of gradient dynamics in general-sum games. pages 541--548.
|
| |
11
|
|
| |
12
|
|
|