| RVσ(t): a unifying approach to performance and convergence in online multiagent learning |
| Full text |
Pdf
(168 KB)
|
| Source
|
International Conference on Autonomous Agents
archive
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
table of contents
Hakodate, Japan
SESSION: Learning and evolution
table of contents
Pages: 798 - 800
Year of Publication: 2006
ISBN:1-59593-303-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 24, Citation Count: 2
|
|
|
ABSTRACT
We present a new multiagent learning algorithm (RVσ(t) that can guarantee both no-regret performance (all games) and policy convergence (some games of arbitrary size). Unlike its predecessor ReDVaLeR, it (1) does not need to distinguish whether its opponents are self-play or otherwise non-stationary, (2) is allowed to know its portion of any equilibrium that, we argue, leads to convergence in some games in addition to no-regret. Although the regret of RVσ(t) is analyzed in continuous time, we show that it grows slower than in other no-regret techniques like GIGA and GIGA-WoLF. We show that RVσ(t) can converge to coordinated behavior in coordination games, while GIGA, GIGA-WoLF may converge to poorly coordinated (mixed) behaviors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Banerjee and J. Peng. Performance bounded reinforcement learning in strategic intercations. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), pages 2--7, San Jose, CA, 2004. AAAI Press.
|
| |
2
|
B. Banerjee and J. Peng. Convergence of no-regret learning in multiagent systems. In Proceedings of the First International Workshop on Learning and Adaptation in Multiagent Systems (LAMAS), Utrecht, The Netherlands, 2005. Held in conjunction with AAMAS-05.
|
| |
3
|
M. Bowling. Convergence and no-regret in multiagent learning. In Proceedings of NIPS 2004/5, 2005.
|
| |
4
|
|
| |
5
|
V. Conitzer and T. Sandholm. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning, 2003.
|
| |
6
|
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning, Washington DC, 2003.
|
|