ACM Home Page
Please provide us with feedback. Feedback
RVσ(t): a unifying approach to performance and convergence in online multiagent learning
Full text PdfPdf (168 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems table of contents
Hakodate, Japan
SESSION: Learning and evolution table of contents
Pages: 798 - 800  
Year of Publication: 2006
ISBN:1-59593-303-4
Authors
Bikramjit Banerjee  Tulane University, New Orleans, LA
Jing Peng  Tulane University, New Orleans, LA
Sponsors
IFMAS : The International Foundation for Multiagent Systems
ATAL : The International Workshop on Agent Theories, Architectures, and Languages
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 24,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1160633.1160775
What is a DOI?

ABSTRACT

We present a new multiagent learning algorithm (RVσ(t) that can guarantee both no-regret performance (all games) and policy convergence (some games of arbitrary size). Unlike its predecessor ReDVaLeR, it (1) does not need to distinguish whether its opponents are self-play or otherwise non-stationary, (2) is allowed to know its portion of any equilibrium that, we argue, leads to convergence in some games in addition to no-regret. Although the regret of RVσ(t) is analyzed in continuous time, we show that it grows slower than in other no-regret techniques like GIGA and GIGA-WoLF. We show that RVσ(t) can converge to coordinated behavior in coordination games, while GIGA, GIGA-WoLF may converge to poorly coordinated (mixed) behaviors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
B. Banerjee and J. Peng. Performance bounded reinforcement learning in strategic intercations. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), pages 2--7, San Jose, CA, 2004. AAAI Press.
 
2
B. Banerjee and J. Peng. Convergence of no-regret learning in multiagent systems. In Proceedings of the First International Workshop on Learning and Adaptation in Multiagent Systems (LAMAS), Utrecht, The Netherlands, 2005. Held in conjunction with AAMAS-05.
 
3
M. Bowling. Convergence and no-regret in multiagent learning. In Proceedings of NIPS 2004/5, 2005.
 
4
 
5
V. Conitzer and T. Sandholm. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning, 2003.
 
6
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning, Washington DC, 2003.


Collaborative Colleagues:
Bikramjit Banerjee: colleagues
Jing Peng: colleagues