|
ABSTRACT
In large systems, it is important for agents to learn to act effectively, but sophisticated multi-agent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if best-reply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Blonski. Equilibrium characterization in large anonymous games. Technical report, U. Mannheim, 2001.
|
 |
2
|
|
| |
3
|
|
| |
4
|
M. H. Bowling and M. M. Veloso. Rational and convergent learning in stochastic games. In 17th Int. Joint Conf. on Artificial Intelligence (IJCAI 2001), pages 1021--1026, 2001.
|
| |
5
|
|
| |
6
|
|
| |
7
|
D. P. Foster and P. Young. Regret testing: Learning to play Nash equilibrium without knowing you have an opponent. Theoretical Economics, 1:341--367, 2006.
|
 |
8
|
Eric J. Friedman , Joseph Y. Halpern , Ian Kash, Efficiency and nash equilibria in a scrip system for P2P networks, Proceedings of the 7th ACM conference on Electronic commerce, p.140-149, June 11-15, 2006, Ann Arbor, Michigan, USA
[doi> 10.1145/1134707.1134723]
|
| |
9
|
E. J. Friedman and S. Shenker. Learning and implementation on the internet. 1998.
|
| |
10
|
D. Fudenberg and D. Levine. Theory of Learning in Games. MIT Press, 1998.
|
| |
11
|
F. Germano and G. Lugosi. Global Nash convergence of Foster and Young's regret testing. Games and Economic Behavior, 60(1):135--154, July 2007.
|
| |
12
|
A. Greenwald, E. J. Friedman, and S. Shenker. Learning in networks contexts: Experimental results from simulations. Games and Economic Behavior, 35(1--2):80--123, 2001.
|
| |
13
|
S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5):1127--1150, 2000.
|
| |
14
|
S. Hart and A. Mas-Colell. A reinforecement learning procedure leading to correlated equilibrium. In G. Debreu, W. Neuefeind, and W. Trockel, editors, Economic Essays, pages 181--200. Springer, 2001.
|
| |
15
|
|
| |
16
|
L. P. Kaelbling, M. L. Littman, and A. P. Moore. Reinforcement learning: A survey. J. Artif. Intell. Res. (JAIR), 4:237--285, 1996.
|
| |
17
|
J. R. Marden, G. Arslan, and J. S. Shamma. Connections between cooperative control and potential games. In Proc. 2007 European Control Conference (ECC), 2007.
|
 |
18
|
|
| |
19
|
J. R. Marden, H. P. Young, G. Arslan, and J. S. Shamma. Payoff-based dynamics for multi-player weakly acyclic games. SIAM J. Control and Optimization, 2008. To appear.
|
| |
20
|
P. Milgrom and J. Roberts. Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica, 58(6):1255--1277, 1900.
|
| |
21
|
M. Osborne and A. Rubenstein. A Course in Game Theory. MIT Press, 1994.
|
| |
22
|
Y. Shoham, R. Powers, and T. Grenager. Multi-agent reinforcement learning: a critical survey. Technical report, Stanford, 2003.
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
H. P. Young. Learning by trial and error. http://www.econ.jhu.edu/People/Young/Learning22Feb08.pdf.
|
|