| Lossless clustering of histories in decentralized POMDPs |
| Full text |
Pdf
(291 KB)
|
Source
|
International Conference on Autonomous Agents
archive
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
table of contents
Budapest, Hungary
Pages 577-584
Year of Publication: 2009
ISBN:978-0-9817381-6-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 27, Citation Count: 0
|
|
|
ABSTRACT
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one. We show how this result can be exploited in optimal policy search and demonstrate empirically that it can provide a speed-up of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. We also perform an empirical analysis of the generality of our clustering method, which suggests that it may also be useful in other (approximate) Dec-POMDP solution methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
C. Amato, D. S. Bernstein, and S. Zilberstein. Optimal fixed-size controllers for decentralized POMDPs. In Multi-Agent Sequential Decision Making in Uncertain Domains (AAMAS Workshop), May 2006.
|
| |
2
|
C. Amato, D. S. Bernstein, and S. Zilberstein. Optimizing memory-bounded controllers for decentralized POMDPs. In Proc. of Uncertainty in Artificial Intelligence, July 2007.
|
| |
3
|
R. Aras, A. Dutech, and F. Charpillet. Mixed integer linear programming for exact finite-horizon planning in decentralized POMDPs. In Int. Conf. on Automated Planning and Scheduling, 2007.
|
| |
4
|
|
| |
5
|
A. Boularias and B. Chaib-draa. Exact dynamic programming for decentralized POMDPs with lossless policy compression. In Int. Conf. on Automated Planning and Scheduling, 2008.
|
| |
6
|
|
| |
7
|
Rosemary Emery-Montemerlo , Geoff Gordon , Jeff Schneider , Sebastian Thrun, Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, p.136-143, July 19-23, 2004, New York, New York
[doi> 10.1109/AAMAS.2004.67]
|
| |
8
|
R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Game theoretic control for robot teams. In IEEE Int. Conf. on Robotics and Automation, 2005.
|
| |
9
|
C. V. Goldman and S. Zilberstein. Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research, 22, 2004.
|
| |
10
|
|
| |
11
|
R. Nair, M. Tambe, M. Yokoo, D. V. Pynadath, and S. Marsella. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In IJCAI, 2003.
|
| |
12
|
F. A. Oliehoek, M. T. J. Spaan, and N. Vlassis. Optimal and approximate Q-value functions for decentralized POMDPs. Journal of Artificial Intelligence Research, 32, 2008.
|
| |
13
|
M. J. Osborne and A. Rubinstein. A Course in Game Theory. The MIT Press, July 1994.
|
 |
14
|
|
| |
15
|
S. Seuken and S. Zilberstein. Improved memory-bounded dynamic programming for decentralized POMDPs. In Proc. of Uncertainty in Artificial Intelligence, July 2007.
|
| |
16
|
|
| |
17
|
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proc. of Uncertainty in Artificial Intelligence, 2005.
|
|