|
ABSTRACT
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents in a factored representation. Factored Dec-POMDP representations have been proposed before, but only for Dec-POMDPs whose transition and observation models are fully independent. Such strong assumptions simplify the planning problem, but result in models with limited applicability. By contrast, we consider general factored Dec-POMDPs for which we analyze the model dependencies over space (locality of interaction) and time (horizon of the problem). We also present a formulation of decomposable value functions. Together, our results allow us to exploit the problem structure as well as heuristics in a single framework that is based on collaborative graphical Bayesian games (CGBGs). A preliminary experiment shows a speedup of two orders of magnitude.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research (JAIR), 22:423--455, December 2004.
|
| |
2
|
|
| |
3
|
C. Boutilier, T. Dean, and S. Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.
|
| |
4
|
|
| |
5
|
X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In Proc. of Uncertainty in Artificial Intelligence, pages 33--42. San Francisco: Morgan Kaufmann, 1998.
|
| |
6
|
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In Advances in Neural Information Processing Systems 14, pages 1523--1530, 2002.
|
| |
7
|
E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In AAAI '04: Proceedings of the Nineteenth National Conference on Artificial Intelligence, pages 709--715, 2004.
|
| |
8
|
E. A. Hansen and Z. Feng. Dynamic programming for POMDPs using a factored state representation. In Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems, pages 130--139, Apr. 2000.
|
| |
9
|
|
| |
10
|
|
| |
11
|
R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proc. of the National Conference on Artificial Intelligence, pages 133--139, 2005.
|
| |
12
|
F. A. Oliehoek, M. T. J. Spaan, and N. Vlassis. Optimal and approximate Q-value functions for decentralized POMDPs. Journal of Artificial Intelligence Research, 2008. (to appear).
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proc. of Uncertainty in Artificial Intelligence, 2005.
|
 |
19
|
Pradeep Varakantham , Janusz Marecki , Yuichi Yabu , Milind Tambe , Makoto Yokoo, Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies, Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, May 14-18, 2007, Honolulu, Hawaii
[doi> 10.1145/1329125.1329388]
|
| |
20
|
N. Vlassis. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2007.
|
|