|
ABSTRACT
Coordination of multiple agents under uncertainty in the decentralized POMDP model is known to be NEXP-complete, even when the agents have a joint set of goals. Nevertheless, we show that the existence of goals can help develop effective planning algorithms. We examine an approach to model these problems as indefinite-horizon decentralized POMDPs, suitable for many practical problems that terminate after some unspecified number of steps. Our algorithm for solving these problems is optimal under some common assumptions -- that terminal actions exist for each agent and rewards for non-terminal actions are negative. We also propose an infinite-horizon approximation method that allows us to relax these assumptions while maintaining goal conditions. An optimality bound is developed for this sample-based approach and experimental results show that it is able to exploit the goal structure effectively. Compared with the state-of-the-art, our approach can solve larger problems and produce significantly better solutions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
C. Amato, D. S. Bernstein, and S. Zilberstein. Optimizing memory-bounded controllers for decentralized POMDPs. In Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, 2007.
|
| |
2
|
R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition-independent decentralized Markov decision processes. Journal of AI Research, 22:423--455, 2004.
|
| |
3
|
D. S. Bernstein, E. Hansen, and S. Zilberstein. Bounded policy iteration for decentralized POMDPs. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pages 1287--1292, 2005.
|
| |
4
|
|
| |
5
|
C. V. Goldman and S. Zilberstein. Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research, 22:143--174, 2004.
|
| |
6
|
E. A. Hansen. Indefinite-horizon POMDPs with action-based termination. In Proceedings of the Twenty-Second National Conference on Artificial Intelligence, pages 291--296, 2007.
|
| |
7
|
|
| |
8
|
M. Kearns, Y. Mansour, and A. Y. Ng. Approximate planning in large POMDPs via reusable trajectories. http://robotics.stanford.edu/~ang/papers/pomdp-long.pdf, 1999.
|
| |
9
|
R. Nair, D. Pynadath, M. Yokoo, M. Tambe, and S. Marsella. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pages 705--711, 2003.
|
| |
10
|
S. D. Patek. On partially observed stochastic shortest path problems. In Proceedings of the Fortieth IEEE Conference on Decision and Control, pages 5050--5055, 2001.
|
| |
11
|
J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: an anytime algorithm for POMDPs. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pages 1025--1032, 2003.
|
| |
12
|
S. Seuken and S. Zilberstein. Improved memory-bounded dynamic programming for decentralized POMDPs. In Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, 2007.
|
| |
13
|
|
| |
14
|
D. Szer and F. Charpillet. An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs. In Proceedings of the Sixteenth European Conference on Machine Learning, 2005.
|
| |
15
|
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, 2005.
|
|