|
ABSTRACT
Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In Proc. of the Neural Information Processing Systems, pages 1523--1530, 2001.
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Janusz Marecki , Tapana Gupta , Pradeep Varakantham , Milind Tambe , Makoto Yokoo, Not all agents are equal: scaling up distributed POMDPs for agent networks, Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, May 12-16, 2008, Estoril, Portugal
|
| |
11
|
R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proc. of the Twentieth National Conference on Artificial Intelligence, pages 133--139, 2005.
|
| |
12
|
A. Petcu and B. Faltings. Dpop: A scalable method for multiagent constraint optimization. In Proc. of the International Joint Conference on Artificial Intelligence, pages 266--271, 2005.
|
| |
13
|
J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. In Proc. of the Eighteenth International Joint Conference on Artificial Intelligence, pages 1025--1032, 2003.
|
| |
14
|
S. Seuken and S. Zilberstein. Memory-bounded dynamic programming for DEC-POMDPs. In Proc. of the Twentieth International Joint Conference on Artificial Intelligences, pages 2009--2015, 2007.
|
| |
15
|
G. Shani, R. I. Brafman, and S. E. Shimony. Forward search value iteration for POMDPs. In Proc. of the Twentieth International Joint Conference on Artificial Intelligence, pages 2619--2624, 2007.
|
| |
16
|
M. Spaan and N. Vlassis. Perseus: Randomized point-based value iteration for POMDPs. Journal of Artificial Intelligence Research, 24:195--220, 2005.
|
| |
17
|
D. Szer and F. Charpillet. MAA*: a heuristic search algorithm for solving decentralized POMDPs. In Proc. of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pages 576--583, 2005.
|
 |
18
|
Pradeep Varakantham , Janusz Marecki , Yuichi Yabu , Milind Tambe , Makoto Yokoo, Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies, Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, May 14-18, 2007, Honolulu, Hawaii
[doi> 10.1145/1329125.1329388]
|
|