ACM Home Page
Please provide us with feedback. Feedback
Exact solutions of interactive POMDPs using behavioral equivalence
Full text PdfPdf (281 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems table of contents
Hakodate, Japan
SESSION: Architectures: BDI and MDPs table of contents
Pages: 1025 - 1032  
Year of Publication: 2006
ISBN:1-59593-303-4
Authors
Bharanee Rathnasabapathy  University of Illinois at Chicago
Prashant Doshi  University of Georgia
Piotr Gmytrasiewicz  University of Illinois at Chicago
Sponsors
IFMAS : The International Foundation for Multiagent Systems
ATAL : The International Workshop on Agent Theories, Architectures, and Languages
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 40,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1160633.1160816
What is a DOI?

ABSTRACT

We present a method for transforming the infinite interactive state space of interactive POMDPs (I-POMDPs) into a finite one, thereby enabling the computation of exact solutions. I-POMDPs allow sequential decision making in multi-agent environments by modeling other agents' beliefs, capabilities, and preferences as part of the interactive state space. Since beliefs are allowed to be arbitrarily nested and are continuous, it is not possible to compute optimal solutions using value iteration as in POMDPs. We present a method that transforms the original state space into a finite one by grouping the other agents' behaviorally equivalent models into equivalence classes. This enables us to compute the complete optimal solution for the I-POMDP, which may be represented as a policy graph. We illustrate our method using the multi-agent Tiger problem and discuss features of the solution.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
D. S. Bernstein, E. A. Hansen, and S. Zilberstein. Bounded policy iteration for decentralized pomdps. In Proceedings of the The Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05), 2005.
 
4
A. R. Cassandra, M. L. Littman, and N. L. Zhang. Incremental pruning: A simple, fast, exact method for partially observable markov decision processes. In Uncertainty in Artificial Intelligence, Rhode Island, Providence, 1997.
5
 
6
P. Doshi and P. Gmytrasiewicz. A particle filtering based approach to approximating interactive pomdps. In Proceedings of the The Twentieth National Conference on Artificial Intelligence (AAAI-05), 2005.
 
7
 
8
P. J. Gmytrasiewicz and P. Doshi. A framework for sequential planning in multi-agent settings. In Journal of AI Research, 2005.
 
9
E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), 2004.
 
10
J. C. Harsanyi. Games with incomplete information played by 'bayesian' players. In Management Science, pages 14(3):159--182, Nov 2005.
 
11
 
12
R. Nair, M. Tambe, M. Yokoo, D. Pynadath, and S. Marsella. Taming decentralized pomdps: Towards efficient policy computation for multiagent settings. In Proceedings of International Joint Conference in Artificial Intelligence (IJCAI), 2003.
 
13
J. M. Porta, M. T. Spaan, and N. Vlassis. Value iteration for continuous-state pomdps. In IAS Technical report IAS-UVA-04-04, 2004.
 
14
P. Poupart and C. Boutilier. Value directed compression of pomdps. In Seventeenth Annual Conference on Neural Information Processing Systems, 2003.
15
 
16
R. Smallwood and E. Sondik. The optimal control of partially observable markov decision processes over a finite horizon. Operations Research, 21:1071--1088, 1973.
 
17
S. Thrun. Monte carlo POMDPs. In S. Solla, T. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, pages 1064--1070. MIT Press, 2000.


Collaborative Colleagues:
Bharanee Rathnasabapathy: colleagues
Prashant Doshi: colleagues
Piotr Gmytrasiewicz: colleagues