ACM Home Page
Please provide us with feedback. Feedback
Coevolutive planning in markov decision processes
Full text PdfPdf (102 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2 table of contents
Bologna, Italy
SESSION: Session 6C: mobile embodied agents table of contents
Pages: 843 - 844  
Year of Publication: 2002
ISBN:1-58113-480-0
Authors
Bruno Scherrer  LORIA, Campus Scientifique, Vandoeuvre-les-Nancy
François Charpillet  LORIA, Campus Scientifique, Vandoeuvre-les-Nancy
Sponsors
ACM: Association for Computing Machinery
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 15,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/544862.544939
What is a DOI?

ABSTRACT

We investigate the idea of having groups of agents coevolving in order to iteratively refine multi-agent plans. This idea we called coevolution is formalized and analyzed in a general purpose and applied to the stochastic control frameworks that use an explicit model of the world,: coevolution can directly be adapted to the frameworks of Multi-Agent Markov Decision Processes (MMDP) and Multi-Agent Partially Observable MDP (MPOMDP). We also consider the decentralized version of MPOMDP (DEC-POMDP) which is known to be a difficult problem,: we show that the coevolution approach can be applied if we restrict the search to memoryless policies. We evaluate our coevolutive approach experimentally on a typical multi-agent problem.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Baxter and P. Bartlett. Direct gradient-based reinforcement learning. Technical report, Research School of Information Sciences and Engineering, Australian National University, July 1999
 
2
 
3
4
 
5
 
6
M. Puterman. Markov decision processes, 1994
 
7
E. J. Sondik. The Optimal Control of Partially Observable Markov Decision Processes. PhD thesis, Stanford University, California, 1971
 
8
C. Watkins and P. Dayan. Machine learning, 1992

Collaborative Colleagues:
Bruno Scherrer: colleagues
François Charpillet: colleagues