ACM Home Page
Please provide us with feedback. Feedback
A hierarchical approach to efficient reinforcement learning in deterministic domains
Full text PdfPdf (248 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems table of contents
Hakodate, Japan
SESSION: Agent planning and search table of contents
Pages: 313 - 319  
Year of Publication: 2006
ISBN:1-59593-303-4
Authors
Carlos Diuk  Rutgers University Piscataway, NJ
Alexander L. Strehl  Rutgers University Piscataway, NJ
Michael L. Littman  Rutgers University Piscataway, NJ
Sponsors
IFMAS : The International Foundation for Multiagent Systems
ATAL : The International Workshop on Agent Theories, Architectures, and Languages
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 49,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1160633.1160686
What is a DOI?

ABSTRACT

Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-scale state spaces. We bring these three ideas together in a new algorithm. Our algorithm tackles two open problems from the reinforcement-learning literature, and provides a solution to those problems in deterministic domains. First, it shows how models can improve learning speed in the hierarchy-based MaxQ framework without disrupting opportunities for state abstraction. Second, we show how hierarchies can augment existing factored exploration algorithms to achieve not only low sample complexity for learning, but provably efficient planning as well. We illustrate the resulting performance gains in example domains. We prove polynomial bounds on the computational effort needed to attain near optimal performance within the hierarchy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
{Boutilier et al., 1999} Craig Boutilier, Thomas Dean, and Steve Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.
 
2
 
3
{Dietterich, 2000a} Thomas G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227--303, 2000.
 
4
 
5
{Dietterich, 2000c} Thomas G. Dietterich. State abstraction in MAXQ hierarchical reinforcement learning. In Advances in Neural Information Processing Systems 12, pages 994--1000, 2000.
 
6
 
7
 
8
{Kaelbling, 1993} Leslie Pack Kaelbling. Hierarchical learning in stochastic domains: Preliminary results. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, 1993. Morgan Kaufmann.
 
9
{Kakade, 2003} Sham M. Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, 2003.
 
10
 
11
 
12
 
13
 
14
 
15
 
16


Collaborative Colleagues:
Carlos Diuk: colleagues
Alexander L. Strehl: colleagues
Michael L. Littman: colleagues