| A hierarchical approach to efficient reinforcement learning in deterministic domains |
| Full text |
Pdf
(248 KB)
|
| Source
|
International Conference on Autonomous Agents
archive
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
table of contents
Hakodate, Japan
SESSION: Agent planning and search
table of contents
Pages: 313 - 319
Year of Publication: 2006
ISBN:1-59593-303-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 49, Citation Count: 2
|
|
|
ABSTRACT
Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-scale state spaces. We bring these three ideas together in a new algorithm. Our algorithm tackles two open problems from the reinforcement-learning literature, and provides a solution to those problems in deterministic domains. First, it shows how models can improve learning speed in the hierarchy-based MaxQ framework without disrupting opportunities for state abstraction. Second, we show how hierarchies can augment existing factored exploration algorithms to achieve not only low sample complexity for learning, but provably efficient planning as well. We illustrate the resulting performance gains in example domains. We prove polynomial bounds on the computational effort needed to attain near optimal performance within the hierarchy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
{Boutilier et al., 1999} Craig Boutilier, Thomas Dean, and Steve Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1--94, 1999.
|
| |
2
|
|
| |
3
|
{Dietterich, 2000a} Thomas G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227--303, 2000.
|
| |
4
|
|
| |
5
|
{Dietterich, 2000c} Thomas G. Dietterich. State abstraction in MAXQ hierarchical reinforcement learning. In Advances in Neural Information Processing Systems 12, pages 994--1000, 2000.
|
| |
6
|
|
| |
7
|
|
| |
8
|
{Kaelbling, 1993} Leslie Pack Kaelbling. Hierarchical learning in stochastic domains: Preliminary results. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, 1993. Morgan Kaufmann.
|
| |
9
|
{Kakade, 2003} Sham M. Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, 2003.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
CITED BY 2
|
|
|
|
|
Neville Mehta , Soumya Ray , Prasad Tadepalli , Thomas Dietterich, Automatic discovery and transfer of MAXQ hierarchies, Proceedings of the 25th international conference on Machine learning, p.648-655, July 05-09, 2008, Helsinki, Finland
|
|