ACM Home Page
Please provide us with feedback. Feedback
Automatic discovery and transfer of MAXQ hierarchies
Full text PdfPdf (307 KB)
Source ICML; Vol. 307 archive
Proceedings of the 25th international conference on Machine learning table of contents
Helsinki, Finland
Pages 648-655  
Year of Publication: 2008
ISBN:978-1-60558-205-4
Authors
Neville Mehta  Oregon State University, Corvallis, OR
Soumya Ray  Oregon State University, Corvallis, OR
Prasad Tadepalli  Oregon State University, Corvallis, OR
Thomas Dietterich  Oregon State University, Corvallis, OR
Sponsors
: Yahoo!
: Xerox
IBM : IBM
: NSF
Microsoft Research : Microsoft Research
: Machine Learning Journal/Springer
: Pascal
: University of Helsinki
: Federation of Finnish Learned Societies
: Intel Corporation
: Google
: Helsinki Institute for Information Technology
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 47,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390156.1390238
What is a DOI?

ABSTRACT

We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Şimşek, Ö., & Barto, A. (2004). Using Relative Novelty to Identify Useful Temporal Abstractions in Reinforcement Learning. ICML (pp. 751--758).
 
3
Dietterich, T. (2000). Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. Journal of Artificial Intelligence Research, 13, 227--303.
4
 
5
 
6
 
7
Marthi, B., Kaelbling, L., & Lozano-Perez, T. (2007). Learning Hierarchical Structure In Policies. NIPS Hierarchical Organization of Behavior Workshop.
 
8
 
9
Mehta, N., & Tadepalli, P. (2005). Multi-Agent Shared Hierarchy Reinforcement Learning. ICML Rich Representations in Reinforcement Learning Workshop.
 
10
 
11
 
12
 
13
 
14
Thrun, S., & Schwartz, A. (1995). Finding Structure in Reinforcement Learning. NIPS (pp. 385--392).


Collaborative Colleagues:
Neville Mehta: colleagues
Soumya Ray: colleagues
Prasad Tadepalli: colleagues
Thomas Dietterich: colleagues