ACM Home Page
Please provide us with feedback. Feedback
Using relative novelty to identify useful temporal abstractions in reinforcement learning
Full text PdfPdf (272 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 95  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Özgür Şimşek  University of Massachusetts, Amherst, MA
Andrew G. Barto  University of Massachusetts, Amherst, MA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 25,   Citation Count: 8
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015353
What is a DOI?

ABSTRACT

We present a new method for automatically creating useful temporal abstractions in reinforcement learning. We argue that states that allow the agent to transition to a different region of the state space are useful subgoals, and propose a method for identifying them using the concept of relative novelty. When such a state is identified, a temporally-extended activity (e.g., an option) is generated that takes the agent efficiently to this state. We illustrate the utility of the method in a number of tasks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227--303.
 
3
 
4
 
5
 
6
Kakade, S., & Dayan, P. (2001). Dopamine bonuses. Advances in Neural Information Processing Systems (pp. 131--137). MIT Press.
 
7
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
Thrun, S., & Schwartz, A. (1995). Finding structure in reinforcement learning. Advances in Neural Information Processing Systems (pp. 385--392). MIT Press.

CITED BY  8
Collaborative Colleagues:
Özgür Şimşek: colleagues
Andrew G. Barto: colleagues