| Using relative novelty to identify useful temporal abstractions in reinforcement learning |
| Full text |
Pdf
(272 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 69
archive
Proceedings of the twenty-first international conference on Machine learning
table of contents
Banff, Alberta, Canada
Page: 95
Year of Publication: 2004
ISBN:1-58113-828-5
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 27, Citation Count: 8
|
|
|
ABSTRACT
We present a new method for automatically creating useful temporal abstractions in reinforcement learning. We argue that states that allow the agent to transition to a different region of the state space are useful subgoals, and propose a method for identifying them using the concept of relative novelty. When such a state is identified, a temporally-extended activity (e.g., an option) is generated that takes the agent efficiently to this state. We illustrate the utility of the method in a number of tasks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227--303.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
Kakade, S., & Dayan, P. (2001). Dopamine bonuses. Advances in Neural Information Processing Systems (pp. 131--137). MIT Press.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
Thrun, S., & Schwartz, A. (1995). Finding structure in reinforcement learning. Advances in Neural Information Processing Systems (pp. 385--392). MIT Press.
|
|