|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ABSTRACT
We present a novel technique for automated problem decomposition to address the problem of scalability in reinforcement learning. Our technique makes use of a set of near-optimal trajectories to discover options and incorporates them into the learning process, dramatically reducing the time it takes to solve the underlying problem. We run a series of experiments in two different domains and show that our method offers up to 30 fold speedup over the baseline. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
INDEX TERMS
Primary Classification:
Additional Classification:
Collaborative Colleagues:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||