| The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning |
| Full text |
Pdf
(763 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 249-256
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 26, Citation Count: 0
|
|
|
ABSTRACT
The purpose of this paper is three-fold. First, we formalize and study a problem of learning probabilistic concepts in the recently proposed KWIK framework. We give details of an algorithm, known as the Adaptive k-Meteorologists Algorithm, analyze its sample-complexity upper bound, and give a matching lower bound. Second, this algorithm is used to create a new reinforcement-learning algorithm for factored-state problems that enjoys significant improvement over the previous state-of-the-art algorithm. Finally, we apply the Adaptive k-Meteorologists Algorithm to remove a limiting assumption in an existing reinforcement-learning algorithm. The effectiveness of our approaches is demonstrated empirically in a couple benchmark domains as well as a robotics navigation problem.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1--94.
|
| |
3
|
|
| |
4
|
Brunskill, E., Leffler, B. R., Li, L., Littman, M. L., & Roy, N. (2008). CORL: A continuous-state offset-dynamics reinforcement learner. Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI-08).
|
 |
5
|
Nicolò Cesa-Bianchi , Yoav Freund , David Haussler , David P. Helmbold , Robert E. Schapire , Manfred K. Warmuth, How to use expert advice, Journal of the ACM (JACM), v.44 n.3, p.427-485, May 1997
[doi> 10.1145/258128.258179]
|
| |
6
|
|
| |
7
|
Guestrin, C., Koller, D., Parr, R., & Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 19, 399--468.
|
| |
8
|
Kakade, S. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London, UK.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Leffler, B. R., Littman, M. L., & Edmunds, T. (2007). Efficient reinforcement learning with relocatable action models. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07) (pp. 572--577).
|
| |
13
|
Li, L. (2009). A unifying framework for computational reinforcement learning theory. Doctoral dissertation, Department of Computer Science, Rutgers University, New Brunswick, NJ.
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Strehl, A. L. (2007). Model-based reinforcement learning in factored-state MDPs. Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (pp. 103--110).
|
| |
18
|
Strehl, A. L., Diuk, C., & Littman, M. L. (2007). Efficient structure learning in factored-state MDPs. Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) (pp. 645--650).
|
| |
19
|
Strehl, A. L., Li, L., & Littman, M. L. (2006a). Incremental model-based learners with formal learning-time guarantees. Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI-06) (pp. 485--493).
|
 |
20
|
Alexander L. Strehl , Lihong Li , Eric Wiewiora , John Langford , Michael L. Littman, PAC model-free reinforcement learning, Proceedings of the 23rd international conference on Machine learning, p.881-888, June 25-29, 2006, Pittsburgh, Pennsylvania
[doi> 10.1145/1143844.1143955]
|
 |
21
|
|
| |
22
|
|
|