ACM Home Page
Please provide us with feedback. Feedback
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
Full text PdfPdf (763 KB)
Source ACM International Conference Proceeding Series; Vol. 382 archive
Proceedings of the 26th Annual International Conference on Machine Learning table of contents
Montreal, Quebec, Canada
Pages 249-256  
Year of Publication: 2009
ISBN:978-1-60558-516-1
Authors
Carlos Diuk  Rutgers University, Piscataway, NJ
Lihong Li  Rutgers University, Piscataway, NJ
Bethany R. Leffler  Rutgers University, Piscataway, NJ
Sponsors
: MITACS
: NSF
Microsoft Research : Microsoft Research
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 26,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1553374.1553406
What is a DOI?

ABSTRACT

The purpose of this paper is three-fold. First, we formalize and study a problem of learning probabilistic concepts in the recently proposed KWIK framework. We give details of an algorithm, known as the Adaptive k-Meteorologists Algorithm, analyze its sample-complexity upper bound, and give a matching lower bound. Second, this algorithm is used to create a new reinforcement-learning algorithm for factored-state problems that enjoys significant improvement over the previous state-of-the-art algorithm. Finally, we apply the Adaptive k-Meteorologists Algorithm to remove a limiting assumption in an existing reinforcement-learning algorithm. The effectiveness of our approaches is demonstrated empirically in a couple benchmark domains as well as a robotics navigation problem.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1--94.
 
3
 
4
Brunskill, E., Leffler, B. R., Li, L., Littman, M. L., & Roy, N. (2008). CORL: A continuous-state offset-dynamics reinforcement learner. Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI-08).
5
 
6
 
7
Guestrin, C., Koller, D., Parr, R., & Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 19, 399--468.
 
8
Kakade, S. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London, UK.
 
9
 
10
 
11
 
12
Leffler, B. R., Littman, M. L., & Edmunds, T. (2007). Efficient reinforcement learning with relocatable action models. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07) (pp. 572--577).
 
13
Li, L. (2009). A unifying framework for computational reinforcement learning theory. Doctoral dissertation, Department of Computer Science, Rutgers University, New Brunswick, NJ.
14
 
15
 
16
 
17
Strehl, A. L. (2007). Model-based reinforcement learning in factored-state MDPs. Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (pp. 103--110).
 
18
Strehl, A. L., Diuk, C., & Littman, M. L. (2007). Efficient structure learning in factored-state MDPs. Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) (pp. 645--650).
 
19
Strehl, A. L., Li, L., & Littman, M. L. (2006a). Incremental model-based learners with formal learning-time guarantees. Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI-06) (pp. 485--493).
20
21
 
22

Collaborative Colleagues:
Carlos Diuk: colleagues
Lihong Li: colleagues
Bethany R. Leffler: colleagues