ACM Home Page
Please provide us with feedback. Feedback
An object-oriented representation for efficient reinforcement learning
Full text PdfPdf (212 KB)
Source ICML; Vol. 307 archive
Proceedings of the 25th international conference on Machine learning table of contents
Helsinki, Finland
Pages 240-247  
Year of Publication: 2008
ISBN:978-1-60558-205-4
Authors
Carlos Diuk  Rutgers University, Piscataway, NJ
Andre Cohen  Rutgers University, Piscataway, NJ
Michael L. Littman  Rutgers University, Piscataway, NJ
Sponsors
: Yahoo!
: Xerox
IBM : IBM
: NSF
Microsoft Research : Microsoft Research
: Machine Learning Journal/Springer
: Pascal
: University of Helsinki
: Federation of Finnish Learned Societies
: Intel Corporation
: Google
: Helsinki Institute for Information Technology
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 66,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390156.1390187
What is a DOI?

ABSTRACT

Rich representations in reinforcement learning have been studied for the purpose of enabling generalization and making learning feasible in large state spaces. We introduce Object-Oriented MDPs (OO-MDPs), a representation based on objects and their interactions, which is a natural way of modeling environments and offers important generalization opportunities. We introduce a learning algorithm for deterministic OO-MDPs and prove a polynomial bound on its sample complexity. We illustrate the performance gains of our representation and algorithm in the well-known Taxi domain, plus a real-life videogame.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1--94.
 
2
Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227--303.
 
3
Guestrin, C., Koller, D., Gearhart, C., & Kanodia, N. (2003). Generalizing plans to new environments in relational mdps. IJCAI (pp. 1003--1010).
4
 
5
 
6
Strehl, A. L., Diuk, C., & Littman, M. L. (2007). Efficient structure learning in factored-state mdps. AAAI (pp. 645--650). AAAI Press.
 
7
 
8
van Otterlo, M. (2005). A survey of reinforcement learning in relational domains (Technical Report TR-CTIT-05-31). CTIT Technical Report Series, ISSN 1381--3625.

Collaborative Colleagues:
Carlos Diuk: colleagues
Andre Cohen: colleagues
Michael L. Littman: colleagues