| Constraint relaxation in approximate linear programs |
| Full text |
Pdf
(713 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 809-816
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 26, Citation Count: 0
|
|
|
ABSTRACT
Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for the poor quality of ALP solutions in problems where the approximation induces virtual loops. We then introduce two methods for improving solution quality. One method rolls out selected constraints of the ALP, guided by the dual information. The second method is a relaxation of the ALP, based on external penalty methods. The latter method is applicable in domains in which rolling out constraints is impractical. Both approaches show promising empirical results for simple benchmark problems as well as for a realistic blood inventory management problem.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Guestrin, C., Koller, D., Parr, R., & Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 19, 399--468.
|
| |
4
|
|
| |
5
|
Petrik, M., & Zilberstein, S. (2008). Learning heuristic functions through approximate linear programming. International Conference on Automated Planning and Scheduling (ICAPS) (pp. 248--255).
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
|