| Markov decision processes in large state spaces |
| Full text |
Pdf
(745 KB)
|
| Source
|
Annual Workshop on Computational Learning Theory
archive
Proceedings of the eighth annual conference on Computational learning theory
table of contents
Santa Cruz, California, United States
Pages: 281 - 288
Year of Publication: 1995
ISBN:0-89791-723-5
|
|
Authors
|
|
Lawrence K. Saul
|
Center for Biological and Computational Learning, Massachusetts Institute of Technology, 79 Amherst Street, E10-243, Cambridge, MA
|
|
Satinder P. Singh
|
Center for Biological and Computational Learning, Massachusetts Institute of Technology, 79 Amherst Street, E10-243, Cambridge, MA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 25, Citation Count: 1
|
|
|
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. J. Amit. Modeling brain functwn. Cambridge University Press, Cambridge, 1989.
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
David Haussler , H. Sebastian Seung , Michael Kearns , Naftali Tishby, Rigorous learning curve bounds from statistical mechanics, Proceedings of the seventh annual conference on Computational learning theory, p.76-87, July 12-15, 1994, New Brunswick, New Jersey, United States
[doi> 10.1145/180139.181018]
|
| |
6
|
R. Howard. Dynamic programming a~d Markov processes. MIT Press, Cambridge, MA, 1960.
|
| |
7
|
K. Huang. Statistical Mechanics. John Wiley & Sons, New York, NY, 1987.
|
| |
8
|
W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes. Cambridge University Press, Cambridge, 1986.
|
| |
9
|
H. S. Seung, H. Sompolinsky, and N. Tishby. Statistical mechanics of learning from examples. Physical Review A 45: 6056-6091, 1992.
|
| |
10
|
|
| |
11
|
P. Tseng. Solving H-horizon, stationary Markov decision problems in time proportional to log(H), Operations Research Letters, 9:287-297, 1990.
|
| |
12
|
T. Watkin, A. Rau, and M. Biehl. The statistical mechanics of learning a rule. Reviews of Modern Physzcs 65:499-556, 1993.
|
| |
13
|
C. Watkins. Learning from delayed rewards. PhD thesis, Cambridge University, 1989.
|
| |
14
|
|
|