| Convergence analysis of temporal-difference learning algorithms with linear function approximation |
| Full text |
Pdf
(919 KB)
|
| Source
|
Annual Workshop on Computational Learning Theory
archive
Proceedings of the twelfth annual conference on Computational learning theory
table of contents
Santa Cruz, California, United States
Pages: 193 - 202
Year of Publication: 1999
ISBN:1-58113-167-4
|
|
Author
|
|
Vladislav Tadić
|
Mihajlo Pupin Institute, Volgina 15, 11000 Belgrade, Serbia, Yugoslavia
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 16, Citation Count: 0
|
|
|
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
D. S. Clark, Necessary and sufficient conditions for the Robbins-Monro method, Stochastic Processes and their Applications, vol. 17, pp. 359-367, 1984.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
T. Jaakola, M. I. Jordan, S. P. Singh, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation, vol. 6, pp. 1185-1201, 1994.
|
| |
9
|
S. R. Kulkarni and C. S. Horn, An alternative proof for convergence of stochastic approximation algorithms, IEEE Transactions of Automatic Control, vol. 41, pp. 419-424, 1996.
|
| |
10
|
|
| |
11
|
H. J. Kushner and D. S. Clark, Stochastic Approximation Methods for Constrained and Unconstrained Systems, Springer Verlag, 1978.
|
| |
12
|
|
| |
13
|
S. P. Meyn and R. L. Tweedie, Markov Chains and Stochastic Stability, Springer-Verlag, 1993.
|
| |
14
|
|
| |
15
|
W. F. Stout, Almost Sure Convergence, Academic Press, 1974.
|
| |
16
|
|
| |
17
|
|
| |
18
|
V. Tadid, Convergence of stochastic approximation under general noise and stability conditions, Proceedings of the 36 IEEE Conference on Decision and Control, 1997.
|
| |
19
|
|
| |
20
|
J. N. Tsitsiklis and B. Van Roy, An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, vol. 42, pp. 674-690, 1997.
|
| |
21
|
I.-J. Wang, E. K. P. Chong, and S. R. Kulkarni, Equivalent and sufficient conditions on noise sequences for stochastic approximation algorithms, Advances in Applied Probability, vol. 28, pp. 784- 801, 1996.
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|