ACM Home Page
Please provide us with feedback. Feedback
Convergence analysis of temporal-difference learning algorithms with linear function approximation
Full text PdfPdf (919 KB)
Source Annual Workshop on Computational Learning Theory archive
Proceedings of the twelfth annual conference on Computational learning theory table of contents
Santa Cruz, California, United States
Pages: 193 - 202  
Year of Publication: 1999
ISBN:1-58113-167-4
Author
Vladislav Tadić  Mihajlo Pupin Institute, Volgina 15, 11000 Belgrade, Serbia, Yugoslavia
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
Univ. of California, : University of California at Santa Cruz
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 16,   Citation Count: 0
Additional Information:

references   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/307400.307438
What is a DOI?

REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
D. S. Clark, Necessary and sufficient conditions for the Robbins-Monro method, Stochastic Processes and their Applications, vol. 17, pp. 359-367, 1984.
 
5
 
6
 
7
 
8
T. Jaakola, M. I. Jordan, S. P. Singh, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation, vol. 6, pp. 1185-1201, 1994.
 
9
S. R. Kulkarni and C. S. Horn, An alternative proof for convergence of stochastic approximation algorithms, IEEE Transactions of Automatic Control, vol. 41, pp. 419-424, 1996.
 
10
 
11
H. J. Kushner and D. S. Clark, Stochastic Approximation Methods for Constrained and Unconstrained Systems, Springer Verlag, 1978.
 
12
 
13
S. P. Meyn and R. L. Tweedie, Markov Chains and Stochastic Stability, Springer-Verlag, 1993.
 
14
 
15
W. F. Stout, Almost Sure Convergence, Academic Press, 1974.
 
16
 
17
 
18
V. Tadid, Convergence of stochastic approximation under general noise and stability conditions, Proceedings of the 36 IEEE Conference on Decision and Control, 1997.
 
19
 
20
J. N. Tsitsiklis and B. Van Roy, An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, vol. 42, pp. 674-690, 1997.
 
21
I.-J. Wang, E. K. P. Chong, and S. R. Kulkarni, Equivalent and sufficient conditions on noise sequences for stochastic approximation algorithms, Advances in Applied Probability, vol. 28, pp. 784- 801, 1996.


Peer to Peer - Readers of this Article have also read: