ACM Home Page
Please provide us with feedback. Feedback
Temporal difference learning and TD-Gammon
Full text PdfPdf (4.47 MB)
Source
Communications of the ACM archive
Volume 38 ,  Issue 3  (March 1995) table of contents
Pages: 58 - 68  
Year of Publication: 1995
ISSN:0001-0782
Author
Gerald Tesauro  IBM Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 25,   Downloads (12 Months): 171,   Citation Count: 90
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/203330.203343
What is a DOI?

Warning: The download time has expired please click on the item to try again.


ABSTRACT

Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, checkers, Othello, and backgammon has been widely regarded as an ideal testing ground for exploring a variety of concepts and approaches in artificial intelligence and machine learning. Such board games offer the challenge of tremendous complexity and sophistication required to play at expert level. At the same time, the problem inputs and performance measures are clear-cut and well defined, and the game environment is readily automated in that it is easy to simulate the board, the rules of legal play, and the rules regarding when the game is over and determining the outcome.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Berliner, H. Computer Backgammon Sci. Amer. 243, 1, (1980), 64-72.
 
2
 
3
 
4
 
5
 
6
Isabelle, J.-F. Auto-apprentissage, a paide de research de neurones, de fonctions heuristic utilities dans les. jeux strageties Master's thesis.. Univ of Montreal, 1993
 
7
Magreal,P. Backgammon, Times Books, Newyork, 19736.
 
8
Robertie. B, Carbonm Versus silicon: Matching wits with TD-Gammon. Inside Backmonnom 2, 2, (1992), 14-22.
 
9
 
10
Samuel, A. Some studies in machine learning using the game of checkers Ibm J. of Research and Deveopment 3. (1959), 210-229
 
11
Schraudolph, N.N. DAyan P. and Sjnoeski, Tj> Temporal difference learning of positoin evaluation in the game of Go. In J. D, Cowan, el al. Eds., Advances in Neural Information Processing Systems 6, 817-824.Morgan Kaufmann, San Mateo, Calif 1994
 
12
Shannon, C.E Programming aComputer for Playing Chess. Philosophical Mag,41, (1950), 265-275.
 
13
 
14
Tesauro, G. Neurogammon wins Computer Olympiad. Neura Computation-I, (1989),321-323.
 
15
 
16
Zadeh, N, and Kobiska, G. On optima doubing in backgammon, Manage, sci. 23 (1977), 853-858.

CITED BY  90


REVIEW

"Jaak Tepandi : Reviewer"

Complex board games are a natural testing ground for machine learning and artificial intelligence. They are based on experience; they are attractive; and they do not have the safety requirements that sometimes block the use of heur  more...