| On-line discovery of temporal-difference networks |
| Full text |
Pdf
(341 KB)
|
| Source
|
ICML; Vol. 307
archive
Proceedings of the 25th international conference on Machine learning
table of contents
Helsinki, Finland
Pages 632-639
Year of Publication: 2008
ISBN:978-1-60558-205-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 24, Citation Count: 2
|
|
|
ABSTRACT
We present an algorithm for on-line, incremental discovery of temporal-difference (TD) networks. The key contribution is the establishment of three criteria to expand a node in TD network: a node is expanded when the node is well-known, independent, and has a prediction error that requires further explanation. Since none of these criteria requires centralized calculation operations, they are easily computed in a parallel and distributed manner, and scalable for bigger problems compared to other discovery methods of predictive state representations. Through computer experiments, we demonstrate the empirical effectiveness of our algorithm.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Michael Bowling , Peter McCracken , Michael James , James Neufeld , Dana Wilkinson, Learning predictive state representations using non-blind policies, Proceedings of the 23rd international conference on Machine learning, p.129-136, June 25-29, 2006, Pittsburgh, Pennsylvania
[doi> 10.1145/1143844.1143861]
|
| |
2
|
Cassandra, A. (1999). Tony's POMDP file repository page. URL http://www.cs.brown.edu/research/ai/pomdp/examples/index.html.
|
| |
3
|
|
 |
4
|
|
| |
5
|
Littman, M. L., Sutton, R. S., & Singh, S. (2002). Predictive representations of state. In Advances in neural information processing systems 14, 1555--1561. MIT Press.
|
| |
6
|
McCracken, P. (2005). An online algorithm for discovery and learning of predictive state representations. Master's thesis, University of Alberta.
|
| |
7
|
McCracken, P., & Bowling, M. (2006). Online discovery and learning of predictive state representations. In Advances in neural information processing systems 18, 875--882. MIT Press.
|
 |
8
|
Matthew Rosencrantz , Geoff Gordon , Sebastian Thrun, Learning low dimensional predictive representations, Proceedings of the twenty-first international conference on Machine learning, p.88, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015441]
|
| |
9
|
Rudary, M. R., & Singh, S. (2004). A nonlinear predictive state representation. In Advances in neural information processing systems 16. MIT Press.
|
| |
10
|
Sutton, R. S., & Tanner, B. (2005). Temporal-difference networks. In Advances in neural information processing systems 17, 1377--1384. MIT Press.
|
 |
11
|
|
| |
12
|
Tanner, B., & Sutton, R. S. (2005b). Temporal-difference networks with history. In Proc. of IJCAI'05, 865--870.
|
 |
13
|
|
|