|
ABSTRACT
Robust sequence prediction is an essential component of an intelligent agent acting in a dynamic world. We consider the case of near-future event prediction by an online learning agent operating in a non-stationary environment. The challenge for a learning agent under these conditions is to exploit the relevant experience from a limited environmental event history while preserving flexibility.We propose a novel time/space efficient method for learning temporal sequences and making short-term predictions. Our method operates on-line, requires few exemplars, and adapts easily and quickly to changes in the underlying stochastic world model. Using a short-term memory of recent observations, the method maintains a dynamic space of candidate hypotheses in which the growth of the space is systematically and dynamically pruned using an entropy measure over the observed predictive quality of each candidate hypothesis.The method compares well against Markov-chain predictions, and adapts faster than learned Markov-chain models to changes in the underlying distribution of events. We demonstrate the method using both synthetic data and empirical experience from a game-playing scenario with human opponents.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Yoshua Bengio , Samy Bengio , Jean-François Isabelle , Yoram Singer, Shared context probabilistic transducers, Proceedings of the 1997 conference on Advances in neural information processing systems 10, p.409-415, July 1998, Denver, Colorado, United States
|
| |
3
|
|
| |
4
|
D. Fudenberg and D. K. Levine. The Theory of Learning in Games. MIT Press, Cambridge, Massachusetts, 1999.
|
| |
5
|
|
| |
6
|
S. A. Huettel, P. B. Mack, and G. McCarthy. Perceiving patterns in random series: dynamic processing of sequence in prefrontal cortex. Nature Neuroscience, 5(5):485--490, May 2002.
|
| |
7
|
G. A. Miller. The magical number 7 plus or minus two: Some limits on our capacity in processing information. Psychol. Rev., 63:81--97, 1956.
|
| |
8
|
|
| |
9
|
L. K. Saul and M. I. Jordan. Mixed memory Markov models: decomposing complex stochastic processes as mixtures of simpler ones. Machine Learning, pages 1--11, 1998.
|
| |
10
|
|
CITED BY
|
|
Steven Jensen , Daniel Boley , Maria Gini , Paul Schrater, Non-stationary policy learning in 2-player zero sum games, Proceedings of the 20th national conference on Artificial intelligence, p.789-794, July 09-13, 2005, Pittsburgh, Pennsylvania
|
|