| A stochastic memoizer for sequence data |
| Full text |
Pdf
(679 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 1129-1136
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
Frank Wood
|
University College London, London, UK
|
|
Cédric Archambeau
|
University College London, London, UK
|
|
Jan Gasthaus
|
University College London, London, UK
|
|
Lancelot James
|
Hong Kong University of Science and Technology, Kowloon, Hong Kong
|
|
Yee Whye Teh
|
University College London, London, UK
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 41, Citation Count: 0
|
|
|
ABSTRACT
We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. We introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. We show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. We demonstrate the sequence memoizer by using it as a language model, achieving state-of-the-art results.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Cleary, J. G. & Teahan, W. J. (1997). Unbounded length contexts for PPM. The Computer Journal, 40, 67--75.
|
| |
3
|
Goodman, N. D., Mansinghka, V. K., Roy, D., Bonawitz, K., & Tenenbaum, J. B. (2008). Church: a language for generative models. In Uncertainty and Artificial Intelligence. to appear.
|
| |
4
|
Ho, M. W., James, L. F., & Lau, J. W. (2006). Coagulation fragmentation laws induced by general coagulations of two-parameter Poisson-Dirichlet processes. http://arxiv.org/abs/math.PR/0601608.
|
| |
5
|
Ishwaran, H. & James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. Journal of American Statistical Association, 96(453), 161--173.
|
| |
6
|
Michie, D. (1968). Memo functions and machine learning. Nature, 218, 19--22.
|
| |
7
|
Mnih, A. & Hinton, G. (2009). A scalable hierarchical distributed language model. In Neural Information Processing Systems 22. to appear.
|
| |
8
|
Mochihashi, D. & Sumita, E. (2008). The infinite Markov model. In Advances in Neural Information Processing Systems 20, (pp. 1017--1024).
|
| |
9
|
Perman, M. (1990). Random Discrete Distributions Derived from Subordinators. PhD thesis, Department of Statistics, University of California at Berkeley.
|
| |
10
|
Pitman, J. (1999). Coalescents with multiple collisions. Annals of Probability, 27, 1870--1902.
|
| |
11
|
Pitman, J. & Yor, M. (1997). The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Annals of Probability, 25, 855--900.
|
| |
12
|
Sudderth, E. B. & Jordan, M. I. (2009). Shared segmentation of natural scenes using dependent pitman-yor processes. In Neural Information Processing Systems 22. to appear.
|
| |
13
|
|
| |
14
|
Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566--1581.
|
| |
15
|
Ukkonen, E. (1995). On-line construction of suffix trees. Algorithmica, 14, 249--260.
|
| |
16
|
|
| |
17
|
Wood, F. & Teh, Y. W. (2009). A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation. In Journal of Machine Learning, Workshop and Conference Proceedings: Artificial Intelligence in Statistics 2009, volume 5, (pp. 607--614).
|
|