ACM Home Page
Please provide us with feedback. Feedback
Inference and minimization of hidden Markov chains
Full text PdfPdf (1.17 MB)
Source Annual Workshop on Computational Learning Theory archive
Proceedings of the seventh annual conference on Computational learning theory table of contents
New Brunswick, New Jersey, United States
Pages: 147 - 158  
Year of Publication: 1994
ISBN:0-89791-655-7
Authors
David Gillman  Univ. of Minnesota, Minneapolis
Michael Sipser  Massachusetts Institute of Technology, Cambridge
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 46,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/180139.181091
What is a DOI?

ABSTRACT

A hidden Markov chain (hmc) is a finite ergodic Markov chain in which each of the states is labelled 0 or 1. As the Markov chain moves through a random trajectory, the hmc emits a 0 or a 1 at each times step according to the label of the state just entered. The inference problem is to construct a mechanism which will emit 0's and 1's and which is equivalent to a given hmc in the sense of having identical long-term behavior. We define the inference problem in a learning setting in which an algorithm can query an oracle for the long-term probability of any binary string. We prove that inference is hard: any algorithm for inference must make exponentially many oracle calls. Our method is information-theoretic and does not depend on separation assumptions for any complexity classes. We show that the related discrimination problem is also hard, but that on a nontrivial subclass of hmc's there is a randomized algorithm for discrimination. Finally, we give a polynomial-time algorithm for reducing a hidden Markov chain to its minimal form, and from this there follows a new algorithm for equivalence of hmc's.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
AW
 
AK*
R. Aleliunas, R. M. Karp, R. J. Lipton, L. Lovasz, and C. Rackoff, "Random walks, universal traversal sequences, and the complexity of maze problems," Proceedzngs of the ~Oth Annual IEEE Symposium on the Foundations of Computer Science (1979), 218-233.
 
Ang87
 
Ang88
 
Ang89
D. Angluin, "Minimum consistent DFA problem is NP-complete," unpublished manuscript.
 
Ch
H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations," Annals of Mathematical Statistics 23 (1952), 493-507.
 
Co
A. Cobham, "Stochastic automata with large state spaces and low rank," Linear Algebra and Its Applications 75 (1986), 57-66.
 
CLR
 
ERV
W. Evans, S. Rajagopalan, and U. Vazirani, "Learning an unknown randomized algorithm from its behavior," to appear in Proceedings of the S~xth A CM Workshop on Computational Learmng Theory, 1993.
 
Fe
W. Feller, An Introductwn to Probabzhty The. ory and Its Applications, 2 volumes, 2d ed., Wiley, 1957.
 
Gi
E.J. Gilbert, "On the identifiability problem for functions of finite Markov chains," Annals of Mathematical Statistics 30 (1959), 688-697.
 
G93a
 
Go
E.M. Gold, "System identification via state characterizaton," Automatica 8 (1972), 621- 636.
 
Go78
E. M. Gold, "Complexity of automaton identification from given data," Information and Control 37 (1978), 302-320.
 
Ha
D J. Hand, Discrimination and Classification, Wiley, 1981.
 
IAK
H. Ito, S.-I. Amari, and K. Kobayashi, "Identifiability of hidden Markov information sources and their minimum degrees of freedom," IEEE Transaction on Information Theory 38 (1992), 324-333.
 
LZ
J. Ziv and A. Lempel, "Compression of individual sequences via variable-rate coding," IEEE Transactzons on Informatwn Theory 24 No. 5 (1978), 530-536.
 
LRS
S.E. Levinson, L.R. Rabiner, and M. M. Sondhi, "An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition,'' The Bell System Technical Journal 62 (1983), 1035-1074.
 
MZ89
N. Merhav and J. Ziv, "Estimating with partial statistics the parameters of ergodic finite Markov sources," IEEE Transactions on Information Theory 35 No. 2 (1989), 326-333.
 
P
A. Paz, Introduction to Probabilistic Automata, 1971.
PW
 
R
S. Rudich, "Inferring the Structure ofa Markov Chain From Its Output,"Proceedings of the Twenty-Sizth IEEE Symposium on Foundations of Computer Science (1985), 321-326.
 
Tz92a
 
Tz92b
V84
 
WZ
A.D. Wyner and J. Ziv, "Classification with finite-memory," preprint - in preparation.
Y
 
Zi
J. Ziv, "On classification with empirically observed statistics and universal data compression,'' IEEE Transactions on Information Theory 34 No. 2 (1988), 278-286.
 
ZM
J. Ziv and N. Merhav, "Estimating the number of states of a finite-state source," IEEE Transactions on Information Theory 38 No. 1 (1992), 61-65.


Collaborative Colleagues:
David Gillman: colleagues
Michael Sipser: colleagues