|
ABSTRACT
We motivate, formalize, and study a computational problem in concrete inductive inference. A “pattern” is defined to be a concatenation of constants and variables, and the language of a pattern is defined to be the set of strings obtained by substituting constant strings for the variables. The problem we consider is, given a set of strings, find a minimal pattern language containing this set. This problem is shown to be effectively solvable in the general case and to lead to correct inference in the limit of the pattern languages. There exists a polynomial time algorithm for it in the restricted case of one-variable patterns. Inference from positive data is re-examined, and a characterization given of when it is possible for a family of recursive languages. Various collateral results about patterns and pattern languages are obtained. Section 1 is an introduction explaining the context of this work and informally describing the problem formulation. Section 2 is definitions. Section 3 is results concerning patterns and pattern languages. Section 4 concerns the abstract question of inference from positive data. Section 5 gives a polynomial time algorithm for finding minimal one-variable pattern languages compatible with a given set of strings. Section 6 contains remarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Angluin, Easily Inferred Sequences. Technical report, Department of EECS, University of California, Berkeley, 1974.
|
| |
2
|
D. Angluin, On the complexity of minimum inference of regular sets, Information and Control, to appear.
|
| |
3
|
L. Blum and M. Blum, Toward a mathematical theory of inductive inference, Information and Control 28 (1975), 125-155.
|
| |
4
|
E.M. Gold, Complexity of automaton identification from given data, Information and Control 37 (1978), 302-320.
|
| |
5
|
E.M. Gold, Language identification in the limit, Information and Control 10 (1967), 447-474.
|
CITED BY 7
|
|
|
|
|
Hasan Davulcu , Guizhen Yang , Michael Kifer , I. V. Ramakrishnan, Computational aspects of resilient data extraction from semistructured sources (extended abstract), Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.136-144, May 15-18, 2000, Dallas, Texas, United States
|
|
|
|
|
|
Hiroki Arimura , Hiroki Ishizaka , Takeshi Shinohara, Polynomial time inference of a subclass of context-free transformations, Proceedings of the fifth annual workshop on Computational learning theory, p.136-143, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|