|
ABSTRACT
An editing by example system is an automatic program synthesis facility embedded in a text editor that can be used to solve repetitive text editing problems. The user provides the editor with a few examples of a text transformation. The system analyzes the examples and generalizes them into a program that can perform the transformation to the rest of the user's text. This paper presents the design, analysis, and implementation of a practical editing by example system. In particular, we study the problem of synthesizing a text processing program that generalizes the transformation implicitly described by a small number of input/output examples. We define a class of text processing programs called gap programs, characterize their computational power, study the problems associated with synthesizing them from examples, and derive an efficient heuristic that provably synthesizes a gap program from examples of its input/output behavior. We evaluate how well the gap program synthesis heuristic performs on the text encountered in practice. This evaluation inspires the development of several modifications to the gap program synthesis heuristic that act both to improve the quality of the hypotheses proposed by the system and to reduce the number of examples required to converge to a target program. The result is a gap program synthesis heuristic that can usually synthesize a target gap program from two or three input examples and a single output example. The editing by example system derived from this analysis has been embedded in a production text editor. The system is presented as a group of editor commands that use the standard interfaces of the editor to collect examples, show synthesized programs, and run them. By developing an editing by example system that solves a useful class of text processing problems, we demonstrate that program synthesis is feasible in the domain of text editing.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A.V. Aho, B.W. Kernighan, and P.J. Weinberger. AWK - A pattern matching and scanning language. Software - Practice & Experience, 9(4):267-280, April, 1979.
|
| |
2
|
D. Angluin. On the complexity of minimum inference of regular sets. Information and Control, 39:337-350, 1978.
|
| |
3
|
D. Angluin. Finding patterns common to a set of strings. JCSS, 21:46-62, 1980.
|
| |
4
|
D. Angluin and C. Smith. A survey of inductive inference: theory and methods. Research report No. 250, Yale University Department of Computer Science, October. 1982.
|
| |
5
|
D. Angluin. A note on the number of queries needed to identify regular languages. Information and Control, 51(1):76-87, October, 1982.
|
| |
6
|
A.W. Biermann and J.A. Feldman. On the synthesis of finite-state machines from samples of their behavior. IEEE Transactions on Computing, C-21:592-597, 1972.
|
| |
7
|
A.W. Biermann and J.A. Feldman. A survey of results in grammatical inference. In Frontiers of Pattern Recognition, Academic Press, N.Y, 1972.
|
| |
8
|
C.M. Cook, A. Rosenfeld and A.R. Aronson. Grammatical inference by hill-climbing. ISJ, 10:59-80, 1976.
|
 |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
John R. Ellis, Nathaniel Mishkin, Robert P. Nix, and Steven R. Wood. A BLISS programming environment. Research report No. 231, Yale University Department of Computer Science, June, 1982.
|
| |
13
|
John R. Ellis, Nathaniel Mishkin, Mary-Claire van Leunen, and Steven R. Wood. Tools: An environment for timeshared computing and programming. Research report No. 232. Yale University Department of Computer Science. To appear in Software—Practice & Experience.
|
| |
14
|
J.A. Feldman. First thoughts on grammatical inference. Research report No. 55. Stanford University Artificial Intelligence Laboratory, 1967.
|
| |
15
|
|
| |
16
|
K.S. Fu and T.L. Booth. Grammatical inference: introduction and survey, parts 1 and 2. IEEE Transactions on Systems Man and Cybernetics, SMC-5:95-111 and 409-423, 1975.
|
| |
17
|
E.M. Gold. Language identification in the limit. Information and Control, 10:447-474, 1967.
|
| |
18
|
E.M. Gold. Complexity of automaton identification from given data. Information and Control, 37:302-320, 1978.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
B. Knobe and K. Knobe. A method for inferring context-free grammars. Information and Control, 31:129-146, 1976.
|
 |
26
|
|
 |
27
|
|
 |
28
|
|
 |
29
|
|
| |
30
|
R.P. Nix and N.W. Mishkin. U Editor User's and Programmer's Manual. Internal memorandum of the Yale University Department of Computer Science, 1983.
|
| |
31
|
R.P. Nix. Editing by Example. PhD thesis, Yale University Department of Computer Science, 1983. Also appeared as Yale University Department of Computer Science Research Report No. 280.
|
| |
32
|
T.W. Pao and J.W. Carr III. A solution of the syntactical induction-inference problem for regular languages. Computer Languages, 3:53-64, 1978.
|
 |
33
|
|
| |
34
|
Jonathan A. Rees and Norman I. Adams IV. T User's Manual. Internal memorandum of the Yale University Department of Computer Science, 1982.
|
| |
35
|
T. Shinohara. Polynomial time inference of pattern languages and its application. In Proceedings of the 7th IBM Symposium on Mathematical Foundations of Computer Science, 1982.
|
| |
36
|
|
| |
37
|
D.C. Smith. Pygmalion: A Computer Program to Model and Stimulate Creative Thought. Ph.D. thesis, Stanford University Computer Science Department, 1975. Appeared as AIM-260, 1975, and as a book from Birkhauser Verlag, 1977.
|
 |
38
|
|
| |
39
|
Warren Teitelman. The Cedar programming environment: A midterm report and examination. Research report No. CSL-83-11, Computer Science Laboratory, Xerox Palo Alto Research Center, December, 1983 (in press).
|
| |
40
|
B.A. Trakhtenbrot and Y.M. Barzdin. Finite Automata. North-Holland, Amsterdam, 1973, pp. 98-99.
|
 |
41
|
|
| |
42
|
R.M. Wharton. Grammar enumeration and inference. Information and Control, 33:253-272, 1977.
|
 |
43
|
|
| |
44
|
M.M. Zloof, Query-by-example: a data base language. IBM Systems Journal, 16(4):324-343, 1977.
|
| |
45
|
M.M. Zloof. Office-by-example: a business language that unifies data and word processing and electronic mail. IBM Systems Journal, 21(3):272-304, 1982.
|
|