|
ABSTRACT
Research aimed at correcting words in text has focused on three progressively more difficult problems:(1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem, a variety of general and application-specific spelling correction techniques have been developed. Some of them were based on detailed studies of spelling error patterns. In response to the third problem, a few experiments using natural-language-processing tools or statistical-language models have been carried out. This article surveys documented findings on spelling error patterns, provides descriptions of various nonword detection and isolated-word error correction techniques, reviews the state of the art of context-dependent word correction techniques, and discusses research issues related to all three areas of automatic error correction in text.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ABNEY, S. 1990. Rapid incremental parsing with repair. In Proceedings of the 6th New OED Conference: Electronic Text Research (Waterloo, Ontario, Oct. 1990).
|
| |
2
|
|
 |
3
|
|
| |
4
|
AHO, A. V., AND PETERSON, T.G. 1972. A minimum distance error-correcting parser for context free languages. SIAM J. Comput. 1, 4 (Dec.), 305-312.
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
ANGELL, R. C., FREUND, G. E., AND WILLETT, P. 1983. Automatic spelling correction using a trigram similarity measure. Inf. Process. Manage. 19,255 261.
|
| |
9
|
ATWELL, E., AND ELLIOTT, S. 1987. Dealing with ill-formed English text (Chapter 10). In The Computational Analysis of English: A Corpus- Based Approach. R. Garside, G. Leach, G. Sampson, Ed. Longman, Inc. New York.
|
| |
10
|
BAHL, L. R., BROWN, P. F., DESOUZA, P. V., AND MERCER, R.L. 1989. A tree-based statistical language model for natural language speech recognition. IEEE Trans. Acoust. Speech Stg. Process. 37, 7, (July), 1001-1008.
|
| |
11
|
BAHL, L. R., JELINEK~ F., AND MERCER, R.L. 1983. A maximum likelihood approach to continuous speech recognition. IEEE Trans. Patt. Anal. Machine Intell. PAMI-5, 2 (Mar.), 179 190.
|
 |
12
|
|
 |
13
|
|
| |
14
|
BLAIR, C. R. 1960. A program for correcting spelling errors. Inf. Contr. 3, 60 67.
|
| |
15
|
BLEDSOE, W. W., AND BROWMNG, I. 1959. Pattern recognition and reading by machine. In Proceedings of the Eastern Joint Computer Conference, vol. 16, 225-232.
|
| |
16
|
BOCAST, A. K. 1991. Method and apparatus for reconstructing a token from a Token Fragment. U.S. Patent Number 5,008,818, Design Services Group, Inc. McLean, Va.
|
| |
17
|
BOIWE, R. H. 1981. Directory assistance revisited. AT & T Bell Labs Tech. Mem. June 12, 1981.
|
| |
18
|
BROWN, P. F., DELLA PIETRA, V. J., DESOUZA, P. V., AND MERCER, R. L. 1990a. Class-Based n- Gram Models of Natural Language.
|
| |
19
|
Peter F. Brown , John Cocke , Stephen A. Della Pietra , Vincent J. Della Pietra , Fredrick Jelinek , John D. Lafferty , Robert L. Mercer , Paul S. Roossin, A statistical approach to machine translation, Computational Linguistics, v.16 n.2, p.79-85, June 1990
|
| |
20
|
Peter F. Brown , Stephen A. Della Pietra , Vincent J. Della Pietra , Robert L. Mercer, Word-sense disambiguation using statistical methods, Proceedings of the 29th annual meeting on Association for Computational Linguistics, p.264-270, June 18-21, 1991, Berkeley, California
[doi> 10.3115/981344.981378]
|
| |
21
|
BURR, D. J. 1983. Designing a handwriting reader. IEEE Trans. Patt. Anal. Machine Intell. PAMI-5, 5 (Sept.), 554 559.
|
| |
22
|
BURR, D. J. 1987. Experiments with a connactionist text reader. In IEEE International Conference on Neural Networks (San Diego, Calif., June). IEEE, New York, IV:717-724.
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
CHERKASSKY, V., AND VASSILAS, N. 1989a. Backpropagation networks for spelling correction. Neural Net. 1, 3 (July), 166-173.
|
| |
27
|
CHERKASSKY, V., AND VASSILAS, N. 1989b. Performance of back-propagation networks for associative database retrieval. Int. J. Comput. Neural Net.
|
| |
28
|
|
| |
29
|
|
| |
30
|
CHERKASSKY, V., VASSILAS, N., BRODT, G. L., AND WECHSLER, H. 1992. Conventional and associative memory approaches to automatic spelling checking. Eng. Appl. Artif. Intell. 5, 3.
|
| |
31
|
CHERRY, L., AND MACDONALD, N. 1983. The Writer's Workbench software Byte, (Oct.), 241 248.
|
| |
32
|
CHOUEKA, Y. 1988. Looking fbr needles in a haystack. In Proceedtngs of RIAO, 609 623
|
| |
33
|
|
| |
34
|
CHURCH, K. W., ANO GALE, W.A. 1991a. Probability scoring for spelling correction. Stat. Camput. 1, 93 103.
|
| |
35
|
CHURCH, K. W., AND GAbE, W. A. 1991b. Enhanced Good-Turmg and cat-cal Two new methods for esmnating probabilities of English bigrams. Comput. Speech Lung. 1991.
|
| |
36
|
COHEN, G. 1980. Reading and searching for spelling errors. In Cognitive Processes in Spelhng. Uta Frith, Ed. Academic Press, London.
|
| |
37
|
COtLER, C. H., CHURCH, K. W., AND LIBERMAN, M. Y. 1990. Morphology and rhyming: Two powerful alternatives to letter-to-sound rules for speech synthesis. In Proceedings of the Conference on Speech Synthesis. European Speech Communication Association.
|
| |
38
|
CONTANT, C., AND BRUNELLE, E. 1992 Exploratexte: Un analyseur a l'affut des erreurs grammaticales. In Actes du colloque lexiquesgrammatres compares, Universite du Quebec a Montreal. In French.
|
 |
39
|
William H. Cushman , Purnendu S. Ojha , Cathleen M. Daniels, Usable OCR: what are the minimum performance requirements?, Proceedings of the SIGCHI conference on Human factors in computing systems: Empowering people, p.145-152, April 01-05, 1990, Seattle, Washington, United States
[doi> 10.1145/97243.97267]
|
| |
40
|
DAHL, P, AND CHERKASSKY, V. 1990. Combined encoding in associative spelling checkers. Umv. of Minnesota EE Dept. Tech. Rep.
|
| |
41
|
|
 |
42
|
|
| |
43
|
|
 |
44
|
|
| |
45
|
DEERWESTER, S., DUMAIS, S. T., FURNAS, G. W., LANDAUER, T K., AND HARSHMAN, R. 1990. Indexing by Latent Semantic Analysis. JASIS 41, 6, 391-407.
|
| |
46
|
|
| |
47
|
DEFFNER, R., GEIGER, H., KAHLER, R., KREMPL, T., AND BRAUER, W. 1990b. Recognizing words with connectionist architectures. In Proceedings of INNC-90-Parts (Paris, France, July), 196.
|
| |
48
|
DEHEER, T. 1982. The application of the concept of homeosemy to natural language information retrieval. Inf. Process. Manage. 18, 229-236.
|
| |
49
|
DELOCttE, G., AND DEmLh F. 1980. Order information redundancy of verbal codes in French and English' Neurolinguistic implications. J. Verbal Learn. Verbal Behav. 19, 525-530.
|
 |
50
|
|
| |
51
|
DEROUAULT, A.-M., AND MERIALDO, B. 1984a. Language modeling at the syntactic level. In Proceedmgs of the 7th International Conference on Pattern Recognition (Montreal, Canada, July 30-Aug. 2), 1373-1375.
|
| |
52
|
DEROUAULT, A.-M, AND MER~ALDO, B. 1984b. TASF: A stenotypy-to-French transcription system. In Proceedings of the 7th International Conference on Pattern Recogn~tton (Montreal, Canada, July 30-Aug. 2), 866-868.
|
 |
53
|
|
 |
54
|
|
| |
55
|
|
| |
56
|
ELLIOTT, R. J. 1988. Annotating spelling list worda with a~fixation classes. AT & T Bell Labs Int. Mem. Dec. 14.
|
| |
57
|
ELLIS, A. W. 1979 Slips of the pen. Vis. Lang. 13, 265-282.
|
| |
58
|
ELLIS, A. W. 1982. Spelling and writing (and reading and speaking). In Normahty and Pathology m Cognttwe Functwns, A. W Elhs, Ed. Academic Press, London.
|
| |
59
|
|
| |
60
|
|
| |
61
|
FORNEY, G. D., JR. 1973. The Viterbi algorithm. Prec. IEEE 61, 3 (Mar.), 268-278.
|
 |
62
|
|
 |
63
|
|
| |
64
|
FROMKIN, V., ED. 1980. Errors in Linguistic Performance: Shps of the Tongue, Ear, Pen and Hand. Academic Press, New York, 1980.
|
| |
65
|
GALE, W. A., AND CHURCH, K.W. 1990. Estimation procedures for language context: Poor estimates are worse than none. In Proceedings of Compstat-90 (Dubrovnik, Yugoslavia). Springer-Verlag, New York, 69-74.
|
| |
66
|
Stephen I. Gallant, A practical approach for representing context and for performing word sense disambiguation using neural networks, Neural Computation, v.3 n.3, p.293-309, Fall 1991
|
| |
67
|
GARRETT, M. 1982. Production of speech: Observations from normal and pathological language use. In Normality and Pathology ~n Cognttive Functmns, A. W. Ellis, Ed. Academic Press, London.
|
| |
68
|
GARSIDE, R., LEACH, G., AND SAMPSON, G. 1987. The Computatwnal Analysis of English: A Corpus-Based Approach. Longman, Inc., New York.
|
| |
69
|
GENTNER, D. R., GRUDIN, J., LAROCHELLE, S., NOR- MAN, D. A., AND RUMELHART, D. E. 1983. Studies of typing from the LNR typing research group. In Cognitive Aspects of Skilled Typewriting, W. E. Cooper, Ed. Springer- Verlag, New York.
|
| |
70
|
GERSHO, M., AND REITER, R. 1990. Information retrieval using self-organizing and heteroassociative supmwised neural networks. In Procee&ngs oflJCNN (San Diego, Calif. June).
|
| |
71
|
GOOD, I.J. 1953. The population frequencies of species and the estimation of population parameters Biometrika 40, 3 and 4 (Dec.), 129-264.
|
| |
72
|
GORIN, R. E. 1971. SPELL: A spelling checking and correction program. Online documentation for the DEC-10 computer.
|
| |
73
|
|
| |
74
|
|
| |
75
|
GRUDIN, J. 1983. Error patterns in skilled and novice transcription typing. In Cognitive Aspects of Skilled Typewriting, W. E. Copper, Ed. Springer-Verlag, New York.
|
| |
76
|
GRUHIN. J. 1981. The organization of serial order in typing. Ph.D. dissertation Univ. of California, ~an Diego.
|
 |
77
|
|
| |
78
|
HANSON, S. J., AND KEGL, J. 1987. PARSNIP: A connectionist network that natural language grammar from exposure to natural language sentences. In Proceedings of the Cognitive Science Conference.
|
| |
79
|
HANSON, A. R., RISEMAN, E. M., AND FISHER, E., 1976. Context in word recognition. Part. Recog. 8, 35-45.
|
| |
80
|
HARMON, L. D. 1972.Automatic recognition of print and script. Proc. IEEE 60, (Oct.), 1165 1176.
|
| |
81
|
HAWLEY, M.J. 1982. Interactive spelling correction in Unix: The METRIC Library. AT &T Bell Labs Tech. Mem., August 31.
|
| |
82
|
|
| |
83
|
HEIDORN, G. E., JENSEN, K., MILLER, L. A., BYRD, R. J., AND CHODOROW, M.S. 1982. The EPIS- TLE text-critiquing system. IBM Syst. J. 21, 3,305-326.
|
| |
84
|
HENSELER, J., SCHOLTES, J. C., AND VERDOEST, C. R. J. 1987. The design of a parallel knowledge-based optical character recognition system. Master of Science Theses, Dept. of Mathematics and Informatics, Delft Univ. of Technology.
|
| |
85
|
HINDLE, D. 1983. User manual for Fidditch, a deterministic parser. Tech. Mere. 7590 142, Naval Research Lab.
|
| |
86
|
Ho, T. K., HULL, J. J., AND SRIHARI, S. N. 1991. Word recognition with multi-level contextual knowledge. In Proceedings of IDCAR-91 (St. Malo, France), 905-915.
|
| |
87
|
HOTOPF, N. 1980. Slips of the pen. In Cognitive Processes in Spelling, Uta Frith, Ed. Academic Press, London.
|
| |
88
|
HULL, J.J. 1987. Hypothesis testing in a computational theory of visual word recognition. In Proceedings of AAAI-87, 6th National Conference on Artificial Intelligence. vol. 2 (Seattle, Wash., July 13 17). AAAI, 718 722.
|
| |
89
|
HULL, J. J., AND SRIHARI, S. N. 1982. Experiments in text recognition with binary n-gram and Viterbi algorithms. IEEE Trans. Patt. Anal. Machine Intell. PAMI-4, 5 (Sept.), 520 530.
|
| |
90
|
F. Jelinek , B. Merialdo , S. Roukos , M. Strauss, A dynamic language model for speech recognition, Proceedings of the workshop on Speech and Natural Language, p.293-295, February 19-22, 1991, Pacific Grove, California
[doi> 10.3115/112405.112464]
|
| |
91
|
|
| |
92
|
JOHNSTON, J. C., AND MCCLELLAND, J. L. 1980. Experimental tests of a hierarchical model of word identification. J. Verbal Learn. Verbal Behav. 19, 503-524.
|
| |
93
|
JONES, M. A., STORY, G. A., AND BALLARD, B. W. 1991. Integrating multiple knowledge sources in a Bayesian OCR post-processor. In Proceedtngs of IDCAR-91 (St Malo, France), 925-933.
|
| |
94
|
JOSHI, A.K. 1985. How much context-sensitivity is necessary for characterizing structural descriptions-Tree Adjoining Grammars In Natural Language Processing Theoretzcal, Computatzonal and Pwcholog~cal Perspectives, D. Dowty, L. Karttunen, A. Zwicky, Ed. Cambridge University Press, New York.
|
| |
95
|
|
| |
96
|
KASHYAP, R. L, AND OOMMEN, B. J. 1981 An effective algorithm for string correction using generalized edit distances. Inf Sci 23, 123-142.
|
| |
97
|
KASHYAP, R. L., AND OOMMEN, B.J.1984. Spelling correction using probabilistic methods. Part Recog. Lett. 2, 3 (Mar.), 147 154.
|
| |
98
|
KEELER, J., AND RUMELHART, D.E. 1992. A selforganizing mtegreted segmentation and recognition neural net. In Advances ~n Neural ln/~rmation Proccsszng Systems, vol. 4. J. E. Moody, S. J. Hanson, R. P. Lippmann, Ed. Morgan Kaufmann, San Mateo, Calif., 496-503.
|
| |
99
|
KEMPEN, G., AND VOSSE, T. 1990. A languagesensitive text editor for Dutch. In Proceedings of the Computers and Writing 111 Conference (Edinburgh, Scotland, Apr )
|
| |
100
|
KERNIGHAN, M.D. 1991. Specialized spelling correction for a TDD system AT & T Bell Labs Tech. Mere., August. 30.
|
| |
101
|
KERNIGHAN, M. D., AND GALE, W.A. 1991. Varmtions on channel-frequency spelling correction in Spamsh. AT&T Bell Labs Tech. Mem., September.
|
| |
102
|
|
| |
103
|
|
| |
104
|
|
| |
105
|
|
| |
106
|
KUCERA, H., AND FRANCIS, W.N. 1967. Computational Analysis of Present-Day American Engltsh Brown University Press, Providence, R.I.
|
| |
107
|
KUKICH, K. 1988a. Variatmns on a back-propagation name recognition net. In Proceedings of the Advanced Technology Conference, vol 2 (May 3-5). U.S. Postal Service, Washington D.C., 722-735.
|
| |
108
|
KUKICH, K. 1988b. Back-propagation topologies for sequence generation. In Proceedings o/ the IEEE International Conference on Neural Networks, vol. 1 (San Diego, Calif., July 24 27). IEEE, New York, 301-308.
|
| |
109
|
KUKICH, K. 1990 A comparison of some novel and traditional lexical distance metrics for spelling correction. In Proceectzngs of INNC- 90-Paris (Paris, France, July), 309-313.
|
 |
110
|
|
| |
111
|
LANDAUER, T. K, AND STREETER~ L. A. 1973. Structural differences between common and rare words. J. Verbal Learn. Verbal Behav. 12, 119-131.
|
| |
112
|
LEE, Y.-H., EVENS, M., MICfiAEL, J. A., AND ROVlCK, A.A. 1990. Spelling Correction for an intelligent tutoring system. Tech. Rep., Dept. of Computer Science, Illinois Inst. of Technology, Chicago
|
| |
113
|
TEIN, V I. 1966. Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10, (Feb), 707-710.
|
| |
114
|
AN, M. Y., AND WALKER. D.E. 1989. ACL Data Collectmn mitmtlve: First release. Fznite String 15, 4 (Dec.), 46-47.
|
 |
115
|
|
| |
116
|
O., BURGES, C. J. C, LECuN, Y, AND DENKER, J.S. 1992. Multi-digit recogmtion using a space displacement neural network. In Advances in Neural Information Processzng Systems, vol. 4, J. E Moody, S. J. Hanson, R. P. Lippnmnn, Ed. Morgan Kaufmann, San Mateo, Calif, 488-495.
|
| |
117
|
|
| |
118
|
J. L., AND RUMELHART. D.E. 1981 An interactive activation model of context effects in letter perception. Psychol. Rev. 88, 5 (Sept.), 375 407.
|
| |
119
|
|
| |
120
|
Y, M. D 1992. Development of a spelling li~t. IEEE Trans_ Comrnun. COM-30, i (Jan.), 91 99.
|
| |
121
|
L.G. 1988. Cn yur cmputr reed ths. In Proceedinss of the 2nd Applzed Natural Language Processing Conference (Austin, Tex, Feb.). ACL, 93-100.
|
| |
122
|
S., HAYES, P. J., AND FAIN J. 1985. Controlling search in fiemble parsing. In Proceedings of the Internatzonal Jmnt Conference on Artificml Intelhgence. Morgan Kaufman, San Marco, Calif., 786-787.
|
| |
123
|
|
| |
124
|
|
| |
125
|
R. 1985. A collection of computer-readable corpora of English spelling errors. Cog. Neuropsychol. 2, 3,275-279.
|
| |
126
|
AND FRAENKEL, A. S. 1982a. Retrieval in an environment of faulty texts or faulty queries. In Proceedings of the 2nd International Conference on Improving Database Usabihty and Responsiveness (Jerusalem), P. Scheuerman, Ed. Academic Press, New York, 405-425.
|
 |
127
|
|
 |
128
|
|
| |
129
|
R., AND CHERRY, L.L. 1975. Computer detection of typographical errors. IEEE Trans. Profess. Commun. PC-18, 1, 54-63.
|
| |
130
|
E., JR., AND THARP, A.L. 1977. Correcting human error in alphanumeric terminal input. Inf. Process. Manage. 13, 329-337.
|
| |
131
|
ER, G. L. 1966. Introduction to Dynamic Programming. Wiley, New York.
|
| |
132
|
J., PHILLIPS, V. L., AND DUMAIS, S. T. 1992. Retrieving imperfectly recognized handwritten notes. Behav. Inf. Teeh.
|
| |
133
|
M. K., AND RUSSELL, R. C. 1918. U.S. Patent Numbers, 1,261,167 (1918) and 1,435,663 (1922). U.S. Patent Office, Washington, D.C.
|
| |
134
|
T., TANAKA, E., AND KASAI, T. 1976. A method of correction of garbled words based on the Levenshtein metric. IEEE Trans. Comput. 25, 172-177.
|
| |
135
|
|
| |
136
|
E, K., CHIGNELL, M., KHOSHAFIAN, S., AND WONG, H. 1990. Intelligent databases. A/ Expert, (Mar.), 38 47.
|
 |
137
|
|
 |
138
|
|
| |
139
|
POLLOCK, J. J., AND ZAMORA, A. 1983. Collection and characterization of spelling errors in scientific and scholarly text. J. Amer. Soc. Inf. Sci. 34, 1, 51 58.
|
 |
140
|
|
| |
141
|
RAMSaAW, L. A. 1989. Pragmatic knowledge for resolving ill-formedness. Tech. Rep. No. 89-18, BBN, Cambridge, Mass.
|
| |
142
|
RHYNE, J. R., AND WOLF, C. G. 1991. Paperlike user interfaces. RC 17271 (#76097), IBM Research Division, T. J. Watson Research Center, Yorktown Heights, N.Y.
|
| |
143
|
RHYNE, J. R., AND WOLF, C. G. 1993. Recognition-based user interfaces. In Advances m Human-Computer Interaction, vol. 4, H. R. Hartson and D. Hix, Ed. Ablex, Norwood, N.J.
|
| |
144
|
|
| |
145
|
E. M., AND HANSON, A.R. 1974. A contextual postprocessing system for error correction using binary n-grams. IEEE Trans. Cornput. C-23, (May), 480-493.
|
 |
146
|
|
| |
147
|
ROSENFELD, A., HUMMEL, R. A., AND ZUCKER, S. W. 1976. Scene labeling by relaxation operations. IEEE Trans. Syst. Man Cybernet. SMC-6, 6, 420-433.
|
| |
148
|
RUMELHART, D. E., AND MCCLELLAND, J.L. 1982. An interactive activation model of context effects in letter perception. Psychol. Rev. 89, 1, 60-94.
|
| |
149
|
|
| |
150
|
|
| |
151
|
SAMPSON, G. 1989. How fully does a machineusable dictionary cover English text. Lit. Ling. Comput. 4, 1, 29-35.
|
| |
152
|
SANKOFF, D., AND KRUSKAL, J. B. 1983. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading, Mass.
|
| |
153
|
SANTOS, P. J., BALTZER, A. J., BADRE, A. N., HENNE- MAN. R. L.. AND MILLER. M. S. 1992. On handwriting recognition system performance: Some experimental results. In Proceedings of the Human Factors Soctety 36th Annual Meeting (Atlanta, Ga., Oct. 12-16). Human Factors Society.
|
| |
154
|
|
 |
155
|
|
| |
156
|
SH~NOHAL, R, AND TOUSSAINT, G. T 1979a Experiments in text recognition with the modified Viterbi algorithm. IEEE Trans Patt. Anal. Machine Intell. PAMI-1, 4 (Apr), 184 193.
|
| |
157
|
SHiNGHAL, R., AND TOUSSAINT, G.T. 1979b. A bottom-up and top-down approach to using context in text recognition. Dzt. J. Man-Machine Stud. 11,201 212.
|
| |
158
|
SIDOROV, A.A. 1979. Analysis of word similarity on spelling correction systems. Program. Cornput. Softw 5, 274 277.
|
| |
159
|
|
| |
160
|
SITAR, E.J. 1961. Machine recognition of cursive script: The use of context for error detection and correction. Bell Labs Tech. Mem.
|
| |
161
|
SLEATOR, D. a., AND TEMPERLY, a. 1992. ParsLng Enghsh with a Link Grammar. Source code via internet host: spade.pc.cs.cmu.edu:/usr/ sleator/pubhc. Carnegie-Mellon Univ., Pittsburgh, Pa.
|
| |
162
|
|
| |
163
|
|
| |
164
|
|
| |
165
|
SPENKE, M., BEILKEN, C., MATTERN, F., MEVENKAMP, M., AND H. M. 1984. A language independent error recovery method for LL(1) parsers. Softw. Pract. Exp. 14, 11.
|
| |
166
|
SRItlARI, S., El). 1984. Computer Text Recognitzon and Error Correctwn. IEEE Computer Society Press, Plscataway, N.J
|
 |
167
|
|
| |
168
|
SuRL L. Z. 1991. Language transfer: A foundation for correcting the written English of ASL signers. Tech. Rep. No. 91-19, Dept. of Computer and Information Sciences, Univ. of Delaware, Newark, Del.
|
| |
169
|
SuRL L. Z., AND McCoY, K. F. 1991. Language transfer in deaf writing: A correction methodology for an instructional system. Tech. Rep. No. 91-20, Dept. of Computer and Information Sciences, Univ. of Delaware, Newark, De}.
|
| |
170
|
TAYLOR, W D. 1981. GROPE--A spelling error correction tool. AT & T Bell Labs Tech. Mere.
|
| |
171
|
TENCZAR, P., AND GOLDEN, W. 1972. CERL Report X-35. Computer-Based Educatmn Research Lab., Umv of Ilhnois, Urbana, Ill.
|
| |
172
|
|
| |
173
|
TOUSSAINT, G T. 1978. The use of context in pat-tern recognition. Patt Recog. 10, 189 204.
|
| |
174
|
TR^WICK, D J. 1983. Robust sentence analysis and habitability. Ph.D dissertation, California Inst. of Technology, Pasadena. Calif.
|
| |
175
|
TROY, P. L. 1990 Combining probabilistic sources with lexical distance measures for spelhng correction. Bellcore Tech Memo., Bellcore, Morristown, N.J.
|
| |
176
|
TSAO, Y. C. 1990. A lexical study of sentences typed by hearing-impaired TDD users. In Proceed~ngs of the 13th International Symposium on Human Factors in Telecommun~catzons (Turin, Italy, Sept ), 197 201.
|
 |
177
|
|
| |
178
|
ULLMANN, J.R. 1977 A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words. Cornput J. 20, 141-147.
|
| |
179
|
|
| |
180
|
VERONIS, J. 1988a. Computerized correction of phonographic errors. Comput. Hum. 22, 43-56.
|
| |
181
|
|
| |
182
|
|
 |
183
|
|
 |
184
|
|
| |
185
|
WALKE~, D. E. 1991. The ecology of language. In Proceedings of the International Workshop on Electronic D~ctzonarzes (Feb.). Japan Electronic Dictionary Research Institute, Tokyo, 10-22.
|
| |
186
|
WALKER, D. E., AND AMSLER, R.A. 1986. The use of machine-readable dictionaries in sublanguage analysis. In Analyzing Language ~n Restricted Domains: Sublanguage Description and Processing. Lawrence Erlbaum, Hillsdale, N.J., 69-83.
|
 |
187
|
|
| |
188
|
Webster's New World Misspeller's Dictionary. Simon and Schuster, New York.
|
| |
189
|
|
| |
190
|
WING, A. M., AND BADDELEY, A.D. 1980. Spelling errors in handwriting: A corpus and distributional analysis. In Cognitive Processes in Spelhng, U. Frith, Ed. Academic Press, London.
|
 |
191
|
|
| |
192
|
WRIGHT, h. G., AND NEWELL, A. F. 1991. Computer help for poor spellers. Brit. J. Educ. Tech. 22, 2 (Feb.), 146 148.
|
| |
193
|
YANNAKOUDAKIS, E. J., AND FAWTHROP, D. 1983a. An intelligent spelling correcter. Inf. Process. Manage. 19, 12, 101-108.
|
| |
194
|
YANNAKOUDAKIS, E. J., AND FAWTHROP, D. 1983b. The rules of spelling errors. Inf. Process. Manage. 19, 2, 87 99.
|
| |
195
|
YOUNG, C. W., EASTMAN, C. M., AND OAKMAN, R. L. 1991. An analysis of ill-formed input in natural language queries to document retrieval systems. Inf. Process. Manage. 27, 6, 615-622.
|
| |
196
|
ZA~IORA, E. M., POLLOCK, J. J., AND ZAMORA, A. 1981. The use of trigram analysis for spelling error detection. Inf. Process. Manage. 17, 6, 305-316.
|
| |
197
|
ZIPF, G. K. 1935. The Psycho-Biology of Language. Houghton Mifflin, Boston.
|
CITED BY 103
|
|
Stefan Berchtold , Christian Böhm , Daniel A. Keim , Hans-Peter Kriegel, A cost model for nearest neighbor search in high-dimensional data space, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.78-86, May 11-15, 1997, Tucson, Arizona, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
James C. French , Allison L. Powell , Eric Schulman, Applications of approximate word matching in information retrieval, Proceedings of the sixth international conference on Information and knowledge management, p.9-15, November 10-14, 1997, Las Vegas, Nevada, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Surapant Meknavin , Boonserm Kijsirikul , Ananlada Chotimongkol , Cholwich Nuttee, Combining trigram and Winnow in thai OCR error correction, Proceedings of the 36th annual meeting on Association for Computational Linguistics, p.836-842, August 10-14, 1998, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Eneko Agirre , Koldo Gojenola , Kepa Sarasola , Atro Voutilainen, Towards a single proposal in spelling correction, Proceedings of the 17th international conference on Computational linguistics, August 10-14, 1998, Montreal, Quebec, Canada
|
|
|
Eugene Borovikov , Ilya Zavorin , Mark Turner, A filter based post-OCR accuracy boost system, Proceedings of the 1st ACM workshop on Hardcopy document processing, p.23-28, November 12-12, 2004, Washington, DC, USA
|
|
|
|
|
|
|
|
|
I. Aldezabal , I. Alegria , O. Ansa , J. M. Arriola , N. Ezeiza , I. Aduriz , A. Da Costa, Designing spelling correctors for inflected languages using lexical transducers, Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics, June 08-12, 1999, Bergen, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reid Kerr , Wolfgang Stuerzlinger, Context-sensitive cut, copy, and paste, Proceedings of the 2008 C3S2E conference, May 12-13, 2008, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mu Li , Yang Zhang , Muhua Zhu , Ming Zhou, Exploring distributional similarity based models for query spelling correction, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, p.1025-1032, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AiTi Aw , Min Zhang , Juan Xiao , Jian Su, A phrase-based statistical model for SMS text normalization, Proceedings of the COLING/ACL on Main conference poster sessions, p.33-40, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
Wilson Wong , Wei Liu , Mohammed Bennamoun, Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text, Proceedings of the fifth Australasian conference on Data mining and analystics, p.83-89, November 29-30, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hyungkeun Jee , Jooyoung Lee , Dowon Hong, High speed search for large-scale digital forensic investigation, Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop, January 21-23, 2008, Adelaide, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guoliang Li , Shengyue Ji , Chen Li , Jianhua Feng, Efficient type-ahead search on relational data: a TASTIER approach, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.7
Natural Language Processing
Subjects:
Text analysis
Additional Classification:
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.7
Natural Language Processing
Subjects:
Language models;
Language parsing and understanding
I.5
PATTERN RECOGNITION
I.5.1
Models
Subjects:
Statistical;
Neural nets
I.7
DOCUMENT AND TEXT PROCESSING
I.7.1
Document and Text Editing
Subjects:
Spelling**
General Terms:
Algorithms,
Experimentation,
Human Factors,
Performance,
Theory
Keywords:
n-gram analysis,
Optical Character Recognition (OCR),
context-dependent spelling correction,
grammar checking,
natural-language-processing models,
neural net classifiers,
spell checking,
spelling error detection,
spelling error patterns,
statistical-language models,
word recognition and correction
REVIEW
"Graeme J. Hirst : Reviewer"
It is often easy to tell when a poor speller or poor typist has
used a spelling checker on a document: each word is correctly spelled,
but not all are the words that the author intended. And optical
character recognition of documents, with its
more...
|