Abstract
A new model for multiple errors spelling correction is proposed. The model handles insert, delete, change, and transpose errors. In the new model, we put constraints on possible editing sequences to reflect the error occurrence phenomenon in spelling, resulting in an error measure different from the traditional editing distance error measure. Properties of the “error distance matrix” between two character strings are studied under the assumptions of the new model. A cut-off criterion has been discovered, which can detect whether the error distance between two character strings is greater than a prespecified value during the calculation. Based on this cut-off criterion, a fast algorithm has been developed to find the nearest neighbors of a given character string in a dictionary. Experiments have been conducted with results showing that the cut-off criterion can greatly cut down the computation time needed for the nearest neighbor searching.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Blair, C.R.: A program for correcting spelling errors. Inf. Control3, 60–67 (1960)
Bourne, C.P.: Frequency and impact of spelling errors in bibliographic data bases. Inf. Process. Manage.13(1), 1–12 (1977)
Spelling correction program for micros. Comput. Weekly 7 (July 3, 1980)
Du, M.W., Chang, S.C.: A new approach to shortest editing sequence and longest common subsequence problems. Tech. Rep. GTE Labs. TR-047-06-89-500 (June 1989)
Du, M.W., Chang, S.C.: A model and a fast algorithm for multiple errors spelling correction. Tech. Rep. GTE Labs. TR-065-09-89-500 (Sept. 1989)
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM7(3), 171–176 (1964)
Galil, Z., Giancarlo, R.: Data structures and algorithms for approximate string matching. J. Complexity4, 33–72 (1988)
Hall, P.A.V., Dowling, G.R.: Approximate string matching. ACM Comput. Surv.12(4), 381–402 (1980)
Litecky, C.R., Davis, G.B.: A study of errors, error-proneness, and error diagnosis in COBOL. Commun. ACM19(1), 33–37 (1976)
Lowrance, R., Wagner, R.A.: An extension of the string-to-string correction problem. J. ACM22(2), 177–183 (1975)
Morgan, H.L.: Spelling correction in systems programs. Commun. ACM13(2), 90–94 (1970)
Mukhopadhyay, A.: A fast algorithm for the longest-common-subsequence problem. Inf. Sci.20, 69–82 (1980)
Muth, F.E., Jr., Tharp, A.L.: Correcting human error in alphanumeric terminal input. Inf. Process. Manage.13(6), 329–337 (1977)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM21(1), 168–173 (1974)
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Du, M.W., Chang, S.C. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica 29, 281–302 (1992). https://doi.org/10.1007/BF01185682
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF01185682