Abstract
The effectiveness of stemming algorithms has usually been measured in terms of their effect on retrieval performance with test collections. This however does not provide any insights which might help in stemmer optimisation. This paper describes a method in which stemming performance is assessed against predefined concept groups in samples of words. This enables various indices of stemming performance and weight to be computed. Results are reported for three stemming algorithms. The validity and usefulness of the approach, and the problems of conceptual grouping, are discussed, and directions for further research are identified.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lennon, M., Pierce, D.S., Tarry, B.D., Willett, P. An evaluation of some conflation algorithms for information retrieval. Journal of Information Science1981; 3, 177–183.
Frakes, W.B. Term Conflation for Information Retrieval. Ph.D. thesis, Syracuse University, NY, 1982.
Paice, C.D. Another stemmer. SIGIR Forum1990; 24, 56–61.
Hafer, M.A. and Weiss, S.F. Word segmentation by letter successor varieties. Information Storage and Retrieval1974; 10, 371–385.
Landauer, C. and Mall, C. Message extraction through estimation of relevance. In: R.N. Oddy et al. (Eds.), Information Retrieval Research. London: Butterworths, London, 1981, pp. 117–138.
Harman, D. How effective is suffixing? Journal of the American Society for Information Science1991; 42, 7–15.
Lovinss, J.B. Development of a stemming algorithm. Mechanical Translation and Computational Linguistics1968; 11, 22–31.
Porter, M.F. An algorithm for suffix stripping. Program1980; 14, 130–137.
Wilson, A. and Rayson, P. The automatic content analysis of spoken discourse: a report on work in progress. In: Souter, C. and Atwell, A., Corpus-based Computational Linguistics. Rodopi, Amsterdam & Atlanta GA, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag London Limited
About this paper
Cite this paper
Paice, C.D. (1994). An Evaluation Method for Stemming Algorithms. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_5
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2099-5_5
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive