Abstract
This paper is positioned within the development of an automated indexing system for the CISMeF quality controlled health gateway. For disambiguation purposes, we wish to perform text categorization prior to indexing. Hence, a global approach contrasting with the classical analytical methods based on the analysis of keyword counts extracted from the text is necessary. The use of statistical compression models enables us to proceed avoiding keyword extraction at this stage. Preliminary results show that althought this method is not as precise as others in terms of resource categorization, it can significantly benefit indexing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Koch, T.: Quality-controlled subject gateways: definitions, typologies, empirical overview. Subject gateways, Special issue of ”Online Information Review” 24(1), 24–34 (2000)
Pouliquen, B.: Indexation de document médicaux par extraction de concepts, et ses utilisation, PhD thesis (2002)
Wiener, W., Pedersen, J., Weigend, A.: A neural network approach to topic spotting. In: Proc. of the Symposimum on Document Analysis and Information Retrieval, pp. 317–332 (1995)
Dumais, S., Osuna, E., Platt, J., Schölkopf, B.: Using SVMs for text categorization. Hearst, M. (ed.) IEEE Intelligent Systems Magazine, Trends and Controversies 13(4), 18–28 (1998)
Wilcox, A., Hripcsak, G.: Classification Algorithms Applied to Narrative Reports. In: Proc of Symp. in AMIA (1999)
Néveol, A., Soualmia, L.S., Rogozan, A., Douyère, M., Darmoni, S.J.: Utilisation des propriétés sémantiques de la terminologie CISMeF pour la catégorisation de ressources de santé, à paraître dans Actes des Journées Francophones d’Informatique Médicale (2003)
Kosala, R., Blockeel, H.: Web Mining Research: A Survey. ACM SIGKDD 2(1), 1–15 (2000)
Teahan, W., Harper, D.: Using compression based language models for text categorization. In: Callan, J., Croft, B., Lafferty, J. (eds.) Workshop on Language Modelling and Information Retrieval, pp. 83–88 (2001)
Soualmia, L.F., Thirion, B., Leroy, J.P., Douyère, M., Darmoni. S.J.: Modélisation et représentation des connaissances dans un catalogue de santé, dans les Actes des Journées Francophones d’Ingénierie des Connaissances 2002, pp. 139-149 (2002)
Darmoni, S.J., Leroy, J.P., Baudic, F., Douyère, M., Piot, J., Thirion, B.: CISMeF: a structured health resource guide. Methods of Information in Medicine 39(1), 30–35 (2000)
Cleary, T.C., Witten, J.G.: Data compression using adaptive coding and partial string matching. IEEE Transaction on Communications 32(4), 396–402 (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rogozan, A., Néveol, A., Darmoni, S.J. (2003). Text Categorization prior to Indexing for the CISMEF Health Catalogue. In: Dojat, M., Keravnou, E.T., Barahona, P. (eds) Artificial Intelligence in Medicine. AIME 2003. Lecture Notes in Computer Science(), vol 2780. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39907-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-39907-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20129-8
Online ISBN: 978-3-540-39907-0
eBook Packages: Springer Book Archive