Abstract
In epidemiological cancer registries, extensive measures to clean data are inevitable. This is due to the heterogenity of reporting organizations, as for example hospitals, oncological centers, and physicians. Data cleansing ensures data quality, which is essential for epidemiological analysis and interpretation. By means of the software tool CARELIS, this paper shows that the process of data cleansing can be automatized to a considerable extent. CARELIS integrates a probabilistic record linkage method with knowledge-based techniques to minimize the effort of manual after-treatment of notifications transmitted to the registry. This paper focusses the discussion of experiences made by CARELIS users, usually medical documentation assistants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adamek, J.: Fusion: Combining Data from Separate Sources. Marketing Research: A Magazine of Management and Applications 6, 48–50 (1994)
Agarwal, S., Keller, A.M., Wiederhold, G., Saraswat, K.: Flexible Relation: An Approach for Integrating Data from Multiple, Possibly Inconsistent Databases. In: International Conference on Data Engineering, Taipei (1995)
Appelrath, H.-J., Friebe, J., Hinrichs, E., Hinrichs, H., Hoting, I., Kieschke, J., Panienski, K., Rettig, J., Scharnofske, A., Thoben, W., Wietek, F.: CARLOS (Cancer Registry Lower–Saxony): Taetigkeitsbericht fuer den Zeitraum 1.1. 31.12.1997. Technical Report. OFFIS. Oldenburg (1997) (in German)
Bundestag, D.: Gesetz ueber Krebsregister (Krebsregistergesetz KRG). Bundesgesetzblatt 79, 3351–3355 (1994) (in German)
Fellegi, I.P., Sunter, A.B.: A Theory for Record Linkage. Journal of the American Statistical Association 40, 1183–1210 (1969)
Hinrichs, H., Aden, T., Dirks, J.-C., Wilkens, T.: CARELIS— Record Linkage im EKN. Technical Report (in German). Epidemiological Cancer Registry of Lower Saxony. Oldenburg (to appear)
Jaro, M.A.: Advances in Record Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association 84, 414–420 (1989)
Jensen, O.M., Parkin, D.M., MacLennan, R., Muir, C.S., Skeet, R.G.: Cancer Registration: Principles and Methods. IARC Scientific Publications No. 95. International Agency for Research on Cancer (IARC), Lyon (1991)
Kimball, R.: Dealing with Dirty Data. DBMS Magazine 9 (1996), http://www.dbmsmag.com/9609d14.html
Parkin, D.M., Chen, V.W., Ferlay, J., Galceran, J., Storm, H.H., Whelan, S.L.: Comparability and Quality Control in Cancer Registration. IARC Technical Report No. 19. International Agency for Research on Cancer (IARC), Lyon (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hinrichs, H., Panienski, K. (1999). Experiences with Knowledge-Based Data Cleansing at the Epidemiological Cancer Registry of Lower-Saxony. In: Puppe, F. (eds) XPS-99: Knowledge-Based Systems. Survey and Future Directions. XPS 1999. Lecture Notes in Computer Science(), vol 1570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10703016_18
Download citation
DOI: https://doi.org/10.1007/10703016_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65658-6
Online ISBN: 978-3-540-49149-1
eBook Packages: Springer Book Archive