Abstract
Raw Data used in data mining often contain missing information, which inevitably degrades the quality of the derived knowledge. In this paper, a new method of guessing missing attribute values is suggested. This method selects attributes one by one using attribute group mutual information calculated by flattening the already selected attributes. As each new attribute is added, its missing values are filled up by generating a decision tree, and the previously filled up missing values are naturally utilized. This ordered estimation of missing values is compared with some conventional methods including Lobo’s ordered estimation which uses static ranking of attributes. Experimental results show that this method generates good recognition ratios in almost all domains with many missing values.
This work was supported by Korea Science Foundation under contract 97-01-02-04-01-3.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. Cestnik and et al., “Assistant-86: A Knowledge-elicitation Tool for Sophisticated Users,” Progress in Machine Learning, Sigma Press, UK, 1987.
I. Kononenko and E. Roscar, “Experiments in Automatic Learning of Medical Diagnostic Rules,” Technical Report, Jozef Stefan Institute, Yugoslavia, 1984.
K.C. Lee, “A Technique of Dynamic Feature Selection Using the FGMI,” Lecture Notes in Artificial Intelligence 1574, pp. 138–142, Springer, 1999.
O.O. Lobo and M. Numao, “Ordered Estimation of Missing Values,” Lecture Notes in Artificial Intelligence 1574, pp.499–503, Springer, 1999.
J.R. Quinlan, “Unknown Attribute Values,” C4.5 Programs for Machine Learning, pp.27–32, Morgan Kaufmann, 1993.
J.R. Quinlan, “Induction of Decision Trees,” Machine Learning:1, pp.81–106, 1986.
J.R. Quinlan, “Unknown Attribute Values in Induction,” Proc. of the 6th International Machine Learning Workshop, pp. 164–168, Morgan Kaufmann, 1989.
http://www.ics.uci.edu/~mlearn/MLRepository.html . UCI Machine Learning Repository, Univ. of California, Dept. of Info. Computer Science, Irvine, CA, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, K.C., Park, J.S., Kim, Y.S., Byun, Y.T. (2000). Missing Value Estimation Based on Dynamic Attribute Selection. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_15
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive