Abstract
Feature weighting in supervised learning concerns the development of methods for quantifying the capability of features to discriminate instances from different classes. A popular method for this task, called RELIEF, generates a feature weight vector from a given training set, one weight for each feature. This is achieved by maximizing in a greedy way the sample margin defined on the nearest neighbor classifier. The contribution from each class to the sample margin maximization defines a set of class dependent feature weight vectors, one for each class. This provides a tool to unravel interesting properties of features relevant to a single class of interest.
In this paper we analyze such class dependent feature weight vectors. For instance, we show that in a machine learning dataset describing instances of recurrence and non-recurrence events in breast cancer, the features have different relevance in the two types of events, with size of the tumor estimated to be highly relevant in the recurrence class but not in the non-recurrence one. Furthermore, results of experiments show that a high correlation between feature weights of one class and those generated by RELIEF corresponds to an easier classification task.
In general, results of this investigation indicate that class dependent feature weights are useful to unravel interesting properties of features with respect to a class of interest, and they provide information on the relative difficulty of classification tasks.
Chapter PDF
Similar content being viewed by others
References
Dietterich, T.G.: Machine-learning research: Four current directions. The AI Magazine 18(4), 97–136 (1998)
Domeniconi, C., Gunopulos, D.: andS. Ma, B. Yan, M. Al-Razgan, and D. Papadopoulos. Locally adaptive metrics for clustering high dimensional data. Data Min. Knowl. Discov. 14(1), 63–97 (2007)
Elkin, E.B., Hudis, C., Begg, C.B., Schrag, D.: The effect of changes in tumor size on breast carcinoma survival in the u.s.: 1975-1999. Cancer 104(6), 1149–1157 (2005)
Friedman, J.H., Meulman, J.J.: Clustering objects on subsets of attributes. Journal of the Royal Statistical Society 6, 815–849 (2004)
Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection - theory and algorithms. In: ICML (2004)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Kira, K., Rendell, L.A.: The feature selection problem: Traditional methods and a new algorithm. In: National Conference on Artificial Intelligence, pp. 129–134 (1992)
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence Journal 97(1-2), 273–324 (1997)
Kononenko, I.: Estimating attributes: Analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Liu, H., Motoda, H.: Computational Methods of Feature Selection. Chapman & Hall/Crc Data Mining and Knowledge Discovery Series. Chapman & Hall/CRC (2007)
Sun, Y.: Iterative RELIEF for feature weighting: Algorithms, theories, and applications. IEEE TPAMI 29(6), 1035–1051 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marchiori, E. (2013). Class Dependent Feature Weighting and K-Nearest Neighbor Classification. In: Ngom, A., Formenti, E., Hao, JK., Zhao, XM., van Laarhoven, T. (eds) Pattern Recognition in Bioinformatics. PRIB 2013. Lecture Notes in Computer Science(), vol 7986. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39159-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-39159-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39158-3
Online ISBN: 978-3-642-39159-0
eBook Packages: Computer ScienceComputer Science (R0)