Abstract
Feature and instance selection before classification is a very important task, which can lead to big improvements in both classifier accuracy and classifier speed. However, few papers consider the simultaneous or combined instance and feature selection for Nearest Neighbor classifiers in a deterministic way. This paper proposes a novel deterministic feature and instance selection algorithm, which uses the recently introduced Minimum Neighborhood Rough Sets as basis for the selection process. The algorithm relies on a metadata computation to guide instance selection. The proposed algorithm deals with mixed and incomplete data and arbitrarily dissimilarity functions. Numerical experiments over repository databases were carried out to compare the proposal with respect to previous methods and to the classifier using the original sample. These experiments show the proposal has a good performance according to classifier accuracy and instance and feature reduction.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Kuncheva, L.I., Jain, L.C.: Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recognition Letters 20, 1149–1156 (1999)
Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177, 28–40 (2007)
Ruiz-Shulcloper, J., Abidi, M.A.: Logical combinatorial pattern recognition: A Review. In: Pandalai, S.G. (ed.) Recent Research Developments in Pattern Recognition. Transword Research Networks, USA, pp. 133–176 (2002)
García-Borroto, M., Ruiz-Shulcloper, J.: Selecting Prototypes in Mixed Incomplete Data. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 450–459. Springer, Heidelberg (2005)
García-Borroto, M., Villuendas-Rey, Y., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: Finding Small Consistent Subset for the Nearest Neighbor Classifier Based on Support Graphs. In: Bayro-Corrochano, E., Eklundh, J.-O. (eds.) CIARP 2009. LNCS, vol. 5856, pp. 465–472. Springer, Heidelberg (2009)
García-Borroto, M., Villuendas-Rey, Y., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: Using Maximum Similarity Graphs to Edit Nearest Neighbor Classifiers. In: Bayro-Corrochano, E., Eklundh, J.-O. (eds.) CIARP 2009. LNCS, vol. 5856, pp. 489–496. Springer, Heidelberg (2009)
Pawlak, Z.: Rough Sets. International Journal of Information & Computer Sciences 11, 341–356 (1982)
Villuendas-Rey, Y., Caballero-Mota, Y., García-Lorenzo, M.M.: Using Rough Sets and Maximum Similarity Graphs for Nearest Prototype Classification. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 300–307. Springer, Heidelberg (2012)
Ahn, H., Kim, K.J., Han, I.: A case-based reasoning system with the two-dimensional reduction technique for customer classification. Expert Systems with Applications: An International Journal 32, 1011–1019 (2007)
Sakinah, S., Ahmad, S., Pedrycz, W.: Feature and Instance selection via cooperative PSO. In: IEEE International Conference on Systems, Man and Cybernetic, pp. 2127–2132. IEEE Publishing (2011)
Derrac, J., García, S., Herrera, F.: IFS-CoCo in the Landscape Contest: Description and Results. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 56–65. Springer, Heidelberg (2010)
Derrac, J., Cornelis, C., Gaecía, S., Herrera, F.: Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection. Information Sciences 186, 73–92 (2012)
Dasarathy, B.V.: Concurrent Feature and Prototype Selection in the Nearest Neighbor Decision Process. In: 4th World Multiconference on Systemics, Cybernetics and Informatics, Orlando, USA, vol. VII, pp. 628–633 (2000)
Villuendas-Rey, Y., García-Borroto, M., Medina-Pérez, M.A., Ruiz-Shulcloper, J.: Simultaneous Features and Objects Selection for Mixed and Incomplete Data. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds.) CIARP 2006. LNCS, vol. 4225, pp. 597–605. Springer, Heidelberg (2006)
Villuendas-Rey, Y., García-Borroto, M., Ruiz-Shulcloper, J.: Selecting Features and Objects for Mixed and Incomplete Data. In: Ruiz-Shulcloper, J., Kropatsch, W.G. (eds.) CIARP 2008. LNCS, vol. 5197, pp. 381–388. Springer, Heidelberg (2008)
Dasarathy, B.V., Sanchez, J.S., Townsend, S.: Nearest Neighbour Editing and Condensing Tools - Synergy Exploitation. Pattern Analysis & Applications 3, 19–30 (2000)
Zhuravlev, Y.I., Nikiforov, V.V.: Recognition algorithms based on voting calculation. Journal Kibernetika 3, 1–11 (1971)
Lazo-Cortés, M., Ruiz-Shulcloper, J., Alba-Cabrera, E.: An overview of the evolution of the concept of testor. Pattern Recognition 34, 753–762 (2001)
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases. University of California at Irvine, Department of Information and Computer Science, Irvine (1998)
Wilson, R.D., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
Demsar, J.: Statistical comparison of classifiers over multiple datasets. The Journal of Machine Learning Research 7, 1–30 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Villuendas-Rey, Y., Caballero-Mota, Y., García-Lorenzo, M.M. (2013). Intelligent Feature and Instance Selection to Improve Nearest Neighbor Classifiers. In: Batyrshin, I., González Mendoza, M. (eds) Advances in Artificial Intelligence. MICAI 2012. Lecture Notes in Computer Science(), vol 7629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37807-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-37807-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37806-5
Online ISBN: 978-3-642-37807-2
eBook Packages: Computer ScienceComputer Science (R0)