Abstract
Feature selection happens to be an important step in many classification tasks. Its aim is to reduce the number of features and at the same time to try to maintain or even improve the performance of the used classifier. The selection methods described in the literature present some limitations at different levels. For instance, some are too complex to be operated in reasonable time or too dependent on the classifier used for evaluation. Others overlook interactions between features. In this paper, in order to limit these drawbacks, we propose a fast feature selection method. Each feature is closely associated with a single feature classifier. The weak classifiers we considered have several degrees of freedom and are optimized on the training dataset. Within the genetic algorithm, the individuals who are classifier subsets are evaluated by a fitness function based on a combination of single feature classifiers. Several combination operators are compared. The whole method is implemented and extensive trials are performed on four databases built from the MNIST handwritten digits database using four different descriptors. Results show how robust is our approach and how efficient is the method. On average, the number of selected features is about 70% smaller than the initial set while keeping the level of recognition rate.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Alamdari, A.: Variable selection using correlation and single variable classifier methods: Applications. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L. (eds.) Feature Extraction. STUDFUZZ, vol. 207, pp. 343–358. Springer, Heidelberg (2006)
Ben-Bassat, M.: Use of distance measures, information measures and error bounds in feature evaluation. In: Krishnaiah, P., Kanal, L. (eds.) Classification, Pattern Recognition and Reduction of Dimensionality. HandBook of Statistics II, vol. 2, pp. 773–791. North Holland (1983)
Bins, J., Draper, B.: Feature selection from huge feature sets. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 159–165. IEEE (2001)
Bouguila, N., Ziou, D.: A countably infinite mixture model for clustering and feature selection. Knowledge and Information Systems 33, 351–370 (2012)
Breiman, L., et al.: Classification and Regression Trees. Chapman and Hall, New York (1984)
Chapelle, O., Vapnik, V.: Model selection for support vector machines. In: Proceedings of the Neural Information Processing Systems, ANIPS 2000, Denver, Colorado, USA, pp. 230–236. MIT Press (2000)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1-2), 155–176 (2003)
Dujet, C., Vincent, N.: Data fusion modeling human behavior. International Journal of Intelligent System 13, 27–39 (1998)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hall, M.: Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: 17th International Conference on Machine Learning, ICML 2000. LNCS, pp. 359–366. Morgan Kaufmann Publishers, San Fransico (2000)
Huang, C.-J., Yang, D.-X., Chuang, Y.-T.: Application of wrapper approach and composite classifier to the stock trend prediction. Expert Syst. Appl. 34, 2870–2878 (2008)
Iba, W., Langley, P.: Induction of one-level decision trees. In: Proceedings of the ninth International Workshop on Machine Learning, ML 1992, pp. 233–240. Morgan Kaufmann Publishers Inc., San Francisco (1992)
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Machine Learning: Proceedings of the Eleventh International Conference, pp. 121–129. Morgan Kaufmann (1994)
Kachouri, R., Djemal, K., Maaref, H.: Adaptive feature selection for heterogeneous image databases. In: Djemal, K., Deriche, M. (eds.) Second IEEE International Conference on Image Processing Theory, Tools 38; Applications, 10, Paris, France (2010)
Kim, H., Kim, J., Sim, D., Oh, D.: A modified zernike moment shape descriptor invariant to translation rotation and scale for similarity-based image retrieval. In: ICME 2000, p. MP5 (2000a)
Kim, Y., Street, W., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: 6th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 365–369 (2000b)
Kira, K., Rendell, L.A.: The feature selection problem: Traditional methods and a new algorithm. In: AAAI, pp. 129–134. AAAI Press and MIT Press, Cambridge, MA, USA (1992)
Kitoogo, F.E., Baryamureeba, V.: A methodology for feature selection in named entity recognition. International Journal of Computing and ICT, 18–26 (2007)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Leardi, R.: Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. Journal of Chemometrics 8(1), 65–79 (1994)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE, 2278–2324 (1998)
Li, Y., Guo, L.: Tcm-knn scheme for network anomaly detection using feature-based optimizations. In: Proceedings of the 2008 ACM Symposium on Applied Computing, SAC 2008, pp. 2103–2109. ACM, New York (2008)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transations on Knowledge and Data Engineering 17, 491–502 (2005)
Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of the 16th International Conference on Pattern Recognition, ICPR 2002, vol. 1. IEEE Computer Society, Washington, DC (2002)
Tabbone, S., Wendling, L.: Binary shape normalization using the Radon transform. In: Nyström, I., Sanniti di Baja, G., Svensson, S. (eds.) DGCI 2003. LNCS, vol. 2886, pp. 184–193. Springer, Heidelberg (2003)
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Intelligent Systems and their Applications 13(2), 44–49 (1998)
Zhang, D., Lu, G.: Shape based image retrieval using generic fourier descriptors. Signal Processing: Image Communication 17, 825–848 (2002)
Zhou, X., Dillion, T.: A statistical-heuristic feature selection criterion for decision tree induction. IEEE Trans. Pattern Anal. Mach. Intell. 13, 834–841 (1991)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Chouaib, H., Cloppet, F., Vincent, N. (2014). Combination of Single Feature Classifiers for Fast Feature Selection. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 527. Springer, Cham. https://doi.org/10.1007/978-3-319-02999-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-02999-3_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02998-6
Online ISBN: 978-3-319-02999-3
eBook Packages: EngineeringEngineering (R0)