Abstract
In this paper, a novel feature selection method based on the normalization of the well-known mutual information measurement is presented. Our method is derived from an existing approach, the max-relevance and min-redundancy (mRMR) approach. We, however, propose to normalize the mutual information used in the method so that the domination of the relevance or of the redundancy can be eliminated. We borrow some commonly used recognition models including Support Vector Machine (SVM), k-Nearest-Neighbor (kNN), and Linear Discriminant Analysis (LDA) to compare our algorithm with the original (mRMR) and a recently improved version of the mRMR, the Normalized Mutual Information Feature Selection (NMIFS) algorithm. To avoid data-specific statements, we conduct our classification experiments using various datasets from the UCI machine learning repository. The results confirm that our feature selection method is more robust than the others with regard to classification accuracy.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Asuncion A, Newman DJ (2007) Uci machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550
Bhanu B, Lin Y (2003) Genetic algorithm based feature selection for target detection in sar images. Image Vis Comput 1(7):591–608
Cawley GC, Talbot NLC, Girolami M (2007) Sparse multinomial logistic regression via Bayesian l1 regularisation. Adv Neural Inf Process Syst 19:209–216
Chang T-W, Huang Y-P, Sandnes FE (2009) Efficient entropy-based features selection for image retrieval. In: Proceedings of the 2009 IEEE international conference on systems, man and cybernetics, pp 2941–2946
Dasgupta A, Drineas P, Harb B, Josifovski V, Mahoney MW (2007) Feature selection methods for text classification. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 230–239
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156
Dimililer N, Varoglu E, Altinçay H (2009) Classifier subset selection for biomedical named entity recognition. Appl Intell 31:267–282
Dy JG, Brodley CE, Kak A, Broderick LS, Aisen AM (2003) Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans Pattern Anal Mach Intell 25(3):373–378
Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201
Fodor IK (2002) A survey of dimension reduction techniques. Technical report, Center for Applied Scientific Computing, Lawrence Livermore National Laboratory
Forman G, Alto P (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
Goulden CH (1956) Methods of statistical analysis, 2nd edn. Wiley, New York
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: An update. SIGKDD Explor 11(1):10–18
Kamimura R (2011) Structural enhanced information and its application to improved visualization of self-organizing maps. Appl Intell 34:102–115
Khor K-C, Ting C-Y, Amnuaisuk S-P (2009) A feature selection approach for network intrusion detection. In: Proceedings of the 2009 international conference on information management and engineering, pp 133–137
Kwak N, Choi C-H (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159
Li Y, Zeng X (2010) Sequential multi-criteria feature selection algorithm based on agent genetic algorithm. Appl Intell 33:117–131
Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922
Oh I-S, Lee J-S, Moon B-R (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):1367–4803
Shen K-Q, Ong C-J, Li X-P (2008) Novel multi-class feature selection methods using sensitivity analysis of posterior probabilities. In: Proceedings of the IEEE international conference on systems, man and cybernetics, pp 1116–1121
Shie J-D, Chen S-M (2008) Feature subset selection based on fuzzy entropy measures for handling classification problems. Appl Intell 28:69–82
Tsang C-H, Kwong S, Wang H (2007) Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recognit 40(9):2373–2391
Vinh LT, Thang ND, Lee Y-K (2010) An improved maximum relevance and minimum redundancy feature selection algorithm based on normalized mutual information. In: Proceedings of the 10th IEEE/IPSJ international symposium on applications and the Internet, pp 395–398
Xia H, Hu BQ (2006) Feature selection using fuzzy support vector machines. Fuzzy Optim Decis Mak 5(2):187–192
Yan R (2006) MatlabArsenal toolbox for classification algorithms. Informedia School of Computer Science, Carnegie Mellon University
Yang HH, Moody J (1999) Data visualization and feature selection: New algorithms for nongaussian data. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 687–693
Yu L, Liu H (2004) Redundancy based feature selection for microarray data. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 737–742
Yuan G-X, Chang K-W, Hsieh C-J, Lin C-J (2010) A comparison of optimization methods and software for large-scale l1-regularized linear classification. J Mach Learn Res 11:3183–234
Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research—asu feature selection repository. Technical report, School of Computing, Informatics, and Decision Systems Engineering, Arizona State University
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vinh, L.T., Lee, S., Park, YT. et al. A novel feature selection method based on normalized mutual information. Appl Intell 37, 100–120 (2012). https://doi.org/10.1007/s10489-011-0315-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-011-0315-y