Abstract
It is very normal for any user to receive hundreds of emails every day. Almost 93% of them are spam messages which include mainly advertisements from the industries like software, phishing, gambling, stocks, electronics, pharmaceutical, loan, and malware attempts etc. Spams messages not only waste user’s time but also eats up user valuable space. In this paper, a nature inspired metaheuristics technique has been used for email classification which emphasizes on reducing false-positive problem of treating spam messages as ham. It uses metaheuristics-based feature selection methods and employs extra-tree classifier to classify emails into spam and ham. The proposed model has accuracy of 95.5%, specificity of 93.7%, and F1-score of 96.3%, which is clearly a major improvement over the previous researches which have been conducted in this field using decision trees. The comparative analysis of extra-tree classifiers with other classifiers like decision trees and random forest has also been studied.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Idris, I., Selamat, A., Nguyen, N.T., Omatu, S., Krejcar, O., Kuca, K., Penhaker, M.: A combined negative selection algorithm–particle swarm optimization for an email spam detection system. Eng. Appl. Artif. Intell. 39, 33–44 (2015)
Brezočnik, L.: Feature selection for classification using particle swarm optimization. In: 17th International Conference on Smart Technologies, IEEE EUROCON 2017, pp. 966–971. IEEE (2017)
Chakraborty, B.: Feature subset selection by particle swarm optimization with fuzzy fitness function. In: 3rd International Conference on Intelligent System and Knowledge Engineering, 2008, ISKE 2008, vol. 1, pp. 1038–1042. IEEE (2008)
Wang, Y., Liu, Y., Feng, L., Zhu, X.: Novel feature selection method based on harmony search for email classification. Knowl.-Based Syst. 73, 311–323 (2015)
Zhang, Y., Wang, S., Phillips, P., Ji, G.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl.-Based Syst. 64, 22–31 (2014)
Sharaff, A., Nagwani, N.K.: Identifying categorical terms based on latent Dirichlet allocation for email categorization. In: Emerging Technologies in Data Mining and Information Security, pp. 431–437. Springer, Singapore (2019)
Aski, A.S., Sourati, N.K.: Proposed efficient algorithm to filter spam using machine learning techniques. Pac. Sci. Rev. A: Nat. Sci. Eng. 18(2), 145–149 (2016)
Cohen, A., Nissim, N., Elovici, Y.: Novel set of general descriptive features for enhanced detection of malicious emails using machine learning methods. Expert. Syst. Appl. (2018)
Almeida, T.A., Silva, T.P., Santos, I., Hidalgo, J.M.G.: Text normalization and semantic indexing to enhance instant messaging and SMS spam filtering. Knowl.-Based Syst. 108, 25–32 (2016)
Proença, H.M., Vieira, S.M., Kaymak, U., Almeida, R.J., Sousa, J.M.: Optimizing probabilistic fuzzy systems for classification using metaheuristics. In: 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1635–1641. IEEE (2016)
Sharaff, A., Nagwani, N.K., Dhadse, A.: Comparative study of classification algorithms for spam email detection. In: Emerging Research in Computing, Information, Communication and Applications, pp. 237–244. Springer, New Delhi (2016)
Dong, H., Li, T., Ding, R., Sun, J.: A novel hybrid genetic algorithm with granular information for feature selection and optimization. Appl. Soft Comput. 65, 33–46 (2018)
Polat, K., Güneş, S.: A novel hybrid intelligent method based on C4. 5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst. Appl. 36(2), 1587–1592 (2009)
Wei, J., Zhang, R., Yu, Z., Hu, R., Tang, J., Gui, C., Yuan, Y.: A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl. Soft Comput. 58, 176–192 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sharaff, A., Gupta, H. (2019). Extra-Tree Classifier with Metaheuristics Approach for Email Classification. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_17
Download citation
DOI: https://doi.org/10.1007/978-981-13-6861-5_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6860-8
Online ISBN: 978-981-13-6861-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)