Abstract
Phishing is a rapidly increasing threat to the modern internet where phisher mimics a legit web-page to get user into the phishers cage. The aim of phisher is to get sensitive records or information about the user such as credit card details, email and passwords. Anti-Phishing Working Group recently reported that 86,276 unique phishing URLs have been detected in the month of September, 2019. Thus, to resolve this threat, different methods and techniques have been proposed such as blacklist, whitelist, heuristics of URL, content based, image processing as well as Machine Learning based. Machine Learning (ML) is a modern technique and its algorithms have better efficiency, accuracy and performance. Thus, this paper critically reviews and evaluates Machine Learning based classifiers on the basis of datasets used, feature extraction techniques and performance measures used for the detection of phishing URLs. Moreover, the literature review reveals that every ML base approach has its advantages and limitations; hence, suggesting one classifier over other is challenging. However, Random Forest (RF) and Support Vector Machine (SVM) are mainly used classifiers in the literature of this study, where RF achieved the highest accuracy using larger dataset. In contrast, SVM achieved the second highest accuracy using smaller dataset. This research concludes that RF is an efficient approach for the detection of phishing URLs. Finally, insights are provided at the critical evaluation section and this research proposed extracted features that can enhance the literature in order to develop an effective URL phishing detection method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
APWG: Phishing activity trends reports, 1st, 2nd, 3rd, and 4th quarters of each years (2013–2019). https://apwg.org/trendsreports/
Alsharnouby, M., Alaca, F., Chiasson, S.: Why phishing still works: user strategies for combating phishing attacks. Int. J. Hum.-Comput. Stud. 82, 69–82 (2015)
Parekh, S., Parikh, D., Kotak, S., Sankhe, S.: A new method for detection of phishing websites: URL detection. In: IEEE 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT), April 2018
Jagadeesan, S.: URL phishing analysis using random forest. IEEE Int. J. Pure Appl. Math. 118(20), 4159–4163 (2018)
Hutchinson, S., Zhang, Z., Liu, Q.: Detecting phishing websites with random forest. In: ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, MILICOM, vol. 251, pp. 470–479. Springer, October 2018
Subasi, A., Molah, E., Almkallawi, F., Chaudhery, T.J.: Intelligent phishing website detection using random forest classifier. In: IEEE International Conference on Electrical and Computing Technologies and Applications (ICECTA), November 2017
Alswailem, A., Alabdullah, B., Alrumayh, N., Alsedrani, A.: Detecting phishing websites using machine learning. In: IEEE 2nd International Conference on Computer Applications & Information Security (ICCAIS), May 2019
Islam, M., Chowdhury, N.K.: Phishing websites detection using machine learning based classification techniques. In: ICAICT 1st International Conference on Advanced Information and Communication Technology, At Chittagong, Bangladesh, November 2016
Jain, A.K., Gupta, B.B.: PHISH-SAFE: URL features-based phishing detection system using machine learning. In: Cyber Security, Advances in Intelligent Systems and Computing. Springer (2018)
Banik, B., Sarma, A.: Phishing URL detection system based on URL features using SVM. Int. J. Electron. Appl. Res. (IJEAR) 5(2), 40–55 (2018)
Kulkarni, A., Brown, L.L.: Phishing websites detection using machine learning. Int. J. Adv. Comput. Sci. Appl. (IJACSA), 10, 8–13 (2019)
Singh, P., Maravi, Y.P.S., Sharma, S.: Phishing websites detection through supervised learning networks. In: IEEE International Conference on Computing and Communication Technologies (ICCCT) (2015)
Shirazi, H., Bezawada, B., Ray, I.: Know Thy Doma1n Name: unbiased phishing detection using domain name based features. In: Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies (SACMAT), pp. 69–75 (2018)
Weedon, M., Tsaptsinos, D., Denholm-Price, J.: Random forest explorations for URL classification. In: IEEE International Conference on Cyber Situational Awareness, Data Analytics and Assessment (Cyber SA), June 2017
Pandey, A., Gill, N., Nadendla, K.S.P., Thaseen, I.S.: Identification of phishing attack in websites using random forest-SVM hybrid model. In: Intelligent Systems Design and Applications (ISDA), pp. 120–128. Springer, April 2019
Sananse, B.E., Sarode, T.K.: Phishing URL detection: a machine learning and web mining-based approach. Int. J. Comput. Appl. (IJCA) 123(13), 46–50 (2015)
Benavides, E., Fuertes, W., Sanchez, S., Sanchez, M.: Classification of phishing attack solutions by employing deep learning techniques: a systematic literature review. In: Developments and Advances in Defense and Security, pp. 51–64. Springer, June 2019
Jain, A.K., Gupta, B.B.: Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 68, 687–700 (2017)
Jain, A.K., Gupta, B.B.: A machine learning based approach for phishing detection using hyperlinks information. J. Ambient Intell. Hum. Comput. 10, 2015–2028 (2018)
Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Comput. Appl. 31, 3851–3873 (2018)
Rao, R.S., Vaishnavi, T., Pais, A.R.: CatchPhish: detection of phishing websites by inspecting URLs. J. Ambient Intell. Hum. Comput. 11, 813–825 (2019)
Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. J. Expert Syst. Appl. 117, 345–357 (2019)
PhishTank: Verified phishing URLs. https://www.phishtank.com/
UCI: UC Irvine Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php/
Alexa: Most popular legitimate URLs. https://www.alexa.com/
DMOZ: Web Directory. https://dmoz-odp.org/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jalil, S., Usman, M. (2021). A Review of Phishing URL Detection Using Machine Learning Classifiers. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-55187-2_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55186-5
Online ISBN: 978-3-030-55187-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)