An Intelligent System for Spam Message Detection

Sartaj, Sahil; Mollah, Ayatullah Faruk

doi:10.1007/978-981-16-2248-9_37

Part of the book series: Algorithms for Intelligent Systems ((AIS))

667 Accesses
2 Citations

Abstract

About more than half of worldwide email traffic, amounting several billions per day consists of spam causing considerable disturbance in telecommunications. This upheaval volume of unwanted messages implies an intense need for reliable and robust spam filters. Conventional filtering methods have largely failed to tackle the adaptive nature of spam messages. Machine learning methods, on the contrary, may have the ability to intelligently detect and filter spams. Here, we present a system of spam message prediction based on appropriate lexical analysis like tokenization, stop-words removal, stemming, lemmatization, and feature extraction. Impressive results, i.e., over 97% accuracy with random forest classifier, have been obtained in several experiments on the UCI spam collection dataset. We have also hosted the developed spam-detection system on Heroku as platform-as-a-service (PAAS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Comprehensive Comparative Study of Machine Learning Classifiers for Spam Filtering

A Detailed Analysis on Spam Emails and Detection Using Machine Learning Algorithms

A Proposal of Systematic SMS Spam Detection Model Using Supervised Machine Learning Classifiers

References

Awad M, Foqaha M (2016) Email spam classification using hybrid approach of RBF neural network and particle swarm optimization. Int J Netw Secur Appl 8(4):17–28. http://doi.org/10.5121/ijnsa.2016.8402
Fonseca O, Fazzion E, Cunha I, Las-Casas PHB, Guedes D, Meira W, Hoepers C, Steding-Jessen K, Chaves MHP (2016) Measuring, characterizing, and avoiding spam traffic costs. IEEE Int Comp 20(4):16–24. http://doi.org/10.1109/MIC.2016.53
Kaspersky Lab Report. https://www.kaspersky.com/about/press-releases/2013_kaspersky-lab-report-37-3-million-users-experienced-phishing-attacks-in-the-last-year. Accessed on 1 Feb 2021
Cormack GV, Smucker MD, Clarke CL (2011) Efficient and effective spam filtering and re-ranking for large web datasets. Inf Retrieval 14(5):441–465. arXiv:1004.5168v1
Awad WA, ELseuofi SM (2011) Machine learning methods for spam e-mail classification. Int J Comput Sci Inf Technol 3(1):173–184. http://doi.org/10.5121/ijcsit.2011.3112.173
Marsono MN, El-Kharashi MW, Gebali F (2008) Binary LNS-based naive Bayes inference engine for spam control: noise analysis and FPGA synthesis. IET Comput Digit Tech 2(1):56–62. http://doi.org/10.1049/iet-cdt:20050180
Amayri O (2009) On email spam filtering using a support vector machine. Doctoral dissertation, Concordia University. https://spectrum.library.concordia.ca/976212/
Torabi ZS, Nadimi-Shahraki MH, Nabiollahi A (2015) Efficient support vector machines for spam detection: a survey. Int J Comput Sci Inf Secur 13(1):11–28
Google Scholar
Chawla G, Saini R (2016) Implementation of improved KNN algorithm for email spam detection. Int J Trends Res Dev 3(5):479–483
Google Scholar
Cao Y, Liao X, Li Y (2004) An e-mail filtering approach using neural network. In: International symposium on neural networks, pp 688–694. http://doi.org/10.1007/978-3-540-28648-6_110
Dada EG, Joseph SB (2018) Random forests machine learning technique for email spam filtering. Semin Ser 9(1):29–36
Google Scholar
Sheng S, Holbrook M, Kumaraguru P, Cranor LF, Downs J (2010) Who falls for phish? A demographic analysis of phishing susceptibility and effectiveness of interventions. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 373–382. http://doi.org/10.1145/1753326.1753383
Akinyelu AA, Adewumi AO (2014) Classification of phishing email using random forest machine learning technique. J Appl Math 2014(425731):1–6. https://doi.org/10.1155/2014/425731
Article Google Scholar
Khonji M, Iraqi Y, Jones A (2013) Phishing detection: a literature survey. IEEE Commun Surv Tutorials 15(4):2091–2121. https://doi.org/10.1109/SURV.2013.032213.00009
Article Google Scholar
Obied A (2007) Bayesian spam filtering. Department of Computer Science, University of Calgary
Google Scholar
Wang XL (2005) Learning to classify email: a survey. In: International conference on machine learning and cybernetics. IEEE, pp 5716–5719. http://doi.org/10.1109/ICMLC.2005.1527956
Bhowmick A, Hazarika SM (2018) E-mail spam filtering: a review of techniques and trends. In: Advances in electronics, communication and computing, pp 583–590. http://doi.org/10.1007/978-981-10-4765-7
Karthika R, Visalakshi P (2015) A hybrid ACO based feature selection method for email spam classification. WSEAS Trans Comput 14:171–177
Google Scholar
Deshpande VP, Erbacher RF, Harris C (2007) An evaluation of naive bayesian anti-spam filtering techniques. In: IEEE SMC information assurance and security workshop, pp 333–340. http://doi.org/10.1109/IAW.2007.381951
Mishra R, Thakur RS (2013) Analysis of random forest and Naive Bayes for spam mail using feature selection categorization. Int J Comput Appl 80(3):42–47. http://doi.org/10.5120/13844-1670
Sjarif NNA, Azmi NFM, Chuprat S, Sarkan HM, Yahya Y, Sam SM (2019) SMS spam message detection using term frequency-inverse document frequency and random forest algorithm. Procedia Comput Sci 161:509–515. http://doi.org/10.1016/j.procs.2019.11.150
UCI SMS Spam Collection Dataset. https://www.kaggle.com/uciml/sms-spam-collection-dataset. Accessed 17 Sept 2020

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Aliah University, IIA/27 New Town, Kolkata, 700160, India
Sahil Sartaj & Ayatullah Faruk Mollah

Authors

Sahil Sartaj
View author publications
You can also search for this author in PubMed Google Scholar
Ayatullah Faruk Mollah
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Artificial Intelligence Institute, University of South Carolina, Columbia, SC, USA
Amit Sheth
JK Lakshmipat University, Jaipur, India
Amit Sinhal
University of Maryland, College Park, MD, USA
Abhinav Shrivastava
BeingAI Limited/Socients AI and Robotics, Paris, France
Amit Kumar Pandey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sartaj, S., Mollah, A.F. (2021). An Intelligent System for Spam Message Detection. In: Sheth, A., Sinhal, A., Shrivastava, A., Pandey, A.K. (eds) Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-16-2248-9_37

Download citation

DOI: https://doi.org/10.1007/978-981-16-2248-9_37
Published: 22 July 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2247-2
Online ISBN: 978-981-16-2248-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

An Intelligent System for Spam Message Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Comprehensive Comparative Study of Machine Learning Classifiers for Spam Filtering

A Detailed Analysis on Spam Emails and Detection Using Machine Learning Algorithms

A Proposal of Systematic SMS Spam Detection Model Using Supervised Machine Learning Classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Intelligent System for Spam Message Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Comprehensive Comparative Study of Machine Learning Classifiers for Spam Filtering

A Detailed Analysis on Spam Emails and Detection Using Machine Learning Algorithms

A Proposal of Systematic SMS Spam Detection Model Using Supervised Machine Learning Classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation