SMS Phishing Dataset for Machine Learning and Pattern Recognition

Mishra, Sandhya; Soni, Devpriya

doi:10.1007/978-3-031-27524-1_57

Sandhya Mishra¹⁵ &
Devpriya Soni¹⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 648))

Included in the following conference series:

International Conference on Soft Computing and Pattern Recognition

744 Accesses
5 Citations

Abstract

The reliability of the dataset is an essential factor for solving classification problems. Data is required for training, testing, classification, and evaluation of the machine learning models. SMS Phishing (Smishing) is a binary classification problem in which messages are categorized as malicious (Smishing) or legitimate (Ham). It is a fraudulent activity in which the attacker sends a malicious text message to the Smartphone user that causes financial or personal loss to the victim. Few research works have been proposed for the identification of smishing messages. According to the literature survey conducted, the smishing dataset is not publicly available yet. Hence, we have composed a smishing dataset that contains smishing messages extracted from different internet sources. We have formulated a dataset of 5971 text messages that contain 638 smishing messages, 489 spam messages, and 4844 ham messages. This SMS Phishing dataset can be used for the extraction of smishing features and classification of text messages using Machine Learning Algorithms. Experimental evaluation of the dataset for smishing message categorization using keyword classification is also presented in this paper. This smishing dataset can be used as a baseline for future research work corresponding to SMS Phishing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Smishing-Classifier: A Novel Framework for Detection of Smishing Attack in Mobile Environment

DSmishSMS-A System to Detect Smishing SMS

Article 28 July 2021

Detecting Smishing Attacks Using Feature Extraction and Classification Techniques

References

Almeida, T.A., Hidalgo, J.M.G., Yamakami, A.: Contributions to the study of SMS spam filtering: new collection and results. In: 11th ACM Symposium on Document Engineering, pp. 259–262 (2011). https://doi.org/10.1145/2034691.2034742
Pinterest: Smishing Dataset (2021). https://in.pinterest.com/seceduau/smishingdataset/?lp=true
Sonowal, G., Kuppusamy, K.S.: SmiDCA: an anti-smishing model with machine learning approach. Comput. J. 61(8), 1143–1157 (2018). https://doi.org/10.1093/comjnl/bxy039
Paytm: Beware of Fraudulent SMS (2021). https://www.paytmbank.com/blog/2020/06/beware-of-fraudulent-sms-calls-about-kyc-suspension-or-expiration-account-block-and-fake-rewards/
Paytm: Fraud Awareness: Stay informed about Phishing! (2021). https://blog.paytm.com/fraud-awareness-paytm-never-asks-for-your-password-otp-2eed50a24ed0 (2017)
MessageMedia: 6 COVID-19 (Coronavirus) SMS scams to look out for (2020). https://messagemedia.com/au/blog/covid-19-coronavirus-sms-scams-to-look-out-for/
Jain, A., Gupta, B.B.: Feature based approach for detection of smishing messages in the mobile environment. J. Inf. Technol. Res. 12, 17–35 (2019). https://doi.org/10.4018/JITR.2019040102
Jain, A.: A novel approach to detect spam and smishing SMS using machine learning techniques. Int. J. E-Serv. Mob. Appl. (2019). https://doi.org/10.4018/IJESMA.2020010102
Article Google Scholar
Sonowal, G.: Detecting phishing SMS based on multiple correlation algorithms. SN Comput. Sci. 1(6), 1–9 (2020). https://doi.org/10.1007/s42979-020-00377-8
Article Google Scholar
Mishra, S., Soni, D.: Smishing detector: a security model to detect smishing through SMS content analysis and URL behavior analysis. Future Gener. Comput. Syst. (2020). https://doi.org/10.1016/j.future.2020.03.021
Article Google Scholar
Mishra, S., Soni, D.: DSmishSMS-a system to detect smishing SMS. Neural Comput. Appl. (2021). https://doi.org/10.1007/s00521-021-06305-y
Mishra, S., Soni, D.: Implementation of ‘smishing detector’: an efficient model for smishing detection using neural network. SN Comput. Sci. 3(3), 1–13 (2022). https://doi.org/10.1007/s42979-022-01078-0
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering and Information Technology, Jaypee Institute of Information Technology, Sector-128, Noida, 201304, India
Sandhya Mishra & Devpriya Soni

Authors

Sandhya Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Devpriya Soni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandhya Mishra .

Editor information

Editors and Affiliations

Faculty of Computing and Data Science, FLAME University, Pune, Maharashtra, India
Ajith Abraham
University of Applied Sciences and Arts Northwestern Switzerland, Olten, Switzerland
Thomas Hanne
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs, Auburn, AL, USA
Niketa Gandhi
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs, Mala, Kerala, India
Pooja Manghirmalani Mishra
Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab, India
Anu Bajaj
Université Paris-Est Créteil, Créteil, France
Patrick Siarry

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mishra, S., Soni, D. (2023). SMS Phishing Dataset for Machine Learning and Pattern Recognition. In: Abraham, A., Hanne, T., Gandhi, N., Manghirmalani Mishra, P., Bajaj, A., Siarry, P. (eds) Proceedings of the 14th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2022). SoCPaR 2022. Lecture Notes in Networks and Systems, vol 648. Springer, Cham. https://doi.org/10.1007/978-3-031-27524-1_57

Download citation

DOI: https://doi.org/10.1007/978-3-031-27524-1_57
Published: 28 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27523-4
Online ISBN: 978-3-031-27524-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

SMS Phishing Dataset for Machine Learning and Pattern Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Smishing-Classifier: A Novel Framework for Detection of Smishing Attack in Mobile Environment

DSmishSMS-A System to Detect Smishing SMS

Detecting Smishing Attacks Using Feature Extraction and Classification Techniques

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

SMS Phishing Dataset for Machine Learning and Pattern Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Smishing-Classifier: A Novel Framework for Detection of Smishing Attack in Mobile Environment

DSmishSMS-A System to Detect Smishing SMS

Detecting Smishing Attacks Using Feature Extraction and Classification Techniques

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation