Abstract
Malicious software (malware) is a term that describes any malicious program or code that is designed to impose harm or to steal information from systems. It includes various types such as viruses, worms, and Trojan horses. Malware imposes tremendous threats to everyone in contact with the cyberworld. Hence, malware analysis has been extensively researched as the versatility and number of malware have increased dramatically. Until recently, signature-based detection has been prevailing in detecting malware. However, it is becoming ineffective as it relies on detecting malware that was already seen in the past. To countermeasure those new types of malware, there has been a rise in engineering machine learning-based malware detection and analysis techniques. It has seen massive growth in its development thanks to its effectiveness, swiftness, safety, and depth of investigation of malware samples. Static malware analysis relies on examining the static content of an executable without execution. This can be conducted by obtaining features statically such as API calls, binary sequences, and control flow graphs (CFGs). However, this area of research is still growing since packed files and other obfuscation techniques used to evade analysis remain a challenge for pure static analysis methods.
Similar content being viewed by others
References
Bell T (1999) The concept of dynamic analysis. In: Proceedings of the 7th European software engineering conference (ESEC’99). Lecture notes in computer science, vol 1687. Springer, pp 216–234
Bergeron J, Debbabi M, Desharnais J, Erhioui M, Lavoie Y, Tawbi N (2000) Static detection of malicious code in executable programs
Chen L (2018) Deep transfer learning for static malware classification, CoRR abs/1812.07606
Devi D, Nandi S (2012) Detection of packed malware. In: Proceedings of the 1st international conference on security of internet of things, SECURIT’12. ACM, pp 22–26
Egele M, Scholte T, Kirda E, Kruegel C (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44(2):6:1–6:42
Gandotra E, Bansal D, Sofat S (2014) Malware analysis and classification: a survey. J Inf Secur 05:56–64
Hall MA, Smith LA (1998) Practical feature subset selection for machine learning. In: Proceedings of the 21st Australasian Computer Science Conference (ACSC 1998), 4–6 Feb 1998, Berlin, Springer, 20(1):181–191
Henchiri O, Japkowicz N (2006) A feature selection and evaluation scheme for computer virus detection. In: Proceedings of the 6th IEEE international conference on data mining (ICDM 2006), 18–22 Dec 2006, Hong Kong. IEEE Computer Society, pp 891–895
Islam MR, Tian R, Batten LM, Versteeg S (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36(2):646–656
Kolter JZ, Maloof MA (2004) Learning to detect malicious executables in the wild. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 470–478
Li MQ, Fung BCM, Charland P, Ding SHH (2019) I-MAD: a novel interpretable malware detector using hierarchical transformer, CoRR abs/1909.06865
Ma X, Biao Q, Yang W, Jiang J (2016) Using multi-features to reduce false positive in malware classification. In: Proceedings of the 2016 IEEE information technology, networking, electronic and automation control conference, pp 361–365
Mohamed GAN, Ithnin N (2017) Survey on representation techniques for malware detection system. Am J Appl Sci 14:1049–1069
Mosli R, Li R, Yuan B, Pan Y (2016) Automated malware detection using artifacts in forensic memory images. In: Proceedings of the 2016 IEEE symposium on technologies for homeland security (HST), pp 1–6
Mosli R, Li R, Yuan B, Pan Y (2017) A behavior-based approach for malware detection, pp 187–201
Nath HV, Mehtre BM (2014) Static malware analysis using machine learning methods. In: Martinez Perez G, Thampi SM, Ko R, Shu L (eds) Recent trends in computer networks and distributed systems security. Springer, Berlin/Heidelberg
Naval S, Laxmi V, Rajarajan M, Gaur M, Conti M (2015) Employing program semantics for malware detection. IEEE Trans Inf Foren Secur 10:2591–2604
Or-Meir O, Nissim N, Elovici Y, Rokach L (2019) Dynamic malware analysis in the modern era – a state of the art survey. ACM Comput Surv 52(5):88:1–88:48
Ramesh G, Menen A (2020) Automated dynamic approach for detecting ransomware using finite-state machine. Decis Support Syst 138:113400
Rathnayaka C, Jamdagni A (2017) An efficient approach for advanced malware analysis using memory forensic technique. In: Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, pp 1145–1150
Santos I, Devesa J, Brezo F, Nieves J, Bringas PG (2012) OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Proceedings of the international joint conference CISIS’12-ICEUTE’12-SOCO’12. Advances in intelligent systems and computing, vol 189. Springer, pp 271–280
Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features
Schultz MG, Eskin E, Zadok E, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy. IEEE Computer Society, pp 38–49
Shalaginov A, Banin S, Dehghantanha A, Franke K (2018) Machine learning aided static malware analysis: a survey and tutorial, CoRR abs/1808.01201
Shijo P, Salim A (2015) Integrated static and dynamic analysis for malware detection. Proc Comput Sci 46:804–811
Siddiqui M, Wang M, Lee J (2009) Detecting internet worms using data mining techniques. J Syst Cybern Inform 6:48–53
Sophos, Zaki AM, Humphrey B (2014) The kernel: rootkit discovery using selective automated kernel memory differencing
Souri A, Hosseini R (2018) State-of-the-art survey of malware detection approaches using data mining techniques. Hum-Centric Comput Inf Sci 8:1–22
Stoecklin MP, Jang J, Kirat D (2020) DeepLocker–Concealing targeted attacks with AI Locksmithing. In Proceedings of the Black Hat USA Conference
Wang T, Horng S, Su M, Wu C, Wang P, Su W (2006) A surveillance spyware detection system based on data mining methods. In: Proceedings of the IEEE international conference on evolutionary computation, CEC 2006. IEEE, pp 3236–3241
Ye Y, Li T, Adjeroh DA, Iyengar SS (2017) A survey on malware detection using data mining techniques. ACM Comput Surv 50(3):41:1–41:40
Yu Z, Cao R, Tang Q, Nie S, Huang J, Wu S (2020) Order matters: semantic-aware neural networks for binary code similarity detection. Proc AAAI Conf Artif Intell 34:1145–1152
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Mansour, Z., Molloy, C., Ding, S.H.H. (2022). Machine Learning for Static Malware Analysis. In: Phung, D., Webb, G.I., Sammut, C. (eds) Encyclopedia of Machine Learning and Data Science. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7502-7_981-1
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7502-7_981-1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4899-7502-7
Online ISBN: 978-1-4899-7502-7
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering