Abstract
In today’s digital world most of the anti-malware tools are signature based, which is ineffective to detect advanced unknown malware, viz. metamorphic malware. In this paper, we study the frequency of opcode occurrence to detect unknown malware by using machine learning technique. For the purpose, we have used kaggle Microsoft malware classification challenge dataset. The top 20 features obtained from Fisher score, information gain, gain ratio, Chi-square and symmetric uncertainty feature selection methods are compared. We also studied multiple classifiers available in WEKA GUI-based machine learning tool and found that five of them (Random Forest, LMT, NBT, J48 Graft and REPTree) detect the malware with almost 100% accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sharma, A., Sahay, S.K.: Evolution and detection of polymorphic and metamorphic malware: a survey. Int. J. Comput. Appl. 90(2), 7–11 (2014)
Solutions, E.S., Heal, Q.: Quick Heal Quarterly Threat Report | Q1 2017. http://www.quickheal.co.in/resources/threat-reports (2017). Accessed 13 June 2017
Govindaraju, A.: Exhaustive Statistical Analysis for Detection of Metamorphic Malware. Master’s project report, Department of Computer Science, San Jose State University (2010)
Schultz, M.G., Eskin, E., Stolfo, S.J.: Data Mining Methods for Detection of New Malicious Executables (2001)
Bilar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digit. Forensics 1(2), 156–168 (2007)
Elovici, Y., Shabtai, A., Moskovitch, R., Tahan, G., Glezer, C.: Applying machine learning techniques for detection of malicious code in network traffic. In: Annual Conference on Artificial Intelligence, pp. 44–50. Springer, Berlin, Heidelberg (2007)
Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Japkowicz, N., Elovici, Y.: Unknown malcode detection and the imbalance problem. J. Comput. Virol. 5(4), 295–308 (2009)
Moskovitch, R., et al.: Unknown malcode detection using OPCODE representation. In: Intelligence and Security Informatics. LNCS, vol. 5376, pp. 204–215. Springer, Berlin, Heidelberg (2008)
Santos, I., Nieves, J., Bringas, P., G.: Semi-supervised learning for unknown malware detection. In: International Symposium on Distributed Computing and Artificial Intelligence, vol. 91, pp. 415–422. Springer, Berlin, Heidelberg (2011)
Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)
Shabtai, A., Moskovitch, R., Feher, C., Dolev, S., Elovici, Y.: Detecting unknown malicious code by applying classification techniques on OpCode patterns. Secur. Inf. 1(1), 1 (2012)
Sharma, A., Sahay, S.K.: An effective approach for classification of advanced malware with high accuracy. Int. J. Secur. Its Appl. 10(4), 249–266 (2016)
Sahay, S.K., Sharma, A.: Grouping the executables to detect malwares with high accuracy. In: Procedia Computer Science, First International Conference on Information Security & Privacy 2015, vol. 78, pp. 667–674, June 2016
Kaggle: Microsoft Malware Classification Challenge (BIG 2015). Microsoft. https://www.kaggle.com/c/malware-classification (2015). Accessed 10 Dec 2016
Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: ACM Conference Data Application Security Privacy, pp. 183–194 (2016)
Drew, J., Hahsler, M., Moore, T.: Polymorphic malware detection using sequence classification methods and ensembles. EURASIP J. Inf. Secur. 2017(1), 2 (2017)
Derrac, J., García, S., Herrera, F.: A first study on the use of co evolutionary algorithms for instance and feature selection. In: International Conference on Hybrid Artificial Intelligence Systems, pp. 557–564. Springer, Berlin, Heidelberg (2009)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)
Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Dietterich, T.G.: Machine learning in ecosystem informatics and sustainability. In: Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, pp. 8–13 (2009)
Acknowledgements
Mr. Sanjay Sharma is thankful to Dr. Lini Methew, Associate Professor and Dr. Rithula Thakur Assistant Professor, Department of Electrical Engineering for providing computer lab assistance time to time.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sharma, S., Rama Krishna, C., Sahay, S.K. (2019). Detection of Advanced Malware by Machine Learning Techniques. In: Ray, K., Sharma, T., Rawat, S., Saini, R., Bandyopadhyay, A. (eds) Soft Computing: Theories and Applications. Advances in Intelligent Systems and Computing, vol 742. Springer, Singapore. https://doi.org/10.1007/978-981-13-0589-4_31
Download citation
DOI: https://doi.org/10.1007/978-981-13-0589-4_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0588-7
Online ISBN: 978-981-13-0589-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)