Skip to main content

Malware Detection Based on Opcode Sequence and ResNet

  • Conference paper
  • First Online:
Security with Intelligent Computing and Big-data Services (SICBS 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 895))

Abstract

Nowadays, it is challenging for traditional static malware detection method to keep pace with the rapid development of malware variants, therefore machine learning based malware detection approaches begin to flourish. Typically, operation codes disassembled from binary programs were sent to classifiers e.g. SVM and KNN for classification recognition. However, this feature extraction method does not make full use of sequence relations between opcodes, at the same time, the classification model still has less dimensions and lower matching ability. Therefore, a malware detection model based on residual network was proposed in this paper. Firstly, the model extracts the opcode sequences using the disassembler. To improve the vector’s expressibility of opcodes, Word2Vec strategy was used in the representation of opcodes, and word vector representations of opcodes were also optimized in the process of training iteration. Unfortunately, the overlapping opcode matrix and convolution operation results in information redundancies. To overcome this problem, a method of downsampling to organize opcode sequences into opcode matrix was adopted, which can effectively control the time and space complexity. In order to improve the classification ability of the model, a classifier with more layers and cross-layer connection was proposed to match malicious code in more dimensions based on ResNet. The experiment shows that the malware classification accuracy in this paper is 98.2%. At the same time, the processing time consumption comparing with traditional classifiers is still negligible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, J., Sun, L., Yan, Q., et al.: Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Industr. Inf. 14(7), 3216–3225 (2018)

    Article  Google Scholar 

  2. Abou-Assaleh, T., Cercone, N., Keselj, V., et al.: N-gram-based detection of new malicious code. In: Proceedings of the International Computer Software and Applications Conference. COMPSAC 2004, vol. 2, pp. 41–42. IEEE (2004)

    Google Scholar 

  3. Shabtai, A., Moskovitch, R., Feher, C., et al.: Detecting unknown malicious code by applying classification techniques on Opcode patterns. Secur. Inform. 1(1), 1–22 (2012)

    Article  Google Scholar 

  4. Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Iasted International Conference on Artificial Intelligence and Applications, pp. 358–363. ACTA Press (2008)

    Google Scholar 

  5. Santos, I., Brezo, F., Ugarte-Pedrero, X., et al.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231(9), 64–82 (2013)

    Article  MathSciNet  Google Scholar 

  6. Divandari, H., Pechaz, B., Jahan, M.V.: Malware detection using Markov Blanket based on Opcode sequences. In: International Congress on Technology, Communication and Knowledge. IEEE (2016)

    Google Scholar 

  7. Kang, B.J., Yerima, S.Y., Mclaughlin, K., et al.: N-Opcode Analysis for Android Malware Classification and Categorization, 1–7 (2016)

    Google Scholar 

  8. O’Kane, P., Sezer, S., Mclaughlin, K., et al.: SVM training phase reduction using dataset feature filtering for malware detection. IEEE Trans. Inf. Forensics Secur. 8(3), 500–509 (2013)

    Article  Google Scholar 

  9. Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014)

    Google Scholar 

  10. Lee, Y.J., Choi, S.-H., Kim, C., Lim, S.-H., Park, K.-W.: Learning binary code with deep learning to detect software weakness (2017)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition, pp. 770–778 (2015)

    Google Scholar 

  12. Rasmus, A., Valpola, H., Honkala, M., et al.: Semi-supervised learning with ladder networks. Comput. Sci. 9 Suppl 1(1), 1–9 (2015)

    Google Scholar 

  13. Microsoft Malware. https://www.kaggle.com/c/malware-classification

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (BK20180080).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinshuang Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, X., Sun, M., Wang, J., Wang, J. (2020). Malware Detection Based on Opcode Sequence and ResNet. In: Yang, CN., Peng, SL., Jain, L. (eds) Security with Intelligent Computing and Big-data Services. SICBS 2018. Advances in Intelligent Systems and Computing, vol 895. Springer, Cham. https://doi.org/10.1007/978-3-030-16946-6_39

Download citation

Publish with us

Policies and ethics