A Novel Enhanced Naïve Bayes Posterior Probability (ENBPP) Using Machine Learning: Cyber Threat Analysis

Sentuna, Ayan; Alsadoon, Abeer; Prasad, P. W. C.; Saadeh, Maha; Alsadoon, Omar Hisham

doi:10.1007/s11063-020-10381-x

A Novel Enhanced Naïve Bayes Posterior Probability (ENBPP) Using Machine Learning: Cyber Threat Analysis

Published: 09 November 2020

Volume 53, pages 177–209, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Processing Letters Aims and scope Submit manuscript

A Novel Enhanced Naïve Bayes Posterior Probability (ENBPP) Using Machine Learning: Cyber Threat Analysis

Download PDF

Ayan Sentuna^1,2,
Abeer Alsadoon ORCID: orcid.org/0000-0002-2309-3540^1,2,
P. W. C. Prasad^1,2,
Maha Saadeh³ &
…
Omar Hisham Alsadoon⁴

983 Accesses
11 Citations
Explore all metrics

Abstract

Machine learning techniques, that are based on semantic analysis of behavioural attack patterns, have not been successfully implemented in cyber threat intelligence. This is because of the error prone and time-consuming manual process of deep learning solutions, which is commonly used for searching correlated cyber-attack tactics, techniques and procedures in cyber-attacks prediction techniques. The aim of this paper is to improve the prediction accuracy and the processing time of cyber-attacks prediction mechanisms by proposing enhanced Naïve Bayes posterior probability (ENBPP) algorithm. The proposed algorithm combines two functions; a modified version of Naïve Bayes posterior probability function and a modified risk assessment function. Combining these two functions will enhance the threat prediction accuracy and decrease the processing time. Five different datasets were used to obtain the results. Five different datasets containing 328,814 threat samples were used to obtain the processing time and the prediction accuracy results for the proposed solution. Results show that the proposed solution gives better prediction accuracy and processing time when different examination types and different scenarios are taken into consideration. The proposed solution provides a significant prediction accuracy improvement in threat analysis from 92–96% and decreases the average processing time from 0.043 to 0.028 s compared with the other method. The proposed solution successfully enhances the overall prediction accuracy and improves the processing time by solving the TTPs dependency and the prediction sets threshold problems. Thus, the proposed algorithm reaches a more reliable threat prediction solution.

Bayesian Networks for Online Cybersecurity Threat Detection

An Incisive Analysis of Advanced Persistent Threat Detection Using Machine Learning Techniques

Cyber Threat Intelligence: Challenges and Opportunities

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Background

Detailed investigations of cyber-attacks and threats reveal the fact that attackers or organizations that perform cyber-attacks use common attack patterns to trick targets [1]. For this reason, security communities that develop defense strategies against cyber-attacks require intensive sharing and review of Cyber Threat Incident Reports (CTIR) in order to make these strategies more effective [2]. Due to the large sizes of CTIR and the addition of new attacks to concept of Advanced Persistent Threats (APT), it almost impossible for traditional methods in cyber-attack environment [1] to identify characteristic signature of the performed attacks [3]. Traditional methods for threat classification can be divided into two categories; machine learning based methods and lexicon-based methods [4]. Prior methods make calculations using lexicons to generate classification results [5]. However, reliability of results obtained with these methods depends largely on the quality and coverage of threat reports [4]. On the other hand, methods in the first category use handcrafted feature engineering to capture statistical features. With statistical data, various classifiers, such as Support Vector Machine (SVM), are used to obtain an estimated output of threat characteristics [5]. However, due to the difficulty of applying these methods, a poor performance results in classification results. To improve the performance of these methods, a combination of deep learning and machine learning methods can be used to automatically extract the features from CTIRs [6].

1.2 Review

Effectiveness and usefulness of machine learning methods in cyber threat intelligence has been proven many times. Bayesian probabilistic machine learning is based on joint probability function that graphically represents probability-based relationships between attack tactics, techniques, and procedures (TTPs). In addition, it eliminates uncertainty and provides prediction reliability in threat intelligence reports [7]. Bayesian method allows for a better extraction of threat features with low computation cost [8]. Despite all these superior features, performance of Bayesian method needs to be improved mainly in terms of the activation function, the loss function, and the parameters. For improving the processing time and prediction accuracy, optimized algorithms in threat characteristic classification can be used [7]. Models that use machine learning have improved the processing time and the classification accuracy with the help of various algorithms and techniques in determining threat classification [5]. For example, the prediction accuracy in [2] reaches 92%, while the processing time gives an average value of 0.043 s which is better compared to other models that use conventional clustering methods [7]. In addition to that, the use of Naive Bayes algorithm reduces keyword search problems and provides better performance than other models [2]. However, this algorithm assumes that all predictive threats are independent of each other [7]. This slows down the prediction stage and effecting the processing time performance, making it difficult to understand associated attacks.

1.3 Aim

The purpose of this paper is to improve the prediction accuracy and the processing time of security mechanisms against cyber-attacks. This is done by combining two functions; a modified version of Naïve Bayes posterior probability function and a modified risk assessment function. The proposed modified Bayesian probabilistic graphical function [7] is used to overcome independent TTPs detection problems of current support function algorithm. Moreover, it represents the probability-based relationships between TTPs, eliminates uncertainty, and provides prediction reliability [7]. For the modified risk assessment function, this paper applies the risk assessment framework, provided in [3], for the TTP classifications. This framework [3] provides a dynamic risk management by focusing on behavioural detection of complex TTPs. By combining these modified functions, the proposed solution has solved the problems of the existing solutions and increased the classification and the prediction accuracy.

1.4 Paper Structure

The rest of this paper is organized as follows: Sect. 2 reviews the literature on current solutions of cyber threat classification using machine-learning methods. The proposed solution is discussed in Sect. 3. Section 4 discusses the experiments and results of the proposed solution. Finally, Sect. 5 concludes the paper and presents the future works. Table 1 shows the abbreviation list used in this paper.

Table 1 Abbreviation for annotations used in the report

A Novel Enhanced Naïve Bayes Posterior Probability (ENBPP) Using Machine Learning: Cyber Threat Analysis

Abstract

Similar content being viewed by others

Bayesian Networks for Online Cybersecurity Threat Detection

An Incisive Analysis of Advanced Persistent Threat Detection Using Machine Learning Techniques

Cyber Threat Intelligence: Challenges and Opportunities

Explore related subjects

1 Introduction

1.1 Background

1.2 Review

1.3 Aim

1.4 Paper Structure

2 Literature Review

2.1 New Emerging Techniques, Tactics and Procedures in Cyber Threat Intelligence Reports

2.2 Modelling Attacker Activities Based on Close Attacks

2.3 Advanced Malware Prediction with Regression Models

2.4 State of Art

3 Proposed System

3.1 Proposed Equation

3.2 Area of Improvement

4 Results and Discussion

5 Conclusion and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation