1 Introduction

At present, network security has become an important focus of computer security research. The attack on the network infrastructure is the threat against network and information security. An intrusion detection system (IDS) activities monitor and analysis user and system activity, system configuration auditing and ensuring critical system securities. The characteristics of effective intrusion detection systems are high detection rate, less false alarm, fewer CPU cycles and quick detection of intrusion. In [1], the IDS was developed by Denning in 1987. This paper was considered as a great statistics landmark in intrusion detection system (IDS) field. Every ID system must possess four characteristics. They are time, performance, dynamic reconfiguration and prediction performance. The IDS policy gets the requirements from the goal of IDS. The goals of the IDS involve enforcement of use policies, collection of evidence, detection and prevention of attacks. IDS classification techniques are shown in Fig. 1. IDS can also be classified into the following four categories on the basis of the detection approaches:

Fig. 1
figure 1

Intrusion detection classification techniques

  • Signature-based intrusion detection or misuse (knowledge-based)

In this approach, the comparison of the user’s activity and known intrusion detection pattern is called signatures. The main advantage of signature-based intrusion detection is very easy to understand and high detection speed because of dealing with false positives.

  • Anomaly-based intrusion detection system or behavior system

In this approach, normal user is compared with a new user to judge the new user’s activity. The advantages of this method are less dependent on system software and very efficient to detect unknown and attacks.

  • Host-based IDS (HIDS): HIDS is a method, which monitors the system activities are generally used for information collected by system logs and system audit trails.

  • Network-based IDS (NIDS): NIDS is a system, it runs at strategic points in the network (e.g., server and switch).

In this paper, we propose stacked ensemble learning model for network intrusion detection by using gradient boosting machine (GBM) and random forest (RF) algorithms. The proposed method integrates the features of both GBM and RF classifiers. RF is the ensemble regression and classification approach. RF was developed by Breiman [2]. Random forest generates various classification trees. The primary features of the random forest algorithm are handling unbalanced data sets, running efficiently at the large data sets with various features and there is no nominal data problem. Gradient boosting stands for gradient descent and boosting is called by gradient boosting [3]. It is a powerful machine learning algorithm. General boosting algorithm works with a variety of loss functions. The models included in GBM are resistant regression, K-class classification, regression and risk modeling. Gradient boosting can do regression, classification and ranking.

2 Related work

Aung et al. [4] has suggested two approaches, namely random forest and random forest-based K-means algorithm. They achieved good accuracy and able to extract the attacked illusion. Abdulhammed et al. [5] had proposed deep and variational auto-encoder (VAE), voting, random forest and stacking machine learning classifiers for anomaly-based intrusion detection of imbalanced network traffic. Ahmad et al. [6] proposed various machine learning techniques such as ELM, SVM and RF for IDS. The analysis shows that the ELM approach got good accuracy, precision and recall compare to other models. ELM is a suitable technique for IDS that is designed to analyze a large amount of data.

Aburomman et al. [7], had analyzed various ensemble and hybrid techniques in the intrusion detection system. Choudhury et al. [8] analyzed various classification algorithms. The analysis shows that RF and BayesNet are most precise when compare to the other algorithms. Chang et al. [9] presented the network IDS using random forest and SVM approaches to improve the accuracy in computer networks. The authors build two machine learning algorithms which are used for improve high detection rate in network intrusion detection.

Chabathula et al. [10] have analyzed IDS system that has using SVM, KNN, J48, random forest, adaboost, nearest neighbors generalized exemplars algorithm, voting features interval classification algorithm and Naivebayes probabilistic classifier. The analysis shows that principal component analysis (PCA) is the most precise among other approaches. Jayveer Singh et al. [11] surveyed different machine learning techniques and soft computing techniques for IDS.

Joshi et al. [12] proposed various classifications, clustering in intrusion detection system. Khan et al. [13] analyzed various classification techniques on intrusion detection systems. Li et al. [14] used hybrid methods such as particle swarm optimization (PSO) and random forest (RF) approaches are used to improve the better performance in detecting the attacks on the network.

Latah et al. [15] built software-defined networking (SDN) controller to improve the accuracy in anomaly-based intrusion detection. SDN is compared with many supervised machine learning approaches. Finally, the analysis shows that the decision tree (DT) is most precise when compared to other models using SDN controller.

Malik et al. [16] proposed a hybrid algorithm named PSO and random forests algorithm. That is, PSO is used for dimension reduction and RF is used for classification. Murugan et al. [17] analyzed various detection spams in social networks using machine learning such as SVM, NaïveBayes, random forest and decision tree (J48) algorithms.

Maniriho et al. [18] presented a combined machine learning approach with a two-feature selection techniques such as correlation ranking filter and gain ratio feature evaluator for the anomaly NIDS. Tsai et al. [19] reviewed the recent studies of intrusion detection by machine learning techniques. The authors are reviewed a large number of machine learning techniques which are used in the intrusion detection domain review has included single classifiers, hybrid classifiers and ensemble classifiers.

Wang et al. [20] have presented C-ELM approach to network intrusion detection. The authors built several models with fast learning speed in hidden neurons with binary search. Zhang et al. [21] proposed a network intrusion detection system based on data mining algorithm called random forest in anomaly-based, misuse-based and hybrid-network-based intrusion detection systems. Yin et al. [22] proposed a model based on recurrent neural networks (RNN) to improve the accuracy in NIDS.

Ingre et al. [23] have presented different methods of BFGS quasi-Newton backpropagation algorithm and Levenberg–Marquardt (LM) in ANN. ANN is used for supervised classification learning to improve the accuracy and detection rate. Murugan et al. [24] proposed a hybrid algorithm named feature extraction combination of logistic regression and principal component analysis methods to increase the classification accuracy using machine learning algorithms on twitter data.

3 Stacked ensemble learning

Ensemble learning has three types namely, bagging, boosting and stacking. Bagging and boosting are the alternatives of the voting methods. The bagging approach, homogeneous models are taken to predict the class of test data. Initially, homogeneous selected models predictions are recorded. Finally, the class which is predicted by the maximum number of models is assigned to the test data. Similarly in boosting, the models are trained heavily for the misclassified data in the training phase. Finally, the model which is showing maximum accuracy is considered as the classifier for the test data. The stacking is an effective approach because it is a generic framework, which combines many ensemble methods. It has two levels of learning, base learning and meta-learning. In base learning, the initial (base) learners are trained with training data set. After training, the base learners create a new data set for the meta-learner. Then, the meta-learner is trained with new training data set. The trained meta-learner is used to classify the test set. A crucial part in stacking is the selection of a best base learner. That is instead of selecting a single base learner, select many base learners for the training data set. The main difference between stacked and other methods of ensemble techniques is that in stacking meta-level learning-based classification is applied as final classification. Ensemble classification techniques are shown in Fig. 2. This paper has used stacked ensemble model, in which random forest and gradient boost are base learners. The above algorithm summarizes the proposed method for detecting the attacks. Gradient boost is a recent approach, it is an improved version of adaboost. Adaboost is an additive predictive model in machine learning algorithms. Its prediction based on step–by-step forward stage-wise manner that is in initial stage starts with weak learners for the predictions, for each iteration converts the walk learners to strong by increasing higher data weight points. The slighter variation of adaboost is gradianboost, which introduce a new learner at every iteration on existing weak learner rather than increasing data points. The gradient boost approach can boost any differentiable loss function. The regression trees are built at every iteration and individual trees are added sequentially. That is, the next tree is constructed based on the variation between actual value and predicted values. The random boost works on selecting features and target values which can build set of rules to generate multiple decision trees, finally taking the mean of the results. That is, creating the forest with many trees. Build the tree iteratively from remaining features with n nodes. Finally, calculate the average constructed trees.

Fig. 2
figure 2

Ensemble classification techniques

The random forest works on selecting features and target values, which can build set of rules to generate multiple decision trees, finally taking the mean of the results.

figure a

4 NSL-KDD data set description

The proposed stacked ensemble model combines gradient boost and random forest approaches. This model is evaluated with NSL-KDD data set [25], which is the new version of KDD-CUP99 [26, 27].The NSL-KDD data set contains 41 attributes (features) and five different classes. Different classifications in the NSL-KDD data set are shown in Table 1. Attack types present in the NSL-KDD data set and their categorization, which are shown in Table 2. The classes are normal and four different attacks, namely Dos, Probe, R2L and U2R. In the data set, the basic features are from one to ten columns, from 11 to 22 are content features and the rest are traffic features. The proposed model is implemented with R programming language. Experimental result reveals that proposed method increase the attack detection rates and reduces the training time compare to single decision algorithm. Table 3 shows the confusion matrix, which helps to compute the performance measures such as precision, recall, and accuracy.

Table 1 Various classifications of the NSL-KDD data set
Table 2 Attack types present in the NSL-KDD data set and their categorization
Table 3 Confusion matrix
  • Denial of Service (DoS) attack: Infinite decline services of a host. Features are source bytes and percentage of packets with errors.

  • Probe attack: Unauthenticated information gathering. Features are duration and source bytes.

  • User to Root (U2R) attack: Illegal access to local super user or root. Features are number of file creations and number of shell prompts invoked.

  • Root to Local (R2L) attack: Illegal access from a remote machine. Features are duration of connection and service requested.

A Dos attack is one of the cyber-attacks, in which the hacker is intentionally change the normal functions of the target system, so that the system become unavailable to its intention users. The main motive of this attack is to denying the requests which are raised from authenticated users. This DoS attack includes flooding, smruf, ping of death and so on. Flooding is one of the DoS attack broadcast the packets to the target machine, so that the machine becomes very busy in receiving the packets. It creates large number of packets and that system is started to deny any request. Likewise, many DoS attacks create the situation of denial of service to the target machine. The proposed approach detects all those attacks and classified as attack. The proposed method has detected 7443 DoS attacks correctly from the NSL-KDD dataset.

Teardrop is a DoS attack, in which the fragmented packets couldn’t reassemble by the target machine. Teardrop attack feature is specified in feature_wrong_fragment attribute in the NSL-KDD data set. The proposed approach is effectively identifies and classifies this attack. An Udpstorm attack is a UDP flood attack, in which the remote host sends huge number of UDP packets to the target machine. Mailwomb is a DoS attack, in which the large number emails are sent with garbage values to the target system. Land is a DoS attack, in which the attacker sends highly sends the garbage values messages as TCP-SYN packets to the target system. This attack can be possible only when both the sender and receiver IP port numbers are same. The features of the land attack in the given data set are feature_diff_srv_rate, feature_dst_host_serror_rate and feature_dst_host_diff_srv_rate. Neptune is a DoS attack, it is also called half-opened TCP-SYN attack. The attacker is continually sending large number of connection request to the target system. This Neptune attack is recognized using the feature_count, feature_diff_srv_rate, feature_dst_host_serror_rate, feature_src_bytes and feature_dst_host_diff_srv_rate of the NSL-KDD data set. POD is a DoS attack, where the attacker is sending large-sized packets to the target system in a single ping command. The features for this attack are feature_protocol, feature_type and feature_src_bytes and feature_wrong_fragment. Smurf attack is a DoS attack, in which large-sized ICMP packets with garbage values are broadcasted to the target machine. For this attack, the features used in the NSL-KDD data set are feature_protocol, feature_type, feature_src_bytes and feature_wrong_fragment. Probe is another type of attack in IDS, where the attacker scans the target system to find the weakness of the system for exploitation. The probe attack features are feature_service, feature_logged_in, feature_diff_srv_rate, feature_dst_host_count, feature_dst_host_diff_srv_rate and feature_dst_host_ same_src_ port_ rate.

Remote to user attacks (R2U) is another attack in IDS, in which the attacker access the target system as like local user of that system. Then, the attacker can access the target machine and exploit the whole privileges of that system. The features of the R2U attack are feature_service, feature_logged_in, feature_count, feature_same_srv_rate, feature_dst_host_count, feature_dst_host_srv_count and feature_ dst_host_serror_rate in the NSL-KDD data set. User to root attacks (U2R) is an attack, of IDS in which the attacker enter into the target system as a normal user, understanding the privileges of the system and then exploit the vulnerabilities. The features of this U2R attack are feature_service, feature_src_bytes, feature_dst_bytes, feature_hot, feature_num_compromised, feature_srv_count, feature_dst_host_diff_srv_rate, feature_dst_host_same_ src_ port_rate and feature_dst_host_serror_rate.

5 Proposed work

Stacked ensemble learning model is also called as multiple classifier system that uses a set of classifiers as base classifiers to build new training data to classify unknown data. Figure 3 shows the architecture of stacked ensemble method. Where are stacked ensemble learning model first select the different base classifiers say B1, B2,…Bn, train them using training data set and creates multiple learners L1, L2,…Ln from training process. These learner outputs are combined for create a new data set in the form of <(y′0… ym), yj> for the second level classifier. Here, y′0 is a predicted output of the first base classifier on the input a1′ and yj is the actual output on the input a1′. The second level classifier is also called as meta-level classifier.

Fig. 3
figure 3

Stacked ensemble-based IDS architecture

The base learners are homogeneous or heterogeneous ensembles. The learners are under same type is homogeneous otherwise heterogeneous. The meta-classifier takes <(y′0… ym), yj> as an input and trained on that input set. While training on this input set, the meta-learner identify the errors of base learners and adjust them for optimistic solution. This process is repeated with k times for k-fold cross-validation to minimize the error and optimize the output. After repeating this process the meta-learner becomes a generalization model for any input data. In conventional approach, the selected single classifier may perform poorly, that is the classifier behaves good on training data, but it saw an unseen new data, leads to poor classifier. This problem is eliminated in stacked ensemble approach. Because even though, the one of selected classifiers in ensemble is unfit for the approach, averaging of all the classifiers can reduce the risk of depending one approach. In general, ensemble method does not guarantee that it can provide best solution for every problem. But it can avoid the risk of poor selection of classifier.

5.1 Performance metrics

In proposed model, the most important performance indicator accuracy. This accuracy is used to measure the performance GBM-RF model in intrusion detection system. The metrics that have been used to evaluate the performance of the proposed system include the classification accuracy (AC), detection rate (DR), precision and recall. These metrics are expressed by Eqs. 1 to 3 where, true positive (TP), true negative (TN), false positive (FP) and false negative (FN), respectively.

  1. 1.

    True positive (TP)—attack data are classified as an attack.

  2. 2.

    True negative (TN)—normal data are as normal.

  3. 3.

    False positive (FP)—normal data are classified as an attack.

  4. 4.

    False negative (FN)—attack data are classified as normal.

$${\text{Accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP + TN}} + {\text{FP + TN}}}}$$
(1)
$${\text{Detection rate}} = \frac{\text{TP}}{{{\text{TP + FP}} + {\text{FN}} + {\text{TN}}}}$$
(2)
$${\text{Recall}} = \frac{\text{TP}}{\text{TP + TN}}$$
(3)

5.2 Experimental results and discussion

The proposed approach is tested on Intel Core (TM) i5, 8 GB RAM and coding is done by R language. The confused matrix for the five-category experiments on the KDDTest+ is shown in Table 4. Table 5 shows the precision and recall values are obtained by the proposed approach.

Table 4 Confusion matrix for the five-category experiments on the KDDTest+
Table 5 Results of the performance metrics (detection rate and recall) for the four-category classification of proposed work

Figures 4 and 5 depicts that the proposed method stacked ensemble is obtained higher percentage of detection rate, recall and accuracy compare to conventional machine learning on each attack types. As the gradient boost approach reduces residue error at every iteration, the prediction model improves its performance, thus the attacks detection rate is increased indirectly. The proposed model is compared with several machine learning algorithms and neural network models. The results shows that for each attack type, the RNN and ANN approaches precision and recall values are very less compare to proposed approach.

Fig. 4
figure 4

Different attack detection rate of stacked ensemble with other models

Fig. 5
figure 5

Different attack recall of ensemble with other models

In Fig. 6, accuracy is compared with the different machine learning approaches. In the experiment, the machine learning algorithms are used, namely Naïve Bayes, decision tree and random forest, perceptron and neural network models for performance evaluation. The overall accuracy is higher for the stacked ensemble learning method is compared with these above models.

Fig. 6
figure 6

Accuracy of proposed stacked ensemble learning and existing approach

6 Conclusion

The proposed stacked ensemble model is well suited for intrusion detection system to detect the attacks as well as classify them in five different classes. The proposed method is a combined approach of gradient descent and random forest algorithms. The proposed method increases the gradient on the weak learner in every iteration so that the weak learner is converted into a strong enough for predicting test data. The proposed method makes better performance when compare to the other ensemble methods like bagging and boosting. Thus, this paper has deployed stacked ensemble approach to detect the security attacks. The experimental result is also show that compare to machine learning and neural network models, the proposed stacked ensemble model performance is higher. In the future work, stacked ensemble learning classifiers will be used in many machine learning algorithms for the task of intrusion detection method. Furthermore, an investigation of probabilistic, decision tree, non-probabilistic and rule induction-based classification algorithms will be combined as ensemble for the better performance.