Keywords

1 Introduction

The transformation of malicious software shows crucial challenges to the development of intrusion detection system (IDS). The malicious attack has become increasingly complex and the primary problem is to find obfuscated and unknown malware [1]. Additionally, there has been increasing threats to security like zero-day attack developed to end internet user [2]. Thus, computer security is becoming very important as the usage of information technology has been an essential part of their day-to-day life. Intrusion is an act operated illegally or legally in transmission networks and systems that display variation from regular actions of that network. IDS is a well-equipped software or hardware-based system used to trap the intrusion at an early stage [3]. It is an integration of methods, tools, and resources that assist in identifying the intrusion and dissolving them at the right time. First line of defense, the prevention technique for intrusions like access control, authenticity, cryptography, and secure routing [4, 5]. Figure 1 illustrates the general process involved in IDS.

Fig. 1
A flow chart illustrates the general process involved in I D S. It takes inputs from system data to detect anomalies by A I methods or match signatures to set alarms.

.

Initially, IDS was presented in 1980. Since then, many advanced IDS products have emerged. But still, several IDS suffer from a higher false alarm rate, generate several alerts for lower non-threatening situations that can cause severe harmful attacks to be ignored and increase the burden for security analysts [6]. Therefore, several authors aimed at designing IDS with reduced false alarm rates and high detection rates. Other challenges with current IDS are that they cannot identify unknown attacks. Since network environment changes rapidly, attack variants and novel attacks emerge continuously. Therefore, it is essential to design IDS that could identify unknown attacks. To resolve these above-mentioned issues, authors have focused on building IDS using machine learning (ML) techniques. The ML method is a type of artificial intelligence (AI) approach that could automatically determine valuable data from large data sets [7].

The ML-based IDS could attain a reasonable detection level once satisfactory training data is available, and ML model has adequate generalizability to identify novel attacks and attack variants [8]. Additionally, ML-based IDS don’t depend largely on domain knowledge; hence, they are easier to construct and design. Deep learning (DL) is a subdivision of ML method which could attain remarkable performance. In comparison with conventional ML methods, DL approaches are good at handling big data [9]. Furthermore, DL techniques could automatically learn feature representation from raw information and then output results; they are practical and function in an end-to-end method. One prominent characteristic of DL is the deeper network which has numerous hidden layers. At the same time, conventional ML techniques [10], like k-nearest neighbors (KNN) and support vector machines (SVM), contains none or only one hidden layer. Hence, this conventional ML method is known as shallow model.

This study concentrates on the survey of recently presented DL models for IDS. This study examines the existing 35 DL based intrusion detection approaches based on the performance evaluation, determines the research gap, and highlights the reviewed works. All the articles selected for analysis are based on the performance outcome and its accuracy results. The reviewed methods are examined in different ways such as objectives, methodology used, datasets used, and performance measures. In addition, an expressive and comparison study of the surveyed DL enabled IDS models takes place by offering a tabular format. Then, the performance analysis of the different DL models takes place and identifies the optimal solutions. Lastly, the research issues and potential future scope of the research are highlighted with objectives such as evaluating an effective intrusion detection system model from the different datasets and assessing machine learning based optimization model that will be applied to detect and classify intrusions along with its performance evaluation on parameters such as Precision, Recall, F-Score, and Computational Time and Accuracy.

2 Review of Deep Learning Based IDS Models

In these subsection, a detailed review of various DL-based solutions developed for IDS is shown in Table 1. Farahnakian et al. [11] introduced a DL technique to IDS. This technique utilizes Deep Autoencoder (DAE) as most famous DL technique. The presented DAE technique was trained from a greedy layer-wise fashion for avoiding over-fit and local optimum. Hanselmann et al. [12] regarded a novel unsupervised learning approach called CANet. Familiar to us, an initial DL based IDS that manages the data structure of maximum dimension CAN bus, in which varying message kinds are sent at varying times. This technique was estimated on real and synthetic CAN data. An evaluation with preceding ML techniques demonstrates that CANet exhibits them by an important margin.

Table 1 Comparison of different DL-based IDS models

In Boukhalfa et al. [13], a novel idea to Network IDS (NIDS)-based LSTM for recognizing menaces and for obtaining long-term memory on them, for stopping a novel attack that is similar to the recent ones, and simultaneously, for containing a single mean to block intrusion. The simulated outcome is proved that novel technique LSTM is more effective, it is efficiently memorizing and differentiate among traffics: normal and attack, from both situations of classifications, binary and multi-classification. Yin et al. [14] presented a DL technique to IDS utilizing recurrent neural networks (RNN-IDS). Also, it can be analyzed the efficiency of the model from binary as well as multiclass classifications, and the amount of neurons and varying learning rate influence the efficiency of the presented technique. It can be related to individuals of J48, ANN, RF, SVM, and other ML techniques presented by the preceding researcher on the benchmark dataset.

Kunang et al. [15] introduced a DL-based IDS utilizing a pre-training approach with deep autoencoder (PTDAE) related to DNN. These techniques are established utilizing hyperparameter optimized methods. This investigation offers an alternative solutions to DL framework techniques with automatic hyperparameter optimized procedure which relates grid as well as random search approaches. The automated hyperparameter optimized technique uses defines the value of hyperparameter and an optimum categorical hyperparameter structure for improving detection efficiency. Fatani et al. [16] examined an effective AI based procedure to IDS from IoT systems. It can control the progress of DL and metaheuristics (MH) techniques which permitted its efficacy from resolving difficult engineering issues. It can be present the feature removal technique utilizing the CNNs for extracting relevant features. Besides, it can progress a novel FS technique utilizing a novel different of transient search optimization (TSO) technique is named TSODE, utilizing the operator of differential evolution (DE) technique.

In Kanna and Santhi [17], a precise IDS technique was presented by utilizing a unified method of optimized CNN (OCNN) and hierarchical multi-scale LSTM (HMLSTM). The presented IDS method carries out the pre-processed, feature removal with testing and training of network and last classification. In this method, the lion swarm optimization (LSO) was utilized for tuning the hyperparameter of CNN to an optimum configuration of learning spatial features. Aleesa et al. [18] enhance UNSW-NB15 data set that utilized with DL as old ML approaches are taken as much time and the size of data set is not influence the efficiency of ML approaches, but the size of utilized data set affects the efficacy of DL approaches.

Lee and Park [19] resolved data imbalance by utilizing the generative adversarial networks (GAN) technique that is an unsupervised learning approach of DL that created novel virtual data same as recent data. It is also presented a model which is classified as RF for identifying detection efficiency after addressing data imbalance dependent upon GAN. In Liu et al. [20], an ID technique that relates ML with DL was presented. This technique utilizes the k-means and RF techniques as classifiers, and distributed computing of these techniques are executed on Spark platform for rapidly classifying normal as well as attack events. Next, with utilizing the CNN, LSTM, and other DL techniques, the event judged as abnormal is more classified as to varying attack type lastly. Currently, adaptive synthetic sampling (ADASYN) was selected for solving the unbalanced data set.

Ullah and Mahmoud [21] develop and design an anomaly-based IDS for IoT networks. Firstly, a CNN method is applied for creating a multi-class classification method. Then, the presented method is carried out through CNN models in 1D, 2D, and 3D. The generation and processing of features focus on the actual network traffic flow. They developed four data sets using this approach and then integrate them by rising the number of attack classes. Aldallal and Alisa [22], proposed a ML-based hybrid IDS. We integrated SVM and GA methods with an advanced fitness function designed for evaluating performance of the system. This scheme was investigated by the CICIDS2017 data set that has common and normal attacks. These two GA and SVM algorithms have been implemented in parallel to attain two ideal objectives: obtain the optimal set of features with the highest performance.

Abusitta et al. [23] present a ML-based IDS that effectively employs the past feedback data to make decisions. Especially, the presented method is depending on a Denoising Autoencoder (DA), i.e., employed as a fundamental element to create a DNN system. The power of DA exists in its capacity of learning the way to recreate IDS feedback from partial feedback. It enables to periodically decide suspected intrusion without comprehensive feedback from the IDS. Zhou et al. [24], introduced a hierarchical adversarial attack (HAA) technique, targeting the graphical neural network (GNN)-related IDS in the IoT system constrained resources. A hierarchical node election method based random walk with restart (RWR) is designed for selecting a group of susceptible nodes with higher attack importance, with the consideration of the overall loss changes and structural features within the targeted IoT networks.

Al Jallad et al. [25] present a resolution to identify new threats with lower false positive and high detection rates than previously employed IDS, as well as identify contextual and collective security attacks. They attain outcomes through networking Chatbot, a Deep RNN: LSTM on topmost Apache Spark Framework which contains input of traffic aggregation and traffic flow and the outcome is a language of two words, abnormal or normal. Mighan and Kahani [26] present a hybrid system that integrates the benefits of ML and deep network systems. At first, SAE system is utilized for latent feature extraction, which follows various classification-based IDSs, like SVM, RF, DT, and NB that is utilized for efficient and fast detection of intrusion in large network traffic data.

In Vinayakumar et al. [27], a DNN system, a kind of DL method, is used to design an effective and flexible IDSs for detecting and classifying unpredictable and unforeseen cyber-attacks. The rapid changes in network behavior and continues development of attacks make it essential for evaluating different data sets that are produced by the dynamic and static models. This kind of study facilitates the finding of the optimal model that could perform efficiently in identifying upcoming cyber-attacks. Kasongo and Sun [28] present a FFDNN wireless IDSs with a wrapper based feature extraction unit (WFEU). The WFEU employs the extra trees model for creating a best possible feature vector. The efficacy and efficiency of the WFEU-FFDNN are examined according to the AWID and UNSW-NB15 IDS data sets.

Shone et al. [29] proposed a Dl method for IDS called non-symmetric deep auto-encoder (NDAE). Moreover, the authors introduced DL classification technique via stacked NDAE. The presented classification method was executed in GPU-assisted TensorFlow and estimated by the standard NSL-KDD and KDD Cup’99 data sets. Kasongo and Sun [30] designed an IDS based DL method using FFDNN together with a filter-based feature selection method. The presented method can be estimated by the popular data mining (NSL-KDD) and NSL-knowledge discovery datasets and it is compared with the current ML algorithms.

Hu et al. [31] presented an IDS-based DL (IDSDL) with fine grained channel state information (CSI) for free the AP place. In CSI stage propagation modules decomposition technique was implemented for obtaining blurred elements of CSI stage on various paths as further sensitive detection signals. In CNN of DL was utilized for enabling the computer for learning as well as detecting intrusion without removing numerical features. In Mendonça et al. [32], a novel IDS dependent upon Tree-CNN hierarchical technique with Soft-Root-Sign (SRS) activation purpose was presented. To performance assessment, this method was executed in medium-sized company, analyzing the level of difficulty of presented solutions.

Toldinas et al. [33] present a novel manner to network ID utilizing multistage DL image detection. The network feature was altered as to 4 channel (Alpha, Blue, Red, and Green) images. Images are utilized for classification for testing as well as training the pretrained DL technique ResNet50. In Khan [34], a convolutional recurrent neural network (CRNN) was utilized for creating a DL-based hybrid ID structure that forecasts and categorizes malicious cyberattacks from the networks. Ashiku and Dagli [35] presents utilize of DL structures for developing a resilient and adaptive network IDS for detecting as well as classifying network attacks. The emphasis is DL or DNNs are enable flexible IDS with learning capacity for detecting and novel or zero-day network behavioral feature, consequently emitting the system intruders and decreasing the risks of compromises. In Wani and Khaliq [36], an SDN based IDS was presented that utilizes DL classifier to detection of anomaly from IoT. The presented IDS doesn’t burden the IoT device with security profile. The presented work was implemented in simulated environments. The outcomes of the experimental test are estimated utilizing different matrices and related to other relevant techniques.

In Jothi and Pushpalatha [37], a novel IDS was presented utilizing powerful DL techniques. Motivated by LSTM benefits, whale integrated LSTM (WILS) network was presented for designing intelligent IDS for detecting the range of distinct states of threat on IoT networks. The system includes four important functions: (i) Data gathering unit that profiles the regular efficiency of IoT device linked from the network, (ii) identify the malicious device on the networks if an attack was happening, (iii) forecasts the kind of attacks utilized from the network. In Haghighat and Li [38], a novel voting-based DL structure is named VNN has presented for taking the benefits of some types of DL frameworks. Regarded as many methods generated by distinct features of data and many DL frameworks, VNN offers the capability for aggregating optimum methods for creating further accurate and robust outcomes. So, VNN uses security specialists for detecting further difficult attacks.

Yousefnezhad et al. [39] implement ensemble methods for enhancing the efficiency of ID and simultaneously, reduce the FAR. It can be utilized kNN tor multi-class classifier as well as SVM for approaching the classification issue from normally based detections. For combining several outcomes, it can be the Dempster–Shafer approach in that there are possibilities of explicit retrieval of uncertainty. In addition, it can be employed DL to remove features for training the instances, chosen by the instance selective technique dependent upon ensemble margin. Mayuranathan et al. [40] present an effective feature subset selection-based classification method to detect DDoS attacks. For detecting the DDoS attacks from IDS, an optimum feature set was chosen with maximal detection by utilizing of random harmony search (RHS) optimized method. If the features were chosen, a DL-based classifier method utilizing RBM was executed for detecting the DDoS.

3 Performance Analysis

This section inspects the recently developed DL models for IDS available in the literature. Table 2 offers the comparative analysis of the DL models interms of different measures. Figure 2 investigates the \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) analysis of the DL models. The figure reported that the DNN model has obtained lower \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) values of 0.81, 0.785, and 0.765, respectively. At the same time, the CCNN model has obtained slightly enhanced \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) of 0.815, 0.791, and 0.76. Moreover, the conditional VAE, FNN, and MLESM-RNN techniques have resulted in moderate \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) values. Furthermore, the RNN model has tried to accomplish merate \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) of 0.831, 0.833, and 0.82, respectively. However, the DAE-DNN model has resulted in higher \({\text{prec}}_{n}\), \({\text{reca}}_{l}\), and \(F_{{{\text{score}}}}\) of 0.86, 0.833, and 0.82, respectively.

Table 2 Comparative analysis of DL technique for IDS interms of different measures
Fig. 2
A column graph of values versus precision, recall, and F 1 score of C N N, R N N, F N N, conditional V A E, deep neural network, D A E D N N, and M L E S M R N N.

Comparative analysis of DL techniques with varying measures

The \({\text{accu}}_{y}\) analysis of the DL models is carried out in Fig. 3. The figure reported that the CCNN and DNN models have gained lower \({\text{accu}}_{y}\) values of 0.791 and 0.785. In line with, the FNN, conditional VAE, RNN, and MLESM-RNN techniques have obtained moderate \({\text{accu}}_{y}\) values of 0.803, 0.801, 0.813, and 0.804, respectively. But, the DAE-DNN technique has resulted in a maximum \({\text{accu}}_{y}\) of 0.833.

Fig. 3
A dot plot plots accuracy versus methods. The trend rises from method 1 till method 3 at 0.813, falls till method 5, and rises again till method 7 at 0.833.

Accuracy analysis of DL technique

4 Challenges and Future Developments

The usage of an appropriate data set is one of the major problems in the development of deep learning-based IDS. The presented method doesn’t offer reliable performance result, because they are based on the NSL- or KDD KDD99 benchmark data sets, that has older traffic, don’t have real-time properties, and don’t characterize current traffic behaviors and attack scenarios. Hence, attaining traffic from simulated environment could overcome this problem by investigating current data sets, like the N-BaIoT IoT [41], and CICIDS2017 IDS intrusion prevention system (IPS) datasets [42]. Also, the published dataset is available for several fields, like industrial control systems (ICS). As well, the comparison between distinct DL approaches that are carried out in isolation doesn’t offer a reasonable comparison based on efficiency and effectiveness. This is because of differences in: (1) preprocessing (2) deep network configuration, (3) hardware platforms (4) the used dataset, and (5) part of the data set, viz., adapted. Hence, it is necessary for further comparative analysis that uses common affecting factors and unified computing platforms for distinct DL frameworks to attain a reasonable result.

Yet, the DL methods does not cover intrusion detection in different fields. Thus, it is essential to reconsider the IDS problems in several fields like smart grids, 5G, several IoT platforms, and SCADA that have been previously analyzed by shallow ML and other anomaly detection methods. Extensibility to distinct fields requires a data set that truly reflects the targeted environments and attains remarkable outcomes. Several DL-based IDS based on GPUs and CPUs for intense off-line trained computation. In response to tremendous growth, chip vendor has created innovative AI accelerator; the AI chip markets are predicted to attain $66.3 billion in 2025 [43]. The more commonly used chips are application-specific integrated circuit (ASIC), the field programmable gate array (FPGA), and the neural network processing unit (NNPU), along with the Edge TPU, a tiny AI accelerator published in 2018 by Google for IoT device. Today’s IoT devices and smartphones are armed with these innovative chips. Thus, leveraging this development to perform study will generate real-time prototype, instead of trusting offline data sets. Additionally, it will permit the advancement of innovative IDS for the restricted devices. Additional research of hybrid DL frameworks like GAN model is essential. It is valuable for leveraging DL method to change from collaborative IDS to collaborative DL-based IDS.

5 Conclusion

With the emergence of advanced technologies and the drastic increase in data generation, several research communities have investigated the design of DL models to detect intrusions. This survey extensively analyses and investigates the different DL based IDS models available in the literature. The reviewed methods are examined in different ways such as objectives, methodology used, datasets used, and performance measures. In addition, a comparative and descriptive analysis of the surveyed DL-based IDS models takes place by offering a side-by-side comparison in a tabular form. Then, the performance analysis of the different DL models takes place and identifies the optimal solutions. Lastly, the open research issues and future scope of the research are highlighted.