Introduction

The Intrusion Detection Systems (IDSs), introduced in the year 1980 [1], became one of the most essential defenses in network security and cybersecurity. They were designed to proactively monitor the traffic and raise alerts when something malign or intrusive is detected [2]. The IDS technology evolved in many stages since it was introduced [3]. However, despite several developments made, the detection rates were not improving as expected, and there has not been a significant decrease in the number of false alarms. To overcome such performance issues and widen the capabilities of the IDSs, research began in the late 1990s to incorporate Machine Learning (ML) techniques in IDS development [4]. With the power of ML, IDSs gain the ability to detect unknown attacks. Attack behaviors change rapidly with time, and an IDS should be able to correctly recognize the malign activities in a network. When traditional IDSs encounter new or sophisticated signatures, they may take relatively longer time to analyze the packets and respond [5].

As early as in 2004, a study by N. Dalvi et al. [6] revealed a concerning vulnerability that machine learning algorithms possess against adversarial inputs. Later, it was shown that such a vulnerability profoundly exists in deep learning and neural networks when presented with adversarial perturbations [7,8,9,10,11,12,13]. Various adversarial attack scenarios were developed, and their impacts on classifiers were analyzed. Mechanisms have also been proposed to defend the models from adversarial perturbations and minimize their impacts [14]. However, much of this progress was made in the image-based areas, like computer vision, image processing, etcetera. A relatively lesser progress has been made in the IDS domain [15]. One of the major concerns about training IDSs is datasets. The performance of an IDS hugely depends on the quality of the data it learns from.

The availability of good quality IDS datasets is a challenge. A major portion of research work in this domain is being conducted and/or evaluated using old datasets [16]. Unlike in image domain, the data in IDS domain quickly becomes outdated, as data patterns rapidly change in networks and attack behaviors turn sophisticated. A dataset should reflect the contemporary network behaviors and cover sufficient attack scenarios so that an IDS model learns a wide variety of traffic characteristics. On the bright side, there are some datasets that are relatively newer and can serve better than older benchmark datasets like NSL-KDD, DARPA, etcetera [12]. It is important to study the characteristics of modern datasets and analyze how they are affected by adversarial algorithms, so that the analysis makes it easier for the research community to choose which dataset might fit better into a project’s requirements.

The objective behind choosing recently published IDS datasets for this study is to understand how an IDS model, trained with such a dataset, behaves in adversarial environments. An IDS deployed in a modern network needs to have sufficient knowledge of modern traffic behaviors to properly analyze and correctly identify undesired data patterns in its network. To achieve this, the IDS needs to learn from a dataset that covers a fair amount of traffic scenarios that are common to occur in a typical real-time network.

The novelty of this work lies in the combination of elements such as the contemporary IDS datasets, the adversarial white-box attack algorithms, and more significantly, the domain in which we want to evaluate the impacts of adversarial machine learning. The motive behind choosing the CSE-CIC-IDS2018 dataset is its characteristics, as highlighted in “CSE-CIC-IDS2018 Dataset”, which are close to a real-world environment. Network data that is far from reality might make a model behave as expected in an experimental/research setup, but cannot guarantee the model’s performance in a real-time network. The lesser the gap is between a research IDS dataset and the traffic observed in a real-time network, the greater the chance is for an experimental model to be capable of doing well in a real-world environment.

This work contributes to evaluate the impacts of adversarial algorithms on contemporary datasets that represent modern traffic behaviors and attack scenarios. The datasets covered in this study are UNSW-NB15, published in 2015; Bot-IoT, published in 2018; and CSE-CIC-IDS2018, published in 2018. The adversarial attack algorithms studied are Jacobian-based Saliency Map Attack (JSMA), Fast Gradient Sign Method (FGSM), and Carlini Wagner (CW). Metrics such as Accuracy, Area Under the Curve (AUC), F1 Score, and Recall were used to evaluate the results and analyze the impact of the adversarial algorithms.

The remaining portion of this paper is organized as follows: “Background” presents an overview of adversarial machine learning, the adversarial methods used in this study, and briefly summarizes the datasets studied. “Related Work” presents related work on adversarial sample generation and adversarial machine learning.“Experimental Evaluation” discusses the experimental evaluation process implemented for the study. “Experimentation Results” presents the evaluation results. “Analysis and Discussion” provides an analysis of the adversarial attacks on the datasets. “Conclusions and Future Work” concludes the paper and presents our thoughts for future work.

Background

Adversarial Machine Learning: A Bird-eye View

Adversarial Machine Learning (AML) is the process of deceiving an ML model by providing a perturbed input that makes the model render incorrect prediction. The perturbed input is imperceptible to humans but makes a considerable difference to a neural network. Neural networks are vulnerable to adversarial attacks during training as well as testing/validation phases. Variations in attack techniques can be introduced based on factors like phase (training, testing, etc.), the knowledge of the model that the attacker has, the target of the attack, influence of the attacker, etc. The attacks carried out in the training phase are termed as Poisoning attacks and those launched during the testing phase are called Evasion attacks. Barreno et al. [17] highlights three properties of an attack - influence, focus of violation (confidentiality, integrity, availability), and specificity of the target. For example, based on some of the factors stated above, an evasion attack can be classified as either a white-box attack, where the attacker has complete knowledge of the model (including details like training dataset, parameters, etcetera), or a black-box attack, where the attacker has almost no knowledge of the model, or a gray-box attack, where the attacker has partial knowledge of the same.

Methods used for Generation of Adversarial Samples

The adversarial algorithms chosen for this study are all white-box evasion attacks. Although black-box and gray-box attacks are more common in practice (i.e., in real-time environments), most of these techniques aim at collecting information about their target models in a variety of ways, implying that they gradually progress towards becoming white-box attacks, which tend to be more powerful than the other two categories. This thought process motivated us to choose white-box attacks for our study. The current section briefly explains the algorithms we chose for the experiment.

Jacobian-based Saliency Map Attack

The Jacobian-based Saliency Map Attack (JSMA), introduced by [11], is one of the attack techniques evaluated in this study. It is an evasion attack that works by minimizing the L0 norm by iteratively generating a saliency map which is used to choose a feature that will have a maximum error in prediction when added with perturbation [18]. The attack aims to perturb least possible number of features to cause misclassification. The process consists of obtaining the Jacobian matrix where the component i is the input and j is a derivative of the class for input i [11]:

$$\begin{aligned} J_F(X) = \frac{\partial F(x)}{\partial x} = \Biggl [\frac{\partial j(x)}{ \partial x_i}\Biggr ]_{ixj} \end{aligned}$$
(1)

In the above equation,F represents the second to last layer [19]. For each feature selected, the perturbation is adjusted and the iterations are continued until misclassification in the target class is achieved or the limit for a maximum number of perturbed features is met [11]. If it fails to achieve this, the algorithm selects the next feature and repeats the process with it [12]. The authors were successful in modifying as less as 4.02% of the features per sample and achieved a success rate of 97% [19]. It is a white-box attack algorithm, therefore, requires a complete knowledge of the architecture and parameters of the model targeted [11].

Although the success rates achieved by JSMA and FGSM are almost similar, the number of features modified are relatively lesser and the computational costs higher with JSMA, than with FGSM [18].

Fast Gradient Sign Method

The FGSM attack was a technique proposed by [9] for adversarial data generation. As per this technique, a perturbation can be defined as follows:

$$\begin{aligned} \eta = \epsilon * {\rm{sign}}(\nabla x J(\theta , x, y)) \end{aligned}$$
(2)

In the above equation, \(\theta\) represents the parameters of a model, where x is the input, y is/are the corresponding target(s), and J(\(\theta\), x, y) is the cost to train the neural network [9]. \(\epsilon\) represents the magnitude of the attack, and the gradient can be obtained by back propagation.

The attack algorithm has a loss function, and works by aiming to minimize it [15]. Unlike the JSMA attack, the FGSM attack does not aim at generating minimal adversarial perturbations. However, it tries to speed up the adversarial data generation process [8], and this is why it saves computation time when compared to JSMA.

Carlini Wagner

This attack, proposed by [8], is considered to be one of the powerful attacks in defeating neural network models. It is often used as a benchmark algorithm to evaluate the vulnerability of a model, and also to assess the strength of an adversarial data generation technique. An L2 attack norm is used to generate adversarial samples, and can be defined as follows:

$$\begin{aligned} \mathrm{{minimize}} \vert \vert \frac{1}{2}(\mathrm{{tanh}}(w) + 1) - x \vert \vert _2 ^2 + e. f(\frac{1}{2}(\mathrm{{tanh}}(w)) + 1) \end{aligned}$$
(3)

The main goal of the algorithm is to minimize the distortion in the L2 metric. The evaluations conducted by the authors show that the CW attack fails defensive distillation mechanism, which is another potential reason for its robustness. The L2 attack, implemented in this work, is available in Cleverharns library [20].

Overview of the Datasets

Data is a fundamental and an essential ingredient to conduct research in any field of science. In the modern era, the research community has a greater advantage because of the publicly available datasets, a good number of which are used as benchmark datasets for research and development. In an IDS dataset, the records represent the network traffic, and each data point is either categorized as normal or as malicious, and this categorization is used for the evaluation [21]. Generating a realistic dataset is not only tedious, but also involves complications to make it publicly available because of the sensitive information present in it related to the network, its environment, and the users in it [22]. Despite the hurdles, fortunately, there have been a considerable number of datasets recently made available, that cover relatively modernized network traffic scenarios [23]. They have been generated in a way to overcome the shortcomings of the older benchmark datasets like NSL-KDD [24], and make data more useful for research activities. There is a need to study their characteristics and properties, to understand how useful they can be in various forms of research. This study uses three recently published datasets, UNSW-NB15, Bot-IoT, and CSE-CIC-IDS2018.

UNSW-NB15 Dataset

Developed in the Cyber Range Lab, at UNSW (University of New South Wales) Canberra, the UNSW-NB15 is one of the benchmark datasets that has a hybrid of realistically generated normal traffic behaviors and synthetically generated contemporary attack behaviors. The IXIA PerfectStorm tool was used for the generation of the data [16, 25]. The tcpdump tool was used to capture 100 GB of traffic in the raw form. The dataset covers nine types of attacks, and has a total of 49 features including the label attribute. A total of up to 12 algorithms are developed using tools like Argus and Bro-IDS, to generate the features of the dataset [26,27,28,29]. This dataset is well-balanced when compared with the other two datasets used in this study. This is because there is relatively much lesser difference between the number of benign and malign traffic instances in this dataset.

Bot-IoT Dataset

The Bot-IoT dataset was also developed in the Cyber Range Lab of UNSW Canberra, in the year 2018. A realistic network environment was created to generate this dataset. As it is clear from the name of the dataset, it consists of IoT-based traffic, both benign and botnet. The total raw data captured is 69.3 GB in size, and has over 72 million records. For easier handling of the dataset, the authors also published a smaller version of this dataset, extracting 5% of its data through specific MySQL queries [23, 30,31,32,33,34]. This smaller version, split into training and testing sets, with about 3 million records and around 1 GB in size, has been used in this study.

CSE-CIC-IDS2018 Dataset

The CSE-CIC-IDS2018 dataset hereafter referred to as the CIC-IDS2018 dataset, was developed as a collaborative project between the Communications Security Establishment (CSE) and the Canadian Institute for Cybersecurity (CIC). The dataset covers seven different attack scenarios, and was generated in an environment that is close to reality because of the massive resources used. The attack-generating network had up to 50 devices and the victim network was divided into 5 departments, with a total of 450 devices including servers and other machines. The CICFlowMeter-V3 was used to generate a bidirectional network traffic, and for feature extraction as well [35,36,37]. The traffic data was collected for 10 days, and was saved in 10 different files. There are 79 features in 9 of those files, and 83 features in the remaining file. This dataset is huge to be handled in full, therefore, we have used about 20% of the dataset, making sure we have all the classes included, and a balanced amount of instances in all of them. A brief summary of the datasets is presented in Table 1.

Table 1 Overview of the datasets

Related Work

This section discusses various works that revolve around adversarial machine learning, including works that propose adversarial attack techniques, layout taxonomies for approaches to generate adversarial data put forth mechanisms for defending the adversarial techniques, etcetera.

One of the early studies on adversarial attack techniques and defenses was published in 2006, by [17]. The authors discussed how the learning algorithm can be corrupted when detailed information about the model and its properties is provided.

The authors of [38] propose a strategy to make linear classifiers more robust against adversarial settings, and in particular, investigate two methods, namely, random subspacing and bagging, for the construction of ensemble-classifier models.

In [39], the authors propose an adaptive adversarial technique for embedding a backdoor in a model’s training data and/or its parameters, and can bypass the currently existing mechanisms that detect the presence of backdoors.

The authors of [40] studied the vulnerability of the NSL-KDD dataset against the FGSM technique. They conducted experiments to investigate the presence of attack vector in the data samples that can be used to let the adversarial inputs bypass the detection mechanism.

In [15], the authors used the NSL-KDD dataset to study the impacts of adversarial learning algorithms on deep neural networks, with a Multi-Layer Perceptron (MLP) model. They also examined the uses of feature selection in adversarial sample generation. The attack techniques used in their work are FGSM, Deepfool, JSMA, and CW. Their evaluation results indicate that it is not so beneficial for an adversary to modify a large number of features in the adversarial sample generation.

The authors of [41] propose a GAN-based black-box adversarial technique and analyze how practical its impacts are on a network-based IDS (NIDS). Their results suggest that a black-box adversarial attack can also have a considerable impact on the performance of a deep neural network (DNN). The NSL-KDD dataset was used for their study.

In [12], the author studied the performance of IDS model when trained with each of NSL-KDD and KDD-99 datasets under two attacks, JSMA and FGSM. The classifiers used for the analysis include Random Forest (RF), MLP, Support Vector Machine (SVM), and Decision Tree (DT). Although the attacks used in this study were proposed for image domain-based classifiers, the results in the study showed that these attack methods affect IDS models, too.

In [18], the authors evaluated the perfomance of IDS models by training them with NSL-KDD and CIC-IDS2017 datasets separately. The adversarial techiques they used were DeepFool, JSMA, FGSM, and CW. The study was performed only based on Denial-of-Service (DoS) attack instances. The evaluation results show that the overall performance of the model when trained with CIC-IDS2017 dataset decreased by up to 40%, and by 13% when trained with NSL-KDD.

The authors of [42] conducted a survey on the commonly used IDS datasets for the AML research in the IDS domain, and the attacks implemented. Their study suggests that up to 60% of the works use NSL-KDD dataset, upto 30% use CTU-13, and upto 10% use CIC-IDS2017 dataset. Additionally, it suggests that more commonly used attack algorithms are JSMA, DeepFool, FGSM, and WGAN. Most affected classfiers include SVM, DT, Naive Bayes (NB), while RF and SVM with Radial Basis Function (RBF) kernel are relatively more robust than others.

Aayush Arora and Shantanu [43] present a review of GAN applications in the cybersecurity domain on currently stable datasets. In this paper, they review the extensions of GAN frameworks relevant to the cybersecurity domain such as Deep Convolutional Generative Adversarial Networks (DCGANs), Bidirectional Generative Adversarial Networks (BiGANs), Cycle-Consistent Adversarial Networks (Cycle-GANs) and commonly used stable datasets. They also discuss applications of GAN like Steganography, Password Guessing, and Intrusion Detection Systems. Additionally, they provide a case study to evaluate the performance of the BiGANs for Anomaly Detection.

A survey by Kusha Sadeghi et al. [44] on attacks and defenses in adversarial ML provides system-driven taxonomies for the following aspects - datasets; the architectures of ML models; adversary’s utilities (knowledge, capability, and goal); strategies followed by the adversaries; results of the defense mechanisms. The authors’ idea behind a system-oriented classification is that a system model is necessary to conduct and repeat experiments launching adversarial attacks and to implement their corresponding defenses. In the author’s view, a race between the attacks and defenses carried out using such a model can help enhancing the robustness of the model, and of the ML applications.

Experimental Evaluation

The study summarized in this paper is oriented around multi-class classification, as all the datasets used in this study have multiple classes. To suit the nature of the datasets, four efficient classification algorithms have been chosen, namely, MLP, DT, RF, and SVM. Table 2 presents the hyperparameters chosen for the evaluations. To handle multi-class classification, the OneVsRestClassifier function is used, to fit one classifier per class.

Table 2 Hyperparameters for the classifiers [48]

Software Specifications

The entire programming set-up is based on Python 3.6.5, Scikit-learn V.0.19.1 library [45], Tensorflow V.1.13.2 [46], and Keras V.2.1.5 [47]. For the implementation of the attack algorithms, Cleverhans V.3.0.1 library [20] has been used.

Data Pre-Processing

The data oftentimes needs processing before a learning algorithm is subjected to training with the dataset. There are two steps of pre-processing implemented in this work - the One-Hot Encoding, and the Min-Max Normalization.

One-Hot Encoding

This technique was opted for to convert the entire data to a numerical format. There are some features in each dataset that do have non-numerical values, for example, they may have categorical data. The One-Hot encoding method helps address this scenario.

Min-Max Normalization

This technique was applied to all the datasets to scale the values in each of them between 0 and 1. Since different features in a dataset might have values distributed on different scales, this technique helps convert all the values to a common scale and eliminate outliers, if any. Additionally, the attack methods require that all the features are within a common range, to be effective [18].

Steps Involved in the Experiment

There are two stages implemented in the experiment: 1) training a learning algorithm with original data; 2) generating adversarial samples from the original data. In first stage, training and testing phases are carried out, as shown in Fig. 1. In both phases, the original data is pre-processed. MLP has been used as the baseline learning algorithm. Therefore, baseline results are obtained when MLP is tested with the data (original or adversarial), and for the evaluation purpose, each of the other algorithms (DT, RF, and SVM) are implemented over the baseline algorithm.

Figure 2 outlines the steps involved in the second stage, the adversarial sample generation. There are training and testing phases in this stage, too. The main difference here is that, in the testing phase, after the test-data is pre-processed, it is fed to the MLP, and each of the attack algorithms are invoked to introduce adversarial perturbations into the test-data. The obtained adversarial test-set is forwarded to the classifier for final predictions. The attacks have been performed targeting the normal class in the chosen datasets, with white-box settings. Table 3 presents the parameters set for each of the attacks.

Fig. 1
figure 1

Sequence of steps involved using the original data

Fig. 2
figure 2

Sequence of steps involved using the adversarial data

Table 3 Parameters set for the attacks on all datasets [48]

The evaluation was initially conducted 10 times on a machine with UNSW-NB15 and Bot-Iot datasets, and the average values were noted as the experimental results. Later, the evaluation with the UNSW-NB15 dataset was carried out for an additional 3 times and with the CIC-IDS2018 dataset for 3 times, on a different machine (a server), whose configuration is as follows: 128 GB RAM, dual-core processor, and 3.17 TB secondary storage. The results included in this paper are the averages of the corresponding runs.

Evaluation Metrics

The last step in the evaluation with each attack algorithm is to test every classifier with the original test-set and then with the poisoned set. The same process is applied in case of every dataset. The metrics used for evaluation are Accuracy, Area Under the Curve (AUC), F1-score, and Recall.

Experimentation Results

This section presents the results obtained, ordered by the datasets, and discusses the impact of each attack algorithm on each of the datasets.

UNSW-NB15 Dataset

Tables 4, 5, 6, 7 summarize the results in terms of the various metrics used. The highest accuracy with normal data is obtained from the baseline algorithm, MLP, with the least from SVM. Considering the overall adversarial accuracy scores, the results indicate that CW attack has the highest impact, and JSMA has the least.

Table 4 Accuracy results for UNSW-NB15 dataset
Table 5 AUC results for UNSW-NB15 dataset
Table 6 F1 score results for UNSW-NB15 dataset
Table 7 Recall results for UNSW-NB15 dataset

Jacobian-based Saliency Map Attack

A total of 95 distinct features are altered by JSMA attack in this dataset, with average of 22 per data point. The total percentage of altered features is 11%. The average time taken to generate adversarial samples is 8 min. With UNSW-NB15, the overall results show that this attack has the highest impact on the SVM classifier and the lowest impact on the RF classifier. This makes SVM the most vulnerable to JSMA among the chosen clsasifiers, and RF the least vulnerable.

Fast Gradient Sign Method

A total of 192 features are altered by this attack, with an average of 162 features per data point. The total percentage of altered features is 78%. The time taken for adversarial sample generation is less than 5 seconds. The results suggest that this attack has more impact on DT classifier than on the others, and the least impact on RF. Therefore, RF and SVM are almost equally robust against the FGSM attack, and are better than the DT.

Carlini Wagner

A total of 196 features are altered by this attack, with an average of 133 features per data point. The total percentage of altered features is 65%. The time taken for adversarial sample generation is almost 50 min, the longest among all the selected attack algorithms. The results suggest that this attack has the highest impact on DT classifier and the least impact on RF. Therefore, RF is more robust against the CW attack than the other two algorithms and DT is the most vulnerable to CW.

Bot-IoT Dataset

Tables 8, 9, 10, 11 summarize the results for Bot-IoT in terms of the various metrics used. The highest accuracy with normal data is obtained from both DT and RF classifiers, with the least from MLP. Considering the overall adversarial accuracy scores, the results indicate that FGSM attack is degrading the accuracy by a greater magnitude than the other two, and JSMA has the least impact.

Table 8 Accuracy results for Bot-IoT dataset [48]
Table 9 AUC results for Bot-IoT dataset [48]
Table 10 F1 score results for Bot-IoT dataset [48]
Table 11 Recall results for Bot-IoT dataset [48]

Jacobian-Based Saliency Map Attack

A total of 57 features are altered, with an average of 28 per data point, making the total perentage of altered features 43%. The time taken for adversarial data generation is close to 14 min. The DT classifier is the most vulnerable to this attack, and RF is the least.

Fast Gradient Sign Method

A total of 60 distinct features are altered using this attack, with average of 34 per data point. The percentage of altered features is 52%. The attack takes around 20 seconds to generate adversarial data with Bot-IoT dataset. DT and RF classifiers are almost equally robust against this attack, and are better than the SVM.

Carlini Wagner

A total of 59 distinct features are altered, with an average of 42 per data point, and 52% as the total percentage of altered features. The attack takes close to 2 h to generate adversarial samples. The impact is almost the same on all the classifiers, with DT showing relatively lesser vulnerability than the other two, and SVM being more vulnerable than the other two.

CIC-IDS2018 Dataset

Tables 12, 13, 14, 15 summarize the results for the CIC-IDS2018 dataset in terms of the various metrics used. The highest accuracy with normal data is obtained from the RF classifier, with the least from MLP. Considering the overall adversarial accuracy scores, the results indicate that CW attack is degrading the accuracy by a greater magnitude than the other two, while FGSM has the least impact.

Table 12 Accuracy results for CSE-CIC-IDS2018 dataset
Table 13 AUC results for CSE-CIC-IDS2018 dataset
Table 14 F1 score results for CSE-CIC-IDS2018 dataset
Table 15 Recall results for CSE-CIC-IDS2018 dataset

Jacobian-based Saliency Map Attack

A total of 93 features are altered, with an average of 72 per data point. The percentage of altered features is 42%. The time taken for this attack to generate adversarial samples is close to 10 h. The SVM classifier has been affected the least of all, and the DT has been affected the most.

Fast Gradient Sign Method

A total of 187 features are altered, with an average of 136 per data point. The percentage of altered features is 85%. The time taken for this attack to generate adversarial samples is around 6 h. The RF classifier has been affected the least of all, and the SVM has been affected the most.

Carlini Wagner

A total of 189 features are altered, with an average of 157 per data point. The percentage of altered features is about 86%. The time taken for this attack to generate adversarial samples is around 14 h. The RF classifier has been affected the least of all, and the DT has been affected the most.

Analysis and Discussion

Considering datasets, classifiers, and attacks as three entities, the results obtained from the evaluation indicate that the influence of an entity varies with the other two. This section analyzes the results further and notes appropriate implications.

Implications of this Study

Although all three attack algorithms affected the performances of the classifiers, the variations in their impacts can help investigate deep into the characteristics of the datasets used. Based on the results, the overall impact on the CIC-IDS2018 dataset is relatively lesser, which is followed by the UNSW-NB15 dataset, and then the Bot-IoT. One possible reason behind this pattern is the number of features in the datasets. With lesser number of features, the vulnerabilties may increase. If the entire volume of each of the datasets was considered for the study, the scale of imbalance (being well-balanced or imbalanced) in the datasets would also become a factor for the variations in performance.

Looking at the overall results from classifiers end, the RF classifier stood almost steadily robust against all three attacks, with all three datasets. Another significant behavior is that the impact patterns are not uniform among different evaluation metrics. It means, an adversary needs to decide on a performance metric as target and design the attack accordingly.

Although the CW attack is considered one of the most sophisticated and powerful algorithms, its result patterns on the IDS datasets chosen for this study are similar to the other two attack techniques, and are not exceptional, per se.

Contribution to the Literature

Data is a precious entity, driving ML-based research in nearly every area of science. The quality and characteristics of a dataset are crucial in tuning the efficiency of a model. This work contributes to the literature by analyzing the behaviors of ML-based IDSs in adversarial environments using datasets that consist of realistic network patterns.

A consequential avenue for investigation is the extent of validity of these adversarial white-box attacks in the context of IDS datasets. Although the adversarial samples generated by the attack algorithms succeed in dropping the performance of an IDS model, there is a need to examine their efficiency in generating valid adversarial data. The goal of an adversarial algorithm targeting an IDS model is to modify an attack data instance in a way that it should look like a benign instance to the target while retaining the properties that make it the attack it is supposed to be. In other words, an adversarial data instance, X’, generated from an original (non-adversarial) attack instance, X, is valid only if X’ can achieve exactly what X can, in the network guarded by the target IDS. The real success of an adversarial attack lies in generating valid deceptive samples that can bypass detection and launch the attacks they are meant for. Pujari et al. [49] lists some factors that indicate the validity of adversarial samples. We want to continue this research by analyzing how successful various white-box attacks can be on IDS research datasets.

Limitations

A substantial limitation is the resources to process the huge volumes of datasets utilized in the experiments. Datasets like Bot-IoT and CIC-IDS2018 are big data and need efficient frameworks to handle them. We used smaller portions of these datasets to accommodate the resource constraints. One of the extensions to this work would be to evaluate the experiments with full datasets using a framework like Hadoop.

Insights into Mitigation Strategies

Improving the resistance of IDS models towards adversarial inputs has been a substantial stream of research ever since the vulnerabilities were discovered. The insights on how to enhance the resistance of a model, drawn based on our experiment are presented here. The datasets chosen for this work have many features, but not all the features in a dataset have a significant contribution to the outputs. One approach to reducing the impact of adversarial inputs is to extensively train a model on the features that decide the output. Techniques such as feature selection, feature reduction, etcetera, can help filter the features bearing less to no weightage in predicting the output. Such a training process enables a model to focus more on the deciding attributes and ignore the adversarial perturbations in the remaining features. Furthermore, some features in a dataset may allow values only within a specific range, in which case, an extra step can be added to validate the values in those features before prediction. Another strategy for filtration can be to validate the values that non-changeable features of an input hold. The approaches mentioned here are superficial, as it requires a much more thorough defensive mechanism to effectively make an IDS model robust.

Conclusions and Future Work

There is a need to study the properties of the available modern IDS datasets and switch from the old and outdated datasets to the contemporary ones. As important as it is to analyze how useful the modern datasets are in machine learning-based research, it is essential to know how useful they are under adversarial settings. This work studies three recently published IDS datasets, namely, UNSW-NB15, Bot-IoT, and CIC-IDS2018 under the light of three adversarial attack algorithms, namely, JSMA, FGSM, and CW. The performance is evaluated using multiple classifiers - SVM, DT, and RF - while using MLP as the baseline classifier. The experimental results have shown that RF is relatively more robust in adversarial environments, and in terms of the datasets, CIC-IDS2018 has offered more resilience to the classifiers. The impacts of the attacks have been varying with the datasets and classifiers.

We would like to extend this study in multiple directions. One of them is to analyze the impacts of the white-box attacks on recent datasets using other powerful algorithms, especially, deep learning algorithms. Another direction is to study black-box and gray-box attack techniques and develop defense mechanisms to tackle them.