1 Introduction

The SDN is a new architecture consisting of the three data, control and application plane layers, where data and control layers are independent of each other, as shown in Fig. 1. The data plane consists of switches and routers involved in network traffic forwarding; the control plane constitutes the network intelligent section consisting of NOX, POX, Beacon, Floodlight and OpenDaylight controllers, and the application plane contains applications for SDN configuration [1].

Fig. 1
figure 1

SDN architecture

The IT organizations may possibly encounter security procedures like DDoS attacks due to the lack of network coherency during re-configuration of the networks to SDN [2]. The DDoS is one of the most adverse attacks in the Internet realm, which weakens the network and the server by influencing the network bandwidth or connectivity in providing regular service [3], as shown in Fig. 2, where as observed the attackers put in too many requests to the open-flow switch from different hosts in a simultaneous manner, thus facing the network with difficulties.

Fig. 2
figure 2

DDoS attack Schema in SDN

The DDoS attacks target a wide spectrum of different resources and sites, beginning from servers’ banks up to new sites by introducing big challenges for the managers and users of these systems. On Feb 28, 2018, the GitHub site, one of the most important code variety perceptions for programs, was attacked with a high mass traffic of 1.3 Tbps volume, which made it to become off-line for 5 min. This attack introduced many problems to this site [4]. In a time interval within February 5–March 1, 2019, about 17 DDoS attacks were made on University of Albany site, which disturbed the server therein for at least 5 min. Though the data related to the instructors and students were exempt, some of the servers become off-line [5]. These nonstop attacks necessitate devising procedures in detecting and preventing the DDoS attacks.

There exist approaches in this context which next to their advantages have the following drawbacks.

Difficulty in selecting the appropriate time periods for monitoring the traffic in periodic methods [6], the shortcoming and delays in detecting DDoS attacks may lead to losing resources such as bandwidth and CPU [7], deactivation of the controller and switch, unwanted increase in response time [8] and maintaining the network security at high cost of adding hardware therein.

A method including statistical and machine learning methods involved in SDN is proposed in this article to overcome the available drawbacks in DDoS attack detection.

In this method, the mechanism for selection of time periods is applied in monitoring attacks, something not considered in the available methods. Attempt is made here to select the best time period for achieving the maximum detection rate, which is not necessarily of the lowest or highest volume. Periodic monitoring and scheduled traffic screening increase the efficiency of the controller in terms of the workload. Another advantage of this idea is that no custom hardware is necessary to detect attacks. This method increases the accuracy of DDoS detection and provides independence from the network topology.

The assessed attacks here are of the HTTP-based application layer attack type [9], which are observed in their low-volume or high-volume states. The high-volume attacks send many requests to a server or computer and consume extra bandwidth and processors therein [10], while the low-volume attacks have lower entry traffic mass capable of being deceived by expert or impostor attackers [11]. In this method, both these states are assessed. This model consists of collector, entropy-based and classification sections.

The statistical information from switches and host is collected in the controller sections. The entropy volume and the static and dynamic thresholds are calculated through the entropy-based section.

The 15 features for the hosts of the same flow and recorded data samples for incoming packets are extracted through the classification section. The samples are fed into the classification section as the training inputs to devise models through different classification algorithms.

This method yields 99.85% accuracy with 0.1 FPR on UNB-ISCX and 99.12% on CTU-13 dataset. These results indicate this model’s outperformance versus its counterparts. The main contribution of this article is to combine machine learning and statistical methods to improve the detection of DDoS attacks in SDN networks. In the available methods, the advantage of statistical methods and machine learning combination is not addressed in achieving higher detection performance.

This article is organized as follows: The literature is reviewed in Sect. 2; the method is proposed in Sect. 3; the datasets are presented in Sect. 4; the model is evaluated in Sect. 5; the model is implemented in Sect. 6; the results are expressed in Sect. 7; the analysis are run in Sect. 8; the experiments are compared in Sect. 9; and the article is concluded in Sect. 10.

2 Literature review

There exist many studies on DDoS attack detection. The findings of some of the available articles are briefed in this section.

Researchers in [12] applied the K-means clustering and Naive Bayes method for DDoS attack detection, consisting of: (1) clustering the similar data as to their behaviors in groups and labeling all data according to K cluster and (2) classifying the labeled data groups through Naive Bayes algorithm.

The computer vision technique is applied to detect DDoS attacks in [13], where unlike the statistical and machine learning methods, the traffic records are considered as images and detecting the attacks is viewed as a computer version issue. A multivariable coherence analytical method is introduced for accurate traffic record detection and its conversion into images. This method is named the Earth mover’s distance (EMD) computed based on the measured distance between two probable distributions.

As to the known and unknown DDoS attacks, researchers in [14] applied the artificial neural network (ANN) and revealed that the method is subject to algorithm training through the given dataset. Their proposed method is compared with its counterparts such as the backpropagation (BP), Chi-square and support vector machines (SVM) and Snort. They obtained a detection accuracy of 98%.

The DDoS attack detection in cloud computing and SDN networks is assessed in [15], where different models with features are applied to the datasets involved in both the training and test. For them, to increase efficiency updating is a must. Among the three proposed DDoS attack detection models in SDN networks, the best is Mglobal with 89.30% accuracy.

The authors in [16] applied different features to detect whether an attack has occurred or not. Because there exist more than one major parameter in judging DDoS attacks, the significant issue is related to how these parameters are determined; that is, the destination Internet Protocol (IP) address is considered as one of the attack detection parameters which can be detected by entropy. The detection method is evaluated through this model and many parameters.

A fast attack detection method is proposed in [17] to decrease the controllers and switches workload, where the neural network algorithm is applied. A combination of entropy-based and classification algorithms is presented as well. This method can detect both the high-volume and low-volume DDoS attacks.

To implement their own model, researchers in [18] applied the two data mining algorithms of C5.0 and Ripper. Their model is tested on UNB-ISCX datasets and a detection rate of 99% plus is achieved.

Researchers in [19] applied a statistical approach to detect the attacks next to learning machine techniques. In the statistical approach, usually the predetermined distributions are applied to model the traffic network’s normal and abnormal behaviors, in addition to the distance measures techniques, and in the machine learning stage, the K-Means, SVM, decision tree, Naive Bayes algorithm and AI algorithm are applied as a classifier.

A new solution for determining DDoS attack in IOT network infrastructures is proposed in [20], where for managing high traffic flows, the sFlow- and adaptive polling-based sampling techniques are applied in the data-plane layer. After sampling the distributed traffic in data plane, to increase real attack detection, the Snort-IDS and stacked autoencoders (SAE), an unsuperficial algorithm, are applied to obtain the high accuracy and low FPR to distinguish normal traffic from attack.

In a general assessment in [21], the deep learning modules of convolutional neural networks, deep neural networks, recurrent neural networks and deep Boltzmann machines models are of concern. The efficiency of the model of concern is determined by assessing every model in both the binary and multiclass categories by applying the CSE-CIC-IDS2018 and BoT–IoT datasets which contain real traffic. They revealed that implementation of their method is costly and complex because it requires special hardware such as Graphic Process Unit (GPU) and hundreds of software machines.

The researchers in [22] proposed a dynamic multilayer perceptron (MLP) combined with a feature selection technique to detect DDoS attacks, where a feedback mechanism is applied to promote and reconstruct the detector system when detection is not accurate. In their model, as the complexities of traffic network increase and change, some of the selected features will not be able to distinguish the traffic and normal attacks and determine the failure therein. The proposed method in their article in comparison with their counterparts can be of good functionality, while applying feedback mechanism here can enhance FPR and FNR.

3 The proposed method

In this study, a combination of entropy-based method and classification algorithm is applied for detecting high-volume and low-volume DDoS attacks. A two-class classification task for distinguishing normal flows from attacks is of concern here. The three applications introduced in Floodlight controller [23] for collecting flows and calculating entropy are shown in Fig. 3.

Fig. 3
figure 3

Proposed method

The method shown in Fig. 3 consists of the collector, entropy-based and classification sections, which operate together to detect the DDoS attacks that occur in the Floodlight controller. Each section is introduced in the following text.

3.1 Collector section

Both the statistics of the network flows and communications recorded by switches for a specific period of time are collected in this section. These statistics include the total count of the bytes sent, the count of packets sent and the flow time. Upon establishing a connection between two hosts, the first packet is sent to the controller, to be stored next to IP source, source port, destination IP, destination port, packet bytes and packet arrival time [24]. This phenomenon holds true for all packet-in messages. After making all the flows available, the statistics between the two hosts are obtained and given to the controller.

3.2 Entropy-based section

Here, entropy is applied to detect most of the attacks. Providing a fast and convenient manner in filtering suspicious flows is the main advantage of entropy-based filtering. This section is easily developed and implemented in SDN network environments, where low CPU load and easy implementation by the controller suffice.

The DDoS attacks impose additional overhead and disrupt Web activities; thus, the target system is measured by calculating the entropy of each IP in SDN networks. To calculate the entropy, it is assumed that there exists a time window, W, with n distinct elements and \(X_{(i, t)}\) is the observation i in the set at time t. The size of W in Eq. (1) is named as the size of time window [25].

$$\begin{aligned} W=\{{{X}_{(1,t)}},{{X}_{(2,t)}},\ldots ,{{X}_{(n,t)}}\} \end{aligned}$$
(1)

where W is the time window, and \({X}_{(i,t)}\) is the count of flows i in time t at n different possible states.

The probability of \({X}_{(i,t)}\) occurring in W is calculated through Eq. (2):

$$\begin{aligned} p({{X}_{(i,t)}})=\frac{{{X}_{(i,t)}}}{n} \end{aligned}$$
(2)

where \(p({{X}_{(i,t)}})\) is the occurrence probability of each \({{X}_{(i,t)}}\) in W.

To calculate the entropy \({{H}_{(i,t)}}\), the probability of each element in the set should be multiplied by its logarithm and summed through Eq. (3).

$$\begin{aligned} {{H}_{(i,t)}}=-\sum \limits _{i=1}^{n}{P({{X}_{(i,t)}}})\log P({{X}_{(i,t)}}) \end{aligned}$$
(3)

where \(P({{X}_{(i,t)}})\) is the occurrence probability of each IP.

If the calculated entropy < threshold (Thr), as expressed in Eq. (4), then the occurrence of an attack is reported.

$$\begin{aligned} {{H}_{(i,t)}}<Thr \end{aligned}$$
(4)

where Thr is a threshold in this network.

The optimal entropy for each period is determined by testing different time periods. Changing the time periods is very easy in the SDN controller, and this flexibility is one of the advantageous features in SDN networks. Both the time period duration and threshold size are effective in attack detection. The static and dynamic thresholds are introduced in [26], and the detection of high-volume DDoS attacks with DARPA2000 is assessed in [27]. The DARPA2000 datasets are detected by experts based on the DDoS attacking software, indicating that these attacks are simple in structure and type in spite of the complexity of the real data. In this study, these two thresholds are evaluated for both the high- and low-volume attacks by running tests on datasets collected from actual SDN networks and a method is proposed and compared for threshold calculation so as to select the best threshold volume for each type of attack.

3.2.1 Static threshold

This threshold has a static volume, based on the packets specified to the DDoS attacks. Normal traffic and attack traffic are transmitted separately to the network at different time periods. The mean volume of the entropy for different time periods is calculated once for the attack mode and once for the normal mode. Consequently, the static threshold is obtained through Eq. (5).

$$\begin{aligned} Thr={{T}_{1}}=\frac{\mathop {\overline{H}}\nolimits _{attack}+\mathop {\overline{H}}\nolimits _{normal}}{2} \end{aligned}$$
(5)

where \(\mathop {\overline{H}}_{attack}\) is the entropy average in normal flows and \(\mathop {\overline{H}}_{normal}\) is the entropy average in the attack flow.

3.2.2 Dynamic threshold

A computational method based on time sequence is applied to calculate the dynamic threshold, because it is fast in detecting DDoS attacks in small time windows, as in Eq. (6):

$$\begin{aligned} Thr={{T}_{2}}={{\bar{H}}_{(i,t-1)}}+{{C}_{d}}.{{\sigma }_{{{H}_{(i,}}_{t-1)}}} \end{aligned}$$
(6)

where \({{\bar{H}}_{(i,t-1)}}\) is the calculated mean volumes of the entropies, as in Eq. (7), \({{\sigma }_{{{H}_{(i,}}_{t-1)}}}\) is the standard deviation (SD), at time \(t-1\), as in Eq. (8), and \(C_d\) is the constant volume of a coefficient determined based on experiments, which does not depend on the time period and the volume of previous entropy.

$$\begin{aligned} {{\overline{H}}_{(i,t-1)}}= & {} \frac{1}{t}\sum \limits _{i=1}^{t-1}{{{H}_{(i,t-1)}}} \end{aligned}$$
(7)
$$\begin{aligned} {{\sigma }_{{{H}_{(i,t-1)}}}}= & {} \frac{1}{t}\sum \limits _{i=1}^{t-1}{({{H}_{(i,t-1)}}}-{{\bar{H}}_{(i,t-1)}}{{)}^{2}} \end{aligned}$$
(8)

where \({{H}_{(i,t-1)}}\) calculates the entropy levels for different time periods and \({{\bar{H}}_{(i,t-1)}}\) is the entropy average. At this stage, the entropy volume and dynamic threshold are calculated for each time period by applying a \(\text {C}_\text {d}\) value specifically calculated for the dataset. If the entropy value < the threshold, the attack is detected and a volume is added to the alarm rate parameter that calculates the volume of attack alarms. \(\text {C}_\text {d}\) is an experimental parameter, and its volume is influenced by the accuracy of attack detection. Because selecting the best value for \(\text {C}_\text {d}\) is subjective, depending on different parameters, to calculate the best \(\text {C}_\text {d}\) for each time period, it is better to consider an interaction between the different parameters. One of these parameter has to do with the ability of detecting all attacks, which should not make the count of time periods different, require less computational burden and generate low false alarm rates.

To select the best \(\text {C}_\text {d}\), first, in each time period, the TPR with volume of 100 is of concern, next among the selected situations where the FPR is the lowest is of concern, consequently, the obtained \(\text {C}_\text {d}\) volume is considered as the best \(\text {C}_\text {d}\) at the best time period.

By determining the best time period and best \(\text {C}_\text {d}\) volume, that portion of the flow subject to potential attack is detected, selected and forwarded to the classification section to increase the attack detection accuracy. Because this step eliminates a portion of the normal flow that is correctly detected, the count of the normal flow and attack flow is balanced before being delivered to the classification section.

3.3 Classification section

Here, a portion of the dataset at entropy-based section is identified as attack and considered as the entry. As observed in Fig. 4, every flow is considered as one edge forming both the ends of the host’s graph node

Fig. 4
figure 4

Flow between two hosts in SDN

For collecting the flows and extracting the features of concern, each IP is first considered as a node, and all the connections between those two and other nodes are applied to obtain the features. In feature collection, the neighbors of a node are of concern, as given in Table 1.

Table 1 Features extracted

A set of 15 features is extracted for training the classifiers including 12 features for both the host of a flow (six features for each host) and three shared features among the hosts. The features independent of speed and type of attack during machine learning are extracted through this proposed method, which is able to detect both high-volume and low-volume DDoS attacks (Table 1). After extracting the features, the training samples are given as inputs to the BayesNet, J48, RandomTree, logistic regression, REPTree classifiers classification algorithms [28] to construct the classification models.

To train and test the data, the K-fold method \(K=10\) is of concern [29]. In this method, for data separation, they must be distributed in tenfold in a random manner. Each period in this implementation has some flows where all are put in uniform folds. This operation is iterated for ten times, and the classification algorithms are obtained for modeling.

By comparing the obtained results, the best classification algorithm that improves the accuracy of attack detection is selected for each case.

4 The datasets

To evaluate the performance of this method, the well-known datasets, UNB- ISCXFootnote 1 [30] and CTU-13Footnote 2 [31], are selected and applied. Next to these datasets, ISOTFootnote 3 [32] is applied only for the normal traffic. The first dataset, UNB-ISCX, is prepared by Canadian Institute of Cyber Security,Footnote 4 consisting of different sections. In this article, the two sections of the datasets, namely ISCX-SlowDDos-2016 and ISCX-IDS-2012, are applied. The ISCX-SlowDDos-2016 dataset contains both the high- and low-volume DDoS attacks. The ISCX-IDS-2012 dataset contains different types of attacks like the HTTP DDoS. The HTTP GET DDoS attack is generated by an IRC Botnet and a brute force SSH attack. Each scenario contains a pcap file of both the attack and normal flows. The CTU-13 dataset is collected from the Czech University,Footnote 5 with the objective of generating a real traffic for the botnet combined with normal and background traffic. This dataset is extracted from 13 different samples of different botnet scenarios. In this study, scenarios 10 and 11 are applied to detect the DDoS attacks. The traffic applied in the CTU-10 scenario is of the UDP DDoS type and in the CTU-11 scenario of the ICMP DDoS type. The ISOT dataset is generated through the IT Research Center at the University of Victoria.Footnote 6 In this study, the normal section of this dataset is applied in combination with other dataset (Table 2).

Table 2 Dataset statistics

5 Evaluation

The performance of this proposed method is evaluated through the accuracy (ACC), precision, recall, F-measure, true positive rate (TPR) and false positive rate (FPR) metrics, the related equations of which are presented in Table 3. Let P be the count of the actual positive (attack) examples and N be the count of the actual negative (normal). The TPR measure is the ratio of attacks correctly recognized as attack, and the FPR is the normal cases, wrongly classified as attack. The alarm rate is the examples classified as attack to the total count of classified samples ratio. By applying WEKA Software, the mean absolute error (MAE), root-mean-square error (RMSE), root-mean-square(RMS) and root absolute error (RAE) [33] are obtained as shown in Table 3.

Table 3 Equation parameters

6 Implementation

For this implementation, the necessary software and tools are introduced as follows:

6.1 The implementation environment and tools

The experiments are run on an ASUS laptop with an AMD (Bristol Ridge), FX-9830P CPU 2.8 GHz processor accompanied with a 12GB of RAM. The operating system is the Linux Ubuntu 14.04 LTS run on Window 8.1 host machine. The Floodlight is chosen as the network controller and the Mininet 2.2.1 [34] for network simulation.

The Eclipse Neon.3 [35] is applied here for programming the modules in the Floodlight controller.

6.2 SDN network configuration

To implement this method in SDN, first a module must be implemented in the Floodlight controller which would detect the DDoS attacks. For this purpose to commence the SDN networks, the network configuration is a must prior to the prerequisites related to the controller and network installation and adjustment. For this purpose, a version of Java must be installed. In this study, the Java version 8 is installed followed by installing the Eclipse Neon 3 to configure the Floodlight controller operators according to the following steps [36]:

  1. 1.

    First install jdk 8 (using “java8offline.txt”)

  2. 2.

    $ cd  

  3. 3.

    Download Eclipse Neon.3 installer:

  4. 4.

    $ wget http://ftp.ussg.iu.edu/eclipse/oomph/epp/neon/R/eclipse-inst-linux64.tar.gz

  5. 5.

    $ sudo apt-get remove eclipse eclipse- jdt eclipse-platform eclipse-rcp eclipse-pde eclipse- platform-data $ rm -r  /.eclipse/

  6. 6.

    $ sudo tar -zxvf  /eclipse-*.tar.gz&& cd eclipse-*

  7. 7.

    $. /eclipse-inst

  8. 8.

    //wait a minute, and then choose Eclipse IDE for Java Developers

  9. 9.

    //choose Install

  10. 10.

    //choose accept (for everything)

  11. 11.

    //select launch

The Eclipse Neon.3 is configured as follows:

  1. 1.

    $ ant eclipse

  2. 2.

    Open eclipse and create a new workspace:

    File \(->\) Import \(->\) General \(->\) Existing Projects into Workspace.

    Then, click “Next.”

  3. 3.

    From Select root directory, click Browse. Select the parent directory where you placed floodlight earlier.

  4. 4.

    Check the box for Floodlight. No other projects should be present and none should be selected.

  5. 5.

    Click Finish.

At this stage, after installing and adjusting the Eclipse Neon.3, the same should be done in the SDN, with respect to the controller Floodlight therein as follows:

  1. 1.

    sudo apt-get install build-essential git ant maven python-dev

  2. 2.

    git clone git://github.com/floodlight/floodlight.git

  3. 3.

    cd floodlight

  4. 4.

    git submodule init

  5. 5.

    git submodule update

  6. 6.

    ant

  7. 7.

    sudo mkdir /var/lib/floodlight

  8. 8.

    sudo chmod 777 /var/lib/floodlight

Now that the Floodlight controller is installed, its implementation is subject to the following procedures:

  1. 1.

    sudo apt-get remove openvswitch-testcontroller

  2. 2.

    java -jar target/floodlight.jar

  3. 3.

    in order to see web GUI open link below in browser:

    http://127.0.0.1:8080/ui/pages/index.html

To install this Floodlight controller in SDN, the following steps are observed to design this proposed module in DDoS detection:

  1. 1.

    Expand the Floodlight item in the Package Explorer and find the

    src/main/java folder.

  2. 2.

    Right click on the src/main/javafolder and choose New/Class.

  3. 3.

    Enter net.floodlightcontroller.ddosdetection in the “Package” box.

  4. 4.

    Enter DdosDetection in the Name box.

  5. 5.

    Next to the Interfaces box, choose Add ...

  6. 6.

    Add the IOFMessageListener and the IFloodlightModule, click OK.

  7. 7.

    Click Finish in the dialog.

Now, this proposed algorithm is added to the Floodlight controller in the form of DdosDetection module, but it is not able to be run and must be registered in the controller first, if the following steps are met:

  1. 1.

    src/main/resources/META-INF/services/net.floodlightcontroller.

    core.module.IFloodlightModulenet.floodlightcontroller.ddosdetection.DdosDetection

  2. 2.

    src/main/resources/floodlightdefault.properties

    net.floodlightcontroller.ddosdetection.DdosDetection

At this point, through this proposed module, to simulate the attack, the dataset of concern is injected into the network through the attacker and the Floodlight controller is able to detect the DDoS attacks. This process is run in Mininet emulator through Tcpreplay tool [37], which with its selected speed redistributes the pcap files in the network. To detect the DDoS attacks, the data of concern required for the injectable dataset must be collected from the switch, host and the available communications. The data related to the packages and flows transmitted in the network in wireshark are shown in Fig. 5.

Fig. 5
figure 5

Flow assessment in Wireshark

After flow data extraction and its transmission to the Floodlight controller, by applying this module, the attack detection assessment begins.

A portion of this module related to the entropy calculations is implemented through the Floodlight controller in Eclipse Neon.3 which is shown in Fig. 6.

Fig. 6
figure 6

Implementation of affect detector module in Floodlight controller in Eclipse Neon.3

The implementation code reveals a portion of the control module for this purpose. The module implementation in the controller and obtaining the entropy volume are shown in Fig. 7.

Fig. 7
figure 7

Implementation of this Module and the Entropy calculation

The modules implemented in the controller which begin to calculate the entropy by summing the flows and providing the list flow are shown in this figure.

After the attack detection is made in the entropy-based section, some of the results identified as attack are sent to the other sections of the learning machine, where by running classification algorithm in WEKA Software, the K-fold method detects the attacks.

7 Results of the experiments

The objective here is to identify the low-volume DDoS and high-volume DDoS attacks in their separate sense. In this study, the periods are within 10–240 s range, among which the ones within 15–45 s range are considered the best for this purpose. This is justified by the fact that more attacks are detected in a shorter time within this period.

If a long time period is chosen, the response time would increase, and the attack detection would be delayed as well, that is, the detection may occur after the attack cause destruction, thus making controller and switches handle large volumes of attacks flows and causing harm thereof. By choosing short time periods, the attack detection process begins early and makes the controller to overuse the CPU and network bandwidth resources, which affects the controller performance, and consequently, the attacks spread over a long time span, that is, a negative effect on attack detection. With respect to these two points, different time periods within 15–45 s range constitute the best choice.

7.1 High-volume DDoS attack detection results

In this section, the results of the experiments on various datasets are described with the purpose of high-volume DDoS attack detection.

7.1.1 The results of entropy-based section

The results of applying entropy-based method with static threshold on UNB-ISCX and CTU-13 datasets are bar-charted in Fig. 8, where, as observed, this parameter fails to identify attacks in the periods that contain both normal and attack flows.

Fig. 8
figure 8

Evaluation results of static threshold in high-volume DDoS attack detection

The findings in Fig. 8 indicate that the static threshold method lacks proper functionality in detecting high-volume DDoS attacks in all datasets except for CTU-11. Now, the dynamic results are assessed:

The results of the entropy-based method for high-volume DDoS attack detection through the dynamic threshold are presented in Tables 456 and 7.

Table 4 Evaluation results of dynamic threshold for high-volume DDoS attack detection in the ISCX-SlowDDos-2016 database
Table 5 Results of dynamic threshold in high-volume DDoS attack detection in the ISCX-IDS-2012 dataset
Table 6 Results of dynamic threshold in high-volume DDoS attack detection in the CTU-10 dataset
Table 7 Results of dynamic threshold in high-volume DDoS attack detection in the CTU-11 dataset

All assessments in Tables 456 and 7 at all time periods take place within 15–45 s range. The volume of \(\text {C}_\text {d}\) and the experimental volume in dynamic threshold equations are assessed within \(-4\) to 4 range. Not all these volumes are expressed in the tables, but the ones more effective in attack predictions.

In these tables, for different time periods in the first step, TPR becomes 100, which is of concern, and when FPR volume is low at all states, it is selected as the optimized \(\text {C}_\text {d}\) volume of the period (Tables 456 and 7). The dataset in this section is identified as a portion of the attack and is directed to the classification section. The results of different dataset evaluations are tabulated in Table 8.

Table 8 Value of the best \(\text {C}_\text {d}\) and the best time period for attack detection in different datasets

The results in Table 8 indicate that the appropriate range for selecting \(\text {C}_\text {d}\) values for different datasets is between \(-2\) and 2.

7.1.2 The results of classification algorithms

By applying the BayesNet, J48, logistic regression, RandomTree and REPTree classification algorithms, a module is developed to determine the normal and attack flows.

To train and test the data, the K-fold method is applied where \(K=10\) is of concern. The volume of the parameters of concern in learning and testing for different classification algorithms is shown in Table 9a–e.

Table 9 Volume of the parameters of concern in learning and testing for different classification algorithms applied in K-fold method

The data regarding the ten steps of K-fold method for training and test and the average therein for the ISCX-SlowDDos2016 dataset are given in Table 10a–e.

Table 10 Results of K-fold method as to the high-volume attack detection are obtained by applying ISCX-SlowDDos2016

The data regarding ten steps of K-fold method for training and test and the average therein for the ISCX-SlowDDoS2016, ISCX-IDS-2012, CTU-10, ctu-11 datasets are presented in Table 11.

Table 11 Classification technique results for different datasets in different algorithms

The content of Table 11 indicates that in the classification section, the best algorithm for detecting high-volume DDoS attacks in ISCX-SlowDDos-2016 dataset is the REPTree algorithm at 99.88% accuracy and 0.04% FPR volume. The evaluation results of the ISCX-IDS-2012 dataset revealed that the REPTree algorithm with 99.85% accuracy and 0.1% FPR volume is the best algorithm to detect high-volume DDoS attacks. As to the CTU-10 dataset, the J48 algorithm with an accuracy of 99.12% and 0.35% FPR volume is the best algorithm to detect high-volume DDoS attacks. As to CTU-11 dataset, the classification section results suggest that the logistic algorithm is of higher accuracy in high-volume DDoS attack detection with higher FPR volume. Between the two RandomTree and REPTree algorithms, the accuracy volume in the first is high, while to FPR volume the second is low.

The comparative diagram of the best results in high-volume attack detection for different datasets is shown in Fig. 9.

Fig. 9
figure 9

The best results on attack detection in each dataset

7.2 Low-volume DDoS attack detection results

The results of these proposed methods for different datasets are provided for low-volume DDoS attack detection. Both the static and dynamic threshold methods are applied to examine entropy-based section results.

7.2.1 The results of the entropy-based section

These results together with the static threshold for low-volume DDoS attack detection are presented in Table 12.

Table 12 Evaluation results of static threshold in low-volume DDoS attack detection in the ISCX-SlowDDos-2016 dataset

As observed in Table 12, both the FPR and the TPR volumes are low, while due to the abundance of attacks in one or more specific time periods, the results are relatively better for short time periods.. The results of the entropy-based section for detecting low-volume DDoS attacks applying dynamic threshold are presented in Table 13.

Table 13 Evaluation results of dynamic threshold in low-volume DDoS attack detection for the ISCX-SlowDDos-2016 dataset

The results in Table 13 indicate that the best time period for detecting low-volume attacks in the ISCX-SlowDDos-2016 dataset is 30 s at \(\text {C}_\text {d} = 1\), as the best volume.

7.2.2 The results of classification algorithm

Similar to the approach in detecting high-volume attacks, the RandomTree, Logistic Regression, J48, BayesNet and REPTree classification algorithms and the K-fold method at \(K=10\), are involved in low-volume attack detection. The details of this process are presented in Table 14a–e.

Table 14 The K-fold method results for low-volume attack detection by applying ISCX-SlowDDoS2016 dataset

In the mentioned tables, the parameters are calculated for different classifying algorithms and ISCX-SlowDDoS2016 dataset, results of which are given in Table 15.

Table 15 Results of the ISCX-SlowDDos-2016 dataset for low-volume DDoS attack detection

As observed in Table 15, most classifying algorithms except the Naive Bayes have high accuracy in detection and low alarm rate, while the efficiency of this proposed model for low-volume attack detection is outstanding. The low-volume DDoS attack detection results for different classifying algorithms are bar-charted in Fig. 10 for comparison.

Fig. 10
figure 10

Comparison of the ISCX-SlowDDos-2016 dataset results for low-volume DDoS attack detection

High accuracy and low FPR in all classifying algorithms shown in Fig. 10 reveal the high efficiency and quality of the features extracted through this proposed method in low-volume attack detection.

The results indicate that REPTree algorithm has the high accuracy of 99.96% and a low FPR value of 0.04% in detecting low-volume DDoS attacks.

8 Analysis of computational complexity and time cost of this proposed method

The method proposed here is a combination of entropy-based and classification method. Its computational complexity is derived from the combination of complexity of these two methods. The entropy is calculated through the entropy-based step, and the results are compared with the threshold where calculations are of O (n) computational complexity and n is the flow count. Assuming that the count of time period is d, the computational complexity here is calculated in Eq. (9):

$$\begin{aligned} {\hbox{Computational-Complexity}}_{\mathrm{Entropy}}=O\left( d\times n \right) \end{aligned}$$
(9)

where n is the flow count and d is the time period count. In the classification section, the complexity of calculating the flow features is of O(n) order, while different classification algorithms have different computational complexities (e.g., decision tree algorithms have computational complexities in \(O(\log n)\) order which leads to a total of \(O (n) +O (\log n)\)). Assuming that the count of time period is d, the computational complexity for classification step would be expressed in Eq. (10):

$$\begin{aligned} {\hbox{Computational-Complexity}}_{\mathrm{Machine-Learning}}=O(d \times (n+\log \,n)) \end{aligned}$$
(10)

Each one of these time periods has a constant coefficient, assumed as one in calculating the computational complexity. Now, by assuming that alarms are triggered in \(\frac{1}{k}\) periods, the computational complexity of an attack occurrence is calculated through Eq. (11):

$$\begin{aligned} {\hbox{Computational-Complexity}}_{\mathrm{this-method}}=O\left(d\times n+\left(\frac{d}{k}\right)\times (n+\log n)\right) \end{aligned}$$
(11)

where n is the flow count, d is the period count and 1/K is the count of periods considered as attacks.

9 Comparative performance experiments

In DDoS attack detections, the model is ranked as highly efficient which is of high accuracy and low FPR. The accuracy and FPR of the studies where the UNB-ISCX dataset is applied are shown in Table 16.

Table 16 Results of DDoS attack detection with different methods for the UNB-ISCX dataset

As observed in this table, this proposed method outperforms its counterpart at 99.85% accuracy and 0.1% FPR, where the error awareness level is lower than all mentioned methods. These results indicate the high efficiency of this proposed method. The three techniques applied in CTU-13 dataset for attack detection through this proposed method are compared to its counterparts in Table 17.

Table 17 Results of DDoS attack detection with different methods for the CTU-13 dataset

As observed in this table, an accuracy of 99.12%, the highest on CTU-13 dataset is obtained through this proposed method.

The content of Table 16 is reflected in bar-charts in Figs. 11 and 12

Fig. 11
figure 11

Comparing the accuracy of this proposed method to other studies for the UNB-ISCX dataset

Fig. 12
figure 12

Comparing the accuracy of this proposed method to other studies for the CTU-13 dataset

As observed in Figs. 11 and 12, the higher efficiency of this model for both the datasets in relation to the available methods is evident.

10 Conclusion and future works

SDNs are the latest in improving computer networks, due to their being flexible and reducing operational costs and providing security against DDoS attacks. To improve the security herein, a new method for detecting high-volume and low-volume DDoS attacks by applying a combination of statistical and machine learning techniques is proposed here. This method consists of the collector, entropy-based and classification sections.

This proposed method is evaluated and analyzed, and the findings indicate that the entropy-based sections with static threshold do not yield appropriate results according to experiments run on different datasets. The better results are obtained for the dynamic threshold at the cost of high FPR. To remove this drawback, different classification algorithms are run and more accurate results are obtained.

The significance of this method, as to accuracy, is its outperformance versus its counterparts. Results indicated that the accuracy of this proposed method is higher than other similar methods. Because this proposed model is to find solutions after attack event, the manner of DDoS attack prevention in SDN networks should be assessed. Though, in this article, the DDoS attacks are detected only by one controller in SDN, in the studies to come this method can be improved in networks by involving more than one controller.