Keywords

1 Introduction

Cloud computing systems are developing fast nowadays. There were no such systems 30 years ago, however, according to the report [16] in 2013 47 billion dollars were spent on cloud services all over the world. And the sum is expected to be doubled by 2017 as companies invest in cloud services for creating new competitive offerings. Today life without cloud for many people would be unthinkable as there were no Facebook, Twitter, and Google. The cloud has recently adopted for business needs too. Millions of organizations around the world strongly rely on cloud services starting with document management to using cloud resources.

When starting usage of the cloud technologies in business processes, one should be aware that availability of cloud services depends on the work of cloud platform (infrastructure). It is extremely important for every company to have a cloud platform working up to 24/7. But sometimes that does not happen because hackers are trying to get unauthorized access or just trying to damage the services for different benefits. That is why it is important to have defensive methods implemented against malicious activities.

The international scientific group in the cloud computing security area published threats report in 2013 [4]. According to it the cloud infrastructure attacks were placed to the 5th position in the list of the actual threats to clouds. Moreover, such infrastructure attacks as “distributed denial of service” (DDoS attacks) represent a huge threat for every element of the cloud computing service standard model (Infrastructure as a Service - IaaS, Platform as a Service - PaaS, Software as a Service - SaaS). Interestingly that in the report for 2010 year these attacks were not mentioned in the list of the biggest threats to clouds [4].

Distributed Denial of Service attacks are especially harmful to the companies providing services for the customers. Massive DDoS attacks often affect websites of the government bodies in various countries, websites of leading IT-corporations, including Amazon, Yahoo, Microsoft, etc. It is essential for the companies providing cloud services to be protected against DDoS attacks, because successful attacks can lead to a big loss of money [8].

In this paper, the authors present a novel approach for protection cloud computing against DDoS attacks. We suggest to distinguish external and internal DDoS attacks depending on the location of the attack source relative cloud infrastructure. Such attack classification allows selecting correct counteraction measures. For example, blocking all network traffic from internal virtual node of cloud system can affect the efficiency of all business processes the suspicious node participates in. In this case when the detection module detects that the attack originates from the cloud, the counteraction module will try to block network traffic coming from the specific ports of the suspicious node only. We also propose an architecture of security component that is able to detect both kinds of the attack. The architecture does not assume installation of any sensors on client-side and thus all processes in the cloud are kept confidential. At the same time it analyzes not only incoming external traffic like it is done in some commercial tools [13].

Our contribution is also in developing detection techniques based on data mining and machine learning techniques, including self-learning models. We use supervised models in order to classify network traffic. Experiments showed that the developed algorithms are working fast and malicious traffic is caught within 5 s after the attack starts. We collect attributes for models using Netflow protocol. In addition usage of self-learning algorithms make it easier to maintain cloud security, because models are learning on new types and scenarios of DDoS attacks.

All modules of the security component are flexible and can be deployed on nodes where cloud platform is installed, or on separate ones. The component prototype was implemented in OpenStack cloud computing platform.

The structure the paper is described as follows. Section 2 presents related work. The common approach for detection DDoS attacks in cloud computing is shown in Sect. 3. Section 4 considers the architecture of OpenStack cloud computing platform with particular focus on the components needed to implement suggested approach. Section 5 explains how the experimental environment was set up and describes how training and test data sets were collected, and how the classification models were learned and tested. Section 6 shows and discusses results of the experiments. Section 7 analyzes the paper results and provides an insight into the future research.

2 Related Work

Nowadays, researchers are developing and implementing different defensive techniques for detecting malicious traffic and protecting cloud computing platforms against DDoS attacks.

2.1 Security Solutions for Clouds

Elastic Cloud Security System (ECS2) [17] provides complex security methods against malicious traffic. ECS2 has antibot IP reputation tables and antivirus engines. Firewalls are rule-based and working in real time. The disadvantage of using this approach is the time of updating filter tables and antivirus signatures. Late tables updating can lead to receiving malicious traffic from the new unlabeled IP addresses. That is why our approach is not based on signatures, malware IP tables and other features that have to be updated rapidly.

In [5], the authors propose the defense method that places detection processes into virtual machines (VMs). The authors tested security methods on 108 services launched on different VMs. Those methods were based on data mining techniques and launched applications were analyzed on VMs. The drawback is the placement of the detection system. Some customers do not want to have background security processes inside their VMs due to specific security company rules.

Confidence-based filtering method is presented in [7]. The method focuses the probe on transport and network layers, creating correlation characteristics of co-appearance between attributes in the IP header and TCP header. Few attack types were tested. The authors conclude that the model has high effectiveness and low storage space when working with high-loaded networks.

In [20] authors outline neural network model for malicious traffic detection in cloud networks. Their model searches anomalies in the traffic flows and creates alerts for administrators to prevent damages. The authors notice that increasing the sample period for learning phase improves results, so higher accuracy can be reached not in real time.

The paper [15] presents a novel flow-based anomaly detection scheme based on the K-mean clustering algorithm. Training data containing unlabeled flow records are separated into clusters of normal and anomalous traffic. The corresponding cluster centroids are used as patterns for computationally efficient distance-based detection of anomalies in new monitoring data. The authors state that applying the clustering algorithm separately for different services (identified by their transport protocol and port number) improves the detection quality.

The authors of [21] propose an algorithm for detection of Denial of Service attacks that utilize SSL/TLS protocol. The algorithm based on filtering noise data and clustering detects malicious traffic. They trained models on the data obtained from realistic cyber environment. The authors conclude that the proposed model allows detecting all intrusive flows with very low number of alarms.

2.2 Cloud Computing Systems

Nowadays there are many cloud computing systems. One of the most popular is AWS Amazon [1], a proprietary vertically integrated solution for IaaS, PaaS, SaaS.

Another example is OpenNebula [14], a cloud management system using the functions of hypervisors for providing IaaS. ESXi (vSphere Hypervisor), KVM (Kernel-based Virtual Machine), XEN (OpenSource virtualization platform), Hyper-V (Windows Server Virtualization) are hypervisors supported by the system.

OpenStack [18] is one of the most popular solution for managing cloud services. It is open-source project and is sponsored by IBM, HP, Intel, Ubuntu and others. OpenStack developers implemented universal architecture of cloud computing platform. It allows providing different services on all cloud service models. However, the system has not enough components for detection and mitigation DDoS attacks. The main security method in OpenStack is a security group service. Obliviously, that this is not enough for successful counteraction to DDoS attacks in OpenStack. That is why, OpenStack [18] was chosen as environment for testing developed components.

3 Common Approach for DDoS Attack Detection in Cloud Computing Platforms

3.1 Taxonomy of DDoS Attacks on Cloud Computing

We suggest differentiating DDoS attacks depending on the attack source location relative the attack target located in the cloud. Therefore, there are two possible attack types: external attacks when the attack source is located outside cloud infrastructure; and internal one, when attack source is located inside the cloud.

When considering web server installed on ordinary computer there exists only one type of DDoS attacks — outside the computer network. Having a web server as a cloud instance gives possibilities to attack it both from outside the cloud and from other instances situated in the cloud.

In the IaaS service model customers receive not only service (like in SaaS service model) but also a network consisting of virtual machines and routers. The external attacks are similar to the attacks against ordinary web service. They can be deflected by filtering traffic incoming from attacking sources. The internal attacks are the attacks originating from the virtual machines running in this cloud. Network traffic coming outside can be easily rejected. Rejecting customer’s internal traffic together with malicious one may cause serious problems as a cloud provider does not able to provide full time access to one instance from another one. That is why DDoS attacks on cloud-based services are different comparatively to attacks on simple server-based services.

3.2 Cloud Security Component Architecture

The analysis of the common security problems in the cloud computing infrastructure showed that security system should meet the following requirements:

  • be capable to serve high-loaded networks;

  • mitigate attacks with high accuracy as soon as possible, almost in real-time;

  • meet the requirements of the customer’s security policies;

  • should not consume many resources of the cloud platform.

According to these requirements, the following security architecture of cloud infrastructure is proposed. The key modules of the security component are gate sensor, security controller that consists of collector, analyzer and counteraction module. They are shown on the Fig. 1. All traffic coming from outside the Cloud Network and inside it goes through the gate. This means that every instance communicates with another inside the cloud network using gate.

Fig. 1.
figure 1

Security component architecture for cloud networks

The gate has sensor that monitors traffic flows passing through the gate and sends data about them to security controller. The security controller is a module that processes incoming flow data by defining type of network traffic – benign or malicious - and blocks potentially malicious traffic by sending commands to firewall. The collector stores the received data. The analyzer prepares input data for the prediction module. The prediction module uses data mining classification models. The output of the module is the type of traffic flow. This result goes to the counteraction module that in its turn sends commands to firewall according to the received information.

The proposed architecture allows creating an analyzer component without analyzing payload of the packets in order to meet requirements of the company’s security policy.

The gate sensor and the controller have to operate fast to produce monitoring in real time. The response time of the controller on monitored traffic should be less than 10 s to mitigate attacks timely.

The analyzer of the security controller is the part of data mining process. Firstly, the analyzer prepares data vectors from collector’s data. Currently, the period of capturing traffic parameters is set to four seconds. Experiments showed that such time slot is enough to obtain high prediction accuracy and produce responses on the malicious traffic flows quickly. The input vector is generated from collector’s data every four seconds and is sent to the prediction module. The model produces a prediction on type of monitored traffic and if at some moment it detects incoming traffic as malicious the analyzer sends information about the attack type to the counteraction module. The counteraction module decides how to mitigate this attack and sends command to the gate’s firewall. This is how traffic is filtered.

The proposed architecture of the security component was tested using the OpenStack cloud computing system. The next section describes OpenStack architecture and tools used to implement suggested cloud security component.

4 Openstack Architecture

It is important to know the cloud architecture for understanding security architecture implementation. OpenStack was chosen as a test cloud computing platform for the proposed security component. OpenStack proposes different variants of constructing a cloud environment, but all of them are made up with the same tools and components. The main difference is the amount of nodes and their location. The cloud infrastructure can be constructed in one room and local network as well as in different countries and different provider’s networks. The most commonly used cloud architecture is made using three nodes — Controller, Compute and Network (Fig. 2).

Fig. 2.
figure 2

OpenStack topology layout

The Controller node runs the Identity service, Image Service, management utilities for the Compute and Networking nodes, Networking plug-in, and the dashboard. It also includes supporting services such as a SQL database, message queue, and Network Time Protocol (NTP). The proposed Security Controller (Fig. 1) is implemented on the Controller node.

The Compute node runs the Compute hypervisor that operates tenant virtual machines or instances. By default, Compute uses KVM as the hypervisor. The Compute node also runs the Networking plug-in and an agent that connects tenant networks to instances and provides firewalling (security groups) services. It is possible to run more than one compute node.

The Network node runs the Networking plug-in and several agents that provide tenant networks, switching, routing, NAT, and DHCP services. This node also handles external (Internet) connectivity for tenant virtual machine instances.

If the architecture of the cloud infrastructure does not include Network node all network processes are handled by the Compute node using the same tools as in the Network node.

In our experiments to construct OpenStack infrastructure, we use three identical servers with following characteristics: processor: Intel Xeon i5520, 2.26 GHz * 4 cores; operation system: Ubuntu 14.04; RAM: controller node has 24 GB, network node — 16 GB and compute node — 72 GB.

In OpenStack the route of packet from instance to external nodes or to other instances always goes through the special software router called qrouter [6]. The qrouter is the part of Open Virtual Switch (OVS) [12]. This component is a good place for the gate sensor and firewall. In our experiments the gate sensor was Cisco’s NetFlow sensor [3]. It can be placed at the entry point of the cloud’s internal network. In the case of OpenStack the gate to the cloud network is OVS which has the sensor already implemented for launching in the network infrastructure. Therefore, we needed only to reconfigure OVS to send the data to the collector.

The collector of the security component can be placed on a distinct node for saving clouds resources or on the controller’s node of cloud system as it was done in our case. We used nfdump [10] as a collector. OpenFlow [19] can be used as an automated interface for cloud network management.

5 Data Mining Process and Experiment Setup

Data mining algorithms are implemented using scikit-learn [9]. It is open-source machine learning library for the Python programming language.

We use different classification models to make decisions on the incoming network traffic. Every four seconds data processing module makes a new data vector. We use supervised learning in order to learn and test data mining models. Currently in the analysis process we do not determine the location of the attack source. In the future, we will focus on the second detection layer determining the attack source. Depending on attack source location different data will be collected for making decisions and different counteraction measures will be applied. If it is internal source, the algorithm will try not to block whole IP address, but some ports, in order to keep the legitimate user alive. However, we set up experimental environment suitable for detection both internal and external attacks. Next subsections describe experimental environment used to train and test classification models.

5.1 Environment for Experimenting with External Attacks

To make predictions, the prediction module needs training datasets. According to that, we created Real Service in Virtual Network Framework (RSVNet) [2]. The framework allows connecting real nodes to virtual network. This feature improves the accuracy of experiments in comparison with simulation in case when modeling the infrastructure attacks on cloud computing platforms. Also, the framework allows implementing any known protection mechanisms or create new ones, including hybrid protection mechanisms. The RSVNet was successfully verified. The framework helps to construct quickly multi-level topology consisting of virtual routers, clients, and real services. The users are able to define new attack scenarios, defensive methods and other features. During experiments, traffic can be logged into pcap files for further analysis. The experimental network consisted of 504 clients and 20 routers. Figure 3 shows the network topology used in the experiments.

Fig. 3.
figure 3

Topology made using RSVNet

The virtual workstation is represented by the OpenStack instance with following characteristics:

  • CPU: 1 VCPU, RAM: 2048 MB;

  • operating system: Ubuntu 14.04;

  • web-server: Apache/2.4.7, PHP version: PHP 5.5.9-1.

The router is a D-LINK DIR-615 router. To make packets go through virtual network to real and backwards, the routing tables are set inside the router. The range of IP addresses in virtual network is set to 1.0.0.0/8. So all packets with the destination IP address in this subnet are forwarded to the computer with the RSVNet installed. The framework captures needed packets and forwards them into virtual network.

5.2 Environment for Experimenting with Internal Attacks

To simulate internal attacks on the cloud, we setup cloud network that consisted of 6 routers, 10 subnets and, 104 instances. All instances were launched on a Compute node with following characteristics: processor: 24 * Intel(R) Xeon(R) CPU X5680 @ 3.33 GHz, RAM: 62 GB.

One instance was set to manage attack deployment, and one instance was chosen as victim. Attacking instance was launched with following characteristics: 2 GB RAM available and 1 VCPU. All other instances had 1 VCPU and 512 MB RAM available.

To manage all nodes Ansible Opensource Software was installed [11]. Ansible is an IT automation tool, and is used to configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates. It is uses ssh to connect to selected nodes and deploy commands.

To generate HTTP traffic, Siege-3.1.0 software was installed on every instance. Siege is an http load testing and benchmarking utility. To generate TCP Flooding attacks, UDP Flooding attacks, ICMP Flooding attacks, hping software was installed on every instance.

Each specific attack is independent to others. They can start and end simultaneously or one after another; and use the same computers to attack.

Figure 4 shows an example of settings file for experiment.

Fig. 4.
figure 4

Example experiment’s scenario

Here you can see two attacks on different victim servers. One attack starts at 20th second and lasts 60 s (-t60S) and other starts at 40th second and last 60 s too. Parameters start_hunter_ip and end_hunter_ip are the ranges of IP addresses listed in additional settings file. It contains every client’s IP address, which is able to attack. Therefore, is it can be seen, some of clients will participate in both attacks.

5.3 Learning Network Flow Classification Models

To build up training datasets, the following experiments were made: HTTP Flooding, SYN Flooding and NTP Flooding.

The empirical study showed that the most important attributes for detecting DDoS attacks are amount of bytes, amount of packets, amount of unique pairs of source IP address with port and destination IP address with port. This information is enough to make assessment of the network traffic quickly. For example, traffic of the SYN Flooding attack is characterized by a huge amount of packets having small length. HTTP Flooding causes big amount of packets (however, less than in case of SYN Flooding). The amount of bytes depends on the size of pages being requested by attackers. For NTP Flooding the amount of bytes and packets are extremely big, and it is not difficult to identify such attack. The number of unique pairs can also show whether the network is under the attack. If web server receives lots packets from many IP addresses using different ports then this can be a sign of the attack. The benign traffic is very similar to HTTP Flooding, but the power is significantly lower. In our experiments its power was set to 15 % power of HTTP Flooding traffic settings and the experiment lasted 80 min. The sensor was sending data to the collector immediately after the experiment was launched. The data processing module generated data vectors for classification model every 4 s. This time unit showed to be enough to make clear difference between benign and malicious traffic. In the training mode generated data vectors were stored in the file. By the end of simulation, they were automatically sent to train data mining models. We used cross-validation method to find optimal results.

6 The Experiment Results

6.1 Testing the Classification Models on External Attacks

We have tested several classification models on datasets collected using RSVNet: k-NN, Decision Tree, SVM, and Naïve Bayes.

We made the experiments with different attack power depending on the ratio of the clients involved in the attack: 50 %, 80 % and 100 % of all clients. Total number of the clients was 504.

The results of false positives and false negative rates of the experiments are shown in Tables 1, 2, 3.

Table 1. Results of experiments with 50 % attack power
Table 2. Results of experiments with 80 % attack power
Table 3. Results of experiments with 100 % attack power

The first series of experiments were made with the 50 % attack power (Table 1).

Almost every model showed a low false positive rate, but Decision Tree had the best result.

The second series of experiments were made with 80 % of all clients involved in the attack (Table 2).

Again the Decision Tree model was the best in detecting attacks. Less than 1 benign flow per 100 flows was detected as malicious. This means that almost all benign traffic reached its destination. In case of SYN Flooding attack other algorithms showed almost equal amount of false positives. The k-NN algorithm showed the second result in detecting HTTP Flooding attack.

The third series of experiments were made with all clients involved in the attack (Table 3).

As it can be seen the accuracy of traffic classification increases with the power of the attack. This happens because the higher attack power the greater differences in attribute values of vectors for different types of traffic. Decision Tree showed better results in false positive rate as in the previous experiment.

The speed of data processing and making predictions with all data was less than one second. That means that every five second security controller has resulted about traffic flow type in cloud network. So if the attack starts immediately after last prediction it is identified not more than in 5 s.

The next experiments using Decision Tree as a prediction algorithm in the cloud security component. The same attack types were used: SYN Flooding, HTTP Flooding, and NTP Flooding. The first series of experiments were made with the same benign model, but with different amount of clients involved in the attack. There were from 25 to 400 attack clients with the increment of 25 clients per experiment. Each experiment lasted 10 min. To analyze the results, we calculated F-measure for each type of traffic.

F-measure can be a better single metric when compared to precision (TP/(TP + FP)) and recall (TP/(TP + FN)); as precision and recall give different information that can complement each other when combined. If one of them excels more than the other, F-measure reflects it. Weights of recall and precision can be changed due to their importance in results. In this paper traditional F-measure was chosen as:

$$ F = 2 \cdot \frac{precision \cdot recall}{precision + recall} $$
(1)

Results of F-measure values are shown in the Fig. 5.

Fig. 5.
figure 5

F-measure of experiments with different attacking clients amount

As it follows from the plot Decision Tree has very low F-measure until 200 clients involved in SYN Flooding and HTTP Flooding attacks. Often SYN Flooding attack was classified as HTTP Flooding and HTTP Flooding was classified as NTP flooding traffic. That is not a problem for cloud services due to low attack power: it was just 16 000 flows per 10 min for HTTP Flooding, and 200 000 SYN requests per 10 min for SYN Flooding. Such attack power does not affect cloud services. The reduction of F-measure was detected when 75 to 200 clients were involved in the benign traffic modeling. That happened because HTTP Flooding flows were interpreted as benign traffic since the model of legitimate traffic generation is similar to 100 clients involved in HTTP Flooding. NTP Flooding had the highest F-measure due to the features of the attack.

It should be mentioned that Decision Tree was trained on network consisted of 504 attacking clients. That means that the model predicts traffic type adequately with high-loaded networks.

We also made several experiments with different power of legitimate traffic. Results show that when there is almost no traffic in the cloud network the false positive rate grows up. The solution of that problem is better benign traffic modeling.

6.2 Testing the Classification Models on Internal Attacks

We implemented series of experiments using different attack scenarios to test classification models against internal attacks. We have made experiments with different attack types: HTTP Flooding, SYN Flooding, UDP Flooding and with benign traffic scenario. We learned the same models as in the case of external attacks. We have reached the power of 1 Gb/s in HTTP Flooding attack in the virtual network of OpenStack.

We built up scenario consisting of each attack with increasing power. The whole experiment lasted 5 h: 2 for training dataset and 3 h for testing dataset.

Each attack type and legitimate scenario lasted almost the same time. Differences between training and testing scenarios were the following: amount of attackers during particular period of time, amount of attacking threads coming from one client.

To analyze effectiveness of each classification algorithm we calculated false positive and false negative rates (FP and FN) and F-measure. HTTP Flooding and benign traffic scenario results are provided in Table 4.

Table 4. Results of experiments with inside HTTP Flooding attack

Decision Tree model was the best classification model again. False positive rate is 0.05 % of all. So 99.95 % of legitimate traffic is not rejected. 99.75 % of malicious traffic is rejected. Practically that means that only 250 Mb/s goes through attack detection system when 100 Gb/s attack power will be on the cloud network. Naïve Bayes showed the worst results with very high false negative rate.

Comparing results of the experiments with outside and inside attacks are both very good. Using such detecting techniques can improve any cloud defense system.

7 Conclusion and Future Work

In this paper, we presented the approach for detection internal and external DDoS attack in cloud computing using data mining techniques. We also proposed the architecture of the cloud security component for detecting DDoS attacks. The architecture consists of sensor, controller storing information about traffic flow, analyzer processing data for data mining model and counteraction module sending commands to firewall to prevent attack’s damage. Proposed architecture does not affect customer’s data to meet the requirements of company’s security policies, and can be used in constructing protection of services in the cloud computing platforms from DDoS attacks on them. The authors implemented the component prototype in the OpenStack cloud computing system and carried out a set of experiments with DDoS attacks. The experiments with external DDoS attacks were done using RSVNet framework, to analyze internal DDoS attacks we created experimental environment using python, siege and hping was made. The results of the tested models show high accuracy and small false positive and false negative rate with high F-measure. The Decision Tree is the best according to series of experiments with different attack scenarios. Decision tree shows better results than other models because it constructs well-defined dependencies between different features for detecting the most popular DDoS attacks. However, we expect that testing decision tree on more complicated data could show different results. And our future work will be analysis of model’s efficiency on more sophisticated data sets.

The future research will be devoted to the enhancement of the prediction module and development of the self-training models. This can help to react correctly on changes in traffic power or in the amount of public services. Next papers will be focused on the DDoS attacks mitigation based on information gained by the improved prediction module, described in this paper. In addition, we plan to improve legitimate traffic model and implement experiments involving many instances facing DDoS attacks.