Keywords

1 Introduction

The various components of network infrastructure like firewalls, bridges, switching and routing devices, etc. produce traffic data related to network. These data are also called as network flow data. Analysis of network performance can be efficiently done using this data. The obtained analysis would be a valuable resource for network security teams for further network enhancement and optimization. The network flow data reflects real-time view of the network traffic, integrated with peripheral devices and point solutions. Peripheral devices form outermost defense line, preventing entry of most of malicious things into the network. Still 100% capture/prevention of the malicious things is impossible. Only single anomaly can wreak dangerous havoc and on getting inside, peripheral devices will be of no help. Even though localized solutions enhance security by encountering specific problems, broad-based protection is still unreachable for them. Thus even if various components are already present, to strengthen network security, network traffic data analysis and prediction is required (Fig. 1).

Fig. 1.
figure 1

Network traffic data flow

2 The Importance of Network Traffic Data

A huge amount of data is been produced by traffic that goes from network infrastructure. This is termed as network flow data. It is a good measure for analyzing performance of network. But if this network flow data is scanned to a very root level, it will act as utmost important resource for securing network from various kinds of attacks. Network infrastructure can be optimized with the output of network flow analysis as well as strength will be added to the existing defense mechanism implemented in infrastructure. Strengthening of defense mechanism is possible if mitigating actions can be initiated within no time lag upon attack. This scenario is possible if attack or anomalous behavior can be known or predicted beforehand. Past flow analysis data will help for prediction of anomalous behaviors. If upon prediction, mitigating or preventive actions can be recommended implicitly, then time required to respond to different anomalous network situations will improve drastically.

Other advantages of network flow data analysis are listed below [19] (Table 1).

Table 1. Importance of network traffic data

3 Techniques for Network Traffic Prediction

The techniques can be divided as statistical & composite techniques. Statistical techniques use linear & non linear time series data models. Composite (statistical plus other domain) are based on data mining, neural network, Hadoop, PSO etc. Some have used term decomposed models when time series is decomposed into four components. Linear time series techniques are AR (Auto Regressive) and MA (Moving Average). When combined together, they create ARMA (Auto Regressive Moving Average) model [22,23,24] (Fig. 2, Table 2).

Fig. 2.
figure 2

Network traffic prediction techniques

Table 2. Network traffic prediction techniques details

4 Network Traffic Prediction System

4.1 System Architecture

See Fig. 3.

Fig. 3.
figure 3

Prediction system architecture

4.2 Algorithm for Prediction

  • Step 1: FLOW CAPTURE - Packet flow or network flow is captured and stored temporarily to analyze it.

  • Step 2: FLOW EXPORTER- The exporter creates flow registers from network traces.

  • Step 3: FLOW COLLECTOR- The Flow collector generates statistics from the stored file data.

  • Step 4: FLOW ANALYZER- The behavior profiling of each device is created.

  • Step 5: PREDICTION MODULE- Guesses future network flow data & behavior of related nodes.

  • Step 6: ACTION TAKEN- Application or invocation of various security policies, safeguarding actions as per type of attacks will be initiated.

5 Performance Evaluation Metrics

See Tables 3 and 4.

Table 3. Types of metrics used to evaluate network traffic prediction model [28]
Table 4. Network traffic prediction literature study summary

6 Conclusion

With the ever growing network traffic, present is the era of big data. This data can be explored and utilized for prediction of network traffic. This prediction will help to reduce time to respond in case of anomalies. So in this paper we studied and surveyed various network traffic prediction techniques. Prediction methods based on statistic, neural network are discussed. Performance metrics used in various previous studies [10, 13, 16, 18] etc. have been enlisted. The tabular view of surveyed papers focuses on prediction techniques for network traffic. Standard datasets used by the implemented algorithms and metrics used to evaluate the results are grouped in the research works surveyed. Such a review paper would help to provide an insight into the topic to new researchers.