Performance analysis of network traffic capture tools and machine learning algorithms for the classification of applications, states and anomalies

Fowdur, T. P.; Baulum, B. N.; Beeharry, Y.

doi:10.1007/s41870-020-00458-0

Performance analysis of network traffic capture tools and machine learning algorithms for the classification of applications, states and anomalies

Original Research
Published: 23 April 2020

Volume 12, pages 805–824, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Information Technology Aims and scope Submit manuscript

Performance analysis of network traffic capture tools and machine learning algorithms for the classification of applications, states and anomalies

Download PDF

T. P. Fowdur¹,
B. N. Baulum¹ &
Y. Beeharry¹

428 Accesses
16 Citations
Explore all metrics

Abstract

Network analytics is of key importance for the proper management of network resources as the rate of Internet traffic continues to rise. The aim of this paper is to investigate the performance of different network traffic capture tools for extracting features and to evaluate the performance of eight Machine Learning (ML) algorithms in the classification of (1) applications; (2) states and (3) anomalies. Six Internet applications were considered along with four PC states and two network anomalies. The network was monitored by three traffic capture tools: PRTG, Colasoft Capsa and Wireshark and classification was performed using the Weka Toolkit. The performance of the eight ML classifiers was determined based on several metrics. The Colasoft Capsa feature set gave the highest accuracy for the classification of applications while same was achieved with features from PRTG for the classification of the four states considered. For anomaly classification, the ML algorithms showed almost similar classification behavior when the Colasoft Capsa or PRTG feature set was used.

Reviewing Traffic Classification

Practical and configurable network traffic classification using probabilistic machine learning

Article 16 September 2021

Network Traffic Classification Techniques: A Review

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Network traffic, most commonly referred to as the amount of data being transferred across a network at a specific time, is increasing at a drastic rate as the Internet continues to grow in scope and complexity [1]. Network traffic can also be measured in terms of bandwidth or transmission capacity and is an important factor when determining the quality and speed of a network. The emergence of more and more applications running on Internet Protocol (IP) networks in different fields—including not only traditional Internet services such as WWW, FTP, and e-mail, but also multimedia services such as multimedia streaming, P2P file sharing and gaming—has yielded to network bandwidth growing from hundreds of Mbps to busier and faster wireless networks of more than 10 Gbps [2]. It is therefore crucial for networks to be monitored so as to understand their behavior in terms of applications and bandwidth usage, utilization of network resources, and to detect network anomalies and security issues, hence preventing overall network performance degradation or failure. The two main operations encompassing network analytics are traffic monitoring and traffic classification. Network traffic monitoring tools are employed by administrators in order to check for availability and maintain system stability by fixing network problems on time and ensuring the network security strength. On the other hand, traffic classification helps to identify different applications and protocols that utilize the network’s resources. While network analytics is not essential for private networks, it is an indispensable tool for large business operators to have a better understanding of their networks and which eventually enables them to make smarter and data-driven decisions to attain desired operations’ outcomes and to meet customers’ needs. In other words, the process involves the study of network data and statistics to identify trends and patterns for easy detection and elimination of anomalies [3]. An overview of recent publications that have proposed interesting classification approaches of IP traffic is given next.

In [4], Parsaei et al. applied ML algorithms on captured traffic from a Software-Defined Network (SDN). Four ML algorithms, namely feedforward, Multi-layer Perceptron (MLP), the Levenberg–Marquardt and Naïve Bayes were used. To specify specific flows, features like source port, destination port, IP source, IP destination and transport layer protocol were used. Testing of the classifier model yielded to an accuracy of 95.6% for feedforward, 97% for MLP and Levenberg–Marquardt and finally 97.6% for Naïve Bayes algorithm. The study successfully attained its objective of minimizing overhead of controllers’ processing and network traffic. In [5], a comparative analysis of ML algorithms for classification of traffic from internet applications was performed. For data collection, real time network traffic for a duration of one minute using Wireshark software was collected and the Weka toolkit was used for classification. Traffic from WWW, DNS, FTP, P2P and Telnet applications were targeted. The classification model was constructed by the application of four machine learning algorithm, namely Naive Bayes, Bayes Net, C4.5 and Support Vector Machine (SVM). It was found that C4.5 algorithm gave the highest classification accuracy at 79%. The results also revealed that the recall and precision values for DNS and WWW applications are lower than those of the remaining applications.

In [6], Singh and Agrawal conducted a classification of IP traffic using ML approach. The performance of the five ML algorithms was evaluated based on parameters such as classification accuracy, training time and precision and recall values. It was found that for the case of full feature dataset, the Bayes net classifier gave the best classification accuracy, which is 85.3%. A 100% recall and precision value was recorded with Bayes Net for FTP, P2P, VoIP and IM. In [7], Sohi et al. made use of three ML algorithms: Bayes Net, RBF and C4.5 for classifying Internet traffic into educational and non-education applications. Some educational websites used were the IEEE, Science Direct and SparkNotes while non-educational sites included BitTorrent and Yahoo Messenger. It was found that Bayes Net gave a classification accuracy of 76.6%, making it the most accurate among the 3 classifiers. The latter also outperformed the RBF and C4.5 classifiers in terms of recall and precision for both educational and non-educational Internet applications.

In [8], the authors presented several criteria to assess existing network data capture mechanisms. An extensive review of state of the art network data collection techniques such as packet, flow and log based methods was performed with an in depth analysis of their benefits and drawbacks using the proposed criteria as a means for systematic evaluation. The evaluation criteria used system performance indicators such as instantaneity, effectiveness, scalability and expense among others as a basis. A number of open problems were also identified and several possibilities for future research were identified. In [9], a study based on the selection of features from network traffic in the detection of anomalies was made. The work focused on data preprocessing and outlined the importance of feature selection. This step helps to remove redundant features and hence allows for faster processing and storing of data by reducing resource consumption. To evaluate the performance of the selected feature set, ML algorithms such as KNN, Naïve Bayes, Decision Trees, Artificial Neural networks (ANN) and SVM were deployed. They assessed the performance of the classifiers with datasets consisting of 41, 30 and 16 features. It was observed that the classifiers performed better with feature sets of smaller size. The Bayes classifier showed a high False Positive rate by considered almost every new sample as attack with 41 features. However, its performance greatly improved with 16 features, but at the cost of less anomaly detection power.

Building upon the works previously described, this paper aims at analysing the network traffic of an 802.11 wireless LAN by first capturing a maximum amount of traffic information from on-going sessions of internet applications using three network monitoring tools, namely PRTG, Wireshark and Colasoft Capsa. The applications employed are YouTube, Skype, BitTorrent, Google Drive, Browsing and FTP sharing. Traffic generated during downloading, uploading, streaming and idle states are also captured. The collected data are then used in the evaluation of 8 ML classification algorithms, serving as analytic tools. Moreover, the effect of anomalies in the form of DDoS attack and rogue servers on the network performance is also examined.

The remainder of this paper proceeds as follows: Sect. 2 describes how each traffic capture tool is used for feature extraction. Section 3 describes the classification algorithms used for the analytics and how to perform the analytics with the Weka Toolkit. Section 4 describes the system model used for capturing and analyzing network traffic for different applications, states and anomalies. Section 5 presents the results of all extracted features and classification results of each scheme as well as evaluation and analysis on the performance of each classifier with different feature sets. Section 6 concludes the paper with some recommendations for future works.

2 Feature extraction tools

Based on previous researches, three network monitoring tools were chosen for conducting this study. They are PRTG, Wireshark and Capsa. Their main features are outlined in the following subsections.

2.1 PRTG

PRTG [10, 11] is a product of Paessler which serves as a network monitoring tool. While PRTG is not capable of functioning as an intrusion detection system, it acts as a preventive system and warns against anomalous activities in a network.

Key features of PRTG:

Monitoring of network performance in terms of bandwidth and application usage.
Monitoring of system usage (CPU loads, free memory, free disk space) of hardware devices.
Makes use of a statistical approach by setting up threshold values for traffic parameters and hence detects and alerts about anomalies like unexpected load peaks and abnormally heavy traffic, downtimes and slow servers. Spikes in activity can signal a threat.
User-friendly graphics engine that makes network activity accessible in the form of tables and graphs and hence facilitates analysis of network usage.
Efficient database system that provide storage of raw monitoring data and a report generator to create both live and scheduled reports in CSV, HTML or XML data files.
Network analysis modules for automatic discovery of network devices and sensors.

Several sensors are used by PRTG to track and display network traffic. Four sensors have been deployed for traffic capture and feature extraction. They are the Windows Network Card, Ping [12], DNS and health sensors.

2.2 Wireshark

Wireshark [13, 14] is an open-source network protocol analyser or sniffer that captures and displays data traversing a network in the form of packets. The main features of Wireshark include:

Ability to perform live capture of packets and deep offline analysis of protocols and packet contents.
Reading of live data from several interfaces such as IEEE 802.11, Ethernet, Bluetooth, ATM, USB, among others.
Provide powerful filters for selecting specific protocols for analysis.
Use of coloring rules to highlight packets for quick and easy identification of different protocols.

The captured traffic obtained from Internet applications is saved as CSV files for further processing.

2.3 Colasoft Capsa free

Colasoft Capsa [15] is an open-source network traffic and protocol analyser with a rich set of features [16]. It provides graphical statistics for global network as well as specific nodes in a dashboard tab. A graphical display of both broadcast and multicast packets [17] traversing the network is obtained with Capsa [17]. It also gives the packet count for TCP and UDP traffic along with the amount of TCP FIN and TCP RST sent. It allows for saving the displayed data in CSV format. Protocol statistics include features like sent and received packets and bytes as well average packets per second.

3 Classification of network traffic using machine learning in Weka

This section describes the main classification algorithms used and how the classification was performed using Weka.

3.1 Classification algorithms

Machine Learning techniques help to identify different applications and protocols in a network by grouping them based on packet flow parameters. These include minimum, maximum and mean number of packets, packet length, flow duration, traffic rate, volume, etc. ML classification techniques can be of two types: supervised and unsupervised [5].

In supervised learning technique, a complete labeled data set is required to classify unknown classes. This dataset is used to train the model which will predict output responses in a new set of data. Unsupervised machine learning approach does not constitute complete labeled data. This technique cannot be applied directly for classification because the output is unknown.

A set of 8 ML algorithms is used for this work. They are Naive Bayes, Bayesian Network, Multi-Layer Perceptron, Support Vector Machine, Radial Basis Function Neural network, KNN, bagging and C4.5 Decision Tree. A detailed description of these techniques can be found in references [18,19,20,21,22].

3.2 Classification using Weka Toolkit

The classification process was performed using Weka toolkit [23]. The latter is used as a data mining tool to implement IP traffic classification with ML algorithms. The overall process involves feeding the feature sets containing information about each sample with their labels into the machine learning algorithm to generate a classifier model. The efficiency and accuracy of the obtained model to capture a pattern is then determined by comparing the labels generated by the model for the inputs in a test set with the correct labels for those inputs. This classification process is illustrated in Fig. 1.

The performance of the classifiers was based on the following criteria:

(i)
Classification Accuracy.

Accuracy is the simplest metric deployed to evaluate a classifier. It gives the percentage of inputs in the test set that the classifier correctly labeled.
$$Accuracy = \frac{\sum TP + \sum TN}{\sum Total no. of samples}$$
(1)
where True Positives (TP): relevant items correctly identified as relevant. True Negatives (TN): irrelevant items correctly identified as irrelevant.

To define the remaining parameters, False Positives (FP) and False Negatives (FN) are also used. FP denotes irrelevant items incorrectly identified as relevant, while FN represents relevant items incorrectly identified as irrelevant.

(ii)
Precision (P).

Precision indicates the number of items identified as relevant and is given by:
$$Precision = \frac{TP}{TP + FP}.$$
(2)
(iii)
Recall (R).

Recall value indicates the number of relevant items that are identified.
$$Recall = \frac{TP}{TP + FN}.$$
(3)
(iv)
The F-Measure (or F-Score).

This combines the precision and recall to give a single score, also called the harmonic mean of the precision and recall.
$$F{-}Measure = \frac{2 \times Precision \times Recall}{Precision + Recall}.$$
(4)
(v)
The confusion Matrix.

The confusion matrix summarises the performance of a multi-class classifier. If P denotes the first class and N is the second, the confusion matrix can be represented as shown in Fig. 2.

4 Experimental set-up and testing procedures

The overall set-up for the experiments is shown in Fig. 3. The tests were performed on a PC connected to a Wi-Fi network. For this project, a 2.70 GHz Intel core i5 CPU with 4 GB RAM and 64-bit Windows 10 Operating System workstation was used. The network interface discovery feature in PRTG, Capsa and Wireshark was enabled to monitor IP traffic for the Wi-Fi network on the PC.

Data was captured for a duration of 30 min in intervals of 15 s for the on-going session of each application and state. For the classification of applications, three datasets of 700 samples each were built from raw data captured from the three monitoring tools and were saved as CSV files.

Streaming, uploading, downloading and idle state were considered for further classification. The size of datasets for state classification was of 470 samples.

As for anomaly classification, datasets of 700 samples with three classes labeled as normal, DDoS and Rogue Servers were used. The ‘normal’ class was obtained by running Internet applications under normal conditions.

The network traffic monitoring tools as well as the Weka classifier application were run on the PC. Classification algorithms were used to classify six different internet applications namely video streaming on YouTube, File download and upload via Google Drive, Browsing, Video Conferencing, FTP transfer and P2P File sharing. The experiment was performed for four different states in which the PC can be set namely, streaming, uploading, downloading and idle. Moreover, classification of two different anomalies namely DDoS attack and Rogue Servers were also investigated. Details of these testing conditions are given in the following sub-sections.

4.1 Applications and protocols

Most internet applications operate according to the Client/Server model in the Application layer of the TCP/IP model. A client is a device that requests information and server is the device that responds to the request. Format of requests and responses between clients and servers are generally defined by Application layer protocols [25].

The applications monitored in this study are hereafter described.

(i)
Online (Real-Time) Streaming.

Real time streaming implies sending audio or video data and played by the receiver on the other end with a negligible and consistent delay. This process can involve only a sender and a receiver, hence point-to-point, or one sender and several receivers, called broadcast. Real-time streaming prioritises accurate and quick delivery of data. For this purpose, User Datagram Protocol (UDP) is used to deliver continuous information and avoid re-sending dropped packets as does TCP [26].

Application used: YouTube.

(ii)
Upload and Download via e-mail.

Upload is referred to as the transfer of data from a client to a server while data transfer from server to client is called download. During e-mail operations, the Mail User Agent (MUA) or e-mail client applications are usually used. The e-mail client uses Post office Protocol (POP) to receive e-mail messages from an e-mail server and the Simple Mail Transfer Protocol (SNMP) allows e-mail to be sent from either a client or a server.

Application used: Google Drive.

(iii)
Video Conferencing.

Video conferencing via the Internet makes use of the Voice Over Internet protocol (VoIP). VoIP technology enables voice to be transmitted over the Internet as a digital signal [27].

Application used: Skype.

(iv)
Web Browsing.

World Wide Web (WWW) services are accessible through a web server. To establish a connection to a web service on a server, the web browser uses the Hypertext Transfer Protocol (HTTP). The process involves running background services by the server to allow for requested files by the client to be available. The browser converts the information received by the server into a plain text or HTML format and displays it for the user.

Application used: WWW services.

(v)
FTP Transfer.

File Transfer Protocol (FTP) enables file transfer between a client and a server. FTP needs to establish two connections between the client and the server for successful transfer. The first connection, consisting of commands and replies, is made to the server by the client and is established on TCP port 21. The second connection is then made over TCP port 20 for actual file transfer.

Application used: FileZilla Server.

(vi)
P2P Applications.

A Peer-to-Peer (P2P) application is one where a device can behave as both the client and the server during the same transfer process. P2P implies requesting information off of other computers and not from a server. Therefore, the client is a server and vice versa. Both client and server can set up a connection and have equal priority.

Application used: BitTorrent.

4.2 Network anomalies

Many works have been done in the area of network anomaly detection. This problem is usually approached using Artificial Intelligence and Machine Learning techniques.

In this project, 2 types of anomalies are investigated: (1) Distributed Denial of Service (DDoS) and (2) Rogue Servers.

The DDoS attack refers to the disruption of normal traffic of a server by bombarding the targeted server with excessive Internet traffic, eventually jamming the network infrastructure and prevent desired traffic from reaching its destination [28]. For this research work, a DDoS attack is generated through a code written in JavaScript which serves to open an infinite number of tabs on Google Chrome continuously, and hence preventing the user to access the network and servers as well as jamming the network infrastructure and slows down or completely shut down the operation of Internet applications. The code was run on the NetBeans IDE.

Rogue servers are set up on a network which serve to disrupt access to a target server. It makes use of the Dynamic Host Configuration Protocol (DHCP), a network protocol that allows an IP address from a given range of numbers to be automatically assigned to a computer by a server. Rogue server attacks are launched by attackers in the form of Sniffing and Reconnaissance attacks, among others [29, 30]. To create rogue servers in the system under study, a code was written in JavaScript which consists of three rogue servers and each made to listen to allocated ports 50,300, 50,302 and 50,305 respectively. These port numbers form part of the dynamic/private port range of 49,152–65,535. The code was run on Node.js.

5 Results and analysis

5.1 Features extracted from monitoring tools

Table 1 shows the list of features obtained from the monitoring three monitoring tools.

Table 1 List of extracted features from PRTG, Wireshark and Capsa

Performance analysis of network traffic capture tools and machine learning algorithms for the classification of applications, states and anomalies

Abstract

Similar content being viewed by others

Reviewing Traffic Classification

Practical and configurable network traffic classification using probabilistic machine learning

Network Traffic Classification Techniques: A Review

Explore related subjects

1 Introduction

2 Feature extraction tools

2.1 PRTG

2.2 Wireshark

2.3 Colasoft Capsa free

3 Classification of network traffic using machine learning in Weka

3.1 Classification algorithms

3.2 Classification using Weka Toolkit

4 Experimental set-up and testing procedures

4.1 Applications and protocols

4.2 Network anomalies

5 Results and analysis

5.1 Features extracted from monitoring tools

5.2 Classification of applications

5.3 Classification of states

5.4 Classification of anomalies

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation