Keywords

1 Introduction

The popularity of the Internet of Things (IoT) has significantly increased. The forecast for the total number of connected IoT devices in 2025 is 27.1 billion. Today, it seems inevitable having a smart device in our homes. The proliferation of smart devices is not only within the domestic environment but it is also the driving force behind the development of an interconnected knowledge-based world; economy, society, and machinery of government. However, IoT devices come with a tremendous amount of security risks [3]. Synopsys has a blog post that reveals, back in 2017, a lack of confidence in the security of medical devices with 67% of manufacturers. They believed an attack on a device is probable within a year, and only 17% of manufacturers take steps to prevent them.

The protocols designed for the IoT protocol stack are different from those of the IP stack. IoT designers have to choose between WiFi, Bluetooth, ZigBee, Z-Wave, and LoRaWan as their network protocol. IoT devices are power-constrained and have specific functionality. Using general-purpose protocols for every IoT device would result in more battery consumption and less quality of service. Hence, there is no standard approach for handling security issues in IoT, such as IPSec or SSL for the Internet.

The insufficient security measures and lack of dedicated anomaly detection systems for these heterogeneous networks make them vulnerable to a range of attacks such as data leakage, spoofing, denial of service (DoS/DDoS), etc. These can lead to disastrous effects, damaging the hardware, unavailability of the system, and compromising sensitive data privacy. For example, a deauthentication attack performed on a device with critical significance, such as a steering wheel in a wireless car, can pose a threat to human life. There is a gap between security requirements and the security capabilities of currently available IoT devices.

The traditional IT security ecosystem consists of static network defenses (firewalls, IDS), the ubiquitous use of end-point defenses (e.g., anti-virus), and software patches from vendors. We can’t either employ these mechanisms due to the heterogeneity in devices and their use cases. This means that traditional approaches for discovering attack signatures (e.g., honeypots) will be insufficient or non-scalable [13].

Traditional anomaly detection systems are also ineffective within IoT ecosystems since the range of possible normal behaviors of devices is significantly larger and more dynamic than in traditional IT environments. IDSs such as SNORT and Bro only work on traditional IP-only networks as they are static and use signature-based techniques [13]. To address this issue, multiple intrusion detection systems have been introduced in the context of IoT networks. Yet, the majority of them focus on detecting a limited set of attacks, in particular, routing attacks and DoS. However, there are IDSs that introduce dynamicity by employing machine learning to detect malicious behavior. Soltani et al. focused on 3 deep learning algorithms and applied them to the context of intrusion detection [12]. They also introduced a new deep learning-based classifier. Amouri et al. employed supervised machine learning for IDS [1, 5]. Shukla et al. proposed an IDS that uses a combination of machine learning algorithms, such as K-means and decision trees to detect wormhole attacks on 6LoWPAN networks [11]. Cao et al. proposed a machine learning intrusion detection system for industrial control systems [6]. Bangui et al. have employed random forests for IDS in vehicular ad hoc networks. The aim of this paper is similar. Yet, its focal point is to design the machine learning IDS as lightweight and efficient as possible. When the IDS is lightweight and effective, it may be embedded within certain IoT devices. We summarize the contributions of our work below.

  • We analyze a public dataset for malicious IoT packet classification.

  • We convert the raw traffic of the captured data (PCAP) to a comprehensible format for training. (Lightweight models are unable to classify malicious packets by inspecting raw bytes)

  • We develop a Python code to instantiate, train, and test the specified machine-learning models.

  • We document and report the evaluation metrics for these models and compare them.

The rest of the paper follows this outline. Section 2, discusses the phase of data collection and how the collected data is comprehended to train the machine learning models. For both training and testing the models, the open-source Aposemat IoT-23 dataset [7] will be used. Later, in Sect. 3, we provide clarification on which machine learning models we use, which features are selected, and how the data is labeled. Section 4 provides the results and the evaluation of these models. In this section, the performance of lightweight models is compared with a more complicated model, a neural network. Section 5 concludes the project’s paper and Sect. 6 opens the door for future work.

2 Data Collection

2.1 Dataset

We use the public dataset of Aposemat IoT-23 for our models [7]. The packets are grouped into different chapters (scenarios). The dataset contains 20 captures of malware traffic in the IoT network and 3 captures of benign traffic. The dataset contains more than 760 million packets and 325 million labeled flows with more than 500 h of traffic. The IoT-23 dataset consists of 23 captures overall, called scenarios, of different IoT network traffic. We summarize this information in Figs. 1 and 2.

Fig. 1
figure 1

Summary of the malicious IoT scenarios

Fig. 2
figure 2

Summary of the benign IoT scenarios

The malicious scenarios were created executing a specific malware in a Raspberry Pi. In the dataset, the researchers have included traffic from Mirai, Torii, Hide and Seek, and Hajime attacks. The network traffic capture for the benign scenarios was obtained by capturing the network traffic of three different IoT devices: a Philips HUE smart LED lamp, a Somfy Smart Door Lock, and an Amazon Echo home intelligent personal assistant. We should mention that these three IoT devices are actual devices and not simulated. Both malicious and benign scenarios run in a controlled network environment with an unrestrained internet connection like any other real IoT device.

2.2 Data Preparation

Our lightweight machine learning model would not be able to classify raw bytes of network traffic. Hence, we convert the raw PCAP files into Packet Description Markup Language (PDML) format. PDML conforms to the XML standard and contains details about the packet layers. We then simply represent the PDML files in Comma-separated Values (CSV) by only selecting our desired features from each PDML packet. We have implemented this in Python.

2.3 Feature Selection

As the feature space is relatively large (Table 4), all packet features may not be relevant. We have manually selected 13 features that appeared to have the highest correlation based on Eirini’s research [2]. We state the name of the features below.

length, caplen, frame-encapType, frame-timeShift, ip-flags, ip-flagsMF, ip-flagsDF, ip-ttl, ip-fragOffset, tcp-flagsAck, tcp-flagsSyn, tcp-flagsPush, icmp-code

When a feature is missing in a packet, for example, the tcp.flag in a UDP packet, we replace the non-existing value, None, with −1. The idea is to use a unique value for missing features (i.e., not used by existing features) so the model learns the impact of a missing feature as well.

Sarhan et al. focused on feature extraction in machine learning-based IDS more deeply and have concluded that the choice of datasets significantly alters the performance of feature exraction techniques [10].

2.4 Sample Size Reduction

According to the scale of our paper, we have selected one malicious and one normal scenario to train and test our models. The benign scenario contains 75356 packets, whereas the malicious scenario contains 83068.

3 Machine Learning Models

The Python library we use for employing machine learning is scikit-learn. We use decision trees as our lightweight model [4, 8, 9]. Decision Trees are a supervised learning non-parametric method for classification and regression problems. Their purpose is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.

A problem to address when training a machine learning model is overfitting. The model, in order to keep the accuracy as high as possible, may overfit the dataset and lose its ability to generalize. We use validation curves to measure the overfitting of our model. We evaluate this metric with respect to the depth of the decision tree. We can see in Fig. 3 that when the depth excels 10, the model starts to overfit. Hence, we keep the hyperparameter of max_depth less than or equal to 10. There are other ways of dealing with overfitting in decision trees, such as ensemble techniques and pruning.

After finding the maximum possible depth for our tree (10), we then instantiate a DecisionTreeClassifier class and train it on the dataset. We set the maximum depth of the decision tree to 10. To be able to see the decision rules and the number of samples satisfying them more clearly, we can refer to Fig. 4 which depicts a subtree of the final model.

Figure 5 illustrates a bigger subtree of our model. We can see in the subtree that the feature len (X[0]) appears in many logical rules and is an essential parameter for our decision tree.

Fig. 3
figure 3

Validation score with respect to the tree depth

Fig. 4
figure 4

A subtree of the final decision tree

Fig. 5
figure 5

Detailed subtree

4 Evaluation

At first, we describe the evaluation metrics of our decision tree in Table 1. To show that our lightweight model’s performance is satisfactory in this use case, we develop a neural network using the same library and compare its metrics with those of the decision tree. Table 2 briefly describes our neural network.

Table 1 Decision tree evaluation metrics
Table 2 Neural network parameters

The neural network is 96.6% accurate, which is even less than our decision tree. With that in mind that we have avoided overfitting, we can claim that our model outperforms classical neural networks in this scenario. It is important to note that the deep learning approaches may reach higher accuracy (99% and above), especially when the task includes the detection of the attack type [12] (Table 3).

Table 3 Neural network evaluation metrics

5 Conclusion

In this paper, we presented an on-the-fly malicious packet classifier. It employs decision trees to capture IoT network packets and label them as normal or malicious. We demonstrated that our lightweight model outperforms complex neural networks while keeping the processing and storage requirements at a minimum.

6 Future Work

For future work, we may integrate this model with a packet capturer to automatically label all (or some) of the traffic in the network.

We selected our features manually. In future research, one might also automate feature selection using statistical or ML methods (intrinsic, wrapper methods, etc.)

One possible contribution to this research would be attack classification. The group of malicious packets found by the decision tree can be fed to a more complex machine learning model to detect the type of attack happening in the network. We may use several IoT attack categories for classification.

  • Denial of Service (DoS): aims to make IoT devices unavailable by overloading the network and disrupting the services.

  • Distributed Denial of Service (DDoS)/Botnets: an adversary compromises many IoT devices to employ a significant DoS.

  • Spoofing: The attacker tries to manipulate an authenticated identity by forging.

  • Man-In-The-Middle: The communication channel is compromised. The attacker can act after this attack as a proxy to read, write, and modify the packets.

  • Insecure Firmware: After the control over an IoT device is gained, the device is used to attack other devices.

  • Data Leakage: If the data is not encrypted, the privacy of the user data is compromised and may be used by an attacker to access the private network.

  • Botnets: An adversary controls a network of connected IoT devices by which he performs malicious actions.

  • Brute Force Password Attacks: A potential attacker can gather substantial processing power to try every possible secret a device possesses.