1 Introduction

The Internet of Things (IoT) is currently leading the charge in the digital landscape. It offers driving forces such as cost reduction, business revenue growth, new business prospects, security, improved decision-making, improved infrastructure, and improved citizen experience [1]. It is a global revolution in the information industry and mighty changes in people’s lives by integrating the globally digital and physical into a single ecosystem due to this massive digital development, IoT networks are becoming more vulnerable to cyber-attacks [2, 3], the attackers make hard efforts to cause damage to the infrastructure of networks to carry out hostile acts such as stealing intellectual property and destroying crucial data using developed techniques [4, 5]. Therefore, more than 70 percent of IoT devices are vulnerable to security attacks, and this is now regarded the most open issues given the lack of protection systems that allow attackers to launch serious attacks such as denial of service (DoS) and routing attacks [6, 7]. Many IoT devices, such as sensors and actuators, consume low amounts of power to operate for longer. [8]. Routing layers are a passage port into the targeted devices through IPv6 over Low-Power Wireless Personal Area Networks (6LoWPAN), an open IoT networking protocol designed by resource-constrained devices [9].

However, routing protocol (RPL) is specified by IETF to handle the specific properties and constraints of networks, several routing attacks occur through malicious node activities over routing among data packets [10], and the rank value increases from the root node to the child node [11]. The attacker can manipulate the Destination Oriented Directed Acyclic Graph (DODAG) issuance system by raising their rank in the hierarchical tree and acquiring multiple children who route the packets through the attacker’s parent. Consequently, the attacker can lure multiple child nodes to choose them as a parent by intentionally changing the rank values and thus attracting significant traffic heading to the root node (the parent branch) to flow through itself [12].

Deep learning techniques have made a significant contribution to tracking the behavior of malicious nodes in the routing protocol; the detection and mitigation mechanisms to deal with routing attacks usually are classified either based on modifications to the existing RPL procedures or added procedures to RPL standards [13]. These methods can be classified into mitigation and intrusion detection systems IDSs such as relating the nodes or packets to their locations within the network using GPS, acknowledgment-based methods by sending a message and receiving an acknowledgment, and trust-based methods against the IoT networks [14]. DL-based IDS in IoT environment: IDS also uses deep learning in heterogeneous IoT networks. For instance, Kim et al. [15] trained the IDS model based on the long short-term memory (LSTM) architecture using a recurrent neural network (RNN). The authors ran tests to determine the best hyper-parameter for the best false alarm and detection rates. Similarly, the authors [16] implemented effective and quick anomaly-based IDS in low-power IoT networks using random neural networks (RaNN). The authors proposed a two-layer approach in which the system learns typical behavior at the first layer and detects various illegal memory access (IMA) issues and data integrity attacks on the network at the second layer. The suggested approach is centralized, delivering the results to a single server.

DL-based attack detection and mitigation: By utilizing the fog ecosystem, Dior et al. [17] developed a DL-based attack detection technique in IoT. In essence, the edge node closest to smart objects is where the attack detection methods are executed. The distributed attack detection mechanisms decide on the learning architecture’s output based on the available data, considering various learning mechanism parameters. Abeshu et al. [18] presented a distributed DL-based attack detection technique for the Internet of Things. They implemented DL approaches for threat detection using the fog computing architecture, one of the preferred architectures for implementing IoT applications.

Recent studies on routing attack detection of constrained resource devices have disregarded task distributions and parallel processing of detections scenarios during learning steps, where all the computations of deep learning networks to be addressed in constrained resource devices, any malicious attack of routing attacks on core IoT network can cause enormous loss in network resource consumption. However, it only required tracing malicious nodes in the network, power drain of constrained IoT devices [19]. However, parallelism training in the edge nodes reduces the training stage. Thus, intrusion detection should be in real time. Due to the nature of constrained resource objects in IoT environment, it puts as much computational process and continuous workload on the peripherals as possible.

In order to mitigate the exposure of restricted resources to potential attacks and real-time intrusion detection, the proposed DL-ESD model is designed in a high-level and lightweight method through many stages, starting with data processing which we eliminate the irrelevant features, features extraction using linear discriminant analysis (LDA), aims to generate features more distinguishable from each other, the features normalization using min–max scaling to eliminate the worst overfittings to presence of fewer data points in training samples, then selection the important features. The training accuracy has been achieved the best runtime using binary classification and adopted to keep the service’s survival.

The MLP classifier performance is presented in two different phases. (i) ConFigby specifies the optimization algorithm and tracks the loss and other metrics we apply. Initializing the form with these settings requires calling the model. The compile function is as follows: The word “sgd” denotes a random regression ratio. Also, “binary_crossentropy” is defined as a loss function for the outputs with the values 1 or 0—finally, a precision tracking process. (ii) Comprises the network’s training process by calling the model and specifying the data for the network training.

1.1 The contributions

In summary, the key contributions of the paper are provided as follows:

  1. (a)

    Proposing a DL-ESD model-based deep learning for routing attacks detection in early stage before the harmful node can declare on a new version number and create a new DODAG network.

  2. (b)

    Integration of LDA technique and min–max scaling contribute more distinct features, enhancing DL-ESD model performance in the training and testing stages.

  3. (c)

    Enhancing the detection accuracy of malicious nodes by the linearity of deep learning reduces the training time.

  4. (d)

    The binary classification proved that detection efficiency using MLP is higher than other shallow ML algorithms. That offers better prediction, high classification accuracy, and low error rates compared to recent studies.

The remainder of this paper is structured as follows: Section 2 discusses related work of deep learning and IDSs solutions for routing attacks in RPL protocol and attack scenarios in the DODAG network. Section 3 describes the framework architecture sequence, preprocessing data, and the implementation stages of the DL-ESD model. Section 4 shows the analysis and evaluation results, then classification by comparing DL-ESD model with state-of-the-art studies. Finally, Sect. 5 concludes and opens new perspectives for future research.

2 Related work

This section discusses the recent studies on routing attacks and the detection methods. These studies can be classified into two categories: DL-based routing attack detection methods and routing attack scenarios in the DODAG network are widely used in constrained resource devices.

2.1 DL-based RPL protocol in IoT

Recent studies have addressed direct and indirect routing attacks against node resources and countermeasure classifications using emerging mitigation and detection technologies and IDSs in RPL networks. These techniques are categorized as per the following schemes: (i) relating the nodes or routing the packets to their locations within the network using GPS, (ii) acknowledgment-based methods by sending a message and receiving an acknowledgment, and (iii) trust-based methods against the IoT networks.

The authors [19] researched the effects of the constrained resources consumption and the issue of effecting the routing attacks on energy consumption since the fake control messages and building of loops in the DODAGs reduce the lifetime RPL network. Another related study using the IRAD dataset proposed a reliable DL-based routing attack detection approach; the model considers adversarial training and develops a generative adversarial network classifier (GAN-C) with support vector machine (SVM). This study adopts DL parallel learning [20]. Also, the author [21] suggested a novel secure framework for detection routing attacks networks in IoT networks based on industrial IoT networks. The approach can detect hello flood, version number, black hole, and sinkhole attack. The framework performance is evaluated on performance parameters such as attack detection accuracy, true positive, false-positive rate, and end-to-end delay.

Meanwhile, hello flood causes saturation of routing nodes and traffic congestion in DODAG networks. However, the version number attack increases the control packet overhead, energy usage, and end-to-end delay. It also introduces rank inconsistencies and routing loops. It is worth mentioning that energy usage is critical in IoT networks as most nodes are battery-based, and it sometimes becomes a challenging task to recharge them [22]. Thus, it is highly desirable to conduct such a study to detect the malicious nodes early with less power consumption and network continuity of service. The processed data reduces the training duration time and increasing of training accuracy.

Authors [23] have developed an intelligent intrusion detection system (IDS) by combining deep learning algorithms with network virtualization to detect suspicious behavior on IoT networks. When the DNN detects an unknown intrusion, it saves the corresponding tuple of the only filtered features in the “cache” as feedback. This mechanism is utilized for re-training the DNN model, which contributes to the detection system labeling functionality and feature extraction. This study did not address the significant range of device identifiers. The main study [24] proposed a DNN model can detect attacks based on big data; the study created own real dataset called IoT routing attack dataset (IRAD) includes three types of attacks: hello flood (HF), decreased rank (DR), and version number (VN). The proposed model has been trained based on this IRAD dataset, and the performance results show high accuracy and F1 score up 98%. Another IoT dataset consists of five groups of attacks generated by Kamel SOM et al. [19] and proposed a new model based on convolution neural network (CNN). It predicts the suspicious traffic in IoT networks and detects routing attacks. Three methods have been used to preprocess the generated datasets of features selection, Chi-squared, and weight by tree importance to reduce the overfitting and noise to be a fitting input during training the proposed CNN model.

Authors [25] have also designed a novel scheme for detecting the decreased rank attack and verifying the harmful nodes from the DODAG network using round-trip time. In [10], authors have proposed a security routing been found that the critical point at N = 40 for many classes appeared in different attacks. A related study by the same authors in [26] enhanced a DNN approach based on supervised machine learning. Several scenarios have been implemented and simulated for the three attacks: hello flood, decreased rank, and version number. The results demonstrate that the malicious node of the hello flood generates the maximum number of packets among neighbor nodes in the DODAG network. Consequently, it raises the power expenditure of neighbors and does not impact the DODAG construction. Another model based on machine learning is presented in [29], consisting of data collection, feature extraction, and two classification methods. The IRAD dataset has been used to train ML-RPL model for new features that have been added manually; ML-RPL indicates an accuracy rate up to 97%. However, all the above approaches are considered models-based on RPL using various classification methods and the same dataset and still suffer from the DODAG Information Object (DIO) control message overhead and the uneven accuracy data of packet delivery ratio. Table 1 depicts the recent works related to IDS system and detection of routing attacks in RPL protocol. The highlight of recent literature studies was made according to the closest studies that used the same dataset or other data for the same attacks with different features.

Table 1 Recent works related to RPL protocol attacks in IoT networks

2.2 RPL attack scenarios

Many attack scenarios have been simulated to choose the preferred neighbors mote and keep the energy resource of constrained devices along (refer to Fig. 1). Almusaylim et al. [10] referred to choosing the best parent when node (N = 12, 6, and 26) sends a DAO control message to the sink node N starts the distribution module after selecting the preferred parent P, while the module calculates the MAC value via the specified parameters. The sink maintains the information table to store four groups of information about all nodes of messages received from the DAO. Likewise, the central unit running is extracted in the incoming information pool via the DAO message for N node. Then, the MAC value is calculated if the two MAC values match. It will ensure that the N node sends the message while maintaining the integrity of the received DAO message [31].

Fig. 1
figure 1

RPL network constructions: a hello flood, b decreased rank, and c version number

Thus, Palattella et al. [32] referred if N is an intermediate node, the sink node checks the N rank received from the node or child nodes it belongs to. Also, if the order does not match the order that the node received from the DAO message, the source declares N node malicious. A number of routing attacks target resource-constrained devices in IoT networks. In this section, we explain the three most types of attacks. In addition, the MAC value is validated, the sink starts to check the rank of the N node only when it is a leaf node, and then the pool checks for the presence of a low-rank and hyper-level attack [21].

2.2.1 Hello flood

This attack occurs in the routing layers. The malicious node sends DODAG Information Solicitation DIS messages successively to multiple nodes on the RPL network. The hello flood attack shortens the interval between each two successive DIS messages [28]. After infiltrating the RPL network, the malicious node will immediately begin sending out multiple DIS messages to all of their neighboring nodes in the network, as shown in Fig. 1a. Consequently, adjacent nodes receiving DIS messages must respond with DIO messages, resulting in a set of timer and repeated DIO messages that waste a significant amount of power from neighboring devices receiving a request from the malicious node [10]. Agiollo et al. [33] developed an intrusion detection system that can deal with multiple attacks to avoid the overhead of RPL. So, in hello flood and DIS attack scenarios, the malicious node has an abnormal amount of control packets. DETANOR’s attack classification mechanism identifies the attackers as those devices transmitting an abnormal amount of control packets.

2.2.2 Decreased rank

The harmful node in a decreased rank attack declares its false low rank through the DIO control message to attract traffic to its neighboring nodes, as shown in Fig. 1b. The (root node) takes the 1st rank in DODAG construction [10]. Node 1 (root) sends multicast DIO messages containing all the information of its neighbor nodes. The neighbor nodes in rank 1 choose the root node as a parent. Therefore, after connecting DODAG, the adjacent nodes of root nodes 2, 3, and 4 multicasts their DIO messages by setting the rank to 2nd. The rank of nodes increases in a descending direction. If nodes discard high-value DIO messages from the rank value, they visualize the DIO message coming from child nodes (down) [25]. Node 3 can add nodes 2 and 4 as a preferred parent as in the node three range. Moreover, all descending nodes receive DIO messages from neighbor nodes but decide on harmful nodes as the preferred parent based on the best rank.

Node 6 is harmful and declares a false rank value (rank = 1st) to enable neighboring nodes 5, 9, 8, and 7 to move toward the harmful node, indicated by dotted arrows [21]. The sixth node means that its rank value is rank 1st, while its actual ranking value is 4. In the current circumstances, nodes 5, 7, 8, and 9 decide the harmful node six as the favorite parent and reroute the traffic through node 6, as shown in Fig. 1b.

2.2.3 Version number

This attack is one of the most efficient attacks in routing layers; particularly in the network layer, the malicious node alters a DIO message [32]. In contrast, the malicious node receives a DIO message in the IoT network. The DODAG version number is incremented in a DIO message, and the malicious node forwards the infected DIO message [25]. These require overhauling the entire DODAG architecture. These frequently forced DODAG re-assessment also wastes the key parameter “power” from all nodes belonging to DODAG construction. Thus, the nodes in the network lose their energy rapidly, as shown in Fig. 1c. As a result, the life of the network is greatly affected. A. Mayzaud et al. [34] proposed a monitoring strategy with dedicated algorithms for detecting version number attacks; the solution’s performance has been evaluated through experiments and quantified with the sup. Almusaylim et al. [10] proposed a security routing protocol (SRPL-RP) for RPL rank and version number attacks. The proposed protocol detects and isolates attacks and adds them to the blocklist. The detection is based on a comparison of the ranking mechanism. The analysis results indicate that the PDR packet delivery rate of (98.48%) and SRPL-RP achieved an accuracy rate of (99.92%) under version number attacks. Sahay et al. [35] proposed an inclusive framework for the prediction of version number; the framework includes a feed-forward neural network that uses the traffic as an input for prediction version number attack. Therefore, the framework uses the smart contract-fortified blockchain technique to establish secure channels to access in IoT resources.

3 Proposed DL-ESD model

This section introduces the phases of the proposed model, describes the DL-ESD structure and implementation, also provides a detailed explanation of data processing, and then builds the deep neural network.

3.1 Framework overview

The framework structure consists of three levels: data preprocessing, deep learning networks, and classification, as depicted in Fig. 2. It describes the framework structure as follows: Data processing is divided into three phases; feature selection, the linear discriminant analysis (LDA) has been used for feature extraction and a linear projected transformation utilized for feature extractions in different aspects. It means that feature extraction based on machine learning techniques can obtain an optimal contrast level between the extracted features and improve the performance of the training stage. Data normalization and visualization; in this phase, min–max scaling methods normalize the dataset and adopt the standard quintile conversion to disperse marginal values, and then, the correlation coefficient is also measured to select the dependency level for best features. In the third stage, the dataset is split into a 75% training and 25% testing set using scikit-learning and Pandas function (). In this stage, features are scaled to be compared on a common basis, and then, the preprocessed data is fitted into our classifier to extract the most important features. Therefore, the experiments are performed the deep learning techniques. In the last level, the performance of deep learning techniques is measured and compared to the MLP technique. To achieve the research objectives in capability detection for the malicious nodes in the routing layer, we have proposed a novel deep learning-based early stage detection (DL-ESD) using IoT routing attack dataset (IRAD). The deep learning techniques have been compared to make detecting attacks most accessible. However, MLP technique has proved ability highly in training accuracy and duration, which is the most contribution of our model to improving detection accuracy. Binary classification methods also have been employed to improve performance efficiency.

Fig. 2
figure 2

Proposed framework for detection of routing attack of RPL-based IoT networks

DL-ESD model is presented under two different phases. The first phase: ConFigby specifies the optimization using adam optimizer as faster training in less time and more efficiency and tracks the loss and other metrics we apply. Initializing the form with these settings requires calling the model. The compile function is as follows: The word “sgd” denotes a random regression ratio. In the second line, “binary_crossentropy” is defined as a loss function for the outputs with the values 1 or 0—finally, a precision tracking process. The second phase includes the training process by calling the model to specify the data we want to train the network on, namely X_train and Y_train, and then setting the mini-batch size at 32 and choosing the training time epochs = 100. With ten iterations. Finally, we decide on our verification data which leads measure model performance that can verify at each point of the verification data.


Packet sniffer: Enter the interface’s name to sniff node information that can be seen now.

Training and testing our neural network: Enter the name of the CSV dataset file you wish to use. If you want to load a previous model, enter “y” and the model’s name. Otherwise, just press Enter. Based on the size of the dataset and model topology, the process may take a bit of time. Once completed, enter “y” to see the weights and intersections of the model after training, input “y” again to be saved model (end as “.sav,” must).

Data viewer: It allows displaying data within a dataset. Enter the name of (.CSV) dataset that you would display, input “a” to see all, input “n” to see numeric data only, and “c” to see categorical data only.

Live deep neural network: DNN uses a trainer to detect routing attacks from the menu. Enter the name of the interface that would detect RPL attacks when the input of the trained model’s file name, as that will run until stopped or an attack is detected.

Visualizing loss and accuracy: Displays a visual representation of how the ANN model sounds. It can change code, currently showing an input layer of 9n, two hidden layers of 100n, and “1n” as an output layer.

3.2 Data preprocessing

The IRAD dataset has been used in this study for training and testing stages within various scenarios. Three datasets samples are used, and each sample contains two classes: the malicious and benign samples listed in Table 2 [24]. Therefore, when the completion of the simulation stage. The packet capture (PCAP) files have been converted into a comma separated value (CSV) format using a Wireshark analyzer and developed a preprocessing script for Python data that applies a feature extraction process for the converted CSV files [24].

Table 2 IRAD datasets values

3.2.1 Features extraction

This phase aims to reduce the number of dataset features by discarding the original overfitting features, creating new features from existing ones, and summarizing the most information in raw features. To eliminate the overfitting and get issue-oriented attributes to distinguish between routing attack and normal RPL traffic. Moreover, it reduces the running of training and validation time. The LDA method reduces the dimensions and shows feature samples on a straight line to produce more distinct features. The number of extracted features must equal one since each subset has two classes of attack and benign [36]. The flow identifiers such as Source IP, destination IP, source port, destination port, packet length, time, and protocol type are eliminated to avoid bias toward malicious or legitimate nodes. The IRAD datasets contain qualitative and quantitative features. Our learning algorithm allows for quantitative values only. However, we applied feature conversion qualitative features to convert its integrated format. Thus, the selected features such as DAO are used for unicasting destination information according to the parents selected.

In RPL also, DIO is the message type. It holds the current sequence for the node and uses the specified metrics as distance or hop count to decide the optimal route over the base node. DIS is another message form, and nodes use DIS to join WSN. Other types of IRAD datasets are RPL nodes that are simulated data nodes. Firstly, we have calculated several transmitted and received packets for every node in 100 s in the scheduled time and then split these values into 1000 ms to obtain each node’s DIO transmission and receiving rates (DTR, DRR), respectively. In all time ranges, the time it takes for each node to be sent and received is calculated. It can also calculate the total transmission time and receiving time by adding up each transmission and receiving time, a 1000 ms data packet and each node’s transmission and receiving time. The number of control data packets was calculated in the window size for each node and extracted features as per the steps outlined above. The benign and hostile datasets have the same structure when mixed.

3.2.2 Features normalization

Normalization is one of the most used methods for shifting values between 0 and 1 in a given range. It cleans up the data and lowers bias, resulting in high detection accuracy and improving the performance and training stability of the model [37]. Thus, we have performed feature normalization to drag datasets into the same range. The min–max scaling methods proved the easiest, most intuitive, and more flexible for normalizing the values in the selected features, which X* is the new feature from 0 to 1, RPL.FEATURES is the original feature value and RPL.FEATUREmin and RPL.FEATUREmax are the maximum and minimum values of the selected features as shown in Eq. (1), respectively.

$$X_{*} = \frac{{\left( {RPL.FEATURE - RPL.FEATURE _{{{\varvec{min}}}} } \right)}}{{\left( {RPL.FEATURE_{{{\varvec{max}}}} - RPL.FEATURE _{{{\varvec{min}}}} } \right)}}$$
(1)

Each feature is imposed separately on the standard quintile conversion. The goal of the transformation is to disperse marginal values of DAO, DTR, and Trickle timers reset features, which could alter the connection between values [38]. The best nine features for training and testing have been chosen after data normalization and offset, including DIS, DIO, and DAO for transmitted and received 6LoWPAN attributes with high scores as shown in Fig. 3a–c. Finally, all concatenated datasets have different network topologies for each attack.

Fig. 3
figure 3

Feature selection and importance of data preprocessing

3.3 Selection of importance features

As the aforementioned result of steps, the importance features were selected by the strongest relationship with the output variable and have been selected by scikit-learn, removing the common and irrelevant features. The features are adjusted based on the DIO, DAO, and 6LoWPAN characteristics with comprehensive detection, as listed in Table 3. We addressed the missing values in Pandas DataFrame and use a function that is null () and not null (). Both functions help in checking whether a value is NaN or not. This function can also be used in Pandas series to find null values in a series and then split the dataset into training and test sets. The important features can use to enhance the prediction models. That can apply to selecting these features to keep the highest scores or remove the lowest scores. Figure 3a–c depicts the score of the best nine features after computing its correlation and variance analysis. The ablation experiments have been applied for 18 features using the Pearson coefficient correlation in each phase. Some features were subjected to measuring the correlation with the others, the results indicate that the nine features are considered with a high level of dependence, whereas further experiments indicate to the lack of correlation in other features. The PCC (r) for variables x and y is calculated using Eq. (2)

$$R = \frac{n(\sum xy) - (\sum x) (\sum y) }{{\sqrt {[n\sum x^{2} - \left( {\sum x)^{2} } \right] [n\sum y^{2} - \left( {\sum y)^{2} } \right]^{.} } }}$$
(2)
Table 3 Selection of the important features

The important features selected for the three attacks rely on the packets transmission rate and reception average for 6LoWPAN protocol and the DIO control messages. The important features were selected using a combination of deep neural networks, Pearson correlation coefficients, and histograms [39]. The extracted features’ value was assessed using the DNN technique to determine the optimal number of neurons needed in the network, bagging the means to combine unbiased and noisy variables to create a model with a lower variance. Therefore, the analysis has been conducted on selected features using MLP classifier, a feed-forward with at least three-node layers in ANN. For MLP classifier, only one hidden layer multilayer perceptron is utilized, relying on different activation functions.

3.4 Deep neural network (DNN)

The hidden layers have been decided in the DL-ESD model based on our experimental approach. We added the independent variables as input values (X) and dependent variables output values (0, 1) divided by 2. The network has been tuned by adding the extra nodes to reach optimal results with two hidden layers. Therefore, we tested the model’s accuracy by varying the number of layers and selecting the one that produces the best result. The neural network consists of 4 layers; the input layer has nine neurons. The output layer has just two neurons as the last layer; this is called a regression model as depicted in Fig. 4. The first hidden layer includes 100 neurons and 100 neurons in the second hidden layer. ReLU activation function is used in hidden’s layers. In contrast, the sigmoid function utilizes in the output layer as it is known that network training involves identifying the network model as a structure and then finding the best values from the data to fill in the model. Before starting training, the dataset is split again at a rate of 0.3 as a validation dataset to adjust the model’s training performance.

Fig. 4
figure 4

Structure of our deep neural network

Let X = \(X = \left\{ {X_{1} , X_{2} , X_{3} , \ldots , X_{n} } \right\}\) be the input vector with n = 9, the steps are used to enter the product sum activation function “SOP” to calculate the value of “S” of “X” as input values for our features as shown in Eq. (3) and “W” to measure weights. However, the nonlinear activation function is represented by A(.) and wi and bi indicate the weights and bias of hidden layers i as Eq. (4) and the activation function ReLU is used in hidden layers, while the Eq. (5) shows the mathematical representation in the neuron that achieved ANN as illustrated in Eq. (6)

$$S = X_{1} *W_{1} + X_{2} *W_{2} + b$$
(3)
$$H_{i } \left( x \right) = A\left( {w{}_{i}^{T} x + b_{i} } \right)$$
(4)
$$ReLU\left( x \right) \, = \, max\left( {0,X} \right)$$
(5)
$$Y = \sum \left( {inputs} \right)*\left( {weights} \right) + bias$$
(6)

To enhance the measurement accuracy in the output layer, we used the ReLU and Tanh functions, but the accuracy rate is ineffective as it exceeds 65.3% and 55.8%. Besides, the sigmoid function has a distinct “S” curve and a mathematical representation for ReLU and Tanh function in binary classification. After (7) sigmoid function is used in the output layer, our model’s accuracy exceeds 98.98%. We also decrease the sharp increase by nearly 62–98% during training phases by applying regularization and dropout. The randomly selected nodes increase additional time and cost to drop out in each stage. As a result, the performance of our deep layers is significantly reduced. In other words, when handling large datasets, interactions between neurons almost always result in overfitting. We also apply dropout and regularization for these reasons. As in [37], Keras is used as a framework for deep learning because it includes several advantages, such as its modularity makes it easy to construct and test complex neural networks. Firstly, Keras is a powerful, easy-to-use Python library. Secondly, it is a high-level API for building and training DL models. Therefore, it makes it possible to create deep neural networks quickly.

$$Y = \frac{1}{{1 + e^{ - x} }}$$
(7)
$$Output = f\left( {0,1} \right)$$
(8)
figure a

Tensorflow also includes several implementations for creating a complex DL model. In the training process, the datasets were shuttled to optimize the deep learning model performance and avoid overfitting [40]. The preprocessed dataset is split into x_train and Y_test. The first, X, is the unlabeled portion, and Y is the second portion. More specifically, Y is the supervised learning portion of our model that makes learning algorithms. X and Y are divided into X_train, X_test, Y_train, and Y_test. Train parts are utilized in the training section, while test components measure the training process performance.

4 Experimental results and evaluation

The primary objectives of this research are to develop methods for RPL attack detection to improve prediction accuracy rates in IoT networks with low error. As the outcome is actual, the system prediction is true; otherwise, it is all false. This case is called positive if the forecast is related to the attack. Otherwise, it is negative. Therefore, there are four logical possibilities: true and safe prediction, correct and attack, false-negative and safe attack, respectively, where: True Positive TP, True Negative TN, False Positive FP, and False Negative FN.” The classification error is the ratio of incorrect predictions to totality prediction numbers [19, 41].

Accuracy (ACC): is the percentage of true detection over total data instances.

$${\text{Accuracy}} = \frac{TP + TN}{{TP + TN + FP + FN}}$$
(9)

Precision: represents how many of the returned attacks are correct.

$${\text{Precision}} = \frac{TP}{{TP + FP}}$$
(10)

Recall: measures the ratio between a true positive and a total of both a true positive and a false negative.

$${\text{Recall}} = \frac{TP}{{TP + FN}}$$
(11)

F1 score: is the weighted harmonic mean of the precision and recall and reflects the balance between P and R.

$$F1 = 2{*}\frac{{\left( {Precision*Recall} \right)}}{{\left( {Precision + Recall} \right)}}$$
(12)

Upon nature and amount of the dataset, we applied different ratios and picked the 75–25 ratio to give the best performance result. Also, the picked ratio is proportional to the midsize of our data into 75% for the training set and 25% for the testing set to evaluate the provided model with a biased evaluation fit on the training dataset. Figure 5a shows that training epochs rate is very appropriate and it also can be inferred that the DL-ESD model has performed best in training and testing accuracy.

Fig. 5
figure 5

Measuring performance of training and testing epochs: a accuracy rate of DL-ESD, b loss error of DL-ESD

Table 7 represents routing attack has various methods of feature selection. The features have been applied to each problem in the dataset and fed to the neural network. It can be seen that the MLP technique has high accuracy, precision, recall, and F1 score values performed well. Also, Fig. 5b depicts the loss rate values to the model parameters by adjusting the weight vector values through various optimization approaches that have reduced the training time. It means how efficiently our model behaves after every optimization iteration, in which the weights change in each iteration of 10 iterations. In training phase, the higher number of neurons and epoch, the higher accuracy, and the lower the loss rate. In Fig. 6, the ROC curve close observation of performance matrices shows the ability of sensitivity to correctly predict malicious nodes as harmful nodes while specificity ability to predict normal nodes as malicious nodes correctly.

Fig. 6
figure 6

Receiver operating characteristic

4.1 Confusion matrix

The detection accuracy rate of the DL-ESD scheme success should be high, but the false-positive rate should be lower [42]. The misclassification rate is directly proportional to the false alarm rate, as presented in Tables 4 and 5. The classification results illustrate high performance in our scheme. Figure 7a shows that the bias rate of the safety packets is increasing, and the false-negative rate of the total hostile packets is decreasing; it can be summarized that the classifier understands the positive value of one class from another.

Table 4 Prediction rate of training stage
Table 5 Prediction rate of testing stage
Fig. 7
figure 7

Overall performance of measure the detection ratio of model: a precision, b recall, c F1 score, and d confusion matrix

Figure 7b illustrates the sensitive rate of the classifier and the simple raise in the bias ratio to the positive values. In contrast, the classifier decreases the sensitive rate of false negative for a package rate. It can be summarized that the classifier can understand the positive value of a class from another.

Figure 7c exhibits the F1 score as the weighted average of the sensitivity and precision rate for positive and negative values. F1 is usually more effective than precision, especially if the class distribution is uneven. But this model reflects the balance between P and R classes, which the confusion matrix indicates to uneven class distribution. Thus, that will be contributed much to classifier performance.

The proposed detection method effectively achieves the highest TPs, TNs, and lowest instances of FNs. Figure 7d indicates that the prediction ratio for TP’s input values is the classification result, and TN is largely high. At the same time, the false committed by the classifier for (FN, FP) is low. It also indicates that the classifier performance and the expected ratios are satisfactory. The detection rate with training confusion metrics and multiple datasets are listed in Table 6. The binary classification approach is obtained for training and testing stages based on DNN technique. Due to the inability to measure the bias ratio among the classes in the classifier, it is necessary to rely on classification reports to obtain a deeper concept of the strength and performance of the classifier more than the accuracy. The experimental results in class 0 and class 1 classification report indicate that the MLP classifier is more biased toward class 0 in training and testing, as listed in Tables 6 and 7.

Table 6 Training classification from confusion matrices for our model
Table 7 Testing classification from confusion metrics for our model

4.2 Training and testing analysis

The ablation experiments drive the enhancement of the neural network performance based on the correlation coefficient to considered features. The features are split into different levels accordingly to the node’s behavior. We have applied the incremental training stage and conducted several tests over IRAD preprocessed features, including the multiclass categories of three attacks and normal nodes for binary classification based on MLP classifier; we also compared MLP with shallow machine learning, KNN, SVM, NB, LR, and MLP techniques and state-of-the-art routing attacks. The test is applied using the weights learned during the training stage. This section reports the average of running the training model 10 iterations. The comparison results show the effectiveness of early detection and identify the best parent in the DODAG construction to keep the IoT network in service. Table 8 and Fig. 8a summarize the training and testing accuracy results compared to supervised DL algorithms.

Table 8 Calculating scoring time during the training and testing stages of our approach based on binary classification compared to different ML techniques in our work
Fig. 8
figure 8

Comparison of training and testing accuracy of our classifier with other techniques: training accuracy of HF (a), training accuracy of DR (b), training accuracy of VN (c), as well as the testing accuracy of HF (d), training accuracy of DR (e), and testing accuracy of VN (f)

The training accuracy of HF demonstrates that MLP technique achieves a performance reach of 0.992% higher than the four techniques within 78 ms of training duration and up to 0.5% of testing period. In Fig. 8b, the DR sample indicates that among both MLP and KNN a relative parity of training accuracy with an approximate increase of MLP with 0.98% within 33 ms as an optimal time is observed. In Fig. 8c, the MLP technique in VN shows high detection accuracy of 0.97% compared to other methods within 46% ms of training duration. In order to evaluate the robustness and effectiveness, Fig. 8d compares the test performance accuracy of our classifier with four other classifiers K-nearest neighbors (KNN), logistic regression (LR), support vector machine (SVM), and naïve Bayes (NB). HF in SVM has a lower accuracy than the other classifier. Therefore, it cannot be recommended because its learning time is also high, while the MLP classifier has higher accuracy and lowest time, as listed in Table 8.

Figure 8e shows the test performance accuracy compared with the other four classifiers; the test accuracy of NB and LR is low compared to the other classifier. Therefore, they cannot be recommended because its learning time is high, while the MLP classifier is higher accuracy and lowest time, as listed in Table 8.

Figure 8f shows the test performance accuracy compared with four other classifiers; the test accuracy of five classifiers is uneven and has a slight improvement from each other. Therefore, they can be recommended due to good learning time, as shown in Table 8’..

4.3 Performance discussion

Upon the analysis methods in Sect. 4, the performance results show a speed and decrease in epoch time, which is reflected in the perfect performance of the model. Table 9 compares the performance metrics for our model to recent studies that generated IRAD dataset, and other studies that used the same dataset. DNN model obtained the best performance of HF attack through training the detection model using five features, the DR and VN attacks were trained using ten features, and the performance accuracy of DR and VN models is higher with 0.94 and 0.95 F1 score. Whereas, our model used nine features for each attack during training with uneven correlation level. ML-RPL used binary classification and multi-classification for  a sample of IRAD dataset is DR attack. In the binary classification, the training accuracy reached 97.17% and 97.01% for testing accuracy, while in multi-classification used the same parameters with SoftMax as the activation function, the training and testing accuracy of the model obtained 96.59% and 96.39% for testing phase. GAN is used to detect any fake samples that could confuse the learning cycle of the detection model. The performance measuring is compared between GAN-C and an independent SVM classifier to select the proper model in IoT.

Table 9 Comparison of the quantitative measure of performance metrics with other Techniques

The training results showed slight improvement, although the proposed model evaluated the performance compared to one classifier. It took a lower number of epochs (about 50) to reach an accuracy of 91%. In contrast, our proposed classifier evaluated the performance accuracy with four algorithms. It leaves no doubt that our model is more efficient. In the same context, Table 10 states that the DL-ESD scheme has the highest accuracy compared with the DNN [24], the author of the IRAD dataset used in our study.

Table 10 Comparison classification results of our model with state-of-the-art studies

Almusaylim [10] proposed a security routing protocol (SRPL-RP) for RPL rank and version number attacks. The proposed protocol detects and isolates attacks and adds them to the blocklist. The detection is based on a comparison of the ranking mechanism. The analysis results indicate that the PDR packet delivery rate of (98.48%) and SRPL-RP achieved an accuracy rate of (99.92%) under routing attacks. A recent study [29] suggested a machine learning model consisting of three steps: data collection, feature extraction, and two classification methods. The decreased rank IRAD dataset has been used to train ML-RPL model for new features that have been added manually. MLRP indicates that the accuracy rate is up (97%). The authors depend on actual sensor code through the data generated in the simulation scenarios, and the performance accuracy reached 96%. CCN method [19] predicts suspicious traffic on IoT networks, and the authors generated an IoT dataset consisting of five datasets. Due to the lack of studies that use the IRAD dataset, we chose two subsets to compare with the used datasets in our research. The results indicate a relative decrease in the detection of version number motes; the detection accuracy rate in both HF and VN attacks reached 93.63%. In iIoT, [21] is based on Industrial IoT networks that detect hello flood, version number, black hole, and sinkhole attacks. The performance accuracy among the Interval rate (200–1000 s) of hello flood indicates 92%, and version number reaches 93%, respectively.

GAN-C model [20] takes adversarial training into account and has created a generative adversarial network classifier (GAN-C) with support vector machine (SVM). The study adopts DL parallel learning, and the results show a relatively much lower level of training to achieve an appreciable detection accuracy of 91%. What gives our adopted mechanism a preference over the proposed mechanisms is that DL-ESD can find easy arithmetic solutions at a high rate of efficiency, as the detection accuracy has reached (98.85%), the precision rate of (97.50%), recall rate of (98.33%), and F1 score rate of (97.01%), model performance values are evidence of the model scalability.

5 Conclusion and future work

This study proves that deep learning techniques are more efficient in complex security issues in IoT security. A new scheme called DL-ESD has been performed to detect routing attacks early. The LDA proved a potential to maximize the distances between the mean classes (between classes) and reduce the distance between the mean of the same class (intraclass), which produced more distinct features, it was implemented with the MLP classification algorithm. At the same time, the data was normalized using min–max scaling, which eliminated the worst overfittings of fewer data points in training samples. The important features are based on the highest correlation level features. The introduced approach applies the binary classification method in lightweight deep learning techniques. It can classify the behavior as a normal node or routing attack as available in the processed dataset. Therefore, we  observe a high enhancement in MLP classifier performance, it shows high accuracy in testing and training and a low runtime compared to other classifiers. The results of DL-ESD model performance also show better detection efficiency. This scheme requires firmware adjustment on IoT objects, and its computational complexity is still low. In future work, we plan to enhance the detection range using a better technique based on edge computing environment of widely comprehensive routing attacks in RPL protocol.