1 Introduction

Wireless sensor networks (WSNs) is a heterogeneous system containing small actuators and sensors which have general-purpose computing elements. WSN consists of hundreds or thousands of low-cost wireless nodes, low-power, self-organizing, and deployed to control and monitor the environment (Abdollahzadeh and Navimipour 2016; Rashid and Rehmani 2016). Five main characteristics have to be considered during the development of WSN: self-healing, reliability, scalability, robustness, and security (Emary et al. 2016). Moreover, WSNs can be used in many applications like ocean monitoring, performance monitoring for manufacturing machines, earthquake monitoring, and various military applications. Besides, future applications, such as monitoring of pollution, highway traffic, building security, wildfires, and water quality, are likely to use WSN principles in their architectures. WSNs have many advantages, such as converting raw data into useful aggregated and grouped information (Maleh and Ezzati 2015).

WSNs often have one (or more) centralized control unit called base station(s). A base station is typically playing a gateway role to another network, it provides a great storage center and data processing, and it offers human interface access point. It can also be used as a connector to disseminate control information and to extract data from the network. The base station has also referred to the sink. All of the sensor nodes build a routing forest, where the root of each tree is a base station. The base station has more storage and power capacity than the other sensor nodes. Typically, the base station has the required battery energy to cover sensor nodes lifetime, enough memory storage to save cryptographic keys, stronger (in terms of performance and speed) processors, and means to communicate with outside WSN (Khasawneh et al. 2020; Çavuşoğlu 2019).

Ensuring a high level of surety for the critical applications that use WSN is highly required to secure their data and infrastructure from intrusions. Therefore, an intrusion detection system (IDS) should be used to detect abnormal behaviors and intrusions. In WSN, sensors gather the information from the environment where they are distributed, and then they transmit it to the base station node. Information should be secured from external attackers, as cryptographic security is not fully efficient in protecting this information. Therefore, the second level of defense mechanism is required, such as IDS (Guo et al. 2014; Khor et al. 2012). IDS monitor the network’s traffic and sends alerts to the base station in case there is a malicious activity found by any sensor as shown in Fig. 1. In this figure, the black cycle presents a sensor node, white star presents an intruder, red star presents a cluster head, D presents the distance between to intruder, R presents the cluster area, and A presents the sensors area.

Fig. 1
figure 1

Example of the intrusion detection system in wireless sensor networks

Feature selection (FS) is a technique used to select the most pertinent features, where these features are used to create accurate and robust IDS models. FS offers a manner to identify the relevant features and removing redundant features among given attributes from the dataset (Chizi et al. 2009; Abualigah 2019). The main objectives of FS are minimizing data dimensionality and improving the performance of the detected performance (Chandrashekar and Sahin 2014). In real applications, there are many features used in data representation, including redundant features. Thus specific features would take the role of other features, and the extra features might be segregated. In addition, the relevant features have a direct impact on the output, and they have essential information describing the behavior of the dataset (Bell and Wang 2000). Formerly, it is unpractical to get a comprehensive search for the best set of features in a high dimensional space. Many kinds of research have been conducted to model the FS as a combinatorial optimization problem, where a set of given features leads to the optimal feature space (Nakamura et al. 2012).

Despite that several IDSs have been proposed to detect intrusions in WSN such as (Haque et al. 2015; Jin et al. 2017; Yu et al. 2016), they are still suffering from: (1) low detection accuracy, (2) high false alarm alert, and (3) high processing time. Therefore, proposing more efficient IDS in WSN is needed. The Grey Wolf Optimization (GWO) is a recent optimization algorithm, proposed by Mirjalili et al. (2014), that has been successfully used for solving different FS problems. In this paper, three binary improved versions of GWO are introduced using two different techniques. In the first technique, the first three best solutions are recognized by the individual, and then they are binarized, after that, the stochastic crossover is done between the three main moves in order to find the next location for the binary gray wolf. In the second technique, frequent update to wolves’ location is performed by sigmoidal function, and these values follow a threshold stochastically in order to find the next location of the binary gray wolf. This paper enhance the performance of GWO-IDS in WSN by increasing the number of wolves and using multi-objective function to increase the overall performance of the prediction system is based on how it predicts unknown classes efficiently in terms of accuracy and the total number of selected features, and speed-up the execution. These two techniques of binary GWO are commonly used in the FS domain to find an optimal subset of features used to maximize the classification accuracy (i.e., detection rate) with the minimum number of features used. Experiments are conducted using the NSL KDD’99 dataset to demonstrate the performance of the proposed method and compare it with other existing techniques including PSO-IDS (since it suffers from long execution time and low detection rate) and grey wolf optimizer with support vector machine (GWOSVM-IDS) with three wolves (since it suffers from long execution time and low accuracy). The proposed methods are evaluated in terms of accuracy, number of features, execution time, false alarm rate, and detection rate. The results showed that the proposed GWOSVM-IDS with five wolves overwhelm PSO-IDS by 3, 40, 42, 64, and 3%, and it enhances GWOSVM-IDS with three wolves by 16, 50, 13, 60, and 14% respectively. Besides, the proposed GWOSVM-IDS with seven wolves enhances PSO-IDS by 7, 40, 46, 88, and 15% respectively and GWOSVM-IDS with 3 wolves by 21, 50, 24, 87 and 15% respectively. Therefore, the proposed technique enhances the existing techniques noticeably in terms of accuracy, detection rate, number of features, and execution time.

The results of this paper are organized as: Sect. 2 presents the more related work to the intrusion detection using optimization algorithms. Section 3 presents the proposed method for solving intrusion detection. Section 4 presents experiments and results. Finally, Sect. 5 presents the conclusion and future works.

2 Literature review

In this section, a comprehensive review of the related works is presented to investigate the existing works related to feature selection based on IDS. Also, a comparison of the proposed technique is conducted against some of the prior techniques in order to validate the result of the proposed one.

2.1 An overview of wireless sensor networks

As aforementioned, WSN holds several devices called sensor nodes. These nodes are characterized by their small size and their limited power and processing capabilities. Besides, sensor nodes sense and gather information from the environment that is distributed (García-Hernández et al. 2007). Although, the WSN is used in many critical applications, it is still suffering from different challenges. These challenges are discussed in the next subsection.

2.1.1 WSNs challenges

WSN poses unique challenges because it deploys inaccessible (open) areas, presented in the added risks of physical attack; also the interaction with their users and physical environments may carry serious security problems (Mahmood et al. 2015; Curiac 2016). So, the traditional security mechanisms are not enough and inadequate to protect WSNs, and new ideas are needed. Indeed, many security issues may face WSNs, like (Guo et al. 2014). Data confidentiality is the process of keeping the information secure from unauthorized parties. It can be achieved since a network sensor does not trickle sensor readings to the sensors neighboring in WSNs. In different applications, network nodes highly communicate sensitive and critical data. Data encryption is the standard approach that can be used to protect sensitive data from unauthorized access. Public-key cryptography is used in data encryption used for encrypting the data. On the other hand, this approach is costly in terms of time and performance.

In WSNs, the receiver node has to make sure that form the originator of the message before making any decision because intruders can easily inject messages in the network. Therefore, data authentication protects the network from unauthorized parties’ participation in the network. In the case of two-party communication, data authentication can be done by using a purely symmetric mechanism. Data Integrity verifies that the receiver will receive an un-altered message, as an adversary can alter the message during the transmission of the message. Moreover, data authentication offers data integrity. Data freshness implies that the data is recent, and there are no replayed old messages by an adversary. Sensor network encryption protocol (SNEP) uses a common defense by using generating unique and order number for each message (Nakamura et al. 2012). This method ensures that the message with the first occurrence must be passed, and the fake message with more than one occurrence must be prevented. WSN should be robust and robust against various attacks. However, if an attack succeeds to enroll in the network, its impact must be decreased as much as possible. WSNs are vulnerable and susceptible to security attacks due to the medium of message transmission and the broadcasting nature. Furthermore, because of the WSNs nodes are placed in a hostile and dangerous environment, WSNs face an additional vulnerability.

2.1.2 WSN threats and attacks

In this section, these threats and attacks are discussed (Chelli 2015). The intruder can receive the messages interception, which is containing the sensor nodes’ physical locations. After that, an attacker can destroy the nodes after locating them. An adversary can observe the sensor nodes’ locations, the content of messages such as: (time-stamps, and message ID) (Hammoudeh et al. 2017). Robust encryption techniques can be used to reduce the threats of gathering passive information.

Capturing the network node may reveal this node information, including cryptographic keys, which can compromise the whole network. The intruders could add a fake node into the network to feed wrong or false data or to prevent accurate data passaging (Pritchard et al. 2017). Moreover, such messages consume resource energy and slow down network performance. This type of attack could be called “sleep deprivation.” A single node in the network has a present multi identity to the other nodes inside the WSN. It poses a critical threat to the protocol of geographic routing, as the location-based routing requires exchanging coordination information with the node and its neighbors to address the data packets geographically to the route efficiently. Authentication and encryption techniques can be used to protect the network from Sybil attack in WSNs. Every two nodes in WSNs can employ Needham-Schroeder protocol in order to verify their identities and to build public key. A neighboring node pair can use the generated key in order to implement an encrypted and authenticated link between them. A sinkhole attack attracts a considerable number of the transmission traffic by a specific node through a specific area, and it constructs fake road or sinkhole at the network center (Maza and Touahria 2019). A sinkhole attack typically makes a particular node that seems more so attractive for its neighbors.

An adversary in wormhole attack tunnels messages which is arrived at the receiver over a low latency network link and replays it to different parts. In this type of attack, between every two nodes, there is single that node has been situated to forward messages between them. Figure 2 presents a wormhole example. From Fig. 2, it can be noticed that the attacker receives packets at one point in the network, and then it tunnels them to another point in the network and replays them into the network from that point.

Fig. 2
figure 2

Wormholes attack (Chelli 2015)

2.2 Intrusion detection systems based on feature selection

As mentioned earlier in chapter one, the IDS is an essential element of security over any network type. It ensures a high level of security for the network against possible threats by preventing or detecting all intrusions and hosts (Devi and Suganthe 2017). However, its primary goal is to ensure the adaptability to detect any new type of attack. IDS can be classified into misuse IDS and anomaly IDS. The misuse IDS employs the signatures to find any new attack, but the anomaly IDS uses statistical patterns and intelligent methods to discover either the behavior is healthy or not (Stein et al. 2005; Amiri et al. 2011; Paulauskas and Auskalnis 2017).

Several techniques have been proposed to introduce IDS are based on anomaly detection, mainly, these techniques use intelligent classification methods and artificial intelligence algorithms to classify the normality and abnormality (Aljawarneh et al. 2018). The classifier in IDS ensures the detection of attacks and provides intelligent computation to the detection process. Every classifier might be characterized by its pattern, detection rate, false alarm rate, and accuracy.

Regrettably, most of the classifiers suffer from the low level of accuracy, detection rate, or high level of false alarm rate, in addition to the issue related to the complexity of the classifier model and computational cost. Moreover, the high level of classification, high complexity, long processing time, and massive storage capacity impact the performance of the detection process (Jaiganesh et al. 2013). Therefore, enhancing the classifier performance and minimizing the processing costs remain the main issues in IDS and need further improvement. However, reducing the dimensionality of the dataset would enhance the effectiveness of the detection process and overcome the overhead classification challenge. Recently, the feature selection process is used to overcome such problems (Ambusaidi et al. 2016).

Feature selection is emerging to minimize the dimensionality of data by selecting the essential features with minimum redundancy and maximum relevancy to get high performance and accuracy. Those are the main objectives that provide us with suitable data recognizing out noise, addressing the over-fitting issues, and select in the optimal feature that could be used in the classification process efficiently. Feature selection can be defined as “a preprocessing optimization process, which is mainly used to reduce the dimensionality of the dataset by selecting the interesting features without redundancy and irrelevance” (Bins and Draper 2001). These selected features resemble the optimal subset which provide an enhanced classification performance and accuracy, avoidance of the overhead of classification issue, a minimized processing time and storage capacities required to store data, a data out noise, and no over-fitting problem (Stein et al. 2005). Figure 3 illustrates the main steps of the feature selection process, which include subset generation from the dataset, evaluation of this subset, then result invalidation after stopping criteria has met.

Fig. 3
figure 3

Feature selection processes

2.3 Related works

A few related works in prior literature consider anomaly detection in WSN or propose detection techniques to address malicious attacks and intrusions in WSN. However, an IDS technique is proposed for WSN based on trust values (Jin et al. 2017). The features of the technique are as follows: (1) Mahalanobis distance is used to judge abnormalities for the node trust feature, which provides more accuracy. (2) To reduce the false-positive rate, a tolerance factor is used in the trust value computations. Thus the WSN security and flexibility are improved. (3) The proposed method is implemented using a multi-agent model to ensure scalability and fault tolerance. The findings revealed that the modified technique ensures a high detection rate and minimal false-positive rate.

A new anomaly detection model is introduced based traffic WSN (Yu et al. 2016). The proposed technique ensures a high level of detection accuracy by frequently updated model and static size sliding window; also, the exponentially weighted technique is used to enhance the detection rate, and using a reference and delay technique also increases the detection rate.

A hybrid IDS for WSN, and their technique is working based on hybridizing two existing techniques (i.e., anomaly and signature) (Maleh and Ezzati 2015). Hybridizing these two techniques provide an IDS with a 94% detection rate. Since these two models are integrated into a cluster topology in order to minimize communication costs between nodes, the lifetime of WSN is improved. In addition, their technique takes advantage of SVM and signature-based techniques to detect suspicious behaviors and to offer a lightweight intrusion detection in cluster topology. This technique employs a discovery protocol depending on the specifications in order to detect intruder’s nodes and to prohibit network disruptions. In specific, the role of this protocol is to classify the behavior into normal behavior or abnormal behavior based on a set of rules. However, this research ignores the other essential factors, such as accuracy, number of features, and execution time.

In addition, a new IDS is proposed in order to detect malicious attacks by classifying true alarms from false alarms, where this technique is applied in healthcare applications (Haque et al. 2015). Their technique is working based on historical data to predict sensor value, and then this value is directly compared with actual sensed value. Majority voting and dynamic threshold are used to identify alarm type. Their technique is evaluated using medical datasets. The findings show that their technique has a high detection rate (91%) and a low false-positive rate (16%). However, still, this technique needs more improvement in terms of execution time, number of features, and accuracy.

The PSO-IDS technique is introduced along with principal component analysis to detect an attack in the WSN environment (Ahmad 2015). The proposed technique obtained noticeable results in terms of the number of selected features and false alarm rate, but it requires more enhancements in terms of accuracy and detection rate. Besides, the execution time required to detect all attacks is totally ignored when proposing this technique. Besides, Sedjelmaci and Feham (2011) proposed a hybrid IDS technique in the WSN environment. Misuse IDs and SVM have been used to detect attacks. The proposed technique is evaluated using the NSL KDD’99 dataset, and the results revealed that the proposed technique achieves high accuracy (96%) and detection rate (93%), but it ignores false alarm rate and execution time. Other optimization algorithms can be investigated to solve the IDS (Abualigah and Khader 2017; Abualigah et al. 2018; Abualigah 2020; Abualigah and Diabat 2020).

As noticed from Table 1, the existing IDS techniques are still suffering from different issues, such as: high false alarm rate, low detection rate, high number of selected features, and/or high processing time. Therefore, this paper is conducted to overcome all these issues by proposing a modified binary GWO (GWOSVM-IDS) to detect intrusions in WSN environment efficiently.

Table 1 Overview of the related works

3 The proposed method

In this section, a new intrusion detection method is proposed to achieve the main objective of this study. As shown in Fig. 4, typically, the quantitative method starts with data acquisition, where the dataset is acquired from NSL-KDD dataset, because it is a verified and valid data set (Haque et al. 2015; Maleh and Ezzati 2015), which is extensively used in testing different intrusion detection technique.

Fig. 4
figure 4

Procedures of the proposed intrusion detection system

Figure 4 shows the steps of the proposed method that deals with the dataset to detect intrusions in WSN. The dataset is divided into two main parts with different sizes (i.e., training dataset with 80% size from the overall dataset and testing dataset with 20% size from the overall dataset). Generally, the proposed method as shown in Fig. 4 is divided into three main stages, the first stage is the future selection applied on the given data, the second stage is the classification process for the data, and the third stage is the evaluation stage to prove the performance of the applied methods. The following sub-sections describe the stages of the proposed method in detail.

3.1 Preparing dataset

In this section, the network security laboratory dataset is used (NSL-KDD’99) to evaluate the proposed technique. It is an open-source dataset, where it can be downloaded online. Indeed, this dataset is an enhanced version of the KDD-Cup 99 (Tavallaee et al. 2009). Several kinds of data analyses have been conducted on the NSL-KDD’99 dataset using different methods and tools with a global security aim, which is to develop active IDS. It has been used in this research due to several advantages such as no redundant records are included; therefore, no biased classification occurs. Indeed, it always provides more intrusion detection rates; and it can be used to compare different IDS techniques effectively. Table 2 summarizes statistics of NSL KDD’99 related to the number of attacks and standard records in the dataset.

Table 2 NSL KDD’99 statistics

The number of original records of NSL KDD’99 is 4,898,431 (3,925,650 attacks records and 972,781 official records). The filtering and reduction process is performed to increase the reliability, validity, and accuracy of this dataset by removing redundant and biased records. The number of distinct records that are eliminated from the refined version of NSL KDD’99 is 1,074,992 (262,178 attacks records and 812,814 standard records). In other words, the total reduction percentage is 78.05% (93.32% reduction in attacks records and 16.44% in standard records).

Besides, Fig. 5 depicts the attacks classes included in the dataset. These classes include: U2R attacks (user to root attacks where the attacker has local access to victim machine, hence tries to gain access), R2L attacks (remote to local attack where the attacker does not have an account on the victim machine, hence tries to gain access), DoS attacks (denial of service attack where the attacker tries to prevent legitimate users from using a service), and Probe attacks (attacker tries to gain information about the target host).

Fig. 5
figure 5

Types of attacks in NSL-KDD’99

Data normalization has been performed on NSL-KDD’99 to be more suitable for feature selection and classification techniques, in which dataset records are transferred in a form that is more appropriate to the classification and FS algorithms. Thus the size of this dataset is reduced, and the processing might be faster with fewer hardware processing requirements. Figure 6 shows a snapshot of the dataset before normalization. The steps of this normalization model are discussed in details as follows:

Fig. 6
figure 6

Snapshot of the original dataset

In this section, the preprocessing of the dataset is performed in two main phases, namely: (1) encoding of non-numerical values, (2) normalize numerical values. Figure 7 summarizes these phases.

Fig. 7
figure 7

Encoding dataset

Figure 7 is used to encode the content of the dataset. For example, each protocol type is given an integer number to covert the original content of dataset to numerical one, TCP = 1, UDP = 2, and ICMP = 3. The same encoding methodology has followed the encoding of the non-numerical content of the dataset.

The final form of the normalized dataset is as the following snapshot, which is shown below in Fig. 8.

Fig. 8
figure 8

Snapshot of dataset after preprocessing

After encoding the non-numerical content of the dataset, the normalization method is used to scale the content of the data set and ensure that the content of the dataset. Therefore, the following Eq. 1 has been used to scale the dataset values to get the normalized value (Jain and Bhandare 2011):

$$X^{\prime} = \left( {original\,value - \hbox{min} \_value} \right)/\left( {\max_{value} - min\_value} \right).$$
(1)

3.2 Feature selection using modified grey wolf optimization

In this paper, the feature selection algorithm (i.e., modified binary grey wolf algorithm) for attribute selection is used to identify the optimal feature set. Thus, it ensures to find out a subset out of full-length features in data set (i.e., 29 features), whose performance is higher than or equal the performance given by the 29 features. Figure 9 presents the flowchart of GWO.

Fig. 9
figure 9

Flowchart of GWO

GWO is one of the most used meta-heuristic feature selection algorithms (see Algorithm 1), which depends mainly on the hierarchy of leadership and the mechanism hunting by grey wolves in the real world. It has four types of wolves, including alpha wolves, beta wolves, delta wolves, and omega wolves (Al-Tashi et al. 2020). These wolves usually prefer residing in a pack, where the average size for each group is ranging from five to twelve. Alpha wolves are located in the 1st place, and beta and delta wolves are considered the second and the third places, respectively. The fourth wolves (omega) are the least expected solution. In other words, the first three wolves guide the optimization, and the fourth one is responsible for tracking the other wolves (i.e., alpha, beta, and delta).

The main element of modified GWO is: (1) increasing the number of wolves to be (five) wolves instead of being (three) wolves, and (2) proposing a new fitness function (multi-objective function) that aims to determine if a subset fits the core goals or not. The importance of accuracy and the total number of selected features are taking into consideration when proposing fitness function in order to evaluate each subset of the selected feature. Thus, the overall performance of the prediction system is based on how it predicts unknown classes efficiently in terms of accuracy and the total number of selected features. The algorithm of enhanced GWO shows in Algorithm 2.

figure a
figure b

3.3 Classification using SVM

After that, a reduced dataset with fewer features is produced to be used as input for the SVM algorithm for classification purposes (i.e., the output of a modified binary grey wolf is used as an input for SVM classifier).

The support vector machine is a machine learning algorithm used for classification purposes. In detail, data is represented in n-dimensional space (n is the number of features), where each feature has a particular coordinate. The classification process is performed by identifying the hyper-plane then classify the classes. Figure 10 summarizes the workflow of the SVM algorithm.

Fig. 10
figure 10

Flowchart of SVM (Ahmad 2015)

In detail, SVM uses the number of free parameters, which mainly depend on the separation margin that is used to spate different points, as illustrated below in Fig. 11. Thus, using SVM with a reduced number of feature resulted from a feature selection stage avoid over fitting. Besides, SVM introduces a generic technique to adjust the surface of the hyper plane to the selected features by using a kernel function.

Fig. 11
figure 11

Separating planes in SVM

However, SVM is adopted in this research because of two main reasons, namely: (1) speed, which allows it to be used in real-time application efficiently, and (2) high scalability, while its complexity is not affected by dimensionality of the feature space (Ahmad 2015).

4 Experimental results

In our experiments, Matlab R2019b is used to implement the proposed technique. Matlab R2019b is a widely used analysis and modeling tool used in various systems in different applications. Indeed, it could help in taking solutions’ ideas beyond the desktop by executing analyses using a massive volume of data sets. Besides, code written in Matlab might be integrated with other programming languages to provide algorithms deployment with web and enterprises. The hardware specifications used to demonstrate the performance of the proposed solutions is summarized in Table 3.

Table 3 Software and hardware specifications

4.1 Evaluation criteria

Moreover, five main evaluation metrics will be used to evaluate the proposed technique, namely:

  1. (a)

    Accuracy, which is the proportion of data correctly classified that is a true positive (TP) and true negative (TN) (Sabri et al. 2011):

    $${\text{Accuracy}} = \left( {\frac{{True_{positive} + True_{negative} }}{{True_{positive} + False_{negative} + False_{positive} + True_{negative} }}} \right).$$
    (12)
  2. (b)

    Number of features, which is average number of selected features used to classify the normal and abnormal behavior efficiently:

    $${\text{Number of Features}} = ({\text{Total}}_{{{\text{number}}\,{\text{of}}\,{\text{features}}}} - {\text{Nonselected}}_{\text{Features}} ).$$
    (13)
  3. (c)

    Execution time, which is the time consumed to finish normality and abnormality classification:

    $${\text{Execution time}} = ({\text{Finishing}}_{\text{Time}} - {\text{Starting}}_{\text{Time}} ).$$
    (14)
  4. (d)

    Detection rate, which is rate of true positive over true positive and false negative:

    $${\text{Detection rate}} = \left( {\frac{{{\text{True}}_{\text{Positive}} }}{{{\text{True}}_{\text{Positive}} + {\text{False}}_{\text{Negative}} }}} \right).$$
    (15)
  5. (e)

    False alarm rate, which is rate of false positive over true negative and false positive:

    $${\text{False alarm}} = \left( {\frac{{{\text{False}}_{\text{Positive}} }}{{{\text{True}}_{\text{Negative}} + {\text{False}}_{\text{Positive}} }}} \right) ,$$
    (16)

    where True-positive is the number of intrusions that are detected successfully, False-positive is the number of the typical behaviors that are classified as intrusions, True-negative is the number of normal behaviors that are classified as normal behaviors as well, and False-negative is the number abnormal behaviors that are classified as normal behaviors.

4.2 Testing and analysis

This section provides a detailed evaluation comparison of particle swarm optimization (PSO-IDS, original GWO-IDS, and the GWOSVM-IDS techniques.

4.2.1 Evaluation based on accuracy

This section provides a detailed evaluation comparison of particle swarm optimization (PSO-IDS, original GWO-IDS, and the GWOSVM-IDS techniques.

The proposed methods, GWOSVM-IDS7, GWOSVM-IDS5, and GWOSVM-IDS3, are investigated to show the ability of the proposed method in solving the IDS problem. These three versions of the improved GWO are adjusting to find better wolves configuration that help the basic version in avoiding its drawbacks such as premature conference, diversity of the solutions, and the slow search. Each change in the wolf size will affect the optimizer process, either better or worse. Hence, the number of wolves is examined in this section to find the most suitable number to make a better trade-off between the exploration and exploitation searches.

In terms of Accuracy, the proposed technique outperforms PSO-IDS and GWO-IDS. In details, GWOSVM-IDS with 7 wolves achieves 96% accuracy, GWOSVM-IDS with 5 wolves achieves 92% accuracy, PSO-IDS achieves 89% accuracy, and GWOSVM-IDS with 3 wolves achieves 79% accuracy. Giving the high weight to Accuracy over several features, in the proposed multi-objective function, increases classification accuracy. Therefore, enhancing the accuracy of intrusion detection in the WSN environment (i.e., objective number 1) is achieved. Figure 12 presents the results.

Fig. 12
figure 12

Evaluation results based on accuracy

4.2.2 Testing of false alarm rate

In terms of false alarm rate, the proposed technique outperforms PSO-IDSPSO-IDS and GWO based IDS. In detail, GWOSVM-IDS with seven wolves achieves a 3% false alarm rate, GWOSVM-IDS with five wolves achieves 9% false alarm rate, PSO-IDS and GWOSVM-IDS with three wolves achieve 26, 24% false alarm rates respectively. Therefore, decreasing the false alarms rates and the number of features resulted from the IDSs in the WSN environment (i.e., objective number 2) is achieved. Figure 13 summarizes the obtained results in terms of detection rate.

Fig. 13
figure 13

Evaluation results based on false alarm rate

4.2.3 Testing of detection rate

In terms of detection rate, the proposed technique outperforms PSO-IDS and GWO-IDS. In detail, GWOSVM-IDS with seven wolves and GWOSVM-IDS with five wolves achieve 96% detection rate, PSO-IDS, and GWOSVM-IDS with three wolves achieve 93, 83% false detection rate respectively. Therefore, enhancing the detection rate of IDS in the WSN environment is achieved (i.e., objective number 3). Figure 14 summarizes the obtained results in terms of detection rate.

Fig. 14
figure 14

Evaluation results based on detection rate

Based on repeatable experiments, using GWOSVM-IDS with 5 wolves and 7 wolves provide consistent and reliable results, since using more wolves would increase complexity and decrease overall detection rate.

4.2.4 Testing of number of features

In terms of number of features, the proposed technique outperforms PSO-IDS and GWO-IDS. In details, GWOSVM-IDS with 7 wolves and GWOSVM-IDS with 5 wolves select 12% features to detect intrusions efficiently, PSO-IDS and GWOSVM-IDS with 3 wolves select 20, 27% features to detect intrusions respectively. Therefore, decreasing the number of features resulted from IDS in WSN environment is achieved (i.e. objective number 4). Figure 15 summarizes the obtained results in terms of number of features.

Fig. 15
figure 15

Evaluation results based on number of features

Based on repeatable experiments, using GWOSVM-IDS with five wolves and seven wolves provide consistent and reliable results since using more wolves would increase complexity and increase the number of selected features.

4.2.5 Testing of number of execution time

In addition, the proposed method outperforms the other techniques in terms of execution time ash shown below in Fig. 16. As GWOSVM-IDS with seven wolves and GWOSVM-IDS with five wolves consume 69.6 and 74.4 h respectively, but PSO-IDS consumes 129.6 h and GWOSVM-IDS with three wolves consume 86.4 h to detect all intrusions. Therefore, decreasing the processing time required by the IDS in the WSN environment (i.e., objective number 5) is achieved (Table 4).

Fig. 16
figure 16

Evaluation results based on execution time (hours)

Table 4 Results summary of the three versions of the GWO compared with others

To determine the enhancement of the proposed technique, the enhancement percentages for GWOSVM-IDS with 7 wolves and with 5 wolves are calculated using the following equation:

$${\text{Enhancment}}_{\text{Percentage}} = \left( {\left| {{\text{Old}}_{\text{Value}} - {\text{New}}_{\text{Value}} } \right|} \right)/{\text{Old}}_{\text{Vlaue}} .$$
(17)

For example, let’s say we want to calculate the enhancement percentage of GWOSVM-IDS 7 wolves on GWOSVM-IDS 3 wolves in terms of accuracy, then:

$${\text{Enhancment}}_{{{\text{Percentage}}\_{\text{Accurcay}}}} = \left( {\left| {{\text{Accurcay}}_{{3 {\text{Wolves}}}} - {\text{Accurcay}}_{{7 {\text{Wolves}}}} } \right|} \right)/{\text{Accurcay}}_{{3 {\text{Wolves}}}} = \, \left( {| \, 0. 7 9- 0. 9 6 { }|} \right) \, / \, 0. 7 9 { } = \, 0. 2 1.$$

The enhancement percentages for the remaining metrics for each technique are calculated using the same manner. The obtained enhancement results are summarized in Figs. 17 and 18.

Fig. 17
figure 17

Enhancement percentage of GWOSM-IDS with 5 wolves on existing IDS techniques

Fig. 18
figure 18

Enhancement percentage of GWOSM-IDS with 7 wolves on existing IDS techniques

Figures 17 and 18 show the enhancement percentage of GWOSVM-IDS with seven wolves and GWOSVM-IDS with five wolves on PSO-IDS and GWOSVM-IDS with three wolves. Indeed, GWOSVM-IDS with five wolves enhances PSO-IDS in terms of accuracy, number of features, and execution time, false alarm, and detection rate 3, 40, 42, 64, and 3% respectively. Further, GWOSVM-IDS with five wolves enhances GWOSVM-IDS with three wolves in terms of accuracy and number of features, execution time, false alarm, and detection rate by 16, 50, 13, 60, and 14% respectively.

Besides, GWOSVM-IDS with seven wolves enhances PSO-IDS in terms of accuracy, number of features, and execution time, false alarm, and detection rate 7, 40, 46, 88, and 15% respectively. Further, GWOSVM-IDS with seven wolves enhances GWOSVM-IDS with three wolves in terms of accuracy and number of features, execution time, false alarm, and detection rate by 21, 50, 24, 87, and 15% respectively. Using the proposed multi-objective function with the proposed IDS techniques plays a significant role in enhancing classification accuracy, detection rate, and decreasing and false alarm rate number of features used to detect attacks efficiently. Also, using more number of wolves might decrease the execution time.

5 Conclusion and future works

GWO is one of the widespread and robust feature selection algorithms used in different fields, such as IDSs. It selects the most important features that can improve the classification accuracy and intrusion detection rate. Despite the effectiveness of the GWO, it has shortcomings related to the accuracy, number of selected features, and execution time. Several studies have been proposed to enhance the performance of the GWO-IDS, but, still, they need more improvements in terms of accuracy, number of selected features, and execution time. This study is conducted to enhance the performance of GWO-IDS in WSN by increasing the number of wolves and using a multi-objective function to increase the overall performance of the prediction system is based on how it predicts unknown classes efficiently in terms of accuracy and the total number of selected features, and speed-up the execution.

NSL KDD’99 has been used to train and test the proposed technique, and the results are compared with the results of original GWO-IDS and PSO-IDS techniques. Indeed, the findings demonstrate that the GWOSVM-IDS technique outperforms the performance of original GWO-IDS and PSO-IDS techniques in terms of accuracy, detection rate, false alarm rate, number of selected features, and execution time.

As future work, In the future, a mathematical prediction technique could be used to predict the next location of wolves, in order to increase the performance of GWO. Also, hybridizing the modified GWO with other efficient and recent feature selection could be conducted to increase the detection rate in the WSN environment. Besides, other classifiers might be used instead of using SVM to enhance the performance of the classification process.