1 Introduction

One of the significant challenges is network security because of the tremendous evolution of information technology. Big data is continuously receiving attackers and is therefore susceptible to external network intrusion. When an intruder sends malicious packets to the host machine or requires a vulnerable network to access or manipulate sensitive data, this is known as an intrusion. Protection protocols may be applied on a network to minimize the number of intruders. Unauthorized people cannot access this service through these devices. Attackers use various approaches to find the weakness in a network’s security, and the method used to identify and track this malicious behavior in a network is intrusion detection [1]. It is quite challenging to detect the network manually. The IDS system was therefore designed to carry out the work automatically, observe network and device operations to identify fraudulent behaviors. It can be a Network-based Intrusion Detection System (NIDS) and Host-based Intrusion Detection Systems (HIDS) [2]. HIDS detects abnormalities in a computer system. NIDS would be used to identify network system abnormalities. Network-based IDS are classified into two types: signature-based (or) misuse-based NIDS and anomaly-based NIDS. Signature-based NIDS identifies a threat by comparing a signature that has already been saved in the signature database with the received data packet. A signature is described as the established attack pattern or rule. However, unidentified assaults can’t be detected. On either end, anomaly-based NIDS detects a new assault by identifying the customer’s usual system behavior.

A slight difference between the observed occurrence and the normal activity is invasive. The drawback of anomaly-based NIDS is that normal behavior is complicated to construct due to the diversity of internet traffic. The Intrusion Detection System (IDS), as shown in Fig. 1, has become a key component of security architecture. The proposed system describes intrusion detection and identification. The intrusion framework usually handles a massive amount of data; one of the critical tasks of IDS is to preserve the highest value of features that display the entire data and delete redundant information. Feature selection decreases the number of features from the noisy dataset. This relevant function subset increases the detection rate. Selection features are classified into (a) Filter, (b) Wrapper, and (c) Hybrid approach [3].

Fig. 1
figure 1

Intrusion detection system setup

Feature selection aims to reduce classification time and improve accuracy rate. They classified the datasets using the current scheme, which incorporates all of the dataset’s attributes. For complex problems, the Bat algorithm is used. Sigmoid and tan hyperbolic functions were commonly used to solve non-continuous issues. The current algorithm selects features using the Bat Algorithm. That is the motion of the Bat in a d-dimensional binary space. As a result, a bat’s location is described as a vector of binary coordinates, and the bat may traverse the hyper cube’s corners. In each iteration step, the transformed values for the attribute subsets will be changed to ensure that the bat continues to travel to the appropriate location. They use knowledge to pick features in the current framework and the bat algorithm to update the SVM [4] regularization and kernel parameters. If the bat cannot find a better performance value within a pre-defined number of samples, use the Bat algorithm’s original global solution. However, it slightly improves the algorithm’s execution. SVM is assigned the selected attributes to review for classification accuracy.

The filter method requires correlation to classify its characteristics and doesn’t rely on the classifier. Wrapper methods, on the other hand, are entirely reliant on the classifier. The actual application is to analyze and monitor system vulnerabilities. This will be more effective in identifying abnormality activities and user tracking policy. IDS system ensures secured web services along with file integrity. Our framework employed the wrapper approach. In this paper, the author uses a comparatively recent hybrid method, the Bat Algorithm, to improve the SVM classifier, providing significant improvement. Consequently, we propose a new adaptive method that incorporates Bat Algorithm and demonstrates experimentally that this outperforms the conventional BAT when combined with SVM.

2 Related Works

Intrusions may be identified as explicit or implicit. Secondary intrusions are triggered by authorized or unauthorized persons from outside the network into the network’s surface. Primary intrusions are conducted out by authorized individuals within the network and the internal network. Attackers commonly compromise computer systems through software defects, password cracking, traffic flows collisions, and performance issues in networks, utilities, or network devices [5]. The application of a distinguished supervised learning algorithm fetches information that is curious to design an IDS. This paves the way for an easy and competent intrusion detection method susceptible to quick acclimatization by anyone. Many current Machine Learning methods like Decision Tree, Neural Network, Back-Propagation NN, Naïve Bayesian, and Bayesian Network for network data classification are used to conduct the comparative study.

IDS’ research area focuses on the development of machine learning algorithms. Various machine learning algorithms have been developed over the past years to deal with noisy data and detect new attacks with a low false-positive rate, including neural networks, genetic algorithms, and decision trees [6]. According to filter-based feature selection, it can manage data features that are sequential and nonlinear. SVM classifier is used for sampling.

An evolutionary algorithm is used to select features. They created Particle Swarm Optimization (PSO) [7] for selecting features and performed classification using ensembles of tree-based classifiers. PSO is a technique proposed for detecting intrusion. They pick features using a genetic algorithm and use Adaptive Mutation to achieve gradual convergence. A hybrid algorithm that integrates modified Artificial Bee Colony with Enhanced Particle Swarm Optimization is used to achieve the best result. The method of tenfold cross-validation can be used for classification. They use the KDDCup’99 [8] benchmark dataset to assess the efficiency of this work.

The author invented a new detection model. They pick features using Binary Particle Swarm Optimization and validate the results using SVM and C4.5 classifiers. In comparison with PSO, BPSO achieves superior performance. The author suggested an ensemble classifier as a hybrid of SVM and K-Nearest Neighbors [9]. PSO searching returns a subset of features, and an ensemble classifier identifies the assault.

The Dynamic Membrane-driven Bat Algorithm (DMBA) method aims to enhance resident diversity through tradeoff where the static membrane methods in DMBA would be dynamically involved by integration and separation rules that help maintain the diversity of the population. This method is used to cover the classification depending on the SVM classifier used in many areas like disease diagnostics, face recognition, text recognition, plant disease identification, sentiment analysis, and IDS for network security applications.

Additionally, they use the KDDCup’99 benchmark dataset. For selecting features, the Binary Bat Algorithm is suggested. The bat’s location is described using a vector of binary coordinates. The bat has traveled across d-dimensional binary space. They validate their method using two datasets: cancer and iris [10].

Techniques for Feature Selection in High-Dimensional Data. The primary issue is to improve the optimization efficiency to solve various optimization issues and advantages of various dynamic membranes computing structures. To extend technologies and develop more reliable data crowned, there is a need to process many data sets. The approaches for selecting the features describe how the features are incorporated during the evaluation process, namely feature subset-based and feature k-means, according to the machine learning algorithm used, namely wrapper, embedded, hybrid, and filter [11]. It is demonstrated that feature ranking-based methods are more effective in system memory and highly computational than subset-based methods and that k-means methodologies do not reduce redundancy.

3 Proposed Model

The proposed framework selects features using the bat algorithm. To identify malicious activity and enhance classification accuracy, specific characteristics would be integrated into the SVM classification algorithms [12]. Even in high-dimensional noisy datasets, SVM provides excellent generalization and avoidance of local minima, as well as good precision. Figure 2 depicts our system framework focused on the Bat algorithm, while Fig. 2 describes the system selection process. The Feature selection function, in this case, seeks to extract the most critical data from a sample set of features. Given that this task can be viewed as an optimization problem, the combinational growth of potential solutions can make an efficient detection impractical [13].

Fig. 2
figure 2

Proposed architecture

3.1 Bat Algorithm

Yang was influenced by microbats that use echolocation. Bats emit a loud sound pulse to locate prey or obstacles. As a result, he created the Bat Algorithm in 2010 [14]. The Bat flies randomly in search of its game. The complete algorithm is shown in the flow chart in Fig. 3. Three classification principles have been identified to evaluate an intelligent bat algorithm:

  1. Step 1

    Directional microphones are used by all bats to measure the contrast between a threat and victims and to detect range.

  2. Step 2

    Bats move randomly, and their action is identified by their spatial position (xi) and speed (vi). These values are measured to search for threats using a changing wavelength (), frequency (freqmin), and loudness (A0). Consequently, bats can modify the frequency of their emitted pulses and the speed during which they transmit vibrations (r [0,1]) to the distance of their objective [15].

  3. Step 3

    The loudness can change in many different ways to presume its ranges between a considerable value (A0) and a consistent minimum value (Amin).

Fig. 3
figure 3

The proposed framework of the feature selection process

3.2 Mathematical Model of Bat

  • Step 1 The formula determines the hyperplane.

    $$f_{{bat}} (X) = sf(x) + Matrix_{b}$$
    (1)

    The formula determines the hyperplane.

    whereby \(sf\) denotes the scaling factor and \(Matrix_{b}\) represents the bias matrix.

  • Step 2 By transforming the dataset into a higher-dimensional feature vector, we can change the nonlinear SVM to a linear problem through kernel functions. We are using SVM with Radial Basis Function (RBF) for testing prototype, with the appropriate RBF kernel [16]:

    $$Kernal(X_{{i,}} X) = Exp\left[ { - \frac{1}{{2\sigma ^{2} }}(X_{i} - X^{2} )} \right]$$
    (2)
  • Step 3 Bat algorithm is a particle swarm algorithm that encompasses a network of sensors to conduct searches. Bat algorithm can find the best C and σ based on the SVM’s reliability while determining SVM threshold limits. Each agent has a current position,

    $$X_{i} = (X_{{i,1}} ,X_{{i,2}} ,....,X_{{i,Dim}} )T,A$$
    (3)
  • Step 4 Current flying velocity,

    $$Velocity_{i} = (Velocity_{{i,1}} ,Velocity_{{i,2}} ,....,Velocity_{{i,Dim}} )Time$$
    (4)

    where Dim is the problem dimension. Each agent alerts its direction and speed according to the given equation to determine the best position:

  • Step 5

    $$Freq_{i} = Freq_{{Minimum}} + (Freq_{{Maximun}} - Freq_{{Minimum}} ).\beta$$
    (5)
  • Step 6

    $$Velocity_{{i,j}}^{T} = Velocity_{{i,j}}^{{T - 1}} + (X_{{i,j}}^{{T - 1}} - XBest_{j} ).Freq_{i}$$
    (6)
  • Step 7

    $$X_{{i,j}}^{T} = X_{{i,j}}^{T} + Velocity_{{i,j}}^{T}$$
    (7)

    where βϵ[0,1] is a randomly generated vector. XBest is the group's optimal solution. Only the fitness function decides the solution's consistency. The fitness function used in this model is the precision of the SVM after it is trained on the dataset described by the bat's location. When a bat approaches a point, its Loudness (Li) decreases, and its pulse emission (PEi) rate increases [17].

  • Step 8

    $$L_{i}^{{T + 1}} = \alpha .L_{i}^{T}$$
    (8)
  • Step 9

    $$PE_{i}^{{T + 1}} = PE_{i}^{0} .[1 - e^{{ - 1\gamma .T}} ]$$
    (9)

    where α (0 < α < 1) and γ (γ > 0) are constant values. Yang implements uniform random walks to enhance the exploration dimension in S space:

  • Step 10

    $$X_{{fresh}} = X_{{\Pr evious}} + \delta .PE_{{Time}}^{*}$$
    (10)

Where δ ϵ [-1,1], random number PE*Time; average loudness of all bats.

In some instances, BA is similar to the prominent PSO. The particle’s position characterizes the method; each swarm member has its velocity and function and based on their fitness value, they have been updated in real-time. Some other notable differences exist since BA utilizes the approach through using random walks and adjusting the loudness and pulse rate. For PSO, extraction is regulated by regional and specific best methods, while discovery is managed using two learning criteria [18].

3.3 Bat Feature Selection

To increase the accuracy of the classifier, the feature selection of Bat attempted to enhance the feature subset from each step. The essential elements in this optimization algorithm are [19] [20]:

  • Each bat has a location that signifies a subclass of all its features. The bat practices and examines the SVM classifier using this element.

  • After examining certain bats, the swarm’s optimal solutions fitness value is determined.

  • While reaching the optimization algorithm, each bat improves its location and pulse rate, and frequency.

  • The solution is randomized by the bat using Levy flights such that it performs the computation, and thus S(Xij) will be the sigmoid function.

  • The frequency and velocity of the bat will be updated in case there is no significant improvement in the fitness value.

  • Finally, an increase in pulse rate and decrease in loudness of the bat has been experienced in case the new fitness value outperforms the global best. There is a modification in the global optimum.

3.4 The General Bat Algorithm for Feature Selection

The below-mentioned algorithm indicates the Feature Set selection by using the Bat system (Fig. 4).

  1. Step 1

    Input: PS (Network Size), Maximum Threshold.

  2. Step 2

    Output: fBat(XBest) solution.

.

Define pulse frequency, rates, and loudness.

Initialize the fBat(xi) and position xi.

  1. Step 3

    While Time < Maximum Threshold Do.

  2. Step 4

    For Each XBat i to PSO Do.

.

Generate New and Optimized Solution.

  1. Step 5

    If rand > Li, Then.

.

Determine the XBest solution.

Estimate local search.

  1. Step 6

    End If.

  2. Step 7

    If L &&< Li &&fBat(Xi) < f(X_Fresh) Then.

.

Accept new solution –Li && ++ Li.

  1. Step 8

    End If.

  2. Step 9

    End For.

.

Time = Time + 1.

  1. Step 100

    Return XBest =f(XBest Solution).

  2. Step 110

    End.

.

Fig. 4
figure 4

Flowchart of bat algorithm

4 Results and Discussions

4.1 Test Feature Set

The KDD’99 data set is the most commonly utilized set of data for evaluating IDS. This model obtains the data collection during the DARPA’98 IDS assessment program. Approximately 8 GB of compressed raw (binary) TCP dump data are found in DARPA’98 from seven weeks of network traffic, which is liable to a transformation of about 5 million link records containing 110 Bytes approximately. Almost 2 million communication records are included in the 14 days of testing results. The KDD training dataset has about 4,950,000 single link vectors, of which most of them contain 41 features and are marked as either normal or an assault with a distinction in each type of attack.

Table 1 Set of Features

4.2 Support Vector Machine

The Support Vector Machine is a two-classifier algorithm. Table 1, which was used for SVM feature selection, indicates the feature collection. It has a hyper-optimal that is split into two groups. The algorithm evaluates the supporting vector to represent the hyperplane, which achieves the highest accuracy. The kernel classifier and classifier design processes for network anomaly detection problems were applied. They test the kernel type bang and the limit values on the precision of the intrusion classification by an SVM. Classification accuracy was shown to vary with kernel type and parameter values. SVMs could detect intrusions with improved efficiency and reduce the number of false alarms when the input parameters are appropriately selected. Kernel function has few advantages, such as less convenient parameters and nonlinear solid forecasting.

4.3 Data Set

To train and test the proposed model KDD’99 data set is used. It is a benchmark data set that was offered by the designer of the intrusion detection system. The data set contains the 41 feature that is grouped into four types:

  1. a.

    Basic Features Extracted without payload from the packet header.

  2. b.

    Content Features The payload from the initial TCP packet can be evaluated using domain information.

  3. c.

    Time-based Traffic Features The same communication features in the last two seconds between origin and the endpoint.

  4. d.

    Host-based Traffic Characteristics Calculate the expected window over time rather than connection numbers.

4.3.1 In the KDD cup, all of the attacks are categorized as follows:

  1. a.

    Denial of Service (DOS): Intruder denies customer service valid.

  2. b.

    Probes: Intruder searches the overall computer networks to collect details or identify vulnerabilities for possible threats.

  3. c.

    Remote to Local (r2l): The attacker attempts to get permission from remote access to the local user.

  4. d.

    User to Root (u2r): The standard user is getting access to the root account.

4.4 Performance Evaluation

The feature selection is measured using different metrics, including accuracy, precision, recall, and F1-Score. Similar specifications are described in terms of True Positive (Tp), False Positive (Fp), False Negative (Fn), and True Negative (Tn).

  • True Positive (Tp): To identify the packets, it is attempting to attacks.

  • False Positive (Fp): To accept regular transmissions as an intruders.

  • True Negative (Tn): Zero analysis of malicious nodes.

  • False Negative (Fn): As its routine, the attack packets are observed.

Accuracy parameter is being used to compute the number of cases correctly recognized as normal or attack, as defined in the following equation:

\(Accuracy=\frac{{{T_p}+{T_n}}}{{{T_p}+{F_p}+{F_n}+{T_n}}}\)

Precision is used to evaluate true-positive and false-positive cases, as illustrated in the following formula:

\(\Pr ecision=\frac{{{T_p}}}{{{T_p}+{F_p}}}\)

The recall is being used to evaluate true-positive and false-negative cases. The following formula represents recall analytically.

\(\operatorname{Re} call=\frac{{{T_p}}}{{{T_p}+{F_n}}}\)

The F1-score is measured as the aggregate of recall and precision. This can be defined as follows:

\(F1 - Score=\frac{{2x\Pr ecisionx\operatorname{Re} call}}{{\Pr ecision+\operatorname{Re} call}}\)

At present, performance reviews can lack accuracy and recall. If a process has a limited recall but a high accuracy, an additional criterion is needed. As a rule, the F1 score must address this issue.

4.5 Results and Analysis

An Intel Core ™ i7 processor with 8 GB of RAM under windows 10 was used to complete all tests. We have used input metrics for Bat algorithm:

  • Maximum loudness A0 = 10.

  • Minimum pulse rate r0 = 0.9.

  • Freqmin = 0.8 and Freqmax = 1.0.

  • α = 0.9 and γ = 0.1.

.

Table 2 Test results
Fig. 5
figure 5

Performance analysis of different feature set with SVM approach

SVM gives 86.45 % precision for a maximum of 41 features. The choice of feature selection employing the bat algorithm led to greater accuracy. The results of the tests and output of the feature set are indicated in Table 2; Fig. 5. The suggested pattern was linked to other feature selection methods, such as PSO, Ant Colony Optimization (ACO), Artificial Bee Colony (Table 3; Fig. 6). The following table results indicate that the suggested framework generates significant improvements in detection rate and FAR. Performance Achieved.

Table 3 Comparison with other IDS
Fig. 6
figure 6

Performance analysis of dataset with different SVM approach

4.6 Experimental Results

The test was conducted to find the time needed to construct a model concerning the feature size. For development and research, the detecting ratio of the model was evaluated using the same dataset. The result shows that the Detection Ratio (almost the same) changes slightly when the characteristic size decreases; instead, the layout time and the model building duration vary greatly. Figure 7 shows the classification characteristics, and Fig. 8 shows the True Positive Rate and False Positive Rate. Figures 9, 10 and 11, and Fig. 12 demonstrate the similarities of the suggested technique with other techniques.

Fig. 7
figure 7

SVM classification with 41 feature

Fig. 8
figure 8

True positive and false positive rate for 41 features

Fig. 9
figure 9

Comparison chart with various IDS

Fig. 10
figure 10

Feature size Vs. detection ratio

Fig. 11
figure 11

Feature size Vs. search time

Fig. 12
figure 12

Feature size Vs. building time

5 Conclusions

This article’s significant aspect is to identify the appropriate IDS function. Since eliminating irrelevant attributes is one of the Intrusion Detection systems challenging works. This feature selection approaches reduced dataset attributes while enhancing the classifier’s prediction performance, detection accuracy, and false alarm rate. To assess the suggested model’s efficiency, the KDDcup99 IDS benchmark data collection has been used. And for different datasets, the presented algorithm achieves positive performance. The proposed Bat algorithm with SVM constructs a wrapper system for 41 feature selection and chooses the appropriate features. The KDDCup99 test showed attacker detection accuracy, and the false alarm rate is better than PSO and ABC in the suggested method.