1 Introduction

The emergence of wireless networking significantly relies on the self-organized and multi-hop network environment. It aggregates huge amount of sensor nodes through wireless communication and characterized as simpler and low cost network deployment [1]. It is extensively adopted in real-time environment like military exploration, modern logistics, and environment perception where the connected sensor nodes collaboratively works to carry out detection, monitoring, and tracking of certain malicious nodes or intruders over the network [2]. Specifically, WSN-based intrusion detection system is used to handle security issues encountered during rescuing of post-disaster, region monitoring, and border patrol and turns as generic field of modern research. Thus, it needs constant monitoring and tracking method for the prediction of intrusion and thus there is a need for design to deal with these multi-objective constraints to attain high-quality and persistent handling of the intruder [3].Please confirm the section headings are correctly identified.Checked and Verified.

Some present investigations over intrusion detection is partitioned into diverse two categories: the former one is to perform trace prediction and accurate localization of the target by adoptively sensing the information from diverse nodes based on local voting and decision fusion approaches [4]. The second model relies on the movement and deployment strategies on SNs to attain enhanced dynamic target coverage. It is considered as an addition of conventional coverage optimization issues and it is the specific concern of this work [5]. The coverage quality is drastically influenced by the preliminary deployment of the SN localization. However, owing to the hostile or remote sensing environments, for example, region monitoring or border patrol based sensor deployment is not manually handled in most real-time environment [6]. Therefore, usually, the sensors are deployed with the scattering of aircrafts; moreover the appropriate position for deriving the landing is not controlled owing to the existence of obstacles and wind like mountains and trees. Subsequently, certain sub-areas does not possess appropriate sensor coverage region where diverse sensors are removed and some regions are identified with coverage issues (regions that does not comes under the coverage region) [7].

Generally, it is crucial to get rid of these issues and addition of sensors for predicting intrusions can be attained only with the adoption of miniaturized robots and embedded hardware’s. Some sensors possess similar sensing competency and considered as the static sensors and it has the ability to move towards the appropriate locations for offering optimal coverage after the node deployment [8]. Regrettably, the nodes are not competent of tracking and predicting the intruders to enhance the coverage quality. This condition is still worse with the emergence of anti-reconnaissance methods over the prediction of intruders in real-world environment. It is equipped with some sensing devices and attains location information regarding the detection nodes and carry out planning to eradicate the detection process. These intruders are depicted as an ‘empowered intruder’ and differ from the native intruders and the elegant nature of the SN’s tracking makes it stubborn. Thus, the design of effectual intrusion detection approaches for these sorts of intruders are a challenging task [9].

Conventional intrusion detection approaches for region monitoring or border patrol relies on the centralized network architecture. The intruders or the intermediate nodes transfer the information to the cluster nodes or base station and takes necessary action after information processing or analysis [10]. This method necessitates recurrent interaction among the cluster nodes, base station, and detection nodes. It is occupy huge amount of network nodes and increases the networks’ transmission delay. Thus, it outcomes delayed handling issues like interrupted events or intruder prediction [11]. Subsequently, the conventional centralized framework is inappropriate for some real-time scenario specifically over the highly-influenced intruders. The nodes have to maintain the records of the process to perform local computation, tracking of trajectories in the real-time environment [12]. Moreover, the node does not possess certain efficiency to deal with these problems.

In the modern era of computation intelligence, various approaches are non-classical approaches the works like human beings to learn certain tasks from the observations or data [13]. Subsequently, this intelligence system possesses some characteristics to make the model more feasible and to be adopted in the construction of effectual models in diverse fields. Some of its features include fault tolerance, high computational speed, competency to deal with error resilience, adaptability during the model of noisy information [14]. This research work considers Fuzzy Logic (FL) which is one among the intelligence technique that is inspired from the human brain activities with uncertainty measure. It is also considered as the logic system or rule-emergence system with appropriate features and tolerance towards uncertainty and imprecision. Thus, it performs rule-based classification in an effectual manner. Moreover, it is not self-adaptive and it acts as a candidate for optimization purpose. Here, Particle Swarm Optimization is considered which is most popular for handling the multi-objective constraints and functions as global optimization ability with Genetic Algorithm (GA). Thus, this work models a novel Fuzzy Genetic Algorithm with Multi-Objective Particle Swarm Optimization that maximizes the detection accuracy, minimizes the false alarms and takes less computational complexity. The anticipated model is tested, validated and proven with the competency or evolution of optimization model with superior accuracy and lesser FAR, improves classification accuracy for certain attacks. The features are chosen and analyzed using Principle Component Analysis (PCA). The data source is attained from the online accessible NSL-KDD dataset. The simulation takes place within the MATLAB environment, incorporating metrics such as accuracy, precision, FAR, and more.

The structure of the work is as follows: Sect. 2 comprises an in-depth survey of various existing approaches related to IDS, along with their associated pros and cons. Section 3 elaborates on the methodology in a broader sense, focusing on gaining insight into the prediction model. In Sect. 4, the discussion revolves around the results obtained from model evaluation, presented graphically. Finally, Sect. 5 presents the conclusion of the work, along with suggestions for future improvements.

2 Related Works

This section gives the recent updation regarding the data taxonomy along with certain research ideas on IDS up to data and the classification systems used for this prediction taxonomy. It offers a comprehensive and structural overview on prevailing IDS. Therefore, the research becomes proficient with certain key factors in anomaly detection.

Osanaiye et al. [15] discusses signature-based IDS for pattern matching approaches to predict the unknown attacks. Also, it is termed as misuse detection or knowledge-based detection. With this model, matching approaches are utilized to predict various intruders. Subsequently, when the intrusion signature fits with the existing intrusion signature that prevails over the signature database, then an alarm signal is found to be triggered. In case of SIDS, the host logs are identified to predict the commands sequence or actions that are previously determined as malware. It is also labelled over the reviews as misuse detection or knowledge-based detection process. Li et al. [16] discusses conventional approaches that are used for intrusion detection using network packets and pretends to match against the signature databases. However, these approaches are incapable to predict the attacks that span various packets. It is extremely essential to haul out signature information as the modern malwares are completely sophisticated over the multiple packets. It needs IDS for content recall for various packets. Generally, there are diverse methods that are used for the creation of state machines, semantic conditions, and formal language string patterns indeed of creating various IDS signatures.

Zhou et al. [17] discusses the significant benefits of various IDS to predict zero-day attacks owing to the fact that the prediction of abnormal user functionality does not based on the signature database. It induces some dangerous signals while analyzing the nature that varies from usual characteristics. Moreover, it possesses various advantages. Initially, it has the competency to predict the internal malicious functionalities. When the intruder initiates the tractions of the stolen account that are not identified by the user activities in a typical manner, it triggers the alarm condition. Next, it is extremely complex for the cyber-criminal to predict what sort of user’s characteristics is constructed devoid of any alert system form the customized profiles. Almomani et al. [18] discusses various categories of IDS methods and it is known as machine-learning based, knowledge-based, and statistics-based approaches. The last model includes examination and collection of various data records over the set of items and the construction of statistical model with normal user characteristics. Subsequently, knowledge-based model pretends to predict the essential activities from prevailing data systems like network traffic instances and protocol specifications. For instance, machine-learning approaches need complex pattern matching approach for training data.

Ioonnou et al. [19] discusses various machine learning approaches. It is a process of hauling out knowledge from huge amount of data. It is a model which is composed of set of rules, complex transfer functionality, and methods which is used to predict the essential data patterns, predict or examines the nature of the model. The learning approaches are used widely in the field of IDS. Various techniques and algorithms like NN, DT, clustering, association rules, GA and K-NN approaches are adopted for predicting or learning knowledge from intrusion datasets. Ghosal et al. [20] discusses a approach to perform feature selection using the integration of feature selection approaches like correlation attribute evaluation and Information Gain. The author validates the performance by selecting the features by applying diverse classification approaches like NB, C4.5, NB-tree and MLP respectively. Almomani et al. [21] applies genetic-fuzzy rules based mining approaches which is used for evaluating the significance of the IDS characteristics. Ke et al. [22] discusses IDS with the adoption of Random forest to enhance the prediction accuracy and Arun et al. discusses how to diminish the FAR [23]. Khraisat et al. [24] anticipates a classification approach using NSL-KDD dataset with DT algorithm to design of a model with certain metrics and examines the significance of DT approaches.

Ali et al. [25] discusses a classifier model known as Support Vector Machine (SVM) determined by partitioning the hyperplanes. It adopts kernel function to map the training data into high-dimensional space. Therefore, the intrusion is classified in a linear manner. It is well-known for its generalization ability and notably value when the number of attributes is larger and number of data points is completely smaller. Various kinds of hyper-plane separation are attained with the adoption of kernel functions like hyperbolic tangent, Gaussian radial basis function, linear and polynomial functions. With IDS dataset, some features are less influencing and redundant in data point separation into appropriate classes. Thus, feature selections are determined by SVM training. Also, SVM is adopted for classification purpose into multiple classes. Buczak et al. [26] describes SVM with RBF kernel function which is used for categorizing KDD’99 dataset in pre-defined classes. From the provided 41 attributes, the feature subset is selected in a careful manner by selecting feature selection approaches.

Peng et al. [27] depicts k-NN classifier which is a non-parametric classifier in a typical manner and applied over ML approaches. The concept behind this approach is to name the provided unlabelled data sample towards the k-NN classes. Here, ‘k’ is an integer that predicts the number of neighbours. Generally, k = 5 for most cases. Here, ‘x’ specifies he unlabelled data instances that need to be categorized. From the provided five NN, three NN possess similar patterns from the given intrusion class and two from normal class. With the major voting model, it facilitates ‘X’ for the intrusion class. Ibrahim et al. [28] anticipates a novel fuzzy-based supervised learning model by adopting unlabelled samples along with supervised learning model to improve IDS classifier performance [29, 30]. Then, the SH-FFNN model is trained for providing the output with fuzzy-based membership vector function and sample classification (high, mid and low fuzzy classifiers) over the unlabelled sample which is done with fuzzy quantifiers. The classifier is then re-trained after the integration of every category into original training set separately. The experimental outcomes use semi-supervised intrusion detection over NSL-KDD dataset and projects unlabelled samples with high and low fuzziness which leads to predominant contributions to improve the IDS prediction accuracy in contrast to conventional approaches.

This section presents a detailed review on various IDS methods, corresponding types and methodologies with significant advantages and constraints. Various machine learning approaches are used for predicting the malicious activities and intruders over sensor networks. Moreover, some of these approaches posses certain constraints during the generation and updation of data regarding the newer attacks and it provides high FAR or least accuracy. The results and methods are summarized and the contemporary models are explored based on the performance enhancements on IDS as an outcome to get rid of IDS issues.

3 Methodology

Here, a detailed discussion is done for validate the performance of proposed fuzzy genetic algorithm and MOPSO model. Some preliminary sets like data acquisition, feature selection, and classification is performed to identify the intrusion over the network. The detection framework is shown in Fig. 1.

Fig. 1
figure 1

Intrusion detection framework

3.1 Dataset Description

In this context, the NSL-KDD dataset is employed, where 20% of its instances serve as training data out of a total of 25,192 instances, while the remaining samples, totalling 22,544 instances, constitute the testing dataset. This dataset comprises 42 attributes, with 41 of them classified into four distinct classes.

  1. 1.

    Basic (B) characteristics: TCP/IP connection attributes utilized in identifying delays.

  2. 2.

    Traffic (T) characteristics: These attributes pertain to window intervals and encompass two prominent features, namely, same service and same host. The service feature evaluates the overall number of connections sharing the same services within a specific time frame.

  3. 3.

    Host (H) characteristics: These attributes are assigned to assess attacks lasting for 2 s, scrutinizing the overall connections directed towards the destination during this duration.

  4. 4.

    Content (C) characteristics: These attributes, informed by domain expertise, are suggested based on moment intervals.

This dataset encompasses four distinct traffic categories, each associated with 23 types of attacks, along with various features:

  1. 1.

    Denial of Service (DoS): Attackers monopolize network resources, rendering them unavailable to legitimate users.

  2. 2.

    User-to-Root (U2R): Attackers intercept passwords and exploit vulnerabilities on hosts to gain unauthorized access as legitimate users.

  3. 3.

    Remote-to-Local (R2L): Attackers transmit messages from remote locations to hosts, exploiting vulnerabilities in the process.

  4. 4.

    Probe: Attackers scan the network to gather information, leading to network breaches. Tables 1 and 2 detail the dataset’s records, labels, and attributes from the NSL-KDD dataset, while Table 3 delineates the four distinct attack categories.

Table 1 Dataset records
Table 2 Dataset labels and attributes
Table 3 Classifications of breaches

3.2 Feature Selection Using Principle Component Analysis

PCA is a statistical approach which is applied in various applications like image compression, face recognition, image processing and so on. It is a common approach for predicting the patterns of high dimensional data. The complete statistical data is based on huge dataset and analyzes the relationship among the individual points (See Table 4). The objective of PCA is to diminish the data dimensionality by measuring the variations identified in the original NSL-KDD dataset. It identifies the data patterns by expressing the differences and similarities among the dataset.

Table 4 Feature dimensionality reduction

Please check the edit made in caption of Algorithm 1. Please check if action taken is appropriate. Otherwise, kindly advise us on how to proceed.Yes Its perfect.

Algorithm 1
figure c

The flow of PCA functionality

3.3 Design of Fuzzy Genetic Algorithm

A classifier model is nothing but the algorithm used for the construction of classification model from the provided dataset to categorize the data. The significance of the model is managed with various parameters like fuzzy set, fuzzy rules, and membership function and prioritization values. Generally, fuzzy logic lacks in learning ability where the optimization process is considered to be more complex. Here, the fuzzy rules, membership function, and fuzzy sets are optimized. The fuzzy rule set is specified by IF–THEN rules. The generation of rule size is based on feature size and it is managed by the dataset adopted. Moreover, to handle the classification ignorance, the numbers of rules are provided in a constraint manner. Generally, membership functions and fuzzy sets are feature-dependent. The membership function can be either trapezoidal or triangular shapes. Three fuzzy sets are considered to reduce the computational complexity. The fuzzified input mapping towards rule-base model is done with inference process to generate fuzzified output for all appropriate rules. The rule is generated based on the following Eq. (1):

$$\propto R_{i} = \min \left\{ {\mu D_{1} \left( {d_{1} } \right), \mu D_{2} \left( {d_{2} } \right), \ldots ,\mu D_{n} \left( {d_{n} } \right)} \right\}$$
(1)

Here, \(\propto R_{i}\) is \(R_{i}^{th}\) fuzzy rule set, \(^{\prime}n^{\prime}\) is number of features, \(d_{1} , \ldots ,d_{n}\) is input variables, \(\mu D_{i} \left( {d_{i} } \right)\) is fuzzified membership degree, \(\mu_{Di}\) is fuzzy set membership function. The fuzzy value (single) is allocated for all output. The final value is related with the output using maximal operator and it is expressed as in Eq. (2):

$$\beta_{i} = \mathop {\max }\limits_{{\text{for all M}}} \left\{ {\alpha_{Ri} } \right\}$$
(2)

Here, \(\beta_{i}\) is maximal value for all fuzzy rules, \(\alpha_{Ri}\) is fuzzy rule strength, \(^{\prime}M^{\prime}\) are total fuzzy rules. The defuzzification process evaluates the centroid and transforms the fuzzy output to crisp values using fuzzy rules. It is expressed as in Eq. (3):

$${\text{Output}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {\alpha_{Ri} * \mu D_{i} \left( {d_{i} } \right)} \right)}}{{\mathop \sum \nolimits_{i = 1}^{n} \alpha_{Ri} }}$$
(3)

Here, \(\alpha_{Ri} * \mu D_{i} \left( {d_{i} } \right)\) is the maximal defuzzification process, \(^{\prime}n^{\prime}\) is total amount of fuzzy rules. Here, the parameters are evaluated with Genetic algorithm and it is used for categorizing the attacks where the models are used for predicting and classification of attacks. Algorithm 2 iIllustrates the genetic fuzzy algorithm

Algorithm 2
figure d

Genetic fuzzy algorithm

The genetic algorithm encodes (provides) fuzzy rules and the chromosomes are modelled to encode the rule-base. The fuzzy rules are specified with integer array where the size of the array is equal to the chosen feature size from the NSL-KDD dataset. The encoding process specifies the dataset features through the membership function for the chosen rule-base. The encoded chromosome fitness is evaluated with the fuzzy set, and the chromosomes. The classification accuracy is expressed as in Eq. (4) and Eq. (5):

$${\text{fitness}} = \frac{1}{{\text{classification error}}}$$
(4)
$${\text{Error}} = 2E^{2} + E + 1$$
(5)

Here, \(^{\prime}E^{\prime}\) is specified as the percentage of inappropriately categorized records. The error (classification) is specified in a quadratic manner. The roulette wheel selection process is used for selecting the appropriate parents for reproduction process. The crossover is adopted for all chromosome pairs in a random manner during reproduction. The chromosome layers are provided with fixed length under a constraint environment. Here, random mutation process is done with mutation selection probability. The best solution is attained with the adoption of elitism and helps to construct the successive generation. It involves in the substitution of the older population by transforming the of fitness candidates into the successive generation. The relationships among the chromosomes are attained with the collaboration of \(^{\prime}K^{\prime}\) rules to predict the categories of the attack. Figure 2 illustrates the flow diagram of the proposed MOPSO.

Fig. 2
figure 2

The proposed algorithm framework

3.4 Multi-Objective Particle Swarm Optimization (MOPSO)

PSO is a bionic concept that originates from the bird’s characteristics and the preliminary concept behind it is to predict the optimal solution via the information sharing and cooperation between the individual over the group. The speed and position of the bird are considered as an independent variables and food density arrives with the functional values. The search can adjust the speed and direction based on the difference among the optimal location and population history. The entire bird swarm attains optimal location based on the population. Therefore, the findings may get optimal solution, i.e. problem convergence. The predominant benefits of PSO are:

  1. 1.

    Stronger competency towards global search and faster computational speed.

  2. 2.

    It is not so sensitive towards the population size with smaller effect over the training speed.

  3. 3.

    There is no necessity towards the computation of gradient information while performing objective function optimization. It is no constraint towards connectivity, derivability, convexity, and continuity over the feasible areas of the objective function.

Multi-objective PSO intends to give solution to various domain related problems in an efficient manner. It is conceptualized as a random search problem across a D-dimensional space, aiming to optimize the objective function. Here, \(^{\prime}n^{\prime}\) particles population \(p_{i} = \left( {p_{i1} , p_{i2} , \ldots ,p_{iD} } \right)^{T}\) and \(i^{th}\) particle composed of \(d -\) dimensional position vector \(x_{i} = \left( {x_{i1} , x_{i2} ,..,x_{id} } \right)^{T}\) and velocity vector \(v_{i} = \left( {v_{i1} , v_{i2} , \ldots ,v_{id} } \right)^{T}\). For all population (particle), fitness value is attained based on the evaluation of particle fitness. The fitness function is expressed in Eq. (6):

$$F \left( X \right) = \alpha \left( {1 - p} \right) + \left( {1 - \alpha } \right)\left( {1 - \frac{{N_{f} }}{{N_{t} }}} \right)$$
(6)

Here, \(^{\prime}\alpha ^{\prime}\) is hyper-parameter, \(^{\prime}p^{\prime}\) shows the coordinate relationship between the classifier performance, \(N_{f}\) is the feature subset. When the search is over the \(D -\) dimensional space, then initialize the random particles and optimal solution is determined via iteration. With constant particle search, the optimal position \(p_{i} = \left( {p_{i1} , p_{i2} , \ldots ,p_{id} } \right)^{T}\) is the local optimal solution and velocity is specified as \(v_{i} = \left( {v_{i1} , v_{i2} , \ldots ,v_{id} } \right)^{T}\). The optimal position \(p_{g} = \left( {p_{g1} , p_{g2} , \ldots ,p_{gd} } \right)\) is determined as global optimal solution. For all iteration, the particle needs to update the velocity and the position by measuring the ‘optimal solutions’, i.e. \(\left( {p_{i} , p_{g} } \right).\) The updation process is expressed as in Eq. (7):

$$v_{id} \left( {t + 1} \right) = \omega v_{id} \left( t \right) + c_{1} r_{1} \left( {p_{id} \left( t \right) - x_{id} \left( t \right)} \right) + c_{2} r_{2} \left( {p_{g} d\left( t \right) - x_{id} \left( t \right)} \right)$$
(7)
$$x_{id} \left( {t + 1} \right) = x\left( t \right) + v_{id} \left( {t + 1} \right), {\text{where}}\; i = 1,2, \ldots ,N;d = 1,2, \ldots ,D$$
(8)

Here, \(^{\prime}N^{\prime}\) is total particles in the population with \(d -\) dimensional space, \(^{\prime}t^{\prime}\) is total present iterations, \(^{\prime}\omega ^{\prime}\) is non-negative inertia factor that manages local and global optimization capabilities. When the value is larger, the global optimization competency is stronger and local optimization competency is weaker. \(v_{id} \left( t \right)\) and \(v_{id} \left( {t + 1} \right)\) specifies the current and updates particle velocity; \(c_{1}\) and \(c_{2}\) are acceleration factors where \(c_{1} = c_{2} = 2\). \(^{\prime}r_{1} ^{\prime}\) and \(^{\prime}r_{2} ^{\prime}\) are random numbers to improve the particle randomness and eliminates the blinding search. The particles position and velocity are constrained with \(\left[ { - x_{\max } , x_{\max } } \right]\) and \(\left[ {{-}v_{\max } , v_{\max } } \right]\). The algorithm for multi-objective PSO is given in Algorithm 3:

Algorithm 3
figure e

Multi-objective PSO

4 Results and Analysis of Data

This section presents the numerical results and discussion of the proposed MOPSO model. The simulation is conducted within the MATLAB environment, evaluating various performance metrics. The NSL-KDD dataset is utilized for training, testing, and validation in intrusion detection. The data prediction encompasses four distinct cases: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), with their corresponding analyses provided below.

  1. 1.

    TP: Indicates cases where both the predicted and actual labels are positive.

  2. 2.

    FN: Denotes instances where the predicted label is negative despite the actual labels being positive.

  3. 3.

    TN: Represents scenarios where both the predicted and actual values are negative.

  4. 4.

    FP: Refers to situations where the predicted label is positive despite the actual label being negative.

Table 5 depicts the confusion matrix of the anticipated model. Based on the above definitions, there are some metrics like False Alarm Rate (FAR), accuracy, and Detection Rate (DR) are measured for providing a novel IDS scheme. It is discussed below:

Table 5 Confusion matrix
  1. 1.

    Detection Rate (DR): It is represented as the appropriate proportion of all positive instances, serving as a coverage measure that assesses the classifier’s predictive capability for all positive instances. This is illustrated in Eq. (9):

    $$DR = \frac{TP}{{TP + FN}}$$
    (9)
  2. 2.

    Accuracy: It is represented as the appropriate prediction outcome relative to the total number of samples, serving as a measure to assess the overall accuracy rate of the classification samples. This is expressed in Eq. (10):

    $$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}$$
    (10)
  3. 3.

    False Alarm Rate: It is depicted as the predicted positive which is actually negative based on the proportional of appropriate negative. It is expressed as in Eq. (11):

    $$FAR = \frac{FP}{{TN + FP}}$$
    (11)

Table 6 depicts the comparison of prediction accuracy and FAR of the proposed MOPSO and existing ML approaches. The accuracy of the proposed MOPSO is 98.86% which is 12.06% higher than PSO with lightweight GBM, 13.36% higher than decision tree, 15.76% higher than logistic regression, 16.8% higher than NB, 17.46% higher than multi-layer perceptron, 17.41% higher than ANN and 20.46% higher than EM clustering (See Fig. 3). Similarly, the FAR of MOPSO is 9.5 which are 1.1, 6.2, 8.9, 9, 11.6, 11.8 and 14.2 lesser than other approaches. Table 7 depicts the total training and testing time of NSL-KDD dataset w.r.t. elapse time and CPU time. The elapse time based on training is 11.52 s and CPU time is 0.30 s. The elapse time based on testing is 2.689 and CPU time is 0.035 s respectively (See Fig. 4).

Table 6 Accuracy and FAR computation
Fig. 3
figure 3

Accuracy and FAR comparison

Table 7 Total training and testing time (s)
Fig. 4
figure 4

Training and Testing time evaluation

Table 8 shows other metrics like precision, recall, F1-score and FAR of the proposed MOPSO respectively. The precision with normal category is 0.947%, recall is 0.995%, F1-score is 0.968 and FAR is 0.015. The values based on attack category shows 0.999% precision, 0.987% recall, 0.993% F1-scoreand FAR is 0.007. The weighted averages of all these metrics are given as 0.986%, 0.987%, 0.989% and 0.008% respectively (See Fig. 5). Table 9 depicts the precision, recall, F1-score and FAR of attack categories like DoS, probe, R2L and U2R respectively. For the DoS attack, the precision stands at 0.9940%, recall at 0.9790%, F1-score at 0.9860%, and FAR at 0.00450. In the case of the probe attack, precision is 0.8600%, recall is 0.8855%, F1-score is 0.9195%, and FAR is 0.5715. Moving to the R2L attack, precision records at 0.6920%, recall at 0.9195%, F1-score at 0.7895%, and FAR at 0.00550. Lastly, for the U2R attack, precision is 0.8880%, recall is 0.5715%, F1-score is 0.6965%, and FAR is 0.00002. The weighted averages of these metrics are 0.99%, 0.9886%, 0.9988% and 0.0996 respectively (See Fig. 6). The execution time (both training (ms) and testing (ms)) of proposed MOPSO is compared with PSO-lightweight GBM, DT, and logistic regression as in Table 10. The training time of MOPSO is 95.4565 ms which is 93.5735 ms, 5.0002 ms, 124.1083 ms lesser than other approaches. The testing duration for MOPSO is 2.5465 ms, representing a reduction of 0.505 ms, 2.3489 ms, and 9.7895 ms compared to alternative approaches (See Fig. 7)Based on these metrics, it is shown that the anticipated model works efficiently for predicting intrusion over the network with least FAR and higher prediction accuracy.

Table 8 Performance metrics comparison based on attack categories
Fig. 5
figure 5

Performance metrics comparison based on attack categories

Table 9 Weighted average measure of attack categories
Fig. 6
figure 6

Weighted average measure of attack categories

Table 10 Average execution time (ms)
Fig. 7
figure 7

Average execution time (ms)

5 Conclusion

In this work a novel Fuzzy Genetic Algorithm with Multi-Objective Particle Swarm Optimization model is designed for predicting the normal traffic and evaluation time. It includes both the minor or major attack categories specifically for the rare information from the provided NSL-KDD dataset. This model includes three essential steps like feature selection, classification and optimization approaches for properly interpreting the accuracy of the given dataset to facilitate human understanding and data analysis. The proposed model is contrasted with several existing approaches. Experimental results illustrate that the proposed model effectively extracts the appropriate rule-based model from network traffic, largely benefiting from the assistance provided by MOPSO. Moreover, certain performance metrics are assessed, revealing how well the proposed model performs in meeting the objectives of the exploitation and exploration criteria, rule evolution, and detection of attack categories with superior detection rate and least FAR compared to other approaches. However, the model attains 98.86% accuracy, 9.5% FAR, 99% precision, 98.86% recall and 99.88% F1-score respectively.

The resourceful classification and detection of the primitive normal network traffic and intrusion attacks offer predominant scope in the future. Based on these models, the improved approach is applied to diverse complex problem-based domains like DNA computation. Additionally, with respect to this domain, some optimization approaches are candidate to be used to attain superior accuracy.