Abstract
Feature Subset Selection (FSS) is an NP-hard problem to remove redundant and irrelevant features particularly from medical data, and it can be effectively addressed by metaheuristic algorithms. However, existing binary versions of metaheuristic algorithms have issues with convergence and lack an effective binarization method, resulting in suboptimal solutions that hinder diagnosis and prediction accuracy. This paper aims to propose an Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA) for FSS in medical data preprocessing to address the suboptimal solutions arising from binary versions of metaheuristic algorithms. The proposed IBQANA’s contributions include the Hybrid Binary Operator (HBO) and the Distance-based Binary Search Strategy (DBSS). HBO is designed to convert continuous values into binary solutions, even for values outside the [0, 1] range, ensuring accurate binary mapping. On the other hand, DBSS is a two-phase search strategy that enhances the performance of inferior search agents and accelerates convergence. By combining exploration and exploitation phases based on an adaptive probability function, DBSS effectively avoids local optima. The effectiveness of applying HBO is compared with five transfer function families and thresholding on 12 medical datasets, with feature numbers ranging from 8 to 10,509. IBQANA's effectiveness is evaluated regarding the accuracy, fitness, and selected features and compared with seven binary metaheuristic algorithms. Furthermore, IBQANA is utilized to detect COVID-19. The results reveal that the proposed IBQANA outperforms all comparative algorithms on COVID-19 and 11 other medical datasets. The proposed method presents a promising solution to the FSS problem in medical data preprocessing.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
During the past years, with the increasing volume of data, the role of data mining techniques for extracting patterns and ultimately discovering hidden knowledge has become more prominent. Knowledge discovery involves searching a large amount of data to identify hidden patterns and extract relevant insights [1]. Applying data mining techniques to medical data has increasing benefits for medical diagnosticians and can minimize the potential errors of inexperienced physicians [2]. An accurate and reliable diagnosis in the early stages of the disease can significantly impact the patient's life and expenses. Since medical datasets are often collected from multiple sources and for different purposes, besides relevant features, they may contain many irrelevant and redundant ones. Relevant features include essential information and can be used to discover knowledge and hidden patterns. Redundant features comprise necessary information already offered by another feature; as a result, these features don't provide the prediction model any extra useful data and only cause the computational cost for the classification algorithm. Features that do not provide useful information are irrelevant, and their presence causes a decline in the precision of prediction models' classification ability. As a result, the existence of irrelevant and redundant features not only imposes an additional computational cost to data mining algorithms but also degrades the accuracy of the predictions [3].
Feature Subset Selection (FSS) is a common way to overcome the mentioned challenges by selecting a set of M effective and relevant features from an original set of N features where M < N. Generally, there are three main types of FSS techniques: filter-based methods, wrapper-based methods, and embedded-based methods [4, 5]. Without utilizing machine learning techniques, filter-based methods evaluate the relevance of features based on criteria including consistency, correlation, similarity, distance, information, variance, and statistical criteria to select features [6]. Filter-based feature selection using ant colony optimization [7] and filter-based binary particle swarm optimization [8] are two prominent FSS algorithms that employ the filter-based method. Although filter-based FSS techniques are generally less computationally expensive than other methods due to their independence from classification algorithms, they usually do not yield adequate performance. The wrapper-based method, popularized by Kohavi and John [9], evaluates the selected feature subset's quality using a classification algorithm. Despite outperforming the filter-based method in accuracy, the frequent usage of the classification algorithm makes it computationally more expensive and requires more execution time than the filter-based methodology. [10]. Embedded-based methods are another category of FSS algorithms, performing selection during training and specific to certain learning machines [11]. They are more efficient by avoiding retraining predictors for each subset but are limited to specific machines and can be intricate. Examples include FSS based on support vector machine [12, 13] and embedded-based genetic programming [14].
As the wrapper-based approach yields better results [15], three search strategies have been used to generate numerous FSS algorithms: sequential, exponential, and random search. The features are progressively added or removed in the sequential search strategy [16, 17], but this search strategy is prone to trap in local optima solutions [18]. The exponential search strategy makes sure to find the optimal features subset by evaluating all possible feature subsets [19]; however, they are computationally costly and inapplicable for many real-world datasets [18]. The random search strategy begins with a random feature subset and then proceeds with a sequential search strategy. Metaheuristic algorithms, a prominent class of random search methods, use stochastic techniques to solve optimization problems. They explore the search space, promote population diversity, and bypass local optimum solutions by embedding randomness into their search procedures. Additionally, they locally search promising areas to enhance solution quality [20]. Particle Swarm Optimization (PSO) [21], Genetic Algorithm (GA) [22], Differential Evolution (DE) [23], Ant Colony Optimization (ACO) [24], Artificial Bee Colony (ABC) [25], Grey Wolf Optimizer (GWO) [26], Moth-Flame Optimization (MFO) [27], Whale Optimization Algorithm (WOA) [28], and Harris Hawks Optimization (HHO) [29] are some well-known metaheuristic algorithms proposed to tackle different optimization problems such as task scheduling [30,31,32], optimal power flow [33, 34], and engineering design [35,36,37,38,39,40]. Researchers continue developing new metaheuristic algorithms to cover some flaws of previous algorithms, such as low convergence rate due to randomization in their methods [41, 42] and the lack of mechanisms to maintain population diversity and the imbalance between search strategies [43]. For instance, the Marine Predators Algorithm (MPA) [44], Colony Predation Algorithm (CPA) [45], Aquila Optimizer (AO) [46], Starling Murmuration Optimizer (SMO) [47], dwarf mongoose optimization algorithm [48], and Quantum-based Avian Navigation Optimizer Algorithm (QANA) [49] are among the recent metaheuristic algorithms proposed for continuous problem-solving [50, 51]. Among these, QANA is a scalable population-based metaheuristic algorithm inspired by migratory bird navigation.
Since the FSS is a binary problem that deals with selecting or discarding features, researchers have developed different binary metaheuristic algorithms to solve various real-world optimization problems [52]. For instance, the Binary Particle Swarm Optimization (BPSO) [53], Binary Gravitational Search Algorithm (BGSA) [54], Binary Differential Evolution (BDE) [55], Binary Bat Algorithm (BBA) [56], Binary Grey Wolf Optimization (bGWO) [57], Binary Dragonfly Algorithm (BDA) [58], Binary Salp Swarm Algorithm (BSSA) [59], and Binary Artificial Bee Colony (BABC) [60] are some well-known binary metaheuristic algorithms developed by applying different binarization methods including, transfer functions and crossover operator. Although metaheuristic algorithms have their benefits, the No-Free-Lunch (NFL) theorem [61] asserts that no individual algorithm can solve all problems [62, 63]. As a result, in FSS domain, new binary metaheuristic algorithms, such as Binary Arithmetic Optimization Algorithm (BAOA) [64], Binary Simulated Annealing-Based Dynamic Step Shuffled Frog Leaping Algorithm (BDSSRLFLA) [65], Binary Coronavirus Disease Optimization Algorithm (BCOVIDOA) [66], binary enhanced gaussian bare-bones grasshopper optimization [67], and Binary Quantum-based Avian Navigation Optimizer Algorithm (BQANA) [68] are still developing.
This paper aims to propose IBQANA, an improved binary quantum-based avian navigation optimizer algorithm, to select an effective feature subset from different medical datasets. First, we propose a binarization method by introducing a novel Hybrid Binary Operator (HBO) to map continuous values produced by the QANA into binary solutions. This method combines a threshold approach with Boolean operators, addressing the limitations of existing binarization techniques. Consequently, the binary version of QANA, called BQANA-HBO, is developed, enabling more efficient and accurate binary solutions. Then, an improved version of BQANA-HBO, named IBQANA is developed to tackle the issues of local optima entrapment and low convergence rate. This improvement is achieved by introducing the Distance-Based Binary Search Strategy (DBSS), consisting of exploration and exploitation phases. In the exploration phase, which mostly happen in the early iterations, inferior search agents are updated by crossing over their positions with the farthest search agent's position in the archive. Conversely, in the final iterations, the exploitation phase is employed to update the inferiors’ position by crossing over with the best position. It is expected that these contributions enhance the performance and effectiveness of IBQANA for FSS problems.
The efficacy of the proposed IBQANA was tested on 12 medical datasets with different numbers of features. First, the effectiveness of BQANA-HBO developed by the introduced binarization method is investigated and compared with BQANA developed using variable thresholding and five representatives of 20 binary versions of QANA developed by incorporating five distinct transfer function categories, including S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions. The performance evaluation of IBQANA includes assessing its effectiveness by considering accuracy, fitness, and the selected features quantity. To determine its competitiveness, IBQANA is compared with seven well-known algorithms commonly used in the literature for FSS, including Binary Differential Evolution (BDE) [55], Binary Bat Algorithm (BBA) [56], V-shaped Binary Particle Swarm Optimization (VPSO) [69], hybrid Binary Particle Swarm Optimization and Gravitational Search Algorithm (BPSOGSA) [70], Binary Dragonfly Algorithm (BDA) [58], Quadratic Binary Harris Hawk Optimization (QBHHO) [71], and Improving Whale Optimization Algorithm for FSS with a Time-Varying transfer function (BWOA-TV) [72]. In addition, the proposed IBQANA is also compared with two other algorithms, namely the Binary Quantum-based Avian Navigation Optimizer Algorithm based on the thresholding approach (BQANA) [68] and the introduced BQANA-HBO. Moreover, a case study was conducted where IBQANA was employed as a diagnostic tool for Coronavirus Disease 2019 (COVID-19). To demonstrate the superiority of the proposed algorithm, the findings were statistically analyzed using the Friedman test. Based on the experimental and statistical results, it was discovered that the proposed IBQANA performs better than other comparative algorithms in identifying relevant features from COVID-19 and 11 other medical datasets.
2 Related Works
Due to the exponential growth in the number of potential solutions when dealing with datasets containing N features, evaluating all 2N possible solutions becomes an NP-hard problem. As a result, researchers have extensively addressed the problem of FSS by turning to metaheuristic algorithms, which have proven successful in approximating solutions. In this section, we review several prominent metaheuristic algorithms used in the wrapper approach, listed in the order of their appearance in the literature.
Kennedy and Eberhart [53] proposed a Binary version of the Particle Swarm Optimization (BPSO) algorithm using a sigmoid transfer function to tackle discrete optimization problems [73]. Sigmoid and its other three versions [69] are known as S-shaped transfer functions, frequently used in the literature. Marandi et al. [74] proposed Boolean Particle Swarm Optimization (BPSO) to solve a dual-band dual-polarized planar antenna design problem. This algorithm uses three Boolean operators, including and (∧), or (∨), and xor ( ⊕), to generate binary solutions. Binary Differential Evolution (BDE), proposed by Gong et al. [55], utilizes discrete mutation and crossover operators to solve binary problems. Rashdi et al. [54] proposed the V-shaped transfer function and developed the Binary Gravitational Search Algorithm (BGSA) accordingly. Nakamura et al. [56] introduced the Binary Bat Algorithm (BBA) by mapping continuous search agents’ velocity into binary solutions utilizing the S-shaped transfer function. Mirjalili et al. [69] proposed the VPSO algorithm as another binary version of PSO by applying four V-shaped transfer function family variants. The comparison between applying S-shaped and V-shaped transfer functions to the PSO algorithm indicates that the V-shaped transfer function provides superior results compared to the S-shaped transfer function. Mirjalili et al. [70] proposed a hybridization of binary PSO and GSA algorithms (BPSOGSA) to solve FSS problems by applying a V-shaped transfer function. Mirjalili [58] proposed a Dragonfly Algorithm (DA) inspired by the natural swarming patterns of dragonflies. Its binary version, named BDA was also developed using V-shaped transfer function.
Aslan et al. [75] introduced Jayax, a binary form of the Jaya algorithm [76] that generates binary solutions using the xor ( ⊕) Boolean operator. The results reported in this research show that the Jayax predominantly provides more optimal solutions than the binary form of the Jaya algorithm based on the transfer function. Jordehi [77] proposed three versions of the quadratic transfer function to develop the quadratic binary PSO for solving scheduling shiftable appliance problems in smart homes. The reported results show that the binary PSO developed using the S-shaped transfer function can generate better results than the V-shaped transfer function. Furthermore, the introduced quadratic transfer function outperforms all other binary versions of PSO developed using different transfer functions. Too et al. [71] proposed the Quadratic Binary Harris Hawk Optimization (QBHHO) by introducing a new variant of the quadratic transfer function. The findings demonstrate that the introduced transfer function provides superior results for most investigated FSS problems. Sayed et al. [78] developed a Chaotic Crow Search Algorithm (CCSA) to tackle the drawbacks of the crow search algorithm in solving FSS problems, including entrapment in local optima and low convergence speed. CCSA employs ten chaotic maps to alleviate the mentioned flaws and boost its performance. The reported findings demonstrate that the sine map improves the CSA's performance in the FSS field. The authors in [79] proposed an effective binary version of PSO using the U-shaped transfer function (UBPSO). The results demonstrate that the UBPSO algorithm could generate superior binary solutions compared to the S-shaped and V-shaped binary PSO.
The Whale Optimization Algorithm (WOA) proposed by Mirjalili and Lewis [28] inspired by the hunting behavior of humpback whales is a well-known metaheuristic algorithm for finding the optimal solution with high speed using Simple but powerful search mechanisms [80]. To address the FSS problem, Mohammadzadeh and Gharehchopogh [81] introduced a hybridization of WOA and flower pollination algorithms based on opposition-based learning for email spam detection. In another study, Kahya et al. [72] developed a binary version of WOA, named BWOA-TV, incorporating a time-varying transfer function. Turkoglu et al. [82] proposed a binary artificial algae algorithm for FSS, inspired by the behavior of algae in nature. Piri et al. [83] developed a Discrete Artificial Gorilla Troop Optimization (DAGTO) algorithm to solve FSS problems in the healthcare sector. The authors implemented four variants of the DAGTO for different numbers and types of objective functions. Abualigah and Diabat [84] proposed a chaotic binary group search optimizer to solve the FSS problem by combining chaotic maps and a binary group search optimizer. Shaddeli et al. [85] introduced an improved African vulture optimization algorithm to alleviate convergence to local optima when solving discrete problems by hybridizing it with SCA and applying four strategies. Moreover, the authors used S-shaped and V-shaped transfer functions to map continuous solutions into binary ones.
Helmi et al. [86] proposed three binary versions of MPA using S-shaped and V-shaped transfer functions to address the problem of FSS in human activity recognition (HAR). HAR refers to identifying a person's actions based on measurements obtained from various mechanisms, including cameras, interior sensors, radars, wireless signals, and other sources [87]. A new method for diagnosing brain tumors was put out by Ren et al. [88], employing a step-by-step process based on a deep learning-based Water Strider Algorithm (WSA). The proposed method involves feature extraction, FSS, and classification steps, using the WSA. Specifically, the WSA is employed to select the most relevant features that contribute to the classification of brain tumors. A Binary QANA (BQANA) algorithm was presented by Nadimi-Shahraki et al. [68] to solve the FSS problem in the medical data field using two different approaches: the first employs various transfer functions to convert the canonical QANA to binary, while the second maps continuous solutions to binary by setting a variable threshold for each dimension. The results revealed that BQANA developed using the threshold method generates better solutions than transfer functions. The results also demonstrated that S4, V1, U4, Z3, and Q3 transfer functions are the representatives of their groups. In a similar study, Nadimi-Shahraki et al. [89] employed transfer functions and variable threshold approaches to develop Binary Starling Murmuration Optimizer (BMSO) by mapping continuous solutions of the Starling Murmuration Optimizer (SMO). Like the previous research, the results show that the variable threshold approach provides superior results.
3 Quantum-based Avian Navigation Optimizer Algorithm
The Quantum-based Avian Navigation Optimizer Algorithm (QANA) proposed by Zamani et al. [49] is a recent DE algorithm inspired by migrating birds' remarkable accuracy in long-distance aerial navigation. QANA utilizes multiple operators such as population partitioning, a qubit-crossover, two mutation strategies, and self-adaptive quantum orientation to achieve competitive results in continuous search spaces. Moreover, the V-echelon communication topology facilitates information sharing among search agents. Therefore, as seen in Algorithm 1, the QANA pseudocode includes four basic steps: initialization and multi-flock construction, movement strategy, fitness evaluation, and updating positions.
3.1 Initialization and Multi-flock Construction
To distribute the entire population of birds represented by matrix A in Eq. (1), the population is divided into k different geographical areas using random centers. This partitioning creates k flocks, each consisting of n search factors, where n = N/k. As a result, the position of each bird in the D-dimensional problem space is denoted by vector Xi = [xi1,xi2,…,xiD].
After the flock construction, in subsequent iterations, the search agents simultaneously use the knowledge shared by the V-echelon topology. Furthermore, utilizing the Success-based Population Distribution (SPD) policy, the search agents traverse the search space by using a mutation strategy rooted in quantum mechanics.
3.2 V-Echelon Communication Topology
The flight formation of migrating birds patterns the V-echelon communication construction and allows search agents of each flock to communicate knowledge that has been obtained. A Header (H) and two subsets for Left (L) and Right (R) lines make up this communication structure. The aerial navigation of migrating birds utilizing V-echelon topology is depicted in Fig. 1.
3.3 Quantum-based Navigation
Bird flocks utilize a quantum-based navigation system for exploring the search space. This system comprises a Success-based Population Distribution (SPD) policy, qubit-crossover, and two mutation strategies named "DE/quantum/I" and "DE/quantum/II". The assignment of a specific mutation strategy to a flock is based on the policy defined in Eq. (2),
where SRm denotes the success rate of mutation strategy Mm, and τij is equal to 1 if Mm improved aj of the i-th flock; else, τij equals to 0.
The DE/quantum/I and DE/quantum/II are two quantum mutation strategies represented by mathematical Eqs. (3) and (4), respectively. These equations utilize several variables such as xi (t) which refers to the i-th search agent, xVechelon (t) representing the position of the search agent followed by ai, xj and xj are two random positions selected from Long-Term Memory (LTM) and Short-Term Memory (STM). In addition, the variable vH (t + 1) indicates the position of the header in the V-echelon topology, and is computed using Eq. (5). The quantum orientation for avian ai, denoted by Si, is also utilized in the computation and is presented in [49, 90]. LB and UB are the search space's lower and upper boundaries.
By crossing the mutant vector vi (t + 1) with its parent xi (t), the trial vector ui (t + 1) is generated, where |ψi⟩d is a qubit-crossover probability for the d-th dimension [49].
4 Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA)
As stated earlier, ineffective binarization methods, prolonged convergence, and local optimum entrapment are the main problems of most binary metaheuristic algorithms that limit their performance in the FSS problem. First, this section introduces a novel binarization method, namely the Hybrid Binary Operator (HBO), to effectively map the continuous values into binary solutions and develop the binary version of the QANA, named BQANA-HBO, accordingly. Then, an improved version of BQANA-HBO, named IBQANA is proposed to cope with local optimum trapping and slow convergence rate by introducing a Distance-Based Binary Search Strategy (DBSS), which includes two different phases to adjust the position of the inferior search agents.
4.1 Hybrid Binary Operator (HBO)
As stated in Sect. 2, although the most popular binarization technique for metaheuristic algorithms is the transfer function, it cannot produce efficient binary solutions for certain metaheuristic methods [68]. On the other hand, the thresholding method presented in Eq. (7) is more effective in developing a binary version of the QANA [68]; however, this method has a major drawback that hinders the metaheuristic algorithm from utilizing its full potential in finding the optimal solution.
The issue lies in the thresholding method, which can only produce valid binary solutions for continuous values within the range of [0, 1]. Consequently, this approach fails to appropriately map continuous values outside these boundaries, such as negative values and values exceeding 1. Therefore, this section proposes a new binary method, Hybrid Binary Operator (HBO), for effectively mapping continuous solutions into binary space.
In the proposed HBO, if the continuous solution produced by the QANA is in the interval [0, 1], the thresholding method is used to convert it into a binary solution. However, if the generated continuous solution violates the lower bound (lb) or the upper bound (ub), according to its binary position in the current iteration, the logical operator AND (∧) is used to update the binary solution based on Eq. (8).
The procedure of the HBO to determine the binary solution for each dimension of a search agent in the BQANA-HBO algorithm is depicted in Fig. 2.
Despite the effective binary mapping provided by HBO, the developed BQANA-HBO still cannot offer high-accuracy solutions with fast convergence. Therefore, in the following, we propose IBQANA to enhance the solution quality and convergence rate of the BQANA-HBO algorithm by introducing DBSS. This innovative combination is expected to yield more efficient and accurate results, making it a promising avenue for advancing the field.
4.2 Distance-based Binary Search Strategy (DBSS)
This section proposes the IBQANA by introducing a DBSS to improve the performance of inferior search agents and speed up the convergence rate. The proposed search strategy consists of two phases of exploration and exploitation. In each iteration, if the new fitness of i-th search agent is equal or greater than its fitness in the current iteration, a new binary position is generated based on one of two different phases. As mentioned in Sect. 3, the QANA algorithm has STM and LTM to preserve the obtained solutions. In the proposed DBSS, two memories are merged and form an Archive. Then, according to the obtained value of P through Eq. (9), a new binary position is generated based on one of the exploitation and exploration phases,
where t represents the present iteration, and MaxIt is the maximum number of iterations. According to this equation, the value of P gradually decreases from 1 to 0 as the number of iterations progresses.
4.2.1 Exploration Phase
If a random number between 0 and 1 is determined to be bigger than the P value in the current iteration, a new binary solution is generated based on the uniform crossover operator according to Eq. (10),
where r1 and r2 are random numbers within the range of (0,1), and Archivefar (t) indicates the binary position of the farthest member of the Archive from the position of bi (t + 1), which is determined based on Eq. (11),
where HDik (t) is the Hamming distance between bi (t + 1) and the archive members, the Hamming distance is defined in Eq. (12), where M represents the total number of positions stored in the archive.
4.2.2 Exploitation Phase
Suppose the P value is determined to be less than or equal to a randomly generated number in the range (0, 1). In that case, the new binary solution is generated using the exploitation phase. This phase enhances the solutions’ quality without degrading the algorithm’s exploration capability. To achieve this, a fitness-dependent weight factor is calculated based on Eq. (13) [91],
where F(.) denotes the fitness value of a binary position, since according to the fitness function definition presented in the next section, the FSS is a minimization problem; the value produced by this equation is always in the range [0, 1]. Therefore, Eq. (14) generates new binary position during the exploitation phase.
Finally, after calculating the fitness of the new position produced by each of the two introduced phases, a greedy selection between bnew (t + 1) and bi (t + 1) determines the value of bi (t + 1). The pseudocode of the proposed DBSS is presented in Algorithm 2, and the pseudocode of the proposed Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA) is shown in Algorithm 3.
4.3 Computational and Time Complexity
As shown in Algorithm 3, the proposed IBQANA consists of four phases: initialization, multi-flock construction, movement, and fitness evaluation. In the initialization phase (line 2), N search agents are randomly distributed into a D-dimensional search space with an O(ND) computational complexity. In the multi-flock construction phase (line 5), k flocks with population size n (where n = N/k) are constructed with the computational complexity of O(kn). Then, the fitness value of each search agent is computed with the computational complexity of O(ND). The movement phase consists of memory construction, V-echelon topology formation, and three types of movement vectors: mutant, trial, and binary. The long-term and short-term memories with K′ and K˝ capacities are constructed with an O(K′D) + O(K˝D) computational complexity. Then, the V-echelon topology (line 8) for each k flock is formed with an O(kn) complexity. Finally, line 9 computes the SPD policy with an O(kn) complexity to assign mutant vectors to each flock. Mutant vectors (line 12) defined in Eqs. (3–5) are computed with O(ND) complexity, and the trial vectors and binary vector (lines 13 and 14) defined in Eqs. (6–8) are calculated with complexity of O(2ND). In the fitness evaluation phase (line 15), the fitness value of each search agent is computed with the computational complexity of O(ND), and Algorithm 2 runs with a computational complexity of O(ND). The algorithm repeats these phases until it reaches the maximum iteration T. Thus, the total computational complexity of IBQANA can be O(ND + T(kn + ND + (K′D + K˝D) + kn + kn + ND + 2ND + ND + ND). Since kn = N, we can simplify the computational complexity of IBQANA to O(ND + T(3N + 5ND + (K′D + K˝D)). The computational complexity can be qual to O(TND), because N is always bigger than K′ and K˝.
To compute the time complexity, the Average Time (AT) required to find the solution, known as the run time, is computed using Eq. (15) [49, 92]. This equation considers the total number of runs (M) and the computational time of algorithm A for each run (RTA,i).
Table 1 compares the run times (in seconds) of the IBQANA algorithm and comparative algorithms on different datasets. It can be seen that, while it may not be the fastest algorithm, it consistently has an acceptable run time across most datasets. However, it's worth noting that using IBQANA for real-time applications might be a limitation due to its relatively longer run times in certain datasets. Nevertheless, IBQANA proves to be a promising candidate for accuracy-critical applications, particularly in medical datasets, making it a promising choice for researchers and practitioners in this domain.
5 Experimental Assessment
This section evaluates the IBQANA algorithm's performance through experiments to tackle the FSS problem on 12 medical datasets with various feature sizes. First, to evaluate the effectiveness of the suggested binarization technique, the Hybrid Binary Operator (HBO) is applied to develop the binary QANA (BQANA-HBO) algorithm and compare its performance with the binary QANA developed using thresholding method, and representatives of 20 binary versions of QANA developed using five groups of transfer functions. In the second experiment, we evaluate the improved version of this algorithm called IBQANA and contrast the obtained results with BQANA-HBO, as well as seven other comparative algorithms, including BBA [56], VPSO [69], BPSOGSA [70], QBHHO [71], BDA [58], BWOA-TV [72], and BQANA [68]. These algorithms encompass highly cited, recent, and diverse approaches, allowing for a comprehensive comparison.
5.1 Parameter Settings
To verify that the comparisons are accurate and fair, the parameters for the comparative algorithms were set to the values stated in their respective original publications, which are provided in Table 2, while the initial values for the shared parameters, such as maximum number of iterations (MaxIt) and population size (N), were set to 20 and 300, respectively, for all algorithms. In this study, all experiments, including those for the other algorithms used in the comparison, were conducted 20 times independently on a laptop with an Intel Core i7-10750H CPU and 24.0 GB of RAM using MATLAB R2022a.
In this work, the K-Nearest Neighbor (K-NN) classifier with Euclidean distance and K = 5 is used to determine the classification accuracy of produced feature subsets based on Eq. (16),
where TP represents the count of accurately identified positive samples, while TN refers to the count of correctly classified negative samples. FN indicates the count of positive samples incorrectly classified as negative, and FP represents the number of negative samples incorrectly classified as positive. Also, the Classification Error (CE) is calculated using Eq. (17).
The goals of this study, including maximization of the classification accuracy and minimization of the number of features, are aggregated in an objective function given in Eq. (18),
where α denotes the significance of classification accuracy, Ntf, and Nsf are the total number of features of the dataset and the number of selected features, respectively. Since classification accuracy is the most crucial metric for medical datasets, following related studies [57, 71], we considered α = 0.99.
5.2 Description of Medical Datasets
This research evaluates the effectiveness of IBQANA and competing algorithms in selecting effective features using 12 medical benchmark datasets, primarily from the UCI machine learning repository [93]. Table 3 presents statistical information on the datasets under consideration. The K-fold cross-validation method was employed to prevent overfitting issues, with kfold = 10. This method divides the dataset into k folds, with the classifier utilizing k-1 folds as training sets and one fold as the testing set.
5.3 Evaluation of the Proposed Hybrid Binary Operator (HBO)
This section investigates the effectiveness of the introduced HBO binarization method on the developed binary QANA (BQANA-HBO) to select effective features from the medical datasets and compare the obtained results with the results obtained from the binary QANA using the thresholding method (BQANA) and five representatives of S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions [68]. The results are tabulated in Table 4, where the best average fitness value achieved for each medical dataset is emphasized in bold font. Also, the last row shows the average of Friedman test ranking results for each algorithm.
The reported results in Table 4 signify that the introduced HBO provides the superior mapping of continuous values of QANA into binary solutions in contrast to other binarization methods in terms of average fitness value. The findings also reveal that the binary QANA developed using the thresholding method (BQANA) provides better solutions than the investigated transfer functions and is ranked second. Among the investigated transfer functions, Q3 provides competitive results. Conversely, it was found that the S-shaped transfer function yields the poorest performance when used to convert continuous QANA values into binary solutions to select effective features from medical datasets.
5.4 Evaluation of the Proposed IBQANA
This section undertakes an evaluation and investigation of the efficacy of the proposed IBQANA on 12 medical datasets, followed by a comparison of the results with those generated by seven well-established metaheuristic algorithms. The outcomes are comprehensively presented in Tables 5 and 6, providing details on the average (Avg), minimum (Min), standard deviation (Std), or maximum (Max) fitness and classification accuracy.
Table 5 compares the fitness values of investigated FSS algorithms on 12 medical datasets, where the bold values represent the best average fitness value achieved for each medical dataset. The fitness function, described in Eq. (18), considers classification accuracy and the number of selected features. IBQANA consistently outperforms other algorithms in fitness values, with the lowest fitness value and standard deviation among all comparatives. Comparing IBQANA and BQANA-HBO highlights the effect of the introduced DBSS to make IBQANA a promising choice for FSS in medical datasets. It is also noticeable that the BQANA-HBO ranks second, demonstrating an advantage over competitors for one medical dataset. The achievements of IBQANA are mostly because of its strengths in the effective mapping of continuous solutions to binary ones using the introduced HBO and updating inferiors’ position using the introduced DBSS.
From the results tabulated in Table 6, it can be observed that IBQANA outperforms other algorithms in terms of average classification accuracy on most datasets. For example, on the Pima dataset, IBQANA achieved an average accuracy of 77.459, which is higher than the average accuracy of other algorithms. Similar trends are observed on different datasets, such as Heart, Hepatitis, Lymphography, SPECT, WBCD, LSVT, Parkinson, Colon, and SRBCT, where IBQANA consistently demonstrates competitive or superior performance compared to other algorithms.
Figure 3 visually depicts the comparative convergence behavior analysis between IBQANA and other relevant algorithms for FSS purposes in the context of 12 medical datasets. The average of the fitness function in 20 separate runs has been used to plot the convergence curves. The plotted curves demonstrate that the proposed IBQANA has the quickest convergence towards optimal solutions compared to other algorithms for most datasets. Also, by comparing the performance of IBQANA against other algorithms on colon, leukemia, and prostate tumor datasets, it can be concluded that as the quantity of features grows, the proposed algorithm maintains its scalability and reaches better solutions than the comparative algorithms. The ability of the IBQANA to bypass local optimum solutions can be seen in the curves related to diabetes, lymphography, WBCD, colon, leukemia, and prostate tumor datasets. It is also noticeable that the IBQANA demonstrates significant advantages over BQANA-HBO primarily due to its innovative approach of updating the positions of inferior search agents using the introduced DBSS. This improvement empowers IBQANA to achieve accelerated convergence and effectively evade being trapped in local optima in various scenarios.
Since in medical datasets, the accuracy of classification is considered the most crucial criterion, the classification accuracy results of IBQANA and comparative algorithms are shown in the form of a boxplot in Fig. 4. In general, the plots demonstrate that the proposed IBQANA can enhance the classification accuracy by selecting effective features and discarding irrelevant ones. The plots indicate that the IBQANA achieved higher median classification accuracy for eight datasets and higher maximum accuracy rate for six datasets. It can also be noticed that the BQANA-HBO represents the potential for increasing classification accuracy by selecting effective features as a low computational cost algorithm.
Given that a primary objective of FSS is to minimize the number of features utilized, Fig. 5 presents a comparison of the algorithms based on the rate of feature reduction achieved for each medical dataset, as determined through Eq. (19). In this equation, Nsf denotes the number of features selected by the algorithm, and Ntf represents the total number of features in the dataset of interest.
The results in Fig. 5 demonstrate that the BDE algorithm has the weakest performance in minimizing the number of features in all 12 datasets. In contrast, QBHHO, BQANA-HBO, and IBQANA have performed best in reducing the number of features. Although QBHHO predominantly minimizes the number of features more than IBQANA, it should be noticed that discarding effective features degrades the accuracy of the classifier, which is not acceptable in medical datasets.
The nonparametric Friedman test [94] is conducted to compare and rank the performance of the algorithms in selecting effective features from the investigated datasets based on their obtained fitness value in 20 runs. The findings reported in Table 7 reveal that the proposed IBQANA is ranked first for 11 datasets and third for one dataset. It is also noticeable that the IBQANA is scalable as it performs best for high-dimensional datasets, including Parkinson, colon, SRBCT, leukemia, and prostate tumor. Furthermore, BQANA-HBO is ranked first for one dataset, second for eight datasets, fourth for one dataset, and fifth for two datasets.
6 Applicability of the Proposed IBQANA on a COVID-19 Case Study
The infectious disease, COVID-19, attributed to the viral agent SARS-CoV-2, has garnered immense attention on a global scale since its initial emergence in January 2019, as corroborated by reference [95]. This virus causes severe acute respiratory syndrome. Due to its quick spread around the globe, World Health Organization (WHO) recognized it as a global crisis. Reportedly, COVID-19 has infected 619,429,000 and killed nearly 6,537,236 people worldwide ever since [96]. Machine learning has recently emerged as an effective approach to solve many problems and can be used to combat COVID-19 through screening [97], monitoring [98], prediction [99], and diagnosis [83, 100, 101]. To make this possible, medical feature selection techniques extract useful features from clinical datasets, which are then used to develop diagnosis algorithms to identify COVID-19 disease [102].
In this section, an evaluation of the potential applicability and performance of the proposed IBQANA is conducted on the preprocessed version [103] of the COVID-19 dataset [104], which is described in Table 8. This table shows that the dataset contains 13 features to predict the patient's condition. The categorical columns in the preprocessed dataset have been converted to numerical values by assigning a unique number to each category. The experimental environments and parameters of the algorithms have been set to the values specified in Sect. 5.
The experimental results illustrated in the first plot of Fig. 6 reveal that the proposed IBQANA generates the best solutions with minimum fitness value among the investigated algorithms, and the BQANA-HBO attained the second rank with a similar convergence behavior (Fig. 6a). It is also noticeable that IBQANA achieves the quickest convergence towards optimal solutions and can generate optimal solutions faster than other algorithms. The boxplot in Fig. 6b shows that BQANA-HBO provides the highest maximum classification accuracy, while IBQANA and BDE algorithms have the highest median accuracy. Ultimately, Fig. 6c presents each algorithm's minimum and average number of selected features, where IBQANA has the shortest bar for the average number of features. In contrast, QBHHO has the shortest bar for the minimum number of features.
7 Conclusion and Future Work
FSS is crucial in data analysis, identifying relevant features, and reducing the computational burden. Existing literature suggests that metaheuristic algorithms effectively find optimal feature subsets quickly. However, current binary metaheuristic algorithms suffer from slow convergence and lack an effective binarization method, resulting in suboptimal solutions. This paper introduces the Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA) specifically designed for FSS on medical datasets. The algorithm employs a Hybrid Binary Operator (HBO) to effectively convert continuous values into binary solutions. In addition, a Distance-Based Binary Search Strategy (DBSS) is introduced to enhance the performance of inferior search agents and accelerate convergence. DBSS operates through a two-phase search strategy, combining exploration and exploitation phases based on an adaptive probability function. This innovative approach effectively steers clear of local optima, ensuring optimal results. Experimental evaluation on 12 medical datasets demonstrates that IBQANA outperforms seven established algorithms regarding fitness, classification accuracy, and the number of selected features. It exhibits scalability in selecting features from high-dimensional datasets and demonstrates the fastest convergence among the tested algorithms. The application of IBQANA for COVID-19 detection underscores its practical significance and potential impact on the medical community. This research offers a promising solution to the FSS problem in medical data preprocessing, with implications for diagnostic tool development. Future research can explore HBO as a binarization method for other continuous metaheuristic algorithms and implement DBSS to mitigate limitations in other binary metaheuristic algorithms, such as slow convergence and local optima trapping.
Data Availability
The datasets used during the current study are available from the corresponding author upon reasonable request.
References
Maimon, O., & Rokach, L. (2005). Data mining and knowledge discovery handbook (p. 1). Springer.
Mirjalili, S., Faris, H., & Aljarah, I. (2019). Evolutionary machine learning techniques (p. 12). Springer.
Duangsoithong, R., Windeatt, T. (2009) Relevant and redundant feature analysis with ensemble classification. Seventh International Conference on Advances in Pattern Recognition, Kolkata, India 247–250
Jović, A., Brkić, K., Bogunović, N. (2015) A review of feature selection methods with applications. 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 1200–1205
Chen, C.-W., Tsai, Y.-H., Chang, F.-R., & Lin, W.-C. (2020). Ensemble feature selection in medical datasets: combining filter, wrapper, and embedded feature selection results. Expert Systems., 37(5), e12553. https://doi.org/10.1111/exsy.12553
Taha, A., Cosgrave, B., & Mckeever, S. (2022). Using feature selection with machine learning for generation of insurance insights. Applied Sciences, 12(6), 3209.
Deriche, M. (2009) Feature selection using ant colony optimization. 6th International Multi-Conference on Systems, Signals and Devices, Djerba, Tunisia, 1–4.
Cervante, L., Xue, B., Zhang, M., Shang, L. (2012) Binary particle swarm optimisation for feature selection: A filter based approach. IEEE Congress on Evolutionary Computation, Brisbane, QLD, Australia, 1–8.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X
Huang, J., Cai, Y., & Xu, X. (2007). A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognition Letters, 28(13), 1825–1844.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Leaning Research, 3(7–8), 1157–1182.
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. MachINE Learning, 46(1), 389–422. https://doi.org/10.1023/A:1012487302797
Rakotomamonjy, A. (2003). Variable selection using SVM-based criteria. Journal of Machine Learning Research, 3(7–8), 1357–1370.
Purohit, A., Chaudhari, NS., Tiwari, A. (2010) Construction of classifier with feature selection based on genetic programming. IEEE Congress on Evolutionary Computation, Barcelona, Spain, 1–5.
Akinola, O., Agushaka, O., & Ezugwu, A. (2022). Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems. PLoS ONE, 17(10), 1–26. https://doi.org/10.1371/journal.pone.0274850
Kumar, V., & Minz, S. (2014). Feature selection: a literature review. SmartCR, 4(3), 211–229.
Koller, D., Sahami, M. (1996) Toward optimal feature selection. Proceedings of the Thirteenth International Conference on International Conference on Machine Learning. Bari, Italy, 1996: 284–292.
Tabakhi, S., Moradi, P., & Akhlaghian, F. (2014). An unsupervised feature selection algorithm based on ant colony optimization. Engineering Applications of Artificial Intelligence, 32, 112–123.
Foroutan, I., & Sklansky, J. (1987). Feature selection for automatic classification of non-gaussian data. IEEE Transactions on Systems, Man, and Cybernetics, 17(2), 187–198.
Nadimi-Shahraki, M. H., Fatahi, A., Zamani, H., Mirjalili, S., Abualigah, L., & Abd Elaziz, M. (2021). Migration-based moth-flame optimization algorithm. Processes, 9(12), 2276.
Kennedy, J., Eberhart, R. (1948) Particle swarm optimization. Proceedings of ICNN'95-international conference on neural networks, 1995, 1942–1948.
Holland, J. H. (1992). Genetic algorithms. Scientific American, 267(1), 66–73.
Storn, R., & Price, K. (1997). Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341–359.
Dorigo, M., Birattari, M., & Stutzle, T. (2006). Ant colony optimization. IEEE Computational Intelligence Magazine, 1(4), 28–39.
Karaboga, D., & Basturk, B. (2007). A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization, 39(3), 459–471.
Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in Engineering Software, 69, 46–61.
Mirjalili, S. (2015). Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-Based Systems, 89, 228–249.
Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in Engineering Software, 95, 51–67.
Heidari, A. A., Mirjalili, S., Faris, H., Aljarah, I., Mafarja, M., & Chen, H. (2019). Harris hawks optimization: algorithm and applications. Future Generation Computer Systems, 97, 849–872.
Li, K., Xu, G., Zhao, G., Dong, Y., Wang, D. (2011) Cloud task scheduling based on load balancing ant colony optimization. Sixth Annual ChinaGrid Conference, Liaoning, China, 3–9.
Awad, A., El-Hefnawy, N., & Abdel-kader, H. (2015). Enhanced particle swarm optimization for task scheduling in cloud computing environments. Procedia Computer Science, 65, 920–929.
Omara, F. A., & Arafa, M. M. (2010). Genetic algorithms for task scheduling problem. Journal of Parallel and Distributed Computing, 70(1), 13–22. https://doi.org/10.1016/j.jpdc.2009.09.009
Nadimi-Shahraki, M. H., Fatahi, A., Zamani, H., Mirjalili, S., & Oliva, D. (2022). Hybridizing of whale and moth-flame optimization algorithms to solve diverse scales of optimal power flow problem. Electronics, 11(5), 831.
Farhat, M., Kamel, S., Atallah, A. M., & Khan, B. (2022). Developing a marine predator algorithm for optimal power flow analysis considering uncertainty of renewable energy sources. International Transactions on Electrical Energy Systems, 2022, 3714475. https://doi.org/10.1155/2022/3714475
Nadimi-Shahraki, M. H., Fatahi, A., Zamani, H., Mirjalili, S., & Abualigah, L. (2021). An improved moth-flame optimization algorithm with adaptation mechanism to solve numerical and mechanical engineering problems. Entropy, 23(12), 1637.
Yıldız, B. S., Pholdee, N., Bureerat, S., Erdaş, M. U., Yıldız, A. R., & Sait, S. M. (2021). Comparision of the political optimization algorithm, the Archimedes optimization algorithm and the levy flight algorithm for design optimization in industry. Materials Testing, 63(4), 356–359.
Sharma, S., Saha, A. K., Roy, S., Mirjalili, S., & Nama, S. (2022). A mixed sine cosine butterfly optimization algorithm for global optimization and its application. Cluster Computing, 25(6), 4573–4600. https://doi.org/10.1007/s10586-022-03649-5
Nadimi-Shahraki, M. H., & Zamani, H. (2022). DMDE: Diversity-maintained multi-trial vector differential evolution algorithm for non-decomposition large-scale global optimization. Expert Systems with Applications, 198, 116895.
Yildiz, B. S., Pholdee, N., Bureerat, S., Yildiz, A. R., & Sait, S. M. (2021). Robust design of a robot gripper mechanism using new hybrid grasshopper optimization algorithm. Expert Systems, 38(3), e12666.
Gupta, S., Abderazek, H., Yıldız, B. S., Yildiz, A. R., Mirjalili, S., & Sait, S. M. (2021). Comparison of metaheuristic optimization algorithms for solving constrained mechanical design optimization problems. Expert Systems with Applications, 183, 115351.
Yıldız, B. S., Kumar, S., Pholdee, N., Bureerat, S., Sait, S. M., & Yildiz, A. R. (2022). A new chaotic Lévy flight distribution optimization algorithm for solving constrained engineering problems. Expert Systems, 39(8), e12992.
Yıldız, B. S., Pholdee, N., Panagant, N., Bureerat, S., Yildiz, A. R., & Sait, S. M. (2022). A novel chaotic Henry gas solubility optimization algorithm for solving real-world engineering problems. Engineering with Computers, 38(2), 871–883. https://doi.org/10.1007/s00366-020-01268-5
Nadimi-Shahraki, M. H., Zamani, H., Fatahi, A., & Mirjalili, S. (2023). MFO-SFR: An enhanced moth-flame optimization algorithm using an effective stagnation finding and replacing strategy. Mathematics, 11(4), 862.
Faramarzi, A., Heidarinejad, M., Mirjalili, S., & Gandomi, A. H. (2020). Marine predators algorithm: a nature-inspired metaheuristic. Expert Systems with Applications, 152, 113377.
Tu, J., Chen, H., Wang, M., & Gandomi, A. H. (2021). The colony predation algorithm. Journal of Bionic Engineering, 18(3), 674–710.
Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: a novel meta-heuristic optimization algorithm. Computers & Industrial Engineering, 157, 107250.
Zamani, H., Nadimi-Shahraki, M. H., & Gandomi, A. H. (2022). Starling murmuration optimizer: A novel bio-inspired algorithm for global and engineering optimization. Computer Methods in Applied Mechanics and Engineering, 392, 114616.
Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570.
Zamani, H., Nadimi-Shahraki, M. H., & Gandomi, A. H. (2021). QANA: Quantum-based avian navigation optimizer algorithm. Engineering Applications of Artificial Intelligence, 104, 104314.
Abdel-Basset, M., El-Shahat, D., Chakrabortty, R. K., & Ryan, M. (2021). Parameter estimation of photovoltaic models using an improved marine predators algorithm. Energy Conversion and Management, 227, 113491.
Wang, S., Jia, H., Abualigah, L., Liu, Q., & Zheng, R. (2021). An improved hybrid aquila optimizer and harris hawks algorithm for solving industrial engineering optimization problems. Processes, 9(9), 1551.
El-Kenawy, E.-S.M., Mirjalili, S., Khodadadi, N., Abdelhamid, A. A., Eid, M. M., El-Said, M., & Ibrahim, A. (2023). Feature selection in wind speed forecasting systems based on meta-heuristic optimization. PLoS ONE, 18(2), e0278491.
Kennedy, J., Eberhart, RC. (1997) A discrete binary version of the particle swarm algorithm. IEEE International Conference on Systems, Man, and Cybernetics. Computational cybernetics and simulation, Orlando, FL, USA, 1997, 4104–4108.
Rashedi, E., Nezamabadi-Pour, H., & Saryazdi, S. (2010). BGSA: Binary gravitational search algorithm. Natural Computing, 9(3), 727–745.
Gong, T. (2007). Tuson (pp. 251–262). AL. Differential Evolution for Binary Encoding. Berlin.
Nakamura, RYM., Pereira, LAM., Costa, KA., Rodrigues, D., Papa, JP., Yang, XS. (2012) BBA: A Binary Bat Algorithm for Feature Selection. 25th SIBGRAPI Conference on Graphics, Patterns and Images, Ouro Preto, Brazil, 2012, 291–297.
Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016). Binary grey wolf optimization approaches for feature selection. Neurocomputing, 172, 371–381.
Mirjalili, S. (2016). Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Computing and Applications, 27(4), 1053–1073.
Faris, H., Mafarja, M. M., Heidari, A. A., Aljarah, I., Alam, A.-Z., Mirjalili, S., & Fujita, H. (2018). An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowledge-Based Systems, 154, 43–67.
He, Y., Xie, H., Wong, T.-L., & Wang, X. (2018). A novel binary artificial bee colony algorithm for the set-union knapsack problem. Future Generation Computer Systems, 78, 77–86.
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.
Nama, S., Saha, A. K., Chakraborty, S., Gandomi, A. H., & Abualigah, L. (2023). Boosting particle swarm optimization by backtracking search algorithm for optimization problems. Swarm and Evolutionary Computation, 79, 101304.
Chakraborty, S., Saha, A. K., & Chhabra, A. (2023). Improving whale optimization algorithm with elite strategy and its application to engineering-design and cloud task scheduling problems. Cognitive Computation. https://doi.org/10.1007/s12559-022-10099-z
Bansal, P., Gehlot, K., Singhal, A., & Gupta, A. (2022). Automatic detection of osteosarcoma based on integrated features and feature selection using binary arithmetic optimization algorithm. Multimedia Tools and Applications, 81(6), 8807–8834.
Liu, Y., Heidari, A. A., Cai, Z., Liang, G., Chen, H., Pan, Z., Alsufyani, A., & Bourouis, S. (2022). Simulated annealing-based dynamic step shuffled frog leaping algorithm: optimal performance design and feature selection. Neurocomputing, 503, 325–362.
Khalid, A. M., Hamza, H. M., Mirjalili, S., & Hosny, K. M. (2022). BCOVIDOA: a novel binary coronavirus disease optimization algorithm for feature selection. Knowledge-Based Systems, 248, 108789.
Xu, Z., Heidari, A. A., Kuang, F., Khalil, A., Mafarja, M., Zhang, S., Chen, H., & Pan, Z. (2023). Enhanced Gaussian bare-bones grasshopper optimization: Mitigating the performance concerns for feature selection. Expert Systems with Applications, 212, 118642.
Nadimi-Shahraki, M. H., Fatahi, A., Zamani, H., & Mirjalili, S. (2022). Binary approaches of quantum-based avian navigation optimizer to select effective features from high-dimensional medical data. Mathematics, 10(15), 2770.
Mirjalili, S., & Lewis, A. (2013). S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm and Evolutionary Computation, 9, 1–14.
Mirjalili, S., Wang, G.-G., & Coelho, Ld. S. (2014). Binary optimization using hybrid particle swarm optimization and gravitational search algorithm. Neural Computing and Applications, 25(6), 1423–1435.
Too, J., Abdullah, A. R., & Mohd Saad, N. (2019). A new quadratic binary Harris hawk optimization for feature selection. Electronics, 8(10), 1130.
Kahya, M. A., Altamir, S. A., & Algamal, Z. Y. (2021). Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numerical Algebra, Control & Optimization, 11(1), 87.
Liao, C.-J., Tseng, C.-T., & Luarn, P. (2007). A discrete version of particle swarm optimization for flowshop scheduling problems. Computers & Operations Research, 34(10), 3099–3111.
Marandi, A., Afshinmanesh, F., Shahabadi, M., Bahrami, F. (2006) Boolean particle swarm optimization and its application to the design of a dual-band dual-polarized planar antenna. IEEE International Conference on Evolutionary Computation, Vancouver, BC, Canada, 2006, 3212–3218.
Aslan, M., Gunduz, M., & Kiran, M. S. (2019). JayaX: jaya algorithm with xor operator for binary optimization. Applied Soft Computing, 82, 105576.
Rao, R. (2016). Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. International Journal of Industrial Engineering Computations, 7(1), 19–34.
Jordehi, A. R. (2019). Binary particle swarm optimisation with quadratic transfer function: A new binary optimisation algorithm for optimal scheduling of appliances in smart homes. Applied Soft Computing, 78, 465–480.
Sayed, G. I., Hassanien, A. E., & Azar, A. T. (2019). Feature selection via a novel chaotic crow search algorithm. Neural Computing and Applications, 31(1), 171–188.
Mirjalili, S., Zhang, H., Mirjalili, S., Chalup, S., Noman, N. (2020) A novel U-shaped transfer function for binary particle swarm optimisation. Soft Computing for Problem Solving 2019, Singapore, 241–259.
Nadimi-Shahraki, M. H., Zamani, H., Asghari Varzaneh, Z., & Mirjalili, S. (2023). A Systematic review of the whale optimization algorithm: theoretical foundation, improvements, and hybridizations. Archives of Computational Methods in Engineering. https://doi.org/10.1007/s11831-023-09928-7
Mohammadzadeh, H., & Gharehchopogh, F. S. (2021). A novel hybrid whale optimization algorithm with flower pollination algorithm for feature selection: Case study Email spam detection. Computational Intelligence, 37(1), 176–209.
Turkoglu, B., Uymaz, S. A., & Kaya, E. (2022). Binary artificial algae algorithm for feature selection. Applied Soft Computing, 120, 108630.
Piri, J., Mohapatra, P., Acharya, B., Gharehchopogh, F. S., Gerogiannis, V. C., Kanavos, A., & Manika, S. (2022). Feature selection using artificial gorilla troop optimization for biomedical data: a case analysis with COVID-19 data. Mathematics, 10(15), 2742.
Abualigah, L., & Diabat, A. (2022). Chaotic binary group search optimizer for feature selection. Expert Systems with Applications, 192, 116368.
Shaddeli, A., Soleimanian Gharehchopogh, F., Masdari, M., & Solouk, V. (2022). An improved african vulture optimization algorithm for feature selection problems and its application of sentiment analysis on movie reviews. Big Data and Cognitive Computing, 6(4), 104.
Helmi, A. M., Al-qaness, M. A., Dahou, A., & Abd Elaziz, M. (2023). Human activity recognition using marine predators algorithm with deep learning. Future Generation Computer Systems, 142, 340–350.
Al-Qaness, M. A., Helmi, A. M., Dahou, A., & Elaziz, M. A. (2022). The applications of metaheuristics for human activity recognition and fall detection using wearable sensors: A comprehensive analysis. Biosensors, 12(10), 821.
Ren, W., Bashkandi, A. H., Jahanshahi, J. A., AlHamad, A. Q. M., Javaheri, D., & Mohammadi, M. (2023). Brain tumor diagnosis using a step-by-step methodology based on courtship learning-based water strider algorithm. Biomedical Signal Processing and Control, 83, 104614.
Nadimi-Shahraki, M. H., Asghari Varzaneh, Z., Zamani, H., & Mirjalili, S. (2023). Binary starling murmuration optimizer algorithm to select effective features from medical data. Applied Sciences, 13(1), 564.
Tanabe, R., Fukunaga, A. (2013) Success-history based parameter adaptation for differential evolution. IEEE congress on evolutionary computation, Cancun, Mexico, 71–78.
Pelusi, D., Mascella, R., Tallini, L., Nayak, J., Naik, B., & Deng, Y. (2020). An improved moth-flame optimization algorithm with hybrid search phase. Knowledge-Based Systems, 191, 105277.
Gupta, S., & Deep, K. (2019). A novel random walk grey wolf optimizer. Swarm and Evolutionary Computation, 44, 101–112. https://doi.org/10.1016/j.swevo.2018.01.001
Kelly, M., Longjohn, R., Nottingham, K. The UCI machine learning repository. Retrieved April 1, 2022, from https://archive.ics.uci.edu
Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics, 11(1), 86–92.
Kutsuna, S. (2021). Clinical manifestations of coronavirus disease 2019. Japan Medical Association, 4(2), 76–80.
TheWorldometers: COVID-19 Coronavirus Pandemic. Retrieved 24 Sep 2022 from https://www.worldometers.info/coronavirus/
Chowdhury, M. E., Rahman, T., Khandakar, A., Mazhar, R., Kadir, M. A., Mahbub, Z. B., Islam, K. R., Khan, M. S., Iqbal, A., & Al Emadi, N. (2020). Can AI help in screening viral and COVID-19 pneumonia? IEEE Access, 8, 132665–132676. https://doi.org/10.1109/ACCESS.2020.3010287
Snyder, S. E., & Husari, G. (2021). Thor: A deep learning approach for face mask detection to prevent the COVID-19 pandemic. In S. E. Snyder (Ed.), Southeast Con 2021, Atlanta. USA: GA.
Babukarthik, R., Adiga, V. A. K., Sambasivam, G., Chandramohan, D., & Amudhavel, J. (2020). Prediction of COVID-19 using genetic deep learning convolutional neural network (GDCNN). IEEE Access, 8, 177647–177666. https://doi.org/10.1109/ACCESS.2020.3025164
Chakraborty, S., Saha, A. K., Nama, S., & Debnath, S. (2021). COVID-19 X-ray image segmentation by modified whale optimization algorithm with population reduction. Computers in Biology and Medicine, 139, 104984. https://doi.org/10.1016/j.compbiomed.2021.104984
Kumar Sahoo, S., Houssein, E. H., Premkumar, M., Kumar Saha, A., & Emam, M. M. (2023). Self-adaptive moth flame optimizer combined with crossover operator and Fibonacci search strategy for COVID-19 CT image segmentation. Expert Systems with Applications, 227, 120367. https://doi.org/10.1016/j.eswa.2023.120367
Hemdan, E.E.-D., Shouman, M. A., & Karar, M. E. (2020). Covidx-net: a framework of deep learning classifiers to diagnose covid-19 in x-ray images. ArXiv preprint ArXiv. https://doi.org/10.48550/arXiv.2003.11055
N.C. Virus, Dataset, Kaggle. Retrieved 25 September 2020 from https://www.kaggle.com/datasets/sudalairajkumar/novel-corona-virus-2019-dataset
Iwendi, C., Bashir, A. K., Peshkar, A., Sujatha, R., Chatterjee, J. M., Pasupuleti, S., Mishra, R., Pillai, S., & Jo, O. (2020). COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in Public Health, 8, 357.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fatahi, A., Nadimi-Shahraki, M.H. & Zamani, H. An Improved Binary Quantum-based Avian Navigation Optimizer Algorithm to Select Effective Feature Subset from Medical Data: A COVID-19 Case Study. J Bionic Eng 21, 426–446 (2024). https://doi.org/10.1007/s42235-023-00433-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42235-023-00433-y