1 Introduction

During the past years, with the increasing volume of data, the role of data mining techniques for extracting patterns and ultimately discovering hidden knowledge has become more prominent. Knowledge discovery involves searching a large amount of data to identify hidden patterns and extract relevant insights [1]. Applying data mining techniques to medical data has increasing benefits for medical diagnosticians and can minimize the potential errors of inexperienced physicians [2]. An accurate and reliable diagnosis in the early stages of the disease can significantly impact the patient's life and expenses. Since medical datasets are often collected from multiple sources and for different purposes, besides relevant features, they may contain many irrelevant and redundant ones. Relevant features include essential information and can be used to discover knowledge and hidden patterns. Redundant features comprise necessary information already offered by another feature; as a result, these features don't provide the prediction model any extra useful data and only cause the computational cost for the classification algorithm. Features that do not provide useful information are irrelevant, and their presence causes a decline in the precision of prediction models' classification ability. As a result, the existence of irrelevant and redundant features not only imposes an additional computational cost to data mining algorithms but also degrades the accuracy of the predictions [3].

Feature Subset Selection (FSS) is a common way to overcome the mentioned challenges by selecting a set of M effective and relevant features from an original set of N features where M < N. Generally, there are three main types of FSS techniques: filter-based methods, wrapper-based methods, and embedded-based methods [4, 5]. Without utilizing machine learning techniques, filter-based methods evaluate the relevance of features based on criteria including consistency, correlation, similarity, distance, information, variance, and statistical criteria to select features [6]. Filter-based feature selection using ant colony optimization [7] and filter-based binary particle swarm optimization [8] are two prominent FSS algorithms that employ the filter-based method. Although filter-based FSS techniques are generally less computationally expensive than other methods due to their independence from classification algorithms, they usually do not yield adequate performance. The wrapper-based method, popularized by Kohavi and John [9], evaluates the selected feature subset's quality using a classification algorithm. Despite outperforming the filter-based method in accuracy, the frequent usage of the classification algorithm makes it computationally more expensive and requires more execution time than the filter-based methodology. [10]. Embedded-based methods are another category of FSS algorithms, performing selection during training and specific to certain learning machines [11]. They are more efficient by avoiding retraining predictors for each subset but are limited to specific machines and can be intricate. Examples include FSS based on support vector machine [12, 13] and embedded-based genetic programming [14].

As the wrapper-based approach yields better results [15], three search strategies have been used to generate numerous FSS algorithms: sequential, exponential, and random search. The features are progressively added or removed in the sequential search strategy [16, 17], but this search strategy is prone to trap in local optima solutions [18]. The exponential search strategy makes sure to find the optimal features subset by evaluating all possible feature subsets [19]; however, they are computationally costly and inapplicable for many real-world datasets [18]. The random search strategy begins with a random feature subset and then proceeds with a sequential search strategy. Metaheuristic algorithms, a prominent class of random search methods, use stochastic techniques to solve optimization problems. They explore the search space, promote population diversity, and bypass local optimum solutions by embedding randomness into their search procedures. Additionally, they locally search promising areas to enhance solution quality [20]. Particle Swarm Optimization (PSO) [21], Genetic Algorithm (GA) [22], Differential Evolution (DE) [23], Ant Colony Optimization (ACO) [24], Artificial Bee Colony (ABC) [25], Grey Wolf Optimizer (GWO) [26], Moth-Flame Optimization (MFO) [27], Whale Optimization Algorithm (WOA) [28], and Harris Hawks Optimization (HHO) [29] are some well-known metaheuristic algorithms proposed to tackle different optimization problems such as task scheduling [30,31,32], optimal power flow [33, 34], and engineering design [35,36,37,38,39,40]. Researchers continue developing new metaheuristic algorithms to cover some flaws of previous algorithms, such as low convergence rate due to randomization in their methods [41, 42] and the lack of mechanisms to maintain population diversity and the imbalance between search strategies [43]. For instance, the Marine Predators Algorithm (MPA) [44], Colony Predation Algorithm (CPA) [45], Aquila Optimizer (AO) [46], Starling Murmuration Optimizer (SMO) [47], dwarf mongoose optimization algorithm [48], and Quantum-based Avian Navigation Optimizer Algorithm (QANA) [49] are among the recent metaheuristic algorithms proposed for continuous problem-solving [50, 51]. Among these, QANA is a scalable population-based metaheuristic algorithm inspired by migratory bird navigation.

Since the FSS is a binary problem that deals with selecting or discarding features, researchers have developed different binary metaheuristic algorithms to solve various real-world optimization problems [52]. For instance, the Binary Particle Swarm Optimization (BPSO) [53], Binary Gravitational Search Algorithm (BGSA) [54], Binary Differential Evolution (BDE) [55], Binary Bat Algorithm (BBA) [56], Binary Grey Wolf Optimization (bGWO) [57], Binary Dragonfly Algorithm (BDA) [58], Binary Salp Swarm Algorithm (BSSA) [59], and Binary Artificial Bee Colony (BABC) [60] are some well-known binary metaheuristic algorithms developed by applying different binarization methods including, transfer functions and crossover operator. Although metaheuristic algorithms have their benefits, the No-Free-Lunch (NFL) theorem [61] asserts that no individual algorithm can solve all problems [62, 63]. As a result, in FSS domain, new binary metaheuristic algorithms, such as Binary Arithmetic Optimization Algorithm (BAOA) [64], Binary Simulated Annealing-Based Dynamic Step Shuffled Frog Leaping Algorithm (BDSSRLFLA) [65], Binary Coronavirus Disease Optimization Algorithm (BCOVIDOA) [66], binary enhanced gaussian bare-bones grasshopper optimization [67], and Binary Quantum-based Avian Navigation Optimizer Algorithm (BQANA) [68] are still developing.

This paper aims to propose IBQANA, an improved binary quantum-based avian navigation optimizer algorithm, to select an effective feature subset from different medical datasets. First, we propose a binarization method by introducing a novel Hybrid Binary Operator (HBO) to map continuous values produced by the QANA into binary solutions. This method combines a threshold approach with Boolean operators, addressing the limitations of existing binarization techniques. Consequently, the binary version of QANA, called BQANA-HBO, is developed, enabling more efficient and accurate binary solutions. Then, an improved version of BQANA-HBO, named IBQANA is developed to tackle the issues of local optima entrapment and low convergence rate. This improvement is achieved by introducing the Distance-Based Binary Search Strategy (DBSS), consisting of exploration and exploitation phases. In the exploration phase, which mostly happen in the early iterations, inferior search agents are updated by crossing over their positions with the farthest search agent's position in the archive. Conversely, in the final iterations, the exploitation phase is employed to update the inferiors’ position by crossing over with the best position. It is expected that these contributions enhance the performance and effectiveness of IBQANA for FSS problems.

The efficacy of the proposed IBQANA was tested on 12 medical datasets with different numbers of features. First, the effectiveness of BQANA-HBO developed by the introduced binarization method is investigated and compared with BQANA developed using variable thresholding and five representatives of 20 binary versions of QANA developed by incorporating five distinct transfer function categories, including S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions. The performance evaluation of IBQANA includes assessing its effectiveness by considering accuracy, fitness, and the selected features quantity. To determine its competitiveness, IBQANA is compared with seven well-known algorithms commonly used in the literature for FSS, including Binary Differential Evolution (BDE) [55], Binary Bat Algorithm (BBA) [56], V-shaped Binary Particle Swarm Optimization (VPSO) [69], hybrid Binary Particle Swarm Optimization and Gravitational Search Algorithm (BPSOGSA) [70], Binary Dragonfly Algorithm (BDA) [58], Quadratic Binary Harris Hawk Optimization (QBHHO) [71], and Improving Whale Optimization Algorithm for FSS with a Time-Varying transfer function (BWOA-TV) [72]. In addition, the proposed IBQANA is also compared with two other algorithms, namely the Binary Quantum-based Avian Navigation Optimizer Algorithm based on the thresholding approach (BQANA) [68] and the introduced BQANA-HBO. Moreover, a case study was conducted where IBQANA was employed as a diagnostic tool for Coronavirus Disease 2019 (COVID-19). To demonstrate the superiority of the proposed algorithm, the findings were statistically analyzed using the Friedman test. Based on the experimental and statistical results, it was discovered that the proposed IBQANA performs better than other comparative algorithms in identifying relevant features from COVID-19 and 11 other medical datasets.

2 Related Works

Due to the exponential growth in the number of potential solutions when dealing with datasets containing N features, evaluating all 2N possible solutions becomes an NP-hard problem. As a result, researchers have extensively addressed the problem of FSS by turning to metaheuristic algorithms, which have proven successful in approximating solutions. In this section, we review several prominent metaheuristic algorithms used in the wrapper approach, listed in the order of their appearance in the literature.

Kennedy and Eberhart [53] proposed a Binary version of the Particle Swarm Optimization (BPSO) algorithm using a sigmoid transfer function to tackle discrete optimization problems [73]. Sigmoid and its other three versions [69] are known as S-shaped transfer functions, frequently used in the literature. Marandi et al. [74] proposed Boolean Particle Swarm Optimization (BPSO) to solve a dual-band dual-polarized planar antenna design problem. This algorithm uses three Boolean operators, including and (∧), or (∨), and xor ( ⊕), to generate binary solutions. Binary Differential Evolution (BDE), proposed by Gong et al. [55], utilizes discrete mutation and crossover operators to solve binary problems. Rashdi et al. [54] proposed the V-shaped transfer function and developed the Binary Gravitational Search Algorithm (BGSA) accordingly. Nakamura et al. [56] introduced the Binary Bat Algorithm (BBA) by mapping continuous search agents’ velocity into binary solutions utilizing the S-shaped transfer function. Mirjalili et al. [69] proposed the VPSO algorithm as another binary version of PSO by applying four V-shaped transfer function family variants. The comparison between applying S-shaped and V-shaped transfer functions to the PSO algorithm indicates that the V-shaped transfer function provides superior results compared to the S-shaped transfer function. Mirjalili et al. [70] proposed a hybridization of binary PSO and GSA algorithms (BPSOGSA) to solve FSS problems by applying a V-shaped transfer function. Mirjalili [58] proposed a Dragonfly Algorithm (DA) inspired by the natural swarming patterns of dragonflies. Its binary version, named BDA was also developed using V-shaped transfer function.

Aslan et al. [75] introduced Jayax, a binary form of the Jaya algorithm [76] that generates binary solutions using the xor ( ⊕) Boolean operator. The results reported in this research show that the Jayax predominantly provides more optimal solutions than the binary form of the Jaya algorithm based on the transfer function. Jordehi [77] proposed three versions of the quadratic transfer function to develop the quadratic binary PSO for solving scheduling shiftable appliance problems in smart homes. The reported results show that the binary PSO developed using the S-shaped transfer function can generate better results than the V-shaped transfer function. Furthermore, the introduced quadratic transfer function outperforms all other binary versions of PSO developed using different transfer functions. Too et al. [71] proposed the Quadratic Binary Harris Hawk Optimization (QBHHO) by introducing a new variant of the quadratic transfer function. The findings demonstrate that the introduced transfer function provides superior results for most investigated FSS problems. Sayed et al. [78] developed a Chaotic Crow Search Algorithm (CCSA) to tackle the drawbacks of the crow search algorithm in solving FSS problems, including entrapment in local optima and low convergence speed. CCSA employs ten chaotic maps to alleviate the mentioned flaws and boost its performance. The reported findings demonstrate that the sine map improves the CSA's performance in the FSS field. The authors in [79] proposed an effective binary version of PSO using the U-shaped transfer function (UBPSO). The results demonstrate that the UBPSO algorithm could generate superior binary solutions compared to the S-shaped and V-shaped binary PSO.

The Whale Optimization Algorithm (WOA) proposed by Mirjalili and Lewis [28] inspired by the hunting behavior of humpback whales is a well-known metaheuristic algorithm for finding the optimal solution with high speed using Simple but powerful search mechanisms [80]. To address the FSS problem, Mohammadzadeh and Gharehchopogh [81] introduced a hybridization of WOA and flower pollination algorithms based on opposition-based learning for email spam detection. In another study, Kahya et al. [72] developed a binary version of WOA, named BWOA-TV, incorporating a time-varying transfer function. Turkoglu et al. [82] proposed a binary artificial algae algorithm for FSS, inspired by the behavior of algae in nature. Piri et al. [83] developed a Discrete Artificial Gorilla Troop Optimization (DAGTO) algorithm to solve FSS problems in the healthcare sector. The authors implemented four variants of the DAGTO for different numbers and types of objective functions. Abualigah and Diabat [84] proposed a chaotic binary group search optimizer to solve the FSS problem by combining chaotic maps and a binary group search optimizer. Shaddeli et al. [85] introduced an improved African vulture optimization algorithm to alleviate convergence to local optima when solving discrete problems by hybridizing it with SCA and applying four strategies. Moreover, the authors used S-shaped and V-shaped transfer functions to map continuous solutions into binary ones.

Helmi et al. [86] proposed three binary versions of MPA using S-shaped and V-shaped transfer functions to address the problem of FSS in human activity recognition (HAR). HAR refers to identifying a person's actions based on measurements obtained from various mechanisms, including cameras, interior sensors, radars, wireless signals, and other sources [87]. A new method for diagnosing brain tumors was put out by Ren et al. [88], employing a step-by-step process based on a deep learning-based Water Strider Algorithm (WSA). The proposed method involves feature extraction, FSS, and classification steps, using the WSA. Specifically, the WSA is employed to select the most relevant features that contribute to the classification of brain tumors. A Binary QANA (BQANA) algorithm was presented by Nadimi-Shahraki et al. [68] to solve the FSS problem in the medical data field using two different approaches: the first employs various transfer functions to convert the canonical QANA to binary, while the second maps continuous solutions to binary by setting a variable threshold for each dimension. The results revealed that BQANA developed using the threshold method generates better solutions than transfer functions. The results also demonstrated that S4, V1, U4, Z3, and Q3 transfer functions are the representatives of their groups. In a similar study, Nadimi-Shahraki et al. [89] employed transfer functions and variable threshold approaches to develop Binary Starling Murmuration Optimizer (BMSO) by mapping continuous solutions of the Starling Murmuration Optimizer (SMO). Like the previous research, the results show that the variable threshold approach provides superior results.

3 Quantum-based Avian Navigation Optimizer Algorithm

The Quantum-based Avian Navigation Optimizer Algorithm (QANA) proposed by Zamani et al. [49] is a recent DE algorithm inspired by migrating birds' remarkable accuracy in long-distance aerial navigation. QANA utilizes multiple operators such as population partitioning, a qubit-crossover, two mutation strategies, and self-adaptive quantum orientation to achieve competitive results in continuous search spaces. Moreover, the V-echelon communication topology facilitates information sharing among search agents. Therefore, as seen in Algorithm 1, the QANA pseudocode includes four basic steps: initialization and multi-flock construction, movement strategy, fitness evaluation, and updating positions.

3.1 Initialization and Multi-flock Construction

To distribute the entire population of birds represented by matrix A in Eq. (1), the population is divided into k different geographical areas using random centers. This partitioning creates k flocks, each consisting of n search factors, where n = N/k. As a result, the position of each bird in the D-dimensional problem space is denoted by vector Xi = [xi1,xi2,…,xiD].

$${\varvec{A}} = \left[ {\begin{array}{*{20}c} {x_{11} } & {x_{12} } & \cdots & {x_{1D} } \\ {x_{21} } & {x_{22} } & \cdots & {x_{2D} } \\ \vdots & \vdots & \vdots & \vdots \\ {x_{N1} } & {x_{N2} } & \cdots & {x_{ND} } \\ \end{array} } \right]$$
(1)

After the flock construction, in subsequent iterations, the search agents simultaneously use the knowledge shared by the V-echelon topology. Furthermore, utilizing the Success-based Population Distribution (SPD) policy, the search agents traverse the search space by using a mutation strategy rooted in quantum mechanics.

3.2 V-Echelon Communication Topology

The flight formation of migrating birds patterns the V-echelon communication construction and allows search agents of each flock to communicate knowledge that has been obtained. A Header (H) and two subsets for Left (L) and Right (R) lines make up this communication structure. The aerial navigation of migrating birds utilizing V-echelon topology is depicted in Fig. 1.

Fig. 1
figure 1

The V-shaped formation

3.3 Quantum-based Navigation

Bird flocks utilize a quantum-based navigation system for exploring the search space. This system comprises a Success-based Population Distribution (SPD) policy, qubit-crossover, and two mutation strategies named "DE/quantum/I" and "DE/quantum/II". The assignment of a specific mutation strategy to a flock is based on the policy defined in Eq. (2),

$${\varvec{SR}}_{{\varvec{m}}} \left( t \right) = \left( {\left( {\mathop \sum \limits_{{i \in {\varvec{f}}_{{\varvec{m}}} }} \frac{{\mathop \sum \nolimits_{j = 1}^{n} \tau_{ij} }}{n}} \right)/\left| {{\varvec{f}}_{{\varvec{m}}} } \right|} \right) \times 100$$
(2)

where SRm denotes the success rate of mutation strategy Mm, and τij is equal to 1 if Mm improved aj of the i-th flock; else, τij equals to 0.

The DE/quantum/I and DE/quantum/II are two quantum mutation strategies represented by mathematical Eqs. (3) and (4), respectively. These equations utilize several variables such as xi (t) which refers to the i-th search agent, xVechelon (t) representing the position of the search agent followed by ai, xj and xj are two random positions selected from Long-Term Memory (LTM) and Short-Term Memory (STM). In addition, the variable vH (t + 1) indicates the position of the header in the V-echelon topology, and is computed using Eq. (5). The quantum orientation for avian ai, denoted by Si, is also utilized in the computation and is presented in [49, 90]. LB and UB are the search space's lower and upper boundaries.

$${\varvec{v}}_{{\varvec{i}}} \left( {t + 1} \right) = {\varvec{x}}_{{{\text{best}}}} \left( t \right) + {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times \left( {{\varvec{x}}_{{{\mathbf{V}}_{{{\text{echelon}}}} }} \left( t \right) - {\varvec{x}}_{{{\varvec{j}} \in {\varvec{LTM}}}} \left( t \right)} \right) + {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times \left( {{\varvec{x}}_{{{\mathbf{V}}_{{{\text{echelon}}}} }} \left( t \right) - {\varvec{x}}_{{{\text{best}}}} \left( t \right)} \right) + {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times \left( {{\varvec{x}}_{{{\varvec{j}} \in {\varvec{LTM}}}} \left( t \right) - {\varvec{x}}_{{{\varvec{j}} \in {\varvec{STM}}}} \left( t \right)} \right)$$
(3)
$${\varvec{v}}_{{\varvec{i}}} \left( {t + 1} \right) = {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times \left( {{\varvec{x}}_{{{\text{best}}}} \left( t \right) - {\varvec{x}}_{{{\mathbf{V}}_{{{\text{echelon}}}} }} \left( t \right)} \right) + {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times \left( {{\varvec{x}}_{{\varvec{i}}} \left( t \right) - {\varvec{x}}_{{{\varvec{j}} \in {\varvec{LTM}}}} \left( t \right) - {\varvec{x}}_{{{\varvec{j}} \in {\varvec{STM}}}} \left( t \right)} \right)$$
(4)
$${\varvec{v}}_{{\varvec{H}}} \left( {t + 1} \right) =\, {\varvec{S}}_{{\varvec{i}}} \left( t \right) \times {\varvec{x}}_{{{\text{best}}}} + \left( {{\varvec{LB}} + \left( {{\varvec{UB}} - {\varvec{LB}}} \right) \times rand\left( {0,1} \right)} \right)$$
(5)

By crossing the mutant vector vi (t + 1) with its parent xi (t), the trial vector ui (t + 1) is generated, where id is a qubit-crossover probability for the d-th dimension [49].

$${\varvec{u}}_{{{\varvec{id}}}} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {{\varvec{x}}_{{{\varvec{id}}}} \left( {t + 1} \right), \left| {{\varvec{\psi}}_{{\varvec{i}}} } \right._{{\varvec{d}}} < rand} \\ {} \\ {{\varvec{v}}_{{{\varvec{id}}}} \left( {t + 1} \right), \left| {{\varvec{\psi}}_{{\varvec{i}}} } \right._{{\varvec{d}}} \ge rand} \\ \end{array} } \right.$$
(6)
figure a

4 Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA)

As stated earlier, ineffective binarization methods, prolonged convergence, and local optimum entrapment are the main problems of most binary metaheuristic algorithms that limit their performance in the FSS problem. First, this section introduces a novel binarization method, namely the Hybrid Binary Operator (HBO), to effectively map the continuous values into binary solutions and develop the binary version of the QANA, named BQANA-HBO, accordingly. Then, an improved version of BQANA-HBO, named IBQANA is proposed to cope with local optimum trapping and slow convergence rate by introducing a Distance-Based Binary Search Strategy (DBSS), which includes two different phases to adjust the position of the inferior search agents.

4.1 Hybrid Binary Operator (HBO)

As stated in Sect. 2, although the most popular binarization technique for metaheuristic algorithms is the transfer function, it cannot produce efficient binary solutions for certain metaheuristic methods [68]. On the other hand, the thresholding method presented in Eq. (7) is more effective in developing a binary version of the QANA [68]; however, this method has a major drawback that hinders the metaheuristic algorithm from utilizing its full potential in finding the optimal solution.

$$b_{i}^{d} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {1, if x_{i}^{d} \left( t \right) > 0.5} \\ {0, if x_{i}^{d} \left( t \right) \le 0.5} \\ \end{array} } \right.$$
(7)

The issue lies in the thresholding method, which can only produce valid binary solutions for continuous values within the range of [0, 1]. Consequently, this approach fails to appropriately map continuous values outside these boundaries, such as negative values and values exceeding 1. Therefore, this section proposes a new binary method, Hybrid Binary Operator (HBO), for effectively mapping continuous solutions into binary space.

In the proposed HBO, if the continuous solution produced by the QANA is in the interval [0, 1], the thresholding method is used to convert it into a binary solution. However, if the generated continuous solution violates the lower bound (lb) or the upper bound (ub), according to its binary position in the current iteration, the logical operator AND (∧) is used to update the binary solution based on Eq. (8).

$$b_{i}^{d} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {b \wedge b_{i}^{d} \left( t \right), \,if\, x_{i}^{d} \left( {t + 1} \right) > ub} \\ {lb \wedge b_{i}^{d} \left( t \right) , \.if\, x_{i}^{d} \left( {t + 1} \right) < lb} \\ \end{array} } \right.$$
(8)

The procedure of the HBO to determine the binary solution for each dimension of a search agent in the BQANA-HBO algorithm is depicted in Fig. 2.

Fig. 2
figure 2

Process of updating the binary positions using the HBO

Despite the effective binary mapping provided by HBO, the developed BQANA-HBO still cannot offer high-accuracy solutions with fast convergence. Therefore, in the following, we propose IBQANA to enhance the solution quality and convergence rate of the BQANA-HBO algorithm by introducing DBSS. This innovative combination is expected to yield more efficient and accurate results, making it a promising avenue for advancing the field.

4.2 Distance-based Binary Search Strategy (DBSS)

This section proposes the IBQANA by introducing a DBSS to improve the performance of inferior search agents and speed up the convergence rate. The proposed search strategy consists of two phases of exploration and exploitation. In each iteration, if the new fitness of i-th search agent is equal or greater than its fitness in the current iteration, a new binary position is generated based on one of two different phases. As mentioned in Sect. 3, the QANA algorithm has STM and LTM to preserve the obtained solutions. In the proposed DBSS, two memories are merged and form an Archive. Then, according to the obtained value of P through Eq. (9), a new binary position is generated based on one of the exploitation and exploration phases,

$$P\left(t\right)=1-\frac{t}{MaxIt}$$
(9)

where t represents the present iteration, and MaxIt is the maximum number of iterations. According to this equation, the value of P gradually decreases from 1 to 0 as the number of iterations progresses.

4.2.1 Exploration Phase

If a random number between 0 and 1 is determined to be bigger than the P value in the current iteration, a new binary solution is generated based on the uniform crossover operator according to Eq. (10),

$$b_{{{\text{new}}}}^{d} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {Archive_{{{\text{far}}}}^{d} \left( t \right), if r_{1}^{d} \ge r_{2}^{d} } \\ {b_{i}^{d} \left( {t + 1} \right) , otherwise} \\ \end{array} } \right.$$
(10)

where r1 and r2 are random numbers within the range of (0,1), and Archivefar (t) indicates the binary position of the farthest member of the Archive from the position of bi (t + 1), which is determined based on Eq. (11),

$$Archive_{{{\text{far}}}}^{d} \left( t \right) = max\left( {HD_{ik} \left( t \right)} \right)$$
(11)

where HDik (t) is the Hamming distance between bi (t + 1) and the archive members, the Hamming distance is defined in Eq. (12), where M represents the total number of positions stored in the archive.

$$\begin{gathered} HD_{ik} \left( t \right) = \mathop \sum \limits_{d = 1}^{D} \left( {b_{i}^{d} \left( {t + 1} \right) - Archive_{k}^{d} \left( t \right)} \right) \hfill \\ i = \left[ {1, 2, \ldots ,N} \right], k = \left[ {1,2, \ldots ,M} \right] \hfill \\ \end{gathered}$$
(12)

4.2.2 Exploitation Phase

Suppose the P value is determined to be less than or equal to a randomly generated number in the range (0, 1). In that case, the new binary solution is generated using the exploitation phase. This phase enhances the solutions’ quality without degrading the algorithm’s exploration capability. To achieve this, a fitness-dependent weight factor is calculated based on Eq. (13) [91],

$$\omega \left( t \right) = \frac{{F\left( {b_{best} \left( {t + 1} \right)} \right)}}{{F\left( {b_{i} \left( {t + 1} \right)} \right)}}$$
(13)

where F(.) denotes the fitness value of a binary position, since according to the fitness function definition presented in the next section, the FSS is a minimization problem; the value produced by this equation is always in the range [0, 1]. Therefore, Eq. (14) generates new binary position during the exploitation phase.

$$b_{{{\text{new}}}}^{d} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {b_{{{\text{best}}}}^{d} \left( t \right) , if \omega^{d} \ge r_{3}^{d} } \\ {b_{i}^{d} \left( {t + 1} \right), otherwise} \\ \end{array} } \right.$$
(14)

Finally, after calculating the fitness of the new position produced by each of the two introduced phases, a greedy selection between bnew (t + 1) and bi (t + 1) determines the value of bi (t + 1). The pseudocode of the proposed DBSS is presented in Algorithm 2, and the pseudocode of the proposed Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA) is shown in Algorithm 3.

figure b
figure c

4.3 Computational and Time Complexity

As shown in Algorithm 3, the proposed IBQANA consists of four phases: initialization, multi-flock construction, movement, and fitness evaluation. In the initialization phase (line 2), N search agents are randomly distributed into a D-dimensional search space with an O(ND) computational complexity. In the multi-flock construction phase (line 5), k flocks with population size n (where n = N/k) are constructed with the computational complexity of O(kn). Then, the fitness value of each search agent is computed with the computational complexity of O(ND). The movement phase consists of memory construction, V-echelon topology formation, and three types of movement vectors: mutant, trial, and binary. The long-term and short-term memories with K′ and capacities are constructed with an O(K′D) + O(K˝D) computational complexity. Then, the V-echelon topology (line 8) for each k flock is formed with an O(kn) complexity. Finally, line 9 computes the SPD policy with an O(kn) complexity to assign mutant vectors to each flock. Mutant vectors (line 12) defined in Eqs. (35) are computed with O(ND) complexity, and the trial vectors and binary vector (lines 13 and 14) defined in Eqs. (68) are calculated with complexity of O(2ND). In the fitness evaluation phase (line 15), the fitness value of each search agent is computed with the computational complexity of O(ND), and Algorithm 2 runs with a computational complexity of O(ND). The algorithm repeats these phases until it reaches the maximum iteration T. Thus, the total computational complexity of IBQANA can be O(ND + T(kn + ND + (K′D + K˝D) + kn + kn + ND + 2ND + ND + ND). Since kn = N, we can simplify the computational complexity of IBQANA to O(ND + T(3N + 5ND + (K′D + K˝D)). The computational complexity can be qual to O(TND), because N is always bigger than K′ and .

To compute the time complexity, the Average Time (AT) required to find the solution, known as the run time, is computed using Eq. (15) [49, 92]. This equation considers the total number of runs (M) and the computational time of algorithm A for each run (RTA,i).

$${AT}_{A}=\frac{1}{M}\sum_{i=1}^{M}R{T}_{A,i}$$
(15)

Table 1 compares the run times (in seconds) of the IBQANA algorithm and comparative algorithms on different datasets. It can be seen that, while it may not be the fastest algorithm, it consistently has an acceptable run time across most datasets. However, it's worth noting that using IBQANA for real-time applications might be a limitation due to its relatively longer run times in certain datasets. Nevertheless, IBQANA proves to be a promising candidate for accuracy-critical applications, particularly in medical datasets, making it a promising choice for researchers and practitioners in this domain.

Table 1 Run times of IBQANA and comparative algorithms for different datasets

5 Experimental Assessment

This section evaluates the IBQANA algorithm's performance through experiments to tackle the FSS problem on 12 medical datasets with various feature sizes. First, to evaluate the effectiveness of the suggested binarization technique, the Hybrid Binary Operator (HBO) is applied to develop the binary QANA (BQANA-HBO) algorithm and compare its performance with the binary QANA developed using thresholding method, and representatives of 20 binary versions of QANA developed using five groups of transfer functions. In the second experiment, we evaluate the improved version of this algorithm called IBQANA and contrast the obtained results with BQANA-HBO, as well as seven other comparative algorithms, including BBA [56], VPSO [69], BPSOGSA [70], QBHHO [71], BDA [58], BWOA-TV [72], and BQANA [68]. These algorithms encompass highly cited, recent, and diverse approaches, allowing for a comprehensive comparison.

5.1 Parameter Settings

To verify that the comparisons are accurate and fair, the parameters for the comparative algorithms were set to the values stated in their respective original publications, which are provided in Table 2, while the initial values for the shared parameters, such as maximum number of iterations (MaxIt) and population size (N), were set to 20 and 300, respectively, for all algorithms. In this study, all experiments, including those for the other algorithms used in the comparison, were conducted 20 times independently on a laptop with an Intel Core i7-10750H CPU and 24.0 GB of RAM using MATLAB R2022a.

Table 2 Parameters of the algorithms

In this work, the K-Nearest Neighbor (K-NN) classifier with Euclidean distance and K = 5 is used to determine the classification accuracy of produced feature subsets based on Eq. (16),

$$Accuracy=\frac{\left(TP+TN\right)}{\left(TP+FN+TN+FP\right)}$$
(16)

where TP represents the count of accurately identified positive samples, while TN refers to the count of correctly classified negative samples. FN indicates the count of positive samples incorrectly classified as negative, and FP represents the number of negative samples incorrectly classified as positive. Also, the Classification Error (CE) is calculated using Eq. (17).

$$CE=1-Accuracy$$
(17)

The goals of this study, including maximization of the classification accuracy and minimization of the number of features, are aggregated in an objective function given in Eq. (18),

$$Fitness=\alpha \times CE+(1-\alpha )\frac{{N}_{sf}}{{N}_{tf}}$$
(18)

where α denotes the significance of classification accuracy, Ntf, and Nsf are the total number of features of the dataset and the number of selected features, respectively. Since classification accuracy is the most crucial metric for medical datasets, following related studies [57, 71], we considered α = 0.99.

5.2 Description of Medical Datasets

This research evaluates the effectiveness of IBQANA and competing algorithms in selecting effective features using 12 medical benchmark datasets, primarily from the UCI machine learning repository [93]. Table 3 presents statistical information on the datasets under consideration. The K-fold cross-validation method was employed to prevent overfitting issues, with kfold = 10. This method divides the dataset into k folds, with the classifier utilizing k-1 folds as training sets and one fold as the testing set.

Table 3 The description of the medical datasets

5.3 Evaluation of the Proposed Hybrid Binary Operator (HBO)

This section investigates the effectiveness of the introduced HBO binarization method on the developed binary QANA (BQANA-HBO) to select effective features from the medical datasets and compare the obtained results with the results obtained from the binary QANA using the thresholding method (BQANA) and five representatives of S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions [68]. The results are tabulated in Table 4, where the best average fitness value achieved for each medical dataset is emphasized in bold font. Also, the last row shows the average of Friedman test ranking results for each algorithm.

Table 4 The comparison between the BQANA-HBO and representatives of each transfer function family

The reported results in Table 4 signify that the introduced HBO provides the superior mapping of continuous values of QANA into binary solutions in contrast to other binarization methods in terms of average fitness value. The findings also reveal that the binary QANA developed using the thresholding method (BQANA) provides better solutions than the investigated transfer functions and is ranked second. Among the investigated transfer functions, Q3 provides competitive results. Conversely, it was found that the S-shaped transfer function yields the poorest performance when used to convert continuous QANA values into binary solutions to select effective features from medical datasets.

5.4 Evaluation of the Proposed IBQANA

This section undertakes an evaluation and investigation of the efficacy of the proposed IBQANA on 12 medical datasets, followed by a comparison of the results with those generated by seven well-established metaheuristic algorithms. The outcomes are comprehensively presented in Tables 5 and 6, providing details on the average (Avg), minimum (Min), standard deviation (Std), or maximum (Max) fitness and classification accuracy.

Table 5 Comparison in terms of fitness values of algorithms
Table 6 Comparison in terms of classification accuracy of algorithms

Table 5 compares the fitness values of investigated FSS algorithms on 12 medical datasets, where the bold values represent the best average fitness value achieved for each medical dataset. The fitness function, described in Eq. (18), considers classification accuracy and the number of selected features. IBQANA consistently outperforms other algorithms in fitness values, with the lowest fitness value and standard deviation among all comparatives. Comparing IBQANA and BQANA-HBO highlights the effect of the introduced DBSS to make IBQANA a promising choice for FSS in medical datasets. It is also noticeable that the BQANA-HBO ranks second, demonstrating an advantage over competitors for one medical dataset. The achievements of IBQANA are mostly because of its strengths in the effective mapping of continuous solutions to binary ones using the introduced HBO and updating inferiors’ position using the introduced DBSS.

From the results tabulated in Table 6, it can be observed that IBQANA outperforms other algorithms in terms of average classification accuracy on most datasets. For example, on the Pima dataset, IBQANA achieved an average accuracy of 77.459, which is higher than the average accuracy of other algorithms. Similar trends are observed on different datasets, such as Heart, Hepatitis, Lymphography, SPECT, WBCD, LSVT, Parkinson, Colon, and SRBCT, where IBQANA consistently demonstrates competitive or superior performance compared to other algorithms.

Figure 3 visually depicts the comparative convergence behavior analysis between IBQANA and other relevant algorithms for FSS purposes in the context of 12 medical datasets. The average of the fitness function in 20 separate runs has been used to plot the convergence curves. The plotted curves demonstrate that the proposed IBQANA has the quickest convergence towards optimal solutions compared to other algorithms for most datasets. Also, by comparing the performance of IBQANA against other algorithms on colon, leukemia, and prostate tumor datasets, it can be concluded that as the quantity of features grows, the proposed algorithm maintains its scalability and reaches better solutions than the comparative algorithms. The ability of the IBQANA to bypass local optimum solutions can be seen in the curves related to diabetes, lymphography, WBCD, colon, leukemia, and prostate tumor datasets. It is also noticeable that the IBQANA demonstrates significant advantages over BQANA-HBO primarily due to its innovative approach of updating the positions of inferior search agents using the introduced DBSS. This improvement empowers IBQANA to achieve accelerated convergence and effectively evade being trapped in local optima in various scenarios.

Fig. 3
figure 3

Convergence comparison of the IBQANA and comparative algorithms

Since in medical datasets, the accuracy of classification is considered the most crucial criterion, the classification accuracy results of IBQANA and comparative algorithms are shown in the form of a boxplot in Fig. 4. In general, the plots demonstrate that the proposed IBQANA can enhance the classification accuracy by selecting effective features and discarding irrelevant ones. The plots indicate that the IBQANA achieved higher median classification accuracy for eight datasets and higher maximum accuracy rate for six datasets. It can also be noticed that the BQANA-HBO represents the potential for increasing classification accuracy by selecting effective features as a low computational cost algorithm.

Fig. 4
figure 4

Boxplot of the classification accuracy obtained by the IBQANA and comparative algorithms

Given that a primary objective of FSS is to minimize the number of features utilized, Fig. 5 presents a comparison of the algorithms based on the rate of feature reduction achieved for each medical dataset, as determined through Eq. (19). In this equation, Nsf denotes the number of features selected by the algorithm, and Ntf represents the total number of features in the dataset of interest.

Fig. 5
figure 5

The feature reduction rate of algorithms for each dataset

$$R(\%)=1-\left(\frac{{N}_{sf}}{{N}_{tf}}\right)\times 100$$
(19)

The results in Fig. 5 demonstrate that the BDE algorithm has the weakest performance in minimizing the number of features in all 12 datasets. In contrast, QBHHO, BQANA-HBO, and IBQANA have performed best in reducing the number of features. Although QBHHO predominantly minimizes the number of features more than IBQANA, it should be noticed that discarding effective features degrades the accuracy of the classifier, which is not acceptable in medical datasets.

The nonparametric Friedman test [94] is conducted to compare and rank the performance of the algorithms in selecting effective features from the investigated datasets based on their obtained fitness value in 20 runs. The findings reported in Table 7 reveal that the proposed IBQANA is ranked first for 11 datasets and third for one dataset. It is also noticeable that the IBQANA is scalable as it performs best for high-dimensional datasets, including Parkinson, colon, SRBCT, leukemia, and prostate tumor. Furthermore, BQANA-HBO is ranked first for one dataset, second for eight datasets, fourth for one dataset, and fifth for two datasets.

Table 7 The Friedman test for the fitness obtained by each algorithm

6 Applicability of the Proposed IBQANA on a COVID-19 Case Study

The infectious disease, COVID-19, attributed to the viral agent SARS-CoV-2, has garnered immense attention on a global scale since its initial emergence in January 2019, as corroborated by reference [95]. This virus causes severe acute respiratory syndrome. Due to its quick spread around the globe, World Health Organization (WHO) recognized it as a global crisis. Reportedly, COVID-19 has infected 619,429,000 and killed nearly 6,537,236 people worldwide ever since [96]. Machine learning has recently emerged as an effective approach to solve many problems and can be used to combat COVID-19 through screening [97], monitoring [98], prediction [99], and diagnosis [83, 100, 101]. To make this possible, medical feature selection techniques extract useful features from clinical datasets, which are then used to develop diagnosis algorithms to identify COVID-19 disease [102].

In this section, an evaluation of the potential applicability and performance of the proposed IBQANA is conducted on the preprocessed version [103] of the COVID-19 dataset [104], which is described in Table 8. This table shows that the dataset contains 13 features to predict the patient's condition. The categorical columns in the preprocessed dataset have been converted to numerical values by assigning a unique number to each category. The experimental environments and parameters of the algorithms have been set to the values specified in Sect. 5.

Table 8 Description of the novel coronavirus 2019 dataset

The experimental results illustrated in the first plot of Fig. 6 reveal that the proposed IBQANA generates the best solutions with minimum fitness value among the investigated algorithms, and the BQANA-HBO attained the second rank with a similar convergence behavior (Fig. 6a). It is also noticeable that IBQANA achieves the quickest convergence towards optimal solutions and can generate optimal solutions faster than other algorithms. The boxplot in Fig. 6b shows that BQANA-HBO provides the highest maximum classification accuracy, while IBQANA and BDE algorithms have the highest median accuracy. Ultimately, Fig. 6c presents each algorithm's minimum and average number of selected features, where IBQANA has the shortest bar for the average number of features. In contrast, QBHHO has the shortest bar for the minimum number of features.

Fig. 6
figure 6

Comparison of the IBQANA and comparative algorithms on novel coronavirus 2019 dataset in terms of fitness value convergence (a), classification accuracy (b), and number of selected features (c)

7 Conclusion and Future Work

FSS is crucial in data analysis, identifying relevant features, and reducing the computational burden. Existing literature suggests that metaheuristic algorithms effectively find optimal feature subsets quickly. However, current binary metaheuristic algorithms suffer from slow convergence and lack an effective binarization method, resulting in suboptimal solutions. This paper introduces the Improved Binary Quantum-based Avian Navigation Optimizer Algorithm (IBQANA) specifically designed for FSS on medical datasets. The algorithm employs a Hybrid Binary Operator (HBO) to effectively convert continuous values into binary solutions. In addition, a Distance-Based Binary Search Strategy (DBSS) is introduced to enhance the performance of inferior search agents and accelerate convergence. DBSS operates through a two-phase search strategy, combining exploration and exploitation phases based on an adaptive probability function. This innovative approach effectively steers clear of local optima, ensuring optimal results. Experimental evaluation on 12 medical datasets demonstrates that IBQANA outperforms seven established algorithms regarding fitness, classification accuracy, and the number of selected features. It exhibits scalability in selecting features from high-dimensional datasets and demonstrates the fastest convergence among the tested algorithms. The application of IBQANA for COVID-19 detection underscores its practical significance and potential impact on the medical community. This research offers a promising solution to the FSS problem in medical data preprocessing, with implications for diagnostic tool development. Future research can explore HBO as a binarization method for other continuous metaheuristic algorithms and implement DBSS to mitigate limitations in other binary metaheuristic algorithms, such as slow convergence and local optima trapping.