A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing

Alomari, Osama Ahmad; Khader, Ahamad Tajudin; Al-Betar, Mohammed Azmi; Awadallah, Mohammed A.

doi:10.1007/s10489-018-1207-1

A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing

Published: 26 June 2018

Volume 48, pages 4429–4447, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Applied Intelligence Aims and scope Submit manuscript

A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing

Download PDF

Osama Ahmad Alomari ORCID: orcid.org/0000-0003-4244-5879¹,
Ahamad Tajudin Khader¹,
Mohammed Azmi Al-Betar² &
…
Mohammed A. Awadallah³

719 Accesses
65 Citations
Explore all metrics

Abstract

This paper proposed a new gene selection method based on modified Minimum Redundancy Maximum Relevancy (MRMR) as a filtering approach and hybrid bat algorithm with β-hill climbing as an efficient wrapper approach. The gene selection is a process of selecting the discriminative genes that aid in the development of efficient cancer diagnosis and classification. In general, the current filter-based approaches produced gene subset according to its discriminative power. However, one of the deficiencies of single filter approaches is that it has high variability of the classification results. Accordingly, this study aim to improve MRMR through incorporating its with ensemble of filters to increase the robustness and the stability of MRMR. The result of filtering-based approach is a set of discriminative genes. The wrapper-based approach considers the results from the filtering-based approach to formulate the gene selection search space. In wrapper approach, bat algorithm is tailored for gene selection problem and hybridized with a powerful local search method called beta hill climbing to further stress the deep learning side in the search space navigation and thus find a very robust and stable discriminative genes. Bat-inspired algorithm (BA) is a recent swarm-based optimization method while β-hill climbing is an exploratory local search. The proposed method is called Robust MRMR and Hybrid Bat-inspired Algorithm (rMRMR-HBA). To evaluate the proposed method, ten well-known microarray datasets are experimented with. These datasets are varies in terms of number of genes, samples, and classes. For performance evaluation, the proposed filtering-based approach (i.e., rMRMR) is initially tested against the standard MRMR and other well-regard filtering approaches. Thereafter, the wrapper-based approach (i.e., HBA) is evaluated by studying the convergence behavior of BA with and without β-hill climbing. For comparative evaluation, the results of the proposed rMRMR-HBA were compared with state-of-art methods using the same microarray datasets. The comparative results show that our proposed approach achieved outstanding results in two out of ten datasets in terms of clarification accuracy and minimum number of genes.

Binary Bat Algorithm with Dynamic Bayesian Network for Feature Selection on Cancer Gene Expression Profiles

A hybrid intelligent optimization algorithm to select discriminative genes from large-scale medical data

Article 05 September 2024

An Ensemble of Cooperative Parallel Metaheuristics for Gene Selection in Cancer Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Detection and classification of cancers are the key issues in microarray technology [1]. The advent of DNA microarray technology facilitates biologist in providing new insight of biological phenomena and monitoring the activity of thousand of genes (features) in one experiment. Normally, the DNA microarray technology generates gene expression datasets considered as high dimensionality datasets. This dataset consists of huge number of genes and limited number of samples, where the columns represent genes and instances represent clinical patients samples [2, 3]. These data carry precious information, which have great potential for cancer diagnosis and classification. However, the existence of a huge number of genes is a challenging issue for development of an efficient classifier (or called machine learning algorithm) [3]. To address this challenge and to improve the predictive accuracy of diseases, gene selection, also known as feature selection, which is a data preprocessing step in data mining, can be used to find the subset of most informative genes [4].

Conventionally, methods for gene selection are divided into three categories, namely, wrapper approach, filter approach, and embedded approach [4]. In wrapper approach, the classifier is employed to assess the reliability of genes or gene subsets. In the filter approach, the machine learning algorithm is not used to remove irrelevant and redundant features; instead, the principal characteristics of the training data were applied to rank the significance of the gene subset or genes [5], such as the Minimum Redundancy Maximum Relevancy (MRMR) [6], RelifF [7], Chi-square [8], KullbackLeibler [9]. The wrapper method provides better results. However, the filter method is computationally less expensive than wrappers, but this method yields less classification accuracy [10]. The embedded method, which is the last category in the gene selection approach, is a hybrid of the filter and wrapper methods. The integration of the features of both approaches leads to detect the informative genes with high classification accuracy [11]. However, the embedded method is still in its infancy. Therefore, further investigations are required for both filter and wrapper methods to develop efficient hybrid/embedded methods for gene selection [3].

In filtering-based approaches, which is one of the core concern of this paper, each filter relies on different metric of various characteristics (distance, probability distribution, information theroy, etc.). For a specific dataset, selecting a filter approach is a very crucial for classification-oriented approach where different filleting approach leads to diverse subset of gens [12]. Restate, the filter approach obtained best results for specific data set may not do so for another, which imply that there exist a high variability in the classification outcomes. Therefore, several filtering approaches are combined together in order to find robust subset of genes. Examples of combining filleting-based approaches are i) Seijo-Pardo et al. [13] studied the ensemble of homogeneous and heterogeneous filter-based gene subset selection approaches ii) Ebrahimpour and Eftekhari [14] utilized hesitant fuzzy sets for representing ensemble of ranking algorithms and ensemble of similarity measures for gene subset selection iii) Boln-Canedo et al. [12] presented ensemble of filters and classifiers for microarray data classification.

In wrapper-based approaches, the classification of the gene subset is accomplished thorough two stages: searching and evaluation. In the search stage, search-based method are utilized to generate a discriminative gene subset based on an efficient classifier [4]. Finding optimal subset of genes has been shown to be NP-hard problem [15]. Therefore, meta-heuristic-based approaches have been devoted as a search method in wrapper-based approaches(i.e., natural inspired evolutionary algorithms) [16]. Several meta-heuristic-based approaches have been applied to solve gene selection problem such as Harmony search with a Markov blanket (HSA-MB) [17], Binary Particle Swarm Optimization and a Combat Genetic Algorithm (BPSO-CGA) [18], and Genetic Algorithm with Artificial Bee Colony [16]. However, most of these approaches still suffer from the problems of stagnation in local optima [17, 19]. Therefore, a strong search method for identifying near-optimal subset of informative genes for accurate cancer classification is still required. One of the efficient meta-heuristic-based approach that have been recently proposed is bat-inspired algorithm (BA) [20].

BA is a meta-heuristic swarm-based method introduced by Xin-She Yang [20] imitated echolocation behavior of swarming of bats. BA employs the idea of combining population based algorithm and local search to empower the global diverse exploration and local intensive exploitation, which are the key success of meta-heuristic methods. BA has other advantages over meta-heuristic based approaches such that the simplicity, flexibility, adaptability and scalability. Therefore, BA has been successfully adapted for a wide variety of optimization problems, such as scheduling [21, 22], inverse problem and parameter estimation [23, 24], classifications [25], gene selection [26], clustering [27], global optimization problem [28], and image processing [29]. In order to be inline with the complexity of the optimization problems, the theory of BA has been also tweaked where several hybridized versions [30, 31] and modified versions [32, 33] are produced. However, BA is not yet properly investigated for a kind of gene selection problems addressed in present work.

In this paper, a hybrid filter/wrapper gene selection method is proposed based on improved MRMR methods and hybridized BA algorithm [20] with $\beta $-hill climbing for cancer classification. In the proposed hybrid method, the filter approach (i.e., improved MRMR) is initially work to find the highly discriminative genes. Thereafter, these genes are used by hybrid BA (HBA) as a strong initial stage to find the most informative subset of gens. To elaborate, the performance of MRMR method is improved via ensemble mechanism in which several filtering-based approaches are combined (i.e., RelifF, Chi-square, Kullback-Leibler) to obtain a robust gene list. This list is used by MRMR method to improve the filtering stage outcomes. The hybridization version in wrapper stage is suggested in order to improve the performance of BA algorithm. $\beta $-hill climbing is hybridized with BA to empower its intensification capabilities to be more suitable for the complexity of gene selection search space. The proposed method is called Robust MRMR and Hybrid Bat-inspired Algorithm (RMRMR-HBA). Note that SVM classifier is employed to evaluate each candidate gene subset in HBA. Specifically, the main contributions of the proposed method are as follows:

Modified version of MRMR through novel hybridization strategy by combining the filtering mechanism of MRMR with aggregate gene list obtained by ensemble of filter methods to select more robust and stable gene list for high classification accuracy.
New wrapper approach that embeds of the hybridization BA algorithm with $\beta $-hill climbing local search algorithm to guide the search for more informative gene subset.
To investigate whether the proposed method(rMRMR-HBA) can obtain a gene subset with small number of genes and better classification accuracy when compared with state-of-art gene selection approaches.

Extensive experiments were done on ten well-known microarray datasets to test rMRMR-HBA method for gene selection problem. These datasets characteristics are different in terms of number of genes, samples, and classes. For performance evaluation, the proposed rMRMR is initials tested against the standard MRMR and other well-regard filtering approaches. Thereafter, HBA is evaluated by studying the convergence behavior of BA with and without $\beta $-hill climbing. For comparative evaluation, the results of the proposed rMRMR-HBA were compared with state-of-art methods using the same microarray datasets. The comparative results show that the proposed method achieved the best results in two out of ten datasets and competitive results for others.

The remaining parts of this paper is organized as follows: related background is provided in Section 2. The proposed method is discussed in Section 3. The experimental results and comparative evaluation is analyzed in Section 4. Finally, Section 5 concludes the paper and provide a possible future enhancements.

2 Research background

In order to build a self-exploratory paper, the research background of this paper is described to provide the necessary knowledge about the research methodology. Therefore, the Chi-Square, RelifF, and Kullback-Liebler filtering methods are fully overviews. The fundamental elements of Bat-inspired algorithm together with its Binary version are provided. Finally, $\beta $-hill climbing algorithm and Support Vector Machine (SVM) are thoroughly illustrated.

2.1 Chi-Square filtering-based method

Chi-square is statistical-based measure which is widely used as feature selection filtering-based method. In general, Chi-square individually evaluates each features with respect to its related classes [8]. It should note that the Chi-Square measurement is work with a feature of discrete nature and therefore features should be discretized form continuous values to discrete intervals. Subsequently, Chi-square measurement can determines a potential merging whether the relative frequencies of the classes in the adjacent intervals are similar enough. Of the N examples, let $N_{ij}$ be the number of samples of the $C_{i}$ class within the $j^{th}$ interval and $M_{ij}$ is the number of samples in the j^th interval. The expected frequency of $N_{ij}$ is $E_{ij} = (M_{ij}*|C_{i}|)/N$. The Chi-squared statistic of a feature is then calculated as:

$$ X^{2}=\sum\limits_{i = 1}^{c}\sum\limits_{j = 1}^{I}{\frac{(N_{ij}-E_{ij})}{E_{i}j}} $$

(1)

where I is the number of intervals. To put it clearly, this method tries to determine the type of relationship between the gene and class label. If the gene and class are dependent, the frequency of the gene can be used to predict the frequency of the class. The task is to select the gene, of which the frequency is highly dependent on the frequency of the class. If the gene and class are independent, the frequency count is close to the expected frequency count, thus a small chi square ${X}^{2}$ score. So a high value of ${X}^{2}$ score indicates that the hypothesis of independence is incorrect. In other words, the larger the ${X}^{2}$ value is, the more informative of the corresponding feature will be. By means of calculating the $X^{2}$ for every feature, the features can be further processed and ranked based on their $X^{2}$ score values.

2.2 RelifF filtering-based method

ReliefF [7] is an extension of the original Relief filtering-based algorithm [34] and its considered as distance-based measure [35]. RelifF algorithm works as follows: Firstly, perform randomly sampling an instance from the data set. Secondly, from both same and opposite class, nearest neighbor will be located to the sampled instance. Finally, the relevance scores for each feature are updated by comparing the values of the nearest neighbor features to the sampled instance. The rationale behind this is that the meaningful feature should be recognized among instances of different classes which have same value of instances from the same class. ReliefF has advantages over standard Relief: its more robust, better handles incomplete and noisy data, capability of dealing with multiclass problems, can be adapted in all situations, has low bias, facilitates interactions among features, and may be able to capture local dependency which other methods cannot do so.

2.3 Kullback-Liebler filtering-based method

Kullback-Leibler (KL) [9] is probability-based measure attempt to figure out features where distance between the different class probabilities is big. The KL divergence is a measure of how different two probability distributions (over the same event space) are. The KL divergence or relative entropy of probability distributions P, Q on a finite set X is defined as:

$$ MI(X,Y)= KL((P(X,Y)||P(X)P(Y)). $$

(2)

$$ D(P||Q)=\sum\limits_{x\in X} P(x) \log \frac{P(x)}{Q(x))} $$

(3)

In case the random variable X and Y are independent, then joint probability become equals to the product of marginal probabilities.

$$ P(X,Y)=P(X)P(Y) $$

(4)

Consequently, the KL divergence between distributions will be zero. This implies that X and Y are independent and Y carries no information about X or X not relevant to Y. Similarly, if X and Y and highly dependent and X carries information about Y then their relevance is higher.

2.4 Bat-inspired algorithm

Bat-inspired algorithm (BA), introduced by Xin-She Yang in 2010 [20], emulates the echolocation behavior of bats. There are many kinds of bats. All of them have similar behavior when navigating and hunting; however, they are different in terms of size and weight. Microbats use echolocation which assists them in seeking for prey and/or avoid obstacles in the complete darkness. In BA, the echolocation features of microbats can be idealized according to the following rules:

1.
All bats use echolocation to sense distance and determine the difference between food/prey and background barriers in some magical way;
2.
Bats randomly fly with velocity $V_{i}$ at position $X_{i}$ with a fixed frequency $f_{min}$, varying wavelength k and loudness $A_{0}$ to seek for prey. They can automatically regulate the wavelength (or frequency) of their emitted pulses and adjust the rate of pulse emission $r \in $ [0,1], depending on the closeness of their target;
3.
Although the loudness can vary in many ways, it is assumed that it varies from a large (positive) $A_{0}$ to a minimum constant value $A_{min}$.

Bat Motion:

The frequency of each bat will be positive integer or float relying on the selected upper bound and lower bound of the frequency. The frequency value is calculated through (5). Determining the upper and lower bound frequencies is based on the domain of interest.

$$\begin{array}{@{}rcl@{}} F_{i} = F_{min} + (F_{max} - F_{min}) \times \beta \end{array} $$

(5)

where $\beta $ is a random number of uniform distribution in $[0,1]$, $F_{max}$ is upper bound of the frequency, and $F_{min}$ is lower bound of the frequency. The velocity of each bat will be a positive integer number. Each bat will update its velocity according to the following equation.

$$\begin{array}{@{}rcl@{}} V_{i}(t + 1) = V_{i}(t) + (X_{i}(t)-Gbest) \times F_{i} \end{array} $$

(6)

Where $Gbest$ is the best solution, $F_{i}$ represents the frequency of the ith bat and the position $X_{i}$ of each bat. Each bat’s position is updated as shown in (7).

$$\begin{array}{@{}rcl@{}} X_{i}(t + 1) = X_{i}(t) + V_{i}(t + 1) \end{array} $$

(7)

The BA employed a random walk to improve its capability in exploitation as given below.

$$\begin{array}{@{}rcl@{}} x_{new} = x_{old} + \epsilon {A}_{t} \end{array} $$

(8)

where $X_{new}$ is the new solution, $X_{old}$ is the current solution, and $\epsilon $ is random number in $[-1,1]$.

Variations of loudness and pulse rates:

Once a bat finds its prey, the loudness usually decreases and the rate of pulse emission increases. In this case, the loudness can be chosen as any value of convenience. Loudness A and pulse emission rate r are updated according to (9) and (10).

$$\begin{array}{@{}rcl@{}} A_{i}(t + 1) = \alpha A_{i}(t) \end{array} $$

(9)

$$\begin{array}{@{}rcl@{}} r_{i}(t + 1) = r_{i}{(0)} (1 - e^{(-\gamma \times t)}) \end{array} $$

(10)

where $\alpha $ and $\gamma $ are constant parameters that lies between 0 and 1 and used to update loudness rate $A_{i}$ and pulse rate $(r_{i})$. The pseudo code of the algorithm is presented in the following pseudo-code. Note that $f(X_{i})$ is that fitness function value of $X_{i}$.

2.5 Binary bat-inspired algorithm

In the continuous version of BA, the artificial bat can be moved around the search space utilizing positions and velocity vectors (or updating position vectors) within continuous domain. However, in dealing with binary search space, the position or solution is a series of 0’s and 1’s binary bits. The updating position process cannot be performed using (6) because the binary search space deals with only two numbers (0 and 1). Therefore, a transfer strategy should be determined to reflect the velocity vector value in changing the elements of position vector from 0 to 1 or vice versa. The solution in gene selection problem is modified to be as a binary vector and the binary bat algorithm is effective in dealing with it.

Nakamura et al. [36] have introduced a binary version of the Bat Algorithm restricting the new bat’s position to only binary values using a sigmoid function as follows:

$$\begin{array}{@{}rcl@{}} S\left( {V_{i}^{j}}\right)= \frac{1}{1+e^{-{v_{i}^{j}}}} \end{array} $$

(11)

Therefore, (11) can be replaced by:

$$ \textit{{x}}_{i}^{j} \leftarrow \left\{\begin{array}{ll} \textit{{1,}} & U(0,1) > \sigma \\ \textit{{0,}} & otherwise. \end{array}\right. $$

(12)

in which $\sigma U(0,1)$. Therefore, (12) can provide only binary values for each bat’s coordinates in the Boolean lattice, which represent the presence of absence of the features.

2.6 β-hill climbing algorithm

Hill climbing is the base algorithm of the local search-based techniques [37]. It doesn’t involve an explorative strategy during its search process. Thus, it is easily trapped in local optima. As a consequence, several extensions of hill climbing have been developed to handle this problem. Some extended algorithms tried to compensate the weakness of explorative strategies by embedding neighborhood functions to define systematically different search space regions. Other methods like simulating annealing (SA) [38] and greedy deluge (GD) [39] utilized a relaxed acceptance method to deal with uphill moves. Unlike other methods, $\beta $-hill climbing [40, 41] is a recent extension method to the basic hill climbing proposed by Al-Betar in 2016, wherein a new explorative operator called ($\beta $ ) has been utilized based on an idea inspired by the uniform mutation operator of GA [42]. At each iteration, the current solution is moved around in the search space based on function called neighborhood search $\mathcal {N}$ and unbounded search space is defined based on the $\beta $ operator. Figure 1 shows the flow chart of the $\beta $-hill climbing algorithm.

The $\beta $-hill climbing starts with random initial solution $\textit {{x}}=(x_{1},x_{2},\ldots ,x_{N})$. At each iteration, new solution $\acute {x}=({x}^{\prime }_{1},{x}^{\prime }_{2},...,{x}^{\prime }_{N} )$ is generated based on two operators: neighborhood navigation (i.e., $\mathcal {N}$- operator) and $\beta $-operator. In $\mathcal {N}$- operator stage, a random neighboring solution of the solution $\textit {{x}}$ is adopted by means of function $improve (\mathcal {N})$ along with ’random walk’ acceptance rule which only move one step without check the change in the objective function, as follow:

$${x}^{\prime}_i=x_i \pm \cup (0,1) \times bw \: \: \: \exists ! \in [0,1]$$

In the $\beta $-operator stage, with a probability range of $\beta $, each single variable of the new solution is assigned values base either existing values of the current solution or randomly from available range, as follows:

$$ \textbf{{x}}^{\prime}_{i}\leftarrow \left\{\begin{array}{ll} x_{r} & U(0,1) > rnd\leq \beta \\ x_{i} & otherwise. \end{array}\right. $$

(13)

where $\beta $ $\in [0,1]$, $x_{r}$ $\in $ $X_{i}$ is the possible range for decision variable ${x}^{\prime }_{i}$, and rnd generate a uniform random number between 0 and 1. By the aforementioned operators, the convergence of $\beta $-hill climbing is boosted toward the optimal solution. Systematically, $\mathcal {N}$-operator navigates the neighboring solutions of the current one and return the one of the solutions with better objective function value. In $\beta $-operator, the convergence can be achieved by constructing a controlled portion of the current solution, and therefore, the convergence rate could be boosted. The $\beta $-operator can be considered as source of exploration while the $\mathcal {N}$-operator can be source of exploitation. In terms of search space, $\beta $ operator is very useful that empower $\beta $-hill climbing algorithm in jumping from search space region to another in the same level or less. By means of this ability, $\beta $-hill climbing algorithm can escape the trap of local minima by trying stochastic values for some decision variables.

2.7 Support vector machine

SVM is used as an evaluator for the candidate gene subset solution, which will be used as an objective function in the proposed method as will be seen in Section 3. SVM is a supervised learning algorithm used for classification and regression [43]. It is a powerful classification algorithm due to its efficient performance in pattern recognition domain. For example, SVM classifier was successfully applied to classify high-dimension data, such as microarray gene expression data [2].

SVM constructs hyperplanes or a set of hyperplanes in a high-dimensional space, which can be utilized for classification, regression, or other tasks. It has the capability to deal with linear and nonlinear datasets. In a linear data, SVM tries to find an optimal separating hyperplane that maximizes the margin between the training examples and the class boundary. In a nonlinear data, such as the microarray gene expression data used in this paper, the definition of a feature mapping function is required where $X\longmapsto \phi (x)$. The mechanism that defines a feature mapping process is called kernel function.

1)
Polynomial kernel function
$$\begin{array}{@{}rcl@{}} k=(x_{i},x_{j})=(x_{i}.x_{j}+a)^{b} \end{array} $$
(14)
2)
Radial basis kernel function (RBF)
$$\begin{array}{@{}rcl@{}} k(x_{i},x_{j})=e^{-(||x_{i} - x_{j} ||)^{2} /2\sigma^{2})} \end{array} $$
(15)
3)
Sigmoidal kernel function
$$\begin{array}{@{}rcl@{}} k(x_{i},x_{j})= tanh(ax_{i}.x_{j}-b) \end{array} $$
(16)
where $x_{i}$ and $x_{j}$ are vectors, and a and b are parameters that define the kernel’s behavior. Polynomial, RBF, and Sigmoidal are the nonlinear kernels. The generation of a predictive model for cancer classification using microarray dataset is considered a nonlinear classification task [16]. In this paper, RBF kernel will be adopted for cancer classification.

3 Proposed method for gene selection

The proposed method for gene selection has two different stages, namely, the filter approach stage and the wrapper approach stage, as shown in Fig. 2. Each stage will be thoroughly discussed in this paper.

3.1 Stage I: Filter approach

In this stage, the Minimum Redundancy Maximum Relevancy (MRMR) as filter approach [44] is modified by hybridizing its filtering process with ensemble of different filter methods. Initially, the original MRMR is defined in Section 3.1.1. Thereafter, the modified version of MRMR is proposed in Section 3.1.2.

3.1.1 Minimum redundancy maximum relevancy (MRMR)

The minimum redundancy maximum relevancy (MRMR) as a filter approach has been developed by Hanchuan Peng [6]. The MRMR tries to find the most relevant features based on its correlation with class label and to minimize the redundancy of the features themselves [44]. This filtering process reveals the features that have maximum relevancy and minimum redundancy. To quantify both relevancy and redundancy, mutual information is used to estimate the mutual dependency of two variables. Mutual information is defined in (17)).

$$ I(x_{i},c)=\int \int P(x_{i},c) \, \, log\frac{P(x_{i},c)}{P(x_{i})P(c)))} $$

(17)

where $I(x_{i},c)$ is the mutual information between the feature $x_{i}$ and class label c. $P(x_{i})$ and $P(c)$ are marginal probability functions, and $P(x_{i}, c)$ is the joint probability distribution. The mutual information value for two completely independent random variables is 0 [45].

Given $x_{i}$, which represents the feature i, and the class label c, the Maximum-Relevance method selects the top m features in the descent order of I(x_i;c), i.e. the best m individual features that are relevant to the class labels.

$$ \max_{S} \frac{1}{|S|} \sum\limits_{x_{i}\in S} I(x_{i};c) $$

(18)

To eliminate the redundancy among features, a Minimum-Redundancy criterion is defined as:

$$ \min_{S} \frac{1}{|S|^{2}} \sum\limits_{x_{i},x_{j}\in S} I(x_{i};x_{j}) $$

(19)

A sequential, incremental search algorithm can be used to solve the simultaneous optimizations of optimization criteria of (1) and (19). Suppose we already have S_m− 1, the feature set with m-1 features; the task is to select the ${m-th}$ feature from the set ${G-S_{m-1}}$. This feature is chosen by maximizing the single-variable relevance minus the redundancy function, as follow:

$$ \max_{x_{i}\in G-S_{m-1}}\left( I(x_{i};c)-\frac{1}{m-1} \sum\limits_{x_{i}\in S_{m-1}}I(x_{i};x_{j})\right) $$

(20)

The method utilizes a series of intuitive measures of relevance and redundancy to pick out promising features of both continuous and discrete datasets, as shown in (17), (18) and (19).

3.1.2 Modified MRMR

The main work in this stage is to modify MRMR to overcome the weakness induced by single filter-based approaches (as mentioned in Section 1) and thus improve the robustness and stability of MRMR. The MRMR is modified via novel hybridization strategy by combining the filtering mechanism of MRMR with aggregate gene list obtained by the ensemble of filter methods. The benefits of ensemble filters is to introduce diversity and reduce this variability associated with single-based filter methods [46]. In this manner, the hybridization of filtering mechanism of MRMR with an ensemble of filters may lead to improve its robustness and stability. The modified version of MRMR is called rMRMR. Moreover, the pseudo-code for the proposed rMRMR is given in Algorithm 3.

Among the wide range of filters available in the literature, three filters were ensembled according to previous studies [14, 35]. Each of which uses different metrics. Thus, the ensemble method will be a combination of the following filters: ReliefF, Chi-Square and Kullback-Liebler. Thereafter, each individual filter method will be executed and the ranks/scores of genes are combined/aggregated by using ‘Mean’ of the scores for each of the ranking list obtained from each individual filter method. The ranking list and the scores of genes $(R(G_{i}))$ will be incorporated with filtering process of MRMR as follows: Firstly, the ranking list is used to formulate the dimensionality of search space used by MRMR. The main merit of incorporating process is to empower the outcome of MRMR by mean of initiating its process with high informative genes produced by ensembled methods. Secondly, while MRMR computes its gene scores , the ensemble scores of genes will be used with relevancy computation process, as shown in (21). This is to emphasis the calculation of each individual gene score subject to different metrics(which is avoided the bias result of single metric). Doing so, the robustness and stability of MRMR will be enhanced. (21).

$$ Relevancy= I(G_{i},c) \ast R(G_{i}) $$

(21)

Lastly, on this stage, SVM classifier is used to calculate the classification accuracy of gene list resulting from rMRMR. To further explore reduced gene subset and identify a subset of informative genes, hybrid BA with $\beta $ Hill climbing and SVM classifier are combined to seek for the better genes subset using the wrapper method.

3.2 Stage II: Wrapper approach

In this stage, we proposed a hybridized version of Bat-inspired as a global optimizer with $\beta $ -hill climbing as a local optimizer. As a $\beta $-hill climbing is a local-area algorithm which can deal with a single solution at a time, they can easily stagnate in some local optima. On the other hand, BA as a population based algorithm can deal with multi-solutions at the same time, however, it performs a wider search without concentration on single area in the search space. Therefore, complementing their capability in a single hybrid optimization framework can be produced an efficient search method. The Hybrid Bat-inspired Algorithm (HBA) is proposed to select the most informative subset of genes from the top ranked genes obtained by rMRMR. As an evaluator to candidate subset of genes, SVM classifier is iteratively used as a fitness function. The goal of our proposed method is to maximize classification accuracy and minimize the size of genes subset. The pseudo-code of the proposed method is shown in Algorithm 4. The components and process of the proposed method in Stage II are illustrated in the following sections:

3.2.1 Solution representation

Gene subset selection problems can be expressed as a combinatorial optimization problem, in which the search space involves a set of all possible subsets [47, 48]. This problem is known to be NP-hard problem [49] which has a highly combinatorial search space in nature. The number of solutions in the search space exponentially increases when the number of genes increases; and there are $[2^{N}]$ possible subsets of genes, where N represents the number of genes.

The complete set of genes (i.e., solution $\textit {{x}}$) is represented by a binary string of length N, $\textit {{x}} = (x_{1}, x_{2},{\ldots } , x_{N} )$, where a bit $x_{i}$ in the solution is set to 1 if the corresponding gene is preserved and set to 0 if it is to be discarded. N is the original number of all genes.

3.2.2 Fitness function

HBA initially starts with a set of generated bats or solutions. Each bat is a series of 0’s and 1’s bit, where bit value of 1 indicates that this gene is selected while 0 is discarded. Each bat is evaluated during the search using (22) where the evaluation function combines classification accuracy and gene subset length [50].

$$ \alpha \times \gamma R(D) + \beta \times \frac{|C|-|R|}{|C|} $$

(22)

where $\alpha R(D)$ is the average of classification accuracy rate obtained by performing ten multiple cross-validation with SVM classifiers, on the training dataset with gene set R chosen from each bat in the population to decision D. $|R|$ is the length of the selected gene subset. $|C|$ is the total number of genes. $\alpha $ and $\gamma $ are two parameters related to the significance of classification quality and subset length. $\alpha $ $\in $ $[0,1]$ and $\gamma $ $=$ (1-α ). The classification accuracy is more important than the subset length. The quality of each bat is gaged by this objective function. In this paper, $\alpha $ is set to 0.8 [3, 50, 51].

3.2.3 Hybrid bat-inspired and β-hill climbing algorithm (HBA)

In this stage, BA algorithm is hybridized with $\beta $-hill climbing(β HC) to empower its exploitation capabilities. Consequently, we named our hybridization method Hybrid Bat-inspired algorithm(HBA). Therefore, HBA algorithm was employed as a gene subset selection method to figure out the near optimal gene subset from the reduced set generated by filter approach stage. Initially, since the gene selection problem is a binary in nature, Nakamura et al. [36] introduced a binary version of the BA as illustrated in Section 2.5. The pseudo-code of the proposed HBA algorithm is presented in Algorithm 4.

After the bat population is initialized and each candidate solution is evaluated, BA iterates toward the optimal solution by means of generating new solutions according to equations formulated by the BA operators, as shown in (5), (6) and (7) as discussed in Sections 2.4 and 2.5. With a probability range of pulse rate $r_{i}$ (Line 7 in Algorithm 4), instead of using the local search strategy in Basic BA algorithm, we propose to use $\beta $HC to utilize an efficient exploitation capabilities around the current best solution (Lines 10 to 21 in Algorithm 4). In other words, $\beta $HC iteratively generates a new solution around the current best solution based on two operators: Neighborhood navigation (i.e. $\mathcal {N}$-operator) and $\beta $ operator. Note that in both operators, the flip-flop process is used in $\mathcal {N}$-operator and a random digit either 0 or 1 is generated to replace the original one in $\beta $ operator with a probability of $\beta $ parameter where $\beta \in [ 0,1 ]$.

During the search process, the function improve(N(X)) (i.e., Line 11 in Algorithm 4) explores the neighborhood solutions $X^{\prime }$ of the current solution X using flip-flop process. In flip-flop process, a random variable is selected from X and its value is flipped to be 1 if it 0 and vise-versa. In $\beta $-operator (Line 13 in Algorithm 4), new solution (i.e., $X^{\prime \prime }$) can be generated from the current solution ((${X}^{\prime }$)) as follow: With a probability range of $\beta $ parameter, any variable in the new solution (i.e., ${X}^{\prime \prime }$) can be assigned by a value of 0 or 1. This is shown in Algorithm 4, line 13. In the algorithm, the function rnd a random distribution number of the range rnd $\in $ $\left [ 0,1 \right ]$. The new solution $X^{\prime \prime }$ is accepted which has an equal or higher fitness value, i.e., $f({X}^{\prime \prime })$ $\geq $ $f(X)$. The search process is repeated until the maximum number of generations are achieved.

Algorithm 4, Line 23 shows the loudness process. In case the loudness parameter a value is higher than a random number generated by rnd, where rnd $\in $$\left [ 0,1 \right ]$. The second condition will be checked. The second condition is satisfied when the fitness function of new solution is better the best solution (i.e., $X^{*}$), then the new solution (i.e., ${X}^{\prime \prime }$) will be included in the bat population and the current solution (i.e., X) will be excluded from bat population. In addition, the loudness rate is decreased and the rate of pulse emission is increased. In the meanwhile the best bat location will be iteratively updated as pseudocoded in Algorithm 4, Line 27. Finally, in case the termination condition is satisfied (Lines 10 to 28 in Algorithm 4), therefore, best predictive genes subset which represents highest fitness function is returned.

4 Experimental setup & results

In this section, the performance of the proposed method was evaluated using benchmark microarray datasets, which is established by Kent ridge biomedical data set repository.^{Footnote 1} These dataset are detailed in Section 4.1. The implementation of filter and wrapper approaches was programmed using two languages (i.e., Java and Matlab). In filter approach stage, ReliefF, Chi-Square and Kullback-Liebler are implemented in java using weka tool [52], while rMRMR is implemented using Matlab. In wrapper approach stage, BA, $\beta $-hill climbing and SVM are implemented using java. The SVM used in this method was based on the one prepared in libsvm [53]. RBF kernel is adopted for SVM classifier. As conventional, grid search algorithm was used for tuning parameters of SVM classifier [54]. The experiments are performed on an Intel Core Quad 2.66 GHz CPU with 4 GB of RAM in Microsoft Windows 7 platform.

As remembering, the proposed method has two stages: filter and wrapper. In stage I (filter approach), the rMRMR is proposed to improve the performance of the classical MRMR by means of combining the capability of the three other filtering approaches (ReliefF, Chi-Square and Kullback-Liebler) with the framework of MRMR. The effect of rMRMR on the classification accuracy is deeply studied in Section 4.2 in which the proposed rMRMR is compared against MRMR and other well-regarded filtering-based approaches. In stage II (wrapper approach): the $\beta $HC is hybridized with BA to improve its exploitation aspects. Therefore, the effect of the $\beta $HC on the convergence behavior of HBA is also studied in Section 4.3 where the BA with and without $\beta $HC are compared. Moreover, rMRMR with Genetic Algorithm [55] (rMRMR-GA) and rMRMR and Particle Swarm Optimization [56] (rMRMR-PSO) are re-implemented to gene selection problem, and their performance is compared to the performance of the proposed method (rMRMR-HBA). Finally, the comparative evaluation between the proposed rMRMR-HBA and other methods using the same datasets are fully discussed in Section 4.4.

For proper usage of stage I (filtering approach), SVM builds a classification model using top ranked 50 genes generated by rMRMR, MRMR, ReliefF, Chi-Square, Kullback-Liebler and ensemble method (i.e. ReliefF, Chi-Square, and Kullback-Liebler), Note that top ranked 50 genes for SVM is configured in accordance with the previous studies [14, 17, 55, 57]. Subsequently, the highly ranked genes resulting from filtering stage are passed into wrapper stage. It is worth recalling that in wrapper approach, SVM is used to validate and assess each candidate gene subset generated from HBA algorithms. In tables below (Tables 5, 6) is evaluated based on thee measurements: the classification accuracy, the predictive gene subset length and fitness function values. while comparative table (Table 6), only the classification accuracy and the predictive gene subset length are used as done previously in the literature [2, 17]. Classification accuracy is calculated by (23), where gene subset length is number of selected genes.

$$ Classificatio Accuracy= \frac{\# of \ correctly\ predicted \ instances}{\#\ of \ instances} $$

(23)

Table 1 shows the parameter setting values of the proposed method used in the experimental step. These parameters values are carefully selected based on some preliminary experiments and based on the previous parameter setting theory concluded by other studies using BA and $\beta $HC [36, 40]. Note that number of iteration is 100 for both algorithms.

Table 1 Parameter setting of the proposed method

Full size table

4.1 Dataset used

To evaluate the proposed methods, we carried out our experiments using ten de facto datasets of gene expression profile [58, 59]. The datasets and their characteristics are summarized in Table 2. In table, the row represents the key of the dataset instances. For each row, four values are presented including #Gene, which refers to the number of genes in the dataset. The second column includes the #Samples, which refers to the number of patient samples selected for testing; In the third column, the #Class refers to the corresponding disease statuses; The last column shows the description of each cancer disease.

Table 2 Datasets characteristic

Full size table

4.2 Effect rMRMR in the proposed method

In order to study the effect of the proposed rMRMR on the performance of classification process, the classification behavior of rMRMR is compared with each basic MRMR, ReliefF, Chi-Square, Kullback-Liebler worked individually and an ensemble of filters (i.e. ReliefF, Chi-Square, Kullback-Liebler). For comparative purpose, we performed a 10-fold cross validation to estimate classification accuracy for all comparative methods.

In Table 3, the results of all comparative methods is summarized in terms average classification accuracy along with standard deviation values which represented as (average $\pm $ standard deviation). The best results are highlighted in bold font. It can be noted that the rMRMR is able to obtain the best recorded results for four out of the 10 datasets. Furthermore, it has similar best score results of four datasets $(i.e., ALL$_AML,Lymphoma,Ovarian,andSRBCT) as obtained by others. Furthermore, rMRMR obtains the second and fourth best results on MLL and Colon datasets, respectively.

Table 3 The classification accuracy of the proposed rMRMR as compared to the other gene selection methods by 10-fold cross validation on 10 microarray datsets

Full size table

In terms of the average of all result averages (AVG) shown in the last row of Table 3, the proposed method clearly achieved the best AVG results for almost all tested datasets. Therefore, the rMRMR reveals an efficient outcomes in addressing the problem that consists large number of genes.

In addition, rMRMR and other methods are compared using the win/tie/loss (WIT) aspect shown in the last row of Table 3. The WTL row denotes the number of times that the method is win/tie/loss than rMRMR. As an example, the entry $``2/4/4"$ in Table 3 indicates that MRMR losses four cases, tie four cases, and wins on only two in comparison with our method rMRMR. Notably, the proposed method is able to outperforms other filter-based gene selection methods in almost all tested datasets. In a nutshell, the proposed filter-based method archives promising improvement in the classifier performance. This outstanding results thanks to ensemble filter component that empower the robustness and stability of MRMR.

4.3 Effect of β HC on the performance of HBA

In this section, the effect of $\beta $HC on the convergence behavior of the proposed method is studied. Note that the top ranked 50 genes obtained by Stage I (filtering approach) are used in HBA in order to cut-of the scale of the search space by the most informative genes. The BA with $\beta $HC (i.e., HBA), BA without $\beta $HC (i.e., BA), GA, and PSO are compared together. Both methods were executed 30 independent runs. Recall, Table 1 shows the parameters setting of HBA. The experimental results obtained are summarized in Table 4 in terms of $|\#G|$, ACC, $|F|$, T and $'T-sig^{\prime }$ which represent the average number of selected genes, the average classification accuracy, the average fitness value introduced in (22), average of execution time (minutes), and Wilcoxon signed-rank statistical test to shows whether there exits any statistically significant difference between HBA and other algorithms, respectively. In addition, the best results of $|\#G|$, ACC, T and $|F|$ are remarked in bold font.

Table 4 Comparison of the gene selection methods on ten selected datasets

Full size table

In Table 4, $'T-sig^{\prime }$ row, with the probability range of $\alpha \leq 0.05$, ‘∗’ denotes that the results of the HBA are significantly better than the corresponding algorithm while ‘−’ denotes that the results of the HBA algorithm are not significantly better than the corresponding algorithm. Wilcoxon signed-rank statistical test is performed on fitness value since its composed of classification accuracy and number of selected genes. Apparently in Table 4, HBA is able to obtain higher classification accuracy than BA and other algorithms on four datasets (i.e., Breast, Colon, ALL_AML_4c, and CNS) and similar classification accuracy to BA and other algorithms on six datasets (i.e., MLL, ALL_AML, ALL_AML_3c, Lymphoma, ovarian, and SRBCT). In terms of classification accuracy and number of selected genes, HBA is able to achieve a better trade-off between classification accuracy and number of selected genes than BA and other algorithms for all datasets except for ‘Colon’ dataset. In Colon dataset, HBA achieved higher classification accuracy but with slightly increasing number of selected genes than BA.

In term of fitness function, HBA exhibits higher fitness function value (|F|) than other algorithms on all datasets. Although the execution time (T) is not a major factor in gene selection domain since it is not a real time, the average execution time of 30 runs replicated for each version of the proposed method in addition to those recorded by GA and PSO are summarized in Table 4. It has to be noted that T recorded by HBA and BA are reasonably acceptable. GA has less T than PSO, BA, and HBA. Notably, HBA records the highest T value since this version is a hybrid version and therefore takes higher execution. Further, between the results of HBA and other algorithms, there exist a significant difference on four out of ten datasets. For instance, HBA gets significantly better results than BA and other algorithms on nine datasets and its not significantly better than BA on only one dataset (i.e., Ovarian). In sum, the experimental results exhibit that HBA is able to yield smaller set of reliable genes with higher/equlivant classification accuracy than BA and other algorithms on almost all datasets. This fruitful results is owing to the integration of $\beta $-hill climbing algorithm with BA that significantly improve the local search capability of BA, which enhance the exploitation aspect. Thereby, HBA is capable of eliminating the irrelevant and redundant genes that empower it to identify smaller set of reliable genes and impose it to obtain competitive or higher classification accuracy.

The convergence behavior trend of HBA against BA is drown in Fig. 3 for all experimented datasets. Notably for all figures drown, convergence rate of HBA is superior in comparison with that produced by BA.

4.4 Comparative evaluation

To further assess the effectiveness of the proposed method, rMRMR-HBA was compared with the state-of-the-art methods, as shown in Table 5. Table 6 shows the average criteria over multiple independent runs for each method, including the average of the classification accuracy (ACC) and the number of selected genes (#G). The best results are highlighted in bold.

Table 5 Key to comparative methods

Full size table

Table 6 Results of comparison between rMRMR-HBA and the state-of-art methods

Full size table

5 Conclusion and future work

In this paper, an efficient method was proposed to solve gene selection problem, which combine filter and wrapper methods, namely, rMRMR, HBA, and SVM classifier. In filter stage, rMRMR method is proposed by combining an ensemble of filter methods(i.e., ReliefF, Chi-Square, Kullback-Liebler) with MRMR to improve its robustness and stability. This combination mechanism is done as follows: firstly, the ensemble methods (i.e., ReliefF, Chi-Square, Kullback-Liebler) generate single gene ranking list by aggregate the results from each individual filter, subsequently, this ranked list is used by MRMR. Secondly, the ensemble scores of genes in coupled with the calculation of relevancy score in MRMR. In wrapper stage,we propose hybrid BA algorithm with β-hill climbing algorithm integrated with SVM to seek for optimal gene subset from top informative genes obtained by rMRMR method. This hybridization process of HBA is proposed to strike a right balance between exploration and exploitation capabilities and improve its local exploitation aspects.

Extensive experiments were conducted into ten microarray benchmark datasets to test the performance of the proposed method. These data sets are high dimensional with high complex. The evaluation process involves three stages, (i) evaluating the filtering-based approach, (ii) evaluating the wrapper-based approach , and (iii) evaluating the proposed rMRMR-HBA method. In the first evaluation stage, the proposed rMRMR is evaluated against original MRMR and other classical filtering-based approach. The results show that the proposed rMRMR outperforms those produced by other comparative filtering-based approach for seven out ten datasets. In the second evaluation stage, the wapper-based approach (i.e., HBA) is evaluated as follows: bat-inspried algorithm with and without β-hill climbing algorithm, as well as, the proposed mrthod (HBA) is evaluated against other well-known methods (i.e., GA and PSO). The evaluation is conducted based on three main criteria: number of selected genes, classification accuracy, execution time, and fitness function value. The proposed HBA is able to exhibit better compromise among number of selected genes and classification accuracy than BA for all datasets except for ‘Colon’ dataset. In Colon dataset, HBA achieved higher classification accuracy but with slightly increasing number of selected genes than BA. In term of fitness function value, HBA obtained higher fitness function value (|F|) than BA on all datasets.

In the third evaluation stage, a comparison with state-of-art methods shows that our proposed method obtained an equivalent or higher classification accuracy in comparison with those yielded by other competitors in nine out of ten datasets. Furthermore, the best overall results in two out of ten datasets (i.e., ALL_AML_3c and Ovarian) and competitive results on the remainder datasets, in terms of number of genes and classification accuracy, are obtained. This finding proves that rMRMR-HBA approach is capable to eliminate the irrelevant and redundant genes effectively due to the integration of rMRMR filter and HBA wrapper, that result in the identification of small set of informative genes. Therefore, these high impressive achievements can be inspired other researchers to contribute in the rMRMR-HBA which is a superior method pregnant with many discoveries and capabilities for gene selection domain.

As the proposed rMRMR-HBA revealed a very successful outcomes for gene selection, improving the performance of each individual filter-based gene selection method (i.e.MRMR) by means of homogeneous/hetrogenous ensembles of filtering-based gene selection methods are required in the future. In addition, rMRMR-HBA can be expanded to other high-dimensional datasets, including image and text mining data.

Notes

http://sdmc.lit.org.sg/GEDatasets/Datasets

References

Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Article Google Scholar
El Akadi A, Amine A, El Ouardighi A, Aboutajdine D (2011) A two-stage gene selection scheme utilizing mrmr filter and ga wrapper. Knowl Inf Syst 26(3):487–500
Article Google Scholar
Lai C-M, Yeh W-C, Chang C-Y Gene selection using information gain and improved simplified swarm optimization, Neurocomputing
Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158
Article Google Scholar
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Article Google Scholar
Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinforma Comput Biol 3(02):185–205
Article Google Scholar
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: European conference on machine learning. Springer, pp 171–182
Su C -T, Hsu J -H (2005) An extended chi2 algorithm for discretization of real value attributes. IEEE Trans Knowl Data Eng 17(3):437–441
Article Google Scholar
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet Google Scholar
Li B-Q, Hu L-L, Chen L, Feng K-Y, Cai Y-D, Chou K-C (2012) Prediction of protein domain with mrmr feature selection and analysis. PLoS One 7(6):e39308
Article Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2012) An ensemble of filters and classifiers for microarray data classification. Pattern Recogn 45(1):531–539
Article Google Scholar
Seijo-Pardo B, Porto-díaz I, Bolón-canedo V, Alonso-betanzos A (2017) Ensemble feature selection: Homogeneous and heterogeneous approaches. Knowl-Based Syst 118:124–139
Article Google Scholar
Ebrahimpour MK, Eftekhari M (2017) Ensemble of feature selection methods: a hesitant fuzzy sets approach. Appl Soft Comput 50:300–312
Article Google Scholar
Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922
Article Google Scholar
Alshamlan HM, Badr GH, Alohali YA (2015) Genetic bee colony (gbc) algorithm: a new gene selection method for microarray cancer classification. Comput Biol Chem 56:49–60
Article Google Scholar
Shreem SS, Abdullah S, Nazri MZA (2014) Hybridising harmony search with a markov blanket for gene selection problems. Inf Sci 258:108–121
Article MathSciNet Google Scholar
Chuang L-Y, Yang C-H, Li J-C, Yang C-H (2012) A hybrid bpso-cga approach for gene selection and classification of microarray data. J Comput Biol 19(1):68–82
Article MathSciNet Google Scholar
Li S, Wu X, Tan M (2008) Gene selection using hybrid particle swarm optimization and genetic algorithm. Soft Comput-A Fusion Found Methodol Appl 12(11):1039–1048
Google Scholar
Yang X-S (2010) A new metaheuristic bat-inspired algorithm. In: Nature inspired cooperative strategies for optimization (NICSO 2010). Springer, pp 65–74
Ramesh B, Mohan VCJ, Reddy VV (2013) Application of bat algorithm for combined economic load and emission dispatch. Int J Electric Eng Telecommun 2(1):1–9
Google Scholar
Musikapun P, Pongcharoen P (2012) Solving multi-stage multi-machine multi-product scheduling problem using bat algorithm. In: 2nd international conference on management and artificial intelligence, vol 35. IACSIT Press Singapore, pp 98–102
Yang X-S, Hossein Gandomi A (2012) Bat algorithm: a novel approach for global engineering optimization. Eng Comput 29(5):464–483
Article Google Scholar
Lin J-H, Chou C-W, Yang C-H, Tsai H-L et al (2012) A chaotic levy flight bat algorithm for parameter estimation in nonlinear dynamic biological systems. Source: J Comput Inf Technol 2(2):56–63
Google Scholar
Mishra S, Shaw K, Mishra D (2012) A new meta-heuristic bat inspired classification approach for microarray data. Procedia Technol 4:802–806
Article Google Scholar
Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int J Data Mining Bioinforma 19(1):32–51
Article Google Scholar
Komarasamy G, Wahi A (2012) An optimized k-means clustering technique using bat algorithm. Eur J Sci Res 84(2):26–273
Google Scholar
Al-Betar MA, Awadallah MA, Faris H, Yang X-S, Khader AT, Alomari OA (2018) Bat-inspired algorithms with natural selection mechanisms for global optimization. Neurocomputing 273:448–465
Article Google Scholar
Akhtar S, Ahmad A, Abdel-Rahman E (2012) A metaheuristic bat-inspired algorithm for full body human pose estimation. In: 2012 Ninth Conference on Computer and robot vision (CRV). IEEE, pp 369–375
Tosun Ö, Marichelvam M (2016) Hybrid bat algorithm for flow shop scheduling problems. Int J Math Oper Res 9(1):125–138
Article MathSciNet Google Scholar
Roeva ON, Fidanova SS (2013) Hybrid bat algorithm for parameter identification of an e. coli cultivation process model. Biotechnol Biotechnol Equip 27(6):4323–4326
Article Google Scholar
Afrabandpey H, Ghaffari M, Mirzaei A, Safayani M (2014) A novel bat algorithm based on chaos for optimization tasks. In: 2014 Iranian Conference on Intelligent systems (ICIS). IEEE, pp 1–6
Wang G, Guo L, Duan H, Liu L, Wang H (2012) A bat algorithm with mutation for ucav path planning. The Scientific World Journal
Kira K, Rendell LA The feature selection problem: Traditional methods and a new algorithm. In: AAAI, Vol 2, 1992, pp 129–134
Jović A, Brkić K, Bogunović N A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE, 2015, pp 1200–1205
Nakamura RY, Pereira LA, Costa K, Rodrigues D, Papa JP, Yang X-S (2012) Bba: a binary bat algorithm for feature selection. In: 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE, pp 291–297
Davis L Bit-climbing, representational bias, and test suite design. In: ICGA, 1991, pp 18–23
Ingber L, Rosen B (1992) Genetic algorithms and very fast simulated reannealing: A comparison. Math Comput Modell 16(11):87–100
Article MathSciNet Google Scholar
Dueck G (1993) New optimization heuristics: The great deluge algorithm and the record-to-record travel. J Comput Phys 104(1):86–92
Article MathSciNet Google Scholar
Al-betar MA (2017) β-hill climbing: an exploratory local search. Neural Comput Appl 28(1):153–168
Article Google Scholar
Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA (2018) Hybridizing β-hill climbing with wavelet transform for denoising ecg signals. Inf Sci 429:229–246
Article MathSciNet Google Scholar
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, NJ
Google Scholar
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Article Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. https://doi.org/10.1109/TPAMI.2005.159
Article Google Scholar
Ṡen B, Peker M, Ċavuṡoġlu A, Ċelebi FV (2014) A comparative study on classification of sleep stage based on eeg signals using feature selection and classification algorithms. J Med Syst 38(3):1–21
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Springer, pp 1–15
Duval B, Hao J-K, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th Annual conference on Genetic and evolutionary computation. ACM, pp 201–208
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Article Google Scholar
Amaldi E, Kann V (1998) On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor Comput Sci 209(1):237–260
Article MathSciNet Google Scholar
Li X, Yin M (2013) Multiobjective binary biogeography based optimization for feature selection using gene expression data. IEEE Trans NanoBiosci 12(4):343–353
Article Google Scholar
Ahmad Alomari O, Tajudin Khader A, Azmi Al-Betar M, Mohammad Abualigah L Mrmr ba: A hybrid gene selection algorithm for cancer classification. J Theor Appl Inform Technol 95(12):2610–2618
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Article Google Scholar
Chang C-C, Lin C-J (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Google Scholar
Huang C-L, Wang C-J (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31(2):231–240
Article Google Scholar
Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recogn 40(11):3236–3248
Article Google Scholar
Jain I, Jain VK, Jain R Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification, Applied Soft Computing
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A, Benítez JM, Herrera F (2014) A review of microarray datasets and applied feature selection methods. Inf Sci 282:111–135
Article Google Scholar
Li T, Zhang C, Ogihara M (2004) A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15):2429–2437
Article Google Scholar
Li J, Liu H Kent ridge bio-medical data set repository, Institute for Infocomm Research. http://sdmc.lit.org.sg/GEDatasets/Datasets.html
Salem H, Attiya G, El-Fishawy N (2017) Classification of human cancer diseases by gene expression profiles. Appl Soft Comput 50:124–134
Article Google Scholar
Kannan SS, Ramaraj N (2010) A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Knowl-Based Syst 23(6):580–585
Article Google Scholar
Talbi E-G, Jourdan L, Garcia-Nieto J, Alba E (2008) Comparison of population based metaheuristics for feature selection: Application to microarray data classification. In: 2008 IEEE/ACS International Conference on Computer Systems and Applications. IEEE, pp 45–52
Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2006) Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn 39(12):2383–2392
Article Google Scholar
Bonilla-Huerta E, Duval B, Hernández JCH, Hao J-K, Morales-Caporal R (2011) Hybrid filter-wrapper with a specialized random multi-parent crossover operator for gene selection and classification problems. In: International Conference on Intelligent Computing. Springer, pp 453–461

Download references

Author information

Authors and Affiliations

School of Computer Sciences, Universiti Sains Malaysia (USM), Pulau Pinang, Malaysia
Osama Ahmad Alomari & Ahamad Tajudin Khader
Department of Information Technology, Al-Huson University College, Al-Balqa Applied University, P.O. Box 50, Al-Huson, Irbid, Jordan
Mohammed Azmi Al-Betar
Department of Computer Science, Al-Aqsa University, P.O. Box 4051, Gaza, Palestine
Mohammed A. Awadallah

Authors

Osama Ahmad Alomari
View author publications
You can also search for this author in PubMed Google Scholar
Ahamad Tajudin Khader
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Azmi Al-Betar
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed A. Awadallah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Osama Ahmad Alomari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alomari, O.A., Khader, A.T., Al-Betar, M.A. et al. A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing. Appl Intell 48, 4429–4447 (2018). https://doi.org/10.1007/s10489-018-1207-1

Download citation

Published: 26 June 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10489-018-1207-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing

Abstract

Similar content being viewed by others

Binary Bat Algorithm with Dynamic Bayesian Network for Feature Selection on Cancer Gene Expression Profiles

A hybrid intelligent optimization algorithm to select discriminative genes from large-scale medical data

An Ensemble of Cooperative Parallel Metaheuristics for Gene Selection in Cancer Classification

1 Introduction