$$\beta$$ -Hill climbing algorithm with probabilistic neural network for classification problems

Alweshah, Mohammed; Al-Daradkeh, Aram; Al-Betar, Mohammed Azmi; Almomani, Ammar; Oqeili, Saleh

doi:10.1007/s12652-019-01543-4

$\beta$-Hill climbing algorithm with probabilistic neural network for classification problems

Original Research
Published: 19 October 2019

Volume 11, pages 3405–3416, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

$\beta$-Hill climbing algorithm with probabilistic neural network for classification problems

Download PDF

Mohammed Alweshah ORCID: orcid.org/0000-0002-3724-5111¹,
Aram Al-Daradkeh¹,
Mohammed Azmi Al-Betar²,
Ammar Almomani² &
…
Saleh Oqeili¹

371 Accesses
21 Citations
Explore all metrics

Abstract

Classification is a crucial step in the data mining field. The probabilistic neural network (PNN) is an efficient method developed for classification problems. The success factor of using PNN for classification problems implies in finding the proper weight during classification process. The main goal of this paper is to improve the performance of PNN by finding the best weight for the PNN using the recent local search approach called $\beta$-hill-climbing ($\beta$-HC) optimizer. This algorithm is an extension version of the traditional hill climbing algorithm in that it uses a stochastic operator to avoid local optima. The proposed approach is evaluated against 11 benchmark datasets ,and the experimental results showed that the proposed $\beta$-HC with PNN approach performed better in terms of classification accuracy than the original PNN, HC-PNN and other six well-established approaches using the same experimented benchmarks.

An ANN Model to Classify Multinomial Datasets with Optimized Target Using Particle Swarm Optimization Technique

Binary $\beta$-hill climbing optimizer with S-shape transfer function for feature selection

Article 28 August 2020

Improving the Generalisation Ability of Neural Networks Using a Lévy Flight Distribution Algorithm for Classification Problems

Article 14 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In machine learning, classification is normally known as a supervised learning mechanism where the initial input data is analyzed and used in to learn the system how to classify the future data or observations (Brownlee 2016). Therefore, The classification process is considered as an important task in data mining. In simple terms, classification identifies a set of data objects that resemble each other in the same dataset and are different from the objects in other datasets. In order to manipulate the classification problem, a training set (input dataset) is required that contains example records with a number of attributes. The classification algorithm uses the training dataset to construct a model and that model can be used to allocate unclassified records to one of the defined classes (Alweshah 2018; Alweshah and Abdullah 2015). Examples of classification problems are speech recognition Juang et al. (1997), handwriting recognition (LeCun et al. 1990), biometric identification (Yampolskiy and Govindaraju 2008), document classification (Alweshah et al. 2017b; Yan et al. 2018), image and video classifications (Yan et al. 2019a, b), and many others as reported in the latest comprehensive survey (Alweshah et al. 2017a).

Many techniques have been used to solve classification problems.Examples include the support vector machine (SVM) (Adankon and Cheriet 2009), neural network (NN) (Specht 1991), radial basis function (RBF) network (Craddock and Warwick 1996), naïve Bayes (NB) (Friedman et al. 1997) and many others (Alshareef et al. 2015b; Faris et al. 2016; Aljarah et al. 2018; Khan et al. 2019; Anagaw and Chang 2019; Boveiri et al. 2019). The NN is a common technique for solving classification problems and there are different types of NN, including the feed-forward neural network and multilayer perceptron (MLP), as well as a radial basis function network (RFB), modular networks and the probabilistic neural network (PNN) (Specht 1990).

The NN is exceptionally effective and has been utilized to solve many classification problems. It was created in the late 1950s by Rosenblatt in Alweshah (2014). The NN consists of two fundamental components: the neuron and the link. The neuron alludes to a sort of processing component, while the link interfaces two neurons together. Each link has an individual weight. Each neuron gets a simulating signal from other neighboring neurons, and from these signals it forms data and creates an output result (Alshareef et al. 2015a).

The PNN is a strong data mining tool and it is an algorithm that can be used for a huge number and complex (input/output) relationships. The PNN is a special form of NN model with a statistical Bayesian decision rule. The general structure of a PNN consists of four layers: (1) an input layer, where the dimensions of the input vector transcribes the dimension of the input layer; (2) a pattern layer (second layer), where the dimension of the pattern layer is equal to the dimension of the number of examples in the training set; (3) a summation layer (third layer), which contains the number of classes in the set of examples, and (4) a decision layer (fourth layer), which classifies the test example into predefined classes.

The success of the PNN is depended on the proper selection of the weight during the classification process (Iliou and Anagnostopoulos 2010). The way of selecting the optimal value of weight vector for PNN is recently modelled as an optimization problem (Iliou and Anagnostopoulos 2010). Several optimization problems such as metaheuristics have been recently utilized to tackle such a problem. metaheuristics-based optimization approaches are categorized into either single-solution-based or population-based searches (Blum and Roli 2003).

Population-based approaches focus on maintaining and improving multiple candidate solutions.They are normally manipulating several solutions at a time using recombination, mutation and selection operators. Although they are very efficient in handling several search space regions at the same time, they cannot find the local optimal solution at each search space regions to which they handled. Two main population-based metaheuristics are popularly known: evolutionary algorithms and swarm intelligence. Evolutionary algorithms include genetic algorithms (GAs) (Hu et al. 2016), harmony search algorithm (Geem et al. 2001). Another category of metaheuristics is swarm intelligence. Examples of this category include ant colony optimization (ACO) (Dorigo and Stützle 2010), PSO, social cognitive optimization, the penguin search optimization algorithm (PeSOA), the artificial bee colony (ABC) (Zhu and Kwong 2010) and the Grey Wolf Optimiser (GWO) algorithm (Al Nsour et al. 2019).

The single-solution approaches are initiated with a single provisional solution. Iterativelly, that solution can be improved by moving to its neighbouring solution until the local optimal solution is reached. Examples of single-solution-based used for NN include simulated annealing (SA) (Ren and Qu 2014), iterated local search (ILS) (El-Bouri 2012), variable neighborhood search (VNS) (Xie et al. 2012), guided local search (GLS) (Tairan and Zhang 2010) and $\beta$-hill climbing (Al-Betar 2017).

$\beta$-Hill clibming is a recent local-search based algorithm (Al-Betar 2017) It is able to escape the local optimal trap using $\beta$-operator. It has several successful features such as it has few parameters to be set in the initial search. It is a very simple optimization frameworks. It can be flexibly adapt for any kind of optimization problems such as economic load dispatch (Al-Betar et al. 2018), denoising electrocardiogram (ECG) signals (Alyasseri et al. 2018, 2017b, a), Multiple-reservoir scheduling (Alsukni et al. 2017), sudoku game (Al-Betar et al. 2017), Substitution-Boxes (Alzaidi et al. 2018), gene selection (Alomari et al. 2018) and feature selection (Abualigah et al. 2017). Furthermore, theoretical concepts of $\beta$-hill climbing related to the adaptive parameter settings are also improved (Al-Betar et al. 2019). Also, $\beta$-hill climbing is hybridized with other population-based algorithms to improve exploitation concepts (Abed-alguni and Alkhateeb 2018).

In this paper, the $\beta$-hill-climbing ($\beta$-HC) algorithm is adapted to seek for the optimal weight for the PNN so as to improve the performance (accuracy). The weight of PNN is modelled as an optimization problem to be solved by $\beta$-HC. In order to evaluate the proposed method, 11 test benchmarks of classification problems are used. Comparative evaluation between PNN without $\beta$-HC, PNN with hill climbing, PNN with $\beta$-HC shows that the PNN with $\beta$-HC is able to overcome the PNN significantly without $\beta$-HC, PNN with hill climbing. It can find the optimal weight for the PNN and strike the right balance between the exploration and exploitation during the search.

The rest of this paper is organized as follows: In Sect. 2, the background of PNN algorithm and $\beta$-HC is described. Then, in Sect. 3 the proposed $\beta$-hill-climbing-based PNN is discussed in detail. In Sect. 4, the experimental results are presented and analyzed. Finally, Sect. 5 concludes the paper and provide some possible directions for future improvements.

2 Research background

2.1 Probabilistic neural network (PNN)

The PNN introduced by Specht (1990) has been shown to have effective classification performance, even when used to a wide variety of different problems (Berrar et al. 2002; Wasserman 1993; Specht 1990; Sweeney et al. 1994; Mao et al. 2000; Kwigizile et al. 2004; Manimala and Selvi 2007; Araghi et al. 2009). The training of a PNN does not involve the implementation of heuristic searches to get the best smoothing factor, for the training, which is an optimization problem. A PNN uses a statistical algorithm, known as kernel discriminant analysis (Mika et al. 1999), in which the methods are organized into a four-layered feed-forward network consisting of an input layer, hidden layer, pattern/summation layer and decision/output layer, as shown in Fig. 1.

Figure 1 illustrates a typical PNN model. Each input neural symbolizes a distinct feature in the training/test data sets. The number of the inputs is equivalent to the number of features in the data set. A PNN has a relatively quicker training process than BP. It basically has an analogous structure, which ensures convergence with an ideal classifier as the dimension of the associated training set increases, and has training samples that can be possibly included or eliminated without the need for substantial retraining (Wasserman 1993). The four layers of the PNN network are described below:

Input layer::

This contains one neuron for each indicator variable. For the categorical factors, the N − 1 neurons are connected, where N shows the number of categories. The input neuron supposedly standardizes the value range by subtracting the middle value. after that, it divides it between quartile range value. From this point onward, the input neurons are believed to sustain these qualities in each neuron exhibited inside the hidden layer.

Hidden layer::

This contains a single neuron for each case in the training dataset. The neurons store the indicator variable values and the objective values for the cases. At the point when the x vector of the input values of the input layer is displayed, the Euclidean separation from the center point of the neuron for the experiment is ascertained by the hidden neurons and, after that, with the assistance of the sigma value the RBF kernel function is applied. The resultant value is then passed to the neurons in the pattern layer.

Pattern/summation layer::

For each class of objective factors, a single pattern neuron is available. The objective class for each training group is put away alongside each hidden neuron, where the weight value emerging from the hidden neurons is passed to the pattern neurons that correspond to the type of hidden neurons. The pattern neurons then include the values for their representation classes (hence, these values refer to the weighted votes for the particular category). In practice, this layer operate a dot product process (Z) on both input vector ${{\varvec{X}}}=(x_1,x_2,\ldots ,x_n)$ and weight vector ${{\varvec{W}}}=(w_1,w_2,\ldots ,w_n)$ such that $Z=X.W$. Thereafter, the non-linear activation function is triggered using Z as shown in Eq. (1)

$$\begin{aligned} f(X)= \exp {\frac{(w_i-x)^t - (w_i-x)}{2\sigma ^2}} \end{aligned}$$

(1)

where $\sigma$ is the smoothing parameter that should be set in the initial training process.

Decision/output layer::

This final layer compares the weight votes in favour of each target category that is gathered in the pattern layer and, after that, uses the highest vote to predict the target group.

For mathematical formulations, the interpretation of PNN concepts is modelled as a multilayer framework of an FFNN as shown in Eq. (2).

$$\begin{aligned} h_i\times c_i \times f_i > h_j\times c_j \times f_j \quad \forall i \ne j \end{aligned}$$

(2)

where $h_i$ is the preceding prospect, in which the new object belongs to class i, and $c_i$ is the cost of misclassifying an object that belongs to class i, and $f_i$ is the probability density function (PDF) of class i. On the other hand, it is significant that ${{\varvec{X}}}$ is the input vector that has to be classified. In general, the prior probabilities and cost of misclassification are called as priority when dealing with PNNs and kept equivalent; therefore the Bayes optimal decision rule can be reduced using Eq. (3).

$$\begin{aligned} g_i( {{\varvec{X}}})> g_j( {{\varvec{X}}}) \quad \forall i \ne j. \end{aligned}$$

(3)

This rule discloses that the outstanding classification choice can be developed just by accomplishing straightforward evaluations upon identifying the PDFs of the various classes. In particular, the PDF for a single class can be estimated based on Eq. (4):

$$\begin{aligned} g_i( {{\varvec{X}}})=\frac{1}{n_i\times \sigma } \sum _{k=1}^{n_i} \frac{X-x_{ik}}{\sigma } \end{aligned}$$

(4)

where $\sigma$ is a smoothing parameter, W is the weighting function, X is the unidentified input sample that has to be classified, $X_{ik}$ is the kth training input from the ith class, n is the number of training inputs for class i, and $g_i( {{\varvec{X}}})$ is the PDF approximation for class i.

Generally, the Euclidean distance is applied to determine the closeness or range between the training sample and the unidentified input. The Gaussian function is a typically used weighting function because it demonstrates the above-stated attributes and can be quickly calculated. Supervised learning provides an input pattern and changes the network factors (weights) to reduce the distance between the calculated output and the expected output. By setting the weighting function to the Gaussian function it is computed as shown in Eq. (5):

$$\begin{aligned} g_i( {{\varvec{X}}})=\frac{1}{n_i\times \sigma } \sum _{k=1}^{n_i} \exp ^ {\frac{(x-x_{ik})^2}{\sigma ^2}}. \end{aligned}$$

(5)

2.2 $\beta$-Hill climbing optimizer

The $\beta$-hill climbing is a recent local search-based algorithm introduced in Al-Betar et al. (2017). $\beta$-Hill climbing is initialized with a random solution ${{\varvec{x}}}=(x_1,x_2, \ldots , x_n)$. Iteratively, that solution undergoes changes using three main operators: (1) $\mathcal {N}$-operator, (2) $\beta$-operator, and (3) $\mathcal {S}$-operator. $\beta$-Hill climbing is normally stopped when the maximum number of iterations is met. The feasibility of the solution is normally preserved during the search. Note that the hill climbing (HC) algorithm is the same of the $\beta$-hill climbing but without $\beta$-operator. Algorithm 1 pseudo-coded the basic version of $\beta$-hill climbing. The three operators are described as follows (Fig. 2):

$\mathcal {N}$-operator:

This operator is used to move the current solution ${{\varvec{x}}}$ to its neighboring solution ${{\varvec{x}}}'$ using a movement strategy with a probability of $\mathcal {N}$ in which the feasibility is maintained. The line 5 in Algorithm 1 is the pseudo-code for $\mathcal {N}$-operator. For more clarification, let $x_i$ be assigned by a value of $\nu _i(k)$ of $k^{th}$ location, the value of $x'_i$ will be assigned by value as follows:

$$\begin{aligned} x'_i= \nu _{i, k} \pm m \end{aligned}$$

(6)

where $\nu _{i, k} \pm m$ is the neighboring value of $\nu _i(k)$. It is worth mentioning that the $\mathcal {N}$-operator is invoked once at each iteration using a random walk mechanism. This means that the effect of moving to the neighbouring solution is not checked until the $\mathcal {S}-operator$ is invoked.

$\beta$-operator:

The $\beta$-operator is responsible for the exploration process by applying a uniform mutation to the current solution. $\forall i \in (1,2,\ldots ,n)$, the decision variable $x_i$ is randomly selected to be changed using the $\beta$ parameter as in Eq. (7). The lines from 6 to 10 in Algorithm 1 are the pseudo-code for $\beta$-operator.

$$\begin{aligned} x'_{i}\leftarrow {\left\{ \begin{array}{ll} \nu _{r} &{} \qquad U[0,1] \le \beta \\ i'_i &{} \qquad otherwise. \end{array}\right. } \end{aligned}$$

It is worth mentioning that $\beta$ determines how often the uniform mutation is applied.

$\mathcal {S}$-operator:

The improved solution ${{\varvec{x}}}'$ is evaluated using $f({{\varvec{x}}}')$. The current solution ${{\varvec{x}}}$ will be replaced by the improved solution ${{\varvec{x}}}'$, if better (i.e., $f({{\varvec{x}}}')\le f({{\varvec{x}}})$). The lines from 11 to 13 in Algorithm 1 are the pseudo-code for $\mathcal {S}$-operator.

3 $\beta$-Hill climbing optimizer for PNN classifier

In this paper, the $\beta$-HC algorithm is utilized to find the optimal weights that can be efficiently used in the PNN algorithm hoping to increase the accuracy of the classification process. Initially, random weights vector x are generated through the PNN algorithm, and the input values are multiplied by the corresponding weights by a model determined by the PNN. Figure3 shows the solution representation example

As shown in Fig. 4, the proposed method included two parts, the first one represents the PNN algorithm and the other represents the $\beta$-HC algorithm. In the first part, the training data is used, and the tested data is subject to the classification process. As for the process of adjusting the weights of the PNN is the responsibility of the $\beta$-HC algorithm, then the accuracy of the disaggregated data is calculated and the process is repeated until the completion criterion is met.

The proposed method was assessed by figuring four counts. True positives (TP) refer to the class with the number of correctly assigned records; true negatives (TN) refer to the number of correct instances that are not part of the class; false positives (FP) refer to the number of instances that are incorrectly assigned; false negatives (FN) refer to the positive tuples that are labelled incorrectly.

Classification accuracy computed in Eq. (7) is a term that refers to a statistical measure that displays how well the classifier is able to correctly identify the objects towards the labelled classes. The error rate computed in Eq.(8) is a measure of the objects that are incorrectly recognized. Furthermore, specificity and sensitivity were also calculated in Eqs. (9) and (10), respectively.

$$\begin{aligned} Accuracy= & {} \frac{TP+TN}{TP+TN+FP+FN} \end{aligned}$$

(7)

$$\begin{aligned} Error Rate= & {} 1-\frac{TP+TN}{TP+TN+FP+FN} \end{aligned}$$

(8)

$$\begin{aligned} Specificity= & {} \frac{TN}{TN+FP} \end{aligned}$$

(9)

$$\begin{aligned} Sensitivity= & {} \frac{TP}{TP+TN}. \end{aligned}$$

(10)

4 Experiments and results

In this section, the results produced by the proposed method is evaluated using a set of experiments. The proposed algorithm are programmed using Matlab R2010 on an Intel® Xeon® CPU E5-1630 v3@3.70 GHz computer are presented. Table 1 shows the parameter settings used for the proposed algorithm, which were determined after intensive preliminary experiments.

Table 1 Parameter settings

Full size table

The aim of conducting these experiments is to check whether the classification accuracy is maximized using the PNN utilized $\beta$-HC.

The quality of the solutions is given in terms of the best accuracy value (in percent), after 30 independent runs for each of the 11 datasets. In this research, four performance measures are used: accuracy, error rate, sensitivity and specificity as formulated in the previous section. Table 2 presents the results for HC-based PNN, $\beta$-HC-based PNN and the original PNN classification techniques when applied to the 11 benchmark datasets. The results in bold highlights the best outcomes. The results summarized in table apparently proved that the proposed $\beta$-HC based PNN method was able to outperform the original PNN algorithm on all datasets and also mostly obtained better results than HC based PNN.

Table 2 Results obtained by PNN, HC and $\beta$-HC on the test datasets

Full size table

Figures 5 and 6 presents the simulation results for the convergence characteristics of $\beta$-HC based PNN and HC based PNN in comparison with original PNN using the 11 datasets. Each algorithm was run for 200 iterations. These results show that $\beta$-HC based PNN can converge faster than HC based PNN except when applied to GCD and ACA. Moreover, $\beta$-HC obtains similar convergence trend for the Heart, and API datasets. Interestingly, when applied to the Four class dataset, $\beta$-HC achieved 100% accuracy in all iterations.

We also investigate whether the performance of $\beta$-HC-PNN has a significant statistical different indicator from HC-PNN and the firefly algorithm (FA) in terms of classification accuracy, sensitivity, and specificity. Accordingly, a sample T test with a significance interval of 95% ($\alpha$ = 0.05) is used. The results show that there is a significant difference between the comparative algorithms in favour to $\beta$-HC-PNN. To elaborate, with regards to accuracy, $\beta$-HC-PNN is statistically different from HC-PNN and FA (p value $<0.05$). As for sensitivity and specificity, $\beta$-HC is also statistically different from HC-PNN and FA. Figure 7 presents the box plots that demonstrate the distribution resolution quality obtained by $\beta$-HC, HC-PNN and FA for the six tested datasets used Gorunescu (2011).

We also assessed all the computational results (with regards to classification accuracy) of the proposed $\beta$-HC against those of six popular approaches in the literature. The abbreviations of the comparative approaches are listed in Table 3.

Table 3 Acronyms of the compared methods

Full size table

Notably, none of the comparative approaches mentioned in Table 4 were applied to whole datasets used in our study, except for FA. Therefore, different combinations of these separate approaches were tested for each dataset, as described in Table 4. The best results are highlighted in bold font.

Table 4 Comparison of the results of $\beta$-HC and the state-of-the-art

Full size table

From the table it can be seen that $\beta$-HC ranked first in all datasets except ACA and GCD, thereby achieving the best performance with regards to classification accuracy. Also, HC ranked second. Moreover, $\beta$-HC and HC were able to classify the Four class dataset with no miss classifications (100% accuracy). In a nutshell, $\beta$-HC outperformed the other comparative approaches for almost all datasets used.

5 Conclusion and future work

The key goal of this research is to improve the classification accuracy of the PNN by (1) modifying the PNN’s inner weights to discover high-quality solutions, thus improve classification accuracy and (2) attaining a high convergence speed. The proposed approach presented in this paper is the utilization of the $\beta$-HC metaheuristic algorithm with the PNN algorithm, where $\beta$-HC was utilized to optimize the weights of the PNN. The rationale for utilizing $\beta$-HC, which is a nature-inspired population-based algorithm, was based on its capacity to undertake an extensive exploration of the search space, its straightforwardness, and its potential for solving various complicated classification problems.

In order to evaluate the proposed method, 11 benchmark datasets circulated widely to evaluate the new classification methods are used in the experiments. Initially, the three experimented approaches which are PNN, HC-PNN, and $\beta$-HC-PNN are compared in terms of Accuracy, Sensitivity, Specificity, and Error rate. Interestingly, for all measures, the proposed $\beta$-HC-PNN achieved the best results.

Furthermore, the proposed $\beta$-HC-PNN is also compared with three well-established approaches using the same datasets. The results produced by $\beta$-HC-PNN are better than that produced by others for nine out of eleven datasets.

evaluated against 11 benchmark datasets and the experimental results showed that the proposed $\beta$-HC with PNN approach performed better in terms of classification accuracy than the original PNN, HC-PNN and other six well-established approaches using the same experimented benchmarks. In a nutshell, hybridizing $\beta$-HC with PNN improves the classification accuracy of PNN.

In general, the results obtained affirmed the powerful performance of the proposed algorithm. In the future, the following improvement can be investigated:

The $\beta$-HCO can be hybridized with another classification method instead of PNN.
Other classification datasets with combinatorial features can be used for further validation.
Other local search methods can be investigated to show the strength of $\beta$-HCO.

References

Abed-alguni BH, Alkhateeb F (2018) Intelligent hybrid cuckoo search and $\beta$-hill climbing algorithm. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.05.003
Abualigah LM, Khader AT, Al-Betar MA, Alyasseri ZAA, Alomari OA, Hanandeh ES (2017) Feature selection with $\beta$-hill climbing search for text clustering application. In: Information and communication technology (PICICT), 2017 Palestinian international conference on, IEEE, pp 22–27
Adankon MM, Cheriet M (2009) Support vector machine. In: Li SZ, Jain AK (eds) Encyclopedia of biometrics, pp 1303–1308
Al-Betar MA (2017) $\beta$-Hill climbing: an exploratory local search. Neural Comput Appl 28(1):153–168. https://doi.org/10.1007/s00521-016-2328-2
Google Scholar
Al-Betar MA, Awadallah MA, Bolaji AL, Alijla BO (2017) $\beta$-hill climbing algorithm for sudoku game. In: Information and communication technology (PICICT), 2017 Palestinian international conference on, IEEE, pp 84–88
Al-Betar MA, Awadallah MA, Doush IA, Alsukhni E, ALkhraisat H (2018) A non-convex economic dispatch problem with valve loading effect using a new modified $\beta$-hill climbing local search algorithm. Arab J Sci Eng 43(12):7439–7456
Al-Betar MA, Aljarah I, Awadallah MA, Faris H, Mirjalili S (2019) Adaptive $\beta$-hill climbing for optimization. Soft Comput. https://doi.org/10.1007/s00500-019-03887-7
Al Nsour H, Alweshah M, Hammouri AI, Al Ofeishat H, Mirjalili S (2019) A hybrid grey wolf optimiser algorithm for solving time series classification problems. J Intell Syst. https://doi.org/10.1515/jisys-2018-0129
Aljarah I, Faris H, Mirjalili S, Al-Madi N (2018) Training radial basis function networks using biogeography-based optimizer. Neural Comput Appl 29(7):529–553
Article Google Scholar
Alomari OA, Khader AT, Al-Betar MA, Awadallah MA (2018) A novel gene selection method using modified mrmr and hybrid bat-inspired algorithm with $\beta$-hill climbing. Appl Intell 48(11):4429–4447
Google Scholar
Alshareef A, Ahmida S, Bakar AA, Hamdan AR, Alweshah M (2015a) Mining survey data on university students to determine trends in the selection of majors. In: 2015 Science and information conference (SAI), IEEE, pp 586–590
Alshareef AM, Bakar AA, Hamdan AR, Abdullah SMS, Alweshah M (2015b) A case-based reasoning approach for pattern detection in malaysia rainfall data. Int J Big Data Intell 2(4):285–302
Article Google Scholar
Alsukni E, Arabeyyat OS, Awadallah MA, Alsamarraie L, Abu-Doush I, Al-Betar MA (2017) Multiple-reservoir scheduling using $\beta$-hill climbing algorithm. J Intell Syst 28(4):559–570. https://doi.org/10.1515/jisys-2017-0159
Google Scholar
Alweshah M (2014) Firefly algorithm with artificial neural network for time series problems. Res J Appl Sci Eng Technol 7(19):3978–3982
Article Google Scholar
Alweshah M (2018) Construction biogeography-based optimization algorithm for solving classification problems. Neural Comput Appl 29(4):1–10
Google Scholar
Alweshah M, Abdullah S (2015) Hybridizing firefly algorithms with a probabilistic neural network for solving classification problems. Appl Soft Comput 35:513–524
Article Google Scholar
Alweshah M, Hammouri AI, Tedmori S (2017a) Biogeography-based optimisation for data classification problems. Int J Data Min Model Manag 9(2):142–162
Google Scholar
Alweshah M, Rashaideh H, Hammouri AI, Tayyeb H, Ababneh M (2017b) Solving time series classification problems using support vector machine and neural network. Int J Data Anal Tech Strat 9(3):237–247
Article Google Scholar
Alyasseri ZAA, Khader AT, Al-Betar MA (2017a) Optimal electroencephalogram signals denoising using hybrid $\beta$-hill climbing algorithm and wavelet transform. In: Proceedings of the international conference on imaging, signal processing and communication, ACM, pp 106–112
Alyasseri ZAA, Khader AT, Al-Betar MA, Abualigah LM (2017b) ECG signal denoising using $\beta$-hill climbing algorithm and wavelet transform. In: 2017 8th International conference on information technology (ICIT), IEEE, pp 96–101
Alyasseri ZAA, Khader AT, Al-Betar MA, Awadallah MA (2018) Hybridizing $\beta$-hill climbing with wavelet transform for denoising ECG signals. Inf Sci 429:229–246
MathSciNet Google Scholar
Alzaidi AA, Ahmad M, Doja M, Al Solami E, Beg MS (2018) A new 1D Chaotic map and $\beta$-hill climbing for generating substitution-boxes. IEEE Access 6:55405–55418
Google Scholar
Anagaw A, Chang YL (2019) A new complement naïve Bayesian approach for biomedical data classification. J Ambient Intell Human Comput 10(10):3889–3897
Article Google Scholar
Araghi LF, Khaloozade H, Arvan MR (2009) Ship identification using probabilistic neural networks (PNN). Proc Int Multiconf Eng Comput Sci 2:18–20
Google Scholar
Berrar DP, Downes CS, Dubitzky W (2002) Multiclass cancer classification using gene expression profiling and probabilistic neural networks. In: Biocomputing 2003, World Scientific, pp 5–16
Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308
Article Google Scholar
Boveiri HR, Khayami R, Elhoseny M, Gunasekaran M (2019) An efficient swarm-intelligence approach for task scheduling in cloud-based internet of things applications. J Ambient Intell Human Comput 10(9):3469–3479
Article Google Scholar
Brownlee J (2016) Supervised and unsupervised machine learning algorithms. Mach Learn Mastery 16(03)
Cpałka K, Rebrova O, Nowicki R, Rutkowski L (2013) On design of flexible neuro-fuzzy systems for nonlinear modelling. Int J Gen Syst 42(6):706–720
Article MATH Google Scholar
Craddock R, Warwick K (1996) Multi-layer radial basis function networks. an extension to the radial basis function. In: Proceedings of international conference on neural networks (ICNN’96), vol 2. IEEE, pp 700–705
Davis WL, Warren L (2006) Enhancing pattern classification with relational fuzzy neural networks and square BK-products. Florida State University, Tallahassee
Google Scholar
Dorigo M, Stützle T (2010) Ant colony optimization: overview and recent advances. Springer, Boston, pp 227–263
Google Scholar
El-Bouri A (2012) An investigation of initial solutions on the performance of an iterated local search algorithm for the permutation flowshop. In: 2012 IEEE Congress on evolutionary computation, IEEE, pp 1–5
Faris H, Aljarah I, Mirjalili S (2016) Training feedforward neural networks using multi-verse optimizer for binary classification problems. Appl Intell 45(2):322–332
Article Google Scholar
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Article MATH Google Scholar
Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: harmony search. Simulation 76(2):60–68
Article Google Scholar
Gorunescu F (2011) Data mining: concepts, models and techniques, vol 12. Springer Science & Business Media, Berlin
Book MATH Google Scholar
Hu L, Qin L, Mao K, Chen W, Fu X (2016) Optimization of neural network by genetic algorithm for flowrate determination in multipath ultrasonic gas flowmeter. IEEE Sens J 16(5):1158–1167
Article Google Scholar
Iliou T, Anagnostopoulos CN (2010) SVM-MLP-PNN classifiers on speech emotion recognition field—a comparative study. In: 2010 Fifth international conference on digital telecommunications, IEEE, pp 1–6
Juang BH, Hou W, Lee CH (1997) Minimum classification error rate methods for speech recognition. IEEE Trans Speech Audio process 5(3):257–265
Article Google Scholar
Khan A, Shah R, Imran M, Khan A, Bangash JI, Shah K (2019) An alternative approach to neural network training based on hybrid bio meta-heuristic algorithm. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01373-4
Kwigizile V, Selekwa MF, Mussa RN (2004) Highway vehicle classification by probabilistic neural networks. In: FLAIRS conference, pp 664–669
LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404
Manimala K, Selvi K (2007) Power quality disturbances classification using probabilistic neural network. In: International conference on computational intelligence and multimedia applications (ICCIMA 2007), vol 1. IEEE, pp 207–211
Mao KZ, Tan KC, Ser W (2000) Probabilistic neural-network structure determination for pattern classification. IEEE Trans Neural Netw 11(4):1009–1016
Article Google Scholar
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers KR (1999) Fisher discriminant analysis with kernels. In: Neural networks for signal processing IX: proceedings of the 1999 IEEE signal processing society workshop (Cat. No. 98TH8468), Ieee, pp 41–48
Ren K, Qu J (2014) Identification of shaft centerline orbit for wind power units based on hopfield neural network improved by simulated annealing. Mathematical problems in engineering 2014
Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118
Article Google Scholar
Specht DF (1991) A general regression neural network. IEEE Trans Neural Netw 2(6):568–576
Article Google Scholar
Sweeney WP Jr, Musavi MT, Guidi JN (1994) Classification of chromosomes using a probabilistic neural network. Cytom J Int Soc Anal Cytol 16(1):17–24
Google Scholar
Tairan N, Zhang Q (2010) Population-based guided local search: some preliminary experimental results. In: IEEE congress on evolutionary computation, IEEE, pp 1–5
Wasserman P (1993) Advanced methods in neural networks
Xie X, Li Y, Zhou H, Zheng Y (2012) Variable neighborhood search based multi-objective dynamic crane scheduling. In: Proceedings of 2012 international conference on measurement, information and control, IEEE, vol 1, pp 457–460
Yampolskiy RV, Govindaraju V (2008) Behavioural biometrics: a survey and classification. Int J Biometr 1(1):81–113
Article Google Scholar
Yan C, Xie H, Chen J, Zha Z, Hao X, Zhang Y, Dai Q (2018) A fast Uyghur text detector for complex background images. IEEE Trans Multimedia 20(12):3389–3398
Article Google Scholar
Yan C, Li L, Zhang C, Liu B, Zhang Y, Dai Q (2019a) Cross-modality bridging and knowledge transferring for image understanding. IEEE Trans Multimedia
Yan C, Tu Y, Wang X, Zhang Y, Hao X, Zhang Y, Dai Q (2019b) STAT: spatial-temporal attention mechanism for video captioning. IEEE Trans Multimedia
Zarndt F (1995) A comprehensive case study: an examination of machine learning and connectionist algorithms. PhD thesis, Brigham Young University. Department of Computer Science
Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al-Salt, Jordan
Mohammed Alweshah, Aram Al-Daradkeh & Saleh Oqeili
Department of Information Technology, Al-Huson University College, Al-Balqa Applied University, Irbid, Jordan
Mohammed Azmi Al-Betar & Ammar Almomani

Authors

Mohammed Alweshah
View author publications
You can also search for this author in PubMed Google Scholar
Aram Al-Daradkeh
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Azmi Al-Betar
View author publications
You can also search for this author in PubMed Google Scholar
Ammar Almomani
View author publications
You can also search for this author in PubMed Google Scholar
Saleh Oqeili
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Alweshah.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alweshah, M., Al-Daradkeh, A., Al-Betar, M.A. et al. $\beta$-Hill climbing algorithm with probabilistic neural network for classification problems. J Ambient Intell Human Comput 11, 3405–3416 (2020). https://doi.org/10.1007/s12652-019-01543-4

Download citation

Received: 25 May 2019
Accepted: 05 October 2019
Published: 19 October 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s12652-019-01543-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

\(\beta\)-Hill climbing algorithm with probabilistic neural network for classification problems

Abstract

Similar content being viewed by others

An ANN Model to Classify Multinomial Datasets with Optimized Target Using Particle Swarm Optimization Technique

Binary \(\beta\)-hill climbing optimizer with S-shape transfer function for feature selection

Improving the Generalisation Ability of Neural Networks Using a Lévy Flight Distribution Algorithm for Classification Problems

1 Introduction