An intelligent heart disease prediction system based on swarm-artificial neural network

Nandy, Sudarshan; Adhikari, Mainak; Balasubramanian, Venki; Menon, Varun G.; Li, Xingwang; Zakarya, Muhammad

doi:10.1007/s00521-021-06124-1

An intelligent heart disease prediction system based on swarm-artificial neural network

S.I: ML4BD_SHS
Published: 27 May 2021

Volume 35, pages 14723–14737, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

An intelligent heart disease prediction system based on swarm-artificial neural network

Download PDF

Sudarshan Nandy¹,
Mainak Adhikari²,
Venki Balasubramanian³,
Varun G. Menon ORCID: orcid.org/0000-0002-3055-9900⁴,
Xingwang Li⁵ &
…
Muhammad Zakarya⁶

1849 Accesses
50 Citations
Explore all metrics

Abstract

The accurate prediction of cardiovascular disease is an essential and challenging task to treat a patient efficiently before occurring a heart attack. In recent times, various intelligent healthcare frameworks have been designed with different machine learning and swarm optimization techniques for cardiovascular disease prediction. However, most of the existing strategies failed to achieve higher accuracy for cardiovascular disease prediction due to the lack of data-recognized techniques and proper prediction methodology. Motivated by the existing challenges, in this paper, we propose an intelligent healthcare framework for predicting cardiovascular heart disease based on Swarm-Artificial Neural Network (Swarm-ANN) strategy. Initially, the proposed Swarm-ANN strategy randomly generates predefined numbers of Neural Networks (NNs) for training and evaluating the framework based on their solution consistency. Additionally, the NN populations are trained by two stages of weight changes and their weight is adjusted by a newly designed heuristic formulation. Finally, the weight of the neurons is modified by sharing the global best weight with other neurons and predicts the accuracy of cardiovascular disease. The proposed Swarm-ANN strategy achieves 95.78% accuracy while predicting the cardiovascular disease of the patients from a benchmark dataset. The simulation results exhibit that the proposed Swarm-ANN strategy outperforms the standard learning techniques in terms of various performance matrices.

Risk Prediction of Heart Disease Based on Swarm Optimized Neural Network

RETRACTED ARTICLE: Effective heart disease prediction using improved particle swarm optimization algorithm and ensemble classification technique

Article 29 May 2023

Heart Attack Classification System Using Neural Network Trained with Particle Swarm Optimization

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent times, cardiovascular/heart disease is one of the major causes of mortality worldwide. The World Health Organization (WHO) has observed that more than 18 million deaths occur per year in the world due to cardiovascular disease [1]. In developing countries and rural areas, this situation gets worse due to the lack of infrastructure and efficient healthcare administrators. Nowadays, one of the main reasons for death is due to heart disease which includes heart attack, hypertension, and stroke. In such a situation, early diagnosis of cardiac diseases can help to take immediate action with proper treatment and preventing death [2, 3]. Besides that, cardiovascular disease affects the other parts of the human body such as blood vessels, which makes a patient weaker, especially aged people. Furthermore, more than 1 billion dollars are spent by the Government of the United States of America on heart disease treatments [4]. Considering the increasing number of cardiovascular/heart disease patients worldwide due to the abnormality of health parameters, it is a challenging task to monitor patients' health status remotely and advise to stay at home isolation for patients with fewer chronic health problems. As a result, early prediction of heart disease before occurring a stroke or heart attack based on the monitoring dataset remotely is another challenging task in the healthcare domain.

Cardiovascular disease can be identified by using wearable sensors (such as ECG or heart sound sensors) or conducting medical tests in the hospital [5, 6]. Besides that, the advancements in communication and sensor technologies have enabled wearable devices to generate an enormous amount of data by monitoring various heart patients remotely [7, 8]. Thus, extracting valuable risk factors for heart disease and accurately diagnosing from such monitoring data is a difficult task. To tackle such a scenario, and analysis the heart disease with the monitoring data, various machine learning and swarm optimization techniques have been used to investigate the effectiveness of the healthcare framework [9, 10]. Thus, the main research questions that come out for cardiovascular/heart disease prediction are: (1) how to extract the features from the monitoring data and handle different volumes, velocities, and varieties of healthcare data efficiently? (2) Which type of machine learning or optimization technique is useful to handle a large amount of healthcare data and diagnose cardiovascular/heart disease inside a patient body efficiently? Thus, there is a need to design an intelligent healthcare framework using machine learning or optimization technique for handling a large volume of monitoring healthcare data and diagnose the heart disease efficiently with higher accuracy.

1.1 Motivation

Due to the outbreak and increasing number of deaths due to heart disease, the prediction, and diagnosis of cardiovascular/heart disease is one of the challenging tasks in the healthcare domain [11, 12]. In recent times, various models have been developed to predict and analyze heart disease with the monitoring dataset. However, from the literature review, it is observed that most of the existing models used for heart disease diagnosis are based on various data mining and machine learning techniques. Further, most of the monitoring data are not analyzed properly due to the lack of data-recognized techniques and proper prediction methodologies. As a result, due to the lack of theoretical support and selection of the suitable classification model, the existing models fail to achieve higher accuracy of heart disease prediction while minimizing the error rate. In such a scenario, Artificial Neural Networks (ANN) play a vital role by predicting heart disease efficiently from the large healthcare dataset of heart diseases. The main advantages of the ANN techniques are: (a) ANNs can learn by themselves and produce the results without depending on the input data and (b) ANN can perform multiple tasks in parallel without affecting the system performance. By motivating from the above-mentioned challenges and advantages of the ANN technique, we develop a new intelligent healthcare framework based on the Swarm-Artificial Neural Network (Swarm-ANN) technique for increasing the accuracy of the cardiovascular/heart disease prediction.

1.2 Contributions

The main aim of this strategy is to diagnosis the cardiovascular disease of a patient based on the existing parameters of a high-dimensional healthcare dataset of heart disease. In the initial phase, the proposed Swarm-ANN strategy randomly generates predefined numbers of Neural Networks (NNs) for training and evaluating the framework based on their solution consistency. In the next step, the NN populations are trained by two stages of weight changes, and the winner is decided based on the similarities between the winner and the predefined threshold value. The weight and bias of the winner neuron in the NN-populations are separated from other neurons, and their weight is adjusted by a newly designed heuristic formulation. Finally, the weight of the neurons is modified by sharing the global best weight with other neurons and predicts the accuracy of cardiovascular disease. The main contributions of this strategy are described as follows.

We design a Swarm-ANN strategy for reliable prediction of heart disease using a healthcare dataset. Further, we implement the 3-layers ANN for training the dataset in parallel for predicting the heart attack of the patients.
We implement a Swarm-ANN strategy based on a two-phase weight modification strategy with a modified stochastic weight technique for quick convergence and increases the accuracy of the heart disease prediction.
The proposed Swarm-ANN strategy is evaluated with a benchmark healthcare dataset over various standard learning models. The simulation results demonstrate that the proposed Swarm-ANN strategy outperforms existing models and achieves 95.78% accuracy with an equal rate of error 4.32% while training the healthcare dataset.

The remaining sections of the paper are organized as follows: Sect. 2 introduces various existing healthcare strategies and their contributions for heart disease prediction using standard machine learning and swarm optimization techniques. The proposed Swarm-ANN strategy has been discussed with a flowchart in Sect. 3. The overview of the benchmark dataset of heart disease and experimental evaluations of the Swarm-ANN strategy over the existing learning models have been elaborated in Sect. 4. Finally, Sect. 5 concludes the work.

2 Related studies

Nowadays, heart disease is one of the major causes of mortality across the globe. Therefore, accurate and early diagnosis of patients with heart disease is necessary to save human lives. To mitigate these issues, several clinical decision support systems using different machine learning models have been introduced to identify various symptoms of heart disease patients and take timely action. Authors in [13] have evaluated various classification models to enhance the prediction accuracy of heart disease by using dimensionality reduction and feature selection techniques. This study revealed that the combination of Chi-square and random forest techniques can improve the accuracy among all other classification models. In [10], an ensemble deep learning model has been designed to improve the accuracy of heart disease prediction. The most relevant features are extracted from two different data sources to generate the most significant set of features for evaluating the classification model. However, this study suggested that the large volume of healthcare records will further require a novel feature reduction strategy to remove the irrelevant features from the dataset.

Several attributes for chronic heart disease prediction are considered in [14], in which authors comparatively studied that the standard machine learning classification techniques and image fusion method used for diagnosing the heart disease. To identify and classify healthy people from chronic heart disease patients, authors in [15] developed a machine learning-based diagnosis system. Additionally, the performance of all machine learning classifiers is evaluated with the full features set while reducing the number of features set. Authors in [16] have introduced a hybrid machine learning approach to improve the prediction accuracy of heart disease patients. This study validated the classification models with different combinations of features to increase the performance of heart disease prediction. An efficient machine learning-based heart diagnosis system has been developed in [17]. In this study, the authors suggested that the irrelevant combination of features are mainly reduced the prediction accuracy and increased the computation time. To accurately monitor and detect the heart conditions of the patients, an adaptive neuro-fuzzy technique and a modified salp swarm optimization algorithm have been introduced in [18]. The proposed model achieved better performance results with the highest fitness values for all iterations. However, the prediction accuracy of the models has still required better feature selection and optimization techniques.

In [19], a heart disease prediction model has been designed to improve the prediction accuracy by minimizing the outlier data and unbalanced training dataset. Authors in [9], an automatic classification model has been designed with a machine learning technique based on the patient heart sounds to diagnose cardiac disorders. This study mainly used strategic processing to extract the most prominent features for evaluating machine learning models. Furthermore, authors in [20, 21] have been considered a hybrid machine learning approach to predict the heart conditions of the patients. In [22], a hybrid swarm optimization algorithm along with a machine learning technique is used for diagnosing the heart disease of the patients. The proposed model used a Fisher feature selection algorithm to select more discriminative feature subsets.

For better heart disease risk prediction, authors in [23] have introduced an advanced feature selection technique, where the most significant features are employed for evaluating the neural network models. The experimental results represent the significance of the proposed technique over the existing classification models. In [24], authors have developed an ANNs-enabled technique to design an accurate decision support system for heart disease diagnosis at an early stage. The feature selection technique is widely used to pre-processing the dataset before training the classification models. To improve the selection of an optimal set of significant features, a modified differential evolution (DE) algorithm has been proposed in [25] to effectively select the most critical features for cardiovascular disease prediction. The proposed hybrid model has been evaluated with the selected critical features and achieved better accuracy.

The aforementioned models suffer from the lack of intelligent learning strategy and optimization techniques that causes minimum prediction accuracy of heart disease. As a result, the existing models require further investigations to address the issues and enhance the prediction accuracy of the machine learning models. Besides that, the above-mentioned issues are highly motivated to introduce an intelligent healthcare model that needs to be designed for handling a large set of monitoring data by executing multiple tasks in parallel without affecting the system performance and prediction accuracy. By motivating the advantages of the ANN and swarm optimization techniques, in this paper, we have designed a new intelligent healthcare framework based on the Swarm-ANN technique for cardiovascular disease prediction with higher accuracy and minimum error.

3 Proposed Swarm-ANN strategy

The goal of the proposed Swarm-ANN Strategy is to present an ANN-based model that takes all the healthcare data of heart disease, observed as input, and offers a reliable prediction of heart disease. The implementation of NNs based on the prefixed population size is carried out by generating a random weight for each NN configuration. The swarm-ANN method is split into three stages based on data processing. In the first step, the algorithm is used to construct an ANN according to the population size. The proposed strategy produces the three-layer feedforward ANN in each of their iterations and randomly initiates their weight. The randomly generated ANN or swarm is now fed with a single pattern and each swarm-ANN, which is modified by a back-propagation strategy for the first step of weight adjustment, which has been done in the second phase. Next, the swarm changes the weight through a stochastic weight adjustment algorithm, developed in the third process. Finally, the population ANN winner is determined based on the success assessment function. The exchange of acquired weight and bias of winner neuron is further performed with other neurons in the population based on a stochastic function. Each of these activities is described in detail in the following sub-sections. The important notations of the paper are highlighted in Table 1.

Table 1 Important notations

Full size table

3.1 Parallel implementation of ANN model

The NNs in the population are generated according to the predefined population size. The structure of each NN is generated by a three-layer feedforward structure of the NN, depicted in Fig. 1. For this work, the size of the neuron layer for the heart attack prediction is set to three-layer NN.

In the proposed Swarm-ANN strategy, the parallel implementation of the neural network is divided into two phases. In the first phase, the weight and bias are initiated and adjusted as necessary. In such a scenario, each neural network's weights and bias are determined to initiate every neural network as a thread. Parallel processing is carried out for each thread or swarm. Further, in this step, the swarms attempt to perform the number of iterations over the patterns. In the second step, the weight of each Swarm-ANN is determined, and the outcome of this iteration is measured for each neural network in the populations. The weight and bias of each neural network are calculated by first considering the mean of each pattern in the dataset, which is calculated as follows.

$$P_{{\mu_{m} }} = \left( {\frac{{\mathop \sum \nolimits_{n = 1}^{r} A_{r} }}{R}} \right)$$

(1)

Here, P_µ is the mean of the mth pattern, A_r is the attributes in the pattern, and dataset D = (P₁, P₂, …, P_m), where the problem is represented as follows.

$${\text{argmax}}\left( {P_{{\mu_{m} }} } \right) \Rightarrow Q_{m}$$

(2)

Further, a rand function is introduced to generate a random number for representing β_g, where β_g has maintained the following properties while producing the result.

Property 1:

β_g ≠ 0 and 0 < β_g < Q_m.

Property 2:

Q_m ≠ β_g.

Thus, the weight and bias for every neural network in the population are calculated as follows.

$${\text{WE}} = \left( {\mathop \sum \limits_{l = 1}^{n} {\text{rand}}\left( {0,\beta_{g} } \right)} \right)$$

(3)

$${\text{BI}} = \left( {\mathop \sum \limits_{l = 1}^{n} {\text{rand}}\left( {0,\beta_{g} } \right)} \right)$$

(4)

where WE and BI represent the weight and bias array, respectively, which hold the matrix weight and bias of every neural network in the Swarm-ANN population. The rand function generates the value between 0 and β_g.

3.2 Two-phase weight-bias modification

The training process is accomplished by updating the weight in two phases. In the first phase, the neural weight of the network is processed through a back-propagation algorithm for one time. The weight update by the back-propagation learning algorithm is stored for each NN in the population, which then undergoes stochastic weight adjustment based on winning neurons or swarms, is evaluated in the second step.

The method of this learning is presented in Fig. 2. According to Fig. 2, after the population has been formed, each neuron is processed for weight modification by re-propagating the error. After the completion of the first phase of weight modification, the second phase of weight modification is done based on a stochastic equation, where information on the weight of the best neuron is shared with the other in the neuron population. At the end of this process, the selection of the best neurons or swarms is justified based on the performance assessment criteria. The back-propagation of error and the weight modification is performed over the three-layer NN structure (shown in Fig. 1) for the heart dataset.

The algorithmic process of the weight and bias modification is divided into five phases, which are discussed as follows.

Phase 1: In the first phase, the actual output of the NNs in the Swarm-ANN population is calculated.

In this process, the output of the ith neural network is calculated as follows.

$$A_{i} = f\left( {{\text{Net}}_{l} } \right) = f\mathop \sum \limits_{l = 1}^{n} {\text{WE}}_{n} \cdot {\text{PA}}_{k}^{T} + {\text{BI}}_{n}$$

(5)

where A_i is the actual output of the ith neuron, f represents the activation function, N is the net calculation and PA_K represents the kth pattern. WE_n and BI_n represent the nth weight and bias matrix, respectively. Hence, the actual output of the Swarm-ANN is expressed as follows.

$${\text{AO}}_{{{\text{pop}}}} = \prod\limits_{m = 1}^{y} {\left( {A_{i} } \right)_{m} = \prod\limits_{m = 1}^{y} {\left( {f\left( {{\text{Net}}_{l} } \right)} \right)_{m} } }$$

(5.1)

where AO_pop is the population of the Swarm-ANN, and m = (1, 2, 3, …, y) is the predefined size of the neural network population. y represents as a constant, and it is considered as a parameter in the Swarm-ANN. Hence, the updated form of Eq. (5.1) using Eq. (5) is derived as follows.

$${\text{AO}}_{{{\text{pop}}}} = \mathop \prod \limits_{m = 1}^{y} \left( {f\left( {\mathop \sum \limits_{l = 1}^{n} {\text{WE}}_{n} \cdot {\text{PA}}_{k}^{T} + {\text{BI}}_{n} } \right)_{l} } \right)_{m}$$

(5.2)

Phase 2: In the second phase, the error for each pattern (generated by the NNs of the population) is calculated.

The calculation of the error is performed for each 3-layer neural network in the Swam-ANN population. The kth pattern error is denoted as E_k, which is formulated as follows.

$$E_{k} = \left( {T_{k} - A_{k} } \right)$$

(6)

where T_K represents the kth target and A_k is the actual output for the kth pattern. Thus, the error E_k is calculated for the swarm-ANN population for the pattern PA_k and it is represented as E_pop and defined as follows.

$$E_{{{\text{pop}}}} = \prod\limits_{m = 1}^{y} {\left( {E_{K} } \right)_{m} }$$

(7)

where E_pop is the array that holds the error for each pattern, generated by the swarm-ANN population.

Phase 3: At the end of each iteration, the error is back-propagated for each m th neural network in the swarm-ANN population and then weight and bias modification is performed in this phase.

The error is back-propagated and the error of the nth layer is calculated from the (n + 1)th layers. The calculation of the error for a layer of the three-layer neural network is calculated as SE_n, which is defined as follows.

$${\text{SE}}_{n} = \frac{{\delta \left( {f_{n} \left( {{\text{Net}}_{n + 1} } \right)} \right)}}{{\delta {\text{Net}}_{n} }}$$

(8)

where Net_n is nth layer net calculation. The weight and bias of the three-layer neural network are defined as follows

$${\text{WE}}_{n + 1} = {\text{WE}}_{n} - \left( {\alpha {\text{SE}}_{n} } \right) \cdot \left( {{\text{AO}}^{i - 1} } \right)^{T}$$

(9)

$${\text{BI}}_{n + 1} = {\text{BI}}_{n} - \left( {\alpha {\text{SE}}_{n} } \right)$$

(10)

The weight and bias of all the layers of each neural network are modified based on Eqs. (9) and (10), respectively, which are derived as follows.

$${\text{WE}}_{{{\text{pop}}}} = \mathop \prod \limits_{m = 1}^{y} {\text{WE}}_{m} = \mathop \prod \limits_{m = 1}^{y} \left( {{\text{WE}}_{n} - \left( {\alpha {\text{SE}}_{n} } \right) \cdot \left( {{\text{AO}}^{i - 1} } \right)^{T} } \right)_{m}$$

(11)

$${\text{BI}}_{{{\text{pop}}}} = \mathop \prod \limits_{m = 1}^{y} {\text{BI}}_{m} = \mathop \prod \limits_{m = 1}^{y} \left( {{\text{BI}}_{n} - \left( {\alpha {\text{SE}}_{n} } \right)} \right)_{m}$$

(12)

where WE_pop and BI_pop are the arrays that consist of the weight and bias of the neural network population.

Phase 4: In this phase, the best neuron, B N ^t of t th iterations mapped from the Swam-ANN population based on the mapping function D _dis .

In the process, the best neural network is selected based on the calculation of the sum of squared error (SSE). Hence, the SSE is evaluated for the population, represented as SSE_pop. Thus, the SSE_pop is derived from the SSE formula, which is derived as follows.

$${\text{SSE}}_{{{\text{pop}}}} = \mathop \prod \limits_{m = 1}^{y} \left( {\mathop \sum \limits_{l = 1}^{n} E_{n} \cdot E_{n}^{T} } \right)_{m}$$

(13)

The SSE_pop consists of the error produced by each neural network but the output of the error must close to zero, which is denoted as θ, where 0 ≤ θ < 1. Hence, to measure the difference with θ, d_sse^m is calculated as follows.

$$d_{{{\text{sse}}}}^{m} = \left| {\sqrt {\left( {\theta - {\text{SSE}}_{i} } \right)^{2} } } \right|$$

(14)

Now, the d_sse^m is calculated for NNs of the population (D_SSE), which is derived as follows.

$$D_{{{\text{SSE}}}} = \left( {d_{{{\text{sse}}}}^{1} ,d_{{{\text{sse}}}}^{2} ,d_{{{\text{sse}}}}^{3} , \ldots ,d_{{{\text{sse}}}}^{m} } \right) = \mathop \prod \limits_{m = 1}^{y} d_{{{\text{sse}}}}^{m}$$

(15)

Hence, the selection of the best neural network is derived based on the following logic.

$$\begin{aligned} BN^{t} = & \{ R_{null} \;if\;D_{SSE} > \varphi \\ & R_{ind} \;if\;0 \le D_{SSE} < \varphi \\ & \} \\ \end{aligned}$$

Now, based on the above logic, the following rule is applied to decide the weight and bias of the tth best neuron BN^t.

As per the above-mentioned logic, if the R_null is found then the continuation of the previous best neuron is performed for the current iteration, otherwise, the weight and bias of the best neuron, WE_best, and BI_best are selected from the WE_pop and BI_pop array based on index information, stored in R_ind.

Phase 5: In the final phase, the modified weight and bias (calculated in the second phase) are evaluated through a stochastic equation. This step is adapted to recover the process from the local minimum.

Here, the weight and bias of the best neural network are represented as WE_best and Bi_best. Now, the difference of weight and bias matrix in between the best neuron over other neurons in the populations of the neural network is calculated as follows.

$$\xi_{{\text{dis - weight}}} = \xi_{{{\text{dis}}}} \left( {{\text{BN}}_{{{\text{Weight}}}}^{l} ,{\text{WE}}_{m}^{l} } \right) = \mathop \prod \limits_{{\left( {m = 1} \right)}}^{y} \left( {\mathop \sum \limits_{l = 1}^{n} \left( {\sqrt {\left( {{\text{WE}}_{{{\text{best}}}}^{l} - {\text{WE}}_{m}^{l} } \right)^{2} } } \right)} \right)_{m}$$

(16)

where represents the lth layer weight matrix of the best neuron BN^t and is the lth layer weight matrix of the mth neural network. Similarly,

$$\xi_{{\text{dis - bias}}} = \xi_{{{\text{dis}}}} \left( {{\text{BN}}_{{{\text{bias}}}}^{l} ,{\text{BI}}_{m}^{l} } \right) = \mathop \prod \limits_{{\left( {m = 1} \right)}}^{y} \left( {\mathop \sum \limits_{l = 1}^{n} \left( {\sqrt {\left( {{\text{BI}}_{{{\text{best}}}}^{l} - {\text{BI}}_{m}^{l} } \right)^{2} } } \right)} \right)_{m}$$

(17)

where is the lth layer bias matrix of the best neuron BN^t and represents the lth layer weight matrix of the mth neural network. Finally, the weight and bias are modified based on the following equations.

$$\Delta {\text{WE}}_{m}^{l} = \mathop \prod \limits_{l = 1}^{n} {\text{WE}}_{m}^{l - 1} + \beta \xi_{{\text{dis - weight}}}$$

(18)

$$\Delta {\text{BI}}_{m}^{l} = \mathop \prod \limits_{l = 1}^{n} {\text{BI}}_{m}^{l - 1} + \beta \xi_{{\text{dis - bias}}}$$

(19)

where the and is represents the modified weight and bias matrix, respectively and β is the constant, where $\beta \in {\text{rand}}\left( {0,1} \right)$.

The first part of Eqs. (18) and (19) represents the old lth layer weight and bias matrix of the current neural network in the Swarm-ANN population, as defined in Eqs. (9) and (10), respectively. The second part of the matrix is introduced to reduce the distance between the best and current neuron and a constant value β is applied to handle the differences between the two matrices. The algorithms are again continued from Phase 1 if the accuracy threshold is not reached. If the pre-defined accuracy threshold is matched, then the Swarm-ANN is ready for testing and presenting the result for disease prediction. The proposed Swarm-ANN strategy is explained through a flowchart and is presented in Fig. 3.

4 Results and discussion

The Swarm-ANN-based intelligent strategy for heart disease prediction is analyzed based on its performance over the heart dataset. The dataset is downloaded from the University of California, Irvine (UCI) machine learning repository [26]. The dataset consists of a total of 76 features, but the proposed method is tested with 14 important features for heart attack prediction. The details of the features of the dataset are given in Table 2. The proposed method is tested with the standard classification models including SVM, Naive Bayes classifier, k-neighbors classifier, decision tree, logistic regression, and random forest classifier [27, 28]. All the classifiers are fed with the 14 features to analyze the performance.

Table 2 Features description of the heart dataset

Full size table

4.1 Heart disease dataset analysis

In this section, the heart disease dataset is analyzed to understand its features. In this work, the Swarm-ANN is applied over the heart-dataset, which consists of 14 features. Initially, the dataset is analyzed for understanding the linear relationship among the 14 features. Thus, the correlation between the features needs to be calculated and is presented through a heat-map.

The correlation matrix represents the features and analyzes the relationship. In Fig. 4, the values in the correlation matrix are range between +1 to −1. If the value in the matrix is close to one then it decides that the features are dependent on each other and any one of them can be selected for the Swarm-ANN. Now, if the value close to a negative one then they are highly not correlated. In the above-mentioned correlation matrix, the features are highly independent, and the value is 1 if the correlation calculation is self. Now, the values in the correlation matrix lie in the ranges of 0.4 to 0.6. The A3, A8, A7, and A11 features of the dataset are highly related to the feature A14, but the higher value is under 0.5. So, the few features are further investigated to understand the relationship and it is presented by comparing the symptoms with positively affected diseases and normal person.

In Fig. 5, the categorical features are represented, and the graphs show the significant difference between all these categories of each feature. Now, apart from this visual presentation, one threshold value, 0.55 is used for checking the values in the correlation matrix and if it is found more than that of the threshold value then it is dropped from the dataset. In the heart dataset, no such highly correlated values are found with any combination of symptoms, and hence, features or symptoms are feed into the proposed method without any modification.

4.2 Performances measurement

The proposed method is tested based on solution quality, which is analyzed based on accurate calculation. The calculation of accuracy is performed as follows.

$${\text{Accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}$$

(20)

The precision and recall values are calculated based on the following equations.

$${\text{Precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}}$$

(21)

and

$${\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}$$

(22)

where TP = true positive, TN = True negative, FP = False Positive, and FN = False Negative.

4.3 Empirical analysis on Swam-ANN

The learning rate for the neural population is considered in the range and the assigning of the values in that range is performed before the initial phases of simulation. Another most important parameter is the neural population size, which can be increased or decreased to adjust the performance of the prediction. In Fig. 6, the performance of the algorithm is shown over different learning rate ranges. In reality, the proposed method is run with different ranges of learning rates but only three ranges are decided for this dataset, which are 0.001 to 0.02, 0.02 to 0.06, and 0.06 to 0.125. In Fig. 6, the iterations are limited for presentation, but it is usually performed over 100 iteration and independent tests are performed with different population sizes by varying the learning rate. In Tables 3 and 4, the details of training and testing are presented for this empirical parameter study. It is easy to observe from these tables that the performance is enhanced by fixing the learning rate range to 0.02 to 0.06 while fixing the population size to 10. The detailed parameter settings for the proposed method are given in Table 5.

Table 3 Performances of the proposed method when neural population size is 10

Full size table

Table 4 Performances of the proposed method when neural population size is 5

Full size table

Table 5 Description of the parameter

Full size table

4.4 Comparative analysis and discussion

The performance of the algorithm is presented in Fig. 7 and it is shown for 10 iterations for presentation purpose. The performance of the proposed method is compared with the standard classifiers. The other standard methods are SVM, Naive Bayes classifier, k-neighbors classifier, decision tree, logistic regression, and random forest classifier. The main criteria of the performance analysis are the prediction quality, which is presented in Table 6. The proposed method accuracy is reported 0.9578, which is better than the existing classifier models. The proposed method and other standard algorithms are tested without the extraction of any feature from the dataset. The proposed Swarm-ANN strategy performs well due to include two-phase weight and bias modification procedure.

Table 6 Comparative analysis of the proposed method and existing classifiers

Full size table

Equations (18) and (19) represent the second stage weight and bias modification, which are evaluated after the back-propagation technique. If the weight bias modification in the first stage gets stuck into a local minimum, then this second stage modification of weight and bias matrix is recovered. In the second stage, the distance measure part in that equation also helped the other neuron in the population to learn the weight and bias of the winner neuron. Thus, in each iteration, the neurons in the population are initiated with a new set of weight and bias matrix but these modified weight and bias matrix also guided by the best weight and bias matrix of that current iteration. The proposed method has produced an almost 0.9578 prediction accuracy rate, which proves the superiority of the swarm-ANN in the limited simulation environment.

5 Conclusions and future work

In this paper, we have designed an intelligent healthcare framework for predicting cardiovascular heart disease based on the Swarm-ANN technique. The main purpose of this work is to receive a large set of monitored parameters from heart disease patients and predicts disease using the proposed Swarm-ANN strategy. In the initial phase, the proposed Swarm-ANN strategy randomly generates a predefined set of NNs for training and evaluating the dataset. In the next phase, the NN populations are trained by two stages of weight changes, and their weight is adjusted by a newly designed heuristic formulation. Finally, the weight of the neurons is modified by sharing the global best weight with other neurons and predicts the accuracy of cardiovascular disease. The accuracy of the proposed technique is validated by a benchmark heart disease dataset with fourteen features. The proposed Swarm-ANN strategy achieves 95.78% accuracy with an equal rate of error 4.32% while predicting heart disease. Further, the simulation results exhibit that the proposed Swarm-ANN strategy outperforms the standard learning techniques in terms of various performance matrices.

In the future, we will apply various feature fusion and selection strategies for extracting the important features from the high-dimension dataset for increasing the accuracy of heart disease prediction. Further, to enhance the applicability of the proposed framework and increase prediction accuracy, we will apply various data mining techniques along with deep learning models for data preprocessing and heart disease prediction, respectively, in fog networks [29, 30].

References

Ahmed H, Younis EM, Hendawi A, Ali AA (2020) Heart disease identification from patients’ social posts, machine learning solution on spark. Fut Gener Comput Syst 111:714–722
Article Google Scholar
Kumar PM, Gandhi UD (2018) A novel three-tier Internet of Things architecture with machine learning algorithm for early detection of heart diseases. Comput Elect Eng 65:222–235
Article Google Scholar
Kwon J-M, Kim KH, Jeon K-H, Park J (2019) Deep learning for predicting in-hospital mortality among heart disease patients based on echocardiography. Echocardiography 36(2):213–218
Article Google Scholar
Hao Y, Usama M, Yang J, Hossain MS, Ghoneim A (2019) Recurrent convolutional neural network-based multimodal disease risk prediction. Fut Gener Comput Syst 92:76–83
Article Google Scholar
Jonnagaddala J, Liaw S-T, Ray P, Kumar M, Chang N-W, Dai H-J (2015) Coronary artery disease risk assessment from unstructured electronic health records using text mining. J Biomed Inform 58:S203–S210
Article Google Scholar
Melin P, Miramontes I, Prado-Arechiga G (2018) A hybrid model based on modular neural networks and fuzzy systems for classification of blood pressure and hypertension risk diagnosis. Exp Syst Appl 107:146–164
Article Google Scholar
Al-Makhadmeh Z, Tolba A (2019) Utilizing IoT wearable medical device for heart disease prediction using higher-order Boltzmann model: a classification approach”. Measurement 147:1–12
Article Google Scholar
Ali F, Kwak D, Khan P, El-Sappagh S, Ali A, Ullah S, Kim KH, Kwak K-S (2019) Transportation sentiment analysis using word embedding and ontology-based topic modeling. Knowl Based Syst 174:27–42
Article Google Scholar
Yadav A, Singh A, Dutta MK, Travieso CM (2019) Machine learning-based classification of cardiac diseases from PCG recorded heart sounds. Neural Comput Appl 32:1–14
Google Scholar
Ali F, El-Sappagh S, Islam SR, Kwak D, Ali A, Imran M, Kwak K-S (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inform Fusion 63:208–222
Article Google Scholar
Kishore A, Jayanthi V (2018) Neuro-fuzzy based medical decision support system for coronary artery disease diagnosis and risk level prediction. J Comput Theor Nanosci 15(3):1027–1037
Article Google Scholar
Ali F, Islam SR, Kwak D, Khan P, Ullah N, Yoo SJ, Kwak KS (2018) Type-2 fuzzy ontology–aided recommendation systems for IoT–based healthcare. Comput Commun 119:138–155
Article Google Scholar
Garate-Escamila AK, El-Hassani AH, Andres E (2020) Classification models for heart disease prediction using feature selection and PCA. Inform Med Unlocked 19:1–15
Article Google Scholar
Diwakar M, Tripathi A, Joshi K, Memoria M, Singh P et al (2020) Latest trends on heart disease prediction using machine learning and image fusion. Mater Today Proceed 37:3213–3218
Article Google Scholar
Haq AU, Li JP, Memon MH, Nazir S, Sun R (2018) A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob Inf Syst 2018:1–21
Google Scholar
Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554
Article Google Scholar
Li JP, Haq AU, Din SU, Khan J, Khan A, Saboor A (2020) Heart disease identification method using machine learning classification in e-healthcare. IEEE Access 8:107562–107582
Article Google Scholar
Khan MA, Algarni F (2020) A healthcare monitoring system for the diagnosis of heart disease in the IOMT cloud environment using msso-anfis. IEEE Access 8:122259–122269
Article Google Scholar
Fitriyani NL, Syafrudin M, Alfian G, Rhee J (2020) Hdpm: an effective heart disease prediction model for a clinical decision support system. IEEE Access 8:133034–133050
Article Google Scholar
Abdeldjouad FZ, Brahami M, Matta N (2020) A hybrid approach for heart disease diagnosis and prediction using machine learning techniques. International conference on smart homes and health telematics. Springer, New York, pp 299–306
Google Scholar
Ansari MF, AlankarKaur B, Kaur H (2020) A prediction of heart disease using machine learning algorithms. International conference on image processing and capsule networks. Springer, New York, pp 497–504
Google Scholar
Shahid AH, Singh MP, Roy B, Aadarsh A (2020) Coronary artery disease diagnosis using feature selection-based hybrid extreme learning machine. In: Proceedings of the 2020 3rd international conference on information and computer technologies (ICICT), IEEE, pp 341–346.
Javeed A, Rizvi SS, Zhou S, Riaz R, Khan SU, Kwon SJ (2020) Heart risk failure prediction using a novel feature selection method for feature refinement and neural network for classification. Mob Inf Syst 2020
Costa W, Figueiredo L, Alves E (2019) Application of an artificial neural network for heart disease diagnosis. In: Proceedings of the XXVI Brazilian congress on biomedical engineering, pp 753–758. Springer, New York
Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Comput Biol Med 90:125–136
Article Google Scholar
Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220
Article Google Scholar
Rish I, et al (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3, pp 41–46.
Meyer D, Leisch F, Hornik K (2003) The support vector machine under test. Neurocomputing 55(1–2):169–186
Article Google Scholar
Jan MA, Khan F, Khan R, Mastorakis S, Menon VG, Watters P, Alazab M (2020) A lightweight mutual authentication and privacy-preservation scheme for intelligent wearable devices in industrial-CPS. IEEE Trans Ind Inform 1–11
Shynu PG, Menon VG, Kumar RL, Kadry S, Nam Y (2021) Blockchain-based secure healthcare application for diabetic-cardio disease prediction in fog computing. IEEE Access 9:45706–45720
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, ASETK, Amity University Kolkata, Kolkata, India
Sudarshan Nandy
Mobile & Cloud Lab, Institute of Computer Science, University of Tartu, Tartu, Estonia
Mainak Adhikari
School of Science, Engineering and Information Technology, Federation University, Mount Helen, Australia
Venki Balasubramanian
Computer Science and Engineering, SCMS School of Engineering and Technology, Ernakulam, India
Varun G. Menon
School of Physics and Electronic Information Engineering, Henan Polytechnic University, Jiaozuo, China
Xingwang Li
Department of Computer Science, Abdul Wali Khan University, Mardan, Pakistan
Muhammad Zakarya

Authors

Sudarshan Nandy
View author publications
You can also search for this author in PubMed Google Scholar
Mainak Adhikari
View author publications
You can also search for this author in PubMed Google Scholar
Venki Balasubramanian
View author publications
You can also search for this author in PubMed Google Scholar
Varun G. Menon
View author publications
You can also search for this author in PubMed Google Scholar
Xingwang Li
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Zakarya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Varun G. Menon.

Ethics declarations

Conflict of interest

The authors declare that there are no potential conflicts of interest in this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nandy, S., Adhikari, M., Balasubramanian, V. et al. An intelligent heart disease prediction system based on swarm-artificial neural network. Neural Comput & Applic 35, 14723–14737 (2023). https://doi.org/10.1007/s00521-021-06124-1

Download citation

Received: 21 February 2021
Accepted: 11 May 2021
Published: 27 May 2021
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00521-021-06124-1

An intelligent heart disease prediction system based on swarm-artificial neural network

Abstract

Similar content being viewed by others

Risk Prediction of Heart Disease Based on Swarm Optimized Neural Network

RETRACTED ARTICLE: Effective heart disease prediction using improved particle swarm optimization algorithm and ensemble classification technique

Heart Attack Classification System Using Neural Network Trained with Particle Swarm Optimization