Abstract
Cardiac and cardiovascular diseases are among the most prevalent and dangerous ailments that influence human health. The detection of cardiac disease in its early stages by the use of early-stage symptoms is a major problem in today’s environment. As a result, there is a demand for a technology that can identify cardiac disease in a non-invasive manner while also being less expensive. In this research we have developed a hybrid deep learning methodology for the categorization of cardiac disease. Classifying synthetic data using RNN and LSTM hybrid approaches has been done using different cross-validations. The system’s performance also be evaluated using a variety of machine learning methods and soft computing approaches. During the classification process, RNN employs three separate activation functions. To balance the data, certain pre-processing methods were used to sort and classify the data. The extraction of features has been done using relational, bigram, and density-based approaches. We employed a variety of machine learning and deep learning methods to assess system performance throughout the trial. The accuracy of each algorithm’s categorization is shown in the results section. As a result, we can say that deep hybrid learning is more accurate than either classic deep learning or machine learning techniques used alone.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Our hearts pump oxygen-rich blood throughout our bodies via a network of arteries and veins, making them the most important organs in our bodies. Our hearts can be affected by a variety of conditions like heart disease [1]. Heart illness is considered a dangerous condition since we frequently hear that the majority of people die as a result of heart disease and other types of heart-related ailments [2, 3]. Most medical researchers have noted that, on many occasions, the majority of heart patients do not survive their heart attacks and die as a result of them [4]. The rising incidence of cardiovascular disorders, which are associated with a high death rate, is posing a substantial concern and placing a significant strain on healthcare systems across the world. Although males are more likely than females to suffer from cardiovascular disorders, particularly in middle or old age, youngsters can also suffer from comparable health problems [5,6,7]. Heart illnesses are classified into several categories, including coronary artery disease, congenital heart disease, arrhythmia, and others. Heart disease manifests itself in a variety of ways, with symptoms such as chest discomfort, dizziness, and excessive perspiration being among them. The most common causes of heart disease are smoking, high blood pressure, diabetes, obesity, and other factors [8]. Recent advancements in the field of health decision-making have resulted in the development of machine learning systems for health-related applications [9]. They are intended to increase the accuracy of cardiac diagnosis choices through the use of computer-aided design technologies. Furthermore, these instruments place their faith in optimization [10], clustering, and ML computation models [11] [9, 12]. Because of the development of machine learning and artificial intelligence, researchers may now construct the best prediction model possible based on the huge amount of data that is already accessible. Recent research that has focused on heart-related concerns in both adults and children has stressed the need to lower the mortality rate associated with cardiac and cardiovascular diseases (CVDs) [7]. When machine learning algorithms are trained on appropriate datasets, they perform at their peak [13, 14]. There are a lot of ways to prepare data for algorithms that use consistency to make predictions. Data mining, relief selection, or the LASSO method can be used to make sure that the data is ready to make a more accurate prediction. Once the right features have been chosen, classifiers and hybrid models can be used to predict how likely it is that a disease will happen. Researchers have used different methods to make classifiers and hybrid models [15, 16]. Heart disease can be unpredictable because there aren’t enough medical datasets, a lot of different types of ML algorithms to choose from, and not enough detailed analysis [7, 17]. Classification software relies heavily on the process of feature selection. Because characteristics taken from the object are the primary source of categorization. Classification results can be improved by utilizing the finest characteristics [18]. It is critical to choose the relevant characteristics that may be employed as risk factors in forecasting models. To construct successful prediction models, it is important to pick the optimal combination of features and machine learning algorithms. Risk factors that fit the three criteria of high prevalence, considerable influence on heart disease independently, and controllability or treatability should be assessed for their impact in order to lower the risks [7, 18]. It is critical to choose the relevant characteristics that may be employed as risk factors in forecasting models. To construct successful prediction models, it is important to pick the optimal combination of features and machine learning algorithms. Risk factors that fit the three criteria of high prevalence, considerable influence on heart disease independently, and controllability or treatability should be assessed for their impact in order to lower the risks [18].
The following are the most significant contributions made by this paper:
-
This paper discusses the application of RNN-LSTM in the implementation of collaborative classification approaches for detection and classification.
-
In order to carry out this research, a fictional Cleveland heart disease dataset.
Furthermore, the following sections of this document are examined, as Sect. 2 offers a motivation with literature study of several currently available approaches. The methods of research methodology and dataset selection investigation are described in Sect. 3, and the algorithm result and discussion specification for the suggested implementation is shown in Sect. 4. The concluding Sect. 5 provides the results of the suggested approach as well as a comparison with other state-of-the-art procedures. Section 6 examines the work completed to date as well as its future potential, followed by a conclusion.
2 Motivation
Here, the need for a new algorithm to predict heart disease and the pros and cons of existing research are looked at as well. The challenges and the literature review are thought to be below (Table 1).
In order to classify, several researchers use different artificial intelligence techniques such as machine learning [26,27,28,29,30,31] and deep learning [32,33,34] algorithms. In this approach for heart disease prediction, machine learning was employed 60% of the time, whereas deep learning techniques were used 30% of the time. Many machine learning approaches, including Naive Bayes [35, 36], Decision Tree, KNN [37, 38], Support Vector Machine [39, 40], Random Forest [41], Logistic Regression, Optimization technique [42] and others are used to categories the cancer dataset. Deep learning approaches based on the ANN have been used by many researchers. The RNN and CNN deep learning algorithms were used. In the classification of ECG images and other visual data, convolutional neural networks (CNNs) [41,42,43] are often used. A recurrent neural network (RNN), multi-layered feed-forward neural network (MLFFNN) [44] is a kind of artificial neural network [43, 45] that improves on prior networks with fixed-size input and output vectors.
3 Research methodology and dataset selection
The proposed system has divided into two different phases, training and testing. In this research, an effective disease prediction using deep learning techniques is proposed. To achieve decent classification accuracy, the dataset plays an important role in the entire execution process. The data from the first synthesis was gathered (UCI Machine Learning Repository Heart Disease Data Set). The dataset was obtained from the University of California, Irvine Machine Learning Repository. It is made up of 14 columns, one of which is shown below with a brief description of each (Table 2).
The above Fig. 1 describes a training and testing phases of synthetic Cleveland data classification. The training module generates Background Knowledge (BK) for all classes, and predict the class label for new input record during module testing. The system also calculates vascular age of Heart (VaH) using below formula,
where \({S}_{0}\left(t\right)\) is the baseline survival at follow up time t (where t = 10 years), \({\beta }_{i}\) is the estimated regression coefficient, \({\chi }_{i}\) denoted the log transformed measured value of the ith risk factor, \({\overline{\chi }}_{i}\) is the corresponding mean and \(p\) indicates the number of risk factors.
3.1 Experimental setup
We used a Windows 10 computer with 8 GB of RAM and the Python programming language to conduct our experiment. The datasets listed below are used in the implementation.
3.2 Algorithm
In proposed system we made hybrid deep learning classification algorithm collaboration with RNN and LSTM. The below we demonstrate each phase of system execution with our hybrid algorithm (Tables 3, 4).
3.3 Performance metrics
The performance metrics explored to determine the efficacy of the proposed Classification for Heart Disease using hybrid model are enlisted below:
Precision: also known as positive predictively, is the number of relevant positive forecasted samples.
Sensitivity: or recall, is another term for this. How many positive samples are correctly expected to be positive?
Accuracy: can be defined as it is ration of correct classification to the number of total classification.
F-Measure: is the harmonic mean of precision and recall, so it is called the F-Measure.
4 Results and discussions
A thorough experimental study was conducted on the systems that were deployed on the widow’s platform, which was running Python 3.7 and the RESNET100 deep learning framework, and a thorough experimental study was conducted.
4.1 Experiment using RNN-LSTM (sigmoid)
Our goal in this experiment was to demonstrate the classification accuracy of RNN (Sigmoid) using the Cleveland heart disease dataset. The results of similar trials utilizing different cross validation methods are presented in Table 5. This analysis found that 15-fold cross validation has the highest average classification accuracy (95.0%) of the methods tested.
In this case, fivefold cross validation with RNN and sigmoid function achieves 93.6% accuracy (Fig. 2). Figure 3 depicts cross validation of tenfold data, whereas Fig. 4 depicts the same. During module testing, the accuracy of both functions is almost the same.
4.2 Experiment using recurrent neural network (TanH)
The classification accuracy of RNNs using the Cleveland dataset is shown in Fig. 3, and the results of analogous experiments using the cross validation approach are shown in Table 6. According to our findings, 15-fold cross validation achieves the highest average classification accuracy of 93.55% and 94.90% for RNNs using Tan-h respectively. The variables and functions that were utilized in the suggested detection algorithm are listed below.
4.3 Experiment using recurrent neural network (ReLU)
With the use of the Cleveland dataset, we investigate the classification accuracy of ReLU in this experiment. Similar studies have been performed on other cross-validations (Fold5, Fold10, and Fold15), and the results are provided in Table 7 for comparison. Taking into consideration the findings of this study, we can conclude that tenfold cross validation yields the highest classification accuracy for RNN, with a classification accuracy of 95.30% and 97.10% for tenfold cross validation, respectively for RNN.
The Table 7 carried out 5-, 10- and 15-fold cross validation training of RNN (Tan h activation function).
Using a machine learning technique, the suggested deep learning classification algorithm is depicted in the preceding Fig. 4. This graphic illustrates the difference between the results obtained with and without cross-validation. The identification of sickness has been accomplished by the use of a minimum of three concealed layers. Following the results of this experiment, we conclude that RNN with sigmoid gives superior detection accuracy than the other two activation functions used in this study as well as the random forest machine learning method.
4.4 Comparative analysis of system
Another study is looking into the possibilities of illness diagnosis using supervised machine learning classification. The suggested system makes four comparisons between our study and the findings of other systems, all of which are based on comparable and/or many datasets, as defined by our findings (Table 8; Figs. 5, 6).
The Cleveland heart disease dataset has not yet been subjected to the use of RNN, LSTM, or RNN + LSTM hybrid models. This study’s accuracy is 95.10%, which is higher than the accuracy of previous ML and DL models. Table 9, Fig. 7 depicts the deep learning classification accuracy of a proposed model utilising several current machine learning methods as measured by deep learning. In terms of accuracy, the suggested hybrid models outperform the Support Vector Machine, the Decision Tree, and the KNN algorithms in terms of accuracy. A training set and a test set are used to organize or classify data in the most recent expected sample, which is the most recent expected sample. The input function modules and their associated class labels are the building blocks of the training package. After learning from these two learning sets, an arrangement (classification) model is constructed, which organizes the input courses into labels that match them. Afterwards, the model is tested against a test set that is constructed from the class labels of orthonormal course labels.
5 Conclusion
To analyses the proposed system, we used a variety of machine learning methods, including a hybrid deep learning algorithm that we developed. In addition, the cooperation of deep learning algorithms and the performance of digital algorithms are calculated in the results section. When the RNN is used to run the system, it causes memory difficulties in the feedback layers to arise. By using LSTM, we can successfully overcome the memory issue and handle vast amounts of data in a timely manner. The hybrid deep learning algorithms achieve an average classification accuracy of around 95.10% on a variety of cross-validation tests. Selection of activation function, epoch size, and choice of the kind of features are all variables that may be changed. The system’s future development will involve the analysis of real-time IoT data in order to perform experiments, which will be part of the system’s future growth.
References
Rani P, Kumar R, Sid NMO, Anurag A (2021) A decision support system for heart disease prediction based upon machine learning. J Reliab Intell Environ 7(3):263–275. https://doi.org/10.1007/s40860-021-00133-6
Assari R, Azimi P, Reza Taghva M (2017) Heart disease diagnosis using data mining techniques. Int J Econ Manag Sci 06(03):750–753. https://doi.org/10.4172/2162-6359.1000415
Krishnaiah V, Srinivas M, Narsimha G, Chandra NS (2014) Diagnosis of heart disease patients using fuzzy classification technique. IEEE Int Conf Comput Commun Technol. https://doi.org/10.1109/ICCCT2.2014.7066746
Mamatha Alex P, Shaji SP (2019) Prediction and diagnosis of heart disease patients using data mining technique. In: Proceedings of the 2019 IEEE international conference on communication and signal processing. ICCSP 2019, pp 848–852. https://doi.org/10.1109/ICCSP.2019.8697977
Jousilahti P, Vartiainen E, Tuomilehto J, Puska P (1999) Sex, age, cardiovascular risk factors, and coronary heart disease. Circulation 99(9):1165–1172. https://doi.org/10.1161/01.cir.99.9.1165
Subhadra K, Vikas B (2019) Neural network based intelligent system for predicting heart disease. Int J Innov Technol Explor Eng 8(5):484–487. [Online]. https://www.researchgate.net/publication/332035370_Neural_network_based_intelligent_system_for_predicting_heart_disease
Ghosh P et al (2021) Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access 9:19304–19326. https://doi.org/10.1109/ACCESS.2021.3053759
Razmjooy N, Rashid Sheykhahmad F, Ghadimi N (2018) A hybrid neural network—world cup optimization algorithm for melanoma detection. Open Med 13(1):9–16. https://doi.org/10.1515/med-2018-0002
Swarnalatha GMP (2021) Optimal feature selection through a cluster—based DT learning (CDTL) in heart disease prediction. Evol Intell 14(2):583–593. https://doi.org/10.1007/s12065-019-00336-0
Moallem P, Razmjooy N, Ashourian M (2013) Computer vision-based potato defect detection using neural networks and support vector machine. Int J Robot Autom 28(2):137–145. https://doi.org/10.2316/Journal.206.2013.2.206-3746
Mousavi BS (2011) Digital image segmentation using rule-base classifier. Am J Sci Res 35(35):17–23. [Online]. https://www.academia.edu/38367918/Digital_Image_Segmentation_Using_Rule_Base_Classifier
Amin MS, Chiam YK, Varathan KD (2019) Identification of significant features and data mining techniques in predicting heart disease. Telemat Inform 36:82–93. https://doi.org/10.1016/j.tele.2018.11.007
Kondababu A, Siddhartha V, Kumar BHKB, Penumutchi B (2021) Materials today: proceedings a comparative study on machine learning based heart disease prediction. Mater Today Proc. https://doi.org/10.1016/j.matpr.2021.01.475
Singh D, Samagh JS (2020) A comprehensive review of heart disease prediction using machine learning. J Crit Rev 7(12):281–285. https://doi.org/10.31838/jcr.07.12.54
Tama BA, Im S, Lee S (2020) Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. Biomed Res Int. https://doi.org/10.1155/2020/9816142
Youssef MM, Mousa SA, Baloola MO, Fouda BM (2020) The impact of mobile augmented reality design implementation on user engagement. CCIS. Springer book series, vol 1244
Kausar N, Palaniappan S, Samir BB, Abdullah A, Dey N (2016) Systematic analysis of applied data mining based optimization algorithms in clinical attribute extraction and classification for diagnosis of cardiac patients. Intell Syst Ref Libr 96:217–231. https://doi.org/10.1007/978-3-319-21212-8_9
Saranya G, Pravin A (2021) Hybrid global sensitivity analysis based optimal attribute selection using classification techniques by machine learning algorithm. Wirel Pers Commun. https://doi.org/10.1007/s11277-021-08796-3
Ali F et al (2021) Feature optimization by discrete weights for heart disease prediction using supervised learning. Soft Comput 25(3):1821–1831. https://doi.org/10.1007/s00500-020-05253-4
Saranya G, Pravin A (2021) Learning algorithm. Wirel Pers Commun. https://doi.org/10.1007/s11277-021-08796-3
Prakash B, Debnath D, Midhun B (2021) A hybrid machine learning approach to identify coronary diseases using feature selection mechanism on heart disease dataset. Distrib Parallel Databases. https://doi.org/10.1007/s10619-021-07329-y
Ali F et al (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion 63:208–222. https://doi.org/10.1016/j.inffus.2020.06.008
Yazdani A, Varathan KD, Chiam YK, Malik AW, Azman W, Ahmad W (2021) A novel approach for heart disease prediction using strength scores with significant predictors. BMC Med Inform Decis Mak. https://doi.org/10.1186/s12911-021-01527-5
Thanga Selvi R, Muthulakshmi I (2021) An optimal artificial neural network based big data application for heart disease diagnosis and classification model. J Ambient Intell Humaniz Comput 12(6):6129–6139. https://doi.org/10.1007/s12652-020-02181-x
Pandian MSA (2021) Intelligent big data analytics model for efficient cardiac disease prediction with IoT devices in WSN using fuzzy rules. Wirel Pers Commun. https://doi.org/10.1007/s11277-021-08788-3
Muthulakshmi RTSI (2021) An optimal artificial neural network based big data application for heart disease diagnosis and classification model. J Ambient Intell Humaniz Comput 12(6):6129–6139. https://doi.org/10.1007/s12652-020-02181-x
Safa M, Pandian A (2021) Intelligent big data analytics model for efficient cardiac disease prediction with IoT devices in WSN using fuzzy rules. Wirel Pers Commun. https://doi.org/10.1007/s11277-021-08788-3
Shidnal S, Latte MV, Kapoor A (2021) Crop yield prediction: two-tiered machine learning model approach. Int J Inf Technol 13(5):1983–1991. https://doi.org/10.1007/s41870-019-00375-x
Niranjan D, Kavya M, Neethi KT, Prarthan KM, Manjuprasad B (2021) Machine learning based analysis of pulse rate using Panchamahabhutas and Ayurveda. Int J Inf Technol 13(4):1667–1670. https://doi.org/10.1007/s41870-021-00690-2
Nayakwadi N, Fatima R (2021) Automatic handover execution technique using machine learning algorithm for heterogeneous wireless networks. Int J Inf Technol 13(4):1431–1439. https://doi.org/10.1007/s41870-021-00627-9
Mangrulkar A, Rane SB, Sunnapwar V (2021) Automated skull damage detection from assembled skull model using computer vision and machine learning. Int J Inf Technol 13(5):1785–1790. https://doi.org/10.1007/s41870-021-00752-5
Mahajan J, Banal K, Mahajan S (2021) Estimation of crop production using machine learning techniques: a case study of J&K. Int J Inf Technol 13(4):1441–1448. https://doi.org/10.1007/s41870-021-00653-7
Bojamma AM, Shastry C (2021) A study on the machine learning techniques for automated plant species identification: current trends and challenges. Int J Inf Technol 13(3):989–995. https://doi.org/10.1007/s41870-019-00379-7
Divate MS (2021) Sentiment analysis of Marathi news using LSTM. Int J Inf Technol 13(5):2069–2074. https://doi.org/10.1007/s41870-021-00702-1
Pattekari A, Parveen SA (2012) Prediction system for heart disease using Naïve Bayes. Int J Adv Comput Math Sci 3(3):290–294
Dulhare UN (2018) Prediction system for heart disease using Naive Bayes and particle swarm optimization. Biomed Res 29(12):2646–2649. https://doi.org/10.4066/biomedicalresearch.29-18-620
Kulkarni TR, Dushyanth ND (2021) Performance evaluation of deep learning models in detection of different types of arrhythmia using photo plethysmography signals. Int J Inf Technol 13(6):2209–2214. https://doi.org/10.1007/s41870-021-00795-8
Pandey NN, Muppalaneni NB (2021) A novel algorithmic approach of open eye analysis for drowsiness detection. Int J Inf Technol 13(6):2199–2208. https://doi.org/10.1007/s41870-021-00811-x
Patil AR, Subbaraman S (2021) Performance analysis of static hand gesture recognition approaches using artificial neural network, support vector machine and two stream based transfer learning approach. Int J Inf Technol. https://doi.org/10.1007/s41870-021-00831-7
Chandra MA, Bedi SS (2021) Survey on SVM and their application in image classification. Int J Inf Technol 13(5):1867–1877. https://doi.org/10.1007/s41870-017-0080-1
Sharma LD, Sunkaria RK (2019) Detection and delineation of the enigmatic U-wave in an electrocardiogram. Int J Inf Technol 13(6):2525–2532. https://doi.org/10.1007/s41870-019-00287-w
Usha Kirana SP, D’Mello DA (2021) Energy-efficient enhanced Particle Swarm Optimization for virtual machine consolidation in cloud environment. Int J Inf Technol 13(6):2153–2161. https://doi.org/10.1007/s41870-021-00745-4
Mane DT, Tapdiya R, Shinde SV (2021) Handwritten Marathi numeral recognition using stacked ensemble neural network. Int J Inf Technol 13(5):1993–1999. https://doi.org/10.1007/s41870-021-00723-w
Kumar R, Srivastava S, Dass A, Srivastava S (2019) A novel approach to predict stock market price using radial basis function network. Int J Inf Technol 13(6):2277–2285. https://doi.org/10.1007/s41870-019-00382-y
Sharma LD, Chhabra H, Chauhan U, Saraswat RK, Sunkaria RK (2021) Mental arithmetic task load recognition using EEG signal and Bayesian optimized K-nearest neighbor. Int J Inf Technol 13(6):2363–2369. https://doi.org/10.1007/s41870-021-00807-7
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
It has been declared by the authors that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Bhavekar, G.S., Goswami, A.D. A hybrid model for heart disease prediction using recurrent neural network and long short term memory. Int. j. inf. tecnol. 14, 1781–1789 (2022). https://doi.org/10.1007/s41870-022-00896-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-022-00896-y