Predicting the Heart Disease Using Machine Learning Techniques

Goyal, Somya

doi:10.1007/978-981-19-5224-1_21

Somya Goyal¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 517))

451 Accesses
7 Citations

Abstract

Heart disease refers to the condition when the heart is not capable to push required amount of blood to the entire body. Heart disease (HD) is the prevailing reason behind deaths among the world-wide population. Early prediction of heart diseases can save lives. Predicting cardiovascular or heart disease in advance, a person can be warned beforehand, and the death can be prevented in turn. Machine learning (ML) has made a huge contribution to classify the population with heart disease from the healthy population. This paper proposes three heart disease prediction (HDP) models namely LOFS-ANN, LOFS-SVM, and LOFS-DT utilizing lion optimization-based feature selection (LOFS) method and three ML-based classifiers. The datasets used are from UCI repository. The comparative analysis reflects that the model LOFS-ANN performs best among all three models, with the values of 97.1% and 90.5% for AUC measure and accuracy measure, respectively. It can be concluded that the LOFS-ANN has a significant potential to predict heart disease after drawing its statistical comparison with the competing models.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Comparison of Different Classification Algorithms for Prediction of Heart Disease by Machine Learning Techniques

Article 27 December 2022

Early and accurate detection and diagnosis of heart disease using intelligent computational model

Article Open access 12 November 2020

Survey on Heart Disease Prediction Using Machine Learning Techniques

Keywords

1 Introduction

Heart disease (HD) is the biggest reason behind the deaths all around the world. The WHO investigated into the statistics and reported that 17.7 deaths were caused due to cardiovascular diseases almost in 2015 throughout the world [1]. The early prediction of HD among population can be a potential help in saving lives by issuing warning and precautionary measures to the people. Machine learning (ML) techniques are playing a crucial role in heart diseases prediction (HDP) using the past collected patient data [2]. A wide range of ML techniques is available for developing the heart disease predictors [3]. The patient datasets possess numerous attributes and not all worthy for predicting the heart disease. Feature selection (FS) facilitates to enhance prediction accuracy by removing the non-contributing and irrelevant attributes [4,5,6,7,8]. Bio-inspired algorithms are gaining popularity for the FS [9]. This study utilizes lion optimization (LO) algorithm originated from the social behavior of lion [10]. Lion optimization for feature selection (LOFS) has not yet been utilized in ML-based HDP domain. To carry out the research streamlined, following research goals are established-

R1To report the best ML-based HDP model among the proposed models to predict heart disease effectively.
R2 To establish the statistical validation of the work.

The paper is organized as follows—Sect. 2 discusses the literature related to this study. The experimental methods and setup are given in Sect. 3. The results of experiments are reported under Sect. 4. The research work is concluded under Sect. 5 bringing a light on the future work.

2 Literature Work

The survey on the work carried out in the literature of HDP applying the machine learning techniques has been summed up in this section. The survey is summarized as Table 1.

Table 1 Related work in the literature

Full size table

3 Research Methodology

The research methodology adopted for this work including the experimental methods and setup are briefed in this section.

This work utilizes three datasets from the UCI repository for experimental work [15]. The description to datasets attributes is given as under Table 2. The patient dataset is partitioned into training and testing datasets with 70–30 ratio. Then, lion optimization algorithm for feature selection (LOFS) [14] is applied to select the most significant features. The features selected using the LOFS algorithm for all three experimental datasets are listed as in Table 3. Then, the only selected features are fed to the ML-based classifiers for training purpose. The most renowned classification algorithms [2] are selected for the heart disease prediction (HDP) which are artificial neural network (ANN) [16], support vector machine (SVM), [17] and decision trees (DT) [18, 19]. Performance of all three proposed classifiers is recorded over all three datasets. Figure 1 depicts the proposed experimental model.

Table 2 Description of the datasets used

Full size table

Table 3 Features Selected Using LOFS Algorithm

Full size table

A flow chart to select features from 3 U C I datasets by L O F S, predict heart disease by 3 machine learning algorithms and evaluate their performance. — **Fig. 1**

For the performance evaluation, ROC, AUC, and accuracy are considered [2, 3, 11,12,13, 16,17,18,19,20,21].

4 Results and Discussion

This section reports the experimental results and the inferences drawn after analysis are listed out here.

4.1 Finding the Best ML-Based HDP Model (R1)

A comparison is done among LOFS-ANN, LOFS-SVM, and LOFS-DT to find the best performer. First up, the AUC values are recorded over all three datasets for all the candidate models and reported as in Table 4. Next, the author records the accuracy measure (see Table 5). It is clear that LOFS-ANN performs best over accuracy criteria too. The results are plotted as Fig. 2 for visualization of comparative analysis.

Table 4 Comparison over AUC

Full size table

Table 5 Comparison over accuracy

Full size table

To achieve the goal R1, ROC is considered for performance evaluation. The corresponding ROC plots for all three datasets—UCI Heart Disease Dataset (Cleveland) [15], UCI Statlog (Heart), and UCI Heart Failure Clinical Dataset are reported as Figs. 3, 4, and 5, respectively.

A bar chart of performance metrics A U C and accuracy of the 3 H D P models L O F S-A N N, L O F S-S V M, and L O F S-D T, A N N model performs best. — **Fig. 2**

A graph has three lines L O F S - A N N, S V M, and D T with the false positive rate on the X-axis and true positive on the Y-axis. All lines exhibit upward trends. — **Fig. 3**

From the experimental results, it is seen that LOFS-ANN shows the best accuracy for predicting the heart disease in comparison with rest of the models.

Response to R1—The proposed LOFS-ANN performs best among the proposed models for all datasets.

4.2 Statistical Justification (R2)

To find the statistical proof, Friedman’s test is conducted [20]. The result of test reflects upon whether the statistical proof for the goal R1 exists or not. The test is conducted with significance level of 5%. The results show that the value of p-statistic is less than 0.05 (see Fig. 6). Hence, it can be statistically validated that proposed LOFS-ANN-based HDP model is better than LOFS-SVM and LOFS-DT.

A table titled Friedman's ANOVA with six columns and four rows. The column headers are sources, S S, d f, M S, Chi-square, and Prob Chi-square. — **Fig. 6**

Response to R2—There exists statistical proof to validate the research work carried out in this paper.

5 Conclusion

Heart disease is the biggest reason of death in the entire world. If it is predicted well in advance and the patient is fore alarmed, then the lives can be saved. ML classification algorithms are being used for predicting the heart disease. The accuracy of the heart disease predictor is enhanced with the appropriate subset selection of the features from the total feature set—which are in good correlation with the target. In this paper, lion-based feature selection (LOFS) method has been utilized to select most significant features from three datasets—UCI Heart Disease Dataset (Cleveland), UCI Statlog (Heart), and UCI Heart Failure Clinical Dataset. These preprocessed data are fed for the training of three classifiers—ANN, SVM, and DT resulting into three HDP models-LOFS-ANN, LOFA-SVM, and LOFS-DT. The comparison is made among the performance of these proposed methods. The author concludes the work that the ANN with LOFS performs best for heart disease prediction.

Author proposes to replicate the work in the future with larger clinical datasets to contribute more accurate heart disease predictors for biomedical domain.

References

World Health Organization (WHO) (2017) Cardiovascular diseases (CVDs)—Key Facts
Google Scholar
http://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds). Accessed 22 Mar 2022
Goyal S (2023) Software measurements with machine learning techniques-a review. Recent Adv Comput Sci Commun 16:1–17. https://dx.doi.org/10.2174/2666255815666220407101922
Safdar S, Zafar S, Zafar N et al (2018) Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artif Intell Rev 50:597–623. https://doi.org/10.1007/s10462-017-9552-8
Article Google Scholar
Goyal S (2022) FOFS: firefly optimization for feature selection to predict fault-prone software modules. In: Nanda P, Verma VK, Srivastava S, Gupta RK, Mazumdar AP (eds) Data engineering for smart systems. Lecture Notes in Networks and Systems, vol 238. Springer, Singapore. https://doi.org/10.1007/978-981-16-2641-8_46
Amin MS, Chiam YK, Varathan KD (2019) Identification of significant features and data mining techniques in predicting heart disease. Telemat Inform 36:82–93. https://doi.org/10.1016/j.tele.2018.11.007
Article Google Scholar
Prakash S, Sangeetha K, Ramkumar N (2019) An optimal criterion feature selection method for prediction and effective analysis of heart disease. Cluster Comput 22(s5):11957–11963. https://doi.org/10.1007/s10586-017-1530-z
Article Google Scholar
Gokulnath CB, Shantharajah SP (2019) An optimized feature selection based on genetic approach and support vector machine for heart disease. Cluster Comput 22(s6):14777–14787. https://doi.org/10.1007/s10586-018-2416-4
Article Google Scholar
Darwish A (2018) Bio-inspired computing: algorithms review, deep analysis, and the scope of applications. Future Comput Inform J 3(2):231–246, ISSN 2314-7288. https://doi.org/10.1016/j.fcij.2018.06.001
Yazdani M, Jolai F (2016) Lion optimization algorithm (LOA): a nature-inspired metaheuristic algorithm. J Comput Design Eng 3(1):24–36, ISSN 2288-4300. https://doi.org/10.1016/j.jcde.2015.06.003
Haq AU, Li JP, Memon MH, Nazir S, Sun R (2018) A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mobile Inform Syst
Google Scholar
Bharti R, Khamparia A, Shabaz M, Dhiman G, Pande S, Singh P (2021) Prediction of heart disease using a combination of machine learning and deep learning. Comput Intell Neurosci
Google Scholar
Benhar Charles V, Surendran D, SureshKumar A (2022) Heart disease data based privacy preservation using enhanced ElGamal and ResNet classifier. Biomed Signal Process Control 71(Part B):103185, ISSN 1746-8094. https://doi.org/10.1016/j.bspc.2021.103185
Fitriyani NL, Syafrudin M, Alfian G, Rhee J (2020) HDPM: an effective heart disease prediction model for a clinical decision support system. IEEE Access 8:133034–133050. https://doi.org/10.1109/ACCESS.2020.3010511
Article Google Scholar
Goyal S (2022) Genetic evolution-based feature selection for software defect prediction using SVMs. J Circuits Syst Comput 31(11):2250161. https://doi.org/10.1142/S0218126622501614
Goyal S (2022) 3PcGE: 3-parent child-based genetic evolution for software defect prediction. Innovations Syst Softw Eng. https://doi.org/10.1007/s11334-021-00427-1
UCI Machine Learning Repository: Heart Disease Data Set.: Archive.ics.uci.edu. http://archive.ics.uci.edu/ml/datasets/Heart?
Goyal S (2021) Effective software defect prediction using support vector machines (SVMs). Int J Syst Assur Eng Manag. https://doi.org/10.1007/s13198-021-01326-1
Article Google Scholar
Goyal S (2022) Static code metrics-based deep learning architecture for software fault prediction. Soft Comput pp 1–33. https://doi.org/10.1007/s00500-022-07365-5
Goyal S (2021) Predicting the defects using stacked ensemble learner with filtered dataset. Autom Softw Eng 28:14. https://doi.org/10.1007/s10515-021-00285-y
Article Google Scholar
Goyal S (2021) Handling class-imbalance with KNN (neighbourhood) under-sampling for software defect prediction. Artif Intell Rev. https://doi.org/10.1007/s10462-021-10044-w
Article Google Scholar

Download references

Author information

Authors and Affiliations

Manipal University Jaipur, Jaipur, Rajasthan, 303007, India
Somya Goyal

Authors

Somya Goyal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Somya Goyal .

Editor information

Editors and Affiliations

University of Macau, Macau, Macao
Simon Fong
JIS University, Kolkata, India
Nilanjan Dey
Global Knowledge Research Foundation, Ahmedabad, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goyal, S. (2023). Predicting the Heart Disease Using Machine Learning Techniques. In: Fong, S., Dey, N., Joshi, A. (eds) ICT Analysis and Applications. Lecture Notes in Networks and Systems, vol 517. Springer, Singapore. https://doi.org/10.1007/978-981-19-5224-1_21

Download citation

DOI: https://doi.org/10.1007/978-981-19-5224-1_21
Published: 06 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5223-4
Online ISBN: 978-981-19-5224-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Predicting the Heart Disease Using Machine Learning Techniques

Abstract

Similar content being viewed by others

Comparison of Different Classification Algorithms for Prediction of Heart Disease by Machine Learning Techniques

Early and accurate detection and diagnosis of heart disease using intelligent computational model

Survey on Heart Disease Prediction Using Machine Learning Techniques

Keywords

1 Introduction

2 Literature Work

3 Research Methodology

4 Results and Discussion

4.1 Finding the Best ML-Based HDP Model (R1)

4.2 Statistical Justification (R2)

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Predicting the Heart Disease Using Machine Learning Techniques

Abstract

Similar content being viewed by others

Comparison of Different Classification Algorithms for Prediction of Heart Disease by Machine Learning Techniques

Early and accurate detection and diagnosis of heart disease using intelligent computational model

Survey on Heart Disease Prediction Using Machine Learning Techniques

Keywords

1 Introduction

2 Literature Work

3 Research Methodology

4 Results and Discussion

4.1 Finding the Best ML-Based HDP Model (R1)

4.2 Statistical Justification (R2)

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation