Abstract
Coronary artery is a major reason of health ailments all over the world. Its detection and management incurred huge amount of deaths across the nation. Heart disease can be diagnosed using various invasive and non-invasive methods. One of the effective methods for detection of coronary artery disease is coronary angiography, which is expensive and also has side effects. This further requires high level of technical expertise. Due to improvement in technology and low-cost storage devices, storage of huge amount of data becomes easy. Even health sector has been untouched. Machine learning methods are being used to analyze the collected data due to its capability to predict the diseases. In this work, machine learning methods are implemented in order to achieve low-cost, reproducible, non-invasive, rapid, and precise identification of heart disease. This paper adopted ensemble method with multiple classifiers to construct and validate the model. For experiment purpose, Z-Alizadesh Sani coronary artery disease dataset is used. The ensemble method of prediction outperforms the other disease prediction methods.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Cardiovascular diseases are one of the foremost reasons of death and disability all over the country. Deaths due to these diseases are increasing day by day in every age group. These are basically the disease of heart and blood vessels. Coronary artery disease is a category of cardiovascular disease where the plaque is accumulated in heart vessels, interrupts the flow of blood, causes pain, and results into heart attack or even death [1,2,3,4]. The major causes of heart diseases are unhealthy lifestyle, lack of exercise, smoking, and unhealthy eating habits. Developing as well as developed countries are spending large amount of its nation financial budget for detection and treatment of the disease. In spite of advancement in medical science, accurate diagnosis and treatment of the heart disease is still a challenging task because of the complexity of diagnosis and treatment methods, especially in resource poor settings. Accurate diagnosis and treatment of patients is necessary in order to save human life and also to reduce the risk of more severe disease [5,6,7,8,9]. Heart disease is prevented by adopting healthy life style and timely tracking off and treatment of the disease. There are many well-known invasive and non-invasive modalities available for identification of the disease [10]. Non-invasive methods include techniques such as ECG, echocardiogram, stress test, and so on. Sometimes the result of these modalities is inconclusive as well as requires time for assessment. So, coronary angiography is popular examining modality for disease identification. It is invasive, painful, and expensive as well as requires expensive clinical setup [11,12,13,14].
Due to improvement in technology and low-cost storage devices, storage of huge amount of data becomes easy. Even health sector has been untouched. Machine learning methods are being widely used to analyze the collected data due to its capability to predict the diseases. Researchers are seeking inexpensive, reproducible, fast, and computationally inexpensive methods for detection of heart disease. They are exploring machine learning methods such as support vector machine (SVM), K-nearest neighbor (KNN), artificial neural network (ANN), decision tree, logistic regression, and naive Bayes for identification of heart disease [15,16,17,18,19,20]. The paper focuses on various machine learning methodologies in order to identify the heart disease. For experiment purpose, Cleveland heart disease dataset and Alizeshani heart disease dataset available at University of California Irvine (UCI) machine learning repository are used.
Nowadays, healthcare sector is generating large amount of data related to patients, disease, clinical reports, physician notes, laboratory tests, and administrative data. The collected data is used by knowledge miners to extract useful patterns with the help of advanced computational intelligent methods results in low-cost healthcare services with reduced error, improved diagnostic methods.
2 Framework for Intelligent Coronary Artery Disease Prediction
The benchmark coronary artery disease dataset is collected from UCI machine learning repository available for research purpose. The Z-Alizadesh Sani coronary artery dataset contains 53 attributes and 303 records such as Age, Weight, Length, Gender, Body mass index, Diabetes Mellitus, Hypertension, Current smoker, EX-Smoker, Obesity, Airway disease, Thyroid, Chest pain, etc. The Cleveland data consists of 14 attributes and 303 instances. The features are Age, Gender, Chest pain type, Resting blood pressure on admission, Serum cholesterol, Fasting blood sugar, Resting ECG outcome, Max heart rate achieved, Old peak, Slope, Number of fluoroscopy, Colored vessels, Reversible defect, and Outcome [21].
Data is preprocessed to apply predictive modeling using classification methods. The disease identification model is evaluated using performance measures such as accuracy, error rate, AUC, and F-measures. The experimental results exhibit that ensemble-based model which is a better approach with regard to reliability and predictivity of diagnosis. Table 1 shows the description of Z-Alizadesh Sani heart disease dataset.
The CAD dataset collected from UCI machine learning repository was preprocessed. Then, logistic regression, deep learning, decision tree, random forest, gradient boosted, and SVM learning algorithm were applied to identify the presence and absence of coronary artery disease. The performance measures to evaluate the recital of learning algorithms are recorded such as accuracy, error rate, AUC, and F-measure.
Accuracy parameter outputs the percentage of correctly identified patients keeping into consideration following the observations of people suffering from coronary artery disease and those who are not suffering from this disease. Error rate is the percentage of patients not suffering from coronary artery disease, identified as positive for disease, and patients who are suffering from disease and identified as negative for the diseases. Figure 1 shows the framework for intelligent coronary artery disease system.
3 Result Analysis
Table 2 presents the result on Z-Alizadesh Sani dataset having accuracy, error rate, F-measures, and AUC. Logistic regression achieves the prediction accuracy of 80%, deep learning-based model achieves the prediction accuracy of 78%, decision tree achieves the prediction accuracy of 69%, SVM achieves accuracy of 71%, and random forest achieves the prediction accuracy of 82%. The Ensemble-based model gradient boosted tree achieves highest prediction accuracy of 84%. Figure 2 presents the accuracy of classification methods.
In case of misclassification error rate, logistic regression achieves the error rate 20%, deep learning-based model achieves the error rate of 22%, decision tree achieves the error rate of 31%, random forest achieves the error rate of 18%, SVM achieves the error rate of 29%, and in case of gradient boosted tree, it achieves the lowest error rate of 16%. Figure 3 presents the misclassification error rate of classifiers. Figures 4 and 5 present the AUC and F-measure of prediction models.
Table 3 shows the experimental results on Cleveland heart disease dataset. Logistic regression achieves the prediction accuracy of 83%, deep learning achieves the prediction accuracy of 83%, and decision tree achieves the prediction accuracy of 74%. Random forest achieves the prediction accuracy of 77%, SVM achieves the prediction accuracy of 74%, and gradient boosted tree (the ensemble-based method) achieves the highest prediction accuracy of 84%. On the other hand, gradient boosted tree achieves the lowest error rate of 16%, and support vector machine has the highest error rate of 26%. Logistic regression achieves the error rate of 17%, deep learning 17%, random forest 23%, respectively, (Figs. 6, 7 and 8).
4 Conclusion
The experimental results using Z-Alizadesh Sani dataset and Cleveland datasets show that ensemble-based method is preferred as compared to other coronary artery disease detection models. The proposed model can be used to reduce cost of initial detection of coronary artery disease with low cost. The clinical parameters can be easily collected from hospitals, and results can be reproduced in a faster, more accurate, scalable, and reliable manner. It can serve an adjunct tool in clinical settings.
References
Terzic A, Waldman S (2011) Chronic diseases: the emerging pandemic. Clin Transl Sci 4(3):225–226. https://doi.org/10.1111/j.1752-8062.2011.00295.x
Reddy KS, Shah B, Varghese C, Ramadoss A (2005) Responding to the threat of chronic diseases in India. The Lancet 366(9498):1744–1749
Verma L, Srivastava S, Negi PC (2018) An intelligent noninvasive model for coronary artery disease detection. Complex Intell Syst 4(1):11–18
Wong ND (2014) Epidemiological studies of CHD and the evolution of preventive cardiology. Nat Rev Cardiol 11(5):276–289
Steele AJ, Denaxas SC, Shah AD, Hemingway H, Luscombe NM (2018) Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease. PLoS ONE 13(8):e0202344
Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP (2018) Machine learning in cardiovascular medicine: are we there yet? Heart 104(14):1156–1164
Verma L, Srivastava S (2016) A data mining model for coronary artery disease detection using non-invasive clinical parameters. Indian J Sci Technol 9(48):1–6
Tiwaskar SA, Gosavi R, Dubey R, Jadhav S, Iyer K (2018) Comparison of prediction models for heart failure risk: a clinical perspective. In: 2018 Fourth international conference on computing communication control and automation (ICCUBEA). IEEE, pp 1–6
Verma L, Srivastava S, Negi PC (2016) A hybrid data mining model to predict coronary artery disease cases using non-invasive clinical data. J Med Syst 40(7):1–7
Takahashi N et al (2019) Computerized identification of early ischemic changes in acute stroke in noncontrast CT using deep learning. In: Medical imaging 2019: computer-aided diagnosis, vol 10950. International Society for Optics and Photonics
Alizadehsani R, Hosseini MJ, Sani ZA, Ghandeharioun A, Boghrati R (2012) Diagnosis of coronary artery disease using costsensitive algorithms. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW). IEEE, pp 9–16
Arafat S, Dohrmann M, Skubic M (2005) Classification of coronary artery disease stress ECGs using uncertainty modeling. In: 2005 ICSC congress on computational intelligence methods and applications. IEEE, pp 4-pp)
Sandhu JK, Verma AK, Rana PS (2018) A data-driven framework for survivable wireless sensor networks. In: 2018 Eleventh international conference on contemporary computing (IC3). IEEE, pp 1–6
Ayatollahi H, Gholamhosseini L, Salehi M (2019) Predicting coronary artery disease: a comparison between two data mining algorithms. BMC Public Health 19(1):448
Dhanaseelan R, Sutha MJ (2018) Diagnosis of coronary artery disease using an efficient hash table based closed frequent itemsets mining. Med Biol Eng Comput 56(5):749–759
Alizadehsani R, Habibi J, Sani ZA, Mashayekhi H, Boghrati R, Ghandeharioun A, Bahadorian B (2012) Diagnosis of coronary artery disease using data mining based on lab data and echo features. J Med Bioeng 1(1)
Bouali H, Akaichi J (2014) Comparative study of different classification techniques: heart disease use case. In: 2014 13th International conference on machine learning and applications (ICMLA). IEEE, pp 482–486
Sandhu JK, Verma AK, Rana PS (2018) A novel framework for reliable network prediction of small-scale wireless sensor networks (SSWSNs). Fundamenta Informaticae 160(3):303–341
Acharya UR, Faust O, Sree V, Swapna G, Martis RJ, Kadri NA, Suri JS (2014) Linear and nonlinear analysis of normal and CAD-affected heart rate signals. Comput Methods Prog Biomed 113(1):55–68
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sapra, L., Sandhu, J.K., Goyal, N. (2021). Intelligent Method for Detection of Coronary Artery Disease with Ensemble Approach. In: Hura, G.S., Singh, A.K., Siong Hoe, L. (eds) Advances in Communication and Computational Technology. ICACCT 2019. Lecture Notes in Electrical Engineering, vol 668. Springer, Singapore. https://doi.org/10.1007/978-981-15-5341-7_78
Download citation
DOI: https://doi.org/10.1007/978-981-15-5341-7_78
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5340-0
Online ISBN: 978-981-15-5341-7
eBook Packages: EngineeringEngineering (R0)