INTRODUCTION

According to the World Health Organization, mortality from cardiovascular disease (CVD) has long occupied a leading position in the world. However, in Russia and Kazakhstan, people die from diseases of the circulatory system almost twice as often as, for example, in European countries [1, 2]. More than others, it threatens people with chronic cardiac insufficiency, which most often develops as a result of arterial hypertension, coronary heart disease, rheumatic malformations, and anemia of various origin.

Timely medical assistance is of decisive importance for preserving the life and health of the sufferers, reducing disability and mortality. Long-term, often lifelong, drug treatment, as well as its high cost, dictates the need to pay more and more attention to the early primary prevention of these diseases. To reduce the risks of life-threatening arrhythmia, it is necessary to improve the systems for diagnosing and processing electrocardiosignals (ECG). A significant part of existing research recommends machine learning algorithms for predicting heart disease.

MATERIALS AND METHODS

In cardiodiagnostics, the QRS complex is widely used, which displays the process of propagation of excitation through the myocardium of the ventricles of the heart (the so-called “depolarization” of the ventricles). The QRS complex consists of Q, R, and S waves on an ECG (Fig. 1). Intelligent systems based on machine learning methods, compared to traditional methods, are able to predict accurately some heart diseases at an early stage, even with complex input data [3]. To study and achieve effective recognition of ECG data, a convolutional neural network is presented in [4] to perform encoding of a single QRS complex with the addition of entropy features. The study is aimed at determining the combination of signal information and providing the best result for subsequent classification of cardiac signals. The analyzed information included the raw ECG signal, its entropy characteristics, and extracted QRS complexes. The methods used in this study for calculation features based on entropy and R-wave detection had limitations in use due to their high computational complexity.

Fig. 1.
figure 1

Scheme of depolarization of the heart ventricles on the ECG (structure of the QRS complex).

In cardiology, ultrasound investigations are used to diagnose heart disease associated with myocardial infarction. Research [5] is aimed to develop methods for segmenting the left ventricle on ultrasound images to check for myocardial movement during a heartbeat. The proposed method uses machine learning methods, such as active contour and convolutional neural networks. A hybrid approach in combination with linear and nonlinear characteristics extracted from ECG and heart rate variability (HRV) has been proposed and described in [6] for multiclass ECG classification based on a deep neural network. The use of this method improves the efficiency of ECG diagnostics by combining optimized deep learning functions with efficient aggregation of ECG functions and HRV indicators based on dynamic chaos theory. Although the proposed approach has been shown to be a promising tool for ECG classification, it should be further developed by examining large ECG datasets with many more patients to identify different classes of heart disease, including heart rhythm classes and myocardial infarction classes.

Paper [7] presents a method for predicting heart diseases based on support vector machines (SVM) supplemented with fuzzy fusion at the decision level. Research [8] combines cloud computing with machine learning by combining statistical datasets and applying fusion methods to ensure the accuracy and consistency of predictions. Real-time patient indicators were retrieved from the cloud repository and processed using a fuzzy model. The model used an artificial neural network, decision tree, and Bayesian method. In [9], to determine algorithms for classifying coronary heart disease (CHD), polygenic risk scales, logistic regression, the naive Bayes model, random forests, support vector machines, and gradient boosting were compared. The resulting models were tested on an independent data set. As a result, it was found that polygenic risk scales turned out to be the most effective algorithm for classifying CHD.

An extended convolutional neural network with deep learning support was developed in [10] to predict cardiovascular disease and determine the level of health risk. The test results showed comparison with approaches such as the deep neural network, the recurrent neural network, and the neural network ensemble method. The system was implemented on the platform of the Internet of Medical Things and a medical decision support system. Paper [11] presents the HealthCloud system for monitoring the health status of patients using machine learning and cloud computing. This study was aimed to integrate information from various sources needed to describe heart disease in detail with an accurate prognosis. The presence of heart diseases was determined using support vector machines, k-nearest neighbors, neural networks, and logistic regression. In [12], machine learning methods were used to construct predictive models of tachyarrhythmia after acute myocardial infarction. However, the authors noted that the machine learning approach needs further validation and optimization before clinical application.

The great importance and intensive development of machine learning methods for the study of heart disease is evidenced by the increased number of publications in this area [1322]. At the same time, the number of publications on this topic is growing, which is associated not only with the relevance and high social significance of the detection of cardiovascular diseases, but also with the development of machine learning methods themselves. Analysis of the literature showed that, in the presence of a well-developed mathematical apparatus and significant amounts of clinical data, there is a scientific problem of determining the optimal parameters of machine learning methods and their application to improve the quality of a multicriteria analysis of the state of the human cardiovascular system, primary diagnosis, and timely prevention of heart disease.

We used raw data on 303 patients from the Hungarian Institute of Cardiology, the University Hospital of Zurich, the University Hospital of Basel, the Long Beach Medical Center, and the Cleveland Clinic for our research. The patient database has fourteen attributes. Information about the attributes of the data set is presented in Table 1. To demonstrate the data structure, Table 2 shows a part of the database used in this study.

Table 1. Description of data attributes
Table 2. Fragment of the data set for testing (by attribute numbers)

We used coding for categorical variables such as age, blood pressure, and cholesterol level, since these are independent discrete values that were prenormalized. Some attributes were considered as fixed, such as age and cholesterol levels. Some attributes were considered as variables, such as pain in the heart.

The aim of this study was to use the first thirteen signs to predict the fourteenth, the presence of heart disease in a patient. For this study, various methods of machine learning were used with their subsequent comparative analysis and study of their effectiveness.

RESULTS AND DISCUSSION

The correlation relationship between each pair of attributes was analyzed and revealed. In Fig. 2, plots on the main diagonal are histograms of each attribute compared to the classification score (presence or absence of heart disease). Plots that are not on the main diagonal show correlations between two different attributes according to identifiers:

Fig. 2.
figure 2

Relationship between data attributes (explanations are in the text).

• age;

• gender;

• chest pain type;

• resting blood pressure in mm/Hg;

• serum cholesterol in mg/dl;

• fasting blood pressure>120mg/dl;

• resting electrocardiographic results (ECG at rest);

• maximum heart rate achieved (maximum heart rate);

• exercise induced angina;

• ST depression induced by exercise relative to rest;

• the slope of the peak exercise ST segment (number of major vessels colored by flourosopy);

• thalassemia, the presence or absence of heart disease.

It is seen from the correlation diagram in Fig. 2 that the presence or absence of heart disease has significant differences in the distribution of attributes such as heart pain (cp); exercise-induced angina (exang); exercise-induced ST depression compared to rest (oldpeak); the slope of the peak ST segment under load (slope); the number of large vessels stained by fluoroscopy (ca), and thalassemia (thal).

Thus, the most statistically significant attributes were identified, which contain information about the probability of having heart disease. To display correlations between attributes, a color-coded matrix was constructed, in which the color gradation corresponds to the degree of correlation (Fig. 3). It can be seen from this matrix that the slope of the peak ST segment during exercise positively correlates with the exercise-induced ST depression indicator. Compared to the resting state, the correlation coefficient is 0.58. This means that if the value of the ST segment slope increases, then the peak of depression will also increase, and vice versa.

Fig. 3.
figure 3

Representation of correlation between attributes.

The heart disease target was found to have the highest positive correlation with the indicator of thalassemia (at 0.52), followed by the number of large vessels colored in roentgenoscopy, indicators of exercise-induced angina, ST depression, and cardiac pain.

All data was analyzed using seven models: logistic regression, k-nearest neighbors, decision tree, support vector machine, naive Bayes classifier, random forest, and deep neural networks. To evaluate machine learning methods, multiple random division of preprocessed data into training and test subsets was used. Since the result may vary depending on the random initial values of this split, two series of experiments were carried out with 20 and 15% of the data as a test series, respectively. The results of two series of experiments for all models with 20 and 15% data redundancy as a test set are shown in Table 3.

Table 3. Results of two series of experiments

To analyze the results obtained, plots were constructed to compare the accuracy of the models (Fig. 4). The main parameters of all machine learning methods studied are given in Table 4.

Fig. 4.
figure 4

Comparison of the efficiency of machine learning models with different distributions between test and training samples.

Table 4. Parameters of supervised machine learning methods

The three-dimensional curves of the accuracy surface obtained as a result of a series of experiments are shown in Fig. 5. On the basis of the developed procedural interactive program, plots of the values of the parameters of the test series size as an independent variable X and the values of the argument of the pseudo-random generation function were constructed to evaluate each of the Y models. The accuracy of the model is plotted along the Z axis.

Fig. 5.
figure 5

Accuracy surface curves for machine learning models: (a) logistic regression; (b) k-nearest neighbor method; (c) decision tree; (d) support vector machine; (e) naive Bayes classifier; (f) random forest.

When constructing three-dimensional surfaces for each type of model, models with test series sizes from 5 to 55% were generated along the X axis. Further, for each example, 250 machine learning models were generated from the data set with values of the initialization argument of the pseudo-random number generator from 0 to 250. This parameter uniquely determines the composition of the test and training series and is responsible for reproducibility of the results. After setting values for independent variables, two corresponding arrays were iteratively entered into each of the machine learning models with fixed parameters from the given ranges of values.

As a result, 12  500 experiments were conducted for each type of machine learning model; in total 75  000 computational experiments were performed with various parameters. Each resulting graph in Fig. 5 is a precision curved surface. The axes (Fig. 5) show the relative size of the test series, the value of the initiating argument of the pseudorandom number generator, and accuracy of the model in arbitrary units. In the graphs, darker lines (peak values) correspond to a higher forecast accuracy.

By points on the plots, one can trace the main patterns and changing trends between them. It can be seen that the magnitude of the fluctuation in the accuracy of forecasting models is related to the value of the argument of the pseudorandom generation function. At the same time, the accuracy of forecasts decreases as the value of this parameter increases. The model achieves relatively high accuracy (local maximum), with the value of the argument of the pseudo-random generation function in the range of 5–30 units. It was found that the accuracy of model prediction can be optimized by choosing the size of the test set and parameter of pseudorandom generation of separation into test and training series. This result is useful for obtaining a stable separation method of dataset and debugging methods of machine learning.

In some ranges of values, the change in the accuracy parameter of the models is relatively stable, while in other areas, the change in the accuracy parameter is characterized by significant fluctuations. Such relatively high but unstable accuracy values are associated with model overfitting. Thus, the accuracy of each model is not constant.

This is due to the fact that, when normalizing data, dividing the series into test and training sets, as well as in the controlled construction of machine learning models, the prediction accuracy largely depends on the numerical parameters of algorithms (Table 4).

For comparison among the models constructed for predicting heart diseases, ROC analysis [23] with the construction of an ROC diagram was used (Fig. 6). To assess the quality of machine learning models, the value of the area under the ROC curve AUC (area under the curve) was used. The AUC values calculated from the models that participated in the series of experiments are shown in Table 3.

Fig. 6.
figure 6

ROC moduli curve for machine learning methods. The X-axis shows a false positive rate, FPR, along the Y-axis is the true positive rate, TPR.

CONCLUSIONS

Due to the intensive development of machine learning methods, cloud technologies, and neural networks and the high social significance of cardiovascular diseases, the relevance of this study is beyond doubt. At the same time, the healthcare system requires the introduction of highly accurate and reliable methods for supporting medical decision-making, methods for collecting and consolidating big data, as well as methods for evaluating and validating predictive methods. At the same time, machine learning methods, including artificial neural networks, play an increasingly important role in the diagnosis of cardiovascular diseases, especially at an early stage, contributing to timely prevention and increasing the duration and quality of human life.

In the present study, a correlation analysis of medical parameters was carried out, and machine learning models for predicting the state of the human cardiovascular system have been considered and analyzed. As a result of this research, estimates of the effectiveness of each method were obtained. The presented methods have confirmed their effectiveness in solving practical problems of public health. As a result of a series of computational experiments, it was found that, for predicting heart disease, it is important not only to choose one machine learning method or another, but also to select its parameters, including chaotic characteristics of testing algorithms.

When solving biomedical problems using machine learning methods, one should always take into account the fact that only a doctor can diagnose, and a decision support system based on machine learning algorithms can only play an advisory role. In this regard, the developments made in this study can be recommended for the practical implementation of the developed software and mathematical tools for prevention of heart attacks and preventive measures of heart diseases. It should be noted that the implementation of such systems in healthcare will have a positive impact on the improvement of measures to preserve and improve the health of the population.