Keywords

1 Introduction

Critical care patients with or without history of diabetes are commonly linked to hyperglycemia (Blood Glucose (BG) > 11.1 mmol/L) [1, 2] during their stay at the Intensive Care Unit (ICU). This condition may occur when the cortisol and counterregulatory hormones increased with insulin resistance [3]. These patients, especially the diabetics, are additionally vulnerable to multiple organ failures, sepsis, or infection, and worst, a higher rate of mortality and morbidity [4]. The pandemic of Coronavirus Disease (COVID-19) cases raised concerns as 42.5% of critically ill patient’s mortality had Diabetes Mellitus (DM) [5]. Lim et al. [5] showed that 9.8% of COVID-19 patients have a comorbidity of DM and from the study, 1.2% mortality rate was reported. To date, From The Desk of the Director General of Health Malaysia report as of 27 July 2021, 1,044,071 Malaysian have been detected positive with COVID-19 and the numbers of mortality are 8,201 (0.78%) in cumulative [6]. This report [7] illustrated that out of 125 mortality cases, 52% had DM. Moreover, COVID-19 also had similar symptoms as a Severe Acute Respiratory Syndrome (SARS) [8]. Additionally, according to the Malaysian Registry of Intensive Care quarterly report, 35.8% of the ICU admitted patients suffer from multiple organ failures in the first 24 h of admission. Respiratory failures represented one-third of the single organ failure by patient, with 21.9%. In the first 24 h of admissions, 15.7% patients were reported to have Acute Respiratory Distress Syndrome (ARDS) [9]. Few studies have also demonstrated that respiratory failures are relatively high in acute patients with diabetes [10,11,12]. From these studies, an interrelationship and correlation were observed between respiratory failure and diabetes. Logette et al., [13] showed evidence that linked elevated BG to COVID-19 patients. When the glycemic level of diabetic patients is uncontrolled, these patients are susceptible to various adverse outcomes such as multiple organ failures. Thus, if the early prediction of BG level in critically ill diabetic patients can be known using daily medical data available from the ICU charts to ease glycaemic control, the occurrence of respiratory failure might be improved in critically ill patients.

Respiratory failure can be assessed with P/F scores of oxygenation index based on the Partial Pressure of Oxygen (PaO2) and Fraction of Inspired Oxygen (FiO2) [14]. One of the standard procedures in diagnosing diabetes is to measure insulin resistance and through euglycaemic clamp [15], but this method can be time-consuming for ICU use [16]. Therefore, Insulin Sensitivity (SI) estimation is used to replace insulin resistance for glycemic control as SI reflects the inversed parameter of insulin resistance. Moreover, SI has been suggested as biomarker in several various glycemic control studies [17,18,19]. In achieving the targeted BG performances, SI was used in stochastic targeted studies [20,21,22,23] for glycaemic control, predicting the BG within 5 to 95%. Since SI has been used for glycaemic control, we hypothesized that it can predict the BG performance by exploiting medical data available in the ICU charts. Additionally, P/F scores can give an early prediction for glycaemic control by stratifying the BG in time of the target bands using classification technique.

A study [24] has shown that there might be an association between insulin resistance and lung dysfunction. Therefore, there is a need to examine if respiratory P/F score and SI information can be used together as strong factors in predicting ICU glycemic control, especially for diabetic patients. Thus, this paper’s objective is to assess classification of ICU patients’ BG performance by classifying three different BG bands using P/F score, SI, and diabetes mellitus (DM) status as predictors. In order to identify which model has the best performance accuracy, seven different classifiers are compared: Decision Tree, k- Nearest Neighbor (kNN), Support Vector Machine (SVM), Naïve Bayes, Discriminant, Ensemble and Neural Network model. This preliminary study was done to identify which classifier models are more feasible towards glycaemic level prediction if P/F scores and DM status are included in model-based glycaemic control.

2 Materials and Method

2.1 Study Population

Retrospective patients’ data for the case study in total were 76 (47 DM and 29 NDM) with 10704 h of length of stay and collected from University Malaya Medical Centre (UMMC) in 2018. These patients’ dataset has examined the following factors: P/F score (in k/Pa), SI (continuous in L/mU.min), diabetes mellitus (DM) status (binary, with 1 for DM and 0 for Non-DM), and the BG performance (three discrete classes). Instead of using a daily worst representative P/F score as usually done for Sequential Organ Failure Assessment (SOFA) score to decide on respiratory failure, P/F scores were extracted directly from the clinical chart whenever they were available. FiO2 data were assumed similar to the previous value whenever the value of PO2 and BG are available. On average, patients have three recorded P/F scores daily. Using BG measurements, provided nutrition and insulin, hourly SI(L/mU.min) were fitted first with Intensive Care Insulin Nutrition Glucose (ICING) model [25] through integral fitting process method [26]. Instead of using a constant value to represent individual insulin resistance, SI was estimated from the fitting process method to represent the patient’s hourly varying metabolism.

Patients demographics are shown in Table 1. Patients’ age and Acute Physiology and Chronic Health Evaluation (APACHE II) scores, SI and P/F scores are presented in the average and standard deviation (±SD). Rank sum p-value test was used to determine the differences in age demographics (years old), the total of ICU stay (hours) distribution, SI and P/F score. A P-value less than 0.05 is considered significantly different. The distribution of age and the total hours of ICU stay show no significant difference. Patients with diabetes status were presented by 11 and 7 female patients in the training and test datasets, respectively. In total, 1902 rows of input variables were divided randomly into training and testing data, with a ratio of 80:20. Three of the attribute variables, that is, SI, DM, and P/F scores, were used as the predictors also known as features, and BG performance with three different classes was selected as the desired output to represent BG in time of the target bands. The data were divided into two partitions, where 4269 data points (1423 rows × 3 input variables) of the data were used for training and 1437 (479 rows × 3 input variables) for testing. Five-fold k cross-validation was used in training data to validate the accuracy of the model during the learning process.

Table 1. Patients demographic for train and test data.

2.2 Classification Learners

The seven classifiers model used in the study were supervised machine learning techniques [27]. The seven classifiers model used in training the dataset were Decision Tree (Model 1), Discriminant (Model 2), Naïve Bayes (Model 3), Support Vector Machine (Model 4), k-Nearest Neighbor (Model 5), Ensemble (Model 6) and lastly Neural Network (Model 7). Classification is often being used as the prediction for medical diagnosis [28], and showed efficiency in using small data for decision-making, [29]. Haque et al. showed an observation between 8 different machine learning algorithm performance for diabetes neuropathy diagnosis and Random Forest had outperformed among all those classifiers techniques [30]. Meanwhile, Singh et al. [28] showed that out of 8 classifiers used, the SVM classifier had the best accuracy to detect the relationship between diabetes and hypertension. From the previous studies, there are various types of classifiers that had classified patients with diabetes status. However, our focus of classification in this study is to observe BG performance with 7 types of classifiers using SI, P/F, and DM as input predictors within diabetic ICU patients. P/F was included as one of the features to represent the respiratory score of the patients with 7 types of classifiers. BG performances as the desired output were classified into three BG level target ranges:

  • Class 1: BG ≤ 7.7 mmol/L

  • Class 2: BG = 7.8 to 11.1 mmol/L

  • Class 3: BG ≥ 11.2 mmol/L

Figure 1 shows the framework for the usage of difference classifiers using MATLAB version 2021a.

Fig. 1.
figure 1

Framework for classification learner

Model accuracy results was demonstrated in Eq. 1. The confusion matrix represented will be used to illustrate the model with the highest performance. From the confusion matrix, the cohort train and test data results of True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) were classified. To further test the classifier model robustness, sensitivity and specificity are calculated as represented in Eqs. 2 and 3, respectively. The nearer the value of sensitivity to 1, the more robust the classifier.

$$Accuracy=\frac{TP+TN}{TP+TN+FP+FN} \times 100\%$$
(1)
$$Sensitivity=\frac{TP}{TP+FN} \times 100\%$$
(2)
$$Specificity=\frac{TN}{TN+FP} \times 100\%$$
(3)

3 Results and Discussion

The accuracies of seven classifiers based on the 3 input predictors for train data are shown in Table 2. The test data accuracy was examined only with the three predictors as it is important to find out how all the three predictors especially P/F can influence the BG performance for critically ill DM patients. We hypothesized that P/F score, DM, and SI could be used in classifying BG performances. When the three predictors of SI-P/F-DM were used, the training Model 3, 4 and 7 showed the highest accuracy with results of 55.4%, 53.5% and 54.1% respectively. Meanwhile, when only two predictors of SI-P/F were used, the training model results of 3 and 7 accuracies were observed and had decreased by 1.4%, and 0.6% respectively. The result for SI-DM predictors shows that Models 5 and 7 had increased accuracy from 53.0% to 55% and 54.1% to 55.3% respectively.

Table 2. Models accuracies.

At first, amongst all classifiers, the Kernel Naive Bayes (Model 3), Fine Gaussian SVM (Model 4), and Narrow Neural Network (Model 7) classifiers give the highest performances compared to other model functions during the training. However, after the trained classifiers were tested, the accuracy shows that Models 2, 5 and 7 had better results using three predictors with 53.2%, 54.1%, and 54.5%, respectively. Despite having low prediction performance result during training, using all the three predictors indicate that P/F is feasible to be included in giving early prediction BG performance for patients with DM.

Figure 2 shows the tabulation of sensitivity and specificity of BG performance results for both data in k-Nearest Neighbour (Model 5) and Neural Network (Model 7) classifiers to compare model robustness with three predictors. These two classifiers were presented and further tested for model robustness as the results were the top two highest. For both training and test data, Model 5 had sensitivity of 0.62 and 0.54, respectively. Meanwhile, specificity for Model 5 shows 0.66 and 0.76 based on the test set, respectively. Then, Model 7 shows the sensitivity of 0.65 and 0.53 for train and test data. The specificity for Model 7 of train and test data are 0.63 and 0.67, respectively. Model 5 has better ability to stratify BG performance while Model 7 is more sensitive and robust classifying the predictor.

Fig. 2.
figure 2

Sensitivity and specificity for training and test data of k-Nearest Neighbour (Model 5) and Neural Network (Model 7).

Figure 3 shows that BG bands in Class 1,2 and 3 positively predicted in training data for Model 5 are 24, 680 and 50 respectively. Meanwhile, in test results for Model 5 are 33, 208 and 18. As for Model 7, 80, 582 and 108 of the predicted Class 1, 2 and 3 are true in train data. The test results that classified true for Class 1, 2 and 3 are 44, 192, and 25. Our findings show that k-Nearest Neighbor (k-NN) had better classification for larger data such as Class 2. In comparison to k-NN, Neural Network classifier had higher accuracy in classifying Class 1 and 3 meaning that Neural Network is more feasible for data with small volumes. From the specificity values, there are increment of 10% and 4% for Model 5 than Model 7.

Fig. 3.
figure 3

Confusion matrix of the (a) training data (b) test data

Based on the classification results, TP predictions for both data partitions need to be calculated by two stages, one for model accuracy and followed by model sensitivity for robustness. This is because there are more than two classes to be classified. From the overall mean of classifier models, there is an increment of 4% from train to test data showing that the combination of SI, P/F and DM status shows positive potential for classifying the BG within the time band. Moreover, sensitivity and specificity showed the highest test data accuracies for Model 5 and Model 7. This indicates an improvement in classifier robustness whereby the results are increased by 8% vs 12% and 10% vs 4% from train to test data. There are several possibilities that contribute to low prediction. First, a study by Razak et al. [31] suggested increasing the value of SI to a physiologically relevant one that represented the human biomarker. Our study showed that the value of SI used most of the time had 0 value which is not acceptable. Thus, improving SI to a physiologically relevant value range of 1e−3 to 1e−5 may also help improve the classifiers features. Second, the PaO2 and FiO2 data collected from ICU charts whenever both data are available. Thus, in this study, we assumed the data of FiO2 to complete the PO2 data. However, since SI represented the hourly varying parameter of ICING model, the data collected for P/F score can be interpolated to hourly as well for future study to complement all the features used.

In summary, this study was done to classify the BG performance bands using SI, P/F, and DM. P/F scores can be used to predict the classes of BG, this input predictor can be used as a non-invasive and time-consuming method to predict glycemic levels. In the future, if these attributes were selected for further research, P/F predictors can be partitioned earlier based on the different ranges of SOFA score or partitioning the SI based on days of stays. Additionally, data collected for this study was before the occurrence of the pandemic and only 16 out 76 patients had pneumonia cause respiratory failures. In the future, it is interesting to study on how P/F can influence sample size that involved COVID-19 patients as well, as COVID-19 are categorized in the severe acute respiratory system.

4 Conclusion

This preliminary work showed that k-Nearest Neighbor and Neural Network models are feasible to classify BG performance. Although the model’s accuracy is only 54.1% and 54.5%, P/F scores can be used to make an early prediction of glycemic level through classification. Based on the test result using SI-P/F-DM as predictors, the accuracy was increased for both models. Similarly, model robustness test using sensitivity, and specificity demonstrated a total increase of 20% and 14% for both models. This increment shows that there is a likely possible relationship between P/F scores and diabetes status and may influenced BG performances. In future, if we have data set from COVID-19 patients, we can make early prediction for BG performance with the train model for those two classifiers.