Keywords

1 Introduction

Diabetes mellitus (DM), which entails a global health crisis, is a group of metabolic disorders that results from dysregulation of blood glucose (BG), either due to the failure of the body to secrete insulin (type I diabetes mellitus, T1DM) or the inability of the body to respond to insulin action (type II diabetes mellitus, T2DM), as well as first recognition during pregnancy (Gestational Diabetes Mellitus, GDM) or other specific types [69]. The patients may present with chronic hyperglycemia, manifesting polydipsia, polyuria, and polyphagia.

Clinically, the current mainstream diagnostic investigation method of DM is venous plasma glucose measurement [28], and 2-h oral glucose tolerance test (OGTT) remains the internationally accepted gold standard for DM diagnosis, where the venous plasma glucose levels are obtained for fasting, as well as 1-h- and 2-h-post a certain amount of glucose intake (normally 50 g or 75 g).

The mainstream view of the pathophysiology of DM remained that genetic predisposition underlies DM development, where what control the biological steps of beta-cell action, insulin secretion, insulin interaction with tissue cells, insulin receptor production and insulin action inside the cells that were altered or mutated [22]. T2DM patients are getting increasingly insensitive to the physiological effects of insulin. Therefore, more insulin is needed to maintain the original effects of insulin to induce cells to uptake glucose [29]. Nevertheless, for T1DM patients, as their insulin production by beta cells is impaired, therefore, eventually, for both T1DM and T2DM patients, the pharmacological induction of insulin secretion or insulin absorption is no longer sufficient for maintaining the euglycemic state. External insulin supplementation is the sine qua non for diabetes management [39].

DM without proper management may lead to a variety of vascular and neural complications involving multiple organ systems either in a short- or long-term manner, and it is the multiple complications secondary to DM that lead to the heavy burdens of the patients, causing increased medical cost and decreased quality of life [22]. In this sense, regular community-based screening and prompt diagnosis in undiagnosed patients, sufficient patient education and support, continuous medical care, and user-friendly continuous BG monitoring, as well as psychological dredge and social support, are required to prevent acute complications (e.g., ketoacidosis) and minimize the risk of long-term complications (e.g., nephropathy, retinopathy, diabetic foot, cardiovascular disease, or stroke) [14, 29].

Therefore, on the one hand, in the community, timely screening of diabetes in undiagnosed patients could help prevent further development of diabetic complications, hence reducing disease burden and improving quality of life. On the other hand, for DM patients, BG monitoring is of vital significance. It is acknowledged that optimizing glycemic control through lowering BG levels and minimizing glucose variability could prevent the development of microvascular complications and long‐term macrovascular disease [47, 54]. BG serves as the most important risk factor and prognostic factor in DM patients owing to its predictive values in disease progression; it is difficult to manage because of its multifactorial nature, as well as inter-and intra‐personal variability associated with nutritional, behavioral, and pharmaceutical management, as shown in Fig. 1 [54].

Fig. 1
figure 1

Systems and organs related to blood glucose level

Specifically, timely acknowledgment of the fluctuation of blood glucose levels underlies the foundation of diabetes management. With proper and timely blood glucose monitoring, efficacious treatment, dysglycemia (especially undetected hypoglycemia) identification, and treatment plan modification (including medical nutrition therapy, exercise therapy, and pharmaceutical interventions) become possible. Normally, the blood glucose level is checked before the meal, 2 h post-meal, and before sleep [1].

The emergence of self-monitoring of blood glucose (SMBG) has inspired diabetes management in the previous decades, aspiring for euglycemia. Yet, its inconvenience in use may lead to incomplete BG data collection [1]. Moreover, portable blood glucose meters have allowed patients and healthcare workers to obtain dynamic blood glucose level data. With the development of technology, the advent of continuous glucose monitoring (CGM) has made surveillance of fluctuation pattern, frequency, level, and timing of BG level variation possible, and it is proven useful in alarming hypoglycemia. Nevertheless, the CGM devices could be expensive and require continued capillary glucose testing for calibration. Despite the gradual transition from SMBG to more advanced glucose monitoring devices, some reluctance to monitor the blood glucose has been noted given the costs, complexity in use, and low awareness of the necessity. Though SMBG has been available for many DM patients, the need for frequent testing and continuous replenishment of consumables has undermined the patients’ compliance. Besides traditional serum glucose monitoring, novel materials have also inspired glucose monitoring. For instance, non-invasive and non-enzymatic sensing using advanced nanomaterials gained popularity, despite lacking sufficient clinical evidence in the accuracy and stability of long-term glucose monitoring [15, 56, 59]. Hence, accurately monitoring the blood glucose while improving glycaemic control and the quality of life of these patients is now one of the biggest challenges in DM management. The recent boom in BG levels prediction arises with the explosion of interest in Artificial Pancreas Project, a closed-loop control system for BG control [60], and a gross estimate of the number of academic papers concerning “blood glucose” and “machine learning” in the Google Scholar database is shown in Fig. 2.

Fig. 2
figure 2

The number of published articles in Google Scholar includes “Blood glucose” and “Machine learning”

Artificial intelligence (AI) is progressively utilized in medicine to find patterns in complex sets of clinically collected data and self-monitored data to improve health outcomes [38]. Among many AI-based algorithms, machine learning (ML) can equip computers with the ability to learn without the need to be explicitly programmed in advance [49]. The ML algorithms provide the added value of the expertise of clinicians. It is better than using only one in disease treatment [11, 68], especially in better DM management and complications prevention (Gadekallu et al. 2020; [51].

In the present chapter, the following contents will be addressed: (1) the role of ML algorithms in DM management; (2) difference between various ML algorithms; (3) insights into future ML application.

2 The Role of ML Algorithms in DM Management

Specifically, the ML takes part in the DM managements mainly in three main aspects: (1) assisting precise BG level prediction; (2) detecting DM-associated complications and BG alarm event (BG anomalies); (3) establishing personalized decision support/education systems.

2.1 BG Levels Prediction

BG levels are variable and multifactorial, directly affected by insulin, physical activity, and dietary intakes, and influenced by numerous factors. Owing to the dynamic nature of BG levels, some scholars also conceptualized a physiological model that could consider the daily events that influence BG levels, including insulin uptakes, food intakes, exercise, sleep, and even seasonal variation [41].

A comprehensive understanding of the pathophysiology and biological mechanisms underlying DM development and progression is the foundation of incorporating physiological parameters in the ML algorithms. Generally, a physiology-based approach to ML strategies would fractionize the parameters related to BG regulation into three distinct categories, viz., BG dynamics, insulin dynamics, and meal absorption dynamics [40]. Two methods are generally used to incorporate the physiology-based data, namely the lumped (semi-empirical) model and the comprehensive model, where the former would only consist of a few equations and parameters, taking all the organs and tissues as a whole. At the same time, the latter manages data separately according to various organs and tissues [6].

Moreover, the increasing popularity of mobile health applications, biosensors, wearables, and many devices for self-monitoring and healthcare management has also made possible the generation and collection of automated and continuous health-related personal data to feed the ML algorithms [62], such as body mass index, stress level, amount of sleeping time, underlying diseases, medications use, smoking habit, menstruation, alcoholism, allergies, and geological factors [62].

Nevertheless, compared to the physiology-based approach, another approach coined the data-driven strategy also internalize self-collected data and other easily available parameters to predict BG. Regarded as the black box, although sometimes this approach can achieve a high accuracy rate, it is sometimes difficult to interpret the results since it lacks biological and physiological theoretical support underlying the mechanisms of the algorithms [62]. In sum, it could be divided into three different models, namely a time series model, machine learning model, and hybrid model.

Specifically, for DM patients, necessary alarms could be noted through BG prediction to avoid disease progression and over-or under-regulation of BG levels, causing hypoglycemia or hyperglycemia. Sudharsan et al. have shown that robust ML models for hypoglycemia prediction in T2DM patients could effectively identify vulnerable patients needing to manage hypoglycemia [55, 64]. Oviedo et al. conducted a methodological review regarding the prediction models of BG levels, risks, and events. They found that the algorithms setup and performance metrics of the ML algorithms currently reported were mainly focused on a closed-loop system (an artificial pancreas) [40]. As reported by Woldaregay et al., in terms of BG level prediction, feedforward neural networks remain the most used algorithms (20%), followed by hybridization of the physiology-based model and machine learning techniques (19%), recurrent neural networks (18%), and support vector machines (SVMs) (11%) [62].

2.2 Detection of DM-Associated Complications

Continuously increased ML models attempting to manage DM-associated complications have been built and assessed. Studies have proven the efficacy of ML-assisted T2DM care programs in the community by identifying population-level effects and mostly benefited patient sub-groups [14, 66]. Makino et al. has demonstrated the use of machine learning (scikit-learn), building a prediction model from 24 factors of interest in predicting the progression of diabetic kidney disease, and an accuracy of 71% was achieved [34].

A systematic review by Kavakiotis et al. has summarized the efficacious role of ML and data mining techniques in diabetes screening and diagnosis and detection and management of complications [26]. Nevertheless, complications secondary to T1DM were scarcely investigated using ML prediction models [26]. T2DM prediction in the community is beneficial for the early detection of T2DM in populations with high-risk factors. It might robustly capture cases with early dysglycaemia but present with no obvious clinical symptoms [13].

On the other hand, pre-hospital screening is also an important application of ML algorithms. Haq et al. proposed a filter method based on a decision tree for incredibly important feature selection and incorporated two ensemble learning algorithms, Ada Boost and Random Forest, for feature selection; the proposed algorithm could reach a test accuracy of 99%, 99.8% with k-floods and 99.9% with LOSO validation, in identifying populations at risk of DM [23].

DM risk classification is vital and challenging, as the medical data is non-linear, non-normal, and complex [7, 35]. A variety of ML algorithms have been developed for the prediction and diagnosis of diabetes disease, viz., (1) supervised algorithms including decision tree (DT), random forest (RF), linear regression, logistic regression (LR), Gaussian process classification (GPC), aïve Bayes (NB), as well as neural networks like artificial neural network (ANN) and feedforward neural network (FFNN); and (2) unsupervised algorithms such as k-nearest neighborhood (KNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) [8, 35]. And the efficacy of such algorithms has been evaluated and reported by various researchers, with an accuracy of DM prediction ranging from 70 to 99% [4, 5, 9, 21, 25, 27, 33, 42, 43, 48, 58, 61, 65].

3 Different Machine Learning Algorithms

ML algorithms were established to reproduce human neural networks in silico in the 1980s. ML is generally composed of three key components: learning algorithms, computational power, and data [18]. As a subset of AI, ML models can be regarded as algorithms that can either self-learn or learn from preset parameters. The main objective is to identify effective variables and the underlying correlation [36, 38]. ML models are normally developed through the following steps, namely, problem identification, goal setting, data collection and sorting, ML model building, validation, assessment of impact, deployment, and monitoring, as well as future modifying [12], as shown in Fig. 3.

Fig. 3
figure 3

A graphical summary of the machine learning algorithm process

The ultimate aim of establishing machine-learning algorithms is to provide optimal personalized decision support of DM management, specifically by developing better closed-loop insulin delivery systems taking into account glycemic variability in DM patients [62].

The value of health-related data to expedite precision medicine development has been well underlined [37, 46, 50]. Therefore, biomarkers and pharmacogenetics parameters may also be incorporated into the ML algorithms to predict management efficacy and responses in patients [19, 31], the onset the progression of the disease course, as well as BG levels [67].

Several factors may influence the eventual clinical implementation during the model design, such as data type and size, model interpretability, and the use of a balancing model. Nevertheless, every type of ML algorithm has its limitations, which may only work at full efficacy in specific circumstances. In a systematic review on ML models for community-based T2DM, ANN outperformed all the counterparts, closely followed by logistic regression, decision trees, and random forests [32]. There exists nothing like universally acceptable and ever-winning ML algorithms that fit in every situation. Therefore, to generate relevant and robust results, the currently available ML frameworks should be adjusted in a tailor-made manner to improve further productivity and efficiency [18].

3.1 Artificial Neural Network (ANN)

An artificial neural network (ANN) is a computational model inspired by biological nervous systems. It comprises various processing elements similar to neurons and axons-like connections called weights [44].

The topology of the ANN could be classified into two main types, namely the feedforward networks and recurrent/feedback networks. The feedforward network is the most used one, where feedback information could be sent back to the former level. In contrast, information could only be sent in one direction (forward) from the earlier stage to the next level in the forward network. Therefore, ANN has excellent efficacy and significant advantages and could adjust to the data flexibly to model and solve a real-world problem.

3.2 Support Vector Machines (SVM) and Gaussian Process Regression

Support vector machines (SVM), a supervised learning algorithm, have been largely utilized for various purposes, such as identification and recognition of patterns, categorization or classification, regression, and prediction [10]. The use of SVM could minimize the errors incurred by empirical classification.

Support vector regression (SVR) is the most widely used in BG level prediction and modeling among the many SVM algorithms. For instance, Reymann et al. has developed an SVT-based Mobile platform with a radial basis function as a kernel to predict BG levels [45].

Although Gaussian process regression is non-parametric, it could estimate uncertainty and capture noise and smoothness parameters from data input [62]. For instance, Tomczak et al. has reported the feasibility of Gaussian process regression in BG level prediction using categorical inputs such as the type of measurement (e.g., insulin dose, meal intake, physical exercise, pre-prandial BG measurement, and others) [53].

3.3 Decision Tree and Random Forest

A decision tree (DT) uses a structure built using input features to predict or classify the target outcomes using various input variables. The decision rules could be easily extracted, and hence it is generalized and extended for multiple kinds of application.

Random forests also called random decision forests, serve as an ensemble learning approach for classification and regression applications. It learns through a multitude of decision trees having been constructed, and it can thus directly start feature selection, generating the model of the class or the mean of prediction [24]. Two methods are generally utilized when measuring variable importance, namely the Gini importance index and permutation importance index [2].

For instance, Xiao et al. developed a kind of BG predictor using random forest and support vector regression to evaluate the improved performance gained using a mixed strategy to select an optimal feature pattern [63]. Moreover, Georga et al. predicted the BG levels using random forest regression in a multivariate and multidimensional dataset [17].

3.4 Logistic Regression

Logistic regression (LR) is generally utilized for classification purposes, and the dependent variable ought to be categorical, owing to its significant role in classification compared to regression. With advantages in robustness and easy handling of non-linear data, the logistic regression could predict the probability of a binary variable (the dummy output variables) based on one or more predictor variables [57].

4 An Example of the Application of ML Algorithms Predicting BG Levels in Pregnant Women with GDM in Resource-Limited Regions

Gestational diabetes mellitus (GDM) is glucose intolerance (hyperglycemia) with the first onset or discovered upon pregnancy. Unmanaged GDM could lead to severe adverse outcomes compromising both mothers and offspring. Nevertheless, pregnant women living in low- and middle-income areas or countries may fail to undergo routine antenatal examinations, leading to a missed diagnosis of GDM. The reluctance to experience the full course of oral glucose tolerance test (OGTT) or the unavailability of a sufficient testing kit may be blamed. To tackle the problems, an AI model that included 9 algorithms was trained using data collected from 12,304 pregnant women from November 2010 to October 2017 who underwent routine prenatal tests in the Obstetrics and Gynecology Department of the First Affiliated Hospital of Jinan University, Guangzhou, China.

The pregnant women’s age and fasting blood glucose level were chosen as the critical parameters input for model building. For validation, fivefold cross-validation was conducted for the internal dataset. An external validation dataset constituted with 1655 cases collected from the electronic database of the Prince of Wales Hospital, Hong Kong SAR.

With 9 ML algorithms (SVM, RF, AdaBoost, kNN, NB, decision tree, LR, eXtreme gradient boosting, and gradient boosting decision tree) built, SVM reached the best performance, obtaining an accuracy of 88.7% in the external validation set.

Later, a mobile application was developed, and a prospective and multicenter study was conducted to test the clinical efficacy of the mobile application incorporated with the ML algorithms we developed in GDM screening for pregnant women in resource-limited areas, using only fasting blood glucose value and their age [52]. Although further experiments are needed, this study has provided direct evidence that ML algorithms could, on the one hand, provide a highly accurate diagnosis of undiagnosed patients with high efficacy, and on the other hand, render the cost at an extremely low level. Hence can become an appropriate tool used in the real world instead of merely an algorithm-chasing high performance in silico.

5 Outlook

Considering the ML algorithms involved in DM managements, several questions emerge (1) who is using the algorithms; (2) what kinds of data are input in the algorithms, and (3) how is the efficacy and interpretability of the models?

Owing to the “black box”-like low interpretability of ML algorithm, the promotion and further generalized application of ML is doubted, despite that the predictive performance is considerably convincing and promising [20, 30]. Nevertheless, machine learning is only effective when large samples are used due to the input data's multi-dimensionality [3], hence, the studies’ small sample size models may be under-estimated. Moreover, ML models devoid of appropriate external validation suffer from limited applicability and extendibility and lacks clinical impact. Even if the ML model is suitable for clinical application, challenges exist in practice due to real-world scenarios’ complexity and variability.

Moreover, the user of the ML algorithms matters. Although some ML models could achieve a prominent accuracy and clinical significance level, the data needed to feed the model may not be easily collected and utilized. Therefore, ML algorithm builders should consider the real-world situation and consider the future users of the algorithms instead of an utter inaccuracy chasing. Concerning the future users of the ML algorithms, the source of the parameters for the model building could be both from the patients’ side (BG levels, insulin intake, calories intake each diet, exercise, and others) and from clinician’s database (bio-physiological parameters, laboratory investigations, ancillary examinations, and others). Moreover, it is also necessary to consider any relevant contextual information, such as intra- and inter variability among the patient’s lifestyle changes, environmental factors, the time series (diurnal vs. nocturnal), and other relevant factors parameters [62].

Therefore, achieving a universal model that accurately predicts and easily collects data from the target population isn't easy. The high accuracy remains controversial if the algorithms were extrapolated to a larger population or a different population. Lacking specific clinical evidence so far, the ML algorithms could still not replace routine diabetes screening and diagnosis and provide clinical suggestions for management for potential DM-related complications predicted. In this sense, future studies should also value the interpretability and applicability of the ML algorithms developed. The assessment of the clinical efficacy and cost-effectiveness of the ML algorithms in the clinics remains urgently needed.

6 Conclusion

Machine learning algorithms have been regarded as accurate, with less operation cost and higher efficacy in predicting potential diabetes in undiagnosed populations, profiling personalized BG dynamics, establishing personalized decision support systems, and building BG alarm events in DM patients. However, real-world data concerning the efficacy and cost-effectiveness of various machine learning algorithms clinically is still limited, and internationally acceptable guidelines have not been established to estimate and quantify the potential lifestyle-relevant variables related to BG level.