Keywords

1 Introduction

Diabetes mellitus (DM) is a growing global disease which highly affects the individual patient but it also represents a global health burden with financial impact on national health care systems. In 2013 approximately 382 million people were suffering from diabetes. It is estimated that this number will have reached 592 million in 2035. In addition, approximately 175 million diabetes patients are estimated to remain undiagnosed. In the U.S., the total estimated costs for diabetes were $174 billion for the year 2007 [13].

DM is a chronic illness of the metabolic system leading to high blood glucose levels. DM can be classified into two main clinical categories. Type 1 diabetes mellitus (T1DM) is caused by the loss of β-cells which are responsible for the storage and release of insulin and it mainly occurs in children, adolescents and young adults. In contrast, type 2 diabetes mellitus (T2DM) is determined by insulin resistance and develops due to a progressive insulin secretory defect, mostly in elderly people with overweight or obesity [4].

In both conditions continuous medical care is required to minimize the risk of acute (e.g. ketoacidosis) and long-term complications (e.g. diabetic foot syndrome, nephropathy, retinopathy, cardiovascular diseases or stroke) [5]. T1DM can only be treated with insulin, whereas a wide range of therapeutic options are available for patients with T2DM [4]. Adhering to therapy in chronic diseases like T2DM requires active participation and is often very burdensome for patients. Furthermore the effects of non-adherence are not immediately evident. Long-term complications like a diabetic foot syndrome or retinopathy take years to develop [6]. Diabetes therapy is complex and therapy decisions comprise various medical and life-style related information.

The availability of smart health technology [7] like continuous glucose monitoring (CGM) [8], physical activity detection [9], location and movement data, image recognition for planned meals [10], data from computerized diabetes diaries offer large data sets which can be used for therapy initialization or the further improvement of the therapy of an individual person suffering from diabetes. The large amount of generated data shows the importance of knowledge discovery in data handling/processing for therapy personalization [11]. Computerized decision support systems (CDSS) aim to improve the treatment process in the hospital [12] as well as at home [13].

In this work we cope with the potential of CDSS in the personalization of diabetes therapy to support the therapy process in different health care sectors and the role of machine learning. Moreover, open problems and challenges for the personalization of the diabetes therapy focusing on CDSS and machine learning technology are identified.

2 Glossary and Key Terms

Clinical Computerized Decision Support systems (CCDSS): ‘Clinical Decision Support systems link health observations with health knowledge to influence health choices by clinicians for improved health care’ - this definition has been proposed by Robert Hayward of the Centre for Health Evidence.

Computerized Physician Order Entry (CPOE) is a specialized sub-category of hospital electronic patient records for the management of physician orders. Such systems in general can offer reminders or prompts or even go further and perform calculations and offer decision support [14].

Diabetes Mellitus (DM) is a group of metabolic diseases in which high blood sugar levels over a prolonged period occur. DM is classified into two main clinical categories. Type 1 diabetes mellitus (T1DM) results from the body’s failure to produce enough insulin. This form was previously referred to as “insulin-dependent diabetes mellitus” (IDDM) or “juvenile diabetes”. The source is unknown. In contrast, type 2 diabetes mellitus (T2DM) develops due to a progressive insulin secretory defect in mostly elderly people with overweight or obesity [4, 6].

Diabetes Therapy: The success of a diabetes therapy depends on various factors. Regular measurement of the blood glucose level is the basal requirement for patients suffering from diabetes. The amount of necessary measurements depends on the intensification of the therapy and the progress of the diabetes disease. In contrast to type 1 DM that can only be treated with insulin, a wide range of therapeutic options are available for patients with type 2 DM. These are in the best case lifestyle change with change of diet and increased physical activity, but therapy options also include oral or injectable antidiabetic drugs and insulin administration. Furthermore insulin therapy itself opens a wide variety of different treatment options. The options range from an once-daily injection of a basal insulin dose (least intensive insulin therapy) to basal-bolus-insulin therapy, where a basal insulin dose and several bolus insulin doses are administered every day (intensified insulin therapy).

Glycated Hemoglobin (HbA1c) is a laboratory parameter which serves as a biomarker for the average blood glucose levels in patients over the previous 2 to 3 months prior to the measurement. In specific situations it can also be used as a measure of compliance with diabetes therapy. In diabetes mellitus, higher amounts of glycated hemoglobin have been associated with increased risk for microvascular complications (nephropathy, retinopathy) and to a lesser extend with macrovascular complications [6].

Glycemic Variability (GV) is the fluctuation of the blood glucose values and it is used as an indicator for the quality of diabetes management, as a high GV leads to increased risk of hypo- and hyperglycemic episodes.

Machine Learning (ML) is an algorithm-based and data-driven technique to automatically improve computer programs by learning from experience. Training of machine learning is performed by the estimation of unknown parameters of a model by using training sets. Literature separates between three main ML groups: supervised, unsupervised and reinforced learning.

3 Personalization of Diabetes Therapy

Individualized glycemic management of diabetes patients using insulin or oral antidiabetics is only possible due to recent advances in diabetes therapy, which increased the therapy safety and efficacy. The development of new insulin analogs led to a more predictable behavior of the drugs’ blood glucose lowering effect [15, 16]. The first type of oral antidiabetic agents were developed in France in the 1940s [6]. Since then a multitude of new oral antidiabetic agents has been developed using different pharmacological and physiological strategies. Furthermore a paradigm shift happened in diabetes therapy over the past decades which led to patient empowerment and therapy personalization due to improved patient education.

The choice of therapy and potential personalization especially depends on the DM type. T1DM patients exclusively get insulin treatment. They either receive insulin via pump or by multiple daily injections. Here, personalization is possible by fine-tuning the parameters which drive the algorithms for the patient’s individual insulin dose calculation [17]. Patients with a high risk of developing T2DM (pre-diabetes) are treated by lifestyle changes (diet change and increase of physical activity). T2DM patients have a broader array of therapeutic choices. Early onset of T2DM is treated by lifestyle changes or oral antidiabetic agents. If an intensification of the diabetes therapy is necessary different strategies involving insulin are treatment options. Here, personalization is possible by setting different treatment goals for the different stages of intensification (stepwise approach) of the insulin therapy [4, 16]. Less intensive insulin therapies comprise fixed insulin doses once a day, either adjusted by the physician at the next routine appointment or by the patient according to a schema. More intensive insulin regimens require multiple insulin doses per day and the consideration of carbohydrate intake and correction insulin for blood glucose levels outside of a target range. Here, personalization is also possible by fine-tuning the parameters which drive the algorithms for the patient’s insulin dose. These algorithms are usually less complex than the ones used for T1DM and consequently they allow fewer options for personalization.

Recent guidelines recommend individualized diabetes therapy goals for people with DM [4]. In the current position statement for the management of T2DM the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD) placed great emphasis on patient-centered and personalized diabetes care [18]. Personalization of glycemic control targets is based on clinical parameters, including age, duration of DM, prevailing risk of hypoglycemia, presence of DM associated complications or co-morbidities and eco-system components [19]. In specific situations, the patient’s glycated hemoglobin (HbA1c) serves as a measure of adherence with diabetes therapy. It is a biomarker for average blood glucose levels over the 2 to 3 months prior to the measurement. In diabetes therapy, certain blood glucose target values and HbA1c targets are defined for the patient’s therapy. These targets are also determined by the choice of the patient’s therapy option. Insulin for example is very effective in lowering HbA1c but insulin administration also increases the risk of hypoglycemia [16].

Individual therapy goals are set to avoid co-morbidities caused by poor glycemic control. To avoid the deterioration of a retinopathy, a better glucose control which means achievement of lower blood glucose levels and HbA1c targets is recommended [20, 21].

Two other important factors in personalizing diabetes therapy are age and diabetes duration. Consequently, lower targets should be achieved in younger patients to reduce the long-term risk of DM associated complications. In contrast, therapy should aim for safer targets and achieving them more slowly in older patients [22].

The setting in which the therapy is performed also strongly influences the therapy targets. Patients in a nursing home setting have typically less stringent targets to avoid hypoglycemia and less frequent blood glucose monitoring compared to patients in intensive care units [23]. Even though the exact therapy goals for patients in intensive care units are discussed controversially, intensive insulin therapy to maintain blood glucose at lower targets reduces morbidity and mortality in critically ill patients [24, 25].

In this article we focus on personalization of diabetes treatment rather than on all strategies of Personalized Medicine for Diabetes (PMFD), because widespread adoption of this global approach will only occur when the identification of risk factors through genotype or through biomarkers is accompanied by an effective therapy [26]. PMFD uses information about the genetic makeup of a person with diabetes to customize strategies for preventing, detecting, treating and monitoring their diabetes.

The vast amount of parameters for personalization makes diabetes management increasingly complex and diabetes complications remain a great burden to individual patients and the society [27]. Therefore it is hypothesized that the quality of these medical decisions can be enhanced by personalized decision support tools that summarize patient clinical characteristics, treatment preferences and ancillary data at the point of care [28].

4 Towards Personalization Using Decision Support Systems

Diabetes therapy takes place in different health care sectors. Every sector has different goals for the patients’ diabetes therapy, as mentioned in the previous chapter. This results in specialized solutions for diabetes management available on the market, each specifically targeting a particular sector. Diabetes decision support systems are used in the following sectors:

  1. 1.

    Patient self-management

    1. a.

      At home

    2. b.

      Primary care

    3. c.

      Outpatient care

  2. 2.

    Institutional care

    1. a.

      Nursing homes

    2. b.

      Hospital

      1. i.

        Inpatient care

      2. ii.

        Intensive care

Decision support aiding health care professionals can primarily be found in institutional care, whereas decision support targeting decisions performed by patients can mostly be found in the patient self-management sector. DM patients outside of institutional care settings are on average younger, more independent and the focus of the therapy lies predominately on the diabetes disease. Patients in institutional care are primarily not admitted because of having DM, but for the complications associated with having DM (diabetic foot syndrome, nephropathy, retinopathy, cardiovascular diseases or stroke). DM is mostly regarded as concomitant disease and should therefore cause the least possible additional effort. Strategies for personalization of the diabetes therapy are therefore very different in the health care sectors. The following chapters summarize decision support systems and tools which facilitate a personalization of the diabetes therapy.

4.1 Diabetes Decision Support Applications for Self-Management

Medication support and therapy control: Self-management of the patient’s insulin therapy requires the frequent measurement of blood glucose levels and the adjustment of the patient’s medication. In insulin therapy, the calculation of the required insulin dose involves the use of more or less complicated mathematical formulas. Therefore mathematical aides, integrated into insulin pumps and glucose meters, have been developed which model evidence based protocols for insulin dosage [29], so called Automated Bolus Calculators (ABC). A recent review summarized the current state of the art on ‘Glucose meters with built-in automated bolus calculator’ [30]. The authors concluded that ABC incorporated in glucose meters can be regarded as bringing real value to insulin treated patients with diabetes. Software apps are not recommended up to now as they generally are of poor quality [31]. ABC allow very detailed personalization of the insulin dosing decision support. Aside from blood glucose levels, ABC also consider carbohydrate intake and physical activity or health events to estimate insulin requirements. ‘Automated’ bolus calculation means that no manual bolus calculation is necessary. The identification of the correct parameters for personalization of the bolus calculation is a very individual and time consuming process for every user [29].

In the context of insulin-based diabetes therapy, a controller is an algorithm that controls the blood glucose values by titrating the amount of insulin. ABC are either rule or model based open-loop diabetes control methods. Independent of the used diabetes control method, it is categorized open-loop system, when a patient has the final power of decision [32].

Artificial pancreas systems are used for automated insulin injections. This type of diabetes control is characterized as closed-loop. Using these systems, model-predictive control algorithms are applied which use predictions of future glucose levels to estimate insulin requirement in insulin-pump therapy [33]. In these applications the input for the prediction models is continuous glucose monitoring data of T1DM patients.

Models of glucose dynamics for predictive purposes can mainly be divided into two categories; physiologically-oriented models and data-driven methods. The latter approach can furthermore be divided into time series analysis, using auto regressive models and machine learning methodologies [34]. Physiological models for blood glucose estimations are very accurate for short time predictions. They achieve a predictive capacity with a root mean square error (RMSE) of 3,6 mg/dl for a prediction horizon of 15 min [35]. Main advantages of these models compared to data-driven models are that there is no need to train these models and that their output is physiologically explainable. The main disadvantage is that if the difference is not explainable with the input variables no personalization of the algorithm is possible. Data-driven glucose prediction is a relatively new methodology compared to physiological glucose prediction. Similar to the development of the personal computer these technologies advanced in the late 1990s [36]. Main advantages of these models are that they are adaptive (self-learning) and patient specific without the need for developing a physiological model. Main disadvantages are that the system depends on the training data quality (garbage in and garbage out problem) and that the output of the system is not physiologically explainable.

For artificial pancreas systems relatively short prediction horizons and therefore a comprehensive monitoring using CGM are needed to enable closed-loop diabetes control [37]. But also patients without CGM which are not so intensively monitored could benefit from the prediction of future blood glucose levels. In [3840] the authors devised an engine that predicts the expected blood glucose level at the next meal and the pending risks of hypoglycemia. They performed a study for safety and efficacy of using predicted data in dosing decision support for routine patient care. The prediction engine was used in patients who were referred to begin basal-bolus-insulin therapy. HbA1c levels fell significantly from 9.7 ± 1.7 % (baseline) to 7.9 ± 1.2 % (end of study), and hypoglycemia dropped fourfold.

Decision support tools for physicians: The patient’s diabetes therapy is performed in close collaboration with primary care physicians and/or outpatient clinics. In [41] a computer application which helps primary care physicians in diabetes therapy decision making was developed and validated in a cluster-randomized clinical trial. The application was used to make decisions when starting, continuing or changing insulin and its dosage. The HbA1c in the intervention group was significantly reduced by the use of the decision support application (–0.69 %; p = 0.001). Electronic decision support tools for primary care physicians are summarizing information about patients’ diabetes state, they provide reminders to required diabetes care and a support to patient education [42]. In [66] a CDSS was designed to help outpatient clinicians manage glycaemia in patients with T2DM. A rule-based expert system generates recommendations for changes in therapy and accompanying explanations. As mentioned earlier, T2DM is in contrast to T1DM a disease where a variety of different treatment options exist. Therefore, the system considers 9 classes of medications and 69 regimens with combinations of up to 4 therapeutic agents. The program is integrated in a web-based system for diabetes case management and supports a method for uploading data from glucose meters via telephone network. The system provides a report to the clinician regarding the overall quality of glycemic control and identifies problems, e.g., hyperglycemia, hypoglycemia, glycemic variability, and insufficient data.

Therapy aids and lifestyle support: To aid diabetes patients in the difficult task of estimating the correct personalized insulin requirement and to meaningful perform personalized control of therapy several tools are available.

Carbohydrate estimation: The success of the patient’s insulin therapy is significantly dependent on the correct estimation of how nutrition influences insulin requirements [43]. This relationship is used in insulin therapy and it is called the Carbohydrate Factor. The factor is patient specific and may vary over the time of the day. Once accurate patient specific factors have been developed for different times of the day, correct estimation of the number of carbohydrates in a meal represents another obstacle in insulin therapy. Many patients might not estimate carbohydrates accurately and commonly either over or underestimate carbohydrates in a given meal [44, 45]. Another source of inaccuracy in estimating the patient’s insulin requirement for meals based on carbohydrate counting is the composition of foods. Not only the number of carbohydrates influences the physiological glycemic response but also how the meal is absorbed. For example rich-in-fat meals need more time to be absorbed. Therefore these meals lead to prolonged hyperglycemia or the risk of hypoglycemia, if the insulin dose to cover the expected blood glucose rise for these meals is administered at once [46]. To approach the these problems, bolus calculators with nutrition data base software integrated into an insulin pump have been developed which are able to control the type of bolus [47]. In rich-in-fat meals the bolus is administered using a wave profile to administer insulin over a longer period of time compared to a single bolus.

For easier estimation of the meals’ carbohydrate content, it has been proposed to implement nutrition data bases in food recognition systems. These systems use machine learning algorithms to categorize images of food [10, 48]. Therefore it is possible to identify the food by taking a picture of the meal using a smartphone. The systems are now able to detect food with an accuracy of up to 81 %. The final systems for diabetes therapy should include food segmentation such that images with multiple food types can also be addressed. Furthermore, to be eligible for diabetes therapy, the food volume should be estimated using multi-view reconstruction and the carbohydrate content should be calculated based on the computer vision results and nutrition data bases.

Activity recognition: The patient’s insulin requirement and therefore the blood glucose levels are strongly influenced by the amount of physical activity and the health status. In diabetes therapy, establishing health benefits from physical activity is primarily done on the basis of self-reported data; typically surveys asking patients to recall what physical activity they performed according to their diabetes treatment plan. This is usually performed in T2DM patients. In T1DM patients using bolus calculators, physical activity often plays a major role in insulin calculation. The extent of change rate of the insulin dose depends on the intensity and duration of physical activity and varies among the patients [49]. Currently, this estimation process is very imprecise due to inaccurate reporting of physical activities. One solution to improve the accuracy of reporting could be automated activity recognition. Such systems consist of [50]:

  1. (1)

    A sensing module that continuously gathers information about activities using accelerometers, microphones, light sensors, heart rate sensors, etc.

  2. (2)

    A feature processing and selection module that processes the raw sensor data into features which categorize by activities.

  3. (3)

    A classification module that uses the features identified in the previous data procession step to infer which activity has been performed.

Methods to predict activity-related energy expenditure have advanced from linear regression to innovative algorithms capable of determining physical activity types and the related metabolic costs. These novel techniques can measure the engagement in specific activity types [51]. Integrated into T2DM therapy, the therapy adherence to physical activity lifestyle interventions could be monitored. In T1DM, these new techniques could help to estimate the possibly required insulin reduction prior to sports using earlier recordings of similar intensive activities.

Activity recognition can also be implemented in a smart home-based health platform for behavior monitoring. In order to recognize activities being performed by smart home residents, machine learning algorithms could be used to classify sensor data streams. The smart home platform could be used to monitor the activity, diet, and exercise adherence of diabetes patients and evaluate the effects of alternative medicine and behavior regimens [52].

Lifestyle support/promotion: In T1DM patients, the loss of the insulin-producing beta cells of the islets of Langerhans in the pancreas results in the body to fail to produce insulin. T2DM is characterized by insulin resistance which, as the disease progresses, may be combined with a relatively reduced insulin secretion [6]. Therefore, the pathogenesis of T2DM, as a not rapidly progressing disease, can be prolonged by lifestyle interventions. Lifestyle intervention options are diets and/or increase of physical activity used to effectively manage patients in the pre-diabetes phase. Nevertheless, lifestyle management remains challenging for both, patients and clinicians. To track lifestyle events a variety of web- or mobile phone-based diabetes diaries are available. Petrella et al. developed a lifestyle support system which facilitates personalized, data-driven recommendations for people living with pre-diabetic and T2DM conditions [53]. The system suggests subtle lifestyle changes to improve overall blood glucose levels. To improve and support therapy adherence, a mobile phone app with lifestyle diary for coaching of the patient based on multiple psychological theories for behavior change has been recently developed. The user automatically receives generated messages with persuasive and personalized content [54]. Such systems can be used to enforce patient’s therapy adherence and to help the patient to better understand their diabetes.

Pattern recognition for optimization of insulin therapy: Diabetes therapy leads to an accumulation of data. Sources are glucose data from blood glucose meters or CGM devices, records of diabetes diaries and therapy plans in more or less structured forms and data from different kinds of therapy aids like bolus calculators. The sources of data are often complex and weakly structured resulting in massive amounts of unstructured information. The data interpretation by the physicians and the patients is often performed without or with only weak decision aids. Currently few products enable data analysis using state of the art technologies which could be found for example in predictive analytics.

In a state of the art article targeting emerging applications for intelligent diabetes management, machine learning classification of blood glucose plots was highlighted [55]. The authors cope with the identification of excessive glycemic variability (EGV). The focus of diabetes therapy is to mimic physiological blood glucose profiles as close as possible. This means to avoid too high and too low blood glucose levels. But, to some extent high and low blood glucose levels are physiologically normal e.g. blood glucose rise after meals. Both upward (postprandial) and downward (interprandial) acute fluctuations of glucose around a mean value activate oxidative stress. As a consequence, it is strongly suggested that a global antidiabetic strategy should be aimed to reduce HbA1c, pre- and post-prandial glucose, as well as glucose variability to a minimum [56]. To the best of our knowledge no guideline-defined metric for classifying glycemic variability exists [57], nor a decision support system which aids in the detection of EGV [58]. Wiley et al. describe an automatic approach to detect EGV from CGM data [59]. Therefore, two physicians independently built a knowledge data base from CGM data which was used for the training of machine learning algorithms for EGV detection. The best performing prediction model achieved an accuracy of 93.8 %. The results of EGV predictions could inform clinical disease management, if a patient used CGM for the week preceding a routine appointment and therefore propose a personalization of the diabetes therapy approach.

Pattern recognition can be used to meaningfully identify blood glucose patterns, highlighting potential opportunities for improving glycemic control in patients who self-adjust their insulin [60]. Skrøvseth et al. conducted a study to identify how self-gathered data can help users to improve their blood glucose management [61]. The participants were equipped with a mobile phone application, recording blood glucose, insulin, dietary information, physical activity and disease symptoms in a minimally intrusive way. Data-driven feedback to the user in form of graphic representation of results from scale-space trends and pattern recognition methods may help patients to gain deeper insight into their disease. Blood glucose pattern analysis can also be found in ABC.

Long - term disease management: During the last decades, research in medicine has given increasing attention to the study of risk factors for diabetes complications. A practical application of risk factor studies is the development of risk assessment models (UKPDS model [62], Framingham model [63]). These models are able to provide a prediction, based on patient characteristics, of the patient’s risk to develop diabetes associated complications [64].

In care management, which is facilitated from a payer perspective by health insurance companies, patients receive a personalization of care according to risk stratification. Stratification focuses on whether patients are ill enough to require ongoing support from a care manager. Having less serious chronic conditions warrant more intensive interventions to prevent them from worsening. Fairly healthy patients just need preventive care and education [65].

Risk preventive modelling enables the prognosis of future high-risk and/or high-cost patients, in patients having a chronic disease like T2DM. The models use a combination of factors, such as demographics, clinical parameters, lifestyle factors, family history of diabetes and metabolic traits [66]. Several machine learning techniques have been applied in clinical settings to predict disease progression and have shown higher accuracy for diagnosis than conventional methods [67]. Risk models have been integrated in guidelines and are increasingly advocated as tools to assist risk stratification and guide prevention and treatments decisions in diabetes care [68, 69]. It is hypothesized that with the prior knowledge of disease risk, the incidence of T2DM could be reduced considerably by implementing preventive measures in high-risk patients [4].

4.2 Diabetes Decision Support Applications for Institutional Care

Systems used in hospitals for management of diabetes care are very generic and they are designed to operate safely for the majority of patients. Currently personalization for patient characteristics plays a secondary role due to two factors: (1) A short length of stay does not allow the empiric development of patient specific factors which are crucial for the personalization of diabetes therapy. (2) Rigid hospital workflows and excessive workload of clinical personnel often prohibits the implementation of individualizations in diabetes therapies. Nonetheless, aside from these restrictions personalization is possible to some extent. Clinical computerized decisions support systems (CCDSS) often model evidence based guidelines which facilitate personalization of the estimation of medication requirements according to laboratory and demographic parameters [7073].

Medication and workflow support: Clinical physician order entries (CPOE) are a specialized sub-category of hospital electronic patient records for the management of physician orders. They can be configured to support glucose management besides many other things. Such systems generally can offer reminders or prompts or go even further and perform calculations and offer decision support [14].

A recent review dealing with CCDSS’ impact on healthcare practitioner performance and patient outcomes displayed significant evidence that CCDSS can positively impact healthcare providers’ performance with drug ordering and preventive care reminders [74]. Furthermore, a recent diabetes guideline emphasizes the use of CCDSS and CPOE for insulin dosing [75]. This is a particularly important field of decision support because the correct handling of insulin in diabetes patients is prone to error. In a recent audit which investigated the quality of inpatient diabetes care, 36.7 % of the patients experienced at least one diabetes medication error during hospital stay [76]. A current review estimated that an adoption of CPOE systems in hospitals alone without decision support function leads to a 12.5 % reduction in medication errors [77]. A Cochrane Review assessed whether computerized advice on drug dosage has beneficial effects on patient outcomes compared with routine care. The review led to the conclusion that computerized advice on drug dosage (oral anticoagulants and insulin) results in a physiological parameter more often in the desired range. Furthermore, it tends to reduce the length of hospital stay compared to the length of hospital stay in routine care. Furthermore comparable or better cost-effectiveness ratios were achieved with computerized advice on drug dosage [78]. Diabetes medication CCDSS in the hospital range from administering and managing oral antidiabetic agents in non-critically ill patients to adjusting insulin infusion in critically ill patients. Insulin infusion in intensive care units is performed according to paper based nurse-directed insulin nomograms that adjust rates of insulin infusion according to the current rate of infusion and the blood glucose reading. These nomograms usually do not take patient-specific blood glucose trends into consideration and patients may oscillate between hypoglycemia and hyperglycemia [79]. By using a computerized insulin infusion algorithm in a CCDSS which also takes into account the patient’s sensitivity to insulin, this system was used to safely achieve near normoglycemia in hospital inpatients. Additionally, there was lower incidence of hypoglycemia compared to initial studies [80].

The success that a CCDSS or CPOE is accepted by clinical staff greatly depends on the implementation into existing workflows [81, 82]. Automatic provision of decision support should be performed as part of the clinicians’ workflow. Overall, the use of CCDSS and CPOE systems lead to a standardization of processes in clinical workflows.

Recently, a survey to map the current state of implementation of CPOE and CCDSS in Switzerland was performed. According to this survey, the introduction of CPOE in Swiss healthcare facilities is increasing. The types of CCDSS currently in service usually include only basic decision support related to drug, the co-medication or the setting, and only scarcely taking into account patient characteristics [83]. Future decision support tools must be designed to account for both clinical and patient characteristics [28].

5 Decision Support Using Machine Learning Technology

5.1 A Glimpse into Machine Learning Methods for Health Care

Advances in medical signal, image and text acquisition led to an extensive improvement of available patient-related medical data. These amounts of data make it difficult for health care professionals or patients to provide a timely treatment decision [84]. CDSS support the medical decision making process in diagnostics, therapeutics and prognostics in main medical disciplines [74]. Typical CDSS applications can be found for example in radiology, emergency medicine and intensive care, cardiovascular medicine, internal medicine or oncology [8591].

In CDSS machine learning is an important underlying technology in many applications. For example radiology-based CDSS usually apply pattern recognition techniques based on machine learning for detection of medical conspicuities [9294]. ECG signal processing used in cardiology is another promising machine learning approach in medical decision support applications [88, 95].

Machine learning is concerned with the question how computer programs automatically improve with experience [96]. Witten et al. [97] proposed “Things learn when they change their behavior in a way that makes them perform better in future.” Practically, training of machine learning algorithms is performed by estimation of unknown parameters using training sets.

Duda et al. [98] separates between supervised, unsupervised and reinforced learning. In supervised learning (classification) category labels are manually assigned to each pattern by human experts. The set is divided into a training and a test set. The algorithm learns from the training set, which means that discriminating features of the patterns are identified. The test set is used for evaluation of classification quality. High accuracy means, that the features maximize the difference between patterns of different categories and underline the similarity of patterns in the same category. Typical supervised machine learning models are for example Support Vector Machines (SVM), k-Nearest Neighbors (K-NN), Decision Trees, Naïve Bayes, Random Forests and Neural Networks. Unsupervised learning (clustering) is important if no human expert could or should label patterns. Unsupervised learning models build clusters based on the features of patterns. K-means, hierarchical clustering or expecting-maximization are typical algorithms to solve clustering problems. Reinforced learning follows a feedback mechanism. A feedback is given if a category is correct or incorrect. Based on this feedback, the algorithm should ‘take new paths’ and consequently improves with experience.

In the following section, typical applications of machine learning in the field of diabetes therapy are presented.

5.2 Application of Machine Learning for Diabetes Therapy

Diabetes therapy depends on medical, demographic and lifestyle-related parameters. These parameters include diabetes type, age, weight, diabetes duration, co-morbidities, blood glucose, physical activity and diet, to name a few examples. Latest innovations in sensor technology (CGM, clothes integrated movement sensors, smartphone-based image recognition) together with improved documentation effort of medical history in electronic patient records, diabetes-related patient diaries or telemonitoring systems provide large and valuable datasets for therapy-related decision making. Machine learning is regarded to be a helpful technology to support diabetes therapy. In the following, selected fields of machine learning in diabetes therapy are described.

Data - driven blood glucose prediction: No information about the physiology of diabetes is necessary in the data-driven blood glucose prediction. This is in contrast to systems which simulate the human physiology of the glucose-insulin regulatory systems. Data-driven techniques mainly rely on collected data and exploit hidden information in the data to predict future blood glucose levels [99].

With the availability and improved accuracy of tight glucose monitoring using CGM devices, research postulated the question if recent and future blood glucose values can be predicted from glucose history [100]. If this would be possible, hypoglycemic events could be detected or short and long term medication could be titrated.

The data-driven prediction of blood glucose can be considered as nonlinear regression problem between medication, food intake, exercise, stress etc. as input parameters and blood glucose value as output parameter [34]. Besides regression models [101, 102] and time series analysis [103], especially machine learning methods like artificial neural networks (ANN) [102, 104107], support vector machines [108] and Gaussian models [105] have proven to be successful. Daskalaki et al. [109] presented a promising ANN model with a RMSE of only 4.0 mg/dl for a prediction horizon of 45 min for adults with T1DM. 94 % of the predictions were clinically accurate in the hypoglycemic range. Instead of conducting evaluation with real patients in a clinical study already measured data from patients were used for training and evaluation of the models. Thus, real patient data is needed for a final conclusion on the very good performance of the model. Pappada et al. [110] reported a RMSE of 43.9 mg/dl in his study with ten T1DM patients using a neural network model. The model predicted 88.6 % of normal glucose concentrations (>70 and <180 mg/dl), 72.6 % of hyperglycemia (>=180 mg/dl), but only 2.1 % of hypoglycemia (<=70 mg/dl) correctly within a prediction horizon of 75 min. Data-driven prediction approaches often lack on estimation of hypoglycemic and/or hyperglycemic events due to limited data on low and high blood glucose values [110]. Another problem of blood glucose prediction is the decreasing performance with increasing prediction horizon. Sufficient prediction is only possible in a 5 to 75 min. range [34, 109].

Data-driven prediction methods depend on the frequency and accuracy of available data. CGM measurements are not state-of-the-art in diabetes therapy due to the lack of accuracy and the missing reimbursement by health insurance companies [111].

Hypo - /Hyperglycemia detection: In contrast to the regression problem of blood glucose prediction, the detection of hypo- or hyperglycemic events can be treated as a typical classification problem. For a given set of input parameters, the model should detect if a hypo- or hyperglycemic event will take place. The prediction can be reduced to a binary classification problem which is easier to achieve than a continuous prediction of blood glucose values.

Sudharsan et al. [112] showed that the detection of hypo- and hyperglycemic events for patients with T2DM is achievable with high accuracy, even if only sparse blood glucose values based on self-monitored blood glucose (SMBG) readings once or twice a day are available. They trained the model with data from approximately 10 weeks. The prediction, if a hypoglycemic event will occur within the following 24 hours was achieved with a sensitivity of 92 % and a specificity of 70 %. By including medication information of the past days the specificity was improved to 90 %, although the prediction was narrowed to the hour of hypoglycemia.

Machine learning can also be used to improve the accuracy of CGM systems. Especially in the hypoglycemic range incorrect measurements can occur. Bondia et al. [113] successfully used Gaussian SVM to detect incorrect CGM blood glucose values with a specificity of approximately 93 % and sensitivity with 75 %.

Glycemic variability detection: Glycemic variability (GV), the fluctuation of blood glucose values, is an indicator for the quality of diabetes management due to increased risk of hypo- and hyperglycemic episodes [114]. In order to rate the quality of GV, numerous metrics have been defined in the last decades. Rodbard [58] rated metrics according to their importance and concluded that many metrics are overlapping. He suggested the following five metrics as the most relevant:

(1) SDT (total variability in data set), (2a) SDw (the average of the SDs within each day), or (2b) MAGE (average amplitude of upstrokes or downstrokes with magnitude greater than 1 SD), as a measure of within-day variability, and (3a) SDb hh:mm (average of all SDs for all times of day), or (3b) MODD (mean difference between glucose values obtained at the same time of day on two consecutive days under standardized conditions) as a measure of between-day variability.

Based on these metrics automated classification tasks can support healthcare professionals to identify patients at risk and to provide therapy suggestions [58]. Detection of GV is usually based on CGM signals which provide a comprehensive dataset of blood glucose values. Machine learning proved to be a valuable method to support the consensus building for a GV metric and to categorize CGM data according to this metric.

Marling et al. [57] applied multilayer perceptrons (MPs) and support vector machines for regressions (SVR) on 250 CGM plots of 24 h on a consensus perceived glycemic variability metric (CPGV) which have been manually classified into four CV classes (low, borderline, high, or extremely high) by twelve physicians. The manual classification was averaged and ten-fold cross validation was used for evaluation. SVR performed better than MPs. This CPGV metric obtained an accuracy of 90.1 %, with a sensitivity of 97.0 % and a specificity of 74.1 % and outperformed other metrics like MAGE or SD.

Controller for insulin - based diabetes therapy: Besides rule-based and model-based control methods, machine learning can be used to control blood glucose values. Machine learning is categorized as model-free method which means that it does not need a mathematical model of the glucose-insulin interaction [32, 115].

Zitar et al. [116] applied two different artificial neural network models; the Levenberg-Marquardt training algorithm of multilayer feed forward neural network (LM-NN) and a polynomial network (PN) as controller for insulin dose titration. Simulations were performed with a data set of 30,000 BG samples from 70 different patients. LM-NN proofed to be superior over PN. The authors stated that LM-NN has the potential to be used as model-free insulin controller.

Lifestyle support: Carbohydrate intake and physical activity are important parameters for the treatment of diabetes. While the former case increases the blood glucose values, the latter is glucose-lowering. Anthimopoulos et al. [10] presented an automated food recognition system using computer vision. They adapted the well-known bag-of-words approach from natural language processing to describe the identified features of the images. The classification was performed with three different supervised classifiers: SVM, ANN and Random Forests (RF). In total 5,000 images of typical European food-sets were available in 11 food classes. 60 % of the images were used for training and the remaining 40 % built the evaluation set. SVM performed best with an overall accuracy of 78 % for the image classification task. Future work will include automated food segmentation and food volume estimation to count carbohydrates. A smartphone-based real-time mobile food recognition system was presented by Kawano et al. [48]. They used bounding boxes to identify food items which have been classified in one of fifty food categories using SVM. Accuracy was 81.55 % taking the top five candidates into account. The automated system also showed better performance than the manual food selection from a hierarchical menu which has been tested in a small user study.

Physical activity detection is an important pre-requisite to estimate the energy expenditure. Ruch et al. [117] used a tri-accelerometer together with parameters like age, gender and weight, to train a decision tree based activity-specific prediction equation (Tree-ASPE) and an artificial neural network for energy expenditure estimation (ANNEE). Tree-ASPE outperformed ANNEE.

Ellis et al. [118] showed that RF classifier can be used to predict physical activity type and energy expenditure using accelerometers. In this study wrist accelerometers were more successful in physical activity detection, while hip accelerometers were superior in energy expenditure estimation.

6 Open Problems

In this chapter we highlight the main challenges for personalization of diabetes therapy. The focus lies on the problems regarding technical implementation rather than on the medical issues of therapy personalization.

Problem 1: Often DM is regarded with secondary importance especially in the clinical domain. This is very understandable because primarily the patients are not hospitalized because of having DM and the clinicians need to focus on the reasons for the admission. The clinicians are often not able to spend much time for the patient’s diabetes therapy due to heavy workload and rigid clinical workflows. Therefore one focus in development of CDSS is the optimization of the devices’ usability. In a systematic review investigating features critical to the success of CCDSS, the authors discovered that 75 % of interventions succeeded when the decision support was provided to clinicians automatically. None succeeded when clinicians were required to seek out the advice of the decision support system [82].

Problem 2: Modelling the human insulin system is a complex task. Different approaches have been developed in recent decades. The artificial pancreas is still a field of research and no end-consumer system is available on the market. The main reason for this is that precision and usability of continuous blood glucose (CGM) in daily use currently does not meet the needs for such a system.

Problem 3: Diabetes therapy is complex and varies from patient to patient. Success of diabetes therapy depends on many different factors. Nutrition intake, physical activity and current health status influences the specific therapy. Whereas T1DM can only be treated with insulin, for patients with T2DM a wide range of therapeutic options are available. The combination of factors influencing the therapy and the therapeutic options makes personalized therapy initialization and optimization a complex task. In addition, physicians and patients are often reluctant to start insulin donation and to intensify insulin treatment regimens due to the fear of hypoglycemia. Thus, the use of continuous monitoring with on-body sensors (blood glucose, nutrition intake, physical activity, health status) together with intelligent therapy prediction and optimization models can help to initiate and to optimize therapy with reduced risk of safety critical events like hypoglycemia.

Problem 4: Currently there are many freestanding software applications (apps) available for smartphones which calculate bolus doses of insulin. These apps regulate dosing of potentially dangerous insulin, which puts them in the domain of the Food and Drug Administration (FDA). But none have been approved by the FDA. Patients should not use such non-approved medical software because of the risk of being instructed to administer an unsafe dose of insulin [31]. Also in the institutional care sector, systems with decision support functionality are developed in this “grey area”. CPOE systems in Europe have not yet been classified as Medical Devices [119]. A discussion is on-going whether vendors classify their products as Medical Devices Class IIa, Class I or not at all. The development process of CDSS is complicated and expensive due to requirements of Medical Device Directive (MDD) conform development.

Problem 5: Especially for the personalization of insulin therapy new sensor technologies integrated in applications like wearable devices are very promising. Using intelligent controllers which are available for example in integrated machine learning approaches [120] in combination with an arrangement of different sensors can lead to a significant improvement of insulin therapy. However, the problematic lies in the accuracy of currently available minimal intrusive sensor systems. Sensors have to be very accurate to prevent errors in insulin dose calculations. Also food and activity recognition systems have to be improved to be eligible for insulin therapy. Closed loop systems, such as artificial pancreas systems face the same problem. Currently, the biggest obstacle for safely running these systems is not the controller algorithm but the accuracy of CGM sensor systems.

Problem 6: Personalization of the patient’s diabetes treatment demands patient involvement. The development of factors for personalization requires frequent documentation of relevant events (e.g. blood glucose, meals, physical activity, health status etc.) and adherence to the therapy goals. This human-in-the-loop situation demands special adaptations of CDSS [121]. For elderly, or unexperienced or less motivated patients this may quickly lead to a therapy overload. Unfortunately, the majority of T2DM patients are part of this group. The main challenge is the development of therapy aids which are as least intrusive and interactive as possible.

Problem 7: The treatment of diabetes takes place in different health care sectors (at home, outpatient care, nursing home, hospital care …). Borders between the health care sectors make it difficult to provide a decision support that can be seamless used in every sector. Consequently, the developed CDSS are focused on a special sector and usually interfaces for data-transfer are lacking. These developments make it difficult for patients and for healthcare professionals to initialize and optimize therapy. Future research should focus on cross-border treatment of patients with diabetes.

Problem 8: Machine learning is used to predict blood glucose values. As machine learning is a data-driven method quality of prediction depends on the quality of available data. Very low blood glucose (hypoglycemia) is an adverse event. Consequently, data is sparse which leads to unsatisfactory prediction results for these safety critical situations.

7 Future Outlook

Recent DM guidelines and advances in research and development of diabetes therapy highlight the importance of therapy personalization.

The ultimate goal of technical research in the field of diabetes therapy is to develop an artificial pancreas system. But as long as artificial pancreas systems are still a research field and no commercial product is available, CDSS are valuable tools to assist in the personalized decision making process. On the one hand, machine learning used within the CDSS (e.g. short-term glucose prediction, pattern recognition, physical activity detection) has proven to be a valuable method to support personalized therapy, but on the other hand it has shortcomings in terms of accuracy and usability in the daily routine (e.g. long-term blood glucose predictions, energy expenditure calculation, carbohydrate estimation).

Consequently, future CDSS using machine learning need to improve to be eligible for DM therapy. Personalization of DM therapy using CDSS is a promising future issue and various promising research routes exist.