1 Introduction

A key concept that has greatly influenced modern medical care has been the development of evidence-based medicine. A major aspect of evidence-based medicine includes the development of clinical guidelines for the diagnosis and/or treatment of common disorders. Such guidelines are developed by professional medical organizations, based on evidence accumulating from clinical research, often through randomized, controlled clinical trials. Over the past two decades, several types of clinical-guideline based decision-support systems (DSSs) have been developed to support clinicians in managing patients who have specific diagnoses, such as diabetes or cardiovascular diseases. Such DSSs represent evidence-based clinical guideline knowledge in computable specifications known as computer-interpretable guidelines (CIGs) (Peleg 2013). CIG formalisms provide detailed ontologies that capture data enquiry and abstraction, action management, and decision-making tasks. They provide semantics that go beyond those of existing telemonitoring systems (Lasierra et al. 2013) that use collections of rules to react to data values that are outside normal ranges.

A variety of CIG formalisms exist, each having its own application engine (Peleg 2013). CIG application engines match the CIG specification to a patient’s data set, ideally imported from medical institutions’ electronic health records. The collection of relevant patient data, to which each CIG refers, instantiates a patient model for that CIG. The behavior of the DSS is thus adapted to the patient model. Currently, most of these emerging CIG-based DSSs are used only by clinicians during the patient visit, to obtain patient-specific recommendations.

The MobiGuide system, described in this paper, goes beyond the current state of the art of CIG-based DSSs in two main ways. First, it is patient-centered, meaning that the users of the DSS are not only the clinicians, but are first and foremost, the patients. To accommodate this novelty, we have developed and used new knowledge-acquisition and specification methods for capturing in CIGs the parallel workflows of the patients who are following the CIG’s recommendations, and of the clinicians caring for these patients (Sacchi et al. 2013).

Fig. 1
figure 1

MobiGuide patient models. Features reported in the middle column of the figure are common to both the AF and GDM patient model, while the disease-specific ones are reported on the left and right column, respectively

The second way in which MobiGuide goes beyond traditional CIG-based DSS is personalization of decision support; unlike other emerging CIG-based DSSs, the patient model of MobiGuide includes not only the patient’s clinical data but also the patient’s personal preferences and psychosocial context. Preferences could be values that the patient assigns to his/her present health state and to future possible health states to which she might transition, based on the therapy choices decided together with his/her clinician, but also daily life preferences, such as meal times or self-monitoring schedule. The patient model includes also a set of personal events. These personal events induce (with certain temporal constraints) one or more of a set of predetermined psychosocial contexts that appear in the [customized] CIG, which can impact medical decisions. Thus, following the initial formal specification of clinical practice guidelines as CIGs (with optionally parallel workflows), the CIGs are customized so as to include additional contexts that are not mentioned by the standard clinical practice guidelines, but which could occur for some periods of time when managing a patient at home through a mobile device. These customized contexts determine, for example, which sub-plans of the CIG will be activated or inactivated, and which knowledge will be applied to interpret the patient’s data. For example, a context of “semi-routine schedule” may be associated with care plans that require more frequent monitoring of blood pressure. The customized CIGs are then personalized to specific patients by specifying the individual, personal events that induce each of the predetermined customized contexts (e.g., a “going on vacation” event may induce the predetermined, customized “semi-routine schedule” context). Hence, the patient model is both dynamic, reacting to current data from real-time monitoring, and it is also highly adaptive to the personal preferences and contexts of individual patients, rather than to stereotypical patient profiles.

Figure 1 shows the MobiGuide patient model for AF and GDM patients. In addition to the clinical part (i.e., main diagnosis, comorbidities, and prescriptions), which is found in other emerging CIG-based DSSs, the MobiGuide patient model shows many personalized elements that are not found in other CIG-based DSSs and include (Fig. 1): (customized) context, patient-reported events that invoke DSS recommendations (symptoms, risky events for bleeding, reports of eating extra carbohydrates), and patients’ personal preferences. These features, which change dynamically, influence clinical acts, including (a) the types and frequency of recommendations and reminders, which are sensitive to context and to reporting of personal events, and (b) the timing of measurement reminders, which depends on context and personal preferences. In addition to these personalized elements, the patient model also uses other unique features, including clinical abstractions inferred from raw data, and patient-specific DSS recommendations that were delivered to the patient or to his care providers. The entire patient model resides in the dynamic PHR (see Sect. 2.2).

The following scenario describes the patient experience of using MobiGuide.

Michael is a 70-year old AF patient with paroxysmal AF and hypertension. He has been experiencing more palpitations recently, which has been making him anxious. His doctor, Carlo, suggested that he may feel safer to be monitored and advised by the MobiGuide system while he continues to function in his normal environment.

   After signing the informed consent, Carlo activated the MobiGuide AF CIG for Michael, and his relevant hospital records were exported into the PHR. Then, additional information, such as current therapies, were added to the PHR. At this point, his patient model contains a relatively small number of relevant clinical data (shown in Figs. 1 and in  2b that presents the Summary (of clinical history) and Therapy prescriptions.

   Using the shared decision making module of MobiGuide, Carlo and Michael elicited Michael’s preferences in the form of utility coefficients and selected Warfarin as the most suitable anticoagulant that would bring Michael the highest value of quality-adjusted life years according to his personal utilities attributed to his actual state and other potential health states he could experience in the future. This global preference resulted in the selection of a specific arm in the CIG.

   Next, the MobiGuide DSS guided Carlo in selecting from a relevant set of monitoring plans. He then set Michael’s local preferences regarding reminder days and times for different potential contexts, including routine schedule, semi-routine, increased exercise (e.g., periods when he bikes more than usual), which like the global preferences, became part of his patient model (Fig. 1). Based on them, the system will remind Michael to perform his monitoring plans at the frequency appropriate to his current context.

   On the following day at 6:55 am, at his preferred reminder time, Michael received the reminder to take his medications. However, he decided to report that he’s not going to take the medication, indicating the reason “because of side effects” (Fig. 2c).

   He also received a reminder to measure his ECG. He accepted and activated his mobile sensor for 30 min (Fig. 2d). The signals were communicated via Bluetooth connection to the phone, analyzed by the AF detection algorithm and an ECG summary was saved in his PHR. It will be examined by nurse Roxana later that day.

   Michaels’ BP monitor does not communicate via Bluetooth so he entered his measured values manually. He clicked on the logbook and scrolled to the BP column (Fig. 2e). But instead of reporting DBP of 90 mmHg he entered 900. The system detected the low quality data and asked him to re-enter (Fig. 2f). While he was on the logbook, he also decided to visualize his data collected so far (e.g., Fig. 2g shows all BP data).

   Michael felt some palpitations that afternoon. So he clicked on the green face with a grimace (top of Fig. 2a) and reported unacceptable palpitations from the pull-down menu (Fig. 2h). He then received an automatic clinical recommendation from the MobiGuide DSS to monitor his ECG (Fig. 2i). Michael also remembered that he has a dental appointment 2 weeks from now. So he informed the system of this risky event for bleeding (Fig. 2j). The system recommended to call his doctor to appropriately manage anticoagulant therapy.

Fig. 2
figure 2

MobiGuide patient UI. a regular first screen for Atrial Fibrillation (AF) (the Gestational Diabetes Mellitus [GDM] screen does not include the AF Monitoring icon); b active prescriptions; c medication reminders with a place to indicate non-compliance with reason; d ElectroCardioGram (ECG) monitoring screen; e logbook with Blood Pressure (BP) data entry; f bad quality of data message; g visualizing BP records; h selecting palpitations level; i recommendation to measure ECG due to symptom reporting; j informing the system of risky events (in this case risk of bleeding)

The paper is organized as follows. We start with a section that describes CIG formalisms and their patient models. This section covers the work of other research groups and also summarizes our previously published early work on the architecture of MobiGuide and on the process of customizing its formal knowledge base to the patients’ psycho-social and demographic characteristics (Peleg 2013). In the Sect. 2, we explain in detail the knowledge elicitation and specification methodologies that we have developed (as part of the MobiGuide project) for making CIGs patient-centered and enabling their personalization via the patient model. We then present the evaluation methods and results of a feasibility study, in two different clinical domains, and two geographic sites, as part of a multi-national feasibility study, of the personalized self-management MobiGuide architecture that we have designed and implemented, which applies complex CIGs to support patients and their care providers. We conclude with a Sect. 6.

2 CIG formalisms and their patient models

In this section, we provide the necessary background regarding CIG formalisms in general, and the methods that we had previously developed to implement the MobiGuide system (Peleg 2013) in particular; we then complete the necessary background and paint the complete picture, by reviewing other existing approaches for CIG customization, pointing out their differences with respect to the MobiGuide model.

This background, describing the principles of our knowledge engineering process, will be useful in order to better understand the rest of our methodology as listed in the Sect. 3, as well as to better appreciate the results that we describe in Sect. 5.

2.1 CIG formalisms: their patient models and application to patient data

Most of the CIG formalisms, such as Asbru (Miksch et al. 1997)—the CIG formalism used in the MobiGuide project—and also EON, GLIF3, GUIDE, and PRO forma (Peleg et al. 2003), include procedural and declarative representations. In addition, all of them have application engines that can be used to run the model with a patient’s data. In MobiGuide, the Picard Asbru application engine (Shalom et al. 2016) is used.

The procedural representation specifies care processes as networks of task classes, such as medical actions, decisions, and compound plans. The control flow of task application is governed by scheduling constraints, task entry and exit criteria, and decision criteria. The decision criteria refer to the state of the care process and to abstractions of patient’s state, which are defined as declarative knowledge. Hence, the declarative CIG knowledge defines the patient model.

The various CIG ontologies make different assumptions regarding the available knowledge and patient data; but none of them explicitly refer to the patient. In addition, different CIG frameworks use various means to extract meaningful patterns (abstraction) from the raw, time-stamped clinical data. The definitions of clinical concepts and abstractions constitute the declarative patient model and they make assumptions about the patient information model used (Peleg and Gonzalez-Ferrer 2014). The declarative temporal-abstraction knowledge in MobiGuide, which is used to abstract raw time-stamped clinical data into clinically meaningful higher-level concepts and patterns, in a context-sensitive fashion, is represented using the knowledge-based temporal-abstraction (KBTA) ontology (Shahar and Musen 1996), which is applied to the patient’s longitudinal record by the KBTA problem-solving method (Shahar 1997), implemented by the IDAN temporal-abstraction mediator (Boaz and Shahar 2005) and thoroughly evaluated in several clinical domains (Shahar and Musen 1996; Martins et al. 2008). The abstractions of the patient’s data are often complex (e.g., a “repeating dietary non-compliance” pattern, based on two diet-associated non-compliance events—too much or not enough carbohydrates—during the past week).

Some CIG models include not just a patient model but a model of the clinician user. The Dementia Management and Support System (Lindgren 2011) provides advice to healthcare professionals, tailored to individual and often exceptional patient cases. The system is unique in providing the user with a learning environment that promotes the development of skills while assessing a patient case. This strategy acknowledges the different professional backgrounds, preferences, and different needs for individually tailored support. The system alerts the user when there is an ambiguity or information missing, or there is certain pre-defined types of ignorance detected in the user.

2.2 The MobiGuide architecture: applying decision support to the dynamic patient model

Unlike commercial efforts such as Microsoft’s HealthVault (https://www.healthvault.com/us/en), which focus on patient-centered storage and sharing of health information online, MobiGuide’s centrality is in its knowledge-based decision-support services. These decision-support services provide evidence-based clinical recommendations that are specific to a patient’s model that addresses the patient’s clinical, psychosocial and demographic state. The architecture that we have developed for MobiGuide (Peleg 2013) is generic and was reused in different medical domains with different clinical guidelines, various sensors for collecting patient data and several electronic health records (EHRs). To enhance efficiency and self-sufficiency, much of the DSS’s computation is performed locally for each patient. Thus, MobiGuide’s DSS incorporates a novel distributed architecture (See Fig. 3). Within this architecture, a fully-fledged backend-DSS (BE-DSS), that operates for all patients, projects (projection & callback link) relevant parts of the CIG knowledge to the patient’s local, mobile-based DSS (mDSS). The mDSS runs on the patient’s Smartphone and has access to data collected from mobile sensors that measure the patient’s biosignals (e.g., a mobile ECG monitor for AF patients and mobile blood glucose (BG) and blood pressure (BP) monitors for GDM patients). The mDSS can then act independently until the patient’s state changes substantially, necessitating a callback to the BE-DSS, which projects new knowledge to the mDSS. The BE-DSS has access to the complete CIG knowledge (which is customized to allow individualized personalization as explained below) and to the patient model (i.e., the full historical and monitoring data, current context, clinical state, current therapy and individual preferences). The BE-DSS can also send evidence-based recommendations to the patient’s care provider, through a dedicated interface. In addition, the BE-DSS interacts with a shared decision module that uses decision trees to select a CIG branch that best fits the patient’s utility function.

Another central difference from efforts such as HealthVault is that MobiGuide’s personal model, which is called the personal health record (PHR) (Marcos et al. 2015), conveys to medical domain semantics and is based on a standard patient information models (HL7 vMR—http://www.hl7.org/implement/standards/product_brief.cfm?product_id=338). The PHR refers to controlled clinical vocabulary standardized terms (Bodenreider 2004), such as UMLS and SNOMED-CT which were used in MobiGuide. The PHR semantically integrates data from hospital EHRs, data collected from the mobile monitoring devices (e.g., electrocardiograms, BG), symptoms reported by patients, patient-specific preferences, and patient-specific recommendations and abstractions delivered by the DSS.

Fig. 3
figure 3

The overall architecture of the MobiGuide system

2.3 CIG formal specification, customization, personalization, and application in MobiGuide

The patient-centered care process in MobiGuide introduces two new steps into the CIG development and application process, thus creating a process that we refer to as the CIG formal specification, customization, personalization, and application (SCPA) process (Peleg et al. 2013) (Fig. 4). As in the non patient-centered case, the process of CIG development starts with manual formal specification of the evidence-based guideline as a CIG by a modeling team of knowledge engineers and clinical experts (Shalom et al. 2008).

The two next steps are unique to the patient-centered focus of MobiGuide. In the customization phase, the modeling team extends the CIG, so that it will now include also various new CIG-customized contexts (CCCs) and corresponding CIG branches that were not taken into consideration in the original narrative guideline, which was typically intended to be applied in a hospital or an ambulatory clinic context. These new contexts are crucial for applying the guideline through a remote-care, mobile-based architecture such as MobiGuide. and the new context address the patients’ recurring psychosocial domains (e.g., whether patients have routine or semi-routine schedule) and technological domains (e.g., low Smartphone battery), in addition to clinical context (examples provided in Fig. 4).

Fig. 4
figure 4

The Computer Interpretable Guidelines (CIGs) formal specification, customization, personalization, and application (SCPA) process in the MobiGuide project

CCCs, like clinical contexts introduced into the CIG at the formal specification phase, can be induced by specified events, within predefined temporal constraints. Thus, an event of the type “a potential risk of bleeding,” induces the “anti-coagulant dose reduction” context, which starts 5 days before the scheduled bleeding risk event and finishes one day after the event. The methodology of Dynamic Induction Relations of Contexts (DIRCs), was originally introduced by Shahar (1998), as one component of the KBTA methodology (Shahar 1997). It can handle also composite (overlapping) interacting contexts, as well as multiple types of temporal relations and constraints between the inducing event and the induced context.

In addition, during the CIG customization phase, the modeling team prepares the CIG for the inclusion of patient personal preferences:

  1. (a)

    Global clinical preferences patient preferences that determine the selection of a major branch in the guideline over another. For example, preferring Dabigatran over Warfarin when both are reasonable options as anticoagulants, in the case of AF. This preparation involves adding the appropriate shared decision model that enables a choice of the CIG branch that best fits the patient’s utility function (Quaglini et al. 2013);

  2. (b)

    Local clinical preferences local modifications within a guideline branch, or context. For example, preferring a particular morning hour for a reminder to measure a GDM patient’s fasting BG. This preparation involves making explicit options such as choosing the preferred time to get a reminder, etc. Once a CIG is customized to refer to meal times, specific patients may set their actual meal times during the following personalization phase.

It is important to emphasize that at the end of the customization phase we are left with a single CIG that could be applied to any patient. This CIG does not contain information to any specific patient.

Personalization is the second process that is unique to patient-centered CIGs. This process usually takes place during the first encounter of the patient with a care provider. Together, they define (a) which personal events might induce any of the pre-defined CCCs (and when, i.e., the relevant dynamic temporal constraints); and (b) the patient’s global and local preferences regarding her treatment. These are stored in the PHR.

For example, in the case of the AF domain, the physician would always ask the patient, which events might lead to the predefined semi-routine context, and the patient may indicate that this CCC is induced by a Vacation event. The physician then adds “vacation” as a personal event that induces the semi-routine context for this patient. Similarly, the physician presents to the patient a list of potential bleeding events (risky events) and the patient can select some that are applicable, such as “visiting a dentist”.

During the CIG application phase, patients will have the option to notify the system of their personal events. For example, the patient could report a vacation event or a dental appointment. These will be saved into the PHR. The BE-DSS will be notified by the PHR and the respective CCCs (semi-routine or anti-coagulant dose-reduction) would be induced (see Figs. 2j, 4).

With the two added phases, the CIG Specification, Customization, Personalization, and Application is termed SCPA.

2.4 Other existing approaches for CIG customization

The dynamic context-based personalization mechanism used in MobiGuide is novel and has not been used elsewhere. However, other researchers, whose studies are reviewed below, have customized CIG knowledge to infer patient-specific recommendations that address more than the patient’s clinical data relating to a single disease. Such recommendations can address a patient’s multi-morbidities, her social history, or considerations of the setting of a particular healthcare organization at which the patient is cared for. Unlike the SCPA approach, none of the existing studies have attempted to designate personal patient events as inducers of pre-customized CIG contexts (which might, for example, lead to activation of a new CIG plan).

Riaño et al. (2012) represent CIGs as State-Decision-Action (SDA) that are linked with clinical domain ontologies. They developed algorithms for adjusting patients’ conditions based on disease profiles; these profiles are consulted to suggest additional signs and symptoms that the patient is likely to exhibit and which could be used to generate a more complete record. In turn, the complete record may help to establish the patient’s diagnosis. For patients with multiple morbidities, they suggest a tool that provides a graphical interface to edit and merge SDA diagrams of individual intervention plans for comorbidities of the patient. Intervention plans and decision criteria may address the clinical as well as the social context of the patient.

Grandi et al. (2012, 2016) suggest efficient management of multi-version CIGs collections by representing, in a knowledge base, multi-version clinical guidelines and domain ontologies in XML or in relational schemas. Personalized CIGs can be created by building on demand, from the knowledge base, which also contains historical versions of the guideline, a version that is tailored to the patient’s current time (or desired temporal perspectives, e.g., what were the recommendations that would have been delivered by a previous CIG version) and to the patient’s disease profile (set of comorbidities, e.g., hypertension in addition to AF). The versions could be tailored also to organizational settings in which the patient is cared for. While this approach allows creating different versions of CIGs over time, which are not supported in MobiGuide, these versions do not include individual preferences; they differ only in the composition of the different parts of the customized guidelines that are assembled together to create a guideline version for a particular patient.

Lanzola et al. (2014) developed guideline-based process indicators related to stroke care. The computation of those indicators is done within a stroke registry, which also stores the historical evolution of the resources available at the participating stroke units. While this is useful to justify low values of some indicators in some time periods, again, patients’ preferences are not considered. Lasierra et al. (2013) describe the clinical concepts related to a monitoring task using two levels. The configuration level characterizes the clinical concept (e.g., systolic blood pressure is a vital body measurement) while the results data level characterizes the data used to acquire the concept and contains a more detailed patient model related to the current measurements. For example, whether the patient was sitting or lying down when the measurement was taken. Based on these characteristics, physicians can create personal versions of CIGs, for each patient. Note, that in their approach, there is no distinction between knowledge customization that fits a population of patients and personalization that links personal events to customized CIG contexts.

In their adaptive DSS called Personalized Emergency System for Disabled Humans (PRESYDIUM), Chittaro et al. (2011) developed a patient model that is partly based on the International Classification of Functioning, Disability and Health standard proposed by the World Health Organization. Important attributes from this standard include cognitive functions, body functions, mobility of joints, motor control, pain, and involuntary movements. The knowledge base of their system does not contain a procedural CIG model but rather a collection of frames defining rules. Given a patient profile, frames with corresponding conditions are examined and are filtered based on the current emergency code and user category, to provide operating instructions.

3 Methods

This section presents the methods that we developed for knowledge acquisition. State of the art knowledge elicitation methods start from a narrative clinical guideline and define the semantics of the care process (Shalom et al. 2008). Our novel methods address aspects that are not present in the evidence-based clinical guidelines which view clinicians, rather than patients, as the target of the DSS recommendations. Our methods are used to detect opportunities that allow patients to use IT-technologies for making CIGs patient-centered and enabling their personalization. They include acquiring and specifying knowledge of parallel workflows and projection and acquiring CCCs and their effect on care plans. The parallel and customized CIGs could then be applied by the Asbru CIG engines mentioned in Sect. 2.1. The Sect. 2 also discusses the personalization of the AF and GDM CIGs that were implemented and used in the evaluation of MobiGuide.

3.1 Methods for acquiring relevant CCCs and their effect on care plans

The customization step requires addition of CCCs that were not included in the Specification phase of the original clinical practice guideline, and corresponding care plans that should be applied for the CCCs. To support the teams of clinical experts and knowledge engineers who carry out the SCPA process in considering potentially relevant CCCs and in assessing their impact on clinical care, we apply the following strategies:

  1. (a)

    Examining all CIG decision points, and thinking whether different psycho-social and demographic characteristics of patients could impact decision-making at those points, changing the guideline’s recommended plans and actions (Fux et al. 2012).

  2. (b)

    Thinking about scenarios that arise when patients are guided by a DSS such as MobiGuide, outside of clinically-controlled environments. For example, during travel.

  3. (c)

    Characterizing how care recommendations might change, by thinking about different types of modulations of the clinical goals and actions themselves.

To implement these strategies, we used qualitative research methods, including structured questionnaires, interviews, and text analysis We then used the elicited psycho-social contexts and the effect types to develop an elicitation instrument for thinking of relevant CCCs while trying to customize a particular clinical guideline.

We now outline our context-elicitation methodology in more detail.

A. Elicitation of generic psycho-social and demographic concepts and their potential effects on clinical goals and actions during guideline customization

To elicit generic psycho-social and demographic concepts and their potential effects on clinical goals and actions, we developed a set of questionnaires. The questionnaires were scenario-based and included parts that are increasingly structured. This was done in order to focus the interviewee on a top–down thinking process, thereby eliciting increasingly detailed responses, as it has been indicated (Pitts and Browne 2007) that a systematic thinking process during elicitation serves for avoiding cognitive biases (Kahneman and Tversky 1982). The questionnaires to clinicians and patients are shown in appendices A and B. The interviews were conducted using the questionnaires and were held with different stakeholders, including eight patients, twenty-one physicians, and nine other care providers (nurses, social workers, and nutritionists) from Israel, Italy, Spain, and USA.

Thus, the care providers were first asked to assess the main psycho-social and demographic contexts that might affect the clinical decision-making process. The second part of the first questionnaire lists several general situations. The questionnaire describes a patient who complains that he/she suffers from different symptoms. The care providers were asked to describe the general scenario with emphasis on psycho-social and demographic information, including which information items they collect during the medical interview and how they take them into consideration when they set clinical goals and make therapy choices. In the third part, clinicians were asked to think of patients of theirs that already have an established diagnosis (e.g., AF, hypertension) and describe the process of patient management, addressing major decision points and the role of psycho-social and demographic information items in their decision-making process. The fourth part considers the potential of the MobiGuide system and the goal of personalization. A scenario of a diabetic patient using MobiGuide is described in this part. The interviewees were asked to create an analogous scenario from their clinical domain.

Then, the interviewees were asked to think of the information that they expect to gain from the system in the process of patient follow-up, treatment and work up, considering decision points in the process, the potential use of ambulatory monitoring devices and the relevance of such data and of psycho-social and demographic information to decision-making. The interviewees were then asked more structured and technical questions regarding psycho-social and demographic data: their allowed range and valid time, who would be able to enter and modify those data items, and whether specific values of such data would require immediate notification to patients or care providers.

Similarly to the care providers’ questionnaire, the patients’ questionnaire referred to a series of treatment encounters that the patient had with his/her care provider. In the first part, the patients were asked to describe the course of the meeting with emphasis on treatments and decisions. They were asked how psycho-social and demographic considerations affected treatment on the one hand, and influenced their daily routine (e.g., work, family life, hobbies, habits, diet) on the other hand. As in the clinician’s questionnaire, a second part considered the introduction of the MobiGuide system. The patients were asked how the system and approach would change the way in which they handle their treatment and adherence.

After analyzing the interviews, we defined the emerging psycho-social and demographic context-inducing concepts: ability to comply with the treatment (i.e., fully by the patient herself, fully while assisted by care provider, partially by the patient herself, partially while assisted by an informal care giver, or no ability); communication level, which can be aggregated from cooperation level, desire to know the truth about the prognosis, education level, language level, and trust level; need for an accompanying person during visits to the care provider; degree of family support; whether the daily schedule and the daily diet are routine, semi-routine, or completely non-routine; distance from the nearby medical center; financial capability; living area accessibility, living area pollution, and family-support level.

These concepts were used in a second questionnaire to care providers (Appendix 3), which focused on elicitation of effects on clinical goals and actions. Elicited effects included: target thresholds of concepts (e.g., pre-prandial BG \(\le \) 100 mg/dL), appointment schedule, diet change, measurements change, medication change, physical activity change, and change in the mode of communication with the patient.

B. An instrument for elicitation of relevant CCCs and of their effects on treatment

Based on the generic psycho-social and demographic contexts and on the types of effects that they might have on clinical goals and actions, and on the experience gained from the first sets of interviews and on additional interviews held with clinicians, an elicitation instrument (Appendix 4) was developed to facilitate guideline customization.

Fig. 5
figure 5

A methodology for elicitation of CCCs which customize clinical guidelines

The elicitation process is presented in Fig. 5 and is carried out by the SCPA modeling team. The elicitation instrument is used within the context of considering a specific clinical guideline and the specific clinical decision points established within the guideline. Ideally, the clinical guidelines should contain flowcharts representing the structure of the guideline, as we would like to go over it and check which non-clinical aspects might change a specific treatment step. Therefore, if such a flowchart is not included in the guideline, then the first step would be to prepare this flowchart, focusing on the decision points related to the clinical state of the patient, while paying attention to non-adherence points and to process risks and complications. To establish the treatment steps in the guideline that are affected based on the generic psycho-social and demographic contexts, the instrument contains a set of questions that guide the thinking process needed in order to introduce CCCs and respective care plans into a clinical guideline.

3.2 Knowledge acquisition and specification of parallel workflows and projection

After achieving a consensus regarding the semantics of the care process based on the evidence-based clinical guideline, the elicitation task splits into the two subtasks of eliciting two parallel workflows (Sacchi et al. 2013). The first is a “traditional” workflow directed at the care professional; the second is a parallel process that focuses on the patients’ behavior and on their interaction with the MobiGuide system (i.e., both the Smartphone and the sensors).

For example, a traditional guideline may define a plan for monitoring the patient’s compliance to diet as a set of instructions to a nurse to check whether the patient reported in her diary of eating too many carbohydrates more than twice during the past 2 weeks (non-compliance). In the parallel workflow (García-Sáez et al. 2013), this recommendation is translated into an automatic evaluation of the patient’s non-compliance condition every week and delivery of an alert to the patient through the Smartphone. The automatic evaluation is carried out by retrieving data from the patient’s digital log book and checking for a temporal pattern that occurred during the past 2 weeks. This part of the parallel workflow is managed by the mDSS, hence the relevant CIG plans are indicated as projection points (García-Sáez et al. 2014) and allow at CIG application time the passing of control to the mDSS. The mDSS receives the relevant knowledge in the projection format, which consists of one or more procedural scripts running in parallel.

3.3 The modeled and implemented CIGs and the recommendations they provide

Using the elicitation methods described above, we formalized, customized, and personalized two CIGs in very different clinical domains: atrial fibrillation (AF) and gestational diabetes mellitus (GDM). We selected these clinical domains as representing chronic conditions that require monitoring with different sensors and with different patient populations so as to demonstrate that the same architecture, modeling languages, and software components, could be used in different domains. The system architecture and patients’ and care providers’ user interfaces for the two clinical domains were the same (see Fig. 2 for the patients’ user interface). However, the implementations varied (Table 1) in the types of data collected and patterns monitored, in the amount of interaction between the BE-DSS and mDSS, in the types of customized context, and in the amount of patient-specific recommendations and notifications that they provided to patient and clinician users.

Table 1 compares the knowledge of the two CIGs according to overall characteristics of the two domains. The number of concepts (e.g., fasting BG result), data patterns (e.g., 2 weeks of negative ketonuria), and criteria provide a general impression of the complexity of the modeled clinical care flows.

Table 1 also shows the different types of monitoring plans followed by patients. Plan activation was controlled by plan entry and exit conditions, which specify conditions that addressed clinical or customized context, which are part of the CIGs’ knowledge. Upon CIG activation, these conditions are matched against each patient’s personal model to yield patient-personal decision-support recommendations and reminders. In (MobiGuide Consorium 2016), we provide examples of the monitored patterns and conditions, and sample notifications and recommendations sent to patients and care providers, used in the two CIGs.

Table 1 Difference in CIG characteristics between the two clinical domains

CIG customization: CCCs and projection points

During customization, relevant CCCs are elicited using the methods described in Sect. 3.1, and added to the CIGs. Four CCCs were added to the AF CIG: semi-routine and routine schedule, 24 h monitoring, and increased physical activity. Monitoring plans for these contexts vary (two rather than one daily BP measurements for hypertensive patients with BP monitoring prescriptions while in semi-routine context, in which patients may not be as well controlled; two rather than one daily 30-min ECG monitoring sessions when in increased physical activity context; continuous ECG monitoring when in 24 h monitoring context).

Two CCCs were added to the GDM CIG: routine and semi-routine context. However, CIG plans did not change; only preferred meal times and reminder preferences could be set by individual patients for these two contexts.

During individual personalization at enrollment, relevant CCCs were initialized and configured with meaningful names for the patients (e.g., “holiday” for semi-routine context). Patients could then activate relevant context using their UI. 24 h monitoring had to be triggered by the clinicians.

Using the knowledge elicitation methods defined in Sect. 3.2, projections and call back points were defined in the CIGs. In the case of AF, most of the knowledge was controlled by the mDSS, as is reflected by the small number of callbacks defined in the knowledge base. On the other hand, most of the knowledge of the GDM guideline was defined in the parallel workflows for patients and care providers.

Monitored measurements

Table 2 presents the expected number of weekly measurements for AF and GDM patients.

All AF patients were prescribed monitoring their ECG for 30 min daily via mobile monitors. These monitors send the biosignals to the patient’s Smartphone using Bluetooth, from which the data is sent to the Backend server at the hospital. The nurse would look at their ECGs once a day to assess whether the MobiGuide AF detection algorithms correctly identified AF events, and would inform the doctor if the situation necessitated intervention, such as change of diagnosis or therapy. All patients received reminders for ECG monitoring at their preferred times, which were context-dependent.

AF patients were optionally prescribed to also measure their BP, weight, or INR and enter it manually into the system. The rate of recommended BP measurements depended on context (See Table 2).

The clinical recommendations and measurement frequency for GDM patients did not depend on the patients’ psychosocial context. The frequency of measuring BG, ketonuria and BP depended on the clinical context (stable or unstable), as shown in Table 2. In case an entry condition became true (see lower part of Table 2), monitoring plans were switched automatically by the BE-DSS and the respective knowledge was projected to the mDSS.

Table 2 Expected number of weekly measurements for AF and GDM patients

All GDM patients were provided with portable glucose meters and BP meters that sent data to the Smartphone via Bluetooth. The data was downloaded by the patients to the backend via a menu button on the phone. Patients also had to measure ketones in their urine and manually record results into the system. The GDM system included an option for weight monitoring prescription, although none of the patients on the study were prescribed to do so. Physical activity monitoring plans could be prescribed, but were not prescribed by clinicians in this study. This monitoring was supported using the physical activity detector (PAD) integrated in the MobiGuide application, which provides information about physical activity intensity, energy expenditure and steps and is based on the Smartphone’s accelerometer. Patients activated the PAD and ended sessions manually; continuous monitoring was not a feature of the system.

When patients enrolled into the MobiGuide system, they determined the setting of whether they wanted to receive reminders for monitoring and in the case of AF, also for medications (related to AF and to other comorbidities). They also set the timing of reminders, which could be different under different contexts.

Patients’ self reporting of events and clinical recommendations

AF Patients could report symptoms related to AF by selecting them from a drop-down list and indicating their severity (unacceptable, acceptable, absent). Symptom reporting triggered a recommendation to measure their ECG. Additionally, patients could report personalized future events with risk of bleeding (e.g., dental visit) which could indicate change in anticoagulation therapy, resulting in a system notification to schedule an appointment with their doctor. Upon delivery of medication reminders, patients could report taking the medication or not taking it, indicating their reason for non-compliance.

GDM patients could self-report diet non-compliance events and could answer questions sent via the mDSS regarding following diet prescription. The bottom part of Table 2 presents the different clinical recommendations delivered by the system in response to detection of a patient’s patterns of clinical data, depending in some cases on answers to questions posed by the system.

Recommendations to clinicians

In the case of the AF guideline, most of the recommendations to care providers concerned cardioversion decision support. However, in practice, only one patient was eligible for cardioversion. In GDM, recommendations to clinicians concerned starting insulin and changing diet prescriptions.

4 Evaluation

We tested the MobiGuide system in an extensive pre-pilot testing with healthy volunteers (described in Appendix 5). This testing has shown that the system responded (adapted) correctly to changes in the dynamic patient model (see Fig. 1) and delivered appropriate recommendations.

The evaluation of the system was then performed in a multi-national feasibility study in the clinical domains of AF and GDM. This paper reports results that focus on the patient model and its impact on users’ behavior. An analysis of compliance to specific recommendations and reminders is addressed in a separate paper (Peleg et al. 2017) that examines clinical outcomes.

In the case of AF, the system evaluation study was planned for a maximum of 9 months, considering that the study was done with clinicians’ follow up that was in addition to normal patient care.

Gestational diabetes usually starts after month 5 of pregnancy; the GDM patients could use the system till delivery of the babies, upon which GDM resolves. Thus the evaluation study was planned for a duration of at least 2 months for each patient and a maximum of 5 months.

In addition to the variability of the CIGs for these two guidelines (see Sect. 3.3), the patient user population varied as well (Table 4): while AF patients were older Italian men and women, chronic patients, had additional comorbidities and were much less experienced with technology, the GDM patients were Spanish women with an average age of 35, more experienced with technology, who are otherwise healthy and have complications of pregnancy related to diabetes with/without hypertension. We recruited ten AF patients and twenty GDM patients for the evaluation study. They used the system during April–December 2015.

4.1 Hypotheses

The technical evaluation of the system was successful and showed that the system reacts to the dynamic patient model. In order to design an analysis that would focus on the users’ experience and how it was related to the personal patient model, we defined the following three hypotheses:

  1. 1.

    Sustainable usage by patients. Most patients will not drop out and will use the system for a long duration of time continuously. For AF patients, we considered that patients could be enrolled until 2 months before the end of the project, thus we expect follow-up lengths from 2 to 7 months. For GDM patients, the expected time period was constrained by the gestational age at which the patient was enrolled in MobiGuide and the gestational age at delivery.       We conjectured that this behavior could be attributed to a combination of two factors. First, patients should find the system useful and usable, and second, patients in high risk domains are intrinsically motivated.       One of corollaries of this hypothesis is that at first, patients might use the system more than later, but they would still perform at least about 60 actions a week for AF patients and 53 for GDM patients. This estimation is based on the numbers reported in Table 2. The number of reminders and recommendations that a typical patient should receive depends on the patient’s context and is estimated as 30 per week for AF patients and 26.5 per week for GDM patients (see Table 2). In addition we expected that each patient should view (click) on such reminder/recommendation and also respond by performing a measurement and entering the value into the system for manual monitoring (BP and weight for AF patients and ketonuria for GDM patients).

  2. 2.

    Patients will have positive impressions of the system and will find it useful, especially for increasing their sense of safety and communications with clinicians.

  3. 3.

    Clinicians will be examining the PHR data (patient model) of patients during visits and in between visits.

The norm is that clinicians examine the patients’ medical records only when the patients are coming in for scheduled visits. So, if the clinicians check the patients’ data more often while using the system, then this is a positive outcome, which shows that clinicians react to the changing patient model.

Note that the changes in patient model happen “under the hood”, within the MobiGuide architecture. Implicitly, the fact that clinicians are inspecting the patient model demonstrates the effect of the changes in the patient model on clinicians’ behavior.

4.2 Methods for assessing the hypotheses

Table 3 maps the hypothesis to the methods for their evaluation and the data sources used to evaluate each hypothesis.

Table 3 Methods and data sources for assessing the hypotheses
Table 4 Characteristics of the two patient populations (AF and GDM)

5 Results

The MobiGuide system delivered correct recommendations and reminders according to the personal model. The results of the patients’ usage of the system are reported below, arranged according to the hypotheses, which are related to the effect of the personal patient model in users’ behavior. Clinical outcomes, including compliance to clinical recommendations, will be reported in a separate publication and are briefly mentioned in the Sect. 6. We first present the characteristics of patients who participated in the study, followed by the results related to the three hypotheses.

5.1 Patient characteristics (demographics)

Table 4 compares the two patient populations in terms of the number of participating patients, their age and gender, level of education and technological skills, the average number of days spent using the system, dropout rate, and number of times the patients had switched context.

5.2 Hypothesis 1: Sustainable usage by patients

Tables 5 and 6 report in the column “Days on MG” the total number of days that patients had used the system. These tables also report results concerning the correct delivery of decision-support, which is specific to the dynamic and reactive patient model. This includes delivery of recommendations and reminders according to CIG logic that addresses the personal patient model.

Tables 5 and 6 show the number of recommendations that were sent to clinicians and patients (including clinical and technical recommendations for patients); recommendations are sent in response to declarative knowledge expressions that the system monitors, and when they become true, the system reacts by sending relevant recommendations (see (MobiGuide Consortium 2016) for a list of the available recommendations). We did not have estimates for the expected frequency of recommendations as they depend on the specific conditions that trigger these recommendations, which relate to the patients’ dynamic clinical state. We could observe that AF patients reported AF symptoms for which they received recommendations to add extra ECG monitoring sessions. The mean number of recommendations received was once a month (with the patient receiving ECG recommendations most frequently at 4.1 times a month). Other recommendations, related to events that are associated with a risk for bleeding, were quite rare.

GDM patients received more clinical recommendations than AF patients: on average 1.9 clinical recommendations a week (8.1 a month), with the patient receiving clinical recommendations most frequently at 6.4 times a week. But the numbers are not comparable as the clinical conditions to trigger these recommendations are different. AF patients received recommendations when they reported risky events or AF symptoms whereas GDM patients received recommendations when patterns relating to BG, ketonuria, or BP values occurred; some of these patterns could occur weekly or even every two days (for ketonuria). The frequency of technical recommendations delivered to AF patients was about five times higher than the rate of clinical recommendations. In the case of GDM patients, the frequency of technical recommendations was much lower than clinical recommendations (60%). Technical recommendations are related to quality of data entered, need to charge the battery, and problems in wearing the ECG monitor.

Recommendations to AF clinicians were quite sparse, because only one patient met the criteria for cardioversion advice. Recommendations for GDM clinicians were generated for all patients, but were still quite low as compared to the recommendations to patients (around 1.5 recommendations a month per patient to clinicians vs. 8.1 recommendations to patients).

Table 5 Delivery of DSS for AF patients: response to changes in dynamic patient model
Table 6 Delivery of DSS for GDM patients: response to changes in dynamic patient model

In addition, Tables 5 and 6 show the number of reminders that the system delivered to patients and the number of times that recommendations and reminders were viewed by patients, for which we had estimates. Nine of ten AF patients, and all of the GDM patients, chose to receive measurement reminders and 6 of 8 AF with drug prescriptions chose to receive medication reminders. Observing the results shown in Table 5, we can see that the ECG reminders were issued by the MobiGuide system at roughly the expected frequency of two daily remindersFootnote 1. The number of drug reminders delivered depended on the therapeutic plan of each patient (some might have only one medication to be taken daily, while others may have multiple doses of several drugs each day) which can also change over time, thus we did not expect a fixed frequency for these kind of reminders. GDM patients received BG and ketonuria reminders at a lower rate than expected. Technical problems were reported for some patients such as 2, 13, 14, and 17, who received a scarce number of reminders.

All of the columns of Tables 5 and 6, except for the last one, focus on correct system behavior. But an important component of Hypothesis 1 concerns expected patient behavior. There was much similarity between AF and GDM patients in terms of their usage of the system. All patients used the system mainly to monitor, record and view the most important parameters of their health conditions, i.e. ECG and BP for AFpatients and BG, ketonuria, and BP for GDM patients. Viewing of the logbook data was correlated with the amount of data entered for each parameter. The logbook data included monitored parameters, therapies, patient-reported symptoms, risky events, medication non-compliance for AF patients and diet non-compliance by GDM patients—reported in the last column of Tables 5 and 6. As can be seen, AF patients viewed at least 95% of the total number of recommendations and reminders that were delivered to them and GDM patients viewed 92%. However, this is the lower limit, because, when clicking on any logbook view, all data collected until that time was displayed, so patients could have seen several recommendations or several data values within a single click. In addition, the calculations regarding the number of data visualization and access to different features of the MobiGuide application, such as downloading sensor data, were performed based on the actions registered in a mobile log while patients were using the application. According to the design of the mobile application, patients need to visualize data before entering new measurements either manually or automatically. For this reason, the patients’ data visualization patterns were computed based on the total number of visualizations performed minus the total number of monitoring data updated to the backend, per day. This calculation has limitations because some patients entered several measurements at the same time (e.g. glycemia values were measured on average 4.1 times a day but they were downloaded 2.63 times per week).

Figure 6 shows the amount of actions done by AF and GDM patients over time. As shown, the total counts of weekly actions performed by both groups of patients was high (average and standard deviation of actions per week for AF is \(200.2\pm 56.2\); and for GDM \(150.4\pm 66.6\)). Although most patients experienced a decline in usage over time, their usage remained for the whole period (the average number of actions per week in the last 2 weeks was 157.4 for AF patients and 110. for GDM patients). This number is by far higher than the envisioned number indicating successful usage by patients, which was estimated to be at least 60 for AF patients and 53 for GDM patients. A few patients for some weeks (observe AF patient 7 at week 24–25 and AF patient 10 at weeks 9–11) were using the system less. AF patient 7 brought the smartphone to the doctor’s office for checking his report about a potential problem of drug reminders; AF patient 10 was measuring her ECG and using the app only when her daughter was around). One GDM patient used the system less when she started insulin treatment (see GDM patient 4 at week 4) while two patients had a lower number of actions on the first week due to technical problems (see GDM patient 1 and patient 2).

Most patients performed a higher number of actions during the first week of usage (average and standard deviation of actions the first week for AF patients was \(365.7\pm 112.1\) and for GDM patients was \(327.3\pm 121.7\)), most probably they spent additional time learning about different scenarios of the application.

Fig. 6
figure 6

Temporal trends of patients using the MobiGuide system. a AF patients; b GDM patients. Numbers on the y-axes represent total actions per week. Average number of actions shown in a dashed line. *Data for AF patient #4 was not available

5.3 Hypothesis 2: Patients will have positive impression of the MobiGuide system

Table 7 reports the results of the patients’ end-of-study questionnaires. The table provides the translation of the questions’ exact phrasing for the AF and GDM domains. Patients were asked to answer on a Likert scale of 1 (don’t agree) through 5 (perfect agreement). The number of patients who gave each rating is provided. Nine of ten AF patients filled out the questionnaire. Two GDM patients failed to complete the questionnaire because they lived far away from the hospital and could not deliver the questionnaire after the pregnancy; the patient who dropped out after 1 week was not asked to fill out the questionnaire.

Table 7 Results of patients’ end-of-study questionnaires

As shown in the table, AF and GDM patients agreed in general that the application was interesting, easy to use and with an easy learning curve, the sequences of activities were clear, the application’s response time was appropriate, and patients had rarely experienced errors. We were very pleased by the patients’ perceptions of usefulness; most patients agreed that the system has increased their confidence, that it increased their peace of mind while travelling (as decision support was available to them anytime and everywhere), that it improved their interaction with the clinical staff, that it did not complicate their lives, and that the system has helped them visualize and interpret their data. Most patients liked the fact that the system could adapt to their personal context. Most patients said that they would recommend the system to other patients with their condition, to friends of theirs, and that they would continue to use the system themselves. In addition, AF patients were willing to pay for using the system while GDM patient were not. Patients did not feel that the system has improved the response time of the clinicians, perhaps because it was good also without MobiGuide.

5.4 Hypothesis 3: The clinicians will be examining the data (patient model) of patients during visits and in between visits

In the AF clinic, patients were expected to visit their doctor once a month. However, the AF nurse told us that visits of the MobiGuide AF patients were not formally scheduled and documented; patients were dropping in to see her more often than the planned visits of once every 1–2 months. Therefore, we compared the number of days in which clinicians had viewed the patient’s data (via their caregiver user interface) not to the expected monthly visit, but to a larger number of expected visits (once every 10 days), which accounts for patients unscheduled drop-in visits.

GDM patients had weekly visits each Tuesday. Figure 7a shows the clinicians total number of actions (mostly views) per day for each GDM patient. We were expecting to see a peak on Tuesday and were hoping to also see activity on other days (between visits). The GDM clinicians told us that they had sometimes used the patients’ smart phone GUIs in order to view some of the patients’ data. Figure 7b shows the logged patients daily actions. The actions on that graph represented the total actions by patients and by the clinicians who used the patients’ GUI on visits. Hence, we were expecting to see a peak on Tuesday on that graph as well. Peak activity for clinicians was observed on Tuesdays but also on Thursdays and Fridays. Peak activity for patients was observed on Tuesdays.

Fig. 7
figure 7

a Total number of GDM clinicians actions performed for each week day for each patient; b total number of GDM patient actions performed for each week day by each patient

Tables 8 and 9 present the Wilcoxon signed rank test results for the AF and GDM patients, comparing the #days of caregiver views to the planned number of days of views (Total #days on MobiGuide/10 for AF and Total #days on MobiGuide/7 for GDM). As reported below each of these tables, the number of views was in both cases significantly greater than the planned number of visits, indicating that clinicians have viewed patient data between visits.

Table 8 Number of views by AF clinicians
Table 9 Number of views by GDM clinicians

In addition, we performed similar tests for the cohort of patients, testing the difference in number of actual versus planned day views for the entire patient population. Figure 8 summarizes the results.

The personalized decision support of the MobiGuide system impacted clinicians. The AF doctors changed their diagnosis for two of ten AF patients (and as a consequence changed their therapy). One of these patients had complained of arrhythmias for years, but these had never been observed during monitoring sessions at the hospital. She was diagnosed with suspected AF. With the aid of MobiGuide, the patients was diagnosed as having an arrhythmia that is not AF. The other AF patient had been diagnosed with paroxysmal AF but self-monitoring using MobiGuide confirmed that in fact this patient had permanent AF. MobiGuide had also impacted the behavior of the GDM clinicians. For eleven of the twenty GDM patients, the system had anticipated therapy change that was accepted by GDM clinicians; ten patients started insulin treatment after receiving MobiGuide recommendations and one had a diet therapy change (not included in the group of patients with insulin treatment).

Analyzing what the most common physician actions were, we found that for AF patients, the most frequent action was visualizing the synthesis of the ECG monitoring sessions performed by the patients. Other frequent actions related to the visualization and prescription of medications and the visualization of the recommendations. For GDM, the most frequent action was to visualize important recommendations regarding the patient status. The second most frequent action was the visualization of the synthesis of the BG monitoring sessions.

The W-value is 0. The critical value of W for \(N=10\) at \(p\le \) 0.05 is 10. Therefore, the result is significant at \(p\le 0.05\).

The W-value is 29. The critical value of W for \(N = 19\) at \(p\le \) 0.05 is 53. Therefore, the result is significant at \(p\le 0.05\).

Fig. 8
figure 8

Bar chart comparing the planned frequency of visits to the actual views by clinicians for the cohorts of AF and GDM patients

6 Discussion

MobiGuide is an interactive decision-support system that can be customized and adapt itself dynamically to its users’ incrementally changing personal model, which includes longitudinal clinical data and personal preferences and context. In this paper we discussed the role of patient models in the adaptation process. Most of the current CIG-based DSSs are used only by clinicians, who are guided by the patient-specific recommendations provided by the CIG’s declarative knowledge, which constitutes the patient model. The declarative knowledge in other CIG-based DSSs is not customizable and it refers only to clinical concepts rather than to patients’ personal context and preferences. MobiGuide, on the other hand, is innovative both in having patients as users and in having a patient model that is adaptive to the patients’ personal local and global preferences and contexts. These different aspects of personalization are addressed using innovative means, including projections and DIRCs. Moreover, we have developed methods for eliciting parallel workflows, psycho-social and demographic context and personal utilities that once specified, can support knowledge projection, context customization and personalization, and global decisions based on decision trees with personal preferences (utilities) (Rubrichi et al. 2015; Parimbelli et al. 2015).

Our evaluation study used the same underlying MobiGuide system to create DSSs for two very different clinical domains with different instances of the knowledge base (see Table 1), differences in some of the sensors, and difference hospital EMRs from which data has been imported into the MobiGuide PHR. This architecture can potentially be used in other clinical domains after appropriate CIGs are encoded and EMR exporters are written to export existing medical records into the MobiGuide PHR. The feasibility of the HL7 virtual medical record—the data model used by MobiGuide’s PHR—to cover a large range of medical data items has already been shown (González-Ferrer 2016). In addition, the Asbru language used in MobiGuide and its supporting tools have been successfully used in various other clinical domains (Shalom et al. 2015).

The study reported in this paper focuses on three hypotheses that stem from the personal patient model and how it impacts the behavior of patients and clinicians.

Hypothesis 1: sustainable usage by patients

Our first hypothesis was confirmed. It is a well known phenomenon that patients get tired of using (medical) applications after several weeks, especially for monitoring applications (Consumer Health Information Corporation 2012). Yet, our results show that on average, AF patients used the MobiGuide application consistently for a mean of 4.2 months (and up to 8.6 months) and GDM patients used it for 8.2 weeks (and up to 3.2 months). Moreover, AF patients were performing an average of 200.2 weekly actions per patient and GDM patients were performing an average of 150.5 weekly actions per patient (Fig. 6). While AF is a chronic condition, GDM does not start before the fifth month of pregnancy (and patients may have joined MobiGuide even later than that) and lasts till delivery. Hence the amount of time that GDM patients had used the system was naturally shorter than the AF patients. The fraction of days on which GDM patients had used the system out of the potential number of days on which they could have used the system (from enrollment till delivery) was 89%, which means that patients had been using the system most of the time (and continuously, as shown in Fig. 8). Patient 20 stayed a shorter period than expected due to the end of the project and six patients stopped using the system 1 week before delivery, decreasing the expected usage time per patient.

Patients’ actions were in accordance with the CIG-based reminders and recommendations, which were adaptive to the patient model. AF patients performed ECG monitoring at an overall rate of 0.65 ECGs per day, which is encouraging, especially considering that these were older adults not experienced in using technology. Compliance of AF patients to BP monitoring in routine context was 0.75. However, compliance with BP monitoring in semi-routine context was very low (\(0.03\pm 0.03\)). This may be due to over-burdening these patients with extra measurements (see Hypothesis 3).

Compliance rates of GDM patients were even higher, with observed compliance of 0.82 and 0.96 to recommended BP and ketonuria plans. Compliance with BG monitoring plans (either daily or twice weekly) was 0.99 and compliance with four measurements a day was 0.87. The latter matches the BG monitoring compliance rate of \(0.87\pm 0.28\) that was observed for a historical cohort of 247 patients diagnosed with GDM that was followed at the same hospital department 2 years before our experiment, during 2010–2013 (Villaplana et al. 2015). However, the patients in that cohort were requested to measure at least (and not exactly) 4 BG measurements a day and their compliance was calculated such that measuring 4 BG measurements a day yields full compliance (of 1.0) but more measurements could yield a compliance greater than 1. Using the same calculation for the MobiGuide cohort yielded a compliance rate of \(1.01\pm 0.10\), which was significantly higher than that of the historical cohort (p value of 0.0312). A similar baseline could not be obtained for the AF domain, as per current best practice, AF patients are not provided with ECG monitors at home nor with the ability to report symptoms in real time; hence the MobiGuide system empowers patients beyond current practices.

While the compliance of GDM patients to monitoring plans (reminder-based) was higher than that of AF patients, their compliance with the more sparse clinical recommendations (regarding diet due to abnormal BG or ketonuria) was lower (0.3–0.5) than the perfect compliance of AF patients. However, this may be due to the fact that only two AF patients received one recommendation each (related to risky events and to anticoagulants) whereas all GDM patients received several recommendations each, related to BG or ketonuria control.

While monitoring of the most important health parameters (ECG and BP for AF and glycemia, ketonuria, and BP in GDM) was well followed in both domains, monitoring of other measurements was not followed as well. This may reflect the perceived importance that patients attribute to different recommendations and their connection to their primary health concern. Other reasons could relate to the fact that the other measurements were prescribed to be done less frequently; INR was to be recorded every 2 weeks by one patient, and weekly weight measurements by two patients, and the measured values were almost always within the normal range.

Part of patients’ empowerment is their ability to report things to the system, such as their symptoms and measurements that are not automatically uploaded from mobile biosensors and also view their data recommendations. AF patients often reported symptoms related to their condition, with a mean of one symptom per month. The percentage of recommendations and reminders viewed by patients was also very high (95% for AF patients and 92% for GDM patients of total recommendations and reminders generated).

The patients’ sustainable usage of the system for monitoring their health and following personalized recommendations is in line with the fact that only one of the 20 GDM patients and none of the ten AF patients had dropped out and that three of the AF patients did not want to return the equipment at the end of the study and kept using the system while budget permitted for several additional weeks.

Hypothesis 2: positive impressions by patients

Despite the large difference between the two evaluation studies of the MobiGuide system, the system was effective and appreciated by the two patient populations (Table 7). Patients found the application useful. Overall, the sense of safety that the system has provided to the patients was its greatest asset. When interviewing the patients in person, many of them chose to tell us about this quality of the system. One patient said “With the system I feel the doctor by my side, as if he is hugging me”.

Hypothesis 3: clinicians will examine system-collected patient data also outside visits

The clinicians examined the patient-personal data that justified the personalized recommendations output by the system. For the AF domain, the most important data was ECG sessions and medication prescriptions. For the GDM domain, the clinicians viewed the MobiGuide recommendations sent to patients and correlated them with patient data, including BG, BP and ketonuria. Clinicians also examined patients’ global and local preferences (e.g., meal times) to make their decisions.

For both AF and GDM clinicians, the frequency of actions done by clinicians was significantly larger than the planned number of visits; the clinicians viewed the patients’ models also between patient visits. The AF nurse looked at the entire ECG signal every day, and if there was something worth noting, she alerted the cardiologist. Due to the pilot nature of the evaluation study, patients were invited to schedule visits with their clinicians often (every 1–2 months for AF patients and every week for GDM patients), yet the safety of patients who chose to only be monitored from home was maintained. In fact, some AF patients only had the enrollment and the end-of-study visit, and used the system from home for the whole duration of the study. This is exactly the intention of the MobiGuide system: that using the system the patients would be monitored and safely stay in their normal environment and only come in to the clinics when required by their clinical state.

The GDM clinicians told us that a few days after each Tuesday visit, and especially just before the weekend, they checked again the patients’ model to see how the patient was doing. In line with the statement of GDM clinicians that during the weekly patient visits they used the patient’s smart phone to check the patient’s data, which was more efficient for them, we observed a peak activity on the patients’ GUI on Tuesdays. An additional smaller peak in patient’s actions was observed on Fridays, in line with the request that patients should download their data every 3 days or so.

Perhaps the most impressive outcome was that as a result of observing the changes in the patient model and the patient-specific recommendations delivered to them, the AF clinicians changed the diagnosis for two AF patients and the GDM clinicians started insulin treatment for two of the twenty GDM patients (ten patients started insulin treatment after MobiGuide recommendations, but for two patients, the system detected the need to start insulin earlier than clinicians did).

6.1 Limitations of our approach

One of the largest limitation of our approach is that knowledge-acquisition requires a lot of effort. Even the CIG Specification stage, which is done in non-personalized clinical DSSs is very laborious, despite the fact that it begins with a narrative clinical practice guideline as a starting point. However, as in the work of Camerinit et al. (2011) “the design of a personalized health system, must consider the balance between evidence based medical guidelines, the feasibility of their implementation, and the modeling of the system”. Here, we need to elicit from clinicians information that is not contained in the original guideline regarding opportunities for involving and empowering the patient using mobile technology at the patient’s normal environment. These opportunities are usually not open to clinicians and they are not experienced in thinking about them. The elicitation methods that we have developed have been successful, but they require much time and effort from busy clinicians. Another problem, is that reuse of encoded plans by other CIGs is in most cases not suitable.

Another disadvantage is that our personalization addresses personal preferences and context but only partially addresses comorbidities that are particular to different patients by delivering recommendations to drugs for comorbidities, unlike the more complete approach for handling comorbidities addressed in Riaño et al. (2012), Grandi et al. (2012), Grandi (2016). It is not realistic to think that older chronic patients will suffer from a single disease, yet clinical guidelines focus on a single disease. The AF and GDM CIGs addressed the hypertension comorbidity explicitly. In addition, we have added to the declarative knowledge, concepts representing each medication that any of the patients were taking so that they could receive personalized reminders. Nevertheless, systematic methods for integrating CIGs, each focusing on a different disease, should be developed as future research to support detection of conflicts and their resolution (Peleg 2013).

Except for the availability of BG monitoring compliance rates of a historical GDM cohort, we have not evaluated user behavior against a control; some of the impacts might have been achieved simply by providing the sensors and reminders, without the DSS.

An important limitation is that there was not much opportunity to demonstrate the impact of CCCs, but we think that CCCs would prove important for other clinical domains. Here, the different CCCs that were added during the customization of the two CIGs did not affect the care plans to a large extent; in the case of AF, the semi-routine context impacted only the frequency of measurement of BP, as compared to the frequency in routine context. For GDM, the different clinical recommendations did not change for different context, only the ability to adjust the reminder times per context.

The fact that certain system bugs were not noticed during the pre-pilot stage also limited our potential analysis. Specifically, in the AF system, reminders were not always working as planned; ECG reminders were the only ones that were always delivered, but other reminders were not working consistently. Hence we could not fairly evaluate the impact of the routine versus semi routine CCCs. In addition, due to the small size of cohorts of patients, we did not have a chance to evaluate the full scope of the CIG knowledge, including the two other AF CCCs: intensive physical activity and 24 h ECG monitoring. Nor could we check the impact of variability of patient’s local preferences regarding reminders on patients’ behavior, because most patients did not change their reminder settings of ON/OFF between contexts. Finally, detailed analysis concerning clinical impact and compliance are beyond the scope of this paper and is reported elsewhere (Peleg et al. 2017).

7 Conclusion

MobiGuide is a personalized and patient-centric clinical DSS based on clinical practice guidelines that involves patients in the management of their disease in their normal environments. The results of our multi-national feasibility study demonstrated clearly the feasibility of the MobiGuide architecture and its adaptive behavior that adjusts itself to the patient’s personal model. The evaluation has confirmed the three hypotheses regarding the effect of the personal patient model on users’ behavior; hence the personalization has met its goal. First, the results demonstrate that the users of the system, both patients and physicians, have used the system consistently for substantial periods of time and found that it provided them with multiple benefits, which stem from the personalized patient model. Second, the most substantial benefit to patients was the increase in their sense of safety as well as in their involvement (demonstrated through a high compliance rate), which was enhanced by the personalization and patient-centrality features of the system. Third, clinicians have used the system to follow up on their patients’ models outside the planned visit, which fits the intention of the system to save visits for well-controlled patients. In addition, the system has affected clinicians’ behavior resulting in change of diagnosis for two of the ten AF patients and anticipated change in therapy for eleven of the twenty GDM patients.