Alerting patients via health information system considering trust-dependent patient adherence

Son, Junbo; Kim, Yeongin; Zhou, Shiyu

doi:10.1007/s10799-021-00350-8

Alerting patients via health information system considering trust-dependent patient adherence

Published: 02 December 2021

Volume 23, pages 245–269, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Information Technology and Management Aims and scope Submit manuscript

Alerting patients via health information system considering trust-dependent patient adherence

Download PDF

506 Accesses
3 Citations
Explore all metrics

Abstract

The internet of things has ushered in a world of possibilities in chronic disease management. Connected to the health information network, a health device can monitor and provide intervention recommendations to patients in real time. However, this new health information system may face the risk of patients not following the system’s recommendations depending on their perception of the system. In this paper, we consider patients’ trust in the system a key factor driving their adherence to the system’s recommendation and develop an analytical model to design the optimal alerting strategy in the context of asthma management. Our method acknowledges that patient’s trust may change over time based on their experience of using the system, which may influence their future adherence behavior. We derive a set of structural properties of our solution and demonstrate that our approach can significantly improve patients’ quality of life compared to the current practice of asthma management. Furthermore, we investigate various real-world scenarios, such as the case that patients may have different level of tolerance for receiving alerts. Based on our findings, valuable insights can be shared with patients, healthcare practitioners, and companies in the technology-enabled healthcare business sector.

A Comprehensive Study of Security and Cyber-Security Risk Management within e-Health Systems: Synthesis, Analysis and a Novel Quantified Approach

Article 29 September 2022

A Trust-Based Framework for Information Sharing Between Mobile Health Care Applications

Internet of Things Based E-health Systems: Ideas, Expectations and Concerns

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Recent IT advancements such as the internet of things (IoT) have led to the development of an increasing number of connected infrastructures of medical devices, software applications, and healthcare services [21]. Health information system (HIS) supported by health IoT not only allows care providers to monitor patients in real time based on patient data remotely acquired by sensors but also facilitates the efficient self-management of health conditions for patients themselves. The smart asthma management (SAM) system implemented by Propeller Health is a good example [38]. In this system, a Bluetooth sensor is attached to the personal inhaler that asthma patients carry in their daily life. The sensor records the timestamp of each inhaler use. The detailed inhaler usage logs are then transmitted to the patients’ smartphone and, eventually, to a server at the company. This new type of HIS has opened a window of opportunity to enhance patient care because the use of data collected from IoT devices provides richer insights about patients that were not readily available in traditional healthcare setting in the past [13, 21].

The capability of health IoT to consistently interact with patients is expected to significantly improve care outcomes, especially in chronic disease management [51]. Managing chronic conditions, such as asthma and diabetes, often requires individual patients to make day-to-day decisions about their chronic disease, and a HIS powered by health IoT can support their decisions by recommending appropriate care pathways without geographical and time limitations [31]. However, such an IoT-enabled intervention depends critically on the extent to which patients are willing to accept these recommendations because patients with chronic conditions tend to have a high degree of autonomy in making the decisions. Basically, patients may or may not take the recommended action based on their perception of the system [42]. Patient adherence to a course of treatment plays a crucial role in improving care outcomes because managing chronic disease such as asthma is done mostly outside the clinical setting, and it requires a high level of patient involvement [41]. In conventional clinical settings, a physician provides clinical guidance to patients. Following the suggestions of the care provider is undoubtedly important for patients to improve their medical condition. No matter how good the physician’s clinical judgment is, the patient’s health outcome will not improve if the patient does not adhere to the recommendations made by the medical professional. Likewise, for a HIS system to improve the patients’ health outcome, it is crucial that the patient follows the recommendations made by the system [37]. Due to this nonadherence issue naturally embedded in IoT-enabled HIS, it is important to consider the interaction between patients and health technologies as early as the system designing phase.

As discussed in the literature, patient experience in following a medical intervention alters the adherence levels of individual patients. For instance, a false positive result from a medical test can deter future adherence as patients become less confident in the following care plan in response to the negative consequence of the inaccurate test result [24]. In this paper, we study how the outcomes of medical intervention made by a HIS can affect patient adherence to future intervention because a mistake made by the system (e.g., a false alarm or a misdetection) decreases patient adherence to the system, whereas an accurate diagnostic decision made by the system can increase patient adherence. To investigate this dynamic interaction between patients and HIS, we focus on the patient’s perceived trustworthiness toward the HIS system as a factor affecting patient adherence. It has been reported that patient trust in medical technologies, including health IT, could affect their overall perception of the healthcare delivery process and could enhance or diminish patient adherence to medical recommendations [31, 42].

However, despite the importance of the patient-HIS interaction, the question of how to properly incorporate this human factor into designing and operating a HIS has not been thoroughly explored in the current literature. In this study, we establish the partially observable Markov decision process (POMDP) tailored to the SAM application. The POMDP model is the same as the conventional Markov decision process (MDP) model, except for the partial observability of the state [6]. In our model, the state includes two aspects: patient’s asthma control status and trust level. The state evolves according to the state transition function (state transition probability matrix). In typical MDP models, the decision maker can directly observe the state. However, that is not the case in many clinical applications. Rather, the state is hidden and needs to be inferred based on observations (biomarkers). The SAM system remotely collects patient’s rescue inhaler usage, which is one of the key measures for accurately identifying the underlying asthma control status of patients. On the one hand, the clinical diagnosis of asthma control relies on not only the rescue inhaler usage pattern but also various other factors, such as peak expiratory flow. In that sense, the patient’s true asthma control status is not fully observable. On the other hand, if the patient went through an on-site clinical consultation, the asthma control status can be discovered by medical professionals. In this case, we may fully observe the asthma control level. However, the trust level of a patient is never fully observable by the SAM system, making the state partially observable at best. Therefore, given the partially observable nature of the state, the POMDP model suits well to our study. In the POMDP model, the decision maker can take a certain action. In the SAM application, the system can either alert the patient (action Alert) or keep monitoring the patient while being silent (action Wait). The goal is to find the best alerting strategy for maximizing patients’ quality of life by carefully defining the objective (reward) function. It considers both the short-term and long-term improvements of patients’ quality of life. In recent years, the POMDP model has been widely adopted in the field of medical decision-making to identify the optimal policy for certain clinical interventions [7], and we believe that the POMDP model is a plausible analytics tool for long-term chronic disease management applications as well.

To fill the research gap in the literature and be consistent with the vision of smart and connected health, we take an analytic approach using the POMDP model to investigate the nonadherence issue of the IoT-enabled HIS. Specifically, using the SAM system as a motivating example, we study how HIS should alert patients to maximize the patients’ quality of life considering the human-in-the-loop nature of HIS. Our modeling framework considers the dynamic interaction between patients and the SAM system. In this paper, we aim to accomplish the following research objectives:

Finding the optimal alerting strategy considering the trust-dependent patient adherence to the HIS recommendation.
Showing the positive impact of developing a trust-aware alerting strategy on patients’ quality of life through comparative studies with the current practice of asthma management.
Investigating the importance of trust-aware HIS design under various practical scenarios, such as heterogeneous tolerance levels among patients for receiving alerts.

The important role of an information system (IS) in terms of improving the quality of healthcare through advanced health technology and analytics is well recognized in the literature [11, 28]; hence, various algorithmic developments have been made [3, 9, 17, 32], which collectively take a valuable step toward data-driven, just-in-time, and just-for-me clinical intervention [31]. If the system is perfect, 100% adherence is going to yield the best health outcome for patients. Unfortunately, it is very challenging (if not impossible) to establish a HIS with perfect prognostic/diagnostic accuracy. In that sense, every alert sent out to patients inevitably carries a risk of being a false alarm, which reduces the trust level of patients and affects their future adherence behavior negatively. Consequently, it will have a negative impact on patients’ quality of life. If the system is too conservative in terms of alerting patients to avoid undesirable false alarms, the misdetection rate will increase. In this scenario, similar to the false alarm case, both patients’ trust and quality of life will be negatively affected. Considering this important trade-off, in this paper, we investigate the real-world issues faced by the IT-enabled HIS industry and provide a meaningful discussion to help HIS administrators design the best alerting strategy.

The rest of this paper is organized as follows. Section 2 introduces our research setting describing the SAM system in detail, along with a brief review of the relevant literature on patient adherence. In Sect. 3, we introduce our POMDP model, based on which we provide analytical and numerical results in Sects. 4 and 5 , respectively. Then, we discuss the practical implications and potential challenges for implementing our method in healthcare practice in Sect. 6. Last, Sect. 7 concludes the paper.

2 Research background

We first describe general asthma care practice and the SAM system to provide a better view on our research setting. Then, we discuss how the SAM system (or any IoT-enabled HIS) differs from traditional healthcare delivery systems by having human factors in the loop. Also, we give a brief discussion on the trust-dependent patient adherence based on the existing literature.

2.1 Asthma care practice and the SAM system

Asthma is a prevalent respiratory disease affecting a large portion of the global population [49]. Adult patients diagnosed with asthma need to pay attention to their asthma control status because poorly controlled asthma could significantly reduce patients’ quality of life. Asthma patients inhale two types of medications: controller and rescue medicines. Care providers prescribe a proper daily dose of controller medicine so that patients can keep their asthma under control. When patients experience exacerbated asthmatic symptoms, such as shortness of breath or severe coughing, they are advised to administer a dose of rescue medicine for quick relief. A rigorous asthma control level can be diagnosed by medical professionals based on numerous biomarkers (e.g., a nighttime awakenings survey and the peak flow rate as measured by a breathing test). However, there is no golden standard for asthma control diagnosis, and various medical agencies have published their own guidelines (e.g., [4, 5]).

The HIS has been widely appreciated, especially in asthma management, because it can facilitate consistent and efficient asthma self-management [51]. The SAM system is implemented and operated by our industry collaborator, a healthcare IT company headquartered in Madison, WI, focusing on respiratory diseases, including asthma. The company has developed a Bluetooth sensor that is attachable to personal inhalers. The sensor has received 501(k) class II clearance from the U.S. Food and Drug Administration and has passed a series of tests on various capabilities, such as sensor actuations and data capturing, designed by the Federal Communications Commission licensing standards [27]. The attachable sensor collects timestamps of every inhaler use and transmits the data to secure Health Insurance Portability and Accountability Act of 1996 (HIPPA)-compliant servers through patients’ smartphones or a designated wireless hub provided to patients who do not own a smartphone. The sensor itself can hold data for about 3,900 events in case a reliable wireless network cannot be established for a period of time [52]. The battery of the sensor can last for over a year without a charge, and the sensor constantly sends a signal to the server so that the system can identify any sensors that have been turned off due to a dead battery [51]. The SAM system offers a dashboard that summarizes the history of inhaler usage for each patient, and the dashboard can be accessed via web browsers or a smartphone app. The SAM system has a great potential for innovating the current practice of asthma care because the key for successful long-term asthma control is effective self-management [50]. In the literature, it is shown that having an accessible data summary tool significantly improves health outcomes [23]. In fact, the SAM system can noticeably improve asthmatic symptoms by motivating patients to pay more attention to self-monitoring their asthma [27]. The SAM system monitors patients consistently and provides the data to the care providers filling the gap between periodic on-site clinical consultations. The SAM system, based on a data-driven analytics-based algorithm, can actively alert the patients if needed. Such efforts can potentially prevent catastrophic events (such as emergency room visits) in advance [51].

2.2 Human-in-the-loop HIS

The SAM system is an innovative HIS that utilizes IoT health devices collecting and analyzing healthcare data to support asthma management decision-making. As illustrated in Fig. 1, all HIS with such features have two stages where the system and end users (patients) directly interact: (i) when HIS acquires data from patients and (ii) when patients receive a recommendation. In many cases, bias is inevitable in stage (i), which deteriorates the quality of decision from the analytics in the subsequent stages. For instance, in asthma management, patients’ self-reported symptoms and inhaler use distort the reality, which leads to sub-optimal recommendations by the HIS [40]. Designing a bias-aware algorithm could be a viable choice to mitigate the bias introduced in stage (i) [3], but a more straightforward way to address the issue is to simply avoid unreliable self-reporting practice and automate the data acquisition process instead.

Unlike stage (i), the second source of human-related bias has not been discussed much in the current literature. Stage (ii) shows that the patients who have received a suggestion from the HIS may or may not act accordingly. The SAM system can alert patients when their asthma control seems to be worsening [50] or when their rescue inhaler usage exhibits an unusual pattern [51]. The alert sent from the SAM system is a short message to patients recommending an in-person visit to their care provider for further diagnosis. Despite some algorithmic developments for detecting and predicting undesirable asthma progression trends, designing a good alerting policy for the SAM system is not a trivial task because the system would never know if the patient will follow the decision made by the system. Thus, it is crucial to investigate the patient nonadherence issue in the HIS context, which is largely affected by how far patients trust the HIS and its suggestions.

2.3 Trust-dependent patient adherence

The patient nonadherence to medical recommendation in chronic care has been investigated extensively in the medical literature due to its prevalence and potential harm to patients [19]. Patients adhere to the doctor’s prescription and therapeutic recommendations better if they trust their care providers [29]. Likewise, when patients are recommended to take any action for care through a HIS, they may or may not take the action on the basis of their perception of the system [42].

Studying how patients diagnosed with asthma react to the alert message generated by the SAM system is closely related to the general theory on how human users interact with technologies. For instance, the technology acceptance model has received a significant amount of attention in the past and has produced various improved/extended versions over time [1, 2, 14, 15]. As shown in [20] and trust building model proposed in [37], we consider trust the most crucial factor that influences patient adherence. Specifically, the perceived trustworthiness of the system seems to have the most significant impact on developing a high level of user adherence [36]. The perceived trustworthiness, from the expectation disconfirmation theory, highly depends on how users’ expectations of the system have been met [54]. Also, it has been reported that patient trust in medical technologies affects the overall perception about the healthcare delivery process and eventually enhances/diminishes patient adherence to medical recommendations [31, 42] In other words, patients tend to retain a high level of trust when they have had a positive experience with the system in the past. Many IoT-based HISs, such as the SAM system, are designed for assisting long-term health management. Therefore, the initial HIS adoption is not enough for ensuring an improved clinical outcome. Rather, positive outcomes typically accrue from the sustained use of the system. This is why we need to take the trust-dependent patient adherence issue into consideration when designing the HIS and its alerting strategy.

Patient adherence is a function of many factors ranging from socioeconomics to availability/access to care to patient preferences. Heterogeneity in these factors (at least some of them) is important to capture in any modeling attempt, Therefore, there has been valuable standalone research both in healthcare and management literature regarding the topic (e.g., [8, 24]). The issue of patient nonadherence is complex with numerous factors, and a single model may not be able to consider all of them. In this paper, we focus primarily on trust-dependent adherence so that we can develop a tractable model that is rich enough to identify and highlight the insights in providing medical intervention through such HISs.

3 Model

We use a discrete-time, finite-horizon POMDP model to study the role of trust and patient adherence in the SAM application. The objective is to maximize the patient’s quality adjusted life days (QALDs), which account for both the quality and quantity of life [18], by choosing an appropriate alerting strategy considering uncertainties in the patient’s current asthma control status and the level of patients’ trust in the system.

We denote t as decision time, where $t=1,2,\ldots ,t_E$ and $t_E<\infty$. At each time epoch t, the system can take an action denoted by $a_t$, and there are two possible choices in the action space, that is, $a_t\in {\mathcal {A}}=\{W,A\}$, where A and W refer to Alert and Wait, respectively. Action Alert implies that the SAM system recommends an on-site clinical diagnosis to see if a follow-up intervention is necessary. Action Wait simply means no recommendation from the system at that time. In other words, the system remains silent without sending any messages to patients. Once the patient receives an alert, he/she decides whether to seek on-site clinical consultation adhering to the SAM system’s suggestion or to ignore the alert. Please note that patients can visit their care provider without receiving an alert from the systems and such an unpredictable clinic visit is also considered in the model.

We define three states at any discrete time t: (i) patients with good asthma control and high trust level, (ii) patients with good asthma control but low trust level, and (iii) patients with poorly controlled asthma and low trust level. Patients with bad asthma control but high trust level are excluded to reflect the theories adopted from the literature and practice. It has been shown that users do not rely on technology if the technology fails to meet their expectations [54] or provides unsatisfactory performance [30, 53]. The performance of the SAM system, as perceived by the patient, must be unsatisfactory for a patient with poorly controlled asthma, because patients often can notice the degraded asthma control due to escalated asthmatic symptoms [44]. Therefore, we argue that having a high trust level with poorly-controlled asthma is a very rare (if not impossible) scenario in the SAM setting. This is also consistent with the health belief model, which is frequently used in the health communication domain [12]. Excluding such a state is not only supported by the literature but also allows us to partially address the dependency between trust level and severity of asthma.

As discussed, the state at time t denoted by $s_t$ is a combination of the asthma control status and trust level of a patient, that is, $s_t=s_t^C\times s_t^T$, where $s_t^C$ and $s_t^T$ are the asthma control and trust states, respectively. We denote the aforementioned state by $s_t\in {{\varvec{S}}}=\{GH,GL,BL\}=\{0,1,2\}$, where ${{\varvec{S}}}$ is the entire state space, G/B indicates good/bad asthma control, and H/L represents high/low trust level. We denote the state space ${{\varvec{S}}}$ by $\{GH,GL,BL\}$ because it shows what each state means in the asthma management context. However, in a few places, we also use $\{0,1,2\}$ to denote the state space ${{\varvec{S}}}$ when a numerical representation is preferred.

The evolution of asthma control state $s_t^C$ is straightforward to understand. Bad asthma control can be improved through a proper therapeutic intervention and good asthma control might go bad naturally over time [44]. At each time instance, based on the data collected by the sensor, the SAM system decides whether to alert the patient or to do nothing. Once the patient observes the decision made by the SAM system (alert or wait), the patient then decides whether to visit the care provider at the hospital for clinical consultation or not. The in-person visit plays an important role in our analytical investigation. Obviously, the clinical consultation must have a direct impact on the health outcome (asthma control level). When the patient visits his/her care provider for further diagnosis, the care provider examines the patient through a series of lab tests, and the clinically-defined asthma control level at the time of diagnosis will be revealed [50]. If the patient’s asthma was poorly controlled, a therapeutic follow-up should be administered (e.g., making an adjustment to the prescription), and the intervention will improve the asthma control level of the patient. A patient with well-controlled asthma may not obtain a significant benefit from the consultation, but the patient still can receive helpful feedback on how his/her asthma has been controlled lately. If the patient does not seek a clinical consultation, however, their asthma control level may stay the same or worsen in the future because, in some cases, self-managing asthma may not be enough to reverse the trend of worsening asthma progression.

The on-site diagnosis also affects the level of patients’ trust in the SAM system as well, but unlike the asthma control state, the trust state $s_t^T$ transition involves a more complex mechanism. The on-site diagnosis provides an opportunity for patients to evaluate the performance of the system. Without the on-site clinical diagnosis, patients can only speculate about the trustworthiness of the SAM system with no visible evidence. As shown in Fig. 2a, when the patient has received an alert and decided to visit his/her care provider, the trust level of the patient may become higher if the true asthma control level was bad (i.e., a correct alert) or become lower if the true asthma control level was good (i.e., a false alert). Patients can see a care provider even without receiving an alert from the SAM system. In that case, shown in Fig. 2b, the trust level can become higher if patient’s asthma was well controlled (i.e., correct no-alert) or lower if the patient’s asthma was poorly controlled (i.e., misdetection). When patients decide not to visit the clinic, as in Fig. 2c, they not only miss the chance of receiving a clinical diagnosis and therapeutic follow-up (if necessary) but also lose the opportunity to judge the appropriateness of the decision made by the system.

Basically, in our model, the trust level will be adjusted if the patient decides to visit the clinic, where the patient can hear his/her accurate asthma control information from the medical expert and compare it to what the system suggested because a clinical diagnosis conducted by a healthcare professional is the most reliable way for the patient to observe the true accuracy of the alert (e.g., true/false positive/negative). In summary, there are three trust level updating scenarios as below:

Gaining trust Suppose the SAM system alerted the patient. If the patient visited a clinic following the system’s suggestion and found out that the current asthma control was indeed bad. Then, the SAM system will gain the patient’s trust. Sometimes, the patient may visit a clinic without receiving an alert from the system. If the patient, from the on-site diagnosis, learned that the asthma control was good and hence, a visit to the clinic was unnecessary, then the SAM system will also gain the patient’s trust.
Losing trust Suppose the SAM system alerted the patient, and the patient visited a clinic on the basis of the alert. If the patient then found out that the current asthma control was good (a false alarm), the system will likely lose the patient’s trust. In the case where the patient decided to visit a clinic without receiving an alert from the system, if the patient learned that his/her asthma control was bad (misdetection), the SAM system likely will also lose the patient’s trust.
No trust updating If the patient did not visit a clinic regardless of whether he/she received an alert from the SAM system, the patient has no opportunity to evaluate the performance of the system; hence, there will be no trust updating.

Based on the discussion above, we see that a visit to the clinic will always be beneficial in terms of asthma control but may hurt the trust level if the patient observes a mistake made by the system. In our setting, we assume that the on-site diagnosis is the only reliable resource for evaluating the performance of the system unless the patient has been able to physically notice the degraded asthma control level by experiencing substantially excessive asthmatic symptoms. Therefore, if the patient never goes through a clinical diagnosis, the patient’s trust level will likely remain the same because there is no way to assess the performance of the system. In the SAM system’s perspective, it implies minimal risk of losing trust. However, it would not be easy for the SAM system to maintain such a patient’s asthma condition in the well-controlled state. In our POMDP model, we accommodate this key trade-off so that we can study the impact of having an adherence-aware alerting strategy on the patient’s quality of life.

In addition to the definitions of action $a_t\in {\mathcal {A}}$ and state $s_t\in {{\varvec{S}}}=\{GH,GL,BL\}$, the POMDP model defines a belief $\varvec{\pi }_t\in \varvec{\Pi }$, which is the occupation probability of a specific state in the entire state space ${{\varvec{S}}}$ at time t, representing the decision maker’s belief about the unobservable true state $s_t$. The entire belief space is denoted by $\varvec{\Pi }$. For instance, $\varvec{\pi }_t=(0.1,0.7,0.2)$ means that the patient is most likely in state GL with a probability of 0.7 whereas the probabilities of the patient being in state GH and BL are 0.1 and 0.2, respectively. The belief is a well-known sufficient statistic covering the whole process history of the state prior to time t [6]. Another key term in the POMDP framework is the observation $o_t\in \mathbf{O }$, where $o_t$ is the observation obtained at time t after action $a_t$ is taken but before the state transition. The observation space is $\mathbf{O }=\{\mathbf{Y },s_t^C\}$, where $\mathbf{Y }=\{y^0,\dots ,y^M\}$ indicates a discrete set of possible M+1 observations. In the SAM application, Y is simply a collection of a possible number of rescue inhaler uses for a given day t. Without the loss of generality, we assume that the elements in a set Y are ordered ($y^j$ is a more desirable observation than $y^{j+1}$). For example, observing one rescue inhaler use is more desirable than observing two or more inhaler uses. As mentioned earlier, if a patient visits the clinic, the true asthma control state is revealed by on-site clinical diagnosis. Thus, $s_t^C$ might be observed directly by the patient. Nevertheless, the SAM system cannot directly observe the trust state $s_t^T$; hence, the overall state $s_t$ remains only partially observable. Further description about the state transition function and other notations follows.

$q_t^{s,a}$: This is the probability of intervention (on-site diagnosis) at time t given a current state $s_t$ and an action $a_t$ taken. A patient with high adherence would have a high $q_t^{s,A}$ (high intervention probability when the patient has received an alert) and a low $q_t^{s,W}$ (low intervention probability when the system did not alert the patient) for all $s\in {{\varvec{S}}}$

$\varLambda _t^{s,a}(o)=P(o_t|s_t,a_t)$: This is the observation probability for getting an observation $o_t$ when the state is $s_t$ and the action taken is $a_t$. As implied earlier, $\varLambda _t^{s,a}(o)$ depends on $q_t^{s,a}$ as

$$\begin{aligned} \varLambda _t^{s,a}(o)= {\left\{ \begin{array}{ll} q_t^{s,a} &{} \text {for } o=s_t^C,\\ (1-q_t^{s,a}) \times P( y_t|s_t^C) &{} \text {for } o=y_t \in {\mathbf {Y}}, \end{array}\right. } \end{aligned}$$

where $y_t$ represents the number of inhaler uses observed at time epoch t.

$\varGamma _t^{a,o}(s'|s)=P(s'_{t+1}|s_t,a_t,o_t)$: This is the state transition probability, which indicates the probability of moving from $s\in {{\varvec{S}}}$ at time t to $s'\in {{\varvec{S}}}$ at time $t+1$ if action $a_t$ was taken and $o_t$ was observed at time t. Furthermore, we denote the overall state transition matrix as ${{\varvec{\Gamma }}}_t^{a,o}=[\varGamma _t^{a,o}(s'|s)]_{s\in {{\varvec{S}}}}$ for $a\in {\mathcal {A}}$, $o\in {{\varvec{O}}}$, and $s,s'\in {{\varvec{S}}}$. The $3\times 3$ matrix ${{\varvec{\Gamma }}}_t^{a,o}$ gives the probability of transitioning from state s to $s'$. More details on how we construct the state transition matrix are provided in Appendix A. Observing $s_t^C$ from an on-site diagnosis ($o_t=s_t^C$) affects the likelihood of both future asthma control status ($s_{t+1}^C$) and trust level ($s_{t+1}^T$). This is one of the unique features of the SAM application. In conventional POMDP models, state transition depends only on $a_t$. However, to properly model the SAM practice, the transition probability of our POMDP model depends on both $a_t$ and $o_t$.

The objective of our POMDP model is to maximize the patient’s reward (i.e., utility) over the decision horizon, which is equivalent to minimizing the patient’s total disutility. Thus, we define

$r_t(s,a,o)=r(s_t,o_t,a_t)$: This is the reward between two consecutive times t and $t+1$, where the true state is $s_t$, action $a_t$ taken, and observation $o_t$ is seen at time t. We assume that the reward can be measured by the QALDs for a patient, where its maximum is 1 (full day), and its minimum is 0. Specifically, $r_t(s,a,o)$ is computed on the basis of various disutility values. They are denoted as $\phi _a$ (disutility for taking action $a\in {\mathcal {A}}$), $\phi _s$ (disutility associated with state $s\in {{\varvec{S}}}$), and $\phi _s^o$ (disutility for a patient in state $s\in {{\varvec{S}}}$ to observe $o\in {{\varvec{O}}}$) such that $0\le r_t\left( s,a,o\right) =1-{\phi }_a-{\phi }_s-{\phi }^o_s\le 1$. Furthermore, we define $r_t\left( s,a\right) =\sum _{o\in O}{{{\Lambda }}^{s,a}_t\left( o\right) r_t\left( s,a,o\right) }=\sum _{o\in O}{{{\Lambda }}^{s,a}_t\left( o\right) \left[ 1-{\phi }_a-{\phi }_s-{\phi }^o_s\right] }$.

Suppose a well-controlled asthma patient who has high trust in the SAM system received an alert at time t and visited the clinic for further diagnosis. Then, the reward between time t and $t+1$ for this patient would be $r_t(s=GH,a=A,o=G)=1-\phi _{a=A}-\phi _{s=GH}-\phi _{s=GH}^{o=G}$. The disutility for receiving an alert ($\phi _{a=A}$) may vary across patients because some patients might get annoyed by alert messages more easily than others. The disutility for having the GH state should not be significant because it is the most desirable state. $\phi _{s=GH}^{o=G}$ represents the cost for visiting a clinic and going through a consultation with care providers. This disutility could be large, because visiting a clinic is often a time-consuming task that interrupts the patient’s daily routine and involves monetary costs. Furthermore, in this specific example, observing a good asthma control state after the diagnosis ($o_t=s_t^C=G$ given $s_t=GH$) should make the patient disappointed about the performance of the SAM system. From the definition of QALD, we note that $0\le \phi _{a=A}+\phi _{s=GH}+\phi _{s=GH}^{o=G}\le 1$. In short, $r_t\left( s,a,o\right)$ is the remaining QALD after subtracting all the SAM system-induced disutilities.

${\varvec{\tau }}^{a,o}_\pi ={[\tau ^{a,o}_\pi (s')]}_{s'\in {{\varvec{S}}}}={[P(s'_{t+1}|\varvec{\pi }_t,a_t,o_t)]}_{s'\in {{\varvec{S}}}}$: This is the updated belief that is equivalent to the probability of occupying state $s'\in {{\varvec{S}}}$ at time $t+1$ given the previous belief $\varvec{\pi }_t$ if action taken is $a_t$, and $o_t$ was observed at time t. The updated belief is defined as

$$\begin{aligned} \tau ^{a,o}_\pi (s')=\frac{\sum _{s\in {{\varvec{S}}}}{\pi (s)\varLambda ^{s,a}_t(o)\varGamma ^{a,o}_t(s'|s)}}{\sum _{s\in {{\varvec{S}}}}{\pi (s)\varLambda ^{s,a}_t(o)}}. \end{aligned}$$

It should be noted that the observation probability $\varLambda _t^{s,a}(o)$ depends on the current state s rather than the future state $s'$ because the observation occurs before the state transition in our framework. Therefore, $\varLambda _t^{s,a}(o)$ cannot come out of the summation over s in the numerator as in typical POMDP models.

$V_t^{*}(\varvec{\pi })$: This is the optimal value function for a given belief $\varvec{\pi }$. The goal of the POMDP model is to find an action that maximizes the total expected reward on the basis of the current belief state $\varvec{\pi }_t\in \varvec{\Pi }$ at time t denoted by $V_t^*(\varvec{\pi })$ defined as

$$\begin{aligned}&V^*_t(\varvec{\pi })\\&\quad =\max _a\left[ \sum _{s\in {{\varvec{S}}}}{\sum _{o\in O}{\pi (s)\varLambda ^{s,a}_t(o)r_t(s,a,o)}}\right. \\&\qquad \left. +\sum _{s\in {{\varvec{S}}}}{\sum _{o\in O}{\sum _{s'\in {{\varvec{S}}}}{\pi (s)\varLambda ^{s,a}_t(o)\varGamma ^{a,o}_t(s'|s)V^*_{t+1}({\varvec{\tau }}^{a,o}_\pi )}}}\right] , \end{aligned}$$

for $t=1,\dots ,t_E-1$ where the first term represents the expected immediate reward, and the second term is the expected reward of the resulting belief $\varvec{\tau }^{a,o}_\pi$. After a couple steps of algebra, we get

$$\begin{aligned} V^*_t(\varvec{\pi })&=\max _a\left[ \sum _{s\in {{\varvec{S}}}}{\pi (s)}\sum _{o\in O}\varLambda ^{s,a}_t(o)\left\{ r_t(s,a,o)\right. \right. \\&\quad \left. \left. +\sum _{s'\in {{\varvec{S}}}}{\varGamma ^{a,o}_t(s'|s)V^*_{t+1}({\varvec{\tau }}^{a,o}_\pi )} \right\} \right] \\&=\max _a\left[ \sum _{s\in {{\varvec{S}}}}{\pi (s)}\sum _{o\in O}\varLambda ^{s,a}_t(o)\right. \\&\quad \left. \left\{ r_t(s,a,o)+V^*_{t+1}({\varvec{\tau }}^{a,o}_\pi ) \right\} \right] . \end{aligned}$$

For the last time epoch $t=t_E$, the optimal value function is simply defined as

$$\begin{aligned} V^*_{t_E}(\varvec{\pi })= & {} \max _a\left[ \sum _{s\in {{\varvec{S}}}}{\pi (s)}\sum _{o\in O}{\varLambda ^{s,a}_{t_E}(o)r_{t_E}(s,a,o)}\right] \\= & {} \max _a\left[ \sum _{s\in {{\varvec{S}}}}{\pi (s)r_{t_E}(s,a)}\right] . \end{aligned}$$

In summary, Fig. 3 shows the overall decision process diagram, and Table 1 provides a list of all the notations used in our model.

Table 1 Summary of notations used in our study

Full size table

Unlike other decision processes that are typically modeled by the standard POMDP framework, our process for the SAM system has several unique features. First, the state transition depends not only on the action taken but also on the observation seen at the current time, whereas the conventional POMDP approach assumes that action is the only component that affects the state transition [48], which is considered inappropriate for some medical applications (such as the SAM system) in which diagnosis is a part of the decision process [7]. Second, there are unintended paths due to the misalignment between the decisions made by the SAM system and by the patients. The shaded boxes indicate the occurrence of on-site clinical diagnosis and a follow-up intervention (if necessary), which should increase the probability of patients being in a better asthma control state at the next time epoch. Depending on the current state $s_t\in {{\varvec{S}}}$ and the action taken at time t, $a_t\in {\mathcal {A}}$, the probabilistic state transition from the current state $s_t$ to the next state $s_{t+1}$ follows different paths. For instance, if we look at the path shown at the very top of Fig. 3, the SAM system alerted the patient, and the patient visited the clinic following the system’s suggestion. At the clinic, the actual asthma control level is determined by on-site diagnosis, that is, $o_t=s_t^C$. If $s_t$ is either GH or GL, the patient will be disappointed by the false alarm made by the SAM system. In this case, although $s_{t+1}$ can take any state in ${{\varvec{S}}}$ due to the probabilistic state transition function, the next state will be likely $s_{t+1}=GL$. The system, on the basis of this information, updates the belief $\varvec{\pi }_t$ accordingly, so that a more appropriate decision can be made at the next time $t+1$. Basically, the system will identify the best alerting policy using the updated belief state to maximize patients’ quality of life. The optimality equation also has a different structure compared with the conventional POMDPs, because of the aforementioned unique decision sequence of the SAM system.

4 Analytical study: structural properties

The solution of our POMDP model has a set of structural properties that provide valuable insights and form the basis of our computational analysis using real-world SAM data. Throughout the analytical and computational studies, we aim to highlight the importance of designing a HIS that accommodates the concept of the patient’s trust and to discuss the practical implications to the field. Before moving forward, we first introduce the following necessary definitions:

Definition 1

(from White [55]). If $\sum ^{|X|}_{i=j}{{\varvec{x}}\left( i\right) }\le \sum ^{|X|}_{i=j}{{\varvec{x}}\varvec{'}\left( i\right) }$ for any $j\in \left\{ 1,\ 2,\ 3,\dots ,|X|\right\}$ holds for two probability mass functions ${\varvec{x}}$ and ${\varvec{x}}\varvec{'}$ with the same dimension |X|, ${\varvec{x}}$ is stochastically smaller than ${\varvec{x}}\varvec{'}$ which is denoted by ${\varvec{x}}{\varvec{\le }}_s{\varvec{x}}\varvec{'}$.

Definition 2

(from Ferguson et al. [16]). If ${\varvec{x}}\left( i\right) \varvec{/}{\varvec{x}}\varvec{'}\left( i\right) \varvec{\ge }{\varvec{x}}\left( j\right) \varvec{/}{\varvec{x}}\varvec{'}\left( j\right)$ for all $i\le j$ holds for two probability mass functions ${\varvec{x}}$ and ${\varvec{x}}\varvec{'}$ with the same dimension |X|, ${\varvec{x}}$ is smaller than ${\varvec{x}}\varvec{'}$ in the monotone likelihood ratio (MLR) which is denoted by ${\varvec{x}}{\varvec{\le }}_r{\varvec{x}}\varvec{'}$.

Definition 3

(from Karlin [25]). A matrix ${\varvec{H}}$ has a property of totally positive of order 2 (${TP}_2$) which is denoted by ${\varvec{H}}\varvec{\in }{TP}_2$ if all its second-order minors are non-negative. Equivalently, ${\varvec{H}}\varvec{\in }{TP}_2$ if the ($i+1$)-th row MLR dominates the i-th row: that is, ${{\varvec{H}}}_{j,:}{\varvec{\le }}_r{{\varvec{H}}}_{i,:}$ for all $i>j$, where ${{\varvec{H}}}_{i,:}$ denotes the i-th row of the matrix ${\varvec{H}}$.

For a POMDP model, it is crucial to ensure that the optimal value function is monotone and nonincreasing in $\varvec{\pi }\in \varvec{\Pi }$ [33]. Monotonicity of the optimal value function exists when the belief vectors are MLR ordered, that is, $\varvec{\pi }_t\le _r\varvec{\pi '}_t$ [16]. In the SAM application, $\varvec{\pi }_t\le _r\varvec{\pi '}_t$ means that a patient with belief $\varvec{\pi }_t$ is expected to be in a more desirable state than the patient with belief $\varvec{\pi '}_t$. Because our state space can be naturally ordered by desirability (i.e., it is desirable to have better asthma control and a higher trust level), it is straightforward to justify the condition. In addition to the MLR-ordered belief vector, both the state transition probability matrix and the observation probability matrix (${{\varvec{\Gamma }}}_t^a$ and ${{\varvec{\Lambda }}}_t^a=\left[ \varLambda _t^{s,a}\left( o\right) \right] _{s\in {{\varvec{S}}}}$) in the POMDP model need to possess the ${TP}_2$ property [25]. Therefore, to derive meaningful analytical results, we need to investigate our POMDP model to see if it has a monotone optimal value function nonincreasing in $\varvec{\pi }\in \varvec{\Pi }$. We only summarize the key results here, whereas we show the mathematical details in Appendix A.

Analytical Result 1 (Lemma 1 and Proposition 1 in Appendix A) Suppose (C1)–(C3) hold, then both ${{\varvec{\Gamma }}}_t^{a,o}$ and ${{\varvec{\Gamma }}}_t^a$ have a ${TP}_2$ property for all $a\in {\mathcal {A}}$ and $t\le t_E$ where (C1)–(C3) are

$$\begin{aligned} \text {(C1) }&v^{loss}_{HL}\ge v^{none}_{HL}\ge v^{gain}_{HL} \text { and } c^{loss}_{LL}=c^{none}_{LL}=c^{gain}_{LL},&\nonumber \\ \text {(C2) }&v^{none}_{HH}\ge 1-v^{none}_{LL} \text { and } v^{loss}_{HL}\le v^{loss}_{LL},&\nonumber \\ \text {(C3) }&c^0_{BB}\ge c^1_{BB}, c^0_{GG}=c^1_{GG}\text {, and } c^1_{GG}c^1_{BB}-c^1_{GB}c^1_{BG}\ge 0.&\end{aligned}$$

The first analytical result (AR from here and after) proves that the state transition probability matrix of our POMDP model has the ${TP}_2$ property under some conditions that make sense in the SAM context (notations used in (C1)–(C3) are defined in Appendix A). (C1) simply says that having negative experiences with the SAM system (a false alarm or misdetection) increases the transition probability from a high trust state to a low trust state and, once the patient is in a low trust state, it may not be easy to come back to the high trust state. In other words, in many cases, it would be easier for the SAM system to lose trust than to gain trust from the patients. This cognition anchoring phenomenon is not specific to the SAM system but is prevalent for many decision support systems in various fields [34, 35]. In the same spirit, (C2) suggests that people often do not drastically change their trust state. Last, (C3) dictates that receiving proper intervention decreases the probability of being in a bad asthma control state, whereas there is no intervention effect if the patient’s asthma control was in a good state to begin with. All three conditions make practical sense in the SAM application. In addition to AR1, we can also show that, under the same conditions (C1)–(C3), the belief vectors defined in our POMDP model will always remain MLR-ordered as in the following AR2.

Analytical Result 2 (Proposition 2 in Appendix A) For any two beliefs ${{{\pi }}}$,${{{\pi }}'}\in {{\varvec{\Pi }}}$ such that ${{{\pi }}}\le _r{{{\pi }}'}$, suppose (C1)–(C3) hold, then ${{{\tau }}}_{{{\pi }}}^{a,o}\le _r{{{\tau }}}_{{{{\pi }}'}}^{a,o}$ for any $a\in {\mathcal {A}}$ and $o\in {\mathbf {O}}$.

This property is critical to our SAM application because the SAM system updates its estimated belief as the system collects new observations (either number of inhaler uses or the actual asthma control level determined by a physician). AR2 guarantees that our belief updating function retains the MLR ordering between two belief vectors.

The last proof we need is about the ${TP}_2$ property of the observation probability matrix. However, the observation probability matrix specified in our POMDP model does not have the ${TP}_2$ property due to the characteristics of the SAM system. We can forcefully make the observation probability matrix to have the ${TP}_2$ property by assuming $q_t^{s=GH,a=A}\le q_t^{s=GL,a=A}$ (see Lemma 2 in Appendix A). However, this condition cannot be justified in the SAM context because it implies that, when there are two patients with the same good asthma control who receive an alert, the probability of visiting a clinic for further diagnosis should be greater for a patient whose trust level is low compared to the patient whose trust level is high. Thus, instead, we give AR3 showing that the optimal value function of our POMDP model is still monotone nonincreasing in ${{{\pi }}}\in {{\varvec{\Pi }}}$, even without assuming a ${TP}_2$ observation probability matrix.

Analytical Result 3 (Theorem 1 in Appendix A) Suppose (C1)–(C3) and (C6)–(C8) hold. Then, for any belief vectors ${\pi },{\pi '}\in {{\varvec{\Pi }}}$ such that ${\pi }\le _r{\pi '}$, $V_t^*\left( {\pi }\right) \ge V_t^{*}({\pi '})$ for all $t\le t_E$, where (C1)–(C3) are listed in AR1 and (C6) - (C8) are

$$\begin{aligned} \text {(C6) }&\phi _{s=0}^{o=y}=\phi _{s=1}^{o=y}=\phi _{s=2}^{o=y}=\phi _{a=W}=\phi _{s=0}=0,&\\ \text {(C7) }&\phi _{a=A}+\phi _{s=0}^{o=s^C}\le \phi _{s=1}, \text {and}&\\ \text {(C8) }&\phi _{s=1}+\phi _{a=A}+\phi _{s=1}^{o=s^C}\le \phi _{s=2}\le 1-\phi _{a=A}-\phi _{s=2}^{o=s^C}.&\end{aligned}$$

The conditions from (C6) to (C8) are about the disutility values. (C6) states that being monitored by the SAM system without receiving any alerts should give 0 disutility. Using the SAM system should not change the usual way that patients have been using their personal inhaler. The system only requires a small sensor attached to the inhaler, and everything is done wirelessly and automatically. Therefore, (C6) is a reasonable condition in the SAM system and many other IoT-enabled HIS. (C7) and (C8) collectively say that visiting a clinical facility due to a false alarm is better than actually transitioning to a bad asthma control state. The goal of everyday asthma management is to keep a patient’s asthma condition under control; hence, assigning disutility values according to (C7) and (C8) intuitively makes sense.

Building upon the previous results, we derive AR4, showing that the optimal alerting strategy for the SAM system, considering trust-dependent patient adherence, is a threshold-type policy.

Analytical Result 4 (Theorem 2 and Corollary 1 in Appendix A) Let $a_t^*\left( {\pi }\right)$ denote the optimal action at time t for a given belief ${\pi }\in {{\varvec{\Pi }}}$ and define $\pi _{GH}^*=max\{\pi (0):\pi \left( 1\right) =0,a_t^*\left( {\pi }\right) =A\}$ and $\pi _{GL}^*=max\{\pi (1):\pi \left( 0\right) =0,a_t^*\left( {\pi }\right) =A\}$. Suppose (C1)–(C8) hold, then we get $\pi _{GH}^*\le \pi _{GL}^*$. The conditions (C1)–(C3) and (C6)–(C8) are presented in AR1 and AR3, respectively, and (C4) - (C5) are

$$\begin{aligned} \text {(C4) }&{\xi }_G\le _r{\xi }_B \text { where } {\xi }_{s_t^C}(y_t)=[P(y_t|s_t^C)] \text { for all } y_t\in {\mathbf {Y}}, \text { and}&\\ \text {(C5) }&q_t^{0,a}\le q_t^{1,a}\le \delta (a) \text { for } a=W.&\end{aligned}$$

The first additional condition (C4) can be easily translated into the SAM setting. It essentially says that a patient with bad asthma control is expected to use the inhaler more often than the patient with good asthma control. $\delta (a)$ in (C5) can be expressed as $\delta (a)=q_t^{1,a}/[\{{\xi }_G(y_t=y^M)/{\xi }_B(y_t=y^M)\} (1-q_t^{1,a})+q_t^{1,a}]$. Therefore, (C5) implies that being in a worse state yields a higher probability of intervention than being in a better state when $a=W$. In other words, when the SAM system stays silent (no alert), the patient with poorly controlled asthma is more likely to see his/her care provider than the patient in a good asthma control state. Considering that worsening asthma control typically triggers more asthmatic symptoms than properly controlled asthma does, both (C4) and (C5) are justifiable conditions in the SAM context.

Our final result (AR4) proves that the optimal alerting policy of the SAM system under our POMDP model can be defined by a threshold. In the SAM application, the threshold-type policy means that the optimal policy is simply dividing the entire belief space by several planes (or straight lines). In addition, AR4 also suggests that the maximum alert-triggering probability for a patient with a low trust level is always greater than or equal to the one for a patient whose trust level is high (see Corollary 1 in Appendix A).

Figure 4 illustrates the insights from AR4 on the two-dimensional belief space because the entire three-dimensional belief state space ${{\varvec{\Pi }}}$ can be shown by a two-dimensional surface. For instance, suppose the SAM system estimates the current belief ${\pi }_{example}$ for a specific patient. This patient is believed to be in a good asthma control state with a high trust level with a probability of 0.35, that is, $\pi (0)=0.35$. Similarly, the probability of this patient in the state of good asthma control with low trust is 0.31. Therefore, the remaining probability (probability associated with bad asthma control) is simply 0.34. Now, ${\pi }_{example}$ is in the alert region; hence, the optimal action for the SAM system is to alert the patient on the basis of the threshold-type alerting policy derived from our POMDP model. Furthermore, AR4 suggests that $\pi _{GH}^*$ (the maximum alert-triggering probability for a patient in GH state) cannot be greater than $\pi _{GL}^*$, as shown in Fig. 4. In other words, the system needs to alert a patient more conservatively when the patient is believed to trust the SAM system. For instance, a patient with high trust in the SAM system would receive an alert if the estimated probability of good asthma control was below $\pi _{GH}^*$ (say, 0.7), but a patient with a low trust level would be alerted when the estimated probability of being in good asthma control fell under $\pi _{GL}^*$ (say, 0.95). Because the threshold of 0.95 ($\pi _{GL}^*$) is more sensitive than 0.7 ($\pi _{GH}^*$), the patient with low trust in the system is expected to receive alerts more often compared with the patient with a high trust level. In the next section, we shed more light on the analytical results through a series of computational studies coupled with real-world SAM data to provide an in-depth discussion on practical insights and implications.

5 Computational study based on the SAM data

Computational studies are widely adopted in various fields for a comparative experiment between the newly proposed approach and the baseline method [7]. It is especially useful in healthcare applications because a numerical experiment does not raise ethical concerns for using a method that is not feasible in real-world clinical trials [39]. For instance, the comparison between our alerting policy and the simple no-alert policy (baseline method) would be challenging to conduct on the actual patients using the SAM system.

Many parameters in our POMDP model are directly estimated from real-world SAM data. The SAM data come from our industry collaborator who conducted a beta test in a mid-sized city in the United States from March 2014 to December 2017. A total of 326 adult patients diagnosed with asthma participated in the study. Among them, 31.6% (103 patients) were male and 68.4% (223 patients) were female. The gender imbalance present in our data agrees with the findings in clinical literature, as asthma is known to have a higher prevalence in women than in men [46]. Of the 326 patients, 68.7% were White/Caucasian, 18.1% were African American, and 13.2% were classified as Others, including Asian and Native American. The city where the study was conducted is not a place with a diverse population (88% of the residents are White); hence, the ethnicity imbalance in the data is inevitable because it was an open-participation study. The average number of inhaler uses per day was 0.23, with a standard deviation of 1.06, and about 77% of the participants had their asthma well controlled at the time of enrollment.

Table 2 Parameters specifications

Full size table

A list of our initial parameter specifications is shown in Table 2. The observation probabilities in Table 2a are estimated from the real-world SAM data using the inference method described in [50]. The asthma control state transition probabilities listed in Table 2b are also estimated from the data except for one. Among the probabilities in Table 2b, the only parameter that we were not able to estimate directly from the data is the transition probabilities from bad asthma control to good asthma control after intervention because the SAM system lacks the capability of assessing the intervention effect. Thus, we assume that a patient with poorly controlled asthma will regain good asthma control through a proper clinical intervention with a probability of 0.9. This probability is adopted from a clinical study. Bateman et al. [10] showed that 90% of the patients achieved well-controlled asthma after receiving a proper intervention.

The trust state transition probabilities for all three cases of trust updating (trust gain, no-updating, and trust loss) are in Table 2c. Table 2d and 2e show our parameter specifications for the intervention probabilities and disutility values, respectively. Those parameters are determined to reflect the insights obtained from the literature and can be adjusted to fine-tune the model suitable for the specific chronic condition that one is interested in. For instance, the preset disutility values in Table 2e can be (and should be) adjusted for each patient. The degree of mental fatigue induced by the system’s alert may differ across patients. Therefore, according to the SAM system administrator’s own understanding of each patients, he/she should be able to change the value of $\phi _{a=A}$. In the next subsection, we adjust some parameters, especially the ones that are not directly estimated from the data. By doing so, we not only conduct a sensitivity analysis but also highlight useful insights addressing various practical issues in the SAM application.

Throughout the computational analysis, day is used for the time epoch t and we simplified the observation space down to ${\mathbf {Y}}=\{0,1,2^+\}$, where $2^+$ refers to two or more inhaler uses per day. From the observation probabilities in Table 2a estimated from the SAM data, we can see that for a patient with good asthma control, it is very unusual to observe more than two inhaler uses per day. Therefore, we will have enough differentiating power with the first three observations. This inhaler usage pattern is commonly observed in quantitative asthma studies [51]. The goal of designing the SAM system’s alerting policy is to maximize the reward function defined as $r_t(s,a,o)=1-\phi _a-\phi _s-\phi _s^o$, which can be translated into QALDs.

5.1 Computational experiments under various scenarios

As mentioned, some parameters in Table 2 used for solving the POMDP model are adopted from the relevant literature rather than directly estimated from the data. Therefore, here, we change those parameters to further confirm our analytical findings and to discuss some interesting issues of the SAM system.

First, patients may show different behaviors toward clinical interventions. Some patients are reluctant to seek interventions because they do not want to disturb their everyday life routines (intervention-averse patients). By contrast, some other patients seek interventions actively (intervention-seeking patients) because, for them, the fear of having poorly controlled asthma overshadows the hassle of going through an intervention. We should be able to study this issue by adjusting the intervention probabilities. For instance, for a patient in the GH state (good asthma control with high trust level), the initial intervention probability was set to 0.9 when $a=A$ (Alert) in Table 2d. An intervention-averse patient should exhibit much lower intervention probability than 0.9. Similarly, we can adjust the intervention probabilities of patients in other states to reflect patients’ heterogeneous behavior toward intervention.

Second, patients may have different tolerance levels toward receiving alerts from the SAM system; that is, some patients can be annoyed by alert messages much more than some other patients who are willing to tolerate receiving a few alerts. The issue of different alert fatigue levels can be studied by adjusting disutility values in Table 2e. For instance, the initial value set for disutility caused by receiving an alert from the SAM system was relatively small (0.01). To reflect the inflated disutility for alert-averse patients, we can increase the value to a number much larger than 0.01, say 0.1. It should be noted that 0.1 is a substantial disutility because it is roughly translated into a 10% quality loss of the patient’s entire day.

Lastly, we investigate another interesting question: What if the primary goal of the system administrator (company) was to retain customers (patients) using their service? In other words, for the company, having patients in the GL state (good asthma control but low trust level) could be as bad as having them in the BL state (bad asthma control and low trust level), although the GL state is much better for patients than the BL state. By making the disutility value associated with the GL state comparable to the one for the BL state, we study how alerting policy changes when the company focuses primarily on retaining as many customers as possible rather than improving the health outcome of the patients. This is an undesirable situation in the perspective of asthma care yet is still possible. Therefore, our analysis may reveal the potential risk of the HIS industry.

Various parameter specifications used in the computational experiments are listed in Tables 3 and 4. The two scenarios in Table 3, (S1) and (S2) represent intervention-averse and intervention-seeking patients, respectively. As shown, (S1) has a significantly reduced probability of intervention compared to the base-case (S0), and (S2) assumes inflated probability to reflect patients’ active intervention-seeking behavior.

Table 3 List of intervention probabilities changed for the experiments

Full size table

In addition to (S1) and (S2) in Table 3, we introduce three more practical scenarios in Table 4.

(S3) High penalty on receiving alerts It assumes that the patient may suffer from psychological discomfort triggered by receiving alerts from the system, and the alert fatigue leads to a noticeably reduced quality of life (increased $\phi _{a=A}$).
(S4) High penalty on clinical visits It assumes that what makes patients feel bothersome is the intervention rather than receiving alert messages. For this scenario, we increased the disutilities associated with a clinic visit ($\phi _{s\in {0,1}}^{o=s^C}$ and $\phi _{s=2}^{o=s^C}$).
(S5) High penalty on trust loss It penalizes being in the GL state as strongly as being in the BL state, reflecting the perspective of the company.

Table 4 List of disutility values changed for the experiment

Full size table

The optimal alerting policies for all six scenarios including the base case are shown in Fig. 5. In Fig. 5, we observe that the alert region (shaded area) is slightly larger for intervention-averse patients (S1) compared to the base-case (S0). However, the changes in both thresholds are not noticeably large. It means that, when the patients enrolled in the SAM program are believed to be intervention averse, the system does not need to change its alerting policy significantly. By contrast, we see that the alert region noticeably reduces for intervention-seeking patients (S2). In the SAM application, intervention not only affects the transition from a bad asthma control state to a good asthma control state but also provides an opportunity for patients to evaluate the performance of the system and updates their trust level accordingly. Hoff and Bashir [22] suggest that the reliability and performance of the information system are especially crucial if the system provides detailed feedback to users. In other words, the patients who frequently visit the clinic have more opportunity to evaluate the performance of the SAM system; hence, the system should worry more about making mistakes such as false alerts. In contrast, the performance of the system becomes less influential when the patient is intervention averse, and this partially explains why we observed negligible changes in alerting policy under the scenario (S1). When patients are assumed to experience significant discomfort by either receiving alerts (S3) or going through interventions (S4), the SAM system alerts patients cautiously to avoid excessive disutility. Lastly, if the company wants to keep their customers (patients) in a high-trust state, that is, (S5) in Table 4, the system should apply significantly more conservative alerting rules for patients who trust the system, which is reflected by the small alert region shown in Fig. 5. Interestingly, for the same scenario (S5), the probability threshold for low-trust patients is comparable to the one under the base setting (S0). That means, when the SAM system believes that the patient is already in a low-trust state, it would alert those low-trust patients without worrying about false alarms.

5.2 Comparative discussion on the current practice of asthma management

In the current practice of asthma management, various medical institutes and government agencies created asthma action plans so that patients may assess their asthma control level by referring to them [51]. The asthma action plans have become the standard way to provide guidance to asthma patients [45]. Therefore, it is permissible for the company that provides the SAM service to adopt one of the action plans and simply use it as its alerting policy. We would like to clarify that the asthma action plans are not solely based on inhaler usage. They often involve many other symptoms and health indicators such as the peak expiratory flow. Unfortunately, the current SAM system is not able to collect such diverse biomarkers from patients because the data acquisition is remote and wireless. Therefore, we only adopt the guidelines on inhaler use.

We consider two asthma action plans that are implementable to the SAM system. The American Lung Association [4] suggests that it may indicate a poorly controlled asthma if the patient had to use his/her rescue inhaler more than twice a week. We shall denote this action plan as the ALA Plan. Similarly, the Asthma Society of Canada [5] recommends patients contact their care provider when the number of rescue inhaler uses becomes more than four times a week, and we denote this action plan as the ASC Plan. The misalignment among various asthma action plans is caused by the complex nature of asthma and, to our best knowledge, there is no single best asthma action plan. We also include two more benchmark policies. First, we design a conservative action plan where intervention is recommended for patients whose number of inhaler uses per week is greater than or equal to 7, i.e., once a day on average. We denote this action plan as the 7PW (seven-per-week) Plan. Second, we include the no-alert policy, denoted as the NO Plan, as a baseline.

We present the QALDs and number of days in good asthma control for each alerting policy (the mean values computed from 10,000 numerical experiments) in the left-hand graph in Fig. 6. The numerical simulation was done by the correlated gamma-based hidden Markov model specifically designed for the asthma management applications and well-validated by the real-world health data [50]. The computational experiments are done under the base-case scenario (S0).

Our alerting policy considering the trust-dependent patient adherence yields the highest QALDs compared with other benchmark plans. The 7PW plan shows unsatisfactory performance comparable to the NO Plan because the 7PW Plan is as conservative as the no-alert policy. Although the QALD obtained by implementing our alerting policy is the highest, the ALA Plan seems to perform comparably to our approach. As a matter of fact, on average, the ALA Plan yields slightly more days with good asthma control (1761.665) compared to our alerting policy (1761.426). Therefore, on the right-hand graph in Fig. 6, we examine our alerting policy and the ALA Plan more closely on the basis of the number of misdetection days (undetected bad asthma control days) and the number of alerts. We observe that the ALA Plan sends out substantially more alerts than our alerting policy. This indicates that the ALA Plan is a very sensitive alerting policy. In other words, aggressive alerting is the ALA Plan’s strategy to minimize misdetection. Conversely, our alerting policy provides a comparable number of good asthma control days, sacrificing only a few more misdetections (39.568 vs. 43.371). This means that the alerting strategy that accounts for the trust-dependent patient adherence can ensure high QALDs while minimizing alert fatigue. If we focus solely on maximizing the number of good control days, some existing asthma action plans might be able to achieve the goal. However, properly considering the interaction dynamics between patients and the SAM system can reduce alert fatigue substantially while obtaining a comparable (if not better) health outcome. Lowering alert fatigue is important because patients who have cumulated an excessive amount of alert fatigue may abandon the SAM system, which is an undesirable scenario for both the company and the patients.

6 Practical implication and discussion

We developed a specialized POMDP model and derived the optimal alerting policy considering the trust-dependent patient adherence. The mathematical model may not be able to capture every aspects of the complex patient-HIS relationship, but the analytical process shown in this paper provides valuable and practical insights. In conventional physician-patient relationship, it is well known that adherence is one of the primary reasons for suboptimal clinical benefit [56]. The same issue is present for HIS-patient interaction. As healthcare community adopts analytical methods more than ever before [51], it has become clear that patients’ adherence to the recommendation from the analytics-based system is crucial for ensuring a satisfactory health outcome. Our study provides actionable insights to promote better healthcare practice using the HIS.

We shed light on the importance of considering adherence issue in HIS applications. The best health outcome can only be obtained by carefully studying the patients’ adherence behavior and improving the algorithm embedded in the HIS. We numerically showed the potential benefit of bringing the concept of trust into HIS design for improving patients’ overall quality of life. Also, our study gives a guideline for alerting patients. Our results show that patients with a high trust level need to be alerted cautiously, whereas a more aggressive alerting strategy is acceptable for patients with a low trust level. Although this insight is consistent with the existing findings in [8], we further explain the mechanism through the lens of patients’ trust. Every alert comes with a risk of being a false alarm and losing the patients’ trust level. At the same time, there is a risk of misdetection whenever the system remains silent. In this study, we emphasized that optimally balancing the trade-off is crucial for maximizing the patients’ quality of life. This insight would be helpful for designers when they develop a HIS in various healthcare domains.

It should be noted that the actual implementation of our method in healthcare practice must be done with caution because, despite its promising features, our model has some limitations. First, the parameter specifications in our model are justified only in the context of asthma management. We acknowledge that our assumptions may not be feasible in other applications for managing different medical conditions. For instance, a HIS for Type-2 diabetes management may collect the blood glucose level through sensors and wireless communication channels. A specific model may need to be designed tailored to the symptomology of diabetes which must be vastly different from managing asthma conditions. Second, we primarily focus on trust for investigating the patient adherence behavior. There are other factors involved in the interaction between patients (users) and the HIS. According to the survey results, most of users of the SAM system operated by our industry collaborator are satisfied with the accessibility (ease-of-use) and usefulness of the system; hence, those factors are not considered in our study. However, there might be a considerable variance among patients in terms of the perceived quality of the system, especially for a newly-developed HIS. In such a case, in addition to trust, it might be worthwhile to study other factors as well. Furthermore, we ignore any external forces that might affect the behavior of patients. For instance, if a patient recently witnessed a friend having a positive experience with the same HIS, the patient’s adherence behavior toward the alert might be positively influenced. If the patients’ trust is heavily influenced by factors that are unknown to the model, the performance of the analytics method is expected to decrease. We conducted a sensitivity analysis on this aspect and reported the results and relevant discussions in Appendix B. Interested readers can refer to Appendix B to see the potential negative impact. Our study aims to provide practical insights through an analytical lens. Extending the model to a more general complex scenario can quickly make the model analytically intractable; hence, a machine learning-based approach might be a plausible alternative.

7 Conclusion

In addition to making advancements in health technologies and analytics, a HIS must be designed considering the human-technology relationship to successfully realize the desirable positive health outcome. In the SAM context, the HIS may ask a patient to seek a clinical follow-up based on the patient’s inhaler usage log collected by the IoT-enabled personal inhaler. The performance (accuracy) of the data-driven algorithms that detect the undesirable asthma exacerbation is crucial. However, it is also important to note that the recommendation provided by the system may or may not be accepted by the patient. Furthermore, patient adherence to the recommendation may change over time as the patient’s trust in the SAM system dynamically evolves on the basis of the patient’s experience of using the system. We analytically and numerically showed that considering trust-dependent patient adherence might be critical in the IoT-enabled HIS context.

Our study has both methodological and practical contributions. We developed a specialized POMDP model and derived the optimal alerting policy considering the trust-dependent patient adherence. The mathematical model may not be able to capture every aspects of the complex patient-HIS relationship but the analytical process shown in this paper provides valuable insights. The optimal alerting policy has desirable structural properties that can be easily translated into actionable guidelines in practice due to its simple threshold-type characteristics. We show that the best alerting strategy may vary depending on the patient’s trust level, and we demonstrate that alert fatigue can be significantly reduced by considering such information in the system design.

A set of studies that extends our current research is needed in the future. First, it is important to find the ways to accurately estimate the key parameters, such as the asthma control state and trust state transition probabilities. The current SAM system is not capable of collecting data on various biomarkers that are measured only at clinical facilities by medical experts. This is a well-known limitation of HIS based on remote patient monitoring technologies [50]. Similarly, accurate inferences on trust state transition probabilities is also challenging with the current SAM infrastructure. Taking advantages of the recent advancements in healthcare technologies, the HIS now can be integrated with one or more databases to create a set of comprehensive electronic health records (EHRs). Many firms in the health information service domain, including our industry collaborator, are currently trying to merge their HIS and EHR systems at local hospitals to achieve the aforementioned goals.

Overall, it seems clear that focusing solely on methodological advancement is not enough for ensuring the best possible health outcome from the HIS. Fueled by the technical improvements in artificial intelligence, many algorithms have been developed and tested in diverse healthcare applications, making a quantum leap in algorithmic development. However, we may want to put more effort into how stakeholders (patients, physicians, insurers, and companies) interact with the HIS. Without a deeper understanding of the matter, we will encounter various issues in having the HIS successfully implemented in real-world healthcare practice. By studying the complex relationship between the HIS and its stakeholders, the advancements in healthcare technologies may truly transform the current practice of medicine.

References

Agarwal R, Karahanna E (2000) Time flies when you’re having fun: cognitive absorption and beliefs about information technology usage. MIS Quarterly 24(4):665–694
Agarwal R, Venkatesh V (2002) Assessing a firm’s web presence: a heuristic evaluation procedure for the measurement of usability. Inform Syst Res 13(2):168–186
Ahsen ME, Ayvaci MUS, Raghunathan S (2019) When algorithmic predictions use human-generated data: A bias-aware classification algorithm for breast cancer diagnosis. Inform Syst Res 30(1):97–116
Article Google Scholar
American Lung Association (2014) Asthma treatment plan—adult form. http://www.pacnj.org/pdfs/atpfillableadult.pdf. Accessed 8 Mar 2019
Asthma Society of Canada (2018) Asthma action plan—English. https://asthma.ca/wp-content/uploads/2018/06/AAP-FINAL.pdf. Accessed 17 Nov 2019
Astrom K (1965) Optimal control of markov processes with incomplete state information. J Math Anal Appl 10(1):174–205. https://doi.org/10.1016/0022-247x(65)90154-x
Article Google Scholar
Ayer T, Alagoz O, Stout NK (2012) OR forum—a POMDP approach to personalize mammography screening decisions. Oper Res 60(5):1019–1034
Article Google Scholar
Ayer T, Alagoz O, Stout NK, Burnside ES (2016) Heterogeneity in women’s adherence and its role in optimal breast cancer screening policies. Manage Sci 62(5):1339–1362
Bardhan I, Oh Jh, Zheng Z, Kirksey K (2015) Predictive analytics for readmission of patients with congestive heart failure. Inform Syst Res 26(1):19–39
Article Google Scholar
Bateman ED, Boushey HA, Bousquet J, Busse WW, Clark TJ, Pauwels RA, Pedersen SE (2004) Can guideline-defined asthma control be achieved? the gaining optimal asthma control study. Am J Res Crit Care Med 170(8):836–844
Article Google Scholar
Buntin MB, Burke MF, Hoaglin MC, Blumenthal D (2011) The benefits of health information technology: a review of the recent literature shows predominantly positive results. Health affairs 30(3):464–471
Article Google Scholar
Carpenter CJ (2010) A meta-analysis of the effectiveness of health belief model variables in predicting behavior. Health Commun 25(8):661–669
Article Google Scholar
Cortez A, Hsii P, Mitchell E, Riehl V, Smith P (2018) Conceptualizing a data infrastructure for the capture, use, and sharing of patient-generated health data in care delivery and research through 2024 (white paper)
Davis FD (1989) Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 319–340
Davis FD, Bagozzi RP, Warshaw PR (1989) User acceptance of computer technology: a comparison of two theoretical models. Manage Sci 35(8):982–1003
Article Google Scholar
Ferguson T, Birnbaum Z, Lukacs E (2014) Mathematical statistics: a decision theoretic approach. Academic Press, Probability and mathematical statistics
Fichman R, Kohli R, Krishnan R, Kane G (2011) The role of information systems in healthcare: current research and future trends, lead article by the senior editors in. Inform Syst Res
French MT, Mauskopf JA (1992) A quality-of-life method for estimating the value of avoided morbidity. Am J Public Health 82(11):1553–1555
Article Google Scholar
Gadkari AS, McHorney CA (2012) Unintentional non-adherence to chronic prescription medications: how unintentional is it really? BMC Health Serv Res 12(1):98
Article Google Scholar
Gefen D, Karahanna E, Straub DW (2003) Trust and tam in online shopping: an integrated model. MIS Quarterly 27(1):51–90
Article Google Scholar
Haughey J, Taylor K, Dohrmann M, Snyder G (2018) Medtech and the internet of medical things: how connected medical devices are transforming health care
Hoff KA, Bashir M (2015) Trust in automation. Human Fact 57(3):407–434
Article Google Scholar
Huckvale K, Morrison C, Ouyang J, Ghaghda A, Car J (2015) The evolution of mobile apps for asthma: an updated systematic assessment of content and tools. BMC Med 13(1):58
Article Google Scholar
Kahn BE, Luce MF (2003) Understanding high-stakes consumer decisions: mammography adherence following false-alarm test results. Market Sci 22(3):393–410
Article Google Scholar
Karlin S (1968) Total positivity. Stanford University Press
Karlin S, Rinott Y (1980) Classes of orderings of measures and related correlation inequalities—I. Multivariate totally positive distributions. J Multiv Anal 10(4):467–498
Article Google Scholar
Kim MS, Henderson KA, Van Sickle D (2016) Using connected devices to monitor inhaler use in the real world. Resp Drug Deliv 2016:37–44
Google Scholar
Kolodner RM, Cohn SP, Friedman CP (2008) Health information technology: strategic initiatives, real progress: there is nothing “magical” about the strategic thinking behind health it adoption in the united states. Health Aff 27(Suppl1):w391–w395
Lee YY, Lin JL (2009) The effects of trust in physician on self-efficacy, adherence and diabetes outcomes. Social Sci Med 68(6):1060–1068
Article Google Scholar
Lee JD, See KA (2004) Trust in automation: designing for appropriate reliance. Human Fact 46(1):50–80
Article Google Scholar
Leroy G, Chen H, Rindflesch TC (2014) Smart and connected health [guest editors’ introduction]. IEEE Intell Syst 29(3):2–5
Lin YK, Chen H, Brown RA, Li SH, Yang HJ (2017) Healthcare predictive analytics for risk profiling in chronic care: a bayesian multitask learning approach. MIS Quarterly 41(2):473–495
Article Google Scholar
Lovejoy WS (1987) Some monotonicity results for partially observed markov decision processes. Oper Res 35(5):736–743
Article Google Scholar
Madhavan P, Phillips RR (2010) Effects of computer self-efficacy and system reliability on user interaction with decision support systems. Comput Human Behav 26(2):199–204. https://doi.org/10.1016/j.chb.2009.10.005
Article Google Scholar
Madhavan P, Wiegmann DA (2005) Cognitive anchoring on self-generated decisions reduces operator reliance on automated diagnostic aids. Human Fact 47(2):332–341. https://doi.org/10.1518/0018720054679489
Article Google Scholar
McKnight DH, Cummings LL, Chervany NL (1998) Initial trust formation in new organizational relationships. Acad Manage Rev 23(3):473–490
Article Google Scholar
McKnight DH, Choudhury V, Kacmar C (2002) The impact of initial consumer trust on intentions to transact with a web site: a trust building model. J Strat Inform Syst 11(3–4):297–323
Article Google Scholar
Merchant RK, Inamdar R, Quade RC (2016) Effectiveness of population health management using the propeller health asthma platform: a randomized clinical trial. J Allergy Clin Immunol Pract 4(3):455–463. https://doi.org/10.1016/j.jaip.2015.11.022
Article Google Scholar
Meyer G, Adomavicius G, Johnson PE, Elidrisi M, Rush WA, Sperl-Hillen JM, O’Connor PJ (2014) A machine learning approach to improving dynamic decision making. Inform Syst Res 25(2):239–263
Milgrom H, Bender B, Ackerson L, Bowrya P, Smith B, Rand C (1996) Noncompliance and treatment failure in children with asthma. J Allergy Clin Immunol 98(6):1051–1057. https://doi.org/10.1016/s0091-6749(96)80190-4
Article Google Scholar
Mojtabai R, Olfson M (2003) Medication costs, adherence, and health outcomes among medicare beneficiaries. Health Aff 22(4):220–229
Article Google Scholar
Montague EN, Winchester WW III, Kleiner BM (2010) Trust in medical technology by patients and healthcare providers in obstetric work systems. Behav Inform Technol 29(5):541–554
Article Google Scholar
Ohnishi M, Kawai H, Mine H (1986) An optimal inspection and replacement policy under incomplete state information. Eur J Oper Res 27(1):117–128
Article Google Scholar
Patel M, Pilcher J, Reddel HK, Qi V, Mackey B, Tranquilino T, Shaw D, Black P, Weatherall M, Beasley R (2014) Predictors of severe exacerbations, poor asthma control, and beta-agonist overuse for patients with asthma. J Allergy Clin Immunol Pract 2(6):751-758.e1. https://doi.org/10.1016/j.jaip.2014.06.001
Article Google Scholar
Plottel CS (2010) 100 questions & answers about asthma. Jones & Bartlett Learning, Sudbury, MD
Google Scholar
Postma DS (2007) Gender differences in asthma development and progression. Gender Med 4:S133–S146
Article Google Scholar
Rosenfield D (1976) Markovian deterioration with uncertain information. Oper Res 24(1):141–155
Article Google Scholar
Smallwood RD, Sondik EJ (1973) The optimal control of partially observable markov processes over a finite horizon. Oper Res 21(5):1071–1088. https://doi.org/10.1287/opre.21.5.1071
Article Google Scholar
Son J, Brennan PF, Zhou S (2016) Rescue inhaler usage prediction in smart asthma management systems using joint mixed effects logistic regression model. IIE Trans 48(4):333–346. https://doi.org/10.1080/0740817X.2015.1078014
Article Google Scholar
Son J, Brennan PF, Zhou S (2017) Correlated gamma-based hidden markov model for the smart asthma management based on rescue inhaler usage. Stat Med 36(10):1619–1637
Article Google Scholar
Son J, Brennan PF, Zhou S (2020) Data analytics framework for smart asthma management based on remote health information systems with bluetooth-enabled personal inhalers. MIS Quarterly 44(1):285–303
Article Google Scholar
van Sickle D, Maenner M, Barrett M, Marcus J (2013) Monitoring and improving compliance and asthma control: mapping inhaler use for feedback to patients, physicians and payers. Respir Drug Deliv Eur 1–12
Vance A, Elie-Dit-Cosaque C, Straub DW (2008) Examining trust in information technology artifacts: the effects of system quality and culture. J Manage Inform Syst 24(4):73–100. https://doi.org/10.2753/mis0742-1222240403
Article Google Scholar
Venkatesh V, Goyal S (2010) Expectation disconfirmation and technology adoption: polynomial modeling and response surface analysis. MIS Quarterly 281–303
White CC (1979) Optimal control-limit strategies for a partially observed replacement problem. Int J Syst Sci 10(3):321–332. https://doi.org/10.1080/00207727908941584
Article Google Scholar
World Health Organization (2003) Adherence to long-term therapies: evidence for action

Download references

Funding

The authors did not receive support from any organization for the submitted work. The industry collaborator provided helpful feedback and data without any financial support.

Author information

Authors and Affiliations

Alfred Lerner College of Business and Economics, University of Delaware, Newark, DE, 19716, USA
Junbo Son
School of Business, Virginia Commonwealth University, Richmond, VA, 23284, USA
Yeongin Kim
Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA
Shiyu Zhou

Authors

Junbo Son
View author publications
You can also search for this author in PubMed Google Scholar
Yeongin Kim
View author publications
You can also search for this author in PubMed Google Scholar
Shiyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junbo Son.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest. Also, the authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proofs of analytical results

The following results are used throughout Appendix.

Lemma A.1

[25] For two probability mass functions ${\varvec{x}}$ and ${\varvec{x}}'$ with the same dimension $\left| X\right|$, ${\varvec{x}}\le _s{\varvec{x}}'$ iff $\sum _{i\in X}{{\varvec{x}}\left( i\right) f\left( i\right) }\ge \sum _{i\in X}{{\varvec{x}}'\left( i\right) f\left( i\right) }$ for every non-increasing f in $i\in X$.

Lemma A.2

[47] For two probability mass functions ${\varvec{x}}$ and ${\varvec{x}}'$ with the same dimension $\left| X\right|$, if ${\varvec{x}}\le _r{\varvec{x}}'$ then ${\varvec{x}}\le _s{\varvec{x}}'$.

Lemma A.3

[26] If ${\varvec{P}}\in {TP}_2$ and ${\varvec{x}}\le _r{\varvec{x}}'$ are two probability mass functions with the same dimension $\left| X\right|$, then ${\varvec{x}}{\varvec{P}}\le _r{\varvec{x}}'{\varvec{P}}$ provided that ${\varvec{P}}$ have appropriate dimension.

Lemma A.4

[43] For ${\varvec{x}}\le _r{\varvec{x}}'$ in Definition 2, ${\varvec{x}}\le _r\left( 1-\lambda \right) {\varvec{x}}+\lambda {\varvec{x}}'\le _r{\varvec{x}}'$ for any arbitrary $\lambda \in \left[ 0, 1\right]$.

Lemma A.5

[48] The optimality equation $V^*_t\left( \varvec{\pi } \right)$ is piecewise linear convex hence can be written in terms of the maximum of a finite number of linear functions as:

$$\begin{aligned} V^*_t\left( \varvec{\pi } \right)= & {} \max _k \left[ \sum _{s\in S}{\pi (s)\alpha ^k_t(s)}\right] \\&\text { for some }\alpha _t=\left\{ {\alpha }^0_t,{\alpha }^1_t,{\alpha }^2_t,\dots \right\} , \end{aligned}$$

for all $t\le t_E$ where the $\left| S\right|$-dimensional vector $\alpha ^i_t=\left[ {\alpha }^i_t\left( s\right) \right]$ for $s\in S$ called the $\alpha$-vectors.

Proof of analytical results 1 and 2

Analytical Results 1 and 2 are crucial because, based on them, we can claim a monotone optimal value function nonincreasing in $\varvec{\pi }\in {{\varvec{\Pi }}}$. We first show Lemma 1 as follows:

Lemma 1

Suppose (C1)–(C3) hold, then the state transition probability matrix ${{\varvec{\Gamma }}}_t^{a,o}$ has ${TP}_2$ property for all $a\in {\mathcal {A}}$ and $o\in {\mathbf {O}}$ where (C1)–(C3) are

$$\begin{aligned} \text {(C1) }&v^{loss}_{HL}\ge v^{none}_{HL}\ge v^{gain}_{HL} \text { and } c^{loss}_{LL}=c^{none}_{LL}=c^{gain}_{LL},&\\ \text {(C2) }&v^{none}_{HH}\ge 1-v^{none}_{LL} \text { and } v^{loss}_{HL}\le v^{loss}_{LL},&\\ \text {(C3) }&c^0_{BB}\ge c^1_{BB}, c^0_{GG}=c^1_{GG}\text {, and } c^1_{GG}c^1_{BB}-c^1_{GB}c^1_{BG}\ge 0.&\end{aligned}$$

Proof of Lemma 1

First, we consider $o=y\in \varvec{Y}$ for $a\in {\mathcal {A}}$. In this case, the transition probability matrices are

$$\begin{aligned} {\varvec{{\Gamma }}}^{W,y}_t={\varvec{{\Gamma }}}^{A,y}_t=\left[ \begin{array}{ccc} p^{0,none}_{00}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{0j}} &{} p^{0,none}_{01}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{0j}} &{} p^{0,none}_{02}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{0j}} \\ p^{0,none}_{10}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{1j}} &{} p^{0,none}_{11}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{1j}} &{} p^{0,none}_{12}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{1j}} \\ p^{0,none}_{20}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{2j}} &{} p^{0,none}_{21}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{2j}} &{} p^{0,none}_{22}/\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{2j}} \end{array} \right] \ , \end{aligned}$$

where ${{\Gamma }}^{W,y}_{0j}$, ${{\Gamma }}^{W,y}_{1j}$, and ${{\Gamma }}^{W,y}_{2j}$ can be replaced with ${{\Gamma }}^{A,y}_{0j}$, ${{\Gamma }}^{A,y}_{1j}$, and ${{\Gamma }}^{A,y}_{2j}$.

Let us ignore the denominator and focus on the numerator for each entry in ${\varvec{{\Gamma }}}^{a,y}_t$. Note that $[p_{ij}^{x,z}]=c_{kl}^x\times v_{qu}^z$ denotes a matrix of transition probabilities from state i to state j for $i,j\in \{0,1,2\}$, where $x\in \{1,0\}$, $z\in \{gain,loss,none\}$, $k=G$ if $i\in \{0,1\}$, $k=B$ if $i=2$, $l=G$ if $j\in \{0,1\}$, $l=B$ if $j=2$, $q=H$ if $i=0$, $q=L$ if $i\in \{1,2\}$, $u=H$ if $j=0$, and $u=L$ if $j\in \{1,2\}$. Then, we have

$$\begin{aligned} {\widetilde{\varvec{{\Gamma }}}}^{a,y}_t= & {} \left[ \begin{array}{c} {\widetilde{\varvec{{\Gamma }}}}^{a,y}_t\left( \cdot |0\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{a,y}_t\left( \cdot |1\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{a,y}_t\left( \cdot |2\right) \end{array} \right] =\left[ \begin{array}{ccc} p^{0,none}_{00} &{} p^{0,none}_{01} &{} p^{0,none}_{02} \\ p^{0,none}_{10} &{} p^{0,none}_{11} &{} p^{0,none}_{12} \\ p^{0,none}_{20} &{} p^{0,none}_{21} &{} p^{0,none}_{22} \end{array} \right] \\= & {} \left[ \begin{array}{ccc} c^0_{GG}v^{none}_{HH} &{} c^0_{GG}v^{none}_{HL} &{} c^0_{GB}v^{none}_{HL} \\ c^0_{GG}v^{none}_{LH} &{} c^0_{GG}v^{none}_{LL} &{} c^0_{GB}v^{none}_{LL} \\ c^0_{BG}v^{none}_{LH} &{} c^0_{BG}v^{none}_{LL} &{} c^0_{BB}v^{none}_{LL} \end{array} \right] . \end{aligned}$$

The first, second, and third rows have the following relationship:

$$\begin{aligned}&\frac{c^0_{GG}v^{none}_{HH}}{c^0_{GG}v^{none}_{LH}}\ge \frac{c^0_{GG}v^{none}_{HL}}{c^0_{GG}v^{none}_{LL}}=\frac{c^0_{GB}v^{none}_{HL}}{c^0_{GB}v^{none}_{LL}}\\&\quad \mathrm { and }\frac{c^0_{GG}v^{none}_{LH}}{c^0_{BG}v^{none}_{LH}}=\frac{c^0_{GG}v^{none}_{LL}}{c^0_{BG}v^{none}_{LL}}\ge \frac{c^0_{GB}v^{none}_{LL}}{c^0_{BB}v^{none}_{LL}}{\ ,}\end{aligned}$$

where the first inequality holds because of (C2) and the second inequality holds due to (C3). It can be shown that $c^0_{GB}v^{none}_{HH}=1-\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{0j}}$, $c^0_{GB}v^{none}_{LH}=1-\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{1j}}$, and $c^0_{BB}v^{none}_{LH}=1-\sum ^2_{j=0}{{{\Gamma }}^{W,y}_{2j}}$. Because both $\left( 1-c^0_{GB}v^{none}_{LH}\right) /\left( 1-c^0_{GB}v^{none}_{HH}\right)$ and $\left( 1-c^0_{BB}v^{none}_{LH}\right) /\left( 1-c^0_{GB}v^{none}_{LH}\right)$ are non-negative, we obtain

$$\begin{aligned} \frac{c^0_{GG}v^{none}_{HH}\left( 1-c^0_{GB}v^{none}_{LH}\right) }{c^0_{GG}v^{none}_{LH}\left( 1-c^0_{GB}v^{none}_{HH}\right) }\ge&\frac{c^0_{GG}v^{none}_{HL}\left( 1-c^0_{GB}v^{none}_{LH}\right) }{c^0_{GG}v^{none}_{LL}\left( 1-c^0_{GB}v^{none}_{HH}\right) }\\ =&\frac{c^0_{GB}v^{none}_{HL}\left( 1-c^0_{GB}v^{none}_{LH}\right) }{c^0_{GB}v^{none}_{LL}\left( 1-c^0_{GB}v^{none}_{HH}\right) } \ \mathrm { and } \\ \frac{c^0_{GG}v^{none}_{LH}\left( 1-c^0_{BB}v^{none}_{LH}\right) }{c^0_{BG}v^{none}_{LH}\left( 1-c^0_{GB}v^{none}_{LH}\right) }=&\frac{c^0_{GG}v^{none}_{LL}\left( 1-c^0_{BB}v^{none}_{LH}\right) }{c^0_{BG}v^{none}_{LL}\left( 1-c^0_{GB}v^{none}_{LH}\right) }\\ \ge&\frac{c^0_{GB}v^{none}_{LL}\left( 1-c^0_{BB}v^{none}_{LH}\right) }{c^0_{BB}v^{none}_{LL}\left( 1-c^0_{GB}v^{none}_{LH}\right) }, \end{aligned}$$

which directly implies ${\varvec{{\Gamma }}}^{a,y}_t\left( \cdot |0\right) {\le }_r{\varvec{{\Gamma }}}^{a,y}_t\left( \cdot |1\right)$ and ${\varvec{{\Gamma }}}^{a,y}_t\left( \cdot |1\right) {\le }_r{\varvec{{\Gamma }}}^{a,y}_t\left( \cdot |2\right)$ from Definition 2. Thus, from Definition 3, ${\varvec{{\Gamma }}}^{a,y}_t\in {TP}_2$ for $a\in {\mathcal {A}}$ and $t\le t_E$.

Now, for $o\mathrm {=}{\mathrm {s}}^H_t$ and $a\mathrm {=}W$, the transition matrix without the normalizing denominators is

$$\begin{aligned} {\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t= & {} \left[ \begin{array}{c} {\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |0\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |1\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |2\right) \end{array} \right] =\left[ \begin{array}{ccc} p^{1,gain}_{00} &{} p^{1,gain}_{01} &{} p^{1,gain}_{02} \\ p^{1,gain}_{10} &{} p^{1,gain}_{11} &{} p^{1,gain}_{12} \\ p^{1,loss}_{20} &{} p^{1,loss}_{21} &{} p^{1,loss}_{22} \end{array} \right] \\= & {} \left[ \begin{array}{ccc} c^1_{GG}v^{gain}_{HH} &{} c^1_{GG}v^{gain}_{HL} &{} c^1_{GB}v^{gain}_{HL} \\ c^1_{GG}v^{gain}_{LH} &{} c^1_{GG}v^{gain}_{LL} &{} c^1_{GB}v^{gain}_{LL} \\ c^1_{BG}v^{loss}_{LH} &{} c^1_{BG}v^{loss}_{LL} &{} c^1_{BB}v^{loss}_{LL} \end{array} \right] {\ ,}\end{aligned}$$

and we can derive following relationships

$$\begin{aligned}&\frac{c^1_{GG}v^{gain}_{HH}}{c^1_{GG}v^{gain}_{LH}}\ge \frac{c^1_{GG}v^{gain}_{HL}}{c^1_{GG}v^{gain}_{LL}}=\frac{c^1_{GB}v^{gain}_{HL}}{c^1_{GB}v^{gain}_{LL}}\\&\quad \mathrm { and }\frac{c^1_{GG}v^{gain}_{LH}}{c^1_{BG}v^{loss}_{LH}}=\frac{c^1_{GG}v^{gain}_{LL}}{c^1_{BG}v^{loss}_{LL}}\ge \frac{c^1_{GB}v^{gain}_{LL}}{c^1_{BB}v^{loss}_{LL}}{\ ,}\end{aligned}$$

where the first inequality holds because of (C1) and (C2), the second inequality holds due to (C3), and the second equality is based on (C1). Therefore, ${\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |0\right) {\le }_r{\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |1\right) {\le }_r{\widetilde{\varvec{{\Gamma }}}}^{W,s^C_t}_t\left( \cdot |2\right)$ which implies that ${\varvec{{\Gamma }}}^{W,s^C_t}_t\in {TP}_2$ from Definition 3. Similarly, for $o\mathrm {=}{\mathrm {s}}^H_t$ and $a\mathrm {=}A$, we have

$$\begin{aligned} {\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t= & {} \left[ \begin{array}{c} {\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |0\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |1\right) \\ {\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |2\right) \end{array} \right] =\left[ \begin{array}{ccc} p^{1,loss}_{00} &{} p^{1,loss}_{01} &{} p^{1,loss}_{02} \\ p^{1,loss}_{10} &{} p^{1,loss}_{11} &{} p^{1,loss}_{12} \\ p^{1,gain}_{20} &{} p^{1,gain}_{21} &{} p^{1,gain}_{22} \end{array} \right] \\= & {} \left[ \begin{array}{ccc} c^1_{GG}v^{loss}_{HH} &{} c^1_{GG}v^{loss}_{HL} &{} c^1_{GB}v^{loss}_{HL} \\ c^1_{GG}v^{loss}_{LH} &{} c^1_{GG}v^{loss}_{LL} &{} c^1_{GB}v^{loss}_{LL} \\ c^1_{BG}v^{gain}_{LH} &{} c^1_{BG}v^{gain}_{LL} &{} c^1_{BB}v^{gain}_{LL} \end{array} \right] {\ ,} \end{aligned}$$

and we derive following relationships

$$\begin{aligned}&\frac{c^1_{GG}v^{loss}_{HH}}{c^1_{GG}v^{loss}_{LH}}\ge \frac{c^1_{GG}v^{loss}_{HL}}{c^1_{GG}v^{loss}_{LL}}=\frac{c^1_{GB}v^{loss}_{HL}}{c^1_{GB}v^{loss}_{LL}}\\&\quad \mathrm { and }\frac{c^1_{GG}v^{loss}_{LH}}{c^1_{BG}v^{gain}_{LH}}=\frac{c^1_{GG}v^{loss}_{LL}}{c^1_{BG}v^{gain}_{LL}}\ge \frac{c^1_{GB}v^{loss}_{LL}}{c^1_{BB}v^{gain}_{LL}}{\ ,} \end{aligned}$$

where the first inequality holds because (C2) and the second inequality holds due to (C3). The second equality is based on (C1). Therefore, ${\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |0\right) {\le }_r{\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |1\right) {\le }_r{\widetilde{\varvec{{\Gamma }}}}^{A,s^C_t}_t\left( \cdot |2\right)$ which implies that ${\varvec{{\Gamma }}}^{A,s^C_t}_t\in {TP}_2$ from Definition 3. Based on all results above, we see that ${\varvec{{\Gamma }}}^{a,o}_t\in {TP}_2$ for $a\in {\mathcal {A}}$, $o\in \varvec{O}$, and $t\le t_E$. $\square$

Lemma 1 shows that the state transition probability matrix ${{\varvec{\Gamma }}}_t^{a,o}$ has the $TP_2$ property. We can also show that the $TP_2$ property can be retained after simplifying the state transition probability matrix to ${{\varvec{\Gamma }}}_t^{a}$ as follows:

Proposition 1

Suppose (C1)–(C3) hold, then the ${{\varvec{\Gamma }}}_t^{a,o}\in {TP}_2$ for all $a\in {\mathcal {A}}$ and $t\le t_E$.

Proof of Proposition 1

To recall, the overall transition probability for $a_t=W$ is expressed as

$$\begin{aligned} {\varvec{{\Gamma }}}^W_t=\left[ \begin{array}{ccc} \left( 1-q^{0,W}_t\right) p^{0,none}_{00}+q^{0,W}_tp^{1,gain}_{00} &{} \left( 1-q^{0,W}_t\right) p^{0,none}_{01}+q^{0,W}_tp^{1,gain}_{01} &{} \left( 1-q^{0,W}_t\right) p^{0,none}_{02}+q^{0,W}_tp^{1,gain}_{02} \\ \left( 1-q^{1,W}_t\right) p^{0,none}_{10}+q^{1,W}_tp^{1,gain}_{10} &{} \left( 1-q^{1,W}_t\right) p^{0,none}_{11}+q^{1,W}_tp^{1,gain}_{11} &{} \left( 1-q^{1,W}_t\right) p^{0,none}_{12}+q^{1,W}_tp^{1,gain}_{12} \\ \left( 1-q^{2,W}_t\right) p^{0,none}_{20}+q^{2,W}_tp^{1,loss}_{20} &{} \left( 1-q^{2,W}_t\right) p^{0,none}_{21}+q^{2,W}_tp^{1,loss}_{21} &{} \left( 1-q^{2,W}_t\right) p^{0,none}_{22}+q^{2,W}_tp^{1,loss}_{22} \end{array} \right] , \end{aligned}$$

Now, let $\varvec{p}^{x,z}_{i\cdot }$ denote the a vector of transition probabilities from state i to $j\in \left\{ 0,1,2\right\}$ for any given $x\in \left\{ 1,0\right\}$ and $z\in \left\{ none,gain,loss\right\}$. Then, from Definition 2, we obtain

$$\begin{aligned} \frac{c^0_{GG}v^{none}_{HH}}{c^1_{GG}v^{gain}_{HH}} \le \frac{c^0_{GG}v^{none}_{HL}}{c^1_{GG}v^{gain}_{HL}}=\frac{c^0_{GB}v^{none}_{HL}}{c^1_{GG}v^{gain}_{HL}}\ \Leftrightarrow \ \varvec{p}^{1,gain}_{0\cdot }{\varvec{\le }}_r \varvec{p}^{0,none}_{0\cdot }\ , \end{aligned}$$

where the inequality holds because of condition (C1) and the equality holds due to condition (C3). Similarly, we show

$$\begin{aligned} \frac{c^0_{BG}v^{none}_{LH}}{c^1_{BG}v^{loss}_{LH}}=\frac{c^0_{BG}v^{none}_{LL}}{c^1_{BG}v^{loss}_{LL}}\le \frac{c^0_{BB}v^{none}_{LL}}{c^1_{BB}v^{loss}_{LL}}\ \Leftrightarrow \ \varvec{p}^{1,loss}_{2\cdot }{\varvec{\le }}_r\varvec{p}^{0,none}_{2\cdot }\ , \end{aligned}$$

where the equality and inequality hold due to conditions (C1) and (C3), respectively. Furthermore, from the same conditions (C1) and (C3), it is easy to show that $\varvec{p}^{1,gain}_{1\cdot }\varvec{=}\varvec{p}^{0,none}_{1\cdot }$ because $c^0_{GG}=c^1_{GG}$, $c^0_{GB}=c^1_{GG}$, and $v^{none}_{LL}=v^{gain}_{LH}$.

Let ${\varvec{{\varGamma }}}^W_t\left( \cdot |i\right)$ denote a $\left| S\right|$-dimensional vector of state transition probabilities from state i to $j\in \left\{ 0,1,2\right\}$. Then, from Lemma A.4, we get

$$\begin{aligned}\left\{ \ \ \ \ \begin{array}{l} \varvec{p}^{1,gain}_{0\cdot }{\le }_r{\varvec{{\varGamma }}}^W_t\left( 0|\cdot \right) {\le }_r\varvec{p}^{0,none}_{0\cdot }\ \\ \varvec{p}^{0,none}_{1\cdot }={\varvec{{\varGamma }}}^W_t\left( 1|\cdot \right) =\varvec{p}^{1,gain}_{1\cdot }\ \ \\ \varvec{p}^{1,loss}_{2\cdot }{\le }_r{\varvec{{\varGamma }}}^W_t\left( 2|\cdot \right) {\le }_r\varvec{p}^{0,none}_{2\cdot }\ \end{array} \right. \ .\end{aligned}$$

Now we show

$$\begin{aligned}\left\{ \ \ \ \ \begin{array}{l} \frac{c^0_{GG}v^{none}_{HH}}{c^1_{GG}v^{none}_{LH}}\ge \frac{c^0_{GG}v^{none}_{HL}}{c^1_{GG}v^{none}_{LL}}=\frac{c^0_{GB}v^{none}_{HL}}{c^1_{GG}v^{none}_{LL}}\ \Leftrightarrow \ \varvec{p}^{0,none}_{0\cdot }{\le }_r\varvec{p}^{0,none}_{1\cdot } \\ \frac{c^1_{GG}v^{gain}_{LH}}{c^1_{BG}v^{loss}_{LH}}=\frac{c^1_{GG}v^{gain}_{LL}}{c^1_{BG}v^{loss}_{LL}}\ge \frac{c^1_{GB}v^{gain}_{LL}}{c^1_{BB}v^{loss}_{LL}}\ \Leftrightarrow \ \varvec{p}^{1,gain}_{1\cdot }{\le }_r\varvec{p}^{1,loss}_{2\cdot } \end{array} \right. \ ,\end{aligned}$$

where the inequality in the first expression holds because of condition (C2) and the rest is due to condition (C3). Based on this result, we obtain following relationship:

$$\begin{aligned}&\varvec{p}^{1,gain}_{0\cdot }{\le }_r{\varvec{{\varGamma }}}^W_t\left( \cdot |0\right) {\le }_r\varvec{p}^{0,none}_{0\cdot }{\le }_r\varvec{p}^{0,none}_{1\cdot }={\varvec{{\varGamma }}}^W_t\left( \cdot |1\right) \\&\quad =\varvec{p}^{1,gain}_{1\cdot }{\le }_r\varvec{p}^{1,loss}_{2\cdot }{\le }_r{\varvec{{\varGamma }}}^W_t\left( \cdot |2\right) {\le }_r\varvec{p}^{0,none}_{2\cdot }\\&\quad \Longleftrightarrow \ {\varvec{{\varGamma }}}^W_t\left( \cdot |0\right) {\le }_r{\varvec{{\varGamma }}}^W_t\left( \cdot |1\right) {\le }_r{\varvec{{\varGamma }}}^W_t\left( \cdot |2\right) \ .\end{aligned}$$

Therefore, based on Definition 3, ${\varvec{{\varGamma }}}^W_t\in {TP}_2$.

For $a_t=A$, we follow the identical procedures and get following relationships:

$$\begin{aligned}\left\{ \ \ \ \ \begin{array}{l} \frac{c^0_{GG}v^{none}_{HH}}{c^1_{GG}v^{loss}_{HH}}\ge \frac{c^0_{GG}v^{none}_{HL}}{c^1_{GG}v^{loss}_{HH}}=\frac{c^0_{GB}v^{none}_{HL}}{c^1_{GB}v^{loss}_{HL}}\ \Leftrightarrow \ \varvec{p}^{0,none}_{0\cdot }{\varvec{\le }}_r\varvec{p}^{1,loss}_{0\cdot }\ \\ \frac{c^0_{GG}v^{none}_{LH}}{c^1_{GG}v^{loss}_{LH}}=\frac{c^0_{GG}v^{none}_{LL}}{c^1_{GG}v^{loss}_{LL}}=\frac{c^0_{GB}v^{none}_{LL}}{c^1_{GB}v^{loss}_{LL}}\ \Leftrightarrow \ \varvec{p}^{0,none}_{1\cdot }=\varvec{p}^{1,loss}_{1\cdot }\ \ \\ \frac{c^0_{BG}v^{none}_{LH}}{c^1_{BG}v^{gain}_{LH}}\le \frac{c^0_{BG}v^{none}_{LL}}{c^1_{BG}v^{gain}_{LL}}=\frac{c^0_{BB}v^{none}_{LL}}{c^1_{BB}v^{gain}_{LL}}\ \Leftrightarrow \ \varvec{p}^{1,gain}_{2\cdot }{\varvec{\le }}_r\varvec{p}^{0,none}_{2\cdot }\ \end{array} \right. \ ,\end{aligned}$$

where (C1) and (C3) were used and, using (C2) and (C3), we further obtain

$$\begin{aligned}\left\{ \ \ \ \ \begin{array}{l} \frac{c^1_{GG}v^{loss}_{HH}}{c^1_{GG}v^{loss}_{LH}}\ge \frac{c^1_{GG}v^{loss}_{HL}}{c^1_{GG}v^{loss}_{LL}}=\frac{c^1_{GB}v^{loss}_{HL}}{c^1_{GB}v^{loss}_{LL}}\ \Leftrightarrow \ \varvec{p}^{1,loss}_{0\cdot }{\varvec{\le }}_r\varvec{p}^{1,loss}_{1\cdot } \\ \frac{c^1_{GG}v^{loss}_{LH}}{c^1_{BG}v^{gain}_{LH}}=\frac{c^1_{GG}v^{loss}_{LL}}{c^1_{BG}v^{gain}_{LL}}\ge \frac{c^1_{GB}v^{loss}_{LL}}{c^1_{BB}v^{gain}_{LL}}\ \Leftrightarrow \ \varvec{p}^{1,loss}_{1\cdot }{\varvec{\le }}_r\varvec{p}^{1,gain}_{2\cdot } \end{array} \right. \ . \end{aligned}$$

Now, from Lemma A.4, we show

$$\begin{aligned}&\varvec{p}^{0,none}_{0\cdot }{\varvec{\le }}_r{\varvec{{\varGamma }}}^A_t\left( \cdot |0\right) {\varvec{\le }}_r\varvec{p}^{1,loss}_{0\cdot }{\varvec{\le }}_r\varvec{p}^{1,loss}_{1\cdot }\\&\quad ={\varvec{{\varGamma }}}^A_t\left( \cdot |1\right) {\varvec{\le }}_r\varvec{p}^{1,gain}_{2\cdot }{\varvec{\le }}_r{\varvec{{\varGamma }}}^A_t\left( \cdot |2\right) {\varvec{\le }}_r\varvec{p}^{0,none}_{2\cdot }\\&\quad \Longleftrightarrow \ {\varvec{{\varGamma }}}^A_t\left( \cdot |0\right) {\varvec{\le }}_r{\varvec{{\varGamma }}}^A_t\left( \cdot |1\right) {\varvec{\le }}_r{\varvec{{\varGamma }}}^A_t\left( \cdot |2\right) \ .\end{aligned}$$

Therefore, based on Definition 3, ${\varvec{{\varGamma }}}^A_t\in {TP}_2$ hence ${\varvec{{\varGamma }}}^a_t\in {TP}_2$ for all $a\in {\mathcal {A}}$ and $t\le t_E$. $\square$

Furthermore, from Lemma 1, we get

Proposition 2

For any two beliefs $\varvec{\pi }$,$\varvec{\pi }^{'}\in {{\varvec{\Pi }}}$ such that $\varvec{\pi }\le _r\varvec{\pi }^{'}$, suppose (C1)–(C3) hold, then $\varvec{\tau }_{\varvec{\pi }}^{a,o} \le _r {\varvec{\tau }^{a,o}_{\varvec{\pi }^{'}}}$ for any $a\in {\mathcal {A}}$ and $o\in {\mathbf {O}}$.

Proof of Proposition 2

Let ${\varvec{{\Psi }}}_t\left( o\right)$ denote a 3-by-3 matrix defined as

$$\begin{aligned}{\varvec{{\Psi }}}_t\left( o\right) =\left[ \begin{array}{ccc} {{\Lambda }}^{0,a}_t\left( o\right) &{} {{\Lambda }}^{0,a}_t\left( o\right) &{} {{\Lambda }}^{0,a}_t\left( o\right) \\ {{\Lambda }}^{1,a}_t\left( o\right) &{} {{\Lambda }}^{1,a}_t\left( o\right) &{} {{\Lambda }}^{1,a}_t\left( o\right) \\ {{\Lambda }}^{2,a}_t\left( o\right) &{} {{\Lambda }}^{2,a}_t\left( o\right) &{} {{\Lambda }}^{2,a}_t\left( o\right) \end{array} \right] \ ,\end{aligned}$$

which has three identical columns for $o\in \varvec{O}$ and $a\in {\mathcal {A}}$. Now, based on Definition 3, it is easy to show that the following matrix has ${TP}_2$ property

$$\begin{aligned}{\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t=\left[ \begin{array}{ccc} {{\Lambda }}^{0,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 0|0\right) &{} {{\Lambda }}^{0,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 1|0\right) &{} {{\Lambda }}^{0,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 2|0\right) \\ {{\Lambda }}^{1,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 0|1\right) &{} {{\Lambda }}^{1,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 1|1\right) &{} {{\Lambda }}^{1,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 2|1\right) \\ {{\Lambda }}^{2,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 0|2\right) &{} {{\Lambda }}^{2,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 1|2\right) &{} {{\Lambda }}^{2,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 2|2\right) \end{array} \right] \ , \end{aligned}$$

where the operator $\circ$ indicates element-wise multiplication (Hadamard product). For each belief, we get

$$\begin{aligned}\varvec{\pi }\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right) \varvec{=}{\left[ \sum _{{\varvec{s}}\in S}{\pi \left( s'\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) }\right] }_{s'\in \varvec{S}}\varvec{\ ,}\end{aligned}$$

and

$$\begin{aligned}\varvec{\pi }\varvec{'}\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right) \varvec{=}{\left[ \sum _{{\varvec{s}}\in S}{\pi '\left( s'\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) }\right] }_{s'\in \varvec{S}}\varvec{\ .}\end{aligned}$$

From Lemma A.3, $\varvec{\pi }\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right) {\varvec{\le }}_r\varvec{\pi }\varvec{'}\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right)$ holds because ${\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\in {TP}_2$. Based on Definition 3, we rewrite the expression as

$$\begin{aligned}&\varvec{\pi }\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right) {\varvec{\le }}_r{\varvec{\pi }'}\left( {\varvec{{\Psi }}}_t\left( o\right) \circ {\varvec{{\Gamma }}}^{a,o}_t\right) \nonumber \\&\quad \Longrightarrow \ \frac{\sum _{{\varvec{s}}\in S}{\pi \left( 0\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 0|s\right) }}{\sum _{{\varvec{s}}\in S}{\pi '\left( 0\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 0|s\right) }}\\&\quad \ge \frac{\sum _{{\varvec{s}}\in S}{\pi \left( 1\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 1|s\right) }}{\sum _{{\varvec{s}}\in S}{\pi '\left( 1\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 1|s\right) }}\\&\quad \ge \frac{\sum _{{\varvec{s}}\in S}{\pi \left( 2\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 2|s\right) }}{\sum _{{\varvec{s}}\in S}{\pi '\left( 2\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( 2|s\right) }}. \end{aligned}$$

From the definition of our belief updating function, we get

$$\begin{aligned}&\frac{{\tau }^{a,o}_{\pi }\left( 0\right) \sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}{{\tau }^{a,o}_{\pi '}\left( 0\right) \sum _{s\in S}{\pi '\left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}\\&\quad \ge \frac{{\tau }^{a,o}_{\pi }\left( 1\right) \sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}{{\tau }^{a,o}_{\pi '}\left( 1\right) \sum _{s\in S}{\pi '\left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}\\&\quad \ge \frac{{\tau }^{a,o}_{\pi }\left( 2\right) \sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}{{\tau }^{a,o}_{\pi '}\left( 2\right) \sum _{s\in S}{\pi '\left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}\ .\end{aligned}$$

Multiplying a non-negative quantity $\sum _{s\in S}{\pi '\left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }/\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }$ yields

$$\begin{aligned} \frac{\tau _\pi ^{a,o}(0)}{\tau _{\pi '}^{a,o}(0)}\ge \frac{\tau _{\pi }^{a,o} (1)}{\tau _{\pi '}^{a,o} (1)}\ge \frac{\tau _\pi ^{a,o} (2)}{\tau _{\pi '}^{a,o} (2)}, \end{aligned}$$

which implies ${\varvec{\tau }}^{a,o}_{\pi }{\le }_r{\varvec{\tau }}^{a,o}_{\pi '}$ for any $a\in {\mathcal {A}}$ and $o\in \varvec{O}$ based on Definition 3. $\square$

The Analytical Result 1 is based on Lemma 1 and Proposition 1, and the Analytical Result 2 is equivalent to Proposition 2.

Proof of analytical result 3

The Analytical Result 3 shows that our optimal value function is nonincreasing in $\varvec{\pi }\in {{\varvec{\Pi }}}$ without assuming an unrealistic $TP_2$ assumption on the observation probability matrix. First, we give Lemma 2 as follows:

Lemma 2

For $y_t\in \varvec{Y}$ and $s^C_t\in \{G,B\}$, let $\varvec{\xi }_{s^C_t}=\left[ \xi _{s^C_t}(y_t)\right]$ be a |Y|-dimensional probability vector where $\xi _{s^C_t}(y_t)=P(y_t|s^C_t)$ and define $\gamma =\xi _G(y^M)/\xi _B(y^M)$ where $0<\gamma \le 1$. Suppose following conditions are satisfied for all $a\in {\mathcal {A}}$, then the observation probability matrix $\varvec{\varLambda }^a_t={\left[ {{\Lambda }}^{s,a}_t(o)\right] }_{s\in \varvec{S},o\in \varvec{O}}\in {TP}_2$ for all $a\in {\mathcal {A}}$ and $t\le t_E$.

$$\begin{aligned} \text {(C4) }&\varvec{\xi }_G {\le }_r {\varvec{\xi }}_B,&\\ \text {(C5) }&q^{0,a}_t\le q^{1,a}_t\le \delta \left( a\right) =q^{1,a}_t/\left\{ \gamma \left( 1-q^{1,a}_t\right) +q^{1,a}_t\right\} \le q^{2,a}_t.&\end{aligned}$$

Proof of Lemma 2

We give proof only for $a_t=W$ because the proof for $a_t=A$ is identical. For $a_t=W$, the observation matrix for $o_t\in \varvec{O}=\left\{ 0,\ 1,\ 2^+,s^C_t\right\}$ is defined as

$$\begin{aligned} {\varvec{{\Lambda }}}^W_t=\left[ \begin{array}{c} {\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }0\right) \\ {\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }1\right) \\ {\varvec{{\Lambda }}}^W_t\left( \cdot \left| \right. 2\right) \end{array} \right] =\left[ \begin{array}{cc} \begin{array}{cc} \left( 1-q^{0,W}_t\right) {\xi }_G\left( 0\right) &{} \left( 1-q^{0,W}_t\right) {\xi }_G\left( 1\right) \\ \left( 1-q^{1,W}_t\right) {\xi }_G\left( 0\right) &{} \left( 1-q^{1,W}_t\right) {\xi }_G\left( 1\right) \\ \left( 1-q^{2,W}_t\right) {\xi }_B\left( 0\right) &{} \left( 1-q^{2,W}_t\right) {\xi }_B\left( 1\right) \end{array} &{} \begin{array}{cc} \left( 1-q^{0,W}_t\right) {\xi }_G\left( 2^+\right) &{} q^{0,W}_t \\ \left( 1-q^{1,W}_t\right) {\xi }_G\left( 2^+\right) &{} q^{1,W}_t \\ \left( 1-q^{2,W}_t\right) {\xi }_B\left( 2^+\right) &{} q^{2,W}_t \end{array} \end{array} \right] \ ,\end{aligned}$$

because ${{\Lambda }}^{s,a}_t\left( o=s^C_t\right) =q^{s,a}_t$ and ${{\Lambda }}^{s,a}_t\left( o=y_t\right) =\left( 1-q^{s,a}_t\right) \times P\left( y_t|s^C_t\right)$ for $y_t\in \varvec{Y}$. Each row in ${\varvec{{\Lambda }}}^W_t$, $\varvec{{\Lambda }}^W_t(\cdot |i)$, denotes an observation probability vector with dimension of $\left| \varvec{O}\right|$ for state i.

For the first and second rows in ${\varvec{{\Lambda }}}^W_t$, we see

$$\begin{aligned} \frac{\left( 1-q^{0,W}_t\right) {\xi }_G\left( 0\right) }{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 0\right) }= & {} \frac{\left( 1-q^{0,W}_t\right) {\xi }_G\left( 1\right) }{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 1\right) }\\= & {} \frac{\left( 1-q^{0,W}_t\right) {\xi }_G\left( 2^+\right) }{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 2^+\right) }\ge \frac{q^{0,W}_t}{q^{1,W}_t}\ ,\end{aligned}$$

where the last inequality holds because of (C2). Thus, from Definition 2, ${\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }0\right) {\varvec{\le }}_r{\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }1\right)$.

Similarly, for the second and third rows in ${\varvec{{\Lambda }}}^W_t$, we get

$$\begin{aligned} \frac{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 0\right) }{\left( 1-q^{2,W}_t\right) {\xi }_B\left( 0\right) }\ge & {} \frac{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 1\right) }{\left( 1-q^{2,W}_t\right) {\xi }_B\left( 1\right) }\\\ge & {} \frac{\left( 1-q^{1,W}_t\right) {\xi }_G\left( 2^+\right) }{\left( 1-q^{2,W}_t\right) {\xi }_B\left( 2^+\right) }\ge \frac{q^{1,W}_t}{q^{2,W}_t}\ ,\end{aligned}$$

where the last inequality holds due to (C2).

From (C1), ${\varvec{\xi }}_G{\varvec{\le }}_r{\varvec{\xi }}_B$ which implies ${\xi }_G\left( 0\right) /{\xi }_B\left( 0\right)$ $\ge$ ${\xi }_G\left( 1\right) /{\xi }_B\left( 1\right)$ $\ge$ ${\xi }_G\left( 2\right) /{\xi }_B\left( 2\right)$. Because $\left( 1-q^{1,W}_t\right) /\left( 1-q^{2,W}_t\right)$ is always positive, the inequalities do not change after multiplication. Therefore, the first and second inequalities hold. Now, based on Definition 2, ${\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }1\right) {\varvec{\le }}_r{\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }2\right)$. Because ${\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }0\right) {\varvec{\le }}_r{\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }1\right)$ and ${\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }1\right) {\varvec{\le }}_r{\varvec{{\Lambda }}}^W_t\left( \cdot {\left| \right. }2\right)$, from Definition 3, ${\varvec{{\Lambda }}}^W_t={\left[ {{\Lambda }}^{s,W}_t\left( o\right) \right] }_{s\in \varvec{S},o\in \varvec{O}}\in {TP}_2$ for $t\le t_E$. Following the same procedure for $a=A$, it is easy to show that ${\varvec{{\Lambda }}}^A_t={\left[ {{\Lambda }}^{s,A}_t\left( o\right) \right] }_{s\in \varvec{S},o\in \varvec{O}}\in {TP}_2$ for $t\le t_E$. $\square$

We stated that (C5) is not a viable assumption in the SAM application. Therefore, we need to find a way to ensure monotonic nonincreasing value function in $\varvec{\pi }\in {{\varvec{\Pi }}}$ without depending on Lemma 2. To do so, we give Lemma 3 and Proposition 3 as below:

Lemma 3

The value function for $a\in {\mathcal {A}}$ is $V^a_t(\varvec{\pi })=\sum _{s\in \varvec{S}}{\pi (s)\left[ r_t(s,a)+\sum _{s'\in S}{\varGamma ^a_t(s'|s){\widetilde{\alpha }}^{\kappa (\varvec{\pi },a)}_{t+1}(s')}\right] }$ where $\kappa (\varvec{\pi },a)=\mathrm {argmax}_k \left[ \sum _{s\in \varvec{S}}{\pi (s)\sum _{s'\in \varvec{S}}{\varGamma ^a_t(s'|s)\varvec{\alpha }^k_{t+1}(s')}}\right]$ and $\alpha ^k_{t+1}(s')$ is $\alpha$-vector defined by Lemma A.5. Then, based on the revised $\alpha$-vector ${\widetilde{\alpha }}_t^\kappa (\varvec{\pi },a)$, the optimizing $\alpha$-vector for a given belief $\varvec{\pi }\in \varvec{\Pi }$ is denoted as $\alpha ^{k^*(\varvec{\pi })}_t$ where $k^*(\varvec{\pi })=\mathrm {argmax}_{\left\{ \kappa \left( \varvec{\pi },W\right) ,\kappa \left( \varvec{\pi },A\right) \right\} } \left[ \sum _{s\in \varvec{S}}{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi },W\right) }_t\left( s\right) },\sum _{s\in \varvec{S}}{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi },A\right) }_t\left( s\right) }\right]$.

Proof of Lemma 3

First, we express the optimal value function for the updated belief ${\varvec{\tau } }^{a,o}_{\pi }$ in terms of $\alpha$-vectors introduced in Lemma A.5 as

$$\begin{aligned} V^*_{t+1}({\varvec{\tau } }^{a,o}_{\pi })&=\max _{k}\left[ \sum _{s'\in S}{\tau ^{a,o}_{\pi }(s')\alpha ^k_{t+1}(s')}\right] \\&=\max _{k} \left[ \sum _{s'\in S}{\left( \frac{\sum _{s\in S}{\pi (s)\varLambda ^{s,a}_t(o)\varGamma ^{a,o}_t(s'|s)}}{\sum _{s\in S}{\pi (s)\varLambda ^{s,a}_t(o)}}\right) {\alpha }^k_{t+1}\left( s'\right) }\right] \\&=\max _{k} \left[ \frac{1}{\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}\right. \\&\quad \left. \sum _{s'\in S}{\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }}\right] , \end{aligned}$$

where ${\alpha }^k_{t+1}\left( s'\right)$ is the $\alpha$-vector in Lemma A.5.

Now, the optimal value function for an action a is

$$\begin{aligned} \begin{aligned} V^a_t\left( \varvec{\pi }\right)&=\mathrm {max} \left[ \sum _{s\in S}{\sum _{o\in O}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) r_t\left( s,a,o\right) }}\right. \\&\quad \left. +\sum _{s\in S}{\sum _{o\in O}{\sum _{s'\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) V^*_{t+1}\left( {\varvec{\tau }}^{a,o}_{\pi }\right) }}}\right] \ \\&=\mathrm {max} \left[ \sum _{s\in S}{\sum _{o\in O}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) r_t\left( s,a,o\right) }}+ \right. \\&\quad \left. \sum _{s\in S}{\sum _{o\in O}{\sum _{s'\in S}{\frac{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) \left\{ \sum _{s'\in S}{\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) {{\Gamma }}^{a,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }}\right\} \ }{\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) }}}}}\right] \\&=\mathrm {max} \left[ \sum _{s\in S}{\sum _{o\in O}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) r_t\left( s,a,o\right) }}\right. \\&\quad \left. +\sum _{o\in O}{\sum _{s\in S}{\pi \left( s\right) {{\Lambda }}^{s,a}_t\left( o\right) \sum _{s'\in S}{{{\Gamma }}^{a,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }}}\right] \ \\&=\mathrm {max} \left[ \sum _{s\in S}\pi \left( s\right) \left\{ \sum _{o\in O}{{\Lambda }}^{s,a}_t\left( o\right) \left( r_t\left( s,a,o\right) \right. \right. \right. \\&\quad \left. \left. \left. +\sum _{s'\in S}{{{\Gamma }}^{a,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }\right) \right\} \right] \ \\&={\mathrm {max} \left[ \sum _{s\in S}{\pi \left( s\right) \left( r_t\left( s,a\right) +\sum _{s'\in S}{{{\Gamma }}^a_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }\right) }\right] }. \end{aligned} \end{aligned}$$

Then, the optimal value function for a given action a becomes $V^a_t\left( \varvec{\pi }\right) =\sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,a\right) }_t\left( s\right) }$, where $\kappa \left( \varvec{\pi } ,a\right) ={{\mathrm {argmax}}_k \left[ \sum _{s\in S}{\pi \left( s\right) \sum _{s'\in S}{{{\Gamma }}^a_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }}\right] \ }$.

Based on this formulation, we can see that the overall optimal value function is expressed as

$$\begin{aligned} V^*_t\left( \varvec{\pi }\right) =\mathrm {max}\left\{ V^W_t\left( \varvec{\pi }\right) ,V^A_t\left( \varvec{\pi }\right) \right\} =\sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{k^*\left( \pi \right) }_t}\ ,\end{aligned}$$

where $k^*\left( \varvec{\pi } \right) ={\mathop {\mathrm {argmax}}_{\left\{ \kappa \left( \varvec{\pi } ,W\right) ,\kappa \left( \varvec{\pi } ,A\right) \right\} } \left[ \sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,W\right) }_t\left( s\right) },\sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }\right] \ }$. $\square$

Proposition 3

Suppose the following conditions on disutility hold in addition to (C1)–(C3), then the optimizing revised $\alpha$-vector is non-increasing in $s\in S$ for an arbitrary belief $\varvec{\pi }\in \varvec{\Pi }$: that is, for any $s_1,s_2\in S$ such that $s_1<s_2$, ${{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi }\right) }_t\left( s_1\right) \ge {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi }\right) }_t\left( s_2\right)$ for all $t\le t_E$.

$$\begin{aligned} \text {(C6) }&\phi ^{o=y}_{s=0} = \phi ^{o=y}_{s=1}=\phi ^{o=y}_{s=2}=\phi _{a=W}=\phi _{s=0}=0,&\\ \text {(C7) }&\phi _{a=A}+{\phi }^{o=s^C}_{s=0}\le {\phi }_{s=1},&\\ \text {(C8) }&\phi _{s=1}+{\phi }_{a=A}+{\phi }^{o=s^C}_{s=1}\le {\phi }_{s=2}\le 1-{\phi }_{a=A}-{\phi }^{o=s^C}_{s=2}.&\end{aligned}$$

Proof of Proposition 3

From Lemma 3, we know that

$$\begin{aligned} {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,a\right) }_t=\mathrm {max}\left[ \sum _{o\in O}{{{\Lambda }}^{s,a}_t\left( o\right) \left( r_t\left( s,a,o\right) +\sum _{s'\in S}{{{\Gamma }}^{a,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }\right) }\right] {\ .} \end{aligned}$$

First, we can show that $r_t\left( s,a,o\right)$ is a non-increasing function in $s\in S$ for any $a{\in }{\mathcal {A}}$ and $o\in O$. Based on (C6), we can derive following minimums and maximums.

$$\begin{aligned} {\mathop {\mathrm {min}}_{a,o} \left[ r_t\left( s=0,a,o\right) \right] \ }&=1-{\phi }_{a=A}-{\phi }^{o=s^C}_{s=0}{\ ,} \\ {\mathop {\mathrm {max}}_{a,o} \left[ r_t\left( s=1,a,o\right) \right] \ }&=1-{\phi }_{s=1}{\ ,} \\ {\mathop {\mathrm {min}}_{a,o} \left[ r_t\left( s=1,a,o\right) \right] \ }&=1-{\phi }_{a=A}-{\phi }^{o=s^C}_{s=1}-{\phi }_{s=1}{\ ,} \text { and} \\ {\mathop {\mathrm {max}}_{a,o} \left[ r_t\left( s=2,a,o\right) \right] \ }&=1-{\phi }_{s=2}{\ .} \end{aligned}$$

From (C7)–(C8), it is straightforward to show that ${\mathop {\mathrm {min}}_{a,o} \left[ r_t\left( s=0,a,o\right) \right] \ }\ge {\mathop {\mathrm {max}}_{a,o} \left[ r_t\left( s=1,a,o\right) \right] \ }$ and ${\mathop {\mathrm {min}}_{a,o} \left[ r_t\left( s=1,a,o\right) \right] \ }\ge {\mathop {\mathrm {max}}_{a,o} \left[ r_t\left( s=2,a,o\right) \right] \ }$. Therefore, $r_t\left( s,a,o\right)$ is non-increasing in $s\in \varvec{S}$.

At the last time epoch $t=t_E$, the optimal value function is defined as

$$\begin{aligned} V^*_{t_E}\left( \varvec{\pi }\right) \mathrm {=}{\mathop {\mathrm {max}}_{a} \left[ \sum _{s\in \varvec{S}}{\pi \left( s\right) }\sum _{o\in \varvec{O}}{{{\Lambda }}^{s,a}_{t_E}\left( o\right) r_{t_E}\left( s,a,o\right) }\right] \ }{\ ,} \end{aligned}$$

which implies that the$\ \alpha$-vector can be defined as ${\alpha }_{t_E}\left( s\right) =r_{t_E}\left( s,a\right) =\sum _{o\in O}{{{\Lambda }}^{s,a}_{t_E}\left( o\right) r_{t_E}\left( s,a,o\right) }$. From the result above, $r_{t_E}\left( s,a,o\right)$ is non-increasing in $s\in \varvec{S}$ for any $a{\in }{\mathcal {A}}$ and $o\in \varvec{O}$ hence the optimizing $\alpha$-vector at time ${t_E}$ is non-increasing in $s\in \varvec{S}$. In other words, the assertion holds when $t={t_E}$. Now, assume inductively that the assertion holds at $t+1,\ t+2,\dots ,\ {t_E}$ and assume the optimizing $\alpha$-vector ${{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t$ is associated with action $a^*$. Based on (C1)–(C3) and Proposition 1, ${\varvec{{\Gamma }}}^a_t\in {TP}_2$ and, from Definition 3 and Lemma A.2, we get

$$\begin{aligned} {\varvec{{\Gamma }}}^a_t\left( \cdot |0\right) {\varvec{\le }}_r{\varvec{{\Gamma }}}^a_t\left( \cdot |1\right) {\varvec{\le }}_r{\varvec{{\Gamma }}}^a_t\left( \cdot |2\right) {\ }{\Rightarrow }{\varvec{{\Gamma }}}^a_t\left( \cdot |0\right) {\varvec{\le }}_s{\varvec{{\Gamma }}}^a_t\left( \cdot |1\right) {\varvec{\le }}_s{\varvec{{\Gamma }}}^a_t\left( \cdot |2\right) {\ ,}\end{aligned}$$

From the above inequalities and Lemma A.1,

$$\begin{aligned}&{{\Gamma }}^{a^*}_t\left( 0|0\right) {\alpha }^k_{t+1}\left( 0\right) +{{\Gamma }}^{a^*}_t\left( 1|0\right) {\alpha }^k_{t+1}\left( 1\right) +{{\Gamma }}^{a^*}_t\left( 2|0\right) {\alpha }^k_{t+1}\left( 2\right) \\&\quad {\ge }{{\Gamma }}^{a^*}_t\left( 0|1\right) {\alpha }^k_{t+1}\left( 0\right) +{{\Gamma }}^{a^*}_t\left( 1|1\right) {\alpha }^k_{t+1}\left( 1\right) +{{\Gamma }}^{a^*}_t\left( 2|1\right) {\alpha }^k_{t+1}\left( 2\right) {\ ,}\end{aligned}$$

where the inequality holds due to the induction assumption and similarly we get

$$\begin{aligned}&{{\Gamma }}^{a^*}_t\left( 0|1\right) {\alpha }^k_{t+1}\left( 0\right) +{{\Gamma }}^{a^*}_t\left( 1|1\right) {\alpha }^k_{t+1}\left( 1\right) +{{\Gamma }}^{a^*}_t\left( 2|1\right) {\alpha }^k_{t+1}\left( 2\right) \\&\quad {\ge }{{\Gamma }}^{a^*}_t\left( 0|2\right) {\alpha }^k_{t+1}\left( 0\right) +{{\Gamma }}^{a^*}_t\left( 1|2\right) {\alpha }^k_{t+1}\left( 1\right) +{{\Gamma }}^{a^*}_t\left( 2|2\right) {\alpha }^k_{t+1}\left( 2\right) {\ .}\end{aligned}$$

Therefore, $\sum _{s'}{{{\Gamma }}^{a^*}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }$ is non-increasing in $s\in S$.

Because both $r_t\left( s,a^*,o\right)$ and $\sum _{s'}{{{\Gamma }}^{a^*,o}_t\left( s'|s\right) {\alpha }^k_{t+1}\left( s'\right) }$ are non-increasing in $s\in S$, we have

$$\begin{aligned}&{{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t\left( s_1\right) \\&\quad =\sum _{o\in O}{{{\Lambda }}^{s_1,a^*}_t\left( o\right) \left( r_t\left( s_1,a^*,o\right) +\sum _{s'\in S}{{{\Gamma }}^{a^*,o}_t\left( s'|s_1\right) {\alpha }^k_{t+1}\left( s'\right) }\right) } \\&\quad =\sum _{o\in O}{{{\Lambda }}^{s_1,a^*}_t\left( o\right) r_t \left( s_1,a^*,o\right) }+\sum _{s'\in S}{{{\Gamma }}^{a^*}_t\left( s'|s_1\right) {\alpha }^k_{t+1}\left( s'\right) } \\&\quad \ge \sum _{o\in O}{{{\Lambda }}^{s_2,a^*}_t\left( o\right) r_t\left( s_2,a^*,o\right) }+\sum _{s'\in S}{{{\Gamma }}^{a^*}_t\left( s'|s_2\right) {\alpha }^k_{t+1}\left( s'\right) }={{\widetilde{\alpha }}}^{k^*\left( \pi \right) }_t\left( s_2\right) \ , \end{aligned}$$

where $\sum _{o\in O}{{{\Lambda }}^{s,a^*}_t\left( o\right) }\mathrm {=1}$ for any given $s\in S$. Based on the results above, for any $s_1,s_2\in S$ such that $s_1<s_2$, ${{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t\left( s_1\right) \ge {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t\left( s_2\right)$ for all $t\le {t_E}$. $\square$

Finally, based on Lemma 3 and Proposition 3, we give Theorem 1.

Theorem 1

Suppose (C1)–(C3) and (C6)–(C8) hold. Then, for any belief vectors $\varvec{\pi },\varvec{\pi }'\in \varvec{{\varPi }}$ such that $\varvec{\pi }{\le }_s\varvec{\pi }'$, $V^*_t\left( \varvec{\pi }\right) \ge V^*_t\left( \varvec{\pi }'\right)$ for all $t\le t_E$.

Proof of Theorem 1

Based on Lemma 3, the optimal value function is

$$\begin{aligned} V^*_t\left( \varvec{\pi }\right) ={\mathop {\mathrm {max}}_{k} \left[ \sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^k_t\left( s|\varvec{\pi } \right) }\right] \ }=\sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t\left( s\right) }{\ .}\end{aligned}$$

From Lemma A.1 and Proposition 3, we get

$$\begin{aligned} V^*_t\left( \varvec{\pi }\right) =\sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi } \right) }_t\left( s\right) }\ge \sum _{s\in S}{\pi \left( s\right) {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi }'\right) }_t\left( s\right) }\ge \sum _{s\in S}{\pi '\left( s\right) {{\widetilde{\alpha }}}^{k^*\left( \varvec{\pi }'\right) }_t\left( s\right) }=V^*_t\left( \varvec{\pi }'\right) {\ ,}\end{aligned}$$

where the first inequality holds because of the definition of the optimal value function. $\square$

As we see in Theorem 1, we do not need the ${TP}_2$ property on the observation probability matrix. Therefore, it is permissible to violate (C5) in Lemma 2. The Analytical Result 3 is a summary of Theorem 1, Lemma 3, and Proposition 3.

Proof of analytical result 4

The Analytical Result 4 is based on Theorem 2 and Corollary 1.

Theorem 2

Let $a^*_t\left( \varvec{\pi }\right)$ denote the optimal action at time t for a given belief $\varvec{\pi }\in \varvec{{\varPi }}$. Suppose (C1)–(C4) and (C6)–(C8) hold. Furthermore, suppose (C5) holds for$\ a=W$ and $\sum _s{\left( \pi \left( s\right) -\pi '\left( s\right) \right) \sum _{s'}{\varGamma ^A_t(s'|s){{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi },A\right) }_{t+1}\left( s'\right) }}\ge 0$ holds. Then, if $a^*_t\left( \varvec{\pi }\varvec{'}\right) =W$ then $a^*_t\left( \varvec{\pi }\right) =W$ and if $a^*_t\left( \varvec{\pi }\right) =A$, then $a^*_t\left( \varvec{\pi }\varvec{'}\right) =A$ for any $\varvec{\pi },\varvec{\pi }\varvec{'}\in \varvec{{\varPi }}$ such that $\varvec{\pi }{\le }_s\varvec{\pi }\varvec{'}$.

Proof of Theorem 2

Consider the case of $a^*_t\left( \varvec{\pi }\varvec{'}\right) =W$. Theorem 2 says that $V^W_t\left( \varvec{\pi }'\right) \ge V^A_t\left( \varvec{\pi }'\right)$ and $V^W_t\left( \varvec{\pi }\right) \ge V^A_t\left( \varvec{\pi }\right)$. Now, suppose the converse is true which means $V^W_t\left( \varvec{\pi }'\right) \ge V^A_t\left( \varvec{\pi }'\right)$ and $V^W_t\left( \varvec{\pi }\right) <V^A_t\left( \varvec{\pi }\right)$. In this case, we get

$$\begin{aligned} V^W_t\left( \varvec{\pi }'\right) -V^W_t\left( \varvec{\pi }\right) >V^A_t\left( \varvec{\pi }'\right) -V^A_t\left( \varvec{\pi }\right) . \end{aligned}$$

(1)

Because (C4) holds and (C5) holds for $a=W$, based on Lemma 2, ${\varvec{{\Lambda }}}^W_t\in {TP}_2$. Also, because (C1)–(C3) hold, ${\varvec{{\Gamma }}}^W_t\in {TP}_2$ based on Proposition 1. When both ${\varvec{{\Lambda }}}^W_t$ and ${\varvec{{\Gamma }}}^W_t$ have ${TP}_2$ property, we can show that $V^W_t\left( \varvec{\pi }\varvec{'}\right) -V^W_t\left( \varvec{\pi }\right) \le 0$ because $V^W_t\left( \varvec{\pi }\right) \ge V^W_t\left( \varvec{\pi }\varvec{'}\right)$ for $\varvec{\pi }{\varvec{\le }}_s\varvec{\pi }\varvec{'}$. We omit this proof which depends on (C6)–(C8), Lemma A.1, and Proposition 2.

Now, from (1), we obtain

$$\begin{aligned} V^A_t\left( {\varvec{\pi }'}\right)<V^A_t\left( \varvec{\pi }\right) \Rightarrow \sum _s{\pi '\left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ',A\right) }_t\left( s\right) }<\sum _s{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }\\ \Rightarrow \sum _s{\pi '\left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ',A\right) }_t\left( s\right) }-\sum _s{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }<0\ . \end{aligned}$$

Because ${{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t$ is the optimizing $\alpha$-vector, we get

$$\begin{aligned}&\sum _s{\pi '\left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }-\sum _s{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }\\&\quad \le \sum _s{{\pi }'{\left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( {\varvec{\pi }}',A\right) }_t\left( s\right) }}-\sum _s{\pi \left( s\right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }<0 \\&\quad \Rightarrow \sum _s{\left( \pi \left( s\right) -\pi '\left( s\right) \right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }<0, \end{aligned}$$

which contradicts to the condition $\sum _s{\left( \pi \left( s\right) -\pi '\left( s\right) \right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }\ge 0$.

Now, consider the case of $a^*_t\left( \varvec{\pi }\varvec{'}\right) =A$. Theorem 2 shows that $V^A_t\left( \varvec{\pi }\right) \ge V^W_t\left( \varvec{\pi }\right)$ and $V^A_t\left( \varvec{\pi }\varvec{'}\right) \ge V^W_t\left( \varvec{\pi }\varvec{'}\right)$. Suppose the converse is true which means $V^A_t\left( \varvec{\pi }\right) \ge V^W_t\left( \varvec{\pi }\right)$ and $V^A_t\left( \varvec{\pi }\varvec{'}\right) <V^W_t\left( \varvec{\pi }\varvec{'}\right)$. Then, we obtain

$$\begin{aligned}V^A_t\left( \varvec{\pi }\right) -V^A_t\left( \varvec{\pi }\varvec{'}\right) >V^W_t\left( \varvec{\pi }\right) -V^W_t\left( \varvec{\pi }\varvec{'}\right) \ge 0\ .\end{aligned}$$

Therefore, $V^A_t\left( {\varvec{\pi }}'\right) <V^A_t\left( \varvec{\pi }\right)$ and, as shown before, $\sum _s{\left( \pi \left( s\right) -\pi '\left( s\right) \right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }<0$ which contradicts to the condition $\sum _s{\left( \pi \left( s\right) -\pi '\left( s\right) \right) {{\widetilde{\alpha }}}^{\kappa \left( \varvec{\pi } ,A\right) }_t\left( s\right) }\ge 0$. $\square$

From Theorem 2, we get Corollary 1 as follows:

Corollary 1

Define two probabilities as ${\pi }^*_{GH}=\mathrm {max}\left\{ \pi \left( 0\right) :\ \pi \left( 1\right) =0,a^*\left( \varvec{\pi }\right) =A\right\}$ and ${\pi }^*_{GL}=\mathrm {max}\left\{ \pi \left( 1\right) :\ \pi \left( 0\right) =0,a^*\left( \varvec{\pi }\right) =A\right\}$. Suppose the conditions in Theorem 2 hold, then ${\pi }^*_{GH}\le {\pi }^*_{GL}$.

Proof of Corollary 1

Suppose the converse is true, ${\pi }^*_{GH}>{\pi }^*_{GL}$, and specify two beliefs ${\varvec{\pi }}_1$ and ${\varvec{\pi }}_2$ as ${\varvec{\pi }}_1=[ \begin{array}{ccc} {\pi }^*_{GH}&0&1-{\pi }^*_{GH} \end{array} ]$ and ${\varvec{\pi }}_2=[ \begin{array}{ccc} 0&{\pi }^*_{GH}&1-{\pi }^*_{GH} \end{array} ]$. Then, by definition, it is easy to see that $a^*\left( {\varvec{\pi }}_1\right) =A$ and ${\varvec{\pi }}_1{\le }_s{\varvec{\pi }}_2$. Therefore, based on Theorem 2, $a^*\left( {\varvec{\pi }}_2\right) =A$. Now, because $a^*\left( {\varvec{\pi }}_2\right) =A$ and ${\varvec{\pi }}_2\left( 0\right) =0$, by definition, we get ${\pi }^*_{GL}\ge {\pi }^*_{GH}$ which contradicts to ${\pi }^*_{GH}>{\pi }^*_{GL}$. $\square$

Appendix B: Sensitivity analysis

To check the robustness of our model and to quantify the impact of potential violation of our assumption on trust transition, we conduct a series of numerical experiments assuming that the patients are enrolled to the SAM program for a year (365 days). In our study, we initially assume that the trust state evolves according to the trust state transition probability matrix (Table 2 in the manuscript). There are two probability matrices: one for the case when the patient visited a clinic and went through a clinical diagnosis and another for the case when there was no diagnosis performed. The matrices are defined based on the assumption that trust mainly depends on the performance of the system (e.g., false alarm and misdetection reduce the trust level whereas correct alert/no-alert should increase the trust level). To forcefully create a hypothetical scenario where trust is affected by other unknown factors, we assume that a random trust state transition occurs with a probability $\theta$. In other words, instead of following the specific trust state transition function, for $365\times \theta$ days within a year, the trust level of the patient is determined by flipping a coin (50/50% chance of being in high/low trust state). By increasing $\theta$, to some extent, we can show how robust (or vulnerable) the method is when our assumption on trust transition was violated. Table 5 summarizes the numerical experiment results. For each simulation run, we assumed 1000 patients are using the SAM system. The end of decision time is 365 ($t_E$=365).

Table 5 Sensitivity analysis on random trust state transition

Full size table

As expected, larger $\theta$ (i.e., trust depends more heavily on other factors than the performance of the system) reduces the performance (mean QALD) of our method. Under the perfect scenario where trust depends solely on the performance of the system, the average QALD per year is about 344 (as reported in Fig. 6 in the manuscript). The average QALD decreases by 6% when 10% of the trust state transition is triggered by other factors unknown to the model.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Son, J., Kim, Y. & Zhou, S. Alerting patients via health information system considering trust-dependent patient adherence. Inf Technol Manag 23, 245–269 (2022). https://doi.org/10.1007/s10799-021-00350-8

Download citation

Accepted: 02 November 2021
Published: 02 December 2021
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10799-021-00350-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Alerting patients via health information system considering trust-dependent patient adherence

Abstract

Similar content being viewed by others

A Comprehensive Study of Security and Cyber-Security Risk Management within e-Health Systems: Synthesis, Analysis and a Novel Quantified Approach

A Trust-Based Framework for Information Sharing Between Mobile Health Care Applications

Internet of Things Based E-health Systems: Ideas, Expectations and Concerns

Explore related subjects

1 Introduction

2 Research background

2.1 Asthma care practice and the SAM system

2.2 Human-in-the-loop HIS

2.3 Trust-dependent patient adherence

3 Model

4 Analytical study: structural properties

Definition 1

Definition 2

Definition 3

5 Computational study based on the SAM data

5.1 Computational experiments under various scenarios

5.2 Comparative discussion on the current practice of asthma management

6 Practical implication and discussion

7 Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Proofs of analytical results

Lemma A.1

Lemma A.2

Lemma A.3

Lemma A.4

Lemma A.5

Proof of analytical results 1 and 2

Lemma 1

Proof of Lemma 1

Proposition 1

Proof of Proposition 1

Proposition 2

Proof of Proposition 2

Proof of analytical result 3

Lemma 2

Proof of Lemma 2

Lemma 3

Proof of Lemma 3

Proposition 3

Proof of Proposition 3

Theorem 1

Proof of Theorem 1

Proof of analytical result 4

Theorem 2

Proof of Theorem 2

Corollary 1

Proof of Corollary 1

Appendix B: Sensitivity analysis

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation