1 Introduction

Run-time Verification (RV) [17] has become a crucial element in developing Cyber-Physical Systems (CPSs) [32, 40, 42]. In RV, a monitor checks the current execution, that is a finite prefix of an infinite path, against a given property, typically expressed in Linear Temporal Logic (LTL) [23], that represents a set of acceptable infinite paths. If any infinite extension of a prefix belongs (does not belong) to the set of infinite paths that satisfy the property, the monitor accepts (resp. rejects) the prefix. For example, \(\varphi _{\mathsf {F}}:\Diamond e\) (resp. \(\varphi _{\mathsf {G}}:\Box \lnot e\)) is satisfied (resp. not satisfied) on any infinite paths with the prefix \(u_1: \lnot e,\lnot e, e\). Whenever the monitor is not able to reach a verdict with the given prefix \(\pi \) because \(\pi \) can be extended to satisfy and to violate the property, the monitor outputs unknown [3]. For example, the prefix \(u_2: \lnot e,\lnot e\) can be extended to both a path that satisfies \(\varphi _{\mathsf {F}}:\Diamond e\) (e.g., any extension of \(u_1\)) and a path that violates \(\varphi _{\mathsf {F}}\) (e.g., \((\lnot e)^{\omega }\)).

The monitor is able to reach a verdict with a finite extension of the prefix, if the property is monitorable [12]. In this paper, we estimate the finite extensions of a prefix using a prediction model. The prediction model is trained from identically and independently distributed (iid) samples of the previous execution paths of the system. We use Hidden Markov Models [26] (HMMs) to realize a prediction model of the system with partially observable behaviour.

We focus on the properties that can be evaluated with regular extensions, that is, the extensions that are expressible by a Deterministic Finite Automaton (DFA). Depending on the given property, the extensions may specify the prefixes that satisfy the property (good extensions) or violate it (bad extensions). We use an upper-bound on the length of the estimated extensions. The monitor in our framework is the result of a bounded reachability analysis on the product of an HMM and a DFA. Using the product model, the monitor is able to predict a verdict, in terms of the probability of the extensions that satisfy or violate the property. To extend an execution path, the monitor needs to know the current state, which is estimated at run-time by Viterbi algorithm [38]. Viterbi algorithm generates the most likely state based on a given observation.

We implemented our approach as a proof-of-concept toolFootnote 1, called \(\mathcal {P}revent\) (Predictive Runtime Verification Framework), and report on using it in two case studies: the original and a modified version of randomized dining philosophers algorithm, and the QNX Neutrino [24] kernel traces. In summary, we make the following contributions:

  • introduce \(\mathcal {P}revent\), a predictive runtime verification framework to detect satisfaction/violation of a property based on partial execution,

  • methodology for constructing a prediction model, that is, the product of a trained HMM and the DFA specifying good and bad extensions,

  • define the prediction error on a partial trace and evaluate the monitor performance using hypothesis testing,

  • implement the runtime monitoring algorithm using Viterbi approximation,

  • evaluate \(\mathcal {P}revent\) with two case studies: a modified version of the randomized dining philosophers problem and the flight control of a hexacopter.

The rest of the paper is structured as follows: in Sect. 2, we give an overview of \(\mathcal {P}revent\). In Sects. 4 and 5, we provide the details of, respectively, constructing the monitor, and the run-time monitoring algorithm. We define a measure to assess the prediction accuracy and validate the performance of the monitor using hypothesis testing in Sect. 6. Finally, we provide the empirical evaluation of \(\mathcal {P}revent\) on two case studies in Sect. 7.

2 An Overview of \(\mathcal {P}revent\)

The key idea in \(\mathcal {P}revent\) is to finitely extend the execution trace using a prediction model, and check the extended path against the specification property. The prediction model is obtained from iid sample traces collected from the past executions of the system. The prediction model enables the monitor to estimate the extensions that satisfy or violate the given property within a finite horizon, that is represented as the maximum length of the finite extensions. This gives the monitor the ability to detect a property violation before its occurrence.

An overview of \(\mathcal {P}revent\) is shown in Fig. 1. The two main components of \(\mathcal {P}revent\), learning and monitoring are described below:

Learning. We use the sample traces to train HMM using Baum-Welch algorithm [26]. The training samples represent an independent and identical distribution (iid) over all the execution traces of the system. The trained HMM represents the joint distribution of the paths over \(\varSigma ^*\) and \(S^*\), where \(\varSigma \) is the observation space and S is the state space of the system.

Fig. 1.
figure 1

The overview of \(\mathcal {P}revent\) framework.

Monitoring. The monitor in our framework is the result of a bounded reachability analysis on the product of the HMM and the DFA that specifies the acceptable or unacceptable extensions by the property. The monitor is implemented as a lookup table. Each entry is a composite state that specifies a DFA state, a hidden state in the HMM, and an observation, and the probability that from the current state the system will satisfy or violate the property in a bounded number of steps. The current hidden states maintain a history of the previous observations (the prefix Y in Fig. 1). The monitor updates its estimation of the current state by running the Viterbi approximation to obtain \((\mathcal {H}\times \mathcal {A})_Y\). The output of the monitor is therefore \(Pr(\mathcal {H}\times \mathcal {A}_Y\models \Diamond ^{\le h}Accept)\), where h is the finite horizon, or the maximum length of the extensions that are estimated by the monitor. Since \(\mathcal {H}\times \mathcal {A}\) has a small size, the probability results of the reachability analysis can be computed off-line for all the states (\(\mathcal {H}\times \mathcal {A}\)), and for \(1\le h \le H_{MAX}\), and stored in a table. The value of \(H_{MAX}\) represents the maximum length of the extensions that the monitor needs to predict the evaluation of the property, and can be obtained empirically from the execution samples.

3 Preliminaries

In this section, we briefly introduce the used definitions and notations.

A probability distribution over a finite set S is a function \(P: S \rightarrow [0,1]\) such that \(\sum _{s \in S}P(s) = 1\). We use \(X_{1:\tau }\) to denote a sequence \(x_1,x_2,\dots ,x_\tau \) of values of a random variable X, and use u and w to resp. denote a finite and infinite path.

Hidden Markov Model (HMM): HMM is the joint distribution over \(X_{1:\tau }\), the sequence of one state variable, and \(Y_{1:\tau }\), the sequence of observations (both with identical lengths). The joint distribution is such that \(Pr(y_i\mid X_{1:i}, Y_{1:i}) = Pr(y_i\mid x_i)\) for \(i\in [1..\tau ]\) (the current observation is conditioned only on the current state), and \(Pr(x_i\mid X_{1:i-1}, Y_{1:i-1}) = Pr(x_i\mid x_{i-1})\) for \(i\in [1..\tau ]\) (the current state is only conditioned on the previous hidden states). We use \(\pi \) to denote the initial probability distribution over the state space, i.e., \(Pr(x_1) = \pi (x_1)\). As a result, an HMM can be defined with three probability distributions:

Definition 1

(HMM). A finite discrete Hidden Markov Model (HMM) is a tuple \(\mathcal {H}:(S, \varSigma , \pi , T, O)\), where S is the non-empty finite set of states, \(\varSigma \) is the non-empty finite set of observations, \(\pi : S \rightarrow [0,1]\) is the initial probability distribution over S, \(T: S\times S \rightarrow [0,1]\) is the transition probability, and \(O:S\times \varSigma \rightarrow [0,1]\) is the observation probability. We use \(\varTheta _{\mathcal {H}}\) to denote \(\pi \), T, and O.

Discrete-Time Markov Chains (DTMC). We use Discrete-Time Markov Chain (DTMC) for reachability analysis necessary to construct our monitor.

Definition 2

(DTMC). A Discrete-Time Markov Chain (DTMC) is a tuple \(\mathcal {M}:(S, \varSigma , \pi , \mathbf {P}, L)\), where S is a non-empty finite set of states, \(\varSigma \) is a non-empty finite alphabet, \(\pi : S \rightarrow [0,1]\) is the initial probability distribution over S, \(\mathbf {P}: S\times S \rightarrow [0,1]\) is the transition probability, such that for any \(s \in S\), \(P(s, \cdot )\) is a probability distribution, and \(L:S\rightarrow \varSigma \) is the labeling function.

Deterministic Finite Automaton: We use Deterministic Finite Automaton (DFA) to describe the extensions of a prefix.

Definition 3

(DFA). A Deterministic Finite Automaton (DFA) is a tuple \(\mathcal {A}:(Q, \varSigma ,\) \( \delta , q_I, F)\), where Q is a set of finite states, \(\varSigma \) is a finite alphabet, \(\delta :Q\times \varSigma \rightarrow Q\) is a transition function determining the next state for a given state and symbol in the alphabet, \(q_I\in Q\) is the initial state, and \(F\subseteq Q\) is the set of final states (\(\mathcal {L}(\mathcal {A}) \subseteq \varSigma ^*\) denotes the language of a DFA \(\mathcal {A}\)).

4 Monitor Construction

A monitor is a finite-state machine (FSM) that consumes the output of the system execution sequentially, and produces the evaluation of a given property at each step, typically as a Boolean value [4]. The monitor in our framework is still an FSM, in the form of a look-up table, that instead of Boolean values produces a value in [0, 1]. The value indicates the probability of the extensions that satisfy or violate the specification, assuming that the property is currently not satisfied/violated. These probability values are the result of a bounded reachability analysis on the product of the trained HMM and the DFA.

In the rest of this section, we describe how an HMM is built using standard Expectation-Maximization (EM) learning technique [6] (Sect. 4.1), describe the product model of HMM and DFA as a DTMC used to perform the reachability analysis (Sect. 4.2), and, our monitor construction approach (Sect. 4.3).

4.1 Training HMM

We use Maximum Likelihood Estimation (MLE) technique [29] to train an HMM. The log-likelihood function \(L(\varTheta )\) of the HMM \(\mathcal {H}:(S, \varSigma , \pi , T, O)\) over an observation sequence \(Y_{1:\tau }\) is defined as \(L(\varTheta )= \log (\sum _{X_{1:\tau }} Pr(X_{1:\tau },Y_{1:\tau }\mid \theta ))\).

Since the probability distribution over the state sequence \(X_{1:\tau }\) is unknown, \(L(\varTheta )\) does not have a closed form [37], leaving the training techniques to heuristics such as EM. One well-known EM technique for training an HMM is Baum-Welch algorithm [26] (BWA), where the training alternates between estimating the distribution over the hidden state variable, \(Q: X\rightarrow [0,1]\), with some fixed choice for \(\varTheta \) (Expectation), and maximizing the log-likelihood to estimate the values of \(\varTheta \) by fixing Q (Maximization) [28].

The Expectation phase in BWA computes \(Pr(X_t=s\mid Y,\varTheta )\) and \(Pr(X_t=s, X_{t+1}=s'\mid Y, \varTheta )\) for \(s, s'\in S\) through forward-backward algorithm [26]. Maximization is performed on a lower bound of \(L(\varTheta )\) using Jensen’s inequality: \(L(\varTheta )\ge Q(X)\log Pr(X_{1:\tau },Y_{1:\tau }\mid \varTheta )-Q(X)\log Q(X)\).

Since the second term is independent of \(\varTheta \) [28], only the first term is maximized in each iteration: \(\varTheta ^{(k)} = {\mathop {\mathrm {argmax}}\nolimits _{\varTheta }} Q(X)\log Pr(X_{1:\tau },Y_{1:\tau }\mid \varTheta ^{(k-1)})\).

The training starts with random initial values for \(\varTheta ^{(0)}\), and consequently running the forward-backward algorithm to update the parameters of the model:

$$\begin{aligned} \begin{aligned} \pi ^*(s) = Pr(X_1=s\mid Y,\varTheta )&\qquad T^*(s,s') = \frac{\sum _{t=1}^\tau Pr(X_t=s, X_{t+1}=s'\mid Y, \varTheta )}{\sum _{t=1}^TPr(X_t=s\mid Y,\varTheta )}\\ O^*(s,o)&= \frac{\sum _{t=1}^\tau \#(Y_t=o)\cdot Pr(X_t=s\mid Y,\varTheta )}{\sum _{t=1}^TPr(X_t=s\mid Y,\varTheta )} \end{aligned} \end{aligned}$$

BWA is essentially a gradient-decent approach, thus its outcome is highly sensitive to the initial values of \(\varTheta \) [37].

We use the Bayesian Information Criterion (BIC) [7] to choose the number of hidden states. BIC assigns a score to a model according to its likelihood but also penalizes models with more parameters to avoid over-fitting: \( BIC(\mathcal {H}) = log(n)|\varTheta |-2L(\varTheta )\), where \(|\varTheta |=|S|^2+|S||\varSigma |\), and n is the size of training sample.

4.2 The Product of the Prediction Model and the Specification

From each state of the trained HMM, the monitor needs to expand the observed execution, u, and predict expected value of the given property. The expansion of u is based on a DFA that specifies good and bad extensions of u. The monitor maintains the configurations of both the DFA and the trained HMM by creating the product of the two models [39, 41]:

Definition 4

(The Product of an HMM and a DFA). Let \(\mathcal {H}=(S, \varSigma , \pi ,\) TO) and \(\mathcal {A}=(Q,\varSigma ,\delta ,q_I,F)\) respectively be an HMM and a DFA. We define the DTMC \(\mathcal {M}_{\mathcal {H}\times \mathcal {A}}:( S'=S\times Q\times \varSigma , \{Accept\}, \pi ',\mathbf {P},L)\) as follows:

$$\begin{aligned} \begin{aligned} \pi '(s,q,o)={\left\{ \begin{array}{ll} \pi (s) &{} \text {if } q\in q_I\\ 0 &{} \text {otherwise} \end{array}\right. }&\qquad L(s,q,o) ={\left\{ \begin{array}{ll} \{Accept\} &{} \text {if } q\in F\\ \emptyset &{} \text {otherwise} \end{array}\right. } \\ \mathbf {P}((s,q,o),(s',q',o'))&= {\left\{ \begin{array}{ll} T(s,s')\cdot O(s',o') &{} \text {if } \delta (q',o)=q\\ 0 &{} \text {otherwise} \end{array}\right. } \end{aligned} \end{aligned}$$

4.3 Constructing Monitor with Bounded Prediction Horizon

The monitor’s purpose is to estimate the probability of all the finite extensions of length at most h that satisfy a given property. The variable h is a positive integer we call the prediction horizon. Let \(\sigma _0\sigma _1\cdots \sigma _t\) be the extension of a finite path that ends at the state \(\sigma \) (\(\sigma _0=\sigma \in S'\)), such that \(L(\sigma _t)=Accept\) in the product model \(\mathcal {M}\) (\(\sigma _i=(s_i,q_i,o_i)\) is the composite state of the product model \(\mathcal {M}\), for all \(0\le i\le t\)). The monitor’s output is \(Pr(\sigma _0\sigma _1\cdots \sigma _t)\), \(t\le h\), which is computed by performing the following reachability analysis on \(\mathcal {M}\) [1]: \(Pr(\sigma \models \Diamond ^{\le h} Accept)\).

In order to compute this probability we adopt the transformation from [16]:

$$\begin{aligned} \mathbf {P}_{Acc}(\sigma ,\sigma ') = {\left\{ \begin{array}{ll} 0 &{} \text {if } L(\sigma )=Accept \text { and } \sigma \ne \sigma '\\ 1 &{} \text {if } L(\sigma )=Accept \text { and } \sigma = \sigma '\\ \mathbf {P}(\sigma ,\sigma ') &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(1)

The transformation (1) allows us to recursively compute \(Pr(\sigma \models \Diamond ^{\le h} Accept)\) as follows: \(Pr(\sigma \models \Diamond ^{\le h} Accept) = \sum _{\sigma '}\mathbf {P}_{Acc}(\sigma ,\sigma ')Pr(\sigma '\models \Diamond ^{\le h-1} Accept)\,\,(2)\).

This is essentially the transient probability for \(\sigma _0\cdots \sigma _hw\) [16], that is, starting from \(\sigma _0\) the probability of being at state \(\sigma _h\) (i.e., after h steps), such that \(L(\sigma _h)=Accept\) (\(w\in \varSigma ^\omega \) is any infinite extension of the path). The probability measure of \(\sigma _0\cdots \sigma _hw\) is based on the prefix \(\sigma _0\sigma _1\cdots \sigma _h\) and can be written as the joint probability distribution of the hidden state variable and that of the observation.

Computing (2) for all the states at runtime is not practical, due to multiplications of large and typically sparse matrices [16]. Instead, for all \(t\le h\) we compute the probabilities off-line and store them in a table \(MT(\sigma ,t)\), where \(MT(\sigma ,t)=Pr(\sigma \models \Diamond ^{\le t}Accept)\). Our monitor, thus, is transformed into a look-up table with the size at most \(O(|S|\times |Q|\times |\varSigma |\times h)\).

5 Run-Time Monitoring with Viterbi Approximation

For each state \(\sigma =(s,q,o)\) the monitor needs to estimate the hidden state s (q is derivable from o). We employ the Viterbi algorithm to find the most likely hidden state during monitoring.

For an observation sequence \(Y=Y_{1:\tau }\), Viterbi algorithm [10, 38] derives \(X^*_{1:\tau }= {\mathop {\mathrm {argmax}}\nolimits _{X_{1:\tau }}}Pr(X_{1:\tau }|Y,\varTheta )\), so-called the Viterbi path. Let \(v_t(s)\) be the probability of the Viterbi path ending with state s at time t: \(v_t(s)=O(s,Y_t)\) \(\max _{s'\in S}(v_{t-1}(s')T(s',s))\,\,(3)\).

To find \(X^*_{t}\) at step t, the monitor only requires \(v_{t-1}(s')\) for all \(s'\in S\). Therefore, we can obtain \(X^*_{t}\) by using only two vectors that maintain the values of \(v_t(s)\) and \(v_{t-1}(s)\) (we call them Viterbi vectors).

figure a

Procedure Monitor demonstrates our runtime monitoring algorithm. We assume that the monitor table MT is already constructed as described in Sect. 4 (line 3). Line 4 initialize the Viterbi vector. The horizon index t stores the prediction horizon at each iteration (initialized to h at the beginning – line 5). Each iteration of the for loop in lines 6–13 is over one observation in the sequence Y. For each observation \(Y_i\), the configuration \((s,q,Y_i)\) (lines 7–8) combined with t gives us the index to retrieve the probability value in the monitor table (line 9). If the path is not accepted by the DFA, the monitor shrinks its horizon index by one (t is decremented — line 10). Each time that the observed path is accepted by the DFA, the horizon index is reset to h (line 10), for the prediction of the next extension. Similarly, once the prediction horizon has reached zero, i.e., the property is not satisfied within the given prediction horizon, the horizon index is reinitialized to h. At the end, the Viterbi vector is updated for the next iteration in lines 11–13 according to (3).

In each monitoring iteration (the loop in lines 6-13), reading the value from the monitor table MT is constant time. For a trained model with k hidden states, updating the Viterbi vector requires O(k) operations of finding maximums, which can be improved to lg(k) using a Max-Heap. Therefore, each monitoring iteration is of \(O(k\,lgk)\) in execution time. The space complexity is mainly bounded by the size of the monitor table and the Viterbi vectors: \(O(k\,h)\).

6 Prediction Evaluation

In this section, we first define a lower bound on the prediction error of the monitor on a single trace, and then use two-sided hypothesis testing to evaluate the average prediction performance on a set of testing samples. Finally, we exploit the hypothesis testing results to find an empirical lower bound of the horizon.

6.1 Prediction Error

Let \((o_i\cdots o_{i+\lambda _i(\mathcal {A})})\) be an extension of length \(\lambda _i(\mathcal {A})\) at point i that is accepted by a given DFA \(\mathcal {A}\), i.e., \((o_i\cdots o_{i+\lambda _i})\in \mathcal {L}(\mathcal {A})\) (for brevity, we use \(\lambda _i\) instead of \(\lambda _i(\mathcal {A})\)). Recall that the monitor’s output at point i is the probability of all the extensions of length at most h that are accepted by \(\mathcal {A}\) (\(Pr(\sigma _i\models \Diamond ^{\le h} Accept)\)). For any \(\lambda _i \le h\) we have: \(Pr(\sigma _i\models \Diamond ^{\le h} Accept) \ge Pr(\sigma _i\cdots \sigma _{i+\lambda _i}\models Accept)\) \(\implies \lambda _i\times Pr(\sigma _i\models \Diamond ^{\le h} Accept) \ge \lambda _i\times Pr(\sigma \cdots \sigma _{i+\lambda _i}\models Accept)\).

We define \(\hat{\lambda }_i = \lambda _i\times Pr(\sigma _i\models \Diamond ^{\le h} Accept)\) as the expected value of \(\lambda _i\) estimated by the monitor. Therefore, we can obtain the following minimum error of the prediction at point i: \(\varepsilon ^{min}_i = \lambda _i - \hat{\lambda }_i\).

Notice that since \(\lambda _i \ge \hat{\lambda }_i\), \(\varepsilon ^{min}\) is always positive. If there is no k, \(i< k < \lambda _i\) such that \((o_i\cdots o_{i+k})\in \mathcal {L}(\mathcal {A})\), i.e., \((o_i\cdots o_{i+\lambda _i})\) is the minimal extension that is accepted by \(\mathcal {A}\), then \(\varepsilon ^{min}_{i+t} = (\lambda _i-t) - \hat{\lambda }_{\lambda _i-t}, 0 \le t < \lambda _i \le h\), where t is the horizon index in Monitor. As a result, the value of \(\varepsilon _i^{min}\) can be computed on-the-fly.

In our implementation, we assume that there exists at least one point \(k \le h\) such that \((o_i\cdots o_{i+k})\in \mathcal {L}(\mathcal {A})\); otherwise, \(\varepsilon ^{min}_i\) is not well-defined, and the prediction accuracy can not be verified. If such a point does not exist, we can extend the prediction horizon by increasing h such that there is at least one accepting extension in the trace. The rest of the path after the last point in which the trace is accepted by \(\mathcal {A}\) is discarded as there is no observation to compare the prediction and compute the error.

In the following, we give an empirical evaluation of the monitor’s prediction using hypothesis testing which leads to an empirical lower bound for h.

6.2 Empirical Evaluation Using Hypothesis Testing

To assess the performance of the prediction, aside from the execution trace, we use hypothesis testing on a set of test samples.

Let \(\varLambda =\tfrac{1}{\tau }\sum _{i=1}^\tau \lambda _i\) be the random variable that represents the mean of all \(\lambda _i\) values, for \(1 \le i \le \tau \). Notice that for iid samples, the value of \(\varLambda \) for a trace is independent of that value for the other traces.

Let \(\bar{\lambda }_M\) be the estimation of \(\varLambda \) by the monitor over a set of monitored traces, and \(\bar{\lambda }\) be the mean of \(\varLambda \) on a separate set of n iid samples with variance \(\nu \). We test the accuracy of the prediction using the following two-sided hypothesis test \( H_0: \bar{\lambda }_M = \bar{\lambda }\). Using confidence \(\alpha \), we use student’s t-distribution to test \(H_0\): \(\frac{\bar{\lambda } - \bar{\lambda }_M}{\tfrac{\sqrt{\nu }}{n}} \le t_{n-1,\alpha }\). Given the mean of the length of the shortest finite extensions in the test sample we can use it to obtain a lower bound for h: \(h \ge \bar{\lambda }-t_{n-1,\alpha }\tfrac{\sqrt{\nu }}{n}\). That is, the prediction horizon h must be at least as large as the mean of the length of the extensions in the test sample that are accepted by \(\mathcal {A}\).

7 Case Studies

We evaluate \(\mathcal {P}revent\) on two case studies: (1) randomized dining philosophers from Prism [27], which includes the original algorithm, and a modified version that we introduce specifically for evaluating \(\mathcal {P}revent\), (2) QNX Neutrino kernel traces collected from the flight control software of a hexacopter. We show the estimation of good and bad extensions in the randomized dining philosophers and hexacopter traces, respectively, each of which represents one of the most commonly used property patterns in Dwyer et al. [11]’s survey: response pattern in the randomized dining philosophers algorithm, and the absence pattern for monitoring a regular safety property [1] in the flight control of a hexacopter. The implementation of monitoring in both experiments is conducted off-line.

7.1 Randomized Dining Philosopher

We adapt Rabin and Lehmann [25]’s solution to the dining philosophers problem that has the characteristics of a stochastic system to be trained using HMM. We also present a modification of their algorithm, which represents a generic form of decentralized on-line resource allocation [36], where our monitoring solution can be seen as a component of the liveness enforcement supervisory [19].

We consider the classic form of the problem, where philosophers are in a ring topology, and are selected for execution by a fair scheduler. Figure 2a demonstrates a state diagram of one philosopher, with Th, H, T, P, D, and E representing the philosopher to be, respectively, thinking, hungry, trying, picking a fork, dropping a fork, and eating. A philosopher starts at (Th), and immediately transitions to (H)Footnote 2. Based on the outcome of a fair coin, the philosopher then chooses to pick the left or the right fork if they are available, and moves to (T). If the fork is not available the philosopher remains at (T) until it is granted access to the fork. The philosopher moves to (E), if the other fork is available; otherwise, the philosopher drops the obtained fork, moves to (D), and eventually transitions back to (H). After the philosopher finishes eating, it drops the forks in an arbitrary order (D), and moves back to (Th). The algorithm is deadlock-free but lockouts are still possible [25].

Our modification of the algorithm is to add a self-transition at (P): a philosopher does not drop the first obtained fork with probability c, i.e., it stays at (P), which is shown with dotted lines in Fig. 2a (the transition from (P) to (D) has the probability \(1-c\), which is not shown in the figure). This modification enables the philosopher to control its waiting time, the period between when it becomes hungry for the first time after thinking, and when it eats. A higher value of c means that, instead of going back to (H), the philosopher is more likely to stay at (P) so that as soon as the other fork is available it will eat. It is not difficult to observe that as long as there is at least one philosopher with \(c\ne 1\), the symmetry that causes the deadlock [25] will eventually break, and the algorithm remains deadlock-free. In a distributed real-time system, where each philosopher represents a process with unfixed deadlines, changing the value of c enables the processes to dynamically adjust their waiting time according to their deadlines.

Fig. 2.
figure 2

Training an HMM for the monitored philosopher in a program with 3 philosophers.

The purpose of our experiments is to implement a monitor that observes the outputs of a single philosopher, and predicts a potential starvation (lockout) by estimating the extensions that leads to eating.

Predicting Starvation at Run-Time. We use Matlab HMM toolbox to train HMMs, and 100 iid samples collected from the implementation of our modified version, with \(c=1\) for all philosophers except the one that is being monitoredFootnote 3. The trained model presents the behavioral signature of the system when a longer waiting time is likely. The size of HMM (i.e., the number of hidden states) is chosen based on the BIC score of each model with different sizes (see Sect. 4.1). Figure 2b demonstrates the trained HMM of one philosopher that is constructed from the traces of a 5-s execution of three philosophers. The trained model reflects the distribution of the prefixes in the training sample, which in turn is determined by how the scheduler as well as other philosophers behaved during training (i.e., resolving non-determinism of the model). For instance, multiple consecutive trys in the training sample create several states in the trained HMM, each emitting the symbol (T), but only one has a high probability to transition to (P) and the others model the state where the philosopher can not pick a fork. Finite extensions that we consider in the prediction are based on the following regular expression: \((\lnot hungry)^*(hungry(\lnot eat)^*eat(\lnot hungry)^*)^*\).

Figure 3 gives a comparison between the prediction results (\(h=33\)) of two trained models, one trained using the samples from the original implementation (LR) and the other one trained from the samples of our modified version (LR-sap), both containing three philosophers. The monitored trace is synthesized in a way that it does not contain any eat, and up to point 33 the philosopher is only at state (T). After that the philosopher frequently picks and drops a fork. When the last event of a prefix is pick, compared to when it ends with any other observations, the philosopher has a higher chance to reach eat (e.g., with probability 0.98 at point 35); however, since HMM maintains the history of the trace, a prefix with frequent (pick, drop) one after another shows a decline in the probability of observing eat (e.g., with probability of 0.8 at point 57). The results in Fig. 3 demonstrate that the model trained on the bad extensions (LR-sap) provides an under-approximation of the model that is trained on the good traces (LR), thus, producing more false positives.

Table 1. Prediction results on 100 test samples.
Fig. 3.
figure 3

The comparison of the prediction results from two trained models.

The summary of our results is displayed in Table 1. We use Prism to perform the reachability analysis on the product of the trained HMM and the DFA. The size of the product model is equal to the size of the HMM, as each state in the trained HMMs emits exactly one observation. The minimum prediction horizon (\(h^{min}\)) is obtained empirically from 100 test samples. We choose the prediction horizon to be three times as large as \(h^{min}\) during monitoring. The average of the estimated length of the acceptable extensions by the monitor is shown as \(\bar{\lambda }_M\), and the mean of the error on the entire testing set is denoted by \(mean(\varepsilon ^{min})\). On average, the monitor predicts the next eat (within the prediction horizon) with one step error. The monitor is not able to detect the waiting periods that approximately are longer than \(3\times h^{min}\pm 1\). Increasing the prediction horizon decreases the error, with the cost of a larger monitor table (MT). The value of \(\bar{\lambda }_M\) is influenced by the total number of discrete events produced by the monitored philosopher. With more philosophers \(\bar{\lambda }_M\) decreases because the monitored philosopher, and hence, the monitor, are scheduled less often.

7.2 Hexacopter Flight Control

In this section, we apply \(\mathcal {P}revent\) to detect injected faults from QNX Neutrino’s [24] kernel calls. The traces are obtained using QNX tracelogger during the flight of a hexacopterFootnote 4. The vehicle is equipped with an autopilot, but can be controlled manually using a remote transmitter. The autopilot system uses a cascaded PID controller. QNX’s microkernel follows message-passing architecture, where almost all the processes (even the kernel processes) communicate via sending and receiving messages that are handled by the kernel calls \(\mathtt {MSG}\)-\(\mathtt {SENDV}\), \(\mathtt {MSG}\)-\(\mathtt {RECEIVEV}\), and \(\mathtt {MSG}\)-\(\mathtt {REPLY}\). Figure 4a shows a sub-trace of the kernel call sample from the hexacopter flight control system.

In this case study, we inject faults by introducing an interference process, with the same priority as the autopilot process, that simply runs a while-loop to consume CPU time. The interference process abrupts message-passing between the processes of the same or lower priorities, causing a kernel call to handle the error (typically due to a timeout) and to unblock the sender (shown as event \(\mathtt {MSG\_}\) \(\mathtt {ERROR}\) in Fig. 4a). The purpose of the monitor is to predict the existence of an interference process by only observing the kernel calls.

We use SFIHMM [8] on an Intel Xeon 2.40 GHz 128 GB RAM machine with Debian 9.3 to train an HMM from 1-s of the auto-pilot execution, with the intervening process in full effect. The HMM with the minimum BIC has 19 states. The regular expression \((\lnot \mathtt {MSG\_ERROR})^*\) \((\mathtt {MSG\_ERROR})\varSigma ^*\) is used to generate the finite extensions that contain the bad prefixes of the property \(\Box \lnot \mathtt {MSG\_ERROR}\).

Fig. 4.
figure 4

The monitoring of \(\Box \lnot \mathtt {MSG\_ERROR}\) on the flight control trace with the interference process.

The monitor’s prediction on part of the trace generated from another scenario, where the interference process is partially in effect and started executing in the middle of the flight, is depicted in Fig. 4. The event \(\mathtt {MSG\_ERROR}\) is emitted at index 10, 861, and the probability of the prefix that contains \(\mathtt {MSG\_ERROR}\) within next 50 steps is 0.15 at index 10, 815. That means that the monitor predicts the message error with \(\%15\) chance, almost 45 steps before its occurrence. The points where the probability is zero is because the monitor was not able to correctly estimate the hidden state of the model. More training samples are required to enable the monitor to estimate the correct state of the model. In our case, three consecutive instances of \(\mathtt {MSG\_RECEIVEV}\) did not appear in the training sample, hence, the prefix can not be associated to any state of the model by the monitor.

8 Related Work

There have been several proposals to define semantics of LTL properties on the finite paths [20]; however, to the best of our knowledge, our approach is the first one in verifying finite paths based on the extensions obtained from a trained HMM. HMMs have been recently used in run-time monitoring of CPSs [2, 13, 30, 31, 33, 35, 40]. Sistla et al. [31] propose an internal monitoring approach (i.e., the property is specified over the hidden states) using specification automata and HMMs with infinite states. Learning an infinite-state HMM is a harder problem than the finite HMMs, but does not require inferring the size of the model [5].

The notion of acceptance accuracy and rejection accuracy in [30] are the complement to our notion of prediction error. According to their definition, our Viterbi approximation generates a threshold conservative monitor for any regular safety property and regular finite horizon. The analytical method in [33] to find an upper bound for the timeliness of a monitor can be applied to \(\mathcal {P}revent\) to find an upper bound for the prediction horizon.

Several works focus on efficiently estimating the internal states of an HMM at runtime using particle filtering [13, 35]. Particle filtering uses weights based on the number of particles in each state, and updates the weights in each observation. Viterbi algorithm provides the most likely state, as an over-approximation. Adaptive Runtime Verification [2] couples state estimation [35] with feed-back control loop to generate several monitors that run on different frequencies. These works are orthogonal to our framework and can be combined with \(\mathcal {P}revent\).

Learning models for verification is executed on Markov Chain models [18, 22]. HMMs are trained in [14] for statistical model checking. Our work focuses on predictive monitors using a similar technique. We also provide assessments for evaluating the learned model and inferring its size.

9 Conclusion

We introduced \(\mathcal {P}revent\), a predictive run-time monitoring framework for properties with finite regular extensions. The core part of \(\mathcal {P}revent\) involves learning a model from the traces, and constructing a tabular monitor using reachability analysis. The monitor produces a quantitative output that represents the probability that from the current state, the system satisfies a property within a finite horizon. The current state is estimated using Viterbi algorithm. We defined an empirical evaluation of the prediction using the expected length of the extension of the execution that satisfies the property. In future, we are interested in exploring other evaluation methods, including comparing the prediction results of the trained model with those of the complete model by applying abstraction [15].

We provided preliminary evaluation of our approach on two case studies: the randomised dining philosophers problem, and the flight control of a hexacopter. In both cases, the trained models are extracted from bad traces, thus, the monitor has a tendency to produce false positives. An interesting modification to our approach, which would reduce the number of false positives, is to involve a mixture of trained models based on both good and bad traces, and only employing ones that have a higher likelihood to generate the current execution trace.

Lastly, an implementation of \(\mathcal {P}revent\) with the application of on-line learning methods (such as state merging or splitting techniques [21, 34]) is necessary to apply the framework to the real-world scenarios.