Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In this paper we combine statistical (Monte Carlo) techniques and numerically sound decision procedures to reason about hybrid systems with random and nondeterministic parameters. In particular, we devise confidence-interval techniques for bounded probabilistic reachability, i.e., we aim at computing statistically valid enclosures for the probability that a hybrid system reaches a given set of states within a given time bound and number of discrete transitions. When nondeterministic parameters are present, a hybrid system will in general feature a range of reachability probabilities, depending on the value of the nondeterministic parameters. Reachability is an important class of behavioural properties, as many verification problems (e.g., proving system safety) can be reduced to reachability questions. A statistical approach to probabilistic reachability is important because statistical techniques trade correctness guarantees with efficiency, and so can scale much better with system size than other rigorous approaches. For example, statistical model checking [15] can be faster than probabilistic model checking, which is based on exhaustive state space search [14]. Also, statistical model checking can handle models for which no efficient verification tools exist, such as cyber-physical systems [2].

Monte Carlo techniques for probability estimation assume that one can sample the random variable representing the true system behaviour. However, while this is possible for certain finite-state systems, for nonlinear systems (e.g., ordinary differential equations (ODEs) with trigonometric functions) it is not. In fact, sampling the random variable representing the true system behaviour can be as hard as reachability, which is undecidable even for very simple systems (e.g., linear hybrid automata [1]). Thus, one has to deal with numerical imprecisions that could lead to missing important events in the true system evolution. For example, zero-crossings can be indistinguishable from “safe” trajectories [8].

A novel aspect of our work is that we explicitly take into account undecidability and numerical precision by employing \(\delta \)-complete decision procedures [4], which enable formal reasoning up to a user-defined numerical precision over bounded domains. In this way it is possible to handle in a sound and safe manner complex dynamical systems, such as nonlinear ODEs [6]. Given any \(\delta > 0\) and an arbitrary first-order formula \(\phi \) over the reals, a \(\delta \)-complete decision procedure returns unsat if \(\phi \) is false and \(\delta \) -sat if \(\phi ^{\delta }\) (a weaker version of formula \(\phi \)) is true. Note that the latter result does not imply satisfiability of the initial formula. Also, the value of \(\delta \) affects the precision of the result, and large values of \(\delta \) can cause false alarms (i.e., \(\delta \) -sat is returned for a formula which is in fact false). Statistical techniques must therefore take into account that samples are only approximation of the real random variable corresponding to the system evolution. In particular, we introduce an approach for computing statistically and numerically rigorous confidence intervals for probabilistic reachability. We exemplify our techniques to hybrid systems with random and/or nondeterministic parameters. For systems with both random and nondeterministic parameters we estimate the (nondeterministic) parameter values that result in the minimal and maximal reachability probabilities. Our algorithms can in principle be applied to other stochastic models (e.g., continuous-time Markov chains) should the corresponding \(\delta \)-complete decision procedure be available.

Related Work. We focus on works that combine statistical techniques with SMT procedures, which are the main subject areas of the paper. The tool SReach [13] combines statistical estimation with \(\delta \)-complete simulation procedures. However, SReach only considers overapproximations of the reachability probability, and thus can offer one-sided confidence intervals only. We instead compute confidence intervals that are guaranteed to contain both the under- and overapproximation of the reachability probability. Also, SReach does not handle nondeterministic parameters, while we do. In [3] the authors present a statistical model checking approach combined with SMT decision procedures, but it is restricted to fixed-sample size techniques, while we develop a more efficient sequential Bayesian approach and consider \(\delta \)-complete decision procedures.

2 Bounded Reachability in Hybrid Systems

Hybrid systems provide a framework for modelling real-world systems that combine continuous and discrete dynamics [1]. We consider parametric hybrid systems as a variant of hybrid systems featuring continuous and discrete parameters whose values are set in the initial state and do not change during the system’s evolution. Such parameters can be random when there is a probability measure associated with them, and nondeterministic otherwise. We now formally define the systems we consider in this paper.

Definition 1

(PHS). A Parametric Hybrid System is a tuple

$$\begin{aligned} H=<Q,\varUpsilon ,X,P,Y,R,\text {{jump, goal}}> \end{aligned}$$

where

  • \(Q = \{q_0, \cdots , q_m\}\) a set of modes (discrete components of the system),

  • \(\varUpsilon = \{(q,q^{\prime }): q,q^{\prime } \in Q\}\) a set of transitions between modes,

  • \(X = [u_1, v_1] \times \cdots \times [u_n, v_n] \subset \mathbb {R}^n\) a domain of continuous variables,

  • \(P = [a_{1}, b_{1}] \times \cdots \times [a_{k}, b_{k}] \subset \mathbb {R}^k\) the parameter space of the system,

  • \(Y = \{{\mathbf{y}_{q}(\mathbf{p}, t)} : q \in Q, \mathbf{p}\in X\times P, t\in [0,T]\}\) the continuous system dynamics where \(\mathbf{y}_q:X\times P\times [0, T] \rightarrow X\),

  • \(R = \{\mathbf{g}_{(q,q^{\prime })}(\mathbf{p},t) : (q, q^{\prime }) \in \varUpsilon , \mathbf{p}\in X\times P, t\in [0,T]\}\) ‘reset’ functions \(\mathbf{g}_{(q,q^{\prime })}: X\times P\times [0,T] \rightarrow X\times P\) defining the continuous state at time \(t = 0\) in mode \(q^{\prime }\) after taking the transition from mode q.

and predicates (or relations)

  • \( jump _{(q, q^\prime )}(\mathbf{x})\) defines a discrete transition \((q, q^{\prime }) \in \varUpsilon \) which may (but does not have to) occur upon reaching the jump condition in state \((\mathbf{x},q)\in X\times P\times Q\),

  • \( goal _{q}(\mathbf{x})\) defines the goal state \(\mathbf{x}\) in mode q.

The continuous system dynamics Y is represented by initial value problems with Lipschitz-continuous ODEs, which by the well-known Picard-Lindelöf theorem have a unique solution for any given initial condition \(\mathbf{p} \in X\times P\) and \(t_{0} \in [0, T]\). We treat system parameters as any other variable, except that their derivatives are zero. Thus, the parameters are part of the initial conditions.

Bounded reachability in PHSs aims to decide whether, for given initial conditions, the system reaches a goal state in a finite number of discrete transitions. Given a PHS and a reachability depth l we can derive the set \(\mathbf{Path}(l)\) of all paths \(\pi \) of length \(|\pi | = l + 1\) whose first (\(\pi (0)\)) and last (\(\pi (l)\)) elements are the initial and the goal mode, respectively. The bounded reachability property for a path \(\pi \in \mathbf{Path}(l)\) and initial condition \(\mathbf{p}\) can be checked by evaluating the formula:

$$\begin{aligned} \begin{aligned}&\phi (\pi ,\mathbf{p}) := \exists ^{[0, T]} t_{0}, \cdots , \exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{|\pi |-2} \Big [\text {jump}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t}) \wedge \big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t}, t_{i}), t_{i+1})\big ) \Big ] \\&\wedge \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t}) \ . \end{aligned} \end{aligned}$$
(1)

where \(\exists ^{[0, T]} t_{i}\) is a shorthand for \(\exists t_{i} \in [0, T]\).

Note that the terms \(\mathbf{x}_{\pi (i+1)}^{t}=\mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t}, t_i), t_{i+1})\) and \(\mathbf{x}_{\pi (0)}^{t}=\mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0})\) are purely syntactic substitutions. Formulas over the reals like (1) are undecidable in general [9], but a relaxed version (\(\delta \)-weakening [4]) is instead decidable.

Definition 2

( \(\delta \) -Weakening [4]). Given a bounded \(\varSigma _{1}\) sentence and an arbitrarily small positive \(\delta \)

$$\begin{aligned} \exists ^{X} \mathbf{x} : \bigwedge _{i=1}^{m}(\bigvee _{j=1}^{k_{i}}(f_{i,j}(\mathbf{x}) = 0)) \end{aligned}$$

(where the \(f_{i,j}\) are Type-2 real computable functions) its \(\delta \)-weakening is

$$\begin{aligned} \exists ^{X} \mathbf{x} : \bigwedge _{i=1}^{m}(\bigvee _{j=1}^{k_{i}}(|f_{i,j}(\mathbf{x})| \le \delta )) \end{aligned}$$

It is easy to see that the bounded reachability property (1) can be rewritten in the format of Definition 2 (see [4]). A \(\delta \)-complete decision procedure [4] correctly decides whether an arbitrary bounded \(\varSigma _{1}\) (existentially quantified) sentence is false (unsat answer) or its \(\delta \)-weakening is true (\(\delta \)-sat answer). Note that with a \(\delta \)-complete decision procedure unsat can always be trusted, while \(\delta \) -sat might in fact be a false alarm due to a coarse overapproximation characterised by \(\delta \).

Evaluating (1) by a \(\delta \)-complete decision procedure returns unsat only if for the given parameter value \(\mathbf{p}\) the path does not reach a goal state. If \(\delta \) -sat is returned, we may try to sharpen the answer by checking an appropriate formula. For example, an unsat answer to formula \(\phi ^{\forall }(\pi ,\mathbf{p})\) below implies reachability:

$$\begin{aligned} \begin{aligned}&\phi ^{\forall }(\pi ,\mathbf{p}) := \forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} \ne \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \vee \\&\bigvee _{i = 0}^{|\pi |-2} \Big [\lnot \text {jump}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t}) \vee \big (\mathbf{x}_{\pi (i+1)}^{t} \ne \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1})\big ) \Big ] \\&\vee \lnot \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t}) \end{aligned} \end{aligned}$$

In the previous formula the time variables are quantified universally. Current implementations of \(\delta \)-complete decision procedures [5] can only handle formulas where the universal quantification is introduced over a single time variable. The goal predicate in \(\phi ^\forall (\pi ,\mathbf{p})\) depends on \(|\pi |\) variables and thus cannot be handled directly. To resolve this issue we instead evaluate a series of formulas \(\psi _j\):

$$\begin{aligned} \begin{aligned}&\psi _{j}(\pi ,\mathbf{p}):= \exists ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{j} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{j-1} \Big [\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ] \wedge \lnot \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (j)}^{t}) \end{aligned} \end{aligned}$$
(2)

if \(j < |\pi | - 1\) and

$$\begin{aligned} \begin{aligned}&\psi _{j}(\pi ,\mathbf{p}) := \exists ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{j} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{j-1} \Big [\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ] \wedge \lnot \text {goal}_{\pi (j)}(\mathbf{x}_{\pi (j)}^{t}) \end{aligned} \end{aligned}$$
(3)

if \(j = |\pi | - 1\). The next proposition establishes a stronger formula for reachability.

Proposition 1

With the definitions in (1), (2) and (3) we have

$$\begin{aligned} \bigwedge _{j=0}^{|\pi |-1} \lnot \psi _{j}(\pi ,\mathbf{p})\ \Rightarrow \ \phi (\pi ,\mathbf{p}) \end{aligned}$$

Proof

Consider the case \(|\pi | = 1\). It can be seen that \(\lnot \psi _{0}(\pi ,\mathbf{p}) \Leftrightarrow \phi (\pi ,\mathbf{p})\) as

$$\begin{aligned} \lnot \psi _{0}(\pi ,\mathbf{p}) := \exists ^{[0, T]} t_{0}: \text {goal}_{\pi (0)}(\mathbf{x}_{\pi (0)}^{t}) \Leftrightarrow \phi (\pi ,\mathbf{p}) \end{aligned}$$

Consider now the case \(|\pi | > 1\).

$$\begin{aligned} \begin{aligned}&\bigwedge _{j=0}^{|\pi |-1} \lnot \psi _{j}(\pi ,\mathbf{p}) := \bigwedge _{j=0}^{|\pi |-2} \Bigg [ \forall ^{[0, T]} t_{0}, \cdots ,\forall ^{[0, T]} t_{j-1}, \exists ^{[0, T]} t_{j} : \big (\mathbf{x}_{\pi (0)}^{t} \not = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \vee \\&\bigvee _{i = 0}^{j-1} \Big (\mathbf{x}_{\pi (i+1)}^{t} \not = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \vee \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\Bigg [\forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-2}, \exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} \not = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \vee \\&\bigvee _{i = 0}^{|\pi |-2} \Big (\mathbf{x}_{\pi (i+1)}^{t} \not = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \vee \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t})\Bigg ] \end{aligned} \end{aligned}$$

We recall that terms \(\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}\) and \(\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1})\) are just a syntactic substitution which cannot be falsified as the system dynamics always exist (by the Picard-Lindelöf theorem). Hence, the formula above implies the following:

$$\begin{aligned} \begin{aligned}&\bigwedge _{j=0}^{|\pi |-2} \Bigg [ \forall ^{[0, T]} t_{0}, \cdots ,\forall ^{[0, T]} t_{j-1}, \exists ^{[0, T]} t_{j} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{j-1} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1),t_i)}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\Bigg [\forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-2}, \exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{|\pi |-2} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1),t_i)}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t})\Bigg ] \end{aligned} \end{aligned}$$

The next step can be equivalently derived by moving universal quantifiers from the second part of the formula (square brackets containing the goal predicate) outside the entire formula:

$$\begin{aligned} \begin{aligned}&\forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-2} : \bigwedge _{j=0}^{|\pi |-2} \Bigg [ \exists ^{[0, T]} t_{j} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{j-1} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\Bigg [\exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{|\pi |-2} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t})\Bigg ] \end{aligned} \end{aligned}$$

The existential quantifiers \(\exists ^{[0, T]} t_{j}\) can be eliminated as variables \(t_{j}\) are already quantified universally. Also \(\exists ^{[0, T]} t_{|\pi |-1}\) can be moved in front of the formula as its first part (square brackets containing jump predicates) does not depend of \(t_{|\pi |-1}\). Hence, the formula above can be written as:

$$\begin{aligned} \begin{aligned}&\forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-2}, \exists ^{[0, T]} t_{|\pi |-1} : \bigwedge _{j=0}^{|\pi |-2} \Bigg [ \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{j-1} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\Bigg [\big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{i = 0}^{|\pi |-2} \Big (\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1}) \Big ) \wedge \text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t})\Bigg ] \Leftrightarrow \end{aligned} \end{aligned}$$

By idempotency of conjunction (\(A \wedge A = A\)) terms \(\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}\) and \(\mathbf{x}_{\pi (i+1)}^{t} = \mathbf{y}_{\pi (i)}(\mathbf{g}_{(\pi (i),\pi (i+1))}(\mathbf{x}_{\pi (i)}^{t},t_i), t_{i+1})\) can be merged:

$$\begin{aligned} \begin{aligned}&\forall ^{[0, T]} t_{0}, \cdots , \forall ^{[0, T]} t_{|\pi |-2}, \exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{j=0}^{|\pi |-2} \Bigg [ \Big (\mathbf{x}_{\pi (j+1)}^{t} = \mathbf{y}_{\pi (j)}(\mathbf{g}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (j)}^{t},t_j), t_{j+1}) \Big ) \wedge \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t}) \end{aligned} \end{aligned}$$

Finally, the following is implied:

$$\begin{aligned} \begin{aligned}&\exists ^{[0, T]} t_{0}, \cdots , \exists ^{[0, T]} t_{|\pi |-2}, \exists ^{[0, T]} t_{|\pi |-1} : \big (\mathbf{x}_{\pi (0)}^{t} = \mathbf{y}_{\pi (0)}(\mathbf{p}, t_{0}) \big ) \wedge \\&\bigwedge _{j=0}^{|\pi |-2} \Bigg [ \Big (\mathbf{x}_{\pi (j+1)}^{t} = \mathbf{y}_{\pi (j)}(\mathbf{g}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (j)}^{t},t_j), t_{j+1}) \Big ) \wedge \text {jump}_{(\pi (j),\pi (j+1))}(\mathbf{x}_{\pi (i)}^{t})\Bigg ] \wedge \\&\text {goal}_{\pi (|\pi |-1)}(\mathbf{x}_{\pi (|\pi |-1)}^{t}) \Leftrightarrow \phi (\pi ,\mathbf{p}) \end{aligned} \end{aligned}$$

   \(\square \)

Proposition 1 enables us to define an evaluate procedure (Algorithm 1) which, given a parametric hybrid system H, reachability depth l, a parameter value \(\mathbf{p} \in X\times P\) and a precision \(\delta \) for the \(\delta \)-complete decision procedure, returns sat if \(\exists \pi \in \mathbf{Path}(l) : \phi (\pi , \mathbf{p})\), unsat if \(\forall \pi \in \mathbf{Path}(l) : \lnot \phi (\pi , \mathbf{p})\) and undet if neither of the above two can be concluded. In general, the undet outcome suggests that either the chosen precision \(\delta \) is not sufficient to decide the satisfiability of \(\phi (\pi , \mathbf{p})\), or that \(\phi (\pi , \mathbf{p})\) is undecidable (i.e., non-robust [4]).

The evaluate procedure is crucial for building the random variables that under- and over-approximate the true system behaviour on the reachability question, as we show in the next section.

figure a

3 Monte Carlo Probability Estimation

In this section we consider hybrid systems with random parameters only, so that the reachability probability is well-defined. We add nondeterministic parameters in the next section. For any given \(\delta > 0\) and any p from the parameter(s) distribution we introduce the Bernoulli random variables:

$$\begin{aligned} X&= {\left\{ \begin{array}{ll} 1 \quad &{} \text {if system } H \text { reaches the goal in }l \text { steps for a given }{} \mathbf{p} \\ 0 &{} \text {otherwise} \end{array}\right. }\end{aligned}$$
(4)
$$\begin{aligned} X_{sat}&= {\left\{ \begin{array}{ll} 1 \quad &{} \text {if }{} \mathbf{evaluate}(H,l,\mathbf{p},\delta ) = \mathbf{sat} \\ 0 &{} \text {otherwise} \end{array}\right. }\end{aligned}$$
(5)
$$\begin{aligned} X_{usat}&= {\left\{ \begin{array}{ll} 0 \quad &{} \text {if }{} \mathbf{evaluate}(H,l,\mathbf{p},\delta ) = \mathbf{unsat} \\ 1 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(6)

Thus, for a given parameter p, \(X_{sat}\) is 1 if we can correctly decide that system H reaches the goal, while \(X_{usat}\) is 0 if we can correctly decide that H does not reach the goal. If no decision can be made (because of the precision \(\delta \) being used or of the nature of the reachability question), \(X_{sat}\) and \(X_{usat}\) take 0 and 1, respectively. From the definition of evaluate it follows directly that:

$$\begin{aligned} X_{sat} \le X \le X_{usat}. \end{aligned}$$
(7)

We now introduce a Bayesian technique for calculating confidence intervals for the reachability probability \(p=\mathbb {E}[X]\) without sampling X, which is not possible in general, but instead sampling \(X_{sat}\) and \(X_{usat}\). For n random variables iid (independent and identically distributed) as \(X_{sat}\) and \(X_{usat}\), we define the random variables:

$$\begin{aligned} \hat{S}_n = \frac{\varSigma _{i=1}^n X_{sat,i}}{n} \quad \quad \hat{U}_n = \frac{\varSigma _{i=1}^n X_{usat,i}}{n}. \end{aligned}$$

The Bayesian approach assumes that the (unknown) reachability probability p is itself a random quantity (here we give a brief overview only, more details can be found in [17]). Bayes’ theorem enables computing the posterior distribution of the unknown quantity given its prior distribution and the likelihood of the data (i.e., samples of X). The posterior distribution of p can be directly used to build confidence (credibility) intervals. In our setting we cannot sample X, so we aim at bounding the posterior of p by the posteriors built from \(X_{sat}\) and \(X_{usat}\), as we show below. We use Beta distribution priors since they are conjugate to the Bernoulli likelihood; the cumulative distribution function (CDF) of a Beta with parameters \(\alpha ,\beta >0\) is denoted \(F_{(\alpha ,\beta )}(\cdot )\). We first need a technical lemma about the Beta CDF.

Lemma 1

For any \(n>0\), \(s\le x \le u \le n\), \(\alpha ,\beta > 0\) (\(n,s,x,u \in \mathbb {N}\)), \(t\in [0,1]\) the following holds:

$$\begin{aligned} F_{(u+\alpha , n-u+\beta )}(t) \le F_{(x+\alpha , n-x+\beta )}(t)\le F_{(s+\alpha , n-s+\beta )}(t) \end{aligned}$$
(8)

Proof

We prove the LHS inequality of (8); the proof of the RHS follows similar steps. When \(s = x\) the inequality holds trivially.

Consider the case \(s < x\). By definition of the Beta distribution function:

$$\begin{aligned} \begin{aligned} F_{(s+\alpha ,n-s+\beta )}(t) = \int _{0}^{t} \frac{v^{s+\alpha -1} (1-v)^{n-s+\beta -1}}{B(s+\alpha ,n-s+\beta )} dv \end{aligned} \end{aligned}$$
(9)

In the proof below we refer to the following formulas from [7]:

figure b
figure c
figure d

By 8.17.1 and 8.17.2 the Beta distribution function (9) can be presented as an incomplete Beta function \(I_{t}(s+\alpha , n-s+\beta )\) (the Beta distribution functions for the variables x and u can be written in the same form). Now we show by induction that the following holds:

$$\begin{aligned} I_{t}(s+\alpha , n-s+\beta ) \ge I_{t}(x+\alpha , n-x+\beta ) \end{aligned}$$
(10)

As \(s < x\), \(s,x \in \mathbb {N}\) and \(s,x>0\) the base case is \(s=0\) and \(x=1\). Thus, we need to prove that \(I_{t}(\alpha , n+\beta ) \ge I_{t}(\alpha +1, (n+\beta )-1)\). By 8.17.18:

$$\begin{aligned} I_{t}((\alpha )+1, (n+\beta )-1) = I_{t}(\alpha , n+\beta ) - \frac{t^{\alpha }(1-t)^{n+\beta -1}}{\alpha B(\alpha , n+\beta )} \end{aligned}$$

It is easy to see that \(\frac{t^{\alpha }(1-t)^{n+\beta -1}}{\alpha B(\alpha , n+\beta )} \ge 0\), and therefore, the base case holds.

Suppose now that \(x = s+1\). By the same formula 8.17.18 [7]:

$$\begin{aligned} \begin{aligned} I_{t}((s+\alpha )+1, (n-s+\beta )-1)&= I_{t}(s+\alpha , n-s+\beta ) - \\&\frac{t^{s+\alpha }(1-t)^{n-s+\beta -1}}{(s+\alpha )B(s+\alpha , n-s+\beta )} \end{aligned} \end{aligned}$$

As \(\frac{t^{s+\alpha }(1-t)^{n-s+\beta -1}}{(s+\alpha )B(s+\alpha , n-s+\beta )} \ge 0\) the induction step holds as well. Hence, for any \(s \le x\) and \(s,x > 0\) (10) holds, and the proof is complete.    \(\square \)

Now, Proposition 2 below tells us how to bound the posterior distribution of the unknown probability p, by using the posteriors built from \(X_{sat}\) and \(X_{usat}\). Given n samples of \(X_{sat}, X_{usat}\) and a Beta prior with parameters \(\alpha , \beta > 0\) it is easy to show that the posterior means are:

$$\begin{aligned} \hat{p}_{sat} = \frac{s+\alpha }{n+\alpha +\beta } \quad \quad \quad \hat{p}_{usat} = \frac{u+\alpha }{n+\alpha +\beta } \end{aligned}$$
(11)

where \(s = \sum _{i=1}^n X_{sat,i}\) and \(u = \sum _{i=1}^n X_{usat,i}\).

Proposition 2

Given \(\xi >0\), the posterior probability with respect to n samples of X of the interval \([\hat{p}_{sat}-\xi , \hat{p}_{usat}+\xi ]\) is bounded below as follows

$$\begin{aligned} \mathbf{Pr}(P \in [\hat{p}_{sat} - \xi , \hat{p}_{usat} +&\xi ]| X_1,\ldots ,X_n) \ge \\&F_{(u+\alpha , n-u+\beta )}(\hat{p}_{usat} + \xi ) - F_{(s+\alpha , n-s+\beta )}(\hat{p}_{sat} - \xi ) \end{aligned}$$

where \(X_1,\ldots ,X_n\) are iid as X, and \(\hat{p}_{sat}\) and \(\hat{p}_{usat}\) are the posterior means (11).

Proof

By definition of posterior CDF and Lemma 1:

$$\begin{aligned} \begin{aligned}&\mathbf{Pr}(P \le \hat{p}_{sat} - \xi | X_1,\ldots ,X_n) \le F_{(s+\alpha , n-s+\beta )}(\hat{p}_{sat} - \xi )\\&\mathbf{Pr}(P \ge \hat{p}_{usat} + \xi | X_1,\ldots ,X_n) \le 1 - F_{(u+\alpha , n-u+\beta )}(\hat{p}_{usat} + \xi ) \end{aligned} \end{aligned}$$

and therefore

$$\begin{aligned} \begin{aligned}&\mathbf{Pr}(P \in [\hat{p}_{sat} - \xi , \hat{p}_{usat} + \xi ]| X_1,\ldots ,X_n) = \\&1 - \mathbf{Pr}(P \le \hat{p}_{sat} - \xi | X_1,\ldots ,X_n) - \mathbf{Pr}(P \ge \hat{p}_{usat} + \xi | X_1,\ldots ,X_n) \ge \\&1 - F_{(s+\alpha , n-s+\beta )}(\hat{p}_{sat} - \xi ) - 1 + F_{(u+\alpha , n-u+\beta )}(\hat{p}_{usat} + \xi ) = \\&F_{(u+\alpha , n-u+\beta )}(\hat{p}_{usat} + \xi ) - F_{(s+\alpha , n-s+\beta )}(\hat{p}_{sat} - \xi ) \end{aligned} \end{aligned}$$

   \(\square \)

Our algorithm is shown in Algorithm 2. Differently from SReach [13], our algorithm first uses procedure evaluate to compute under- and overapproximations of the system behaviour (line 7), and then builds upper and lower posterior probability estimates (lines 13, 14). The posterior probability of the computed interval (line 15) is guaranteed not to exceed the true posterior by Proposition 2, so when the algorithm terminates we know that the returned interval contains the true probability with the required (or a larger) confidence. Our algorithm is sequential as SReach [13], since it only stops when the desired confidence is achieved. We show its (probabilistic) termination in the next proposition.

Proposition 3

Algorithm 2 terminates almost surely.

Proof

Recall that Algorithm 2 generates two sequences of random variables \(\{X_{sat,n}\}_{n\in \mathbb {N}}\) and \(\{X_{usat,n}\}_{n\in \mathbb {N}}\). From [17, Theorem 1] we get that \(X_{sat,n}\) (\(X_{usat,n}\)) converges a.s., for \(n\rightarrow \infty \), to the constant random variable \(\mathbb {E}[X_{sat}]\, (\mathbb {E}[X_{usat}])\). In particular, the posterior probability of any open interval containing the posterior mean (11) must converge to 1. Therefore, the posterior probability of any interval not including the posterior mean must converge to 0.

Now, the interval \((0, \hat{p}_{usat}+\xi )\) contains the posterior mean \((\hat{p}_{usat})\) of \(X_{usat,n}\) and therefore the posterior probability \(F_{(u+\alpha , n-u+\beta )}(\hat{p}_{usat} + \xi )\) converges to 1. Also, the interval \((0, \hat{p}_{sat}-\xi )\) does not contain the mean \((\hat{p}_{sat})\) of \(X_{sat,n}\), so \(F_{(s+\alpha , n-s+\beta )}(\hat{p}_{sat} - \xi )\) tends to 0, and this concludes the proof.    \(\square \)

figure e

In the next section we extend our technique to hybrid systems that feature both nondeterministic and random parameters.

4 Cross-Entropy Algorithm

We perform probabilistic reachability analysis for hybrid systems featuring both random and nondeterministic parameters by solving an optimisation problem aimed at finding parameter values for which the system achieves maximum (minimum) reachability probability. We present an algorithm (Algorithm 3) based on the cross-entropy (CE) method [10], a powerful stochastic technique for solving estimation and optimisation problems. The main idea behind the CE method is obtaining the optimal parameter distribution by minimizing the distance between two probability density functions. The cross-entropy (or Kullback-Leibler divergence) between two probability density functions g and f is:

$$\begin{aligned} \varTheta (g,f) = \int g(\varvec{\lambda })\ln {\frac{g(\varvec{\lambda })}{f(\varvec{\lambda })}}d\varvec{\lambda }. \end{aligned}$$

The CE is nonnegative and \(\varTheta (g,f)=0\) iff \(g = f\), but it is not symmetric (\(\varTheta (g,f) \not = \varTheta (f,g)\)), so it is not a distance in the formal sense.

The optimisation problem solved by the CE method can be formulated as the following: given a family of densities \(\{f(\cdot ;\mathbf{v})\}_{\mathbf{v} \in V}\) find the value \(v^*\in V\) that minimizes \(\varTheta (g^{*}, f(\cdot ;\mathbf{v}))\) (where \(g^{*}\) is the optimal density). The CE method comprises two general steps: (1) generating random samples from some initial distribution; (2) updating the distribution based on the obtained samples in order to obtain better samples in the next iteration. Note that for solving optimisation problems it is necessary that the family \(\{f(\cdot ;\mathbf{v})\}_{\mathbf{v} \in V}\) contains distributions that can approximate arbitrarily well single-point distributions.

In Algorithm 3 we use a parametrized family of normal distributions \(f(\varvec{\lambda };\mathbf{v})\) (the first element of v is the mean and the second element is the standard deviation). Initially the standard deviation should be relatively large in order to sample a larger space on the first iteration of the algorithm. Let D be the definition domain of the nondeterministic parameters (obtained by projecting the hybrid system parameter space P over the nondeterministic parameters only). Starting with \(\mathbf{v}_{0} = \{\mu _{0}, \sigma _{0}\}\) such that \(\mu _{0}\) is the center of D and each element of \(\sigma _{0}\) is half-size the corresponding interval from D the algorithm draws s samples from \(f(\varvec{\lambda }| \mu _{0}, \sigma _{0})\) and evaluates them using the sample performance function:

$$\begin{aligned} P(\varvec{\lambda }) = {\left\{ \begin{array}{ll} \text {probability that } H(\varvec{\lambda }) \text { reaches the goal}\quad &{} \text {if}\ \varvec{\lambda } \in D \\ -\infty &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

To compute \(P(\cdot )\) we run Algorithm 2 and take the mid point of the returned interval. Note that when solving probability minimization problems the second option in the definition of \(P(\cdot )\) should be changed to \(\infty \).

Given a number of samples, it is easy to see that as the number of nondeterministic parameters increases, the more difficult it becomes to draw samples lying inside of D. In fact, given n nondeterministic parameters the probability that a sample \(\varvec{\lambda }\) belongs to D is equal to:

$$\begin{aligned} \mathbf{Pr}(\varvec{\lambda } \in D) = \prod _{j = 1}^{n} \int _{D_j} f(\lambda _{j}|\mu _{j}, \sigma _{j}) d\lambda _{j} \end{aligned}$$
(12)

where \(D_j\) is the domain of the j-th parameter, and we assumed that the parameters are sampled independently. Hence, in order to increase the likelihood that s samples lie in D it is sufficient to generate \(s^{*} = \lceil \frac{s}{\eta }\rceil \) samples, where \(\eta = \mathbf{Pr}(\varvec{\lambda } \in D)\) is obtained using (12). The performance of each sample is then evaluated, and the samples are sorted in descending order (ascending in the case of probability minimization) according to their performance value. We label a number \(k = \lceil \rho s^{*}\rceil \) of them as elite samples, where \(\rho \in [10^{-2}, 10^{-1}]\) is a positive constant chosen by the user. The set of elite samples E is then used for updating the distribution parameters \(\mu _{i}\) and \(\sigma _{i}\) on the i-th iteration of the algorithm using the formulas from [10, Chapter 8.7]:

$$\begin{aligned} \begin{aligned} \mu _{i}&= \frac{\sum _{j \in [1,k]} E_{j}}{k} \\ \sigma _{i}&= \sqrt{\frac{\sum _{j \in [1,k]}(E_{j} - \mu _{i})^{2}}{k}} \end{aligned} \end{aligned}$$
(13)

The algorithm terminates when the largest element of vector \(\sigma \) reaches a user-defined precision \(\hat{\sigma }\), and it outputs the estimated maximum reachability probability \(\mathbf{P}\) and a (nondeterministic) parameter value \(\varvec{\lambda }\) for which \(P(\varvec{\lambda }) = \mathbf{P}\).

5 Experiments

We apply our algorithms to three models (two of which are hybrid), a model of irradiation therapy for psoriasis [16], a car collision scenario and a model of human starvation [12]. The algorithms have been implemented in C++, and the experiments have been carried out on a 32-core (2.9 GHz) Linux machine.

UVB Irradiation Therapy. We consider a simplified version of a hybrid UVB irradiation therapy model [16] used for treating psoriasis, an immune system-mediated chronic skin condition which is characterised by overproduction of keratinocytes. The simplified model comprises of three (six in the original model) categories of normal and three (five in the original model) categories of psoriatic keratinocytes whose dynamics is presented by nonlinear ODEs. The therapy consists of several episodes of UVB irradiation, which is simulated in the model by increasing the apoptosis rate constants (\(\beta _{1}\) and \(\beta _{2}\)) for stem cells (SC) and transit amplifying (TA) cells by \(In_{A}\) times. Every such episode lasts for 48 h and is followed by 8 h of rest (\(In_{A}=1\)) before starting the next irradiation. The efficiency of the therapy depends on the number of alternations between the irradiation and rest stages. An insufficient number of treatment episodes can result into early psoriasis relapse: The deterministic version of this model predicts psoriasis relapse for the number of therapy episodes less than seven [16]. We consider the parameter \(In_{A}\) characterising the therapy strength to be normally distributed with mean value \(6\cdot 10^{4}\) and standard deviation \(10^{4}\) and \(\lambda \in [0.2,0.5]\) characterising the strength of psoriatic stem cells to be nondeterministic, and we calculate the maximum and the minimum probabilities of psoriasis relapse within 2,000 days after the last therapy episode for nine alternations (\(l=9\)) between the ‘on’ and ‘off’ therapy modes (five therapy cycles). The results (Table 1) show that the estimated maximum probability lies in the interval [0.8268, 1] for \(\lambda =0.4953\) and the minimum probability is in the interval [0, 0.1086] for \(\lambda =0.1303\). Algorithm 3 required two iterations in both cases and generated 24 (out of total 26) and 23 (out of 26) samples from the domain of nondeterministic parameters D.

Table 1. UVB irradiation therapy: results with \(\xi =10^{-1}\), \(c=0.99\), \(\delta =10^{-3}\), \(\rho =10^{-1}\), \(s=10\) and \(\hat{\sigma }=10^{-2}\), where \(\lambda \) – estimated value of nondeterministic parameter, \(\mu _{\lambda }\) and \(\sigma _{\lambda }\) – mean and standard deviation of the resulting distribution, CI – confidence interval returned, \(s_{R}\) – total number of random samples used during \(s^{*}_{N}\) executions of Algorithm 2, \(s_{N}\) – total number of nondeterministic samples, \(s^{*}_{N}\) – number of nondeterministic samples drawn from D, i – number of iterations of Algorithm 3, Time – CPU time in seconds.
figure f
Table 2. The minimum probability for the cars collision scenario with \(\xi =5\cdot 10^{-3}\), \(c=0.99\), \(\delta =10^{-3}\), \(\rho =10^{-1}\) and \(\hat{\sigma }=10^{-1}\), where \(\tau _{react}\)CarB driver reaction time, \(\tau _{safe}\) – time interval between the cars, \(\mu _{\tau _{react}}\) and \(\sigma _{\tau _{react}}\) – mean and standard deviation of the resulting distribution for \(\tau _{react}\), \(\mu _{\tau _{safe}}\) and \(\sigma _{\tau _{safe}}\) – mean and standard deviation of the resulting distribution for \(\tau _{safe}\), CI – confidence interval returned, \(s_{R}\) – total number of random samples used during \(s^{*}_{N}\) executions of Algorithm 2, \(s_{N}\) – total number of nondeterministic samples, \(s^{*}_{N}\) – number of nondeterministic samples drawn from D, Time – CPU time in seconds.

Cars Collision Scenario. We consider a taking over and deceleration scenario modelled as a hybrid system. Initially two cars are moving with speed \(\upsilon _{A0}=\upsilon _{B0}=11.12 ~m/s\) at a distance \(\upsilon _{A} \cdot \tau _{safe}\) from each other, where \(\tau _{safe} \in [1,2]~s\) is nondeterministic. In the initial mode CarA changes the lane and starts accelerating until it gets ahead of CarB by \(\upsilon _{B} \cdot \tau _{safe}\) meters. After that CarA changes the lane back and starts decelerating with normally-distributed acceleration \(a_{dA} \sim N(-2,0.2)\). The driver in CarB reacts within \(\tau _{react} \in [0.5,1.5]~s\) and starts decelerating with acceleration \(a_{dB} \sim N(-1.35,0.1)\) until both cars stop completely.

The model contains three modes: CarA overtakes CarB, CarA decelerates while CarB keeps moving for \(\tau _{react}\) second, and both cars decelerate until they stop. There are two nondeterministic (\(\tau _{safe}\) and \(\tau _{react}\)) and two random (\(a_{d1}\) and \(a_{d2}\)) parameters in the system. We aim at determining whether there is a non-zero probability of the cars colliding (\(l=2\)).

We apply Algorithm 3 to this model with different values of s, the CE sample size. The obtained results (Table 2) confirm that choosing smaller values of \(\tau _{react}\) and larger values of \(\tau _{safe}\) decreases the probability value. Also, choosing a larger s increases the accuracy of the obtained result from \(P(0.609, 1.791) = [0.0252,0.0352]\) for \(s = 10\) to \(P(0.522, 1.953) = [0.0121,0.0221]\) for \(s = 20\). The execution of the algorithm took three iterations in both cases drawing 32 (out of 43) and 57 (out of 90) samples lying in D for \(s=10\) and \(s=20\) respectively.

Human Starvation. The human starvation model [12] tracks the amount of fat (F), protein in muscle mass (M), and ketone bodies (K) in the human body after glucose reserves have been depleted from three to four days of fasting. These three variables are modelled using material and energy balances to ensure that the behaviour of the model tracks what is observed in actual experiments involving fasting. Randomising two model parameters we evaluate the probability of a 40 % decrease in the muscle mass by the \(\tau _{g}\)’s day of fasting where \(\tau _{g} \in [20,27]\) is a nondeterministic parameter. The reachability depth value l is 0. The results (Table 3) demonstrate that the maximum probability of losing 40 % of the muscle mass is within the interval [0.99131, 1] for \(\tau _{g} = 26.47\) and the minimum probability is inside [0, 0.0057] for \(\tau _{g} = 20.22\). The execution of the algorithm took three iterations in both cases drawing 31 (out of 37) and 34 (out of 36) samples from D for calculating the minimum and the maximum probabilities respectively.

Table 3. The minimum and the maximum reachability probabilities for the human starvation model with \(\xi =5\cdot 10^{-3}\), \(c=0.99\), \(\delta =10^{-3}\), \(\rho =10^{-1}\), \(s=10\) and \(\hat{\sigma }=10^{-1}\), where \(\tau _{g}\) – time (days) from the beginning of fasting, \(\mu _{\tau _{g}}\) and \(\sigma _{\tau _{g}}\) – mean and standard deviation of the resulting distribution, CI – confidence interval returned, \(s_{R}\) – total number of random samples used during \(s^{*}_{N}\) executions of Algorithm 2, \(s_{N}\) – total number of nondeterministic samples, \(s^{*}_{N}\) – number of nondeterministic samples drawn from D, Time – CPU time in seconds.

Discussion. From our results we see that the chosen value of \(\delta \) did not affect the length (\(2\xi \)) of the returned confidence intervals in any experiment. Also choosing a larger number of samples per iteration (s) in Algorithm 3 and a higher precision (\(\xi \)) for Algorithm 2 increases the accuracy of the obtained result. The sample size adjustment in Algorithm 3 increases the likelihood of drawing the desired number of samples from the domain of nondeterministic parameters. For example, in the cars collision scenario featuring two nondeterministic parameters almost a third of all drawn samples were outliers. However, the desired number of samples belonging to the domain of nondeterministic parameters was still provided. Finally, the performance of Algorithms 2 and 3 significantly depends of the complexity of the system’s dynamics. For example, the UVB irradiation therapy model is more complex in comparison to other two models. As a result, Algorithm 1 required more CPU time for evaluating each pair (random and nondeterministic) of samples.

Implementation. All algorithms presented in this paper were implemented in our tool ProbReach [11], which can be downloaded from https://github.com/dreal/probreach. We also used dReal [5] as an SMT solver (\(\delta \)-complete decision procedure). The models used in this section can be found at https://github.com/dreal/probreach/tree/master/model/hvc2016.

6 Conclusions and Future Work

We introduce novel Monte Carlo (i.e., statistical) techniques for computing both numerically and statistically rigorous confidence intervals for bounded reachability probability in hybrid systems with random and nondeterministic parameters. To enable formal numerical reasoning we employ \(\delta \)-complete SMT decision procedures, and we combine them with sequential Bayesian estimation and the cross-entropy method. We exploit \(\delta \)-complete procedures to build under- and over-approximations of the reachability probability. We prove the correctness of such approximations, the statistical validity of our techniques, and termination of our Bayesian algorithm. Our techniques compute confidence intervals that are formally and statistically correct independently of the numeric precision (\(\delta \)) used. This offers users the choice of trading accuracy of the returned interval for computational cost, thereby enabling faster verification. Our experiments with highly nonlinear hybrid systems show that our techniques are useful in practice.

For future work, understanding the relation between the numerical precision (\(\delta \)) and the returned interval size is an important avenue of research. Also, we plan to extend the range of models analysable (e.g., probabilistic jumps and stochastic differential equations).