Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

1.1 Falsification

In design of Cyber-Physical Systems (CPSs), the importance of quality assurance of these systems is ever-rising, thus employing model-based development (MBD)—making virtual models (e.g. Simulink/Stateflow blocks) of products, and on these models, verifying properties by mathematical methodologies—has become standard. However, currently at least, the complexity of these virtual models in industry are overwhelm the scalability of the state-of-art formal verification methodologies.

Under such current situation, falsification is gathering attention as a viable approach to quality assurance [2, 6, 9, 11]. The falsification problem is formulated as follows.

  • Given: a system model \(\mathcal {M}\) with its input domain D, and a specification \(\varphi \)

  • Return: a counterexample input \(x \in D\) such that its corresponding output \(\mathcal {M}(x)\) violates the specification \(\varphi \) (if such an input exists).

Through solving the above falsification problem, we expect to obtain the following insights: (1): we detect errors in which the system violates the specification \(\varphi \); and (2): in case that such an error could not be found, we would say “the violation of the specification \(\varphi \) unlikely happens.”

1.2 Robustness Guided Falsification

As a formal expression of real-time specification on CPSs, metric interval temporal logic ( MITL ) [1], and its adaptation signal temporal logic ( STL ) [12] are actively studied. For these specifications, one common class of algorithms to solve falsification is robustness guided falsification [2, 6]. Here, one technical core of these algorithms is employing robust semantics [7, 8] on these logics. In robust semantics, in contrast to conventional Boolean semantics, a truth value takes a quantitative one \({\llbracket \mathcal {M}(x),\, \varphi \rrbracket } \in {\mathbb {R}}\) such that it is greater than 0 if the formula \(\varphi \) is satisfied, and its magnitude denotes “how robustly the current output \(\mathcal {M}(x)\) satisfies \(\varphi \).” With this robust semantics, we could attribute falsification problems to numerical optimization problems, that is, we search for a counterexample input \(x \in D\) by iterating the following steps (for \(t=1 \dots N\)).

  1. 1.

    Pick an input \(x_t \in D\) (in stochastic manner.)

  2. 2.

    Compute the output \(\mathcal {M}(x_t)\) by numerical simulation (e.g. \(\mathsf {sim}\) function on Simulink)

  3. 3.

    Check the robustness \({\llbracket \mathcal {M}(x_t),\, \varphi \rrbracket }\)

  4. 4.

    If the robustness is less equal than 0, then return \(x_t\). Otherwise pick a new input \(x_{t+1}\) hopefully with which the robustness becomes smaller.

In industrial practice, a system model \(\mathcal {M}\) is often huge and complex, hence among the above four steps, the second one, numerical simulation step tends to be the most costly in time—it sometimes takes several tens of seconds for each simulation. Therefore, reducing the number of iterations in minimization of the robustness \({\llbracket \mathcal {M}(x_t),\, \varphi \rrbracket }\) is essential. To this end, application of various numerical optimization algorithms (e.g. Simulated Annealing [2], Cross-entropy method [14], and so on) is actively studied.

In this paper, as one of the powerful numerical optimization algorithms, we mainly employ Gaussian process upper confidence bound ( GPU-CB ) [15, 16]. Actually, applying GP-UCB and other Gaussian process regression based optimization techniques for falsification of temporal logic properties is actively studied. [35] We give further illustration of GP-UCB in Sect. 3.

1.3 Our Motivation: Falsification of Conditional Safety Property

In this paper, as a class of specifications to be falsified, we have an eye on conditional safety properties—common class of specifications in development of CPSs.

Whenever a model satisfies an antecedent condition \(\varphi _{\mathsf {cond}}\), then at that time, the model also satisfies a safety property \(\varphi _{\mathsf {safe}}\).

With this class of formulas, we could express various requirements of behavior of the system under various specific conditions. Hence, for a given system, verifying conditional safety property is as important as for safety property.

On STL, we usually encode such a condition into a STL formula in the form of \(\square _{I}(\lnot \varphi _{\mathsf {cond}} \vee \varphi _{\mathsf {safe}})\). Note that, in conventional Boolean semantics, the formula is equivalent to \(\square _{I}(\varphi _{\mathsf {cond}} \rightarrow \varphi _{\mathsf {safe}})\). In robustness guided falsification, we search for a counterexample by minimizing the robustness of the formula \(\lnot \varphi _{\mathsf {cond}}\) and \(\varphi _{\mathsf {safe}}\) simultaneously.

However there exists the following gap between this straightforward attribution to the numerical optimization and what we expect to obtain through the falsification: if we write down a conditional safety property, we would like to say something meaningful about dynamics of the model when the antecedent condition \(\varphi _{\mathsf {cond}}\) holds; but in the iteration of simulation, we could not guarantee that enough number of behavior are observed in which the system satisfies the antecedent condition \(\varphi _{\mathsf {cond}}\). From this point of view, we would expect an optimization algorithm that solves conditional safety property

  • with as small as number of iteration to find a counterexample \(x \in D\); and

  • with picking up enough number of inputs \(x_{j_1} \dots x_{j_n}\) that steers the whole model to satisfy the antecedent condition \(\varphi _{\mathsf {cond}}\).

To this end, we propose a novel algorithm to pick up a suitable input in each step of the iteration with satisfying the above twofold requirements. A technical highlight is that, with Gaussian process regression, we estimate the function \(F^*: x \mapsto {\llbracket \mathcal {M}(x),\, \square _{I}\lnot \varphi _{\mathsf {cond}} \rrbracket }\), and obtaining the input subspace \(D' \subset D\) such that, for any input \(x \in D'\), the output \(\mathcal {M}(x)\) satisfies the antecedent condition \(\varphi _{\mathsf {cond}}\) with high probability.

Related Work. The difficulty of the falsification is to observe the rare event (here, conditional safety property is false). Our technique is based on the following idea: we consider a superset-event that happens much likely than the original one (\(\varphi _{\mathsf {cond}}\) holds), and from the input space, we “prune” the region in which the superset-event does not happen. This idea is common with importance sampling. Actually, our Proposition 2.4 is an instance of decomposition in Sect. 4.1 in [10].

While importance sampling explores the input by stochastic sampling, GP-UCB deterministically chooses the next input, hence combining these two optimization algorithms are not straightforward. One of our contributions is that we realize the above “pruning” in GP-UCB style optimization by employing regression.

2 Signal Temporal Logic (STL)

Definition 2.1

(syntax). Let \(\mathbf {Var}\) be a set of variables. The set of STL formulas are inductively defined as follows.

$$\begin{aligned} \varphi&\,::=\, f(v_1, \dots , v_n) > 0 \mid \bot \mid \top \mid \lnot \varphi \mid \varphi \vee \varphi \mid \varphi \mathbin {\mathcal {U}_{I}} \varphi \end{aligned}$$

where f is an n-ary function \(f:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\cup \{-\infty , \infty \}\), \(v_1, \dots , v_x \in \mathbf {Var}\), and I is a closed non-singular interval in \({\mathbb {R}}_{\ge 0}\), i.e. \(I=[a,b]\) or \([a, \infty )\) where \(a<b\) and \(a \in {\mathbb {R}}\). We also define the following derived operators, as usual: \(\varphi _1 \wedge \varphi _2 \equiv \lnot (\lnot \varphi _1 \vee \lnot \varphi _2)\), \(\varphi _1 \mathbin {\mathcal {R}_{I}} \varphi _2 \equiv \lnot (\lnot \varphi _1 \mathbin {\mathcal {U}_{I}} \lnot \varphi _2)\), \(\Diamond _{I} \varphi \equiv \top \mathbin {\mathcal {U}_{I}} \varphi ,\) and \(\square _{I} \varphi \equiv \bot \mathbin {\mathcal {R}_{I}} \varphi \).

Definition 2.2

(robust semantics of STL). Let \(\sigma :{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}^\mathbf {Var}\) be a signal and \(\varphi \) be an STL formula. We define the robustness \({\llbracket \sigma ,\, \varphi \rrbracket } \in {\mathbb {R}}_{\ge 0}\cup \{-\infty , \infty \}\) inductively as follows. Here \(\sqcap \) and \(\sqcup \) denote infimums and supremums of real numbers, respectively.

Notation 2.3

Let \(f:{\mathbb {R}}^n \rightarrow {\mathbb {R}}\cup \{-\infty , \infty \}\). We define the Boolean abstraction of f as the function \(\overline{f}:{\mathbb {R}}^n \rightarrow {\mathbb {B}}]\) such that as \(\overline{f}(v) = \top \) if \(f(v) > 0\), otherwise \(\overline{f}(v) = \bot \). Similarly, for an STL formula \(\varphi \), we denote by \(\overline{\varphi }\) the formula which is obtained by replacing all atomic functions f occurs in \(\varphi \) with the Boolean abstraction \(\overline{f}\). We see that \({\llbracket \sigma ,\, \varphi \rrbracket } > 0\) implies \({\llbracket \sigma ,\, \overline{\varphi } \rrbracket } > 0\).

As we see in Sect. 1.3, conditional safety properties are written as STL formulas in the form of \({\llbracket \sigma ,\, \square _{I}(\lnot \varphi _{\mathsf {cond}} \vee \varphi _{\mathsf {safe}}) \rrbracket }\), and its intuitive meaning is “\(\varphi _{\mathsf {safe}}\) holds whenever \(\varphi _{\mathsf {cond}}\) is satisfied.” To enforce our algorithm in Sect. 4 to pick inputs satisfying the antecedent conditions \(\varphi _{\mathsf {cond}}\), we convert the formula to the logically equivalent one. The converted formula consists of mainly into the two parts such that one of them stands for “the antecedent condition \(\varphi _{\mathsf {cond}}\) is satisfied or not.”

Proposition 2.4

For any signal \(\sigma \) and STL formulas \(\varphi _1\), \(\varphi _2\), the following holds.

$$\begin{aligned} {\llbracket \sigma ,\, \square _{I}(\lnot \varphi _1 \vee \varphi _2) \rrbracket }> 0 \iff {\llbracket \sigma ,\, \square _{I}\lnot \varphi _1 \rrbracket } \sqcup {\llbracket \sigma ,\, \square _{I}(\lnot \overline{\varphi _1} \vee \varphi _2) \rrbracket } > 0 \end{aligned}$$

3 Gaussian Process Upper Confidence Bound (GP-UCB)

As we mentioned in Sect. 1.3, in robustness guided falsification to minimize \(F^*: x \mapsto {\llbracket \mathcal {M},\, \varphi \rrbracket }\), we pick inputs iteratively hopefully with smaller robustness value. For this purpose, Gaussian process upper confidence bound ( GP-UCB ) [15, 16] is one of the powerful algorithm as we see in [35].

The key idea in the algorithm is that, in each iteration round \(t = 1,\dots ,N\), we estimate the Gaussian process [13] \(\mathrm {GP}(\mu , k)\) that most likely to generate the points observed until round t. Here, we call two parameters \(\mu : D \rightarrow {\mathbb {R}}\) and \(k: D^2 \rightarrow {\mathbb {R}}\) as the mean function and the covariance function respectively. Very roughly speaking, for each \(x \in D\), the value \(\mu (x)\) of mean function stands for the expected value of \(F^*(x)\), and the value k(xx) of co variance function at each diagonal point does for the magnitude of uncertainty of \(F^*(x)\).

Pseudocode for the GP-UCB algorithm is found in Algorithm 1. As we see in the code, we pick \(x[t] = {{\mathrm{\mathsf {argmin}}}}_{x \in D} \mu (x) - {\beta ^{1/2}(t)} k(x,x)\) as the next input. Here, the first term try to minimize the expected value \(F^*(x[t])\), and the second term try to decrease uncertainty globally. In Fig. 1, we see an illustration of how the estimated Gaussian process is updated in each iteration round of optimization. Thus, the strategy balancing exploration and exploitation helps us to find a minimal input with as small as number of iteration.

Fig. 1.
figure 1

An intuitive illustration of GP-UCB algorithm. Each figure shows the estimated Gaussian process \(\mathrm {GP}(\mu ,k)\) at iteration round t and \(t+1\): the middle curve is a plot of the mean function \(\mu \), and the upper and lower curve are a plot of \(\mu +\beta ^{1/2}k\), \(\mu -\beta ^{1/2}k\). In each iteration round t, we pick the point x[t] (red point in the left figure) that minimizes the lower curve. Once we observe the value \(F^*(x[t])\), the uncertainty at x[t] becomes smaller in the next round \(t+1\). In general, as a confidence parameter \(\beta \) we choose an increasing function to guarantee the algorithm not to get stuck in local optima (e.g. \(\beta (t) = 2 \log (c t^2)\) for some constant c). See [15, 16]) (Color figure online)

figure a

4 Our Algorithm: GP-UCB with Domain Estimation

Now we give our algorithm for falsification of conditional safety properties with enough number of testing in which the model satisfies the antecedent condition. As we show in Proposition 2.4, falsification of the specification \(\square _{I}{(\lnot \varphi _{\mathsf {cond}} \vee \varphi _{\mathsf {safe}})}\) could be reduced to the following problem.

Find x such that \({\llbracket \mathcal {M}(x),\, \square _{I}\lnot \varphi _{\mathsf {cond}} \rrbracket } \sqcup {\llbracket \mathcal {M}(x),\, \square _{I}(\lnot \overline{\varphi _{\mathsf {cond}}} \vee \varphi _{\mathsf {safe}}) \rrbracket } \le 0\).

A key observation here is that, when the first part of the robustness \({\llbracket \mathcal {M}(x),\, \square _{I}\lnot \varphi _{\mathsf {cond}} \rrbracket }\) becomes less than zero, then with this input x, the corresponding behavior of the system \(\mathcal {M}(x)\) satisfies the antecedent condition \(\varphi _{\mathsf {cond}}\).

Based on this observation, we propose the GP-UCB with domain estimation algorithm. Pseudocode of the algorithm is available in Algorithm 4. This algorithm takes a hyper parameter R which stands for a target hit rate, that is, how large ratio of the input x[1], ..., x[N] steer the model to satisfy the antecedent condition. In each iteration round of the falsification, to guarantee both fast minimization and enough testing on which \(\varphi _{\mathsf {cond}}\) holds, we pick the next input by the following strategy: (1) calculate how many ratio r of the input should make \(\varphi _{\mathsf {cond}}\) true through the remaining iteration; (2) estimate the input sub-domain \(D' \subset D\) in which the antecedent condition \(\varphi _{\mathsf {cond}}\) holds with probability r; (3) from the restricted domain \(x \in D'\), pick a new input x to falsify the whole specification in the GP-UCB manner.

figure b

5 Experiments

To examine that our GP-UCB with domain estimation algorithm achieves both fast minimization and enough testing with the antecedent condition \(\varphi _{\mathsf {cond}}\).

As a model of the CPSs, we choose the powertrain control verification benchmark [11]. This is an engine model with a controller which try to keep the air/fuel ratio in the exhaust gas. This model has 3-dimensional input parameters, and the controller have mainly two modes—feedback mode and feed-forward mode. As conditional safety specifications to falsify, we experiment with the following STL formula \(\varphi \). In this formula, the antecedent condition is \(\mathsf {mode} = \mathsf {feedforward}\), that is, we would like to observe behavior of the system in the feed-forward mode.

$$\begin{aligned} \square _{[\tau ,\infty )}{\big (\lnot (\mathsf {mode} = \mathsf {feedforward}) \vee |\mathsf {ratio_{A/F}}| < 0.2 \big )} \end{aligned}$$
(1)

In fact of the model, the formula (1) does not have any counterexample input, and with the original GP-UCB algorithm, about 58 % of the input leads the whole systems to feed-forward mode. Then, we run our GP-UCB with domain estimation algorithm with setting the target hit rate as \(R = 0.8\), and observe that about 79 % of the inputs satisfy the antecedent condition.

6 Conclusion

To solve falsification of conditional safety properties with enforcing the generated inputs to satisfy the antecedent condition, we provide an optimization algorithm based on Gaussian process regression techniques.