Keywords

1 Introduction

Timed automata [1] is a widely used formalism for the modeling and verification of systems and software with time-dependent behavior. In timed automata models, erroneous or unsafe behavior (that is to be avoided during operation) is often modeled by error locations. The location reachability problem deals with the question whether a given error location is reachable from an initial state along the transitions of the automaton.

As timed automata contain real-valued clock variables, to ensure performance and termination, model checkers for timed automata apply abstraction over clock variables. The standard solution involves performing a forward exploration in the zone abstract domain [7], combined with extrapolation [3] parametrized by bounds appearing in guards, extracted by static analysis [2]. Other zone-based methods propagate bounds lazily for all transitions [11] or along an infeasible path [10], and perform efficient inclusion checking with respect to a non-convex abstraction induced by the bounds [12]. Alternatively, some methods perform lazy abstraction directly over the zone abstract domain [19, 20]. However, in the context of timed automata, methods rarely address the problem of abstraction for discrete data variables that often appear in specifications for practical real-time systems, or do so by applying a fully SMT based approach, relying on the efficiency of underlying decision procedures for the abstraction of both continuous and discrete variables.

In our work, we address the location reachability problem of timed automata with discrete variables by proposing an abstraction method that can be used to lazily control the visibility of discrete variables occurring in such specifications. If the abstraction is too coarse to disable an infeasible transition, then we propagate the pre-image of the transition backward using weakest precondition computation, and use interpolation (defined for variable assignments) to extract a set of variables that are sufficient to block the transition from the abstract state. We use interpolation in a similar fashion to attempt to enforce coverage of a newly discovered state with an already visited state when possible, this way effectively pruning the search space. Our method does not rely on an interpolating SMT solver, and can be freely combined with zone-based forward search (eager or lazy) methods for efficient handling of clock variables.

We evaluated the proposed abstraction method by combining it with lazy refinement techniques for continuous variables. Results show that in terms of execution time our method performs similarly to lazy methods without abstraction of discrete variables, but generates a smaller (in cases significantly smaller) state space.

Comparison to Related Work. Lazy abstraction [9], a form of counterexample-guided abstraction refinement [6], is an approach widely used for reachability checking, and in particular for model checking software. It consists of building an abstract reachability graph on-the fly, representing an abstraction of the system, and refining a part of the tree in case a spurious counterexample is found. For timed automata, a lazy abstraction approach based on non-convex LU-abstraction and on-the-fly propagation of bounds has been proposed [10]. A significant difference of this algorithm compared to usual lazy abstraction algorithms is that it builds an abstract reachability graph that preserves exact reachability information (a so-called adaptive simulation graph or ASG). As a consequence it is able to apply refinement as soon as the abstraction admits a transition disabled in the concrete system. Similar abstraction techniques based on building an ASG include difference bound constraint abstraction [20] and the zone interpolation-based technique of [19]. In our work, we follow the same approach, but for discrete variables instead of clock variables. The proposed abstraction method is orthogonal to the aforementioned techniques and can be freely combined with any of them.

Symbolic handling of integer variables for timed automata is often supported by unbounded fully symbolic SMT-based approaches. Symbolic backward search techniques like [5, 17] are based on the computation and satisfiability checking of pre-images. In [13], reachability checking for timed automata is addressed by solving Horn clauses. In the IC3-based technique of [15], the problem of discrete variables is not addressed directly, but the possibility of generalization over discrete variables is (to some extent) inherent to the technique. In [14], also based on IC3, generalization of counterexamples to induction is addressed for both discrete and clock variables by zone-based pre-image computation. In our work, we propose an abstraction method over discrete variables that is completely theory agnostic, and does not rely on an SMT-solver.

In [8], an abstraction refinement algorithm is proposed for timed automata that handles clock and discrete variables in a uniform way. There, given a set of visible variables, an abstracted timed automaton is derived from the original by removing all assignments to abstracted variables, and by replacing all constraints by the strongest constraint that is implied and that does not contain abstracted variables. In case the model checker finds an abstract counterexample, a linear test automaton is constructed for the path, which is then composed with the original system to check whether the counterexample is spurious. If the final location of the test automaton is unreachable, a set of relevant variables is extracted from the disabled transition that will be included in the next iteration of the abstraction refinement loop. In our work, we use a similar approach, but instead of building abstractions globally on the system level and then calling to a model checker for both model checking and counterexample analysis, we use a more integrated, lazy abstraction method, where the abstraction is built on-the-fly, and refinement is performed locally in the state space where more precision is necessary.

Interpolation for variable assignments was first described in [4]. There, the interpolant is computed for a prefix and a suffix of a constraint sequence, and an inductive sequence of interpolants is computed by propagating interpolants forward using the abstract post-image operator. In our work, we define interpolation for a variable assignment and a formula, and compute inductive sequences of interpolants by propagating interpolants backward using weakest precondition computation. In our context, this enables us to consider a suffix of an infeasible path, instead of the whole path, for computing inductive sequences of interpolants.

Organization of the Paper. The rest of the paper is organized as follows. In Sect. 2, we define the notations used throughout the paper, and present the theoretical background of our work. In Sect. 3 we propose a lazy reachability checking algorithm based on the visibility of discrete variables for timed automata. Section 4 describes experiments performed on the proposed algorithm. Finally, conclusions are given in Sect. 5.

2 Background and Notations

Let \(V\) be a set of data variables over \(\mathbb {Z}\), and \(X\) a set of clock variables over \(\mathbb {R}_{\ge 0}\). A data constraint over \(V\) is a well-formed formula \(\varphi \in DC (V)\) built from variables in \(V\) and arbitrary function and predicate symbols interpreted over \(\mathbb {Z}\). A clock constraint over \(X\) is a formula \(\varphi \in CC (X)\) that is a conjunction of atoms of the form \(x \prec c\) and \({x_i - x_j \prec c}\) where \(x, x_i, x_j \in X\), \(c \in \mathbb {Z}\) and \(\prec {} \in \left\{ <, \le \right\} \). A data update over \(V\) is an assignment \(u \in DU (V)\) of the form \({v}:={t}\) where \(v \in V\) and t is a term built from variables in \(V\) and function symbols interpreted over \(\mathbb {Z}\). A clock update (clock reset) over \(X\) is an assignment \(u \in CU (X)\) of the form \({x}:={n}\) where \(x \in X\) and \(n \in \mathbb {Z}\). The set of variables appearing in a formula \(\varphi \) is denoted by \({\mathsf {vars}(\varphi )}\).

A valuation over a finite set of variables is a function that maps variables to their respective domains. A data valuation is a valuation over a set of data variables \(V\), that is, a function \({\nu : V \rightarrow \mathbb {Z}}\). Similarly, a clock valuation is a valuation over a set of clock variables \(X\), that is, a function \({\eta : X \rightarrow \mathbb {R}_{\ge 0}}\). We will denote by \({ Eval (Q)}\) the set of valuations over a set of variables Q.

Throughout the paper we will allow partial functions as valuations. We extend valuations to range over terms and formulas the usual way, with the possibility that the value of a term is undefined over a valuation. We will denote by \(\sigma \models \varphi \) iff formula \(\varphi \) is satisfied under valuation \(\sigma \). Note that in the context of partial valuations \(\sigma \models \lnot \varphi \) is a strictly stronger statement than \(\sigma \not \models \varphi \) (e.g. but it is not the case that ).

We will denote by \({\mathsf {def}(\sigma )}\) the domain of definition of a valuation, that is, \({{\mathsf {def}(\sigma )} = \left\{ q \mid \sigma (q) \ne \bot \right\} }\), and by \(\mathsf {form}(\sigma )\) the formula characterizing the valuation, that is, \({\mathsf {form}(\sigma ) = \bigwedge _{q \in {\mathsf {def}(\sigma )}} q \doteq \sigma (q)}\). Valuation \(\top \) is the unique valuation such that \({{\mathsf {def}(\top )} = \emptyset }\). We denote by \(\sigma \sqsubseteq \sigma '\) iff \(\sigma (q) = \sigma '(q)\) for all \(q \in {\mathsf {def}(\sigma ')}\). Note that \(\sqsubseteq \) is a partial order, as expected. Moreover if \(\sigma \sqsubseteq \sigma '\) and \(\sigma ' \models \varphi \) then \(\sigma \models \varphi \), and \(\sigma \sqsubseteq \sigma '\) iff \(\sigma \models \mathsf {form}(\sigma ')\).

We will denote by \(\otimes \) the partial function over valuations that is defined as

$$(\sigma \otimes \sigma ')(q) = {\left\{ \begin{array}{ll} \sigma (q) &{} \text {if } q \in {\mathsf {def}(\sigma )} \\ \sigma '(q) &{} \text {if } q \in {\mathsf {def}(\sigma ')} \\ \bot &{} \text {otherwise} \end{array}\right. }$$

if \(\sigma (q) = \sigma '(q)\) for all \(q \in {\mathsf {def}(\sigma )} \cap {\mathsf {def}(\sigma ')}\), and is undefined otherwise.

Given a valuation \(\sigma \in { Eval (Q)}\) and an assignment \({q}:={t}\), we denote by \(\sigma {\{ {q}:={t} \}}\) the valuation \(\sigma ' \in { Eval (Q \cup \left\{ q \right\} )}\) such that \({\sigma '(q) = \sigma (t)}\) and \({\sigma '(q') = \sigma (q')}\) for all \(q' \ne q\). For a sequence of updates \(\mu \) and a set of updates U we define

$$\sigma {\{ \mu \}}_{U} = {\left\{ \begin{array}{ll} \sigma &{} \text {if } \mu = \epsilon \\ \sigma {\{ u \}}{\{ \mu ' \}}_{U} &{} \text {if } \mu = u \cdot \mu ' \text { and } u \in U \\ \sigma {\{ \mu ' \}}_{U} &{} \text {if } \mu = u \cdot \mu ' \text { and } u \notin U \end{array}\right. }$$

2.1 Timed Automata

In the area of real-time verification, timed automata [1] is the most prominent formalism. To make the specification of practical systems more convenient, the traditional formalism is often extended with various syntactic and semantic constructs, in particular with the handling of discrete variables. In the following, we describe such an extension.

Definition 1

(Syntax). Syntactically, a timed automaton with discrete variables is a tuple \(\mathcal {A}= {(L, V, X, T, \ell _0)}\) where

  • \(L\) is a finite set of locations,

  • \(V\) is a finite set of data variables of integer type,

  • \(X\) is a finite set of clock variables,

  • \({T\, \subseteq L\times {\mathcal {P}(C)} \times {U}^{*} \times L}\) is a finite set of transitions with sets C and U defined as \({C= DC (V) \cup CC (X)}\) and \({U= DU (V) \cup CU (X)}\), where for a transition \((\ell , G, \mu , \ell ')\), the set \(G \subseteq C\) is a set of guards and \(\mu \in {U}^{*}\) is a sequence of updates,

  • \(\ell _0\in L\) is the initial location.

Throughout the paper, we will refer to a timed automaton with discrete variables simply as a timed automaton.

A state of \(\mathcal {A}\) is a triple \({\left( \ell , \nu , \eta \right) }\) where \(\ell \in L\), \(\nu \in { Eval (V)}\) and \(\eta \in { Eval (X)}\). We will denote by \(\nu _0\) the unique total function \({\nu _0 : V \rightarrow \left\{ 0 \right\} }\) and by \(\eta _0\) the unique total function \({\eta _0 : X \rightarrow \left\{ 0 \right\} }\).

Definition 2

(Semantics). The operational semantics of a timed automaton is given by a labeled transition system with initial state \({\left( \ell _0, \nu _0, \eta _0 \right) }\) and two kinds of transitions:

  • Delay: \({{\left( \ell , \nu , \eta \right) } \xrightarrow {\delta } {\left( \ell , \nu , \eta ' \right) }}\) for some real number \(\delta \ge 0\) where \(\eta ' = \eta + \delta \) with \({(\eta + \delta )(x) = \eta (x) + \delta }\) for all \(x \in X\);

  • Action: \({{\left( \ell , \nu , \eta \right) } \xrightarrow {t} {\left( \ell ', \nu ', \eta ' \right) }}\) for some transition \(t = (\ell , G, \mu , \ell ')\) where we have \({\nu ' = {{\mathsf {dpost}_{t}}(\nu )}}\) and \({\eta ' = {{\mathsf {cpost}_{t}}(\eta )}}\) with partial functions

    $${{\mathsf {dpost}_{t}}(\nu )} = {\left\{ \begin{array}{ll} \bot &{} {if } \nu \models \lnot g \text { for some } g \in G \cap DC (V) \\ \nu {\{ \mu \}}_{ DU (V)} &{} {\text {otherwise}} \end{array}\right. }$$
    $${{\mathsf {cpost}_{t}}(\eta )} = {\left\{ \begin{array}{ll} \bot &{} {if } \eta \models \lnot g \text { for some } g \in G \cap CC (X) \\ \eta {\{ \mu \}}_{ CU (X)} &{} {\text{ otherwise }} \end{array}\right. }$$

Here, \({{\mathsf {dpost}_{t}}(\nu )}\) denotes the strongest (discrete) postcondition of \(\nu \) with respect to transition t. Note that for any \(t \in T\), function \({\mathsf {dpost}_{t}}\) is monotonic with respect to \(\sqsubseteq \), as expected. Moreover, we define the weakest (discrete) precondition \(\mathsf {wp}_{t}(\varphi )\) as the formula such that \(\nu \models \mathsf {wp}_{t}(\varphi )\) iff \({{\mathsf {dpost}_{t}}(\nu )} \models \varphi \) for all \(\nu \) and \(\varphi \), with respect to t.

A run of a timed automaton is a sequence of states from the initial state along the transition relation

$$ {{\left( \ell _0, \nu _0, \eta _0 \right) } \xrightarrow {\alpha _1} {{\left( \ell _1, \nu _1, \eta _1 \right) } \xrightarrow {\alpha _2} {\ldots \xrightarrow {\alpha _n} {\left( \ell _n, \nu _n, \eta _n \right) }}}} $$

where \(\alpha _i \in T\cup \mathbb {R}_{\ge 0}\) for all \(0 \le i \le n\). A location \(\ell \in L\) is \(reachable \) iff there exists a run such that \(\ell _n = \ell \).

2.2 Symbolic Semantics

As the concrete semantics of a timed automaton is infinite due to real valued clock variables, model checkers are often based on a symbolic semantics defined in terms of zones. A zone is the solution set of a clock constraint \(\varphi \in CC (X)\). For sets of clock valuations Z and \(Z'\), we will denote by \(Z \sqsubseteq Z'\) iff \(Z \subseteq Z'\). Moreover, if Z is a zone and \(t \in T\), then

  • \(\bot = \emptyset \),

  • \(Z_0 = \left\{ \eta \mid \eta = \eta _0 + \delta \text { for some } \delta \ge 0 \right\} \) and

  • \({{\mathsf {zpost}_{t}}(Z)} = \left\{ \eta ' \mid {{\left( \cdot , \cdot , \eta \right) } \xrightarrow {t} {s \xrightarrow {\delta } {\left( \cdot , \cdot , \eta ' \right) }}} \text { for some } \eta \in Z \text { and } \delta \ge 0 \right\} \)

are also zones. Here, \({{\mathsf {zpost}_{t}}(Z)}\) represents the strongest postcondition of Z with respect to a transition t of a timed automaton. As defined above, function \({\mathsf {zpost}_{t}}\) is monotonic with respect to \(\sqsubseteq \) for any \(t \in T\).

Definition 3

(Symbolic semantics). The symbolic semantics of a timed automaton is given by a labeled transition system with states of the form \({\left( \ell , \nu , Z \right) }\), with initial state \({\left( \ell _0, \nu _0, Z_0 \right) }\), and for \(t = (\ell , \cdot , \cdot , \ell ')\) with transitions of the form .

We will say that a transition t is enabled from a symbolic state \({\left( \ell , \nu , Z \right) }\) iff for some \(\ell '\), \(\nu '\) and \(Z' \ne \bot \), otherwise it is disabled. Note that a transition \(t = (\ell , \cdot , \cdot , \cdot )\) is disabled from a symbolic state \({\left( \ell , \nu , Z \right) }\) iff \({{\mathsf {dpost}_{t}}(\nu )} = \bot \) or \({{\mathsf {zpost}_{t}}(Z)} = \bot \).

Definition 4

(Symbolic run). A symbolic run of a timed automaton is a sequence where \({Z_n \ne \bot }\).

Proposition 1

For a timed automaton, a location \(\ell \in L\) is reachable iff there exists a symbolic run with \(\ell _n = \ell \).

3 Algorithm for Lazy Reachability Checking

In this section, we present our algorithm for lazy reachability checking of timed automata with discrete variables. During the description, we will focus on the handling of discrete variables, but formulate the algorithm so that it is straightforward to combine the method with a corresponding (eager or lazy) method for the handling of clock variables.

3.1 Adaptive Simulation Graph

The central structure of the algorithm is an abstract simulation graph. The presented formulation is a generalization of the definition presented in [19] for the handling of discrete variables and the possibility of using various methods for the handling of clock variables.

Definition 5

(Unwinding). An unwinding of a timed automaton (LVXT\(\ell _0)\) is a tuple \(U = (N, E, n_0, M_n, M_e, \triangleright )\) where

  • (NE) is a directed tree rooted at node \(n_0 \in N\),

  • \({M_n : N \rightarrow L}\) is the node labeling,

  • \({M_e : E \rightarrow T}\) is the edge labeling and

  • \(\triangleright {} \subseteq N \times N\) is the covering relation.

For an unwinding we require that the following properties hold:

  • \(M_n(n_0) = \ell _0\),

  • for each edge \((n, n') \in E\) the transition \(M_e(n, n') = (\ell , \cdot , \cdot , \ell ')\) is such that \(M_n(n) = \ell \) and \(M_n(n') = \ell '\),

  • for all nodes n and \(n'\) such that \({n\triangleright {}n'}\) it holds that \(M_n(n) = M_n(n')\).

The purpose of the covering relation \({\triangleright {}}\) is to mark that a node of the search tree has been pruned due to another node that admits all runs that are possible from the covered node. We define the following shorthand notations for convenience: \(\ell _n = M_n(n)\) and \(t_{n, n'} = M_e(n, n')\).

Definition 6

(Adaptive simulation graph). An adaptive simulation graph (ASG) for a timed automaton \(\mathcal {A}\) is a tuple \(\mathcal {G}= (U, \psi _\nu , \psi _{\hat{\nu }}, \psi _Z, \psi _{\hat{Z}})\) where

  • U is an unwinding of \(\mathcal {A}\),

  • \({\psi _\nu , \psi _{\hat{\nu }} : N \rightarrow { Eval (V)}}\) are labelings of nodes by data valuations and

  • \({\psi _Z, \psi _{\hat{Z}} : N \rightarrow {\mathcal {P}({ Eval (X)})}}\) are labelings of nodes by sets of clock valuations.

We will use the following shorthand notations: \(\nu _n = \psi _\nu (n)\), \(\hat{\nu }_n = \psi _{\hat{\nu }}(n)\), \(Z_n = \psi _Z(n)\) and \(\hat{Z}_n = \psi _{\hat{Z}}(n)\).

A node n is expanded iff for all transitions \(t \in T\) such that \(t = (\ell , \cdot , \cdot , \cdot )\) and \(\ell _n = \ell \), either t is disabled from \((\ell _n, \nu _n, Z_n)\), or n has a successor for t. A node n is covered iff \({n\triangleright {}n'}\) for some node \(n'\). It is excluded iff it is covered or it has an excluded parent. A node is complete iff it is either expanded or excluded. A node n is \(\ell \)-safe iff \(\ell _n \ne \ell \).

For an ASG to be useful for reachability checking, we have to introduce restrictions on the labeling. Therefore while building the ASG we will ensure that \((\ell _n, \nu _n, Z_n)\) represents an exact set of reachable states for n (thus with \(Z_n\) being a zone), and that \(\nu _n \sqsubseteq \hat{\nu }_n\) and \(Z_n \sqsubseteq \hat{Z}_n\). We formalize this notion in the next definition.

Definition 7

(Well-labeled node). A node n of an ASG \(\mathcal {G}\) for a timed automaton \(\mathcal {A}\) is well-labeled iff the following conditions hold:

  • (initiation) if \(n = n_0\), then

    1. (a)

      \(\nu _n = \nu _0\) and \(Z_n = Z_0\)

    2. (b)

      \(\nu _0 \sqsubseteq \hat{\nu }_n\) and \(Z_0 \sqsubseteq \hat{Z}_n\)

  • (consecution) if \(n \ne n_0\), then for its parent m and the transition \(t = t_{m,n}\)

    1. (a)

      \(\nu _n = {{\mathsf {dpost}_{t}}(\nu _m)}\) and \(Z_n = {{\mathsf {zpost}_{t}}(Z_m)}\)

    2. (b)

      \({{\mathsf {dpost}_{t}}(\hat{\nu }_m)} \sqsubseteq \hat{\nu }_n\) and \({{\mathsf {zpost}_{t}}(\hat{Z}_m)} \sqsubseteq \hat{Z}_n\)

  • (coverage) if \({n\triangleright {}n'}\) for some node \(n'\), then \(\hat{\nu }_n \sqsubseteq \hat{\nu }_{n'}\) and \(\hat{Z}_n \sqsubseteq \hat{Z}_{n'}\) and \(n'\) is not excluded

  • (simulation) if n is expanded, then any transition disabled from \((\ell _n, \nu _n, Z_n)\) is also disabled from \((\ell _n, \hat{\nu }_n, \hat{Z}_n)\).

The above definitions for nodes can be extended to ASGs. An ASG is complete, \(\ell \)-safe or well-labeled iff all its nodes are complete, \(\ell \)-safe or well-labeled, respectively. The main challenge for the construction of a well-labeled ASG as defined above is how the labelings \(\psi _{\hat{\nu }}\) and \(\psi _{\hat{Z}}\) are computed. A well-labeled ASG preserves reachability information, which is expressed by the following proposition.

Proposition 2

Let \(\mathcal {G}\) be a complete, well-labeled ASG for a timed automaton \(\mathcal {A}\). Then \(\mathcal {A}\) has a symbolic run iff \(\mathcal {G}\) has a non-excluded node n such that \(\ell _k = \ell _n\).

Proof

The right-to-left direction is a consequence of the subsequent Lemma 1. and the converse follows from Lemma 2.

   \(\square \)

Lemma 1

Let \(\mathcal {G}\) be a well-labeled ASG for a timed automaton \(\mathcal {A}\). If \(\mathcal {G}\) has a node n then \(\mathcal {A}\) has a symbolic run such that \(\ell _k = \ell _n\).

Proof

The statement is a direct consequence of conditions initiation(a) and consecution(a).

   \(\square \)

Lemma 2

Let \(\mathcal {G}\) be a complete, well-labeled ASG for a timed automaton \(\mathcal {A}\). If \(\mathcal {A}\) has a symbolic run then \(\mathcal {G}\) has a non-excluded node n such that \(\ell _k = \ell _n\) and \(\nu _k \sqsubseteq \hat{\nu }_n\) and \(Z_k \sqsubseteq \hat{Z}_n\).

Proof

We prove the statement by induction on the length k of the symbolic run. If \(k = 0\), then \(\ell = \ell _0\) and \(\nu = \nu _0\) and \(Z = Z_0\), thus \(n_0\) is a suitable witness by condition initiation(b). Suppose the statement holds for runs of length at most \(k-1\). Hence there exists a non-excluded node m such that \(\ell _{k-1} = \ell _{m}\) and \(\nu _{k-1} \sqsubseteq \hat{\nu }_m\) and \({Z_{k-1} \sqsubseteq \hat{Z}_m}\).

Clearly the transition \(t_k\) is not disabled from \((\ell _m, \hat{\nu }_m, \hat{Z}_m)\), as then by condition simulation it would be also disabled from \((\ell _{k-1}, \nu _{k-1}, Z_{k-1})\), which contradicts our assumption. As m is complete and not excluded, it is expanded, and thus has a successor n for transition \(t_k\) with \(\ell _n = \ell _k\). By condition consecution(b), we have \({{\mathsf {dpost}_{t_k}}(\hat{\nu }_m)} \sqsubseteq \hat{\nu }_n\). As \(\nu _{k-1} \sqsubseteq \hat{\nu }_m\) and \({\mathsf {dpost}_{t}}\) is monotonic w.r.t. \(\sqsubseteq {}\), we have \(\nu _k \sqsubseteq \hat{\nu }_n\). We can obtain \(Z_k \sqsubseteq \hat{Z}_n\) symmetrically.

Thus if n is not covered, then it is a suitable witness for the statement. Otherwise there exists a node \(n'\) such that \({n\triangleright {}n'}\). By condition coverage, we know that \(\hat{\nu }_n \sqsubseteq \hat{\nu }_{n'}\) and \(\hat{Z}_n \sqsubseteq \hat{Z}_{n'}\) and \(n'\) is not excluded, thus \(n'\) is a suitable witness.

   \(\square \)

3.2 Reachability Algorithm

The pseudocode of the algorithm is shown in Algorithm 1. The algorithm gets as input a timed automaton \(\mathcal {A}\) and a distinguished error location \(\ell _e \in L\). The goal of the algorithm is to decide whether \(\ell _e\) is reachable for \(\mathcal {A}\). To this end the algorithm gradually builds an ASG for \(\mathcal {A}\) and continually maintains its well-labeledness. Upon termination, it either witnesses reachability of \(\ell _e\) by a node n such that \(\ell _n = \ell _e\), which by Lemma 1 corresponds to a symbolic run of \(\mathcal {A}\) to \(\ell _e\), or produces a closed, well-labeled, \(\ell _e\)-safe ASG that proves unreachability of \(\ell _e\) by Lemma 2.

figure a

The main data structures of the algorithm are the ASG \(\mathcal {G}\) and sets \( passed \) and \( waiting \). The set \( passed \) is used to store nodes that are expanded and \( waiting \) stores nodes that are incomplete. The algorithm consists of subprocedures close, expand and refine, and of procedures zcover and zblock. Procedure zcover and zblock serve for abstraction refinement over clock variables. These procedures can be soundly implemented in various ways [3, 10,11,12, 19, 20], and we assume such an implementation. Procedure close attempts to cover a node by some other node. Procedure expand expands a node by creating the successors of a node for all non-blocked transitions for the given location. Procedure refine (see in Sect. 3.3) can be used to ensure for a node n and some formula \(\varphi \) that if \(\nu _n \models \varphi \) then \(\hat{\nu }_n \models \varphi \) as well. Both close and expand maintain well-labeledness by calls to refine. In particular, close calls to refine in order to enforce condition coverage, and expand calls to refine to establish condition simulation.

The algorithm consists of a single loop in line 8 that employs the following strategy. The loop consumes nodes from \( waiting \) one by one. If \( waiting \) becomes empty, then \(\mathcal {A}\) is deemed safe. Otherwise, a node n is removed from \( waiting \). If the node represents an error location, then \(\mathcal {A}\) is deemed unsafe. Otherwise, in order to avoid unnecessary expansion of the node, the algorithm tries to cover it by a call to close. If there are no suitable candidates for coverage, then the algorithm establishes completeness of the node by expanding it using expand, which puts it in \( passed \) and puts all its successors in \( waiting \).

We show that explore is correct with respect to the annotations (procedure contracts) in Algorithm 1. As, given a suitable refinement method for clock variables, termination of the algorithm is trivial, we focus on partial correctness.

Proposition 3

Procedure explore is partially correct: if \({\textsc {explore}}(\mathcal {A}, \ell _e)\) terminates, then the result is \(\textsc {safe}\) iff \(\ell _e\) is unreachable for \(\mathcal {A}\).

Proof

(sketch). Let \( covered = \left\{ n \in N \mid n \text { is covered} \right\} \). It is easy to verify that the algorithm maintains the following invariants:

  • \(N = passed \cup waiting \cup covered \),

  • \( passed \) is a set of non-excluded, expanded, \(\ell _e\)-safe nodes,

  • \( waiting \) is a set of non-excluded, non-expanded nodes,

  • \( covered \) is a set of covered, non-expanded, \(\ell _e\)-safe nodes.

It is easy to see that under the above assumptions sets \( passed \), \( waiting \) and \( covered \) form a partition of N. Assuming that \(\mathcal {G}\) is well-labeled, partial correctness of the algorithm is then a direct consequence. At line 11 a node is encountered that is not \(\ell _e\)-safe, thus by Lemma 1 there is a symbolic run of \(\mathcal {A}\) to \(\ell _e\). Conversely, at line 16 the set \( waiting \) is empty, so \(\mathcal {G}\) is complete and \(\ell _e\)-safe, and as a consequence of Lemma 2 the location \(\ell _e\) is indeed unreachable for \(\mathcal {A}\).

What remains to show is that the algorithm maintains well-labeledness. We assume that procedures zcover and zblock and procedure refine maintain well-labeledness (this later statement we prove to hold in Sect. 3.3). Initially node \(n_0\) is well-labeled as it satisfies initiation. Procedure close trivially maintains well-labeledness, as it just possibly adds a covering edge for two nodes such that condition coverage is not violated. For procedure expand, if a given transition t is enabled, then a node is created that satisfies consecution. Otherwise the corresponding refinement procedure is called, ensuring that simulation holds for the given transition. In particular, if t is blocked due to \({{{\mathsf {dpost}_{t}}(\nu _n)} = \bot }\), we have \(\nu _n \models \mathsf {wp}_{t}(\bot )\), and thus can call refine to update \(\hat{\nu }_n\) so that \({\hat{\nu }_n \models \mathsf {wp}_{t}(\bot )}\), ensuring \({{\mathsf {dpost}_{t}}(\hat{\nu }_n)} \models \bot \) and effectively disabling t from \((\cdot , \hat{\nu }_n, \cdot )\).    \(\square \)

3.3 Abstraction Refinement

To maintain well-labeledness, the algorithm relies on procedure refine that performs abstraction refinement by safely adjusting abstract data valuations labeling nodes of the ASG. The pseudocode of the refinement algorithm is shown in Algorithm 2.

figure b

Informally, refine works as follows. Given a node n and a formula \(\varphi \) such that \(\nu _n \models \varphi \) holds, a weakening \(\nu _I\) of \(\nu _n\) is computed such that \(\nu _I \models \varphi \) by calling to procedure interpolate, which simply removes variables from the domain of definition that are not necessary for satisfying the formula. Then all covering edges are dropped that would violate condition coverage after strengthening. To maintain condition consecution(b), procedure refine is then recursively called for the predecessor m of n. The computed interpolant is then used to strengthen the current labeling by including variables occurring in the interpolant in the current abstraction. We show that refine maintains well-labeledness and is correct with respect to the annotations in Algorithm 2.

Proposition 4

Procedure refine is totally correct: if \(\nu _n \models \varphi \), then \({\textsc {refine}}(n, \varphi )\) terminates and ensures \(\hat{\nu }_n \models \varphi \). Moreover, it maintains well-labeledness.

Proof

Termination of the procedure is trivial, so we focus on partial correctness and the preservation of well-labeledness.

Function interpolate has no side effect, it thus trivially maintains well-labeledness. Moreover, it is easy to see that it satisfies its contract, as it simply drops variables not necessary to ensure satisfiability of \(\varphi _B\) from the domain of definition of \(\nu _A\).

In procedure refine, if \(\hat{\nu }_n \models \varphi \) then no refinement is needed, and the contract is trivially satisfied. Otherwise, the interpolant \(\nu _I\) is computed by function interpolate. As \(\nu _n \sqsubseteq \hat{\nu }_n\) by well-labeledness and \(\nu _n \sqsubseteq \nu _I\) by the precondition, we know that \(\hat{\nu }_n \otimes \nu _I\), and thus the new value of \(\hat{\nu }_n\), is defined. As \(\hat{\nu }_n \otimes \nu _I \sqsubseteq \nu _I\) and \(\nu _I \models \varphi \), we have \(\hat{\nu }_n \otimes \nu _I \models \varphi \), which ensures the postcondition.

Next we show that well-labeledness is maintained. Condition simulation is trivially ensured, as if \(\hat{\nu }_n \models \lnot g\) for some guard g, then \(\hat{\nu }_n \otimes \nu _I \models \lnot g\) as well. After the loop we have \(\hat{\nu }_m \sqsubseteq \nu _I\) for all m such that \({m\triangleright {}n}\). Moreover, \(\hat{\nu }_m \sqsubseteq \hat{\nu }_n\) by well-labeledness. Thus \(\hat{\nu }_m \sqsubseteq \hat{\nu }_n \otimes \nu _I\), which ensures condition coverage. If n has no parent then condition initiation(b) is trivially maintained. Otherwise we have \(\nu _n \sqsubseteq \nu _I\), thus \({{\mathsf {dpost}_{t}}(\nu _m)} \models \mathsf {form}(\nu _I)\), from which \(\nu _m \models \mathsf {wp}_{t}(\mathsf {form}(\nu _I))\) follows. Hence refine can be called to ensure \(\hat{\nu }_m \models \mathsf {wp}_{t}(\mathsf {form}(\nu _I))\), and thus \({{\mathsf {dpost}_{t}}(\hat{\nu }_m)} \sqsubseteq \nu _I\). Moreover, \({{\mathsf {dpost}_{t}}(\hat{\nu }_m)} \sqsubseteq \hat{\nu }_n\) by well-labeledness. It follows that \({{\mathsf {dpost}_{t}}(\hat{\nu }_m)} \sqsubseteq \hat{\nu }_n \otimes \nu _I\), which ensures condition consecution(b).    \(\square \)

3.4 Example

In this subsection, we give an example that demonstrates how the algorithm described above lazily controls the visibility of discrete variables of the system during construction of the abstraction.

Figure 1 shows automaton \(\mathcal {A}_k\), a modified version of the examples given in [10, 16] where clock variables are replaced by discrete variables and a component is added that nondeterministically increments all variables. The resulting automaton is the parallel composition of four components, and has 2k discrete variables, namely \(a_1, a_2, \ldots , a_k\) and \(b_1, b_2, \ldots , b_k\).

Fig. 1.
figure 1

Automaton \(\mathcal {A}_k\)

As an example, we are going to consider \(\mathcal {A}_1\), the simplest version of the automaton. For simplicity, we are going to omit the indexes in names whenever possible. Figure 2 shows part of the ASG produced by the algorithm. Here, normal edges represent edges of the unwinding (elements of the relation E), dashed edges represent covering edges (elements of the relation \(\triangleright {}\)), and dotted edges represent edges of the unwinding that lead to subtrees omitted from the figure. For each node n, the set of visible variables \({\mathsf {def}(\hat{\nu }_n)}\) is shown.

Fig. 2.
figure 2

ASG of \(\mathcal {A}_1\)

The algorithm starts by instantiating the root node \(n_0\) with \(\hat{\nu }_{n_0} = \top \). As transition \(t_c\) is disabled from \(\nu _{n_0}\) but not from \(\hat{\nu }_{n_0}\), the set of visible variables has to be refined in \(n_0\). Hence during refinement, a will be included in the set of visible variables, ensuring . For the same reason, a will become visible when expanding \(n_1\) and \(n_2\). For any other node n however, \(t_c\) is either not an outgoing transition of location \(\ell _n\), or is enabled from \(\nu _n\), thus no refinement will be triggered during expansion, resulting in abstraction \(\hat{\nu }_n = \top \). This enables coverage between nodes that assign different concrete values to the variables. E.g. covering edges \((n_5, n_4)\) and \((n_{10}, n_9)\) are only possible because b is not visible in either nodes (as and ). More importantly, the algorithm is able to quickly cover nodes that result from the second firing of \(t_d\) along a path, thus the resulting ASG remains finite. Even if the number of times \(t_d\) can be taken is bounded by some number N, an algorithm that handles discrete variables explicitly would generate a significantly larger state space depending on N. Similarly, as k increases, the advantage of the abstraction based method compared to the explicit handling of variables becomes increasingly notable.

4 Evaluation

We implemented a prototype version of our algorithm in the open source model checking framework Theta [18]. In order to enable abstraction refinement for clock variables, we implemented a variant of the lazy abstraction method of [10] based on \( LU \)-bounds, and the method described in [19] based on interpolation for zones (with refinement strategy seq). These strategies are then combined both with the explicit handling of discrete variables, resulting in algorithms similar to that of the original papers [10, 19], and with the abstraction and refinement method proposed in this paper. The algorithms are evaluated for both breadth-first and depth-first search orders. This results in 8 algorithm configurations by combining the above mentioned alternatives:

  • explicit (E) or abstraction-based \((\textsf {A})\) handling of discrete variables,

  • lazy abstraction (L) or interpolation \((\textsf {I})\) for clock variables and

  • breadth-first \((\textsf {B})\) or depth-first \((\textsf {D})\) search order.

For the configurations that handle discrete variables explicitly, we partitioned the set of nodes based on the value of the data valuation, this way saving the \(\mathcal {O}(n)\) cost of checking inclusion for valuations. This optimization also significantly reduces the number of nodes for which coverage is checked and attempted during close. Apart from this and the difference in refinement strategies, the implementation of the configurations is shared.

As inputs we considered 15 timed automata models in Uppaal 4.0 XTA format that contain integer variables. For each model, the number of discrete variables/number of clock variables is given in parentheses.

  • bocdp (26/3), bocdpf (26/3): models of the Bang & Olufsen Collision Detection Protocol obtained from the UppaalFootnote 1 benchmark set

  • brp (9/7): a model of the Bounded Retransmission Protocol

  • c1 (12/3), c2 (14/3), c3 (15/3), c4 (17/3): models of a real-time mutual exclusion protocol obtained from the MctaFootnote 2 benchmark set

  • m1 (11/4), m2 (13/4), m3 (13/4), m4 (15/4), n1 (11/7), n2 (13/7), n3 (13/7), n4 (15/7): industrial cases studies obtained from the Mcta benchmark set

We performed our measurements on a machine running Windows 10 with a 2.6 GHz dual core CPU and 8 GB of RAM. We evaluated the algorithm configurations for both execution time (Table 1) and the number of nodes in the resulting ASG (Table 2). The timeout (denoted by “—” in the tables) was set to 120 s. In the tables the best values among both the explicit and abstraction based configurations are emphasized with bold font for each model. The execution time is the average of 10 runs, obtained from 12 deterministic runs by removing the slowest and the fastest one.

Table 1. Execution time in seconds per model and configuration
Table 2. Number of nodes in the ASG per model and configuration

As can be seen in Table 1, in general, the performance of the fastest configurations of the two categories (explicit and abstraction based configurations) with respect to execution time is balanced (there are no more difference than 100%). For models c1–3, the explicit configuration was faster, but the absolute difference in execution time is not significant. For the other Mcta models, the fastest configurations perform similarly with respect to execution time. For model bocdpf the abstraction-based variant was almost twice as fast, whereas the opposite is true for models bocdp and brp. In total, the abstraction based variant is faster than the corresponding configuration without abstraction in one fourth of the cases, and configuration AID is faster than a given configuration without abstraction in two thirds of the cases.

When comparing the methods based on the number of ASG nodes generated, the difference is more significant, as it can be seen in Table 2. As expected, the abstraction-based method produces a smaller ASG than the corresponding configuration without abstraction in most (97%) of the cases, and the state space generated by configuration AID is smaller in all cases. On average, the reduction in size in favor of the abstraction based handling of discrete variables is around 50%. In the worst case (model c1), the reduced size is around 80%, and in the best case (model bocdpf) it is 15%, i.e. the introduction of abstraction has significant gain.

To characterize the fastest configurations, Fig. 3 depicts the execution time (first column in blue) and number of nodes generated (second column in red) for the fastest configuration with abstraction relative to the performance of the fastest configuration without abstraction. Similarly, Fig. 4 depicts the relative performance when considering the configurations generating the least number of nodes. According to Fig. 3, if the configuration with abstraction performs well in execution time, then it also performs well in the number of nodes generated. Conversely, according to Fig. 4, if significant reduction is achieved in the size of the state space, then the algorithm with abstraction also tends to perform well in terms of execution time (except for model bocdp). Moreover, as can be seen on both charts, within a group of models (c, m and n), the relative performance of the abstraction method tends to increase with increasing model complexity.

Fig. 3.
figure 3

Relative execution time and number of nodes generated of fastest configurations (Color figure online)

Fig. 4.
figure 4

Relative execution time and number of nodes generated of configurations with the smallest ASG

Moreover, for the models considered, configuration AID (Abstraction of discrete variables, Interpolation-based abstraction of clock variables, Depth-first search order) approximates the best configuration well for both execution time and ASG size, as this configuration tends to have a good performance on the more complex models. This is depicted on Figs. 5 and 6, where we compared configuration AID with the E-configurations in terms of execution time and size of the generated state space, respectively. In Fig. 5, we denote by BEST the virtual best configuration, calculated from the best results of all other configurations. This data is omitted in Fig. 6, as BEST greatly overlaps with configuration AID in terms of states generated. Moreover, to focus on the significant differences, we only depicted data for the hardest six models (denoted as \(10\ldots 15\) on the horizontal axis) for each configuration.

Fig. 5.
figure 5

Time to solve the hardest model instances (seconds)

Fig. 6.
figure 6

Number of nodes generated for the hardest model instances

5 Conclusions

In this paper we proposed a lazy algorithm for the location reachability problem of timed automata with discrete variables. The method is based on controlling the visibility of discrete variables by using interpolation for valuations of variables. We demonstrated with experiments that our abstraction and refinement strategy, combined with lazy methods for the abstraction of continuous clock variables, can achieve significant reduction in the size of the generated state space during search, typically with low or no overhead in execution time, and in cases even with an additional speedup.

Future Work. According to the method described in this paper, refinement is triggered upon encountering a disabled transition. In the future, we intend to experiment with counterexample-guided refinement for both the abstraction of discrete and continuous variables. In addition, we plan to experiment with different abstract domains (e.g. intervals), and investigate alternative refinement strategies for the discrete variables of timed systems. In particular we are interested in the performance for timed automata of the forward interpolation technique described in [4]. Moreover, we plan to explore more sophisticated strategies for finding covering states, as this can potentially yield considerable speedups for our method. Furthermore, although we evaluated our abstraction method in the context of timed systems, the technique itself can be applied in a more general context, and we plan to investigate its uses for model checking imperative programs.