Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Symbolic techniques can be used to represent possibly infinite sets of states by means of symbolic constraints. These techniques have been developed and adapted to many other verification methods such as SAT solving, Satisfiability Modulo Theories (SMT), rewriting, and model checking. A key open research issue of current symbolic techniques is extensibility. Techniques that combine different methods have been proposed, e.g., decision procedures [28, 29], unifications algorithms [7, 11], theorem provers with decision procedures [1, 10, 32], and SMT solvers in model checkers [3, 18, 27, 36, 38]. However, there is still a lack of general extensibility techniques for symbolic analysis that simultaneously combine the power of SMT solving, rewriting- and narrowing-based analysis, and model checking.

This paper proposes a new symbolic technique that seamlessly combines rewriting modulo theories, SMT solving, and model checking. For brevity, this technique is called rewriting modulo SMT, although it could more precisely be called rewriting modulo SMT+B, where \(B\) is an equational theory having a matching algorithm. It complements another symbolic technique combining narrowing modulo theories and model checking, namely narrowing-based reachability analysis [8, 26]. Neither of these two techniques subsumes the other.

Rewriting modulo SMT can be applied to increase the power of equational reasoning, e.g., [16, 17, 21], but its full power, including its model checking capabilities, is better exploited when applied to concurrent open systems. Deterministic systems can be naturally specified by equational theories, but specification of concurrent, non-deterministic systems requires rewrite theories [24], i.e., triples \(\mathcal {R} = (\varSigma ,E,R)\) with \((\varSigma ,E)\) an equational theory describing system states as elements of the initial algebra \(\mathcal {T}_{\varSigma /E}\), and \(R\) rewrite rules describing the system’s local concurrent transitions. An open system is a concurrent system that interacts with an external, non-deterministic environment. When such a system is specified by a rewrite theory \(\mathcal {R}=(\varSigma ,E,R)\), it has two sources of non-determinism, one internal and the other external. Internal non-determinism comes from the fact that in a given system state different instances of rules in \(R\) may be enabled. The local transitions thus enabled may lead to completely different states. What is peculiar about an open system is that it also has external, and often infinitely-branching, non-determinism due to the environment. That is, the state of an open system must include the state changes due to the environment. Technically, this means that, while a system transition in a closed system can be described by a rewrite rule \(t {\rightarrow } t'\) with \({\textit{vars}}(t') \subseteq {\textit{vars}}(t)\), a transition in an open system is instead modeled by a rule of the form \(t(\overrightarrow{x}) \rightarrow t'(\overrightarrow{x},\overrightarrow{y})\), where \(\overrightarrow{y}\) are fresh new variables. Therefore, a substitution for the variables \(\overrightarrow{x} {\uplus } \overrightarrow{y}\) decomposes into two substitutions, one, say \(\theta \), for the variables \(\overrightarrow{x}\) under the control of the system and another, say \(\rho \), for the variables \(\overrightarrow{y}\) under the control of the environment. In rewriting modulo SMT, such open systems are described by conditional rewrite rules of the form \(t(\overrightarrow{x}) \rightarrow t'(\overrightarrow{x},\overrightarrow{y})\;\mathbf{if }\; {\phi }\), where \(\phi \) is a constraint solvable by an SMT solver. This constraint \(\phi \) may still allow the environment to choose an infinite number of substitutions \(\rho \) for \(\overrightarrow{y}\), but can exclude choices that the environment will never make.

The non-trivial challenges of modeling and analyzing open systems can now be better explained. They include: (1) the enormous and possibly infinitary non-determinism due to the environment, which typically renders finite-state model checking impossible or unfeasible; (2) the impossibility of executing the rewrite theory \(\mathcal {R}=(\varSigma ,E,R)\) in the standard sense, due to the non-deterministic choice of \(\rho \); and (3) the, in general, undecidable challenge of checking the rule’s condition \(\phi \), since without knowing \(\rho \), the condition \(\phi \theta \) is non-ground, so that its \(E\)-satisfiability may be undecidable. As further explained in the paper, challenges (1)–(3) are all met successfully by rewriting modulo SMT because: (1) states are represented not as concrete states, i.e., ground terms, but as symbolic constrained terms \(\langle t; \varphi \rangle \) with \(t\) a term with variables ranging in the domains handled by the SMT solver and \(\varphi \) an SMT-solvable formula, so that the choice of \(\rho \) is avoided; (2) rewriting modulo SMT can symbolically rewrite such pairs \(\langle t; \varphi \rangle \) (describing possibly infinite sets of concrete states) to other pairs \(\langle t'; \varphi ' \rangle \); and (3) decidability of \(\phi \theta \) (more precisely of \(\varphi \wedge \phi \theta \)) can be settled by invoking an SMT solver.

Rewriting modulo SMT can be integrated with model-checking by exploiting the fact that rewriting logic is reflective [15]. Hence, rewriting modulo SMT can be reduced to standard rewriting. In particular, all the techniques, algorithms, and tools available for model checking of closed systems specified as rewrite theories, such as Maude’s search-based reachability analysis [14], become directly available to perform symbolic reachability analysis on systems that are now infinite-state.

The technique proposed in this paper is illustrated with the formal analysis of the CASH scheduling protocol [13]. This protocol specifies a real-time system whose formal analysis is beyond the scope of timed-automata [2].

2 Preliminaries

Notation on terms, term algebras, and equational theories is used as in [6, 19].

An order-sorted signature \(\varSigma \) is a tuple \(\varSigma =(S,\le ,F)\) with a finite poset of sorts \((S,\le )\) and set of function symbols \(F\). The binary relation \(\equiv _\le \) denotes the equivalence relation generated by \(\le \) on \(S\) and its point-wise extension to strings in \(S^*\). The function symbols in \(F\) can be subsort-overloaded and satisfy the condition that, for \(w,w' \in S^*\) and \(s,s'\in S\), if \(f : w \longrightarrow s\) and \(f : w' \longrightarrow s'\) are in \(F\), then \(w \equiv _\le w'\) implies \(s \equiv _\le s'\). A top sort in \(\varSigma \) is a sort \(s \in S\) such that if \(s' \in S\) and \(s \equiv _\le s'\), then \(s' \le s\). For any sort \(s\in S\), the expression \(\left[ s\right] \) denotes the connected component of \(s\), that is, \(\left[ s\right] = \left[ s\right] _{\equiv _\le }\).

The symbol \(X\) denotes an \(S\)-indexed family \(X = \{X_s\}_{s\in S}\) of disjoint variable sets with each \(X_s\) countably infinite. Expressions \(T_\varSigma (X)_s\) and \(T_{\varSigma ,s}\) denote, respectively, the set of terms of sort \(s\) and the set of ground terms of sort \(s\); accordingly, \(\mathcal {T}_\varSigma (X)\) and \(\mathcal {T}_\varSigma \) denote the corresponding order-sorted \(\varSigma \)-term algebras. All order-sorted signatures are assumed preregular [19], i.e., each \(\varSigma \)-term \(t\) has a least sort \({ ls }(t) \in S\) s.t. \(t \in T_\varSigma (X)_{{ ls }(t)}\). For \(S'\subseteq S\), a term is called \(S'\) -linear if no variable with sort in \(S'\) occurs in it twice. The set of variables of \(t\) is written \({\textit{vars}}(t)\).

A substitution is an \(S\)-indexed mapping \(\theta : X \longrightarrow T_\varSigma (X)\) that is different from the identity only for a finite subset of \(X\). The identity substitution is denoted by \({\textit{id}}\) and \(\theta |_Y\) denotes the restriction of \(\theta \) to a family of variables \(Y\subseteq X\). Expression \({ dom}(\theta )\) denotes the domain of \(\theta \), i.e., the subfamily of \(X\) for which \(\theta (x)\ne x\), and \({ ran}(\theta )\) denotes the family of variables introduced by \(\theta (x)\), for \(x \in { dom}(\theta )\). Substitutions extend homomorphically to terms in the natural way. A substitution \(\theta \) is called ground iff \({ ran}(\theta ) = \emptyset \). The application of a substitution \(\theta \) to a term \(t\) is denoted by \(t\theta \) and the composition of two substitutions \(\theta _1\) and \(\theta _2\) is denoted by \(\theta _1\theta _2\). A context \(C\) is a \(\lambda \)-term of the form \(C = \lambda x_1,\ldots ,x_n.c\) with \(c \in T_\varSigma (X)\) and \(\{x_1,\ldots ,x_n\}\subseteq {\textit{vars}}(c)\); it can be viewed as an \(n\)-ary function \(C(t_1,\ldots ,t_n) = c\theta \), where \(\theta (x_i) = t_i\) for \(1 \le i \le n\) and \(\theta (x) = x\) otherwise.

A \(\varSigma \) -equation is an unoriented pair \(t =u\) with \(t \in T_\varSigma (X)_{s_t}\), \(u \in T_\varSigma (X)_{s_u}\), and \(s_t \equiv _\le s_u\). A conditional \(\varSigma \) -equation is a triple \(t =u \; {\mathbf {if}}\; \gamma \), with \(t = u\) a \(\varSigma \)-equation and \(\gamma \) a finite conjunction of \(\varSigma \)-equations; it is called unconditional if \(\gamma \) is the empty conjunction. An equational theory is a tuple \((\varSigma ,E)\), with \(\varSigma \) an order-sorted signature and \(E\) a finite collection of (possibly conditional) \(\varSigma \)-equations. It is assumed that \(T_{\varSigma ,s}\ne \emptyset \) for each \(s \in S\). An equational theory \(\mathcal {E}= (\varSigma ,E)\) induces the congruence relation \(=_\mathcal {E}\) on \(T_\varSigma (X)\) defined for \(t,u \in T_\varSigma (X)\) by \(t =_\mathcal {E}u\) iff \(\mathcal {E}\vdash t =u\) by the deduction rules for order-sorted equational logic in [25]. Similarly, \(=_\mathcal {E}^1\) denotes provable \(\mathcal {E}\)-equality in one step of deduction. The \(\mathcal {E}\) -subsumption ordering \(\ll _{\mathcal {E}}\) is the binary relation on \(T_\varSigma (X)\) defined for any \(t,u\in T_\varSigma (X)\) by \(t \ll _{\mathcal {E}} u\) iff there is a substitution \(\theta : X \longrightarrow T_\varSigma (X)\) such that \(t =_\mathcal {E}u\theta \). A set of equations \(E\) is called collapse-free for a subset of sorts \(S'\subseteq S\) iff for any \(t=u \in E\) and for any substitution \(\theta : X \longrightarrow T_\varSigma (X)\) neither \(t\theta \) nor \(u\theta \) map to a variable for some sort \(s \in S'\). The expressions \(\mathcal {T}_{\mathcal {E}}(X)\) and \(\mathcal {T}_\mathcal {E}\) (also written \(\mathcal {T}_{\varSigma /E}(X)\) and \(\mathcal {T}_{\varSigma /E}\)) denote the quotient algebras induced by \(=_\mathcal {E}\) on the term algebras \(\mathcal {T}_\varSigma (X)\) and \(\mathcal {T}_\varSigma \), respectively; \(\mathcal {T}_{\varSigma /E}\) is called the initial algebra of \((\varSigma ,E)\). A theory inclusion \((\varSigma ,E)\subseteq (\varSigma ',E')\), with \(\varSigma \subseteq \varSigma '\) and \(E\subseteq E'\), is called protecting iff the unique \(\varSigma \)-homomorphism \(\mathcal {T}_{\varSigma /E} \longrightarrow \mathcal {T}_{\varSigma '/E'}|_{\varSigma }\) to the \(\varSigma \) -reduct of the initial algebra \(\mathcal {T}_{\varSigma '/E'}\) is a \(\varSigma \)-isomorphism, written \(\mathcal {T}_{\varSigma /E} \simeq \mathcal {T}_{\varSigma '/E'}|_{\varSigma }\). A set of equations \(E\) is called regular iff \({\textit{vars}}(t) = {\textit{vars}}(u)\) for any equation \((t =u \; {\mathbf {if}}\; \gamma ) \in E\).

Appropriate requirements are needed to make an equational theory \(\mathcal {E}\) admissible, i.e., executable in rewriting languages such as Maude [14]. In this paper, it is assumed that the equations of \(\mathcal {E}\) can be decomposed into a disjoint union \(E \uplus B\), with \(B\) a collection of structural axioms (such as associativity, and/or commutativity, and/or identity) for which there exists a matching algorithm modulo \(B\) producing a finite number of \(B\)-matching solutions, or failing otherwise. Furthermore, it is assumed that the equations \(E\) can be oriented into a set of (possibly conditional) sort-decreasing, operationally terminating, and confluent conditional rewrite rules \(\overrightarrow{E}\) modulo \(B\). The conditional rewrite system \(\overrightarrow{E}\) is sort decreasing modulo \(B\) iff for each \((t \rightarrow u \; {\mathbf {if}}\; \gamma ) \in \overrightarrow{E}\) and substitution \(\theta \), \({ ls }(t\theta ) \ge { ls }(u\theta )\) if \((\varSigma ,B,\overrightarrow{E}) \vdash \gamma \theta \). The system \(\overrightarrow{E}\) is operationally terminating modulo \(B\) iff there is no infinite well-formed proof tree in \((\varSigma ,B,\overrightarrow{E})\). Furthermore, \(\overrightarrow{E}\) is confluent modulo \(B\) iff for all \(t,t_1,t_2 \in T_\varSigma (X)\), if \(t \rightarrow ^*_{E/B} t_1\) and \(t \rightarrow ^*_{E/B} t_2\), then there is \(u \in T_\varSigma (X)\) such that \(t_1 \rightarrow ^*_{E/B} u\) and \(t_2 \rightarrow ^*_{E/B} u\). The term \({t}\!\downarrow _{E/B} \in T_\varSigma (X)\) denotes the \(E\) -canonical form of \(t\) modulo \(B\) so that \(t \rightarrow _{E/B}^* {t}\!\downarrow _{E/B}\) and \({t}\!\downarrow _{E / B}\) cannot be further reduced by \(\rightarrow _{E/B}\). Under the above assumptions \({t}\!\downarrow _{E/B}\) is unique up to \(B\)-equality.

A \(\varSigma \) -rule is a triple \(l \rightarrow r\;\mathbf{if }\; {\phi }\), with \(l,r \in T_\varSigma (X)_s\), for some sort \(s \in S\), and \(\phi = \bigwedge _{i\in I} t_i = u_i\) a finite conjunction of \(\varSigma \)-equations. A rewrite theory is a tuple \(\mathcal {R}= (\varSigma ,E,R)\) with \((\varSigma ,E)\) an order-sorted equational theory and \(R\) a finite set of \(\varSigma \)-rules. The rewrite theory \(\mathcal {R}\) induces a rewrite relation \(\rightarrow _{\mathcal {R}}\) on \(T_{\varSigma }(X)\) defined for every \(t,u \in T_\varSigma (X)\) by \(t \rightarrow _\mathcal {R}u\) iff there is a rule \((l \rightarrow r\;\mathbf{if }\; {\phi }) \in R\) and a substitution \(\theta : X \longrightarrow T_\varSigma (X)\) satisfying \(t =_E l\theta \), \(u =_E r\theta \), and \(E \vdash \phi \theta \). The relation \(\rightarrow _\mathcal {R}\) is undecidable in general, unless conditions such as coherence [37] are given. A key point of this paper is to make such a relation decidable when \(E\) decomposes as \(\mathcal {E}_{0}\uplus B_{1}\), where \(\mathcal {E}_{0}\) is a built-in theory for which formula satisfiability is decidable and \(B_{1}\) has a matching algorithm. A topmost rewrite theory is a rewrite theory \(\mathcal {R}= (\varSigma ,E,R)\), such that for some top sort \({{\textit{State}}}\), no operator in \(\varSigma \) has \({{\textit{State}}}\) as argument sort and each rule \(l \rightarrow r\;\mathbf{if }\; {\phi }\in R\) satisfies \(l,r \in T_\varSigma (X)_{{\textit{State}}}\) and \(l \notin X\).

3 Rewriting Modulo a Built-In Subtheory

This section introduces the concept of rewriting modulo a built-in equational subtheory and presents its main properties. Detailed proofs can be found in [33, 34].

Definition 1

(Signature with Built-ins). An order-sorted signature \(\varSigma = (S,\le ,F)\) is a signature with built-in subsignature \(\varSigma _{0}\subseteq \varSigma \) iff \(\varSigma _{0}= (S_{0},F_{0})\) is many-sorted, \(S_{0}\) is a set of minimal elements in \((S,\le )\), and if \(f : w \longrightarrow s \in F_{1}\), then \(s \notin S_{0}\) and \(f\) has no other typing in \(F_{0}\), where \(F_{1}= F{\setminus } F_{0}\).

The notion of built-in subsignature in an order-sorted signature \(\varSigma \) is modeled by a many-sorted signature \(\varSigma _{0}\) defining the built-in terms \(T_{\varSigma _{0}}(X_{0})\). The restriction imposed on the sorts and the function symbols in \(\varSigma \) w.r.t. \(\varSigma _{0}\) provides a clear syntactic distinction between built-in terms (the only ones with built-in sorts) and all other terms.

If \(\varSigma \supseteq \varSigma _{0}\) is a signature with built-ins, then an abstraction of built-ins for \(t\) is a context \(\lambda x_1\cdots x_n.{t}^{{\circ }}\) such that \({t}^{{\circ }} \in T_{\varSigma _{1}}(X)\) and \(\{x_1,\ldots ,x_n\} = {\textit{vars}}({t}^{{\circ }})\cap X_{0}\), where \(\varSigma _{1}= (S,\le , F_{1})\) and \(X_{0}= \{X_s\}_{s \in S_{0}}\). Lemma 1 shows that such an abstraction can be chosen so as to provide a canonical decomposition of \(t\) with useful properties.

Lemma 1

Let \(\varSigma \) be a signature with built-in subsignature \(\varSigma _{0}=(S_{0},F_{0})\). For each \(t\in T_{\varSigma }(X)\), there exist an abstraction of built-ins \(\lambda x_1\cdots x_n.{t}^{{\circ }}\) for \(t\) and a substitution \({\theta }^{{\circ }} : X_{0} \longrightarrow T_{\varSigma _{0}}(X_{0})\) such that (i) \(t = {t}^{{\circ }}{\theta }^{{\circ }}\) and (ii) \({ dom}({\theta }^{{\circ }}) = \{x_1,\ldots ,x_n\}\) are pairwise distinct and disjoint from \({\textit{vars}}(t)\); moreover, (iii) \({t}^{{\circ }}\) can always be selected to be \(S_{0}\)-linear and with \(\{x_1,\ldots ,x_n\}\) disjoint from an arbitrarily chosen finite subset \(Y\) of \(X_{0}\).

In the rest of the paper, for any \(t \in T_\varSigma (X)\) and \(Y\subseteq X_{0}\) finite, the expression \({\textit{abstract}}_{\varSigma _{1}}(t,Y)\) denotes the choice of a triple \(\langle \lambda x_1 \cdots x_n.{t}^{{\circ }} \,; {\theta }^{{\circ }} \,; {\phi }^{{\circ }} \rangle \) such that the context \(\lambda x_1 \cdots x_n.{t}^{{\circ }}\) and the substitution \({\theta }^{{\circ }}\) satisfy the properties (i)–(iii) in Lemma 1 and \({\phi }^{{\circ }} = \bigwedge _{i=1}^n \left( x_i = {\theta }^{{\circ }}(x_i)\right) \).

Under certain restrictions on axioms, matching a \(\varSigma \)-term \(t\) to a \(\varSigma \)-term \(u\) can be decomposed modularly into \(\varSigma _1\)-matching of the corresponding \(\lambda \)-abstraction and \(\varSigma _{0}\)-matching of the built-in subterms. This is described in Lemma 2.

Lemma 2

Let \(\varSigma = (S,\le ,F)\) be a signature with built-in subsignature \(\varSigma _{0}=(S_{0},F_{0})\). Let \(B_{0}\) be a set of \(\varSigma _{0}\)-axioms and \(B_{1}\) a set of \(\varSigma _{1}\)-axioms. For \(B_{0}\) and \(B_{1}\) regular, linear, collapse free for any sort in \(S_{0}\), and sort-preserving, if \(t \in T_{\varSigma _{1}}(X_{0})\) is linear with \({\textit{vars}}(t) = \{x_1,\ldots ,x_n\}\), then for each \(\theta : X_{0} \longrightarrow T_{\varSigma _{0}}(X_{0})\):

  1. (a)

    if \(t\theta =_{B_{0}}^1 t'\), then there exist \(x \in \{x_1,\ldots ,x_n\}\) and \(w \in T_{\varSigma _{0}}(X_{0})\) such that \(\theta (x) =_{B_{0}}^1 w\) and \(t'= t\theta '\), with \(\theta '(x) = w\) and \(\theta '(y) = \theta (y)\) otherwise;

  2. (b)

    if \(t\theta =_{B_{1}}^1 t'\), then there exists \(v \in T_{\varSigma _{1}}(X_{0})\) such that \(t =_{B_{1}}^1 v\) and \(t' = v\theta \); and

  3. (c)

    if \(t\theta =_{B_{0}\uplus B_{1}} t'\), then there exist \(v \in T_{\varSigma _1}(X_{0})\) and \(\theta ' : X_{0} \longrightarrow T_{\varSigma _{0}}(X_{0})\) such that \(t' = v\theta '\), \(t=_{B_{1}} v\), and \(\theta =_{B_{0}} \theta '\) (i.e., \(\theta (x) =_{B_{0}} \theta '(x)\) for each \(x \in X_{0}\)).

Definition 2 introduces the notion of rewriting modulo a built-in subtheory.

Definition 2

(Rewriting Modulo a Built-in Subtheory). A rewrite theory modulo the built-in subtheory \(\mathcal {E}_{0}\) is a topmost rewrite theory \(\mathcal {R}=(\varSigma , E, R)\) with:

  1. (a)

    \(\varSigma {=} (S,\le ,F)\) a signature with built-in subsignature \(\varSigma _{0}{=} (S_{0},F_{0})\) and top sort \({{\textit{State}}}{\in } S\);

  2. (b)

    \(E = E_{0}\uplus B_{0}\uplus B_{1}\), where \(E_{0}\) is a set of \(\varSigma _{0}\)-equations, \(B_{0}\) (resp., \(B_{1}\)) are \(\varSigma _{0}\)-axioms (resp., \(\varSigma _{1}\)-axioms) satisfying the conditions in Lemma 2, \(\mathcal {E}_{0}= (\varSigma _{0},E_{0}\uplus B_{0})\) and \(\mathcal {E}= (\varSigma ,E)\) are admissible, and the theory inclusion \(\mathcal {E}_{0}\subseteq \mathcal {E}\) is protecting;

  3. (c)

    \(R\) is a set of rewrite rules of the form \(l(\,\overrightarrow{x_1},\overrightarrow{y}) \rightarrow r(\,\overrightarrow{x_2},\overrightarrow{y})\;\mathbf{if }\; {\phi (\,\overrightarrow{x_3})}\) such that \(l,r \in T_\varSigma (X)_{{\textit{State}}}\), \(l\) is \((S\setminus S_{0})\)-linear, \(\overrightarrow{x_i}{:}\overrightarrow{s_i}\) with \(\overrightarrow{s_i} \in S^{*}_{0}\), for \(i \in \{1,2,3\}\), \(\overrightarrow{y}{:}\overrightarrow{s}\) with \(\overrightarrow{s} \in (S\setminus S_{0})^*\), and \(\phi \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\), where \({\textit{QF}}_{\varSigma _{0}}({X_{0}})\) denotes the set of quantifier-free \(\varSigma _{0}\)-formulas with variables in \(X_{0}\).

Note that no assumption is made on the relationship between the built-in variables \(x_1\) in the left-hand side, \(x_2\) in the right-hand side, and \(x_3\) in the condition \(\phi \) of a rewrite rule. This freedom is key for specifying open systems with a rewrite theory because, for instance, \(x_2\) can have more variables than \(x_1\). On the other hand, due to the presence of conditions \(\phi \) in the rules of \(\mathcal {R}\) that are general quantifier-free formulas, as opposed to a conjunction of atoms, properly speaking \(\mathcal {R}\) is more general than a standard rewrite theory as defined in Sect. 2.

The binary rewrite relation induced by a rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\) on \(T_{\varSigma ,{{\textit{State}}}}\) is called the ground rewrite relation of \(\mathcal {R}\).

Definition 3

(Ground Rewrite Relation). Let \(\mathcal {R}= (\varSigma , E, R)\) be a rewrite theory modulo \(\mathcal {E}_{0}\). The relation \(\rightarrow _{\mathcal {R}}\) induced by \(\mathcal {R}\) on \(T_{\varSigma ,{{\textit{State}}}}\) is defined for \(t,u \in T_{\varSigma ,{{\textit{State}}}}\) by \(t \rightarrow _{\mathcal {R}} u\) iff there is a rule \(l \rightarrow r\;\mathbf{if }\; {\phi }\) in \(R\) and a ground substitution \(\sigma : X \longrightarrow T_\varSigma \) such that (a) \(t =_E l\sigma \), \(u =_E r\sigma \), and (b) \(\mathcal {T}_{\mathcal {E}_{0}}\models \phi \sigma \).

The ground rewrite relation \(\rightarrow _\mathcal {R}\) is the topmost rewrite relation induced by \(R\) modulo \(E\) on \(T_{\varSigma ,{{\textit{State}}}}\). This relation is defined even when a rule in \(R\) has extra variables in its right-hand side: the rule is then non-deterministic and such extra variables can be arbitrarily instantiated, provided that the corresponding instantiation of \(\phi \) holds. Also, note that non-built-in variables can occur in \(l\), but \(\phi \sigma \) is a variable-free formula in \({\textit{QF}}_{\varSigma _{0}}({\emptyset })\), so that either \(\mathcal {T}_{\mathcal {E}_{0}}\models \phi \sigma \) or \(\mathcal {T}_{\mathcal {E}_{0}}\not \models \phi \sigma \).

A rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\) always has a canonical representation in which all left-hand sides of rules are \(S_{0}\)-linear \(\varSigma _{1}\)-terms.

Definition 4

(Normal Form of a Rewrite Theory Modulo \(\mathcal {E}_{0}\) ). Let \(\mathcal {R}= (\varSigma , E, R)\) be a rewrite theory modulo \(\mathcal {E}_{0}\). Its normal form \({\mathcal {R}}^{{\circ }} = (\varSigma , E, {R}^{{\circ }})\) has rules:

$$\begin{aligned} {R}^{{\circ }} = \{ {l}^{{\circ }} \rightarrow r\;\mathbf{if }\; {\phi \wedge {\phi }^{{\circ }}} \mid (\exists l \rightarrow r\;\mathbf{if }\; {\phi } \in R) \langle \lambda \overrightarrow{x}.{l}^{{\circ }} \,; {\theta }^{{\circ }} \,; {\phi }^{{\circ }} \rangle = {\textit{abstract}}_\varSigma (l, {\textit{vars}}(\{l,r,\phi \}))\}. \end{aligned}$$

Lemma 3

(Invariance of Ground Rewriting under Normalization). Let \(\mathcal {R}= (\varSigma , E, R)\) be a rewrite theory modulo \(\mathcal {E}_{0}\). Then \({\rightarrow _{\mathcal {R}}} = {\rightarrow _{{\mathcal {R}}^{{\circ }}}}\).

By the properties of the axioms in a rewrite theory modulo built-ins \(\mathcal {R}= (\varSigma ,E_{0}\uplus B_{0}\uplus B_{1})\) (see Definition 2), \(B_{1}\)-matching a term \(t\in T_{\varSigma }(X_{0})\) to a left-hand side \({l}^{{\circ }}\) of a rule in \({R}^{{\circ }}\) provides a complete unifiability algorithm for ground \(B_{1}\)-unification of \(t\) and \({l}^{{\circ }}\).

Lemma 4

(Matching Lemma). Let \(\mathcal {R}= (\varSigma , E_{0}\uplus B_{0}\uplus B_{1}, R)\) be a rewrite theory modulo \(\mathcal {E}_{0}\). For \(t \in T_\varSigma (X_{0})_{{\textit{State}}}\) and \({l}^{{\circ }}\) a left-hand side of a rule in \({R}^{{\circ }}\) with \({\textit{vars}}(t)\cap {\textit{vars}}({l}^{{\circ }})=\emptyset \), \(t \ll _{ B_{1}} {l}^{{\circ }}\) iff \({\textit{GU}}_{ B_{1}}(t ={l}^{{\circ }}) \ne \emptyset \) holds, where \({\textit{GU}}_{ B_{1}}(t ={l}^{{\circ }}) = \{\sigma : X \longrightarrow T_\varSigma \mid t\sigma =_{B_{1}} {l}^{{\circ }}\sigma \}\).

4 Symbolic Rewriting Modulo a Built-In Subtheory

This section explains how a rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\) defines a symbolic rewrite relation on terms in \(T_{\varSigma _{0}}(X_{0})_{{\textit{State}}}\) constrained by formulas in \({\textit{QF}}_{\varSigma _{0}}({X_{0}})\). The key idea is that, when \(\mathcal {E}_{0}\) is a decidable theory, transitions on the symbolic terms can be performed by rewriting modulo \(B_{1}\), and satisfiability of the formulas can be handled by an SMT decision procedure. This approach provides an efficiently executable symbolic method called rewriting modulo SMT that is sound and complete with respect to the ground rewrite relation of Definition 3 and yields a complete symbolic reachability analysis method. Detailed proofs of the theorems presented in this section can be found in [34].

Definition 5

(Constrained Terms and their Denotation). Let \(\mathcal {R}= (\varSigma , E, R)\) be a rewrite theory modulo \(\mathcal {E}_{0}\). A constrained term is a pair \(\langle t; \varphi \rangle \) in \(T_{\varSigma }(X_{0})_{{\textit{State}}}\times {\textit{QF}}_{\varSigma _{0}}({X_{0}})\). Its denotation \([\![t ]\!]_{\varphi }\) is defined as \([\![t ]\!]_{\varphi } = \{t' {\in } T_{\varSigma ,{{\textit{State}}}} \mid (\exists \sigma : X_{0}{\longrightarrow } T_{\varSigma _{0}}) \; t' {=} t\sigma \wedge \mathcal {T}_{\mathcal {E}_{0}}\models \varphi \sigma \}\).

The domain of \(\sigma \) in Definition 5 ranges over all built-in variables \(X_{0}\) and consequently \([\![t ]\!]_{\varphi } \subseteq T_{\varSigma ,{{\textit{State}}}}\) for any \(t \in T_{\varSigma }(X_{0})_{{\textit{State}}}\), even if \({\textit{vars}}(t) \not \subseteq {\textit{vars}}(\varphi )\). Intuitively, \([\![t ]\!]_{\varphi }\) denotes the set of all ground states that are instances of \(t\) and satisfy \(\varphi \).

Before introducing the symbolic rewrite relation on constrained terms induced by a rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\), auxiliary notation for variable renaming is required. In the rest of the paper, the expression \({\textit{fresh-vars}}(Y)\), for \(Y\subseteq X\) finite, represents the choice of a variable renaming \(\zeta : X \longrightarrow X\) satisfying \(Y \cap { ran}(\zeta ) = \emptyset \).

Definition 6

(Symbolic Rewrite Relation). Let \(\mathcal {R}= (\varSigma , E, R)\) be a rewrite theory modulo built-ins \(\mathcal {E}_{0}\). The symbolic rewrite relation \(\rightsquigarrow _{\mathcal {R}}\) induced by \(\mathcal {R}\) on \(T_{\varSigma }(X_{0})_{{\textit{State}}}\times {\textit{QF}}_{\varSigma _{0}}({X_{0}})\) is defined for \(t,u \in T_{\varSigma }(X_{0})_{{\textit{State}}}\) and \(\varphi ,\varphi ' \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\) by \(\langle t; \varphi \rangle \rightsquigarrow _{\mathcal {R}} \langle u; \varphi ' \rangle \) iff there is a rule \(l \rightarrow r\;\mathbf{if }\; {\phi }\) in \(R\) and a substitution \(\theta : X \longrightarrow T_{\varSigma }(X)\) such that (a) \(t =_E l\zeta \theta \) and \(u = r\zeta \theta \), (b) \(\mathcal {E}_{0}\vdash (\varphi ' \Leftrightarrow \varphi \wedge \phi \zeta \theta )\), and (c) \(\varphi '\) is \(\mathcal {T}_{\mathcal {E}_{0}}\)-satisfiable, where \(\zeta = {\textit{fresh-vars}}({\textit{vars}}(t,\varphi ))\).

The symbolic relation \(\rightsquigarrow _\mathcal {R}\) on constrained terms is defined as a topmost rewrite relation induced by \(R\) modulo \(E\) on \(T_\varSigma (X_{0})\) with extra bookkeeping of constraints. Note that \(\varphi '\) in \(\langle t; \varphi \rangle \rightsquigarrow _\mathcal {R}\langle u; \varphi ' \rangle \), when witnessed by \(l \rightarrow r\;\mathbf{if }\; {\phi }\) and \(\theta \), is semantically equivalent to \(\varphi \wedge \phi \zeta \theta \), in contrast to being syntactically equal. This extra freedom allows for simplification of constraints if desired. Also, such a constraint \(\varphi '\) is satisfiable in \(\mathcal {T}_{\mathcal {E}_{0}}\), implying that \(\varphi \) and \(\phi \theta \) are both satisfiable in \(\mathcal {T}_{\mathcal {E}_{0}}\), and therefore \([\![t ]\!]_{\varphi } \!\ne \emptyset \! \ne \! [\![u ]\!]_{\varphi '}\). Note that, up to the choice of the semantically equivalent \(\varphi '\) for which a fixed strategy is assumed, the symbolic relation \(\rightsquigarrow _\mathcal {R}\) is deterministic because the renaming of variables in the rules is fixed by \({\textit{fresh-vars}}\). This is key when executing \(\rightsquigarrow _\mathcal {R}\), as explained in Sect. 5.

The important question to ask is whether this symbolic relation soundly and completely simulates its ground counterpart. The rest of this section affirmatively answers this question in the case of normalized rewrite theories modulo built-ins. Thanks to Lemma 3, the conclusion is therefore that \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) soundly and completely simulates \(\rightarrow _{\mathcal {R}}\) for any rewrite theory \(\mathcal {R}\) modulo built-ins \(\mathcal {E}_{0}\).

The soundness of \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) w.r.t. \(\rightarrow _{{\mathcal {R}}^{{\circ }}}\) is stated in Theorem 1.

Theorem 1

(Soundness). Let \(\mathcal {R}= (\varSigma ,E,R)\) be a rewrite theory modulo built-ins \(\mathcal {E}_{0}\), \(t,u\in T_\varSigma (X_{0})_{{\textit{State}}}\), and \(\varphi ,\varphi '\in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\). If \(\langle t; \varphi \rangle \rightsquigarrow _{{\mathcal {R}}^{{\circ }}} \langle u; \varphi ' \rangle \), then \(t\rho \rightarrow _{{\mathcal {R}}^{{\circ }}} u\rho \) for all \(\rho : X_{0} \longrightarrow T_{\varSigma _{0}}\) satisfying \(\mathcal {T}_{\mathcal {E}_{0}}\models \varphi '\rho \).

The completeness of \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) w.r.t. \(\rightarrow _{{\mathcal {R}}^{{\circ }}}\) is stated in Theorem 2. Intuitively, completeness states that a symbolic relation yields an over-approximation of its ground rewriting counterpart.

Theorem 2

(Completeness). Let \(\mathcal {R}= (\varSigma ,E,R)\) be a rewrite theory modulo built-ins \(\mathcal {E}_{0}\), \(t \in T_\varSigma (X_{0})_{{\textit{State}}}\), \(u' \in T_{\varSigma ,{{\textit{State}}}}\), and \(\varphi \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\). For any \(\rho : X_{0} \longrightarrow T_{\varSigma _{0}}\) such that \(t\rho \in [\![t ]\!]_{\varphi }\) and \(t\rho \rightarrow _{{\mathcal {R}}^{{\circ }}} u'\), there exist \(u\in T_\varSigma (X_{0})_{{\textit{State}}}\) and \(\varphi '\in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\) such that \(\langle t; \varphi \rangle \rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\langle u; \varphi ' \rangle \) and \(u'\in [\![u ]\!]_{\varphi '}\).

Although the above soundness and completeness theorems, plus Lemma 3, show that \(\rightarrow _\mathcal {R}\) is characterized symbolically by \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\), for any rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\), because of Condition (c) in Definition 6, the relation \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) is in general undecidable. However, \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) becomes decidable for built-in theories \(\mathcal {E}_{0}\) that can be extended to a decidable theory \(\mathcal {E}_{0}^+\) (typically by adding some inductive consequences) such that

$$\begin{aligned} (\forall \phi \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})) \; \phi ~{\text {is}}~\mathcal {E}_{0}^+{\text {-satisfiable}} \; \iff \; (\exists \sigma : X_{0} \longrightarrow T_{\varSigma _{0}})\; \mathcal {T}_{\mathcal {E}_{0}}\models \phi \sigma . \end{aligned}$$
(1)

Many decidable theories \(\mathcal {E}_{{0}}^+\) of interest are supported by SMT solvers satisfying this requirement. For example, \(\mathcal {E}_{0}\) can be the equational theory of natural number addition and \(\mathcal {E}_{0}^+\) Pressburger arithmetic. That is, \(\mathcal {T}_{\mathcal {E}_{0}}\) is the standard model of both \(\mathcal {E}_{0}\) and \(\mathcal {E}_{0}^+\), and \(\mathcal {E}_{0}^+\)-satisfiability coincides with satisfiability in such a standard model. Under such conditions, satisfiability of \(\varphi \wedge \phi \zeta \theta \) (and therefore of \(\varphi '\)) in a step \(\langle t; \varphi \rangle \rightsquigarrow _{{\mathcal {R}}^{{\circ }}} \langle u; \varphi ' \rangle \) becomes decidable by invoking an SMT-solver for \(\mathcal {E}_{0}\), so that \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) can be naturally described as symbolic rewriting modulo SMT (and modulo \(B_{1}\)).

The symbolic reachability problems considered for a rewrite theory \(\mathcal {R}\) modulo \(\mathcal {E}_{0}\) in this paper, are existential formulas of the form \((\exists \overrightarrow{z})\;t \rightarrow ^* u \wedge \varphi \), with \(\overrightarrow{z}\) the variables appearing in \(t\), \(u\), and \(\varphi \), \(t\in T_{\varSigma }(X_{0})_{{\textit{State}}}\), \(u \in T_{\varSigma }(X)_{{\textit{State}}}\), and \(\varphi \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\). By abstracting the \(\varSigma _{0}\)-subterms of \(u\), the ground solutions of such a reachability problem are those witnessing the model-theoretic satisfaction relation

$$\begin{aligned} \mathcal {T}_\mathcal {R}\models (\exists \overrightarrow{x} \uplus \overrightarrow{y}) \; t(\overrightarrow{x}) \rightarrow ^* {u}^{{\circ }}(\overrightarrow{y}) \wedge \varphi _1(\overrightarrow{x}) \wedge \varphi _2(\overrightarrow{x},\overrightarrow{y}), \end{aligned}$$
(2)

where \(\mathcal {T}_\mathcal {R}= (\mathcal {T}_{\varSigma /E},\rightarrow _\mathcal {R}^*)\) is the initial reachability model of \(\mathcal {R}\) [12], \(t \in T_{\varSigma }(X_{0})\) and \({u}^{{\circ }} \in T_{\varSigma _{1}}(X)\) are \(S_{0}\)-linear, \({\textit{vars}}(t) \subseteq \overrightarrow{x} \subseteq X_{0}\), and \(\overrightarrow{y}\subseteq X\). Thanks to the soundness and completeness results, Theorem 1, and Theorem 2, the solvability of Condition (b) for \(\rightarrow _\mathcal {R}\) can be achieved by reachability analysis with \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\), as stated in Theorem 3.

Theorem 3

(Symbolic Reachability Analysis). Let \(\mathcal {R}= (\varSigma ,E,R)\) be a rewrite theory modulo built-ins \(\mathcal {E}_{0}\). The model-theoretic satisfaction relation in (2) has a solution iff there exist a term \(v \in T_{\varSigma }(X)_{{\textit{State}}}\), a constraint \(\varphi ' \in {\textit{QF}}_{\varSigma _{0}}({X_{0}})\), and a substitution \(\theta : X \longrightarrow T_\varSigma (X)\), with \({ dom}(\theta ) \subseteq \overrightarrow{y}\), such that (a) \(\langle t; \varphi _1 \rangle \rightsquigarrow _{{\mathcal {R}}^{{\circ }}}^* \langle v; \varphi ' \rangle \), (b) \(v =_{B_{1}} {u}^{{\circ }}\theta \), and (c) \(\varphi ' \wedge \varphi _2\theta \) is \(\mathcal {T}_{\mathcal {E}_{0}}\)-satisfiable.

In Theorem 3, since \({ dom}(\theta ) \subseteq \overrightarrow{y}\), and \(\overrightarrow{x}\) and \(\overrightarrow{y}\) are disjoint, the variables of \(\overrightarrow{x}\) in \(\varphi _{2}\theta \) are left unchanged. Therefore, \(\varphi _2\theta \) links the requirements for the variables \(\overrightarrow{x}\) in the initial state and \(\overrightarrow{y}\) in the final state according to both \(\varphi _{1}\) and \(\varphi _{2}\). Also note that the inclusion of formula \(\varphi _1\) as a conjunct in the formula in Condition (c) of Theorem 3 is superfluous because \(\langle t; \varphi _1 \rangle \rightsquigarrow _{{\mathcal {R}}^{{\circ }}} \langle v; \varphi ' \rangle \) implies that \(\varphi _1\) is a semantic consequence of \(\varphi '\).

5 Reflective Implementation of \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\)

This section discusses the design and implementation of a prototype that offers support for symbolic rewriting modulo SMT in the Maude system. The prototype relies on Maude’s meta-level features, that implement rewriting logic’s reflective capabilities, and on SMT solving for \({\mathcal {E}_{0}^+}\) integrated in Maude as CVC3’s decision procedures. The extension of Maude with CVC3 is available from the Matching Logic Project [35]. In the rest of this section, \(\mathcal {R}= (\varSigma ,E_{0}\uplus B_{0}\uplus B_{1},R)\) is a rewrite theory modulo built-ins \(\mathcal {E}_{0}\), where \(\mathcal {E}_{0}\) satisfies Condition (1) in Sect. 4. The theory mapping \(\mathcal {R}\mapsto \mathbf{u }(\mathcal {R})\) removes the constraints from the rules in \(R\).

In Maude, reflection is efficiently supported by its \({\textit{META-LEVEL}}\) module [14], which provides key functionality for rewriting logic’s universal theory \(\mathcal {U}\) [15]. In particular, rewrite theories \(\mathcal {R}\) are meta-represented in \(\mathcal {U}\) as terms \(\overline{\mathcal {R}}\) of sort \({\textit{Module}}\), and a term \(t\) in \(\mathcal {R}\) is meta-represented in \(\mathcal {U}\) as a term \(\overline{t}\) of sort \({\textit{Term}}\). The key idea of the reflective implementation is to reduce symbolic rewriting with \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) to standard rewriting in an associated reflective rewrite theory extending the universal theory \(\mathcal {U}\). This is specially important for formal analysis purposes, because it makes available to \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) some formal analysis features provided by Maude for rewrite theories such as reachability analysis by search. This is illustrated by the case study in Sect. 6.

The prototype defines a parametrized functional module \({\textit{SAT}}(\varSigma _{0}, E_{0}\uplus B_{0})\) of quantifier-free formulas with \(\varSigma _{0}\)-equations as atoms. In particular, this module extends \((\varSigma _{0}, E_{0}\uplus B_{0})\) with new sorts \({\textit{Atom}}\) and \({\textit{QFFormula}}\), and new constants  \({\textit{var}}(X_{0})\) identifying the variables \(X_{0}\). It has, among other functions, a function \({\textit{sat}} : {\textit{QFFormula}} \longrightarrow {\textit{Bool}}\) such that for \(\phi \), \({\textit{sat}}(\phi ) = \top \) if \(\phi \) is \(\mathcal {E}_{0}^+\)-satisfiable, and \({\textit{sat}}(\phi ) = \bot \) otherwise.

The process of computing the one-step rewrites of a given constrained term \(\langle t; \varphi \rangle \) under \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) is decomposed into two conceptual steps using Maude’s metalevel. First, all possible triples \(\langle u \,; \theta \,; \phi \rangle \) such that \(t \rightarrow _{\mathbf{u }({\mathcal {R}}^{{\circ }})} u\) is witnessed by a matching substitution \(\theta \) and a rule with constraint \(\phi \) are computedFootnote 1. Second, these triples are filtered out by keeping only those for which the quantifier-free formula \(\varphi \wedge \phi \theta \) is \(\mathcal {E}_{0}^+\)-satisfiable.

The first step in the process is mechanized by function \({\textit{next}}\), available from the parametrized module \({\textit{NEXT}}(\overline{\mathcal {R}},\overline{{{\textit{State}}}},\overline{{{\textit{QFFormula}}}})\) where \(\overline{\mathcal {R}}\), \(\overline{{{\textit{State}}}}\), and \(\overline{QFFormula}\) are the metalevel representations, respectively, of the rewrite theory module \(\mathcal {R}\), the state sort \({{\textit{State}}}\), and the quantifier-free formula sort \({\textit{QFFormula}}\). Function \({\textit{next}}\) uses Maude’s \({\textit{meta-match}}\) function and the auxiliary function \({\textit{new-vars}}\) for computing fresh variables (see Sect. 4). In particular, the call \( {\textit{next}}(\overline{((S,\le ,F\uplus {\textit{var}}(X_{0})), E_{0}\uplus B_{0}\uplus B_{1}, {R}^{{\circ }})},\overline{t},\overline{\varphi })\) computes all possible triples \(\langle \overline{u} \,; \overline{\theta '} \,; \overline{\phi '} \rangle \) such that \(t \rightsquigarrow _{{\mathcal {R}}^{{\circ }}} u\) is witnessed by a substitution \(\theta '\) and a rule with constraint \(\phi '\). More precisely, such a call first computes a renaming \(\zeta = {\textit{fresh-vars}}({\textit{vars}}(t,\varphi ))\) and then, for each rule\(({l}^{{\circ }} \rightarrow r\;\mathbf{if }\; {\phi })\), it uses the function meta-match to obtain a substitution \(\overline{\theta }\in {\textit{meta-match}}(\overline{((S,\le ,F\uplus {\textit{var}}(X_{0})), B_{0}\uplus B_{1})},\overline{{t}\!\downarrow _{E_{0}/B_{0}\uplus B_{1}}},\overline{{l}^{{\circ }}\zeta })\), and returns \(\langle \overline{u} \,; \overline{\theta '} \,; \overline{\phi '} \rangle \) with \(\overline{u} = \overline{r\zeta \theta }\), \(\overline{\theta '}=\overline{\zeta \theta }\), and \(\overline{\phi '} = \overline{\phi \zeta \theta }\). Note that by having a deterministic choice of fresh variables (including those in the constraint), function \({\textit{next}}\) is actually a deterministic function.

Using the above-mentioned infrastructure, the parametrized module NEXT implements the symbolic rewrite relation \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) as a standard rewrite relation in the theory \({\textit{NEXT}}\), extending \({\textit{META-LEVEL}}\), by means of the following conditional rewrite rule:

where \({\mathcal {R}}^{{\bullet }} = ((S,\le ,F\uplus {\textit{var}}(X_{0})), B, {R}^{{\circ }})\). Therefore, a call to an external SMT solver is just an invocation of the function \({\textit{sat}}\) in \({\textit{SAT}}(\varSigma _{0}, E_{0}\uplus B_{0})\) in order to achieve the above functionality more efficiently and in a built-in way.

Given that the symbolic rewrite relation \(\rightsquigarrow _{{\mathcal {R}}^{{\circ }}}\) is encoded as a standard rewrite relation, symbolic search can be directly implemented in Maude by its search command. In particular, for terms \(t,{u}^{{\circ }}\), constraints \(\varphi _1,\varphi _2\), \(F\) a variable of sort \({\textit{QFFormula}}\), the following invocation solves the inductive reachability problem in Condition (2):

$$\begin{aligned} {\textit{search}}\;\; \langle t; \varphi _1 \rangle \rightarrow ^* \langle {u}^{{\circ }}; F \rangle \;\textit{ such that } \; {\textit{sat}}(F \wedge \varphi _2). \end{aligned}$$

6 Analysis of the CASH Algorithm

This section presents an example, developed jointly with Kyungmin Bae, of a real-time system that can be symbolically analyzed in the prototype tool described in Sect. 5. The analysis applies model checking based on rewriting modulo SMT. Some details are omitted. Full details and the prototype tool can be found in [9].

The example involves the symbolic analysis of the CASH scheduling algorithm [13], which attempts to maximize system performance while guaranteeing that critical tasks are executed in a timely manner. This is achieved by maintaining a queue of unused execution budgets that can be reused by other jobs to maximize processor utilization. CASH poses non-trivial modeling and analysis challenges because it contains an unbounded queue. Unbounded data types cannot be modeled in timed-automata formalisms, such as those of UPPAAL [22] or Kronos [39], which assume a finite discrete state.

The CASH algorithm was specified and analyzed in Real-Time Maude by explicit-state model checking in an earlier paper by Ölveczky and Caccamo [30], which showed that, under certain variations on both the assumptions and the design of the protocol, it could miss deadlines. Explicit-state model checking has intrinsic limitations which the new analysis by rewriting modulo SMT presented below overcomes. The CASH algorithm is parametrized by: (i) the number \(N\) of servers in the system, and (ii) the values of a maximum budget \(b_i\) and period \(p_i\), for each server \(1 \le i \le N\). Even if \(N\) is fixed, there are infinitely many initial states for \(N\) servers, since the maximum budgets \(b_i\) and periods \(p_i\) range over the natural numbers. Therefore, explicit state model checking cannot perform a full analysis. If a counterexample for \(N\) servers exists, it may be found by explicit-state model checking for some chosen initial states, as done in [31], but it could be missed if the wrong initial states are chosen.

Rewriting modulo SMT is useful for symbolically analyzing infinite-state systems like CASH. Infinite sets of states are symbolically described by terms which may involve user-definable data structures such as queues, but whose only variables range over decidable types for which an SMT solving procedure is available. For the CASH algorithm, the built-in data types used are the Booleans (sort iBool) and the integers (sort iInt). Integer built-in terms are used to model discrete time. Boolean built-in terms are used to impose constraints on integers.

A symbolic state is a pair {iB,Cnf} of sort Sys consisting of a Boolean constraint iB, with and denoted , and a multiset configuration of objects Cnf, with multiset union denoted by juxtaposition, where each object is a record like-structure with an object identifier, a class name, and a set of attribute-value pairs. In each object configuration there is a global object (of class global) that models the time of the system (with attribute name time), the priority queue (with attribute name cq), the availability (with attribute name available), and a deadline missed flag (with attribute name deadline-miss). A configuration can also contain any number of server objects (of class server). Each server object models the maximum budget (the maximum time within which a given job will be finished, with attribute name maxBudget), period (with attribute name period), internal state (with attribute name state), time executed (with attribute name timeExecuted), budget time used (with attribute name usedOfBudget), and time to deadline (with attribute name timeTo Deadline). The symbolic transitions of CASH are specified by 14 conditional rewrite rules whose conditions specify constraints solvable by the SMT decision procedure. For example, rule [deadlineMiss] below models the detection of a deadline miss for a server with non-zero maximum budget.

figure a

That is, the protocol misses a deadline for server S whenever the value of attribute maxBudget exceeds the addition of values for usedOfBudget and timeToDeadline (i.e., iNZT > iT + iT’), so that the allocated execution time cannot be exhausted before the server’s deadline.

The goal is to verify symbolically the existence of missed deadlines of the CASH algorithm for the infinite set of initial configurations containing two server objects \(s_0\) and \(s_1\) with maximum budgets \(b_0\) and \(b_1\) and periods \(p_0\) and \(p_1\) as unspecified natural numbers, and such that each server’s maximum budget is strictly smaller than its period (i.e., \(0 \le b_0 < p_0 \wedge 0 \le b_1 < p_1\)). This infinite set of initial states is specified symbolically by the equational definition (not shown) of term symbinit. Maude’s search command can then be used to symbolically check if there is a reachable state for any ground instance of symbinit that misses the deadline:

figure b

A counterexample is found at (modeling) time two, after exploring 233 symbolic states in less than 3 seconds. By using a satisfiability witness of the constraint iB computed by the search command, a concrete counterexample is found by exploring only 54 ground states. This result compares favorably, in both time and computational resources, with the ground counterexample found by explicit-state model checking in [30], where more that 52,000 concrete states were explored before finding a counterexample.

7 Related Work and Concluding Remarks

The idea of combining term rewriting/narrowing techniques and constrained data structures is an active area of research, specially since the advent of modern theorem provers with highly efficient decision procedures in the form of SMT solvers. The overall aim of these techniques is to advance applicability of methods in symbolic verification where the constraints are expressed in some logic that has an efficient decision procedure. In particular, the work presented here has strong similarities with the narrowing-based symbolic analysis of rewrite theories initiated in [26] and extended in [8]. The main difference is the replacement of narrowing by SMT solving and the decidability advantages of SMT for constraint solving.

M. Ayala-Rincón [5] investigates, in the setting of many-sorted equational logic, the expressiveness of conditional equational systems whose conditions may use built-in predicates. This class of equational theories is important because the combination of equational and built-in premises yield a type of clauses which is more expressive than purely conditional equations. Rewriting notions like confluence, termination, and critical pairs are also investigated. S. Falke and D. Kapur [16] studied the problem of termination of rewriting with constrained built-ins. In particular, they extended the dependency pairs framework to handle termination of equational specifications with semantic data structures and evaluation strategies in the Maude functional sublanguage. The same authors used the idea of combining rewriting induction and linear arithmetic over constrained terms [17]. Their aim is to obtain equational decision procedures that can handle semantic data types represented by the constrained built-ins. H. Kirchner and C. Ringeissen proposed the notion of constrained rewriting and have used it by combining symbolic constraint solvers [20]. The main difference between their work and rewriting modulo SMT presented in this paper is that the former uses narrowing for symbolic execution, both at the symbolic ‘pattern matching’ and the constraint solving levels. In contrast, rewriting modulo SMT solves the symbolic pattern matching task by rewriting while constraint solving is delegated to an SMT decision procedure. More recently, C. Kop and N. Nishida [21] have proposed a way to unify the ideas regarding equational rewriting with logical constraints. More generally, while the approaches in [5, 16, 17, 20, 21] address symbolic reasoning for equational theorem proving purposes, none of them addresses the kind of non-deterministic rewrite rules, which are needed for open system modeling. More recently, A. Arusoaie et al. [4] have proposed a language-independent symbolic execution framework, within the \(K\) framework [23], for languages endowed with a formal operational semantics based on term rewriting. There, the built-in subtheories are the datatypes of a programming language and symbolic analysis is performed on constrained terms (called “patterns”); unification is also implemented by matching for a restricted class of rewrite rules and uses SMT solvers to check constraints.

This paper has presented rewrite theories modulo built-ins and has shown how they can be used for symbolically modeling and analyzing concurrent open systems, where non-deterministic values from the environment can be represented by built-in terms [33, 34]. In particular, the main contributions of this paper can be summarized as follows: (1) it presents rewriting modulo SMT as a new symbolic technique combining the powers of rewriting, SMT solving, and model checking; (2) this combined power can be applied to model and analyze systems outside the scope of each individual technique; (3) in particular, it is ideally suited to model and analyze the challenging case of open systems; and (4) because of its reflective reduction to standard rewriting, current algorithms and tools for model checking closed systems can be reused in this new symbolic setting without requiring any changes to their implementation.

Under reasonable assumptions, including decidability of \(\mathcal {E}_{0}^+\), a rewrite theory modulo is executable by term rewriting modulo SMT. This feature makes it possible to use, for symbolic analysis, state-of-the-art tools already available for Maude, such as its space search commands, with no change whatsoever required to use such tools. We have proved that the symbolic rewrite relation is sound and complete with respect to its ground counterpart, have presented an overview of the prototype that offers support for rewriting modulo SMT in Maude, and have presented a case study on the symbolic analysis of the CASH scheduling algorithm illustrating the use of these techniques.

Future work on a mature implementation and on extending the idea of rewriting modulo SMT with other symbolic constraint solving techniques such as narrowing modulo should be pursued. Also, the extension to symbolic LTL model checking, together with state space reduction techniques, should be investigated. The ideas presented here extend results in [33] and have been successfully applied to the symbolic analysis of NASA’s PLEXIL language to program open cyber-physical systems [33]. Future applications to PLEXIL and other languages should also be pursued.