Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Since its original formulation [1] by José Meseguer, Rewriting Logic (rwl) has been defined both as a semantical framework, suitable for describing concurrent and distributed systems, and as a logical framework, i.e. a meta-logic where other logics can be naturally represented. Both directions have dynamic and strong development. A concurrent system is specified by a rewrite theory \((\varSigma , E,R)\), where \(\varSigma \) defines the syntax of the system and of its states, E defines the states of the system as an algebraic data type, and R is a set of rewriting rules defining the local transitions of the system. rwl deduction consists of a set of inference rules used to prove sequents \(t\rightarrow t'\), meaning that the term-pattern t “becomes” \(t'\) by concurrently applying the rewrite rules R modulo the equations E. Here the relation “becomes” refers the dynamics of the specified concurrent system.

Maude [2] is the most well-known and most used implementation of rwl. Maude is accompanied by a set of tools for analysing rewrite theories and the systems they describe.

Reachability Logic (hereafter, rl) was introduced in a series of papers [36] as means for specifying the operational semantics of programming languages and for stating reachability properties between states of program executions. Reachability Logic is a very promising framework. It has emerged from the Rewriting Logic Semantics project introduced by José Meseguer and Grigore Roşu [7]. Briefly, an rl formula \({\varphi } \Rightarrow {\varphi '}\) expresses reachability relationships between two sets of states of a system, denoted by the patterns \(\varphi \) and \(\varphi '\) respectively. Depending on the interpretation of formulas, the relationships can express either programming-language semantic steps, or safety properties of programs in the languages in question. Several widely-used languages (including C [8], Java [9]) have been completely specified using an rl-based semantics in the \(\mathbb {K}\) tool [10], and nontrivial programs have been proved using an implementation of rl Footnote 1.

Contributions. In this paper we show that rl can be used beyond its (by now, traditional) domain of programming languages. Specifically, we adapt rl for stating properties of systems described in Rewriting Logic [11] (hereafter, rwl). We propose a procedure for proving rl properties of rwl specifications, we prove the soundness of our procedure, and illustrate its use by verifying rl properties of a communication protocol written in Maude [2].

Our contribution with respect to rl is a proved-sound verification procedure. Previous works [36] include sound and relatively complete proof systems for various versions of rl, but these systems lack strategies for rule applications, making them unpractical for verification; our procedure can be seen as such a strategy.

With respect to rwl, our contribution is the adaptation of the above procedure for verifying rwl theories against reachability properties \({\varphi } \Rightarrow {\varphi '}\), which say that all terminating executions starting from a state in the set \(\varphi \) eventually reach a state in the set \(\varphi '\). Both \(\varphi \) and \(\varphi '\) denote possibly infinite sets of states. We note that rl properties for rwl theories are different from the reachability properties that can be checked in Maude using the search command or the Linear Temporal Logic (ltl) model checker [2]. The difference resides in the possibility of using first-order logic for constraining the initial and the final state terms, and in the interpretation of rl formulas. Specifically, the version of rl that we consider (the all-paths interpretation) corresponds to a subset of ltl interpreted on finite paths, whereas Maude’s ltl model checker uses the standard (infinite-paths) interpretation of ltl; and both the model checker and the search command are bound to checking reachability properties starting from finitely many initial states by exploring finitely many execution paths.

Related work. We focus only on related verification approaches for rwl specifications. These fall under the usual classification of verification techniques: algorithmic ones, which essentially consist in using an automatic model checker; deductive ones, which involve an interaction with a theorem prover; and abstraction-based ones, which consist in first reducing the state-space of a system from unmanageable (e.g., infinite/large) to manageable (e.g., finite/small), a step that typically involves human interaction, and then using a model checker on the reduced system.

Algorithmic techniques include Maude’s finite-state ltl model checker [12] with its more recent extensions to the temporal logic of rewriting [13], and a narrowing-based symbolic model-checker for handling classes of infinite-state systems [14]. Among the deductive techniques, [15, 16] propose two different approaches for reducing safety properties of rwl to equational reasoning, and then using equational reasoning tools for proving the resulting encoded properties. We note that the encoding of rwl into equational logic was proposed earlier in [17] for defining the semantics of rwl. Among the abstraction-based techniques, equational abstractions [18], and algebraic simulations [19] are key contributions.

Finally, our verification procedure uses an operation called derivative that consists in computing the symbolic successors of a given set of states (represented by a formula). This symbolic computation is inspired from [20, 21].

Outline. After preliminaries in Sect. 2, we introduce in Sect. 3 the notion of derivative, which is essential for our approach. We then introduce in Sect. 4 our procedure for verifying rl properties on transition systems also defined by rl formulas, and state its soundness. In Sect. 5, we adapt our approach to transition systems defined by rwl theories. We illustrate in Sect. 6 the usability of our procedure by applying it to a communication protocol described in Maude. Proofs of the technical results are included in an Appendix.

2 Preliminaries

2.1 Matching Logic

We recall the syntax and the semantics of Matching Logic (ml) as presented in [3]. Since ml is based on the many-sorted first-order logic (fol), we recall first the basic definitions from fol.

Given S a set of sorts, an S-sorted first order signature \(\varPhi \) is a pair \((\varSigma , \varPi )\), where \(\varSigma \) is an algebraic S-sorted signature and \(\varPi \) is an indexed set of the form \(\{\varPi _w \mid w \in S^*\}\) whose elements are called predicate symbols, where \(\pi \in \varPi _w\) is said to have arity w. A \(\varPhi \)-model consists of a \(\varSigma \)-algebra M together with a subset \(M_p \subseteq M_{s_1} \times \cdots \times M_{s_n}\) for each predicate \(p \in \varPi _w\), where \(w=s_1\ldots s_n\).

Next, we define the syntax of fol formulas over a first order signature \(\varPhi =(\varSigma , \varPi )\) and a possible infinite S-indexed set of variables \( Var \). We let S denote the set of sorts in \(\varPhi \) and \(T_\varSigma ( Var )\) the algebra of \(\varSigma \)-terms with variables in \( Var \).

The set of \(\varPhi \)-formulas is defined by

$$\begin{aligned} \phi \,\,{:}:= \top \mid p(t_1,\ldots ,t_n)\mid \lnot \phi \mid \phi \wedge \phi \mid (\exists V) \phi \end{aligned}$$

where p ranges over predicate symbols \(\varPi \), each \(t_i\) ranges over \(T_\varSigma ( Var )\) of appropriate sort, and V over finite subsets of \( Var \).

Given a first order \(\varPhi \)-model M,a \(\varPhi \)-formula \(\phi \), and a valuation \(\rho : Var \rightarrow M\), the satisfaction relation \( \rho \,\models \, \phi \) is defined as follows:

  1. 1.

    \(\rho \,\models \, \top \);

  2. 2.

    \(\rho \,\models \,p(t_1,\ldots ,t_n)\) iff \((\rho (t_1),\ldots ,\rho (t_n))\in M_p\);

  3. 3.

    \(\rho \,\models \,\lnot \phi \) iff \(\rho \,\models \phi \) does not hold;

  4. 4.

    \(\rho \,\models \,\phi _1\wedge \phi _2\) iff \(\rho \,\models \phi _1\) and \(\rho \,\models \,\phi _2\);

  5. 5.

    \(\rho \,\models \,(\exists V)\phi \) iff there is \(\rho ': Var \rightarrow M\) with \(\rho '(X)=\rho (X)\), for all \(X\not \in V\), such that \(\rho ' \,\models \,\phi \).

A formula \(\phi \) is valid in M, denoted by \(M \,\models \,\phi \), if it is satisfied by all valuations \(\rho \).

We recall below the ml concepts and results used in this paper. Their presentation is based on  [3].

Definition 1

( ml Formulas). An ml signature \(\varPhi =(\varSigma ,\varPi , State )\) is a first-order signature \((\varSigma ,\varPi )\) together with a distinguished sort \( State \) for states. The set of ml-formulas over \(\varPhi \) is defined by

$$\begin{aligned} {\varphi \,\, {:}:= \pi \mid \top \mid p(t_1,\ldots ,t_n) \mid \lnot \varphi \mid \varphi \wedge \varphi \mid (\exists V) \varphi } \end{aligned}$$

where the basic pattern \(\pi \) ranges over \(T_{\varSigma , State }( Var )\), p ranges over predicate symbols \(\varPi \), each \(t_i\) ranges over \(T_{\varSigma }( Var )\) of appropriate sorts, and V over finite subsets of \( Var \).

The sort \( State \) is intended to model system states. The free occurrence of variables in ml formulas is defined as usual (i.e., like in fol) and we let \( FreeVars (\varphi )\) denote the set of variables freely occurring in \(\varphi \). We often use particular ml formulas \(\varphi \triangleq \pi \wedge \phi \), where \(\pi \) represents a state and \(\phi \) is a fol formula used for constraining this state.

Example 1

Assume that S includes the sorts \( Nat , State \), \(\varSigma \) includes a binary operation symbol \(\langle \_,\_\rangle : Nat \times Nat \rightarrow State \), and \(\varPi \) the predicate symbols \( div \) and \({\_{>}\_}\), with arguments of sort \( Nat \). Then \(\varphi \triangleq \langle x,y\rangle \wedge (\exists z)((z>1)\wedge div (z,x)\wedge div (z,y))\) is an ml formula. We have \( FreeVars (\varphi )=\{x,y\}\).

Definition 2

( ml satisfaction relation). Given \(\varPhi =(\varSigma , \varPi , State )\) an ml signature, M a \((\varSigma , \varPi )\)-model, \(\varphi \) an ml formula over \(\varPhi \), \(\gamma \in M_ State \) a state, and \(\rho : Var \rightarrow M\) a valuation, the satisfaction relation \((\gamma , \rho ) \,\models \varphi \) is defined as follows:

  1. 1.

    \((\gamma , \rho ) \,\models \pi \) iff \(\rho (\pi ) = \gamma \);

  2. 2.

    \((\gamma , \rho ) \,\models \top \);

  3. 3.

    \((\gamma , \rho ) \,\models p(t_1,\ldots ,t_n)\) iff \((\rho (t_1),\ldots ,\rho (t_n))\in M_p\);

  4. 4.

    \((\gamma , \rho ) \,\models \lnot \varphi \) iff \((\gamma , \rho ) \,\models \varphi \) does not hold;

  5. 5.

    \((\gamma , \rho ) \,\models \varphi _1\wedge \varphi _2\) iff \((\gamma , \rho ) \,\models \varphi _1\) and \((\gamma , \rho ) \,\models \varphi _2\); and

  6. 6.

    \((\gamma , \rho ) \models (\exists V)\varphi \) iff there is \(\rho ': Var \rightarrow M\) with \(\rho '(X)=\rho (X)\), for all \(X\not \in V\), such that \((\gamma , \rho ') \models \varphi \).

Example 2

Let M be a model defined such that \(M_{ Nat}\) is the set of natural numbers, \(M_<\) is the inequality “greater than” over natural numbers, and \(M_{ div}(m,n)\) holds iff m divides n. Let \(\varphi \) denote the ml formula \(\langle x,y\rangle \wedge ((\exists z)(z>1)\wedge div (z,x)\wedge div (z,y))\). If we consider \(\rho (x)=4\) and \(\rho (y)=6\) then we have \((\langle 4,6\rangle ,\rho ) \models \varphi \) because \(\rho (\langle x,y\rangle )=\langle 4,6\rangle \) and \(\rho \models ((\exists z)(z>1)\wedge div (z,x)\wedge div (z,y))\). We do not have \((\langle 3,5\rangle ,\rho )\models \varphi \) because \(\rho (\langle x,y\rangle )\not =\langle 3,5\rangle \). Even if we consider \(\rho '(x)=3\) and \(\rho '(y)=5\) we still have \((\langle 3,5\rangle ,\rho ')\not \models \varphi \) because there is no m greater than 1 such that \(M_{ div}(m,3)\) and \(M_{ div}(m,5)\) hold.

Definition 3

( fol encoding of ml ). If \(\varphi \) is an ml-formula then \(\varphi ^{=?}\) is the fol formula \((\exists z)\varphi '\), where \(\varphi '\) is obtained from \(\varphi \) by replacing each basic pattern occurrence \(\pi \) with \(z=\pi \), and z is a variable that does not occur in \(\varphi \).

Example 3

Here are a few examples of ml formulas and their fol encodings:

The relationship between ml formulas and their fol encodings is given by the following result:

Proposition 1

\(\rho \models \varphi ^{=?}\) iff there is \(\gamma \) such that \((\gamma ,\rho )\models \varphi \).

The following proposition is needed later for our soundness result.

Proposition 2

If \(\phi \) is a fol (i.e. structureless) formula, then \((\phi \wedge \varphi )^{=?}\) is equivalent to \(\phi \wedge \varphi ^{=?}\). Moreover, if \(\rho \models \phi \) and \((\gamma , \rho ) \models \varphi \) then \((\gamma , \rho ) \models \phi \wedge \varphi \).

Example 4

We consider first an ml formula including two state terms. Let \(\varphi \) denote the ml formula \(\langle x,y\rangle \wedge x < 5 \wedge \langle u,v\rangle \wedge 8 < v\). Then \(\varphi ^{=?}\) is equivalent to \((\exists z)z=\langle x,y\rangle \wedge z = \langle u,v\rangle \wedge x < 5 \wedge 8 < v\), which in turn is equivalent to \(\langle x,y\rangle \wedge = \langle u,v\rangle \wedge x < 5 \wedge 8 < v\). We have \(\rho \models \varphi ^{=?}\) iff \(\rho (\langle x,y\rangle ) = \rho (\langle u,v\rangle )\wedge \rho (x)< 5\wedge \rho (v) > 8\) iff \((\gamma ,\rho ) \models \varphi \), where \(\gamma =\rho (\langle x,y\rangle ) = \rho (\langle u,v\rangle )\).

If \(\varphi _1\) is an ml formula not including a state term (i.e., it is structureless according to ml terminology), then \(\varphi ^{=?}_1\) is the same with \(\varphi _1\).

2.2 Reachability Logic

In this section we recall reachability-logic formulas, the transition systems that they induce, and their all-paths interpretation [6]. We consider a fixed ml signature \(\varPhi = (\varSigma ,\varPi , State )\), a set of variables \( Var \), and a fixed \(\varPhi \)-model M.

Definition 4

( rl Formulas). An rl formula is a pair \({\varphi } \Rightarrow {\varphi '}\) of ml-formulas.

Definition 5

(Transition-System Specification). An rl transition-system specification is a set \(\mathcal {S}\) of rl-formulas. The transition system defined by \(\mathcal {S}\) over M is \((M_ State ,\Rightarrow ^{\!}_{\!\mathcal {S}})\), where \({\Rightarrow ^{\!}_{\!\mathcal {S}}}=\{(\gamma ,\gamma ')\mid (\exists {\varphi } \Rightarrow {\varphi '}\in \mathcal {S})(\exists \rho )(\gamma ,\rho ) \models \varphi \wedge (\gamma ',\rho )\models \varphi '\}\). We write \(\gamma \Rightarrow ^{\!}_{\!\mathcal {S}}\gamma '\) for \((\gamma ,\gamma ')\in {\Rightarrow ^{\!}_{\!\mathcal {S}}}\).

Example 5

The following set of rules is meant to compute the gcd of two natural numbers:

$$\begin{aligned} \mathcal {S}=\{&{\langle x,y\rangle \wedge x > y\wedge y >0} \Rightarrow {(\exists k)\langle y, x - k*y\rangle \wedge x\ge k*y\wedge k >0},\\&{\langle x,y\rangle \wedge y \ge x} \Rightarrow {\langle y,x\rangle }\} \end{aligned}$$

We further assume that M interprets \(+\) and \(*\) as being the usual operations over natural numbers; \(m-n\) is the difference between m and n, if \(m>n\), or 0, otherwise. Examples of transitions are: \(\langle 8,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 2,0\rangle \) as instance of the first rule, and \(\langle 8,10\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 10,8\rangle \) and \(\langle 2,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 2,2\rangle \) as instances of the second rule.

In the sequel we consider a fixed transition system \((M_ State ,\Rightarrow ^{\!}_{\!\mathcal {S}})\).

Definition 6

(Execution Paths). An execution path is a (possibly infinite) sequence of transitions \(\tau \triangleq \gamma _0\Rightarrow ^{\!}_{\!\mathcal {S}}\gamma _1\Rightarrow ^{\!}_{\!\mathcal {S}}\cdots \).

If \(i\ge 0\) then \(\tau |_{i..}\) is the execution path consisting of the (possibly infinite) subsequence starting from \(\gamma _i\), if any. An execution path is complete iff it is not a strict prefix of an another execution path.

A pair \((\tau ,\rho )\), consisting of an execution path \(\tau \triangleq \gamma _0\Rightarrow ^{\!}_{\!\mathcal {S}}\cdots \) and a valuation \(\rho \), starts from an ml formula \(\varphi \) if \((\gamma _0,\rho )\models \varphi \).

Example 6

Examples of executions are \(\tau \triangleq \langle 8,10\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 10,8\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 8,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 2,0\rangle \) and \(\tau '\triangleq \langle 8,10\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 10,8\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 8,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 2,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\langle 2,2\rangle \Rightarrow ^{\!}_{\!\mathcal {S}}\cdots \). Both executions are complete.

Since an infinite execution path cannot be the prefix of an another one, it follows that infinite execution paths are complete and hence the above definition is slightly different from that given in [6].

Definition 7

(All-Paths Interpretation of rl formulas). We say that a pair \((\tau ,\rho )\) satisfies an rl formula \({\varphi } \Rightarrow {\varphi '}\), written \((\tau ,\rho )\models {\varphi } \Rightarrow {\varphi '}\), iff \((\tau ,\rho )\) starts from \(\varphi \) and one of the following two conditions holds: there exists \(i\ge 0\) such that \((\gamma _i,\rho )\models \varphi '\) or \(\tau \) is infinite. We say that \({\Rightarrow ^{\!}_{\!\mathcal {S}}}\) satisfies \({\varphi } \Rightarrow {\varphi '}\), written \({\Rightarrow ^{\!}_{\!\mathcal {S}}}\models {\varphi } \Rightarrow {\varphi '}\), iff \((\tau ,\rho ) \models {\varphi } \Rightarrow {\varphi '}\) for all \((\tau ,\rho )\) starting from \(\varphi \) with \(\tau \) complete.

We let \([\![{\varphi } \Rightarrow {\varphi '}]\!]\triangleq \{\tau \mid (\exists \rho )(\tau ,\rho )\models {\varphi } \Rightarrow {\varphi '}\}\). If F is a set of rl formulas, then \([\![F]\!]=\bigcup _{{\varphi } \Rightarrow {\varphi '}\in F}[\![{\varphi } \Rightarrow {\varphi '}]\!]\).

Example 7

An rl formula specifying that any execution path satisfying it computes the greatest common divisor (gcd) is \({\langle x,y\rangle } \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}\), where \( gcd \) is a predicate symbol with the interpretation: \(M_{ gcd}(d,m,n)\) holds iff d is the gcd of m and n. If \(\rho (x)=10, \rho (y)=8\), \(\tau \) and \(\tau '\) are the execution paths defined in Example 6, then both \((\tau , \rho )\) and \((\tau ',\rho )\) satisfy the given formula.

The definition of all-paths interpretation of ml formulas given above is slightly different from that given in [6], where \({\Rightarrow ^{\!}_{\!\mathcal {S}}} \models {\varphi } \Rightarrow {\varphi '}\) iff \((\tau ,\rho ) \models {\varphi } \Rightarrow {\varphi '}\) for all \((\tau ,\rho )\) starting from \(\varphi \) with \(\tau \) finite and complete. By contrast, our definition lets infinite paths satisfy formulas vacuously. The infinite paths are introduced from technical reasons,e.g., in order to prove \((\tau ,\rho ) \models {\varphi } \Rightarrow {\varphi '}\) we do not need to prove or know that \(\tau \) is finite and complete.

2.3 Rewrite Theories

In this paper we propose a new approach for analysing rewrite theories. Our approach is dedicated to rewrite theories that model systems having at least some terminating executions, since nonterminating executions satisfy rl formulas vacuously. Communication protocols, such as the one we illustrate our approach on later in the paper, are an example of such systems. A counterexample are reactive systems, which (in the ideal case) execute forever in interaction with their environment.

Here we briefly recall the definition for (a particular case of) rewrite theories and their rewriting relation. A rewrite theory \(\mathcal {R}=(\varSigma ,E\cup A,R)\) consists of a signature \(\varSigma \), a set of equations E, a set of axioms A, e.g., associativity, commutativity, identity or combinations of these, a set of rewrite rules R of the form \(l\rightarrow r\mathbf ~if~ b\), where l and r are terms with variables and b is a term of a distinguished sort Bool. We further assume that there is a special constant \( true \) of sort Bool.

In this paper we shall consider rewrite theories \(\mathcal {R}= (\varSigma , E \cup A, R)\) with a distinguished sort \( State \) such that R is topmost w.r.t. \( State \). Moreover, the actual theories \(\mathcal {R}\) that we can analyse impose some additional technical restrictions on their components, which are briefly discussed in Sect. 5.

We use the standard notation for rwl artefacts: \(=_{E\cup A}\) denotes the equality modulo the equations given by E and A, \(T_{\varSigma ,E\cup A}(X)\) denotes the set of \(=_{E\cup A}\)-equivalences classes of \(\varSigma \)-terms with variables in X, \(T_{\varSigma ,E\cup A}\triangleq T_{\varSigma ,E\cup A}(\emptyset )\) is the set of \(=_{E\cup A}\)-equivalences classes of ground \(\varSigma \)-terms, and \( FreeVars (t)\) denotes the set of variables occurring in the term t Footnote 2. The relation \(\rightarrow _{\mathcal {R}}\) denotes the one-step rewriting relation defined by applying a rule from R modulo axioms \(E\cup A\) over ground terms of sort \( State \): \([u] \rightarrow _{\mathcal {R}} [v]\) iff there are a rule \(l\rightarrow r\mathbf ~if~ b\) in R and a (ground) substitution \(\sigma : FreeVars (l,r,b)\rightarrow T_{\varSigma ,E\cup A}\) such that \(\sigma (l)=_{E\cup A}\sigma (u)\), \(\sigma (r)=_{E\cup A}\sigma (v)\), and \(\sigma (b)=_{E\cup A} true \).

3 Derivatives of ml and rl Formulas

The notion of derivative is essential for our approach. Roughly speaking, the derivative of a formula specifies states/execution paths obtained from those satisfying the initial formula after executing one step. For the remaining part of this section we consider a fixed transition system specification \(\mathcal {S}\) and its associated transition system \((M_ State ,\Rightarrow ^{\!}_{\!\mathcal {S}})\) over a fixed model M.

Assumption 1

In what follows we consider only ml formulas \(\varphi \) with the following property: if \(\varphi \) does not occur as a member of a rule in \(\mathcal {S}\) and \({\varphi _l} \Rightarrow {\varphi _r}\in \mathcal {S}\) then \( FreeVars (\varphi )\cap FreeVars (\varphi _l,\varphi _r)=\emptyset \). This is not a real restriction since the free variable in rules can always be renamed.

Definition 8

(Semantic Definition of Derivatives for rl Formulas). We say that \({\varphi _1} \Rightarrow {\varphi '}\) is a \(\mathcal {S}\)-derivative of \({\varphi } \Rightarrow {\varphi '}\) if for all \((\tau _1,\rho ) \models {\varphi _1} \Rightarrow {\varphi '}\) there is \((\tau ,\rho ) \models {\varphi } \Rightarrow {\varphi '}\) such that \(\tau _1=\tau |_{1..}\).

Example 8

An \(\mathcal {S}\)-derivative for \({\langle x,y\rangle } \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}\) is the following formula: \({\langle y, x-y\rangle \wedge x > y\wedge y >0} \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}\).

Definition 9

(Complete Sets of Derivatives). A set D of \(\mathcal {S}\)-derivatives of \({\varphi } \Rightarrow {\varphi '}\) is complete iff \([\![{\varphi _1} \Rightarrow {\varphi '}]\!]\subseteq [\![D]\!]\) for each \(\mathcal {S}\)-derivative \({\varphi _1} \Rightarrow {\varphi '}\) of \({\varphi } \Rightarrow {\varphi '}\).

Example 9

The set

$$\begin{aligned} \{&{(\exists k) \langle y, x - k*y\rangle \wedge y > 0 \wedge x\ge k*y\wedge k >0} \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)},\\&{\langle y, x\rangle \wedge y \ge x} \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}\} \end{aligned}$$

is a complete set of \(\mathcal {S}\)-derivatives for \({\langle x,y\rangle } \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}\).

The next definition and lemma provide us with syntactical means of computing complete sets of derivates for rl formulas.

Definition 10

(Syntactic Definition of Derivative for rl Formulas). If \(\varphi \) is an ml formula then

$$\begin{aligned} \varDelta _\mathcal {S}(\varphi )\triangleq \{(\exists FreeVars (\varphi _l,\varphi _r))(\varphi _l\wedge \varphi )^{=?}\wedge \varphi _r\mid {\varphi _l} \Rightarrow {\varphi _r}\in \mathcal {S}\}. \end{aligned}$$

If \({\varphi } \Rightarrow {\varphi '}\) is an rl-formula then

$$\begin{aligned} \varDelta _\mathcal {S}({\varphi } \Rightarrow {\varphi '}) \triangleq \{{\varphi _1} \Rightarrow {\varphi '}\mid \varphi _1\in \varDelta _\mathcal {S}(\varphi )\}. \end{aligned}$$

Lemma 1

If \(\varphi _1\in \varDelta _\mathcal {S}(\varphi )\) then \({\varphi _1} \Rightarrow {\varphi '}\) is an \(\mathcal {S}\)-derivative of \({\varphi } \Rightarrow {\varphi '}\).

Lemma 2

Let \({\varphi _1} \Rightarrow {\varphi '}\) be an \(\mathcal {S}\)-derivative of \({\varphi } \Rightarrow {\varphi '}\), \(\tau _1\) be an execution path and \(\rho \) a valuation. If \((\tau _1,\rho )\models {\varphi _1} \Rightarrow {\varphi '}\) then there is \(\varphi '_1\in \varDelta _\mathcal {S}(\varphi )\) such that \((\tau _1,\rho )\models {\varphi '_1} \Rightarrow {\varphi '}\).

From Lemmas 1 and  2 we directly obtain:

Proposition 3

\(\varDelta _\mathcal {S}({\varphi } \Rightarrow {\varphi '})\) is a complete set of \(\mathcal {S}\)-derivatives for \({\varphi } \Rightarrow {\varphi '}\).

Example 10

$$\begin{aligned} \varDelta _\mathcal {S}(\langle x, y\rangle {\wedge } y\ge 0){=}\{&(\exists x',y',k) \langle y', x' {-} k*y'\rangle \wedge \langle x',y'\rangle {=}\langle x,y\rangle \wedge y' > 0\wedge x'>y'\\&\wedge x'\ge k*y'\wedge k >0\wedge y\ge 0,\\&(\exists x',y') \langle y', x'\rangle \wedge \langle x',y'\rangle =\langle x,y\rangle \wedge y'\ge x'\wedge y\ge 0\}, \end{aligned}$$

which can be simplified to

$$\begin{aligned} \varDelta _\mathcal {S}(\langle x, y\rangle \wedge y\ge 0)=\{&(\exists k) \langle y, x - k*y\rangle \wedge y > 0 \wedge x\ge k*y\wedge k >0,\\&\langle y, x\rangle \wedge y\ge x\wedge y\ge 0\}, \end{aligned}$$

using the implications \(M\models \langle x',y'\rangle =\langle x,y\rangle \longrightarrow (x=x'\wedge y=y')\), \(M\models (x\ge k*y\wedge k >0)\longrightarrow x > y\) and \(M\models y >0\longrightarrow y\ge 0\), where M is the model defined in the previous examples, and \(\longrightarrow \) is the usual implication in fol.

\(\varDelta _\mathcal {S}({\langle x,y\rangle } \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)})\) is the set given in Example 9.

The following definition of \(\mathcal {S}\)-derivability is used in our verification procedure for rl formulas. The lemma following it gives an equivalent characterisation in terms of fol, which enables the checking of \(\mathcal {S}\)-derivability using smt solvers.

Definition 11

( \(\mathcal {S}\) -derivability of ml -formulas). An ml formula \(\varphi \) is \(\mathcal {S}\)-derivable iff there is at least a transition starting from it, i.e., there exist a model \((\gamma ,\rho )\models \varphi \) and a transition \(\gamma \Rightarrow ^{\!}_{\!\mathcal {S}}\gamma _1\).

Lemma 3

\(\varphi \) is \(\mathcal {S}\)-derivable iff \(\bigvee _{\varphi _1\in \varDelta _\mathcal {S}(\varphi )}\varphi _1^{=?}\) is satisfiable.

Lemma 3 also shows the strong relationship between the \(\mathcal {S}\)-derivability of a ml formula \(\varphi \) and the \(\mathcal {S}\)-derivatives of rl-formulas \({\varphi } \Rightarrow {\varphi '}\). Hence it does make sense to name the elements of the set \(\varDelta _\mathcal {S}(\varphi )\) as being \(\mathcal {S}\)-derivatives of \(\varphi \).

The notion of totality, defined below, is essential for the soundness of our verification procedure. Intuitively, a transition-system specification \(\mathcal {S}\) is total if its rules cover all models \((\gamma ,\rho )\) of any \(\mathcal {S}\)-derivable formula \(\varphi \). For instance, if \({\langle x,y\rangle \wedge y\not = 0} \Rightarrow {\langle y, x\,\%\,y\rangle }\in \mathcal {S}\) then in order to be total \(\mathcal {S}\) must also include a rule for the case \(y=0\).

Definition 12

\(\mathcal {S}\) is total iff for for each \(\mathcal {S}\)-derivable \(\varphi \) and each pair \((\gamma ,\rho )\) such that \((\gamma ,\rho )\models \varphi \), there is \(\gamma _1\) such that \(\gamma \Rightarrow ^{\!}_{\!\mathcal {S}}\gamma _1\).

Note the difference between \(\mathcal {S}\)-derivability and totality: \(\mathcal {S}\)-derivability requires to have at least one transition starting from \(\varphi \) and the totality requires to have at least one transition starting from \(\gamma \) for any model \((\gamma ,\rho )\) of \(\varphi \).

The next result enables the use of sit solvers for checking totality.

Proposition 4

\(\mathcal {S}\) is total iff for each \(\mathcal {S}\)-derivable \(\varphi \),

$$\begin{aligned} M\models \varphi ^{=?}\longrightarrow \bigvee _{\varphi _1\in \varDelta _\mathcal {S}(\varphi )}\varphi _1^{=?}. \end{aligned}$$

We note that in general smt solvers do not support theories for high-level algebraic structures. However, in practice, one can either introduce the theories in the solver, or use simplifications rules before sending the formulas to the solver.

4 A Procedure for Verifying rl Properties

We now introduce our procedure for verifying rl properties on transition systems also defined by rl formulas. We assume given an ml signature \(\varPhi \) and \(\varPhi \)-model M.

Fig. 1.
figure 1

rl verification procedure.

The soundness result, stated below, says that if the procedure returns \(\mathtt {success}\) when presented with a given input (consisting of a transition system rl specification \(\mathcal {S}\), a set of goals \(G_0\), and the \(\mathcal {S}\)-derivatives of \(G_0\), then the transition system \(\Rightarrow ^{\!}_{\!\mathcal {S}}\) satisfies all the goals. We note that this result is not a trivial consequence of the soundness of the rl proof system [6]; our initial attempts at proving soundness by reducing it to the soundness of the rl proof system showed that one step of our procedure corresponds to several (many) steps of the rl proof system. Thus, the soundness of several nontrivial derived rules of the rl proof system would have to be proved first before attempting to prove the soundness of our procedure. We thus opted for a direct proof.

Theorem 1

(Soundness). Let prove be the procedure given in Fig. 1. Assume that \(\mathcal {S}\) is total. Let \(G_0\) be such that for each \({\varphi _c} \Rightarrow {\varphi '_c} \in G_0\), \(\varphi _c\) is \(\mathcal {S}\)-derivable and satisfies \( FreeVars (\varphi '_c)\subseteq FreeVars (\varphi _c)\). If prove( \(\mathcal {S}, G_0, \varDelta _\mathcal {S}(G_0)\) ) returns \(\mathtt {success}\) then \({\Rightarrow ^{\!}_{\!\mathcal {S}}}\models G_0\).

Example 11

Let S be the rl specification defined in Example 5 and \(G_0\triangleq \{{\varphi _0} \Rightarrow {\varphi '_0}\}\), where \({\varphi _0} \Rightarrow {\varphi '_0}\) is

$$\begin{aligned} {\langle x,y\rangle \wedge y\ge 0} \Rightarrow {(\exists x',y')\langle x', y'\rangle \wedge gcd (x',x,y)}. \end{aligned}$$

We illustrate in a step-by-step manner the procedure call prove(\(\mathcal {S}, G_0, G\)), where G is initially \(\varDelta _\mathcal {S}(G_0)=\{{\varphi _1} \Rightarrow {\varphi '_0}, {\varphi _2} \Rightarrow {\varphi '_0}\}\), and

$$\begin{aligned} \varphi _1\triangleq {}&{(\exists k) \langle y, x - k*y\rangle \wedge y > 0 \wedge x\ge k*y\wedge k >0} \Rightarrow {\varphi '_0},\\ \varphi _2\triangleq {}&{\langle y, x\rangle \wedge y\ge x\wedge y\ge 0} \Rightarrow {\varphi '_0}\}. \end{aligned}$$

Let us consider that \({\varphi _1} \Rightarrow {\varphi '_0}\) is the current chosen goal from G. Obviously, \(M\models \varphi _1\longrightarrow \varphi '_0\) does not hold. Since \(M\models x\ge k*y\longrightarrow x -k*y\ge 0\), we obtain \(M\models \varphi _1\longrightarrow (\exists x',y')\langle x',y'\rangle \wedge y'\ge 0\) (i.e. the condition of the if statement at line 4 holds) and hence the goal \({\varphi _1} \Rightarrow {\varphi '_0}\) is replaced with \({\varphi _{11}} \Rightarrow {\varphi '_0}\) by the statement in line 5, where

$$\begin{aligned} \varphi _{11}\triangleq (\exists x_1,y_1,x'_1,y'_1,k) \langle x'_1,y'_1\rangle&\wedge gcd (x'_1,x_1,y_1)\wedge \langle y,x-k*y\rangle =\langle x_1, y_1\rangle \\&\wedge y > 0 \wedge x\ge k*y\wedge k >0 \end{aligned}$$
Fig. 2.
figure 2

The graph \(\mathcal {G}\) corresponding to the prove procedure call for Example 11

Since \(M\models ( gcd (x',y,x-k*y)\wedge k> 0\wedge x\ge k*y)\longrightarrow gcd (x',x,y)\), it follows that \(M\models \varphi _{11}\longrightarrow \varphi '_0\) and the goal \({\varphi _{11}} \Rightarrow {\varphi '_0}\) is removed from G by the if statement in line 2.

Now the only goal in G is \({\varphi _{2}} \Rightarrow {\varphi '_0}\). It is easy to see that \(M\not \models \varphi _{2}\longrightarrow \varphi '_0\) and \(M\not \models \varphi _{2}\longrightarrow (\exists FreeVars (\varphi _0'))\varphi '_0\), i.e. the conditions on then lines 3 and 4 do not hold. We have \(\varDelta _\mathcal {S}({\varphi _{2}} \Rightarrow {\varphi '_0})=\{{\varphi _{21}} \Rightarrow {\varphi '_0}, {\varphi _{22}} \Rightarrow {\varphi '_0}\}\), where

$$\begin{aligned} \varphi _{21}\triangleq {}&{(\exists k) \langle x, y - k*x\rangle \wedge x > 0 \wedge y\ge k*x\wedge k >0\wedge y > x} \Rightarrow {\varphi '_0},\\ \varphi _{22}\triangleq {}&{\langle x, y\rangle \wedge x\ge y \wedge y\ge x\wedge y\ge 0} \Rightarrow {\varphi '_0}\}. \end{aligned}$$

The \({\varphi _{21}} \Rightarrow {\varphi '_0}\) is processed in the same way like \({\varphi _{1}} \Rightarrow {\varphi '_0}\). Since \(M\models \varphi _{22}\longrightarrow (\exists x',y')\langle x',y'\rangle \wedge y'\ge 0\) and hence the goal \({\varphi _{22}} \Rightarrow {\varphi '_0}\) is replaced with \({\varphi _{221}} \Rightarrow {\varphi '_0}\) by the if statement in line 4, where

$$\begin{aligned} \varphi _{221}\triangleq (\exists x_1,y_1,x'_1,y'_1,k) \langle x'_1,y'_1\rangle&\wedge gcd (x'_1,x_1,y_1)\wedge \langle x, y\rangle =\langle x_1, y_1\rangle \\&\wedge x\ge y \wedge y\ge x\wedge y\ge 0 \end{aligned}$$

It is easy to see that \(M\models \varphi _{221}\longrightarrow \varphi '_{0}\) and hence the goal \({\varphi _{221}} \Rightarrow {\varphi '_0}\) is removed from G by the if statement in line 2.

Now the set of current goals G is empty and the execution of the procedure call prove \((\mathcal {S},G_0,\varDelta _\mathcal {S}(G_0))\) returns \(\mathtt {success}\). The execution of the procedure prove corresponding to this call is graphically represented in Fig. 2: the sinks correspond to the implications on line 3, the forward arrows correspond to the calls on line 7, and the backward arrows correspond to the calls on line 5 in the procedure. This graph covers the symbolic executions starting from \(\varphi _0\).

Remark 1

A sound approximated check for validity statements \(M \models \varphi \) is a procedure that, when presented with M and \(\varphi \), if it answers \( true \) then \(M \models \varphi \). We conjecture that Theorem 1 still holds when one uses sound approximated checks for the various validity statements occurring in the procedure (including \(\mathcal {S}\)-derivability and totality of \(\mathcal {S}\), which amount to validity, cf. the previous section). Approximated checks, such as those implemented in smt solvers, are required here since exact checks do not exist due to undecidability issues.

Remark 2

Theorem 1 says nothing about executions of the procedure that return \(\mathtt {failure}\) or that do not terminate. Such outcomes may mean either \({\Rightarrow ^{\!}_{\!\mathcal {S}}}\not \models G_0\), or \({\Rightarrow ^{\!}_{\!\mathcal {S}}} \models G_0\) but the information contained in the goals \(G_0\) is not sufficient for proving them, or, again, that the approximation induced by the validity checkers was too coarse. It is the user’s burden to come up with a set of goals containing enough information such that the procedure terminates successfully.

This is similar to proving loop invariants in imperative programs, which requires users to provide a strong-enough invariant (i.e., one that can be proved).

Remark 3

Unlike the original proof system [6] we do not aim at (relative) completeness for our procedure. The relative completeness result is a very nice but essentially theoretical construction, which is based on strong assumptions (an oracle for deciding first-order theories) and is essentially of no practical use (it does not actually help in finding proofs).

5 Reachability Properties for Rewrite Theories

In this section we show that rl formulas can be used to specify properties of transition systems defined by rwl theories. This is achieved by extending the signature of an rwl theory \(\mathcal {R}\) to an ml-signature, which includes predicates that can be used to define rl properties of the transition system defined by \(\mathcal {R}\).

We also show how the verification procedure in Sect. 4 can be adapted in order to take advantage of rwl-specific operations such as matching. More precisely, we prove that, under reasonable assumptions, a complete set of derivatives of an rl formula can be computed using standard matching-modulo algorithms.

We note that rl properties for rewrite theories are different from the reachability properties that can be checked in Maude using the search command. The difference is given by the possibility of using fol for constraining the initial and the final state terms, and by the interpretation of rl formulas.

Definition 13

( ml Extension of a Rewrite Theory). Consider a rewrite theory \(\mathcal {R}= (\varSigma , E \cup A, R)\) with a distinguished sort \( State \) such that \(\mathcal {R}\) is topmost w.r.t. \( State \). An ml extension of \(\mathcal {R}\) consists of an ml signature \((\varSigma , \varPi , State )\) together with an interpretation \((T_{\varSigma , E \cup A})_p\subseteq T_{\varSigma , E \cup A, s_1}\times \cdots \times T_{\varSigma , E \cup A, s_n}\) for each predicate symbol \(p\in \varPi _{s_1,\ldots ,s_n}\). In this way, \(T_{\varSigma , E \cup A}\) is a model of the ml extension of \(\mathcal {R}\).

The above definition allows one to see the operations \( bop \in \varSigma _{s_1\ldots s_n, Bool }\) as predicates \( bop \in \varPi _{s_1\ldots s_n}\) with the interpretation \(([t_1],\ldots ,[t_n])\in (T_{\varSigma , E \cup A})_ bop \) iff \( bop (t_1,\ldots ,t_n)=_{E\cup A} true \). Consequently, each term b of sort \( Bool \) defines a fol formula such that, if \(\rho : Var \rightarrow T_{\varSigma ,E\cup A}\), then \(\rho \models b\) iff \(\rho (b)=_{E \cup A} true \).

Any rewrite rule \(l\rightarrow r\mathbf ~if~ b\) can be viewed as an rl formula \({l\wedge b} \Rightarrow {r}\) and the transition relation \(\Rightarrow ^{\!}_{\!R}\) is exactly the same with the one-step topmost rewriting relation \(\rightarrow _\mathcal {R}\). This allows one to naturally define when rl formulas specify properties for rwl theories.

Definition 14

( rl Properties for rwl Theories). An rl property for \(\mathcal {R}= (\varSigma , E \cup A, R)\) is an rl formula \({\varphi } \Rightarrow {\varphi '}\), where \(\varphi \) and \(\varphi '\) are ml formulas defined over an ml extension of \(\mathcal {R}\) (cf. Definition 13). We say that \(\mathcal {R}\) satisfies \({\varphi } \Rightarrow {\varphi '}\) iff \({\rightarrow _{\mathcal {R}}}\models {\varphi } \Rightarrow {\varphi '}\) (cf. Definition 7).

We next focus on adapting the prove procedure for verifying rl properties of rwl theories. Specifically, we show the derivatives can be computed using matching algorithms (under certain assumptions). We give first a technical definition.

Definition 15

(Ground \((E\cup A)\) -unifier). Consider a rewrite theory \(\mathcal {R}= (\varSigma , E \cup A, R)\). Two \(\varSigma \)-terms t and \(t'\) are ground \((E\cup A)\) -unifiable if there is \(\sigma : FreeVars (t,t')\rightarrow T_{\varSigma }\) such that \(\sigma (t)=_{E\cup A}\sigma (t')\). The substitution \(\sigma \) is called \((E\cup A)\)-unifier of t and \(t'\).

The following assumptions are required for computing derivatives using matching. The first assumption restricts the class of rl formulas that can be used as properties to those that are useful in practice (i.e., having more than one pattern, or having negations of patterns, in either side of formulas is not really useful).

Assumption 2

We assume that \( FreeVars (b)\subseteq FreeVars (l,r)\) for each rule \(l\rightarrow r\mathbf ~if~ b\) in R. In this paper we also assume that the rl properties of rwl theories \(\mathcal {R}\) are of particular form \({(\exists X)(\pi \wedge \phi )} \Rightarrow {(\exists X')(\pi '\wedge \phi ')}\) with \( FreeVars (\phi )\subseteq FreeVars (\pi )\) (and hence \(X\subseteq FreeVars (\pi )\)).

The second assumption relates ground unifiers to matching substitutions.

Assumption 3

We assume that for each \(l\rightarrow r\mathbf ~if~ b\) in R with l and \(\pi \) ground \((E\cup A)\)-unifiable, there is a set of matching substitutions \( match (l,\pi )\) such that

  • \(\sigma _0(l)=\pi \) for each \(\sigma _0\in match (l,\pi )\), and

  • for each ground \(E\cup A\)-unifier \(\sigma \) of l and \(\pi \) there are \(\sigma _0\in match (l,\pi )\) and \(\sigma ': FreeVars (\pi )\rightarrow T_{\varSigma }\) satisfying \(\sigma =_{E\cup A}\sigma '\circ \sigma _0\) Footnote 3.

Assumption 3 holds under reasonable constraints, cf. the Matching Lemma [21]. In a nutshell, the constraints distinguish a builtin subtheory for the equational subtheory of the rewrite theory \(\mathcal {R}\), with corresponding builtin equations and axioms, assumed to be manageable by an smt solver; there are no non-builtin equations, while non-builtin axioms may include the usual combinations of associativity, commutativity, and unity. The next results show that the matching substitutions can be used to compute the derivatives of ml formulas and, consequently, the derivatives of rl formulas.

Lemma 4

(Computing ml Derivatives by Matching). Let \({\varphi } \Rightarrow {\varphi '}\) be an rl property for \(\mathcal {R}= (\varSigma , E \cup A, R)\), where \(\varphi \triangleq (\exists X)\pi \wedge \phi \). Then, for each derivative \(\varphi _1\in \varDelta _R(\varphi )\) there exists a rewrite rule \(l\rightarrow r\mathbf ~if~ b\) in R such that

$$\begin{aligned} T_{\varSigma ,E\cup A}\models \varphi _1\longleftrightarrow \bigvee \limits _{\sigma _0\in match (l,\pi )}(\exists X\cup FreeVars (r)\setminus FreeVars (l))\sigma _0(r)\wedge \sigma _0(b)\wedge \phi . \end{aligned}$$

The following theorem directly follows from Lemma 4:

Theorem 2

(Computing rl Derivatives by Matching). For each \({\varphi _1} \Rightarrow {\varphi '}\in \varDelta _R({\varphi } \Rightarrow {\varphi '})\) with \(\varphi \triangleq (\exists X)\pi \wedge \phi \), there is \(l\rightarrow r\mathbf ~if~ b \in R\) such that

$$\begin{aligned}{}[\![{\varphi _1} \Rightarrow {\varphi '}]\!]=[\![\bigvee _{\sigma _0\in match (l,\pi )}{(\exists X\cup FreeVars (r)\setminus FreeVars (l))\sigma _0(r)\wedge \sigma _0(b)\wedge \phi } \Rightarrow {\varphi '}]\!]. \end{aligned}$$

Symbolic Rewrite Rules. We now show that Theorem 2 enables the use of symbolic rewrite rules to efficiently compute derivatives and hence to implement the procedure prove in rwl. For \(\varphi \triangleq (\exists X)(\pi \wedge \phi )\), let \(\varDelta ^{ match}_R({\varphi } \Rightarrow {\varphi '})\) be the set

$$\begin{aligned} \{(\exists X\cup FreeVars (r)\setminus FreeVars (l)&)\,{\sigma _0(r)\wedge \sigma _0(b)\wedge \phi } \Rightarrow {\varphi '}\nonumber \\&\mid {} l\rightarrow r\mathbf ~if~ b \in R, \sigma _0\in match (l,\pi )\} \end{aligned}$$
(1)

We have \([\![\varDelta ^{ match}_R({\varphi } \Rightarrow {\varphi '})]\!]=[\![\varDelta _R({\varphi } \Rightarrow {\varphi '})]\!]\) by Theorem 2, which implies that \(\varDelta ^{ match}_R({\varphi } \Rightarrow {\varphi '})\) is a complete set of \(\mathcal {R}\)-derivatives. This allows us to use of \(\varDelta ^{ match}_R\) in the procedure prove instead of \(\varDelta _R\). Next, we note that formulas \({\varphi _1} \Rightarrow {\varphi '}\) in \(\varDelta ^{ match}_R({\varphi } \Rightarrow {\varphi '})\) (where \(\varphi \triangleq (\exists X)(\pi \wedge \phi )\)) can be computed by applying a symbolic rewrite rule of the form \({l\wedge \psi } \Rightarrow {r\wedge b\wedge \psi }\) to the left-hand side of \({\varphi } \Rightarrow {\varphi '}\), where \(\psi \) is a fresh variable of sort \( Bool \), with a matching substitution \(\sigma _0\) such that \(\sigma _0(\psi )=\phi \). Moreover, \(\varphi \) is R-derivable iff there are a rule \(l\rightarrow r\mathbf ~if~ b\) in R and \(\sigma _0\in match (l,\pi )\) such that \((\exists X\cup FreeVars (r)\setminus FreeVars (l)) \sigma _0(b)\wedge \phi \) is satisfiable, by Lemma 4. This is equivalent to saying that \(\varphi \) is R-derivable iff the symbolic rewrite rule \({l\wedge \psi } \Rightarrow {r\wedge b\wedge \psi }\) is applicable to \(\varphi \).

Overall, R-derivatives of rl formulas \({\varphi } \Rightarrow {\varphi '}\) (where \(\varphi \triangleq (\exists X)(\pi \wedge \phi )\)) can be computed by transforming each rule \(l\rightarrow r\mathbf ~if~ b \in R\) into a symbolic rewrite rule \({l\wedge \psi } \Rightarrow {r\wedge b\wedge \psi }\) and by applying the symbolic rewrite rule. This is how derivatives are computed in our rwl adaptation of the prove procedure.

6 Verifying RL Properties of a Communication Protocol

We illustrate the theory on a simple communication protocol described in Maude.

Protocol Description.  The protocol transmits a file between a sender and a receiver. The file is a sequence of records. The sender and receiver communicate through unidirectional, lossy channels, one of which carries messages (a record in the file, together with a sequence number) from send to receiver, while the other one carries retransmission requests (natural numbers) from receiver to sender.

Both the sender and the receiver maintain a counter in order to keep track of the next record to be sent, respectively received. The sender transmits the next record in the file together with the current value of its counter, which then is incremented by one. The receiver accepts a message only if the sequence number of the message is equal to the receiver’s counter; if this is the case, the counter is incremented and the record from the message is saved. The receiver discards all other messages (i.e., whose sequence number is not the expected one). It may also (nondeterministically) request a retransmission, by sending the current value of its counter over the retransmission request channel. This nondeterminism can be seen as an abstraction of a timeout mechanism, not modeled here for simplicity.

Upon reception of a retransmission request, the sender ignores it if it is greater than or equal to its counter (indicating a wrong retransmission request). Otherwise the sender updates its counter to the number it received on retransmission request channel, in order to start resending messages from that number on.

The protocol’s State structure is given as a constructor with five arguments:

figure a

,

where Pair is a sort of pairs consisting of an Element and a natural number, and \(\mathtt {List\{\}}\) are parameterised lists. They respectively denote: the index of the next record to be sent, the sender-to-receiver channel, the receiver-to-sender channel, the next expected record on the receiver side, and the list of records currently accepted and stored by the receiver.

The file to be sent is modelled by a function fileToSend : Nat -> Elements, of size Max. The sender and receiver’s rules are shown in Fig. 3. There are also rules for the channels losing elements, not shown here due to lack of space.

Fig. 3.
figure 3

Communication Protocol (excerpt).

Reachability Properties. The protocol’s initial state is

figure b

. Its expected reachability property states that all terminating executions starting from the initial state should end up in a state of the form

figure c

where file should satisfy \((\forall j)\, 0 \le j \le \mathtt Max \longrightarrow \mathtt fileToSend(j) = file[j] \), where

figure d

is a function returning an element at a given position in a list. (That is, the file received is the same as the file sent.) In order to specify the constraints on the final states we defined in Maude a subset of rl, so that the reachability property specifying the protocol is written as the following Maude rewrite rule:

figure e

The operation

figure f

is the constructor for our defined subset of ml, which takes a term of sort State and term of sort FOL and builds a term of sort ML. Note that the Maude search command cannot be used to prove this RL formula: to do so it would have to explore all terminating executions starting from the initial state, which are infinitely many (and can be arbitrarily long).

Thus, we use our verification procedure. Unsurprisingly, the above rl formula is not enough by itself for our procedure to succeed. For this to happen, a “helper” rl formula is required, whose right-hand side is the same as for the one above, but whose left-hand side describes an invariant (to hold for all states reachable from the initial states):

figure g

This formula says that the currently received file (whose size is m -1) equals the portion of the file being sent (up to that size); and that all messages currently in transition from sender to receiver are records in the fileToSend file. It was obtained by trial-and-error, while applying the following verification technique.

Verification. We have implemented key functionality from our verification procedure at Maude’s metalevel. A first transformation is applied to rewrite rules as described in the Symbolic rewrite rules paragraph of Sect. 5. This reduces the application of rules with unification to application with matching. Derivatives are computed based on matching and rewriting as described in Sect. 5. We also use Maude’s metalevel to achieve the following executions of our verification procedure:

  • for the first rl formula (the protocol’s specification): deriving it with respect to the protocol’s rules \(\mathcal {S}\), then applying the second formula as a circularity;

  • for the second rl formula (designated above as the “helper” formula): deriving it with respect to the protocol’s rules, then applying itself as a circularity.

By requiring that these two executions return success, Maude generates several proof obligations: essentially, that the condition for applying circularities holds (at line 4 in our verification procedure), and that the condition for returning success (line 3) also holds. Several of those proof obligations are discharged automatically by simplification rules we included in Maude (e.g., that fol disjunction is commutative). The remaining ones are axioms satisfied by the assumed model for the various elements in our problem domain (e.g., lossy channels, files consisting on records, and natural numbers). For example, one proof obligation says that if all messages in a channel contain records from the fileToSend file, then by losing a message all the remaining messages satisfy the same property.

There are four such proof obligations left after automatic simplification. We have (manually) checked that they hold (in the assumed model for our problem domain). The trial-and-error process for finding the helper rl formula consisted in examining the generated proof obligations, and noting that some do not hold unless more information is added about the problem domain.

7 Conclusion and Future Work

In this paper we propose a procedure for verifying reachability properties on symbolic transition systems. While the reachability properties are stated as rl formulas, we allow symbolic transition systems to be described by either rl specifications or by rwl specifications. We prove that our procedure is sound. We show with a concrete example that our procedure works in practice.

The paper also contributes to establishing connections between rl and rwl. In [22] it is shown how rl specifications can be encoded as rwl theories. Here, we take an alternative approach, which consists in using rl as a property language for rwl. The proposed procedure adapted for rwl can be implemented in Maude, using reflection and the recently added support for sat checking and one-step rewriting modulo smt using the CVC4 library (http://cvc4.cs.nyu.edu/).

In terms of future work there are several directions to follow. First, starting from our prototype, we shall develop a tool in the Maude environment that will efficiently implement our procedure; we envision that the tool will generate proof obligations to be discharged by Maude’s inductive theorem prover itp [23]. We also intend to formalise in the Coq proof assistant our procedure and its soundness proof in order to be able not only to verify properties but also to generate certificates. Finally, we will use the extraction mechanism of Coq to obtain certified OCaml code for our procedure and use it as a reference implementation.