1 Introduction

It is commonly accepted that a reliable technique for deductive program verification should be designed with the formal semantics of the programming language as foundation. With the formal semantics used as axioms, a mathematical proof of a desired property for the target program can be constructed. Direct program proofs based on operational semantics are often cumbersome. Due to language constructs that may incur unbounded program behavior, inductive proofs along the structure of semantic derivations (e.g., [27]) are expected.

An established method for simplifying the verification is by devising a program logic (e.g., [18, 34]) for the programming language. Program logics effectively reduce the burdens in dealing with many aspects of the verification, such as the reasoning about loops, recursive function calls, memory layout of objects, concurrency, etc. The effectiveness of program logics has been demonstrated by powerful tools (e.g., [6, 9, 10, 20]) and significant projects (e.g., [30, 37]).

A price to pay for enjoying the power of program logics, however, is the considerable amount of effort often needed in establishing their soundness and completeness wrt. the baseline semantics – often an operational semantics. There have been a plethora of programming languages designed and implemented to meet the needs of different domains. The recent development of blockchain technology alone has led to the creation of multiple languages, such as Solidity [5], Yul [7], Scilla [36], Move [3], Michelson [2], EVM bytecode language [41], etc. Developing one program logic for each language that could be used in scenarios where correctness is of serious concern would require a huge amount of efforts.

To combat the cumbersomeness of direct program proofs based on operational semantics, while avoiding the full complexity in the development of program logics, one could seek to establish the infrastructure necessary for reasoning about specific kinds of language features, for any languages with those features. The results in [26] and [25] show how to deal with fundamental language features that may cause unbounded behavior, such as iteration and recursion, in a language-independent fashion. In [26], a technique is proposed to generate inductive invariants from annotated loop invariants. In [25], a method is presented to turn the semantics of a programming language into a program verifier by applying coinductive reasoning principles. Both developments are built on the small-step execution relation of programs.

Small-step semantics [31] is known to be a fine-grained approach to the definition of operational semantics. It supports a way to model concurrent execution. It also enables the differentiation of looping and abnormal termination. Big-step semantics (or natural semantics [15, 21]), on the other hand, can be easier to formulate. For instance, the design of the semantic configurations need not track the intermediate control states. Big-step semantics can also be easier to use. It does not require the consideration of both derivation sequences and derivation trees at the same time, in performing proofs. There exist many formalizations of big-step semantics (e.g., [4, 11, 17, 23, 28, 42]) with practical uses.

In this work, we propose a technique for reasoning about iteration and recursion in deductive program verification based on big-step operational semantics. For any programming language with a big-step semantics, once a generic predicate is defined to hold on the premises and corresponding conclusions for the semantic rules, a theorem becomes available – the theorem turns the verification of partial correctness results into symbolic execution of the target program with auxiliary information from the user specification. For loops and recursive function calls, this auxiliary information is provided in the same form via the specification, enabling the same pattern of reasoning. We illustrate our technique using verification tasks involving simplified imperative and functional languages. We mechanize the proofs of all formal results [8] in the Coq proof assistant [1].

The main technical contributions of this article are:

  • a language-independent technique simplifying the deductive verification of iterative and recursive program structures based on big-step semantics,

  • proofs for the soundness and relative completeness of the technique,

  • illustration of the technique with the verification of example programs in simplified programming languages of different paradigms,

  • mechanization of proofs and verification examples in the Coq proof assistant.

We provide an infrastructure that handles the routine part of the work in reasoning about programming constructs with potentially unbounded behavior, based on a common model of big-step execution in a proof assistant. This provides a basis for a language-independent deductive program verifier.

Structure. The remaining part of this article is structured as follows. In Sect. 2, we introduce the reasoning technique, and prove its soundness. In Sect. 3, we illustrate the technique with a toy example that is developed in detail. In Sect. 4, we present further verification examples targeting simplified imperative and functional languages. In Sect. 5, we discuss the completeness of the technique. In Sect. 6, we discuss related work. In Sect. 7, we discuss potential extensions of the current development. Finally, we conclude in Sect. 8.

2 The Technique

The proposed verification technique can be used to check that the potential execution results of a program satisfy pre-specified conditions. The potential execution results are estimated by a combination of concrete computation according to the big-step semantics of the programming language, and abstract inference according to the auxiliary information in the specification. The abstract inference helps realize what is usually accomplished with loop invariants in reasoning about loops, and with function contracts in reasoning about function calls.

2.1 Specifications

We capture the execution status of programs by configurations. We capture the results of program execution by result configurations. For imperative languages, a configuration can be a pair of a program and a state, and a result configuration can be a state. For functional languages, a configuration can be a functional expression, and a result configuration can be a canonical form.

Let \(C\) be the set of all possible configurations ranged over by \(c\), for programs written in some language. Let \(R\) be the set of all possible result configurations ranged over by \(r\), for programs in the same language. We do not rely on any assumptions about the structure of the elements in \(C\) or in \(R\).

A specification is a function \(\varPhi \in C\rightarrow \mathcal {P}({R})\). For a configuration \(c\), if \(c\) contains the complete program to be verified, then \(\varPhi (c)\) is the set of result configurations capturing the required range for the execution results of the program. Otherwise, \(\varPhi (c)\) is the expected set of potential results obtained by executing some statement within the overall program. This set provides auxiliary information for the verification.

2.2 Semantic Derivation and Correctness

We model the set of rules of a big-step operational semantics by a predicate \( rule \in (C\times R)^*\rightarrow (C\times R)\rightarrow \{ tt , ff \}\). Each semantic rule is captured as

$$ rule ~{[({c_1},{r_1}),\dots , ({c_n},{r_n})]}~{({c},{r})} $$

Here, the list \([{({c_1},{r_1}), \dots , ({c_n},{r_n})}]\) models the list of premises of the rule, and \(({c},{r})\) models the conclusion of the rule. Each premise or conclusion consists of a configuration in the set \(C\) and a corresponding result configuration in the set \(R\). A side condition in a semantic rule can be captured by a condition on the parameters \(c_1\), ..., \(c_n\), \(r_1\), ..., \(r_n\), \(c\), and \(r\), in a concrete definition of \( rule \).

A semantic derivation concluding that the configuration \(c\) can be evaluated to the result configuration \(r\) in the big-step semantics is captured by

$$\begin{aligned} deriv ({c,r}) ~{:=}~ \exists k\!:&\, \exists c_1,\dots ,c_k\!: \exists r_1,\dots ,r_k\!: \\&\, rule ~{[({c_1},{r_1}),\dots , ({c_k},{r_k})]}~{({c},{r})} \, \wedge \, \\&\,\, \forall i\in \{1,\dots ,k\}\!: deriv ({c_i,r_i}) \end{aligned}$$

Hence, the configuration \(c\) can be evaluated to the result configuration \(r\), or \((c,r)\) can be derived in the big-step semantics, if there is a semantic rule with \((c,r)\) as conclusion, and each premise of the rule can itself be derived in the big-step semantics. Intuitively, if \( deriv ({c,r})\) can be established, then there is a finite derivation tree rooted at \(({c},{r})\).

With the notion of semantic derivation defined above, we formalize the notion of partial correctness as the validity of specifications.

$$ valid ({\varPhi }) ~{:=}~ \forall c, r: deriv ({c, r}) \Rightarrow r\in \varPhi (c) $$

A specification \(\varPhi \) is valid, if for each configuration \(c\), any result configurations semantically derivable from \(c\) is a member of \(\varPhi (c)\).

2.3 Specification-Aware Inference and Verification

We infer the potential execution results of a configuration under a given specification \(\varPhi \) according to the following definition.

$$\begin{aligned} infer ^{\varPhi }({c},{r}) ~{:=}~&\exists k\!: \exists c_1,\dots ,c_k\!: \exists r_1,\dots ,r_k\!:\,\\&\quad ~~\, rule ~{[({c_1},{r_1}),\dots , ({c_k},{r_k})]}~{\!({c},{r})} \,\wedge \, \\&\quad ~~\,\, \forall i\!\in \! \{1,\dots ,k\}: res ^{\varPhi }({c_i},{r_i}) \\ res ^{\varPhi }({c},{r}) ~{:=}~&r\in \varPhi (c) \,\wedge \, (\varPhi (c)=R\Rightarrow infer ^{\varPhi }({c},{r}) ) \end{aligned}$$

The result configuration \(r\) is infered from the configuration \(c\) with the help of the specification \(\varPhi \), if there is a semantic rule with \((c,r)\) as conclusion, and for each premise \((c_i,r_i)\) of the semantic rule, \(r_i\) is a potential result for \(c_i\) according to \(\varPhi \), as is captured by the auxiliary predicate \( res ^{\varPhi }\). The expression \( res ^{\varPhi }({c_i},{r_i})\) says that the possible candidates for \(r_i\) are constrained by the information contained in the specification about \(c_i\). In addition, if \(\varPhi \) does not provide any useful information about \(c_i\) (i.e., \(\varPhi (c_i)=R\)), then \(r_i\) should be inferable from \(c_i\).

Intuitively, the application of the semantic rules in the inference corresponds to the symbolic execution of the target program. The information in the specification can be used to overcome the inability to symbolically execute the constructs with potentially unbounded behavior, such as iteration and recursion.

We formulate the condition to be verified on specifications \(\varPhi \) using the predicate \( verif \). In other words, \( verif ({\varPhi })\) is the syntactical correctness condition.

$$\begin{aligned} verif ({\varPhi }) ~{:=}~&\forall c, r: infer ^{\varPhi }({c},{r}) \Rightarrow r\in \varPhi (c) \end{aligned}$$

A specification \(\varPhi \) is verified, if for each configuration \(c\), any result configurations that can be infered from \(c\) with the help of \(\varPhi \) are contained in \(\varPhi (c)\).

2.4 Soundness

We prove the implication from \( verif ({\varPhi })\) to \( valid ({\varPhi })\). The following lemma is a key component of this proof.

Lemma 1

If \( verif ({\varPhi })\) holds, and \( deriv ({c, r})\) holds, then \( infer ^{\varPhi }({c},{r})\) holds.

Proof

According to the definition of \( deriv ({c, r})\), if this predicate holds, then there is a finite derivation tree generated by the following inference rule.

The proof is by induction on the derivation tree for \( deriv ({c,r})\).

From \( deriv ({c, r})\), we have \( deriv ({c_1,r_1})\), ..., \( deriv ({c_m,r_m})\), and

$$\begin{aligned} rule ~{[ ({c_1},{r_1}), \dots , ({c_m},{r_m}) ]}~{({c},{r})} \end{aligned}$$
(1)

for some m, \(c_1\), ..., \(c_m\), \(r_1\), ..., \(r_m\).

For each \(i\in \{1,\dots ,m\}\), we have \( infer ^{\varPhi }({c_i},{r_i})\) from \( deriv ({c_i,r_i})\) and the induction hypothesis. We show that \( res ^{\varPhi }({c_i},{r_i})\) holds by distinguishing between the cases where \(\varPhi (c_i)=R\) and where \(\varPhi (c_i)\ne R\).

  • Suppose \(\varPhi (c_i)=R\). Then, it holds that \(r\in \varPhi (c_i)\). Hence, we have \( res ^{\varPhi }({c_i},{r_i})\) because of \( infer ^{\varPhi }({c_i},{r_i})\), and the definition of \( res ^\varPhi \).

  • Suppose \(\varPhi (c_i)\ne R\). From \( infer ^{\varPhi }({c_i},{r_i})\), and \( verif ({\varPhi })\), we have \(r_i\in \varPhi (c_i)\). Hence, we have \( res ^{\varPhi }({c_i},{r_i})\) according to the definition of \( res ^{\varPhi }\).

Hence, for each \(i\in \{1,\dots ,m\}\), we have \( res ^{\varPhi }({c_i},{r_i})\). Thus, we can deduce \( infer ^{\varPhi }({c},{r})\) using (1) and the definition of \( infer ^\varPhi \). This completes the proof.    \(\square \)

Using this lemma, the soundness theorem can be obtained directly.

Theorem 1 (Soundness)

If \( verif ({\varPhi })\) can be established, then \( valid ({\varPhi })\) holds.

Proof

Assume \( verif ({\varPhi })\) and \( deriv ({c,r})\). Then, we have \( infer ^{\varPhi }({c},{r})\) according to Lemma 1. Thus, we can deduce \(r\in \varPhi (c)\) using \( verif ({\varPhi })\).    \(\square \)

The application of this theorem reliably turns the problem of establishing the validity of a specification \(\varPhi \) into the problem of proving \( verif ({\varPhi })\), irrespective of the language used for the program that is specified in \(\varPhi \). The examples in Sect. 3 and Sect. 4 show that the proof of \( verif ({\varPhi })\) is free from induction for reasoning about iterative and recursive programming constructs, once auxiliary information summarizing the effects of these constructs is provided.

Remark 1

Lemma 1 suggests that an abstract form of computation is obtained leveraging verified user specification. This abstract computation over-approximates the concrete computation, which indicates a potential connection with abstract interpretation [16]. However, we do not attempt at a formal interpretation of our technique in the framework of abstract interpretation in this work.

3 Illustrative Example

In this section, we illustrate our technique using a toy example. In this example, a program computing the factorial of a natural number is written in the While language [27]. We show how the big-step semantics of the While language can be formulated with the \( rule \) predicate introduced in Sect. 2.2. We then show how the functional correctness of the factorial program can be specified and proven.

3.1 Big-Step Semantics of the While Language

The main syntactical categories of the While language are arithmetic expressions a, Boolean expressions b, and statements \(S\). A statement can be \(\mathsf {skip}\) that performs no operation, an assignment \({x}:={a}\), a sequential composition \(S_1;S_2\), a branching statement \(\mathsf {if}~{b}~\mathsf {then}~{S_1}~\mathsf {else}~{S_2}\), or a loop \(\mathsf {while}~{b}~\mathsf {do}~{S}\). Let the set of all statements be \( Stmt \).

For programs in the While language, the states \(\sigma \) are elements of \(\varSigma {:=} Var \rightarrow \mathbb {Z}\). Here, \( Var \) is the set of variables and \(\mathbb {Z}\) is the set of integers. The evaluation of arithmetic expressions and Boolean expressions in states can be formalized by defining evaluation functions \(\mathcal {A}\) and \(\mathcal {B}\), respectively, as in [27].

For the semantics of the While language, the set \(C\) of configurations is \( Stmt \times \varSigma \), and the set \(R\) of result configurations is \(\varSigma \). We formulate the big-step semantics by defining the predicate \( rule \), as in Fig. 1. In each line, a combination of the parameter values for which \( rule \) holds is given.

Fig. 1.
figure 1

The semantic rules for the statements of the While language

There is a direct correspondence between the formulation in Fig. 1 and a formulation using inference rules (e.g., [27]). For instance, the inference rule for the loop \(\mathsf {while}~{b}~\mathsf {do}~{S}\) in the case where the conditional expression evaluates to true can be formulated as

It is captured exactly by the second last line in the definition of \( rule \) in Fig. 1.

3.2 Factorial Program and Its Specification

Consider the program \(S_{\mathrm {fac}}\) in the While language. The program computes the factorial \({m}!\) where m is the initial value of the program variable \(\mathtt {m}\).

$$\begin{aligned} S_{\mathrm {fac}}~{:=}~&({\mathtt {fac}}:={\mathtt {m}}; S_{\mathrm {wh}}) \\ S_{\mathrm {wh}}~{:=}~&(\mathsf {while}~1<\mathtt {m}~\mathsf {do}~ ({\mathtt {m}}:={\mathtt {m}-\mathtt {1}}; {\mathtt {fac}}:={\mathtt {fac}*\mathtt {m}})) \end{aligned}$$

Let \(P_{m}\) be the set of states where \(\mathtt {fac}\) has the value \({m}!\). Let \(P'_{{m},{ fac }}\) be the set of states where \(\mathtt {fac}\) has the value \( fac *{(m-1)}!\).

$$\begin{aligned} P_{m}~{:=}~&\{\sigma '[\mathtt {fac}\mapsto {m}!]\mid \sigma '\in \varSigma \} \\ P'_{{m},{ fac }}~{:=}~&\{ \sigma '[\mathtt {fac}\mapsto fac *{(m-1)}!] \mid \sigma '\in \varSigma \} \end{aligned}$$

We consider the following specification for the program.

$$ \begin{array}{rll} \varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {fac}}}, {\sigma } \rangle ) ~{:=}~ &{} P_{m} \quad &{} \mathrm {if}~ m = \sigma (\mathtt {m})\,\wedge \, m> 0\,\wedge \, \sigma \in \varSigma \\ \\ \varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle ) ~{:=}~ &{} P'_{{m},{ fac }} \quad &{} \mathrm {if}~ m = \sigma (\mathtt {m})\,\wedge \, m > 0\,\wedge \, fac = \sigma (\mathtt {fac})\,\wedge \, \sigma \in \varSigma \\ \\ \varPhi _{\mathrm {fac}}(c) ~{:=}~ &{} \varSigma \quad &{} \mathrm {if}~c~\mathrm {is~not~of~the~above~forms} \end{array} $$

The specification says that when \(S_{\mathrm {fac}}\) finishes execution started in a state where the value of \(\mathtt {m}\) is \(m>0\), the value of \(\mathtt {fac}\) will be \({m}!\). The specification also contains the auxiliary claim that when the loop \(S_{\mathrm {wh}}\) finishes execution started in a state where \(\mathtt {fac}\) has the value \( fac \) and \(\mathtt {m}\) has the value \(m>0\), the value of \(\mathtt {fac}\) will be equal to the product of \( fac \) and \({(m-1)}!\) (noting that \({0}!=1\)).

3.3 Proof of the Factorial Program

A direct proof of the factorial program \(S_{\mathrm {fac}}\) based on the big-step operational semantics of the While language would require an induction on the shape of derivation trees (e.g., [27]) to establish a suitable invariant for the loop \(S_{\mathrm {wh}}\).

Using the technique of Sect. 2, we aim at establishing \( valid ({\varPhi _{\mathrm {fac}}})\). With Theorem 1, it suffices to show \( verif ({\varPhi _{\mathrm {fac}}})\) – for all \(c\) and \(r\), assuming \( infer ^{\varPhi _{\mathrm {fac}}}({c},{r})\), we attempt to show \(r\in \varPhi _{\mathrm {fac}}(c)\).

  1. 1.

    Firstly, assume \(c\) is \(\langle {S_{\mathrm {fac}}}, {\sigma } \rangle \), where \({\sigma }(\mathtt {\mathtt {m}})>0\). Then, \(\varPhi _{\mathrm {fac}}(c)\) is \(P_{m}\), where \(m={\sigma }(\mathtt {\mathtt {m}})\). Using \( infer ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {fac}}}, {\sigma } \rangle },{r})\) and the semantics of the While language in Fig. 1, it is not difficult to obtain

    $$ rule ~{[(\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle ,\sigma ''), (\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle ,r)]}~{(\langle {S_{\mathrm {fac}}}, {\sigma } \rangle ,r)} $$

    for some \(\sigma ''\) such that \( res ^{\varPhi _{\mathrm {fac}}}({\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle },{\sigma ''})\) and \( res ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle },{r})\). Since \(\varPhi _{\mathrm {fac}}(\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle )=R\), we deduce \( infer ^{\varPhi _{\mathrm {fac}}}({\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle },{\sigma ''})\) from \( res ^{\varPhi _{\mathrm {fac}}}({\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle },{\sigma ''})\). Hence, we deduce \(\sigma ''=\sigma [\mathtt {fac}\mapsto {\sigma }(\mathtt {\mathtt {m}})]\). Hence, we have \({\sigma ''}(\mathtt {\mathtt {m}})={\sigma }(\mathtt {\mathtt {m}})>0\). Hence, we have \(\varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle )= P'_{{{\sigma ''}(\mathtt {\mathtt {m}})},{{\sigma ''}(\mathtt {\mathtt {fac}})}} =\{\sigma '[\mathtt {fac}\mapsto {\sigma ''}(\mathtt {\mathtt {fac}})* {({\sigma ''}(\mathtt {m})-1)}!]\mid \sigma '\in \varSigma \}= \{\sigma '[\mathtt {fac}\mapsto {{\sigma }(\mathtt {\mathtt {m}})}!] \mid \sigma '\in \varSigma \}= P_m\). Moreover, from \( res ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle },{r})\) we have \(r\in \varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle )\). Ultimately, we have \(r\in P_{m}\).

  2. 2.

    Secondly, assume \(c\) is \(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle \), where \({\sigma }(\mathtt {\mathtt {m}})>0\). Then, \(\varPhi _{\mathrm {fac}}(c)\) is \(P'_{{m},{ fac }}\), where \(m={\sigma }(\mathtt {\mathtt {m}})\), and \( fac ={\sigma }(\mathtt {\mathtt {fac}})\). Using \( infer ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {wh}}}, {\sigma } \rangle },{r})\) and the semantics of the While language in Fig. 1, we have the following two cases.

    1. (a)

      We have \(m\le 1\), \( rule ~{[~]}~{(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle ,\sigma )}\), and \(r=\sigma \). Since \(m>0\) and \(m\le 1\), we have \(m=1\). Hence, it is not difficult to deduce \(r\in P'_{m, fac }\).

    2. (b)

      We have \(m>1\), and

      $$ rule ~{[(\langle {{\mathtt {m}}:={\mathtt {m}-\mathtt {1}}; {\mathtt {fac}}:={\mathtt {fac}*\mathtt {m}}}, {\sigma } \rangle , \sigma ''), (\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle , r)]}~{(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle ,r)} $$

      for some \(\sigma ''\) such that \( res ^{\varPhi _{\mathrm {fac}}}({\langle { {\mathtt {m}}:={\mathtt {m}-\mathtt {1}}; {\mathtt {fac}}:={\mathtt {fac}*\mathtt {m}} }, {\sigma } \rangle },{\sigma ''})\) and

      \( res ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle },{r})\). From the former we have

      $$ infer ^{\varPhi _{\mathrm {fac}}}({\langle { {\mathtt {m}}:={\mathtt {m}-\mathtt {1}}; {\mathtt {fac}}:={\mathtt {fac}*\mathtt {m}}}, {\sigma } \rangle },{\sigma ''}) $$

      The specification \(\varPhi _{\mathrm {fac}}\) provides no information about the two assignments, \({\mathtt {m}}:={\mathtt {m}-\mathtt {1}}\) and \({\mathtt {fac}}:={\mathtt {fac}*\mathtt {m}}\). Hence, \( infer ^{\varPhi _{\mathrm {fac}}}\) applies to these two individual assignments, and it can be deduced that \(\sigma ''=\sigma [\mathtt {m}\mapsto m-1, \mathtt {fac}\mapsto fac *(m-1)]\). Hence, we have \({\sigma ''}(\mathtt {\mathtt {m}})=m-1>0\). Hence, \(\varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle )= P'_{{{\sigma ''}(\mathtt {\mathtt {m}})},{{\sigma ''}(\mathtt {\mathtt {fac}})}}= \{ \sigma '[\mathtt {fac}\mapsto ( fac *(m-1))*{(m-1-1)}!] \mid \sigma '\in \varSigma \}= P'_{{m},{ fac }}\). Moreover, from \( res ^{\varPhi _{\mathrm {fac}}}({\langle {S_{\mathrm {wh}}}, {\sigma ''} \rangle },{r})\) we have \(r\in P'_{{\sigma ''}(\mathtt {\mathtt {m}}), {\sigma ''}(\mathtt {\mathtt {fac}})}\). Ultimately, we have \(r\in P'_{m, fac }\).

In the other cases, we have \(\varPhi _{\mathrm {fac}}(c)=R\). Hence, it trivially holds that \(r\in \varPhi _{\mathrm {fac}}(c)\) The proof is thus complete.    \(\square \)

The above proof of the factorial program does not require the use of induction. Essentially, the induction required for the loop is already encapsulated in the proof of Theorem 1.

3.4 Comparison with Hoare-Style Program Verification

A Hoare-style specification of the factorial program would be \( \{\mathtt {m}=n\wedge n>0\}~ S_{\mathrm {fac}}~\{\mathtt {fac}={n}!\} \) Here, \(n\) is a logical variable that is used to record the initial value of the program variable \(\mathtt {m}\). This specification corresponds to our definition of \(\varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {fac}}}, {\sigma } \rangle )\) for \({\sigma }(\mathtt {m})>0\). The latter is more verbose for its explicit reference to states. On the other hand, the use of the latter specification spares the efforts to define an assertion language for each specific programming language.

In Hoare logic, the verification of the program can be performed with the loop invariant \( 1\le \mathtt {m} \le n\,\wedge \, \mathtt {fac}=n*(n-1)* \dots * \mathtt {m} \). It captures the condition that is preserved under the effects of a single round of loop. In comparison, the specification \(\varPhi _{\mathrm {fac}}\) features the loop variant \(\varPhi _{\mathrm {fac}}(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle )\) (with \({\sigma }(\mathtt {m})>0\)). It captures the cumulative effects of the loop from the start of any round to the end of the last round. It can be seen that different ways of thinking are required in coming up with the two kinds of specifications. With the proposed technique, the same style as \(\varPhi _{\mathrm {fac}}\) can be used for different programming languages, for both loops and recursive functions, as can be seen in Sect. 4.

In Hoare logic, the reasoning about programs is often performed in a backward fashion. For a statement that is neither a loop nor a function call, a pre-condition is derived from the post-condition based on the logical rule for the statement. For a loop or a function call, the pre-condition is inferred based on the invariant of the loop or the contract of the function. In our technique, the reasoning is performed in a forward fashion. If a specification provides no information about a configuration, a result configuration is derived directly using the semantics. For instance, the result configuration \(\sigma [\mathtt {fac}\mapsto {\sigma }(\mathtt {\mathtt {m}})]\) is derived from the configuration \(\langle {{\mathtt {fac}}:={\mathtt {m}}}, {\sigma } \rangle \) using the semantics in the factorial example. Otherwise, the specification is used to infer the potential result configurations. For instance, the specification \(\varPhi _{\mathrm {fac}}\) is used to infer the potential result configurations for the configuration \(\langle {S_{\mathrm {wh}}}, {\sigma [\mathtt {fac}\mapsto {\sigma }(\mathtt {\mathtt {m}})]} \rangle \) in the factorial example.

In Hoare-style program verification, a loop invariant is justified by assuming that it holds after a round of loop, and showing that it also holds before that round. In our technique, a loop variant is justified by executing one round of loop from each configuration satisfying the pre-condition of the loop variant, and showing that no more result configurations are possible according to the loop variant for each configuration reached after that round (e.g., \(\langle {S_{\mathrm {wh}}}, {\sigma [\mathtt {m}\mapsto m-1, \mathtt {fac}\mapsto fac *(m-1)]} \rangle \) in the factorial example), than for the original configuration (e.g., \(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle \) in the factorial example) before that round.

4 Verification of Iterative and Recursive Programs

In this section, we evaluate our technique with two further examples. In the two examples, programming languages of the imperative and functional paradigms are used, respectively, to implement the functionality of merging two sorted lists of integers into a single sorted list of integers.

4.1 Extended While Language and Array-Merging Program

Extended While Language. The programming language of this section is an extension of the While language. This extension contains the extra features of one-dimensional arrays and functions.

We give the syntax for arithmetic expressions \(a\), Boolean expressions \(b\), and statements \(S\). We then explain the constructs present in the extension only.

$$ \begin{aligned} a~::=~&n\mid x\mid X\mid {X}[{a}] \mid {a}+{a} \mid {a}-{a} \mid {a}*{a} \mid {a}\,/\,{a} \\ b~::=~&\mathsf {true}\mid \mathsf {false}\mid a=a\mid a< a\mid {b}{\, \& \! \& \,}{b} \mid {!{b}} \\ S~::=~&\mathsf {var}~{x} \mid \mathsf {arr}~{X}[{n}] \mid {x}:={a} \mid {{X}[{a}]}:={a} \mid \mathsf {skip}\mid \\&\mathsf {if}~{b}~\mathsf {then}~{S}~\mathsf {else}~{S} \mid \mathsf {while}~{b}~\mathsf {do}~{S} \mid {S};{S} \mid {f}({a,\dots ,a})\rightarrow {[{x,\dots ,x}]} \end{aligned}$$

Here, \(X\) is an array identifier, and \({X}[{a}]\) is the expression used to retrieve the element of the array \(X\) at the index \(a\). In addition, \(\mathsf {var}~{x}\) is the declaration of the variable \(x\), \(\mathsf {arr}~{X}[{n}]\) is the declaration of the array with identifier \(X\) and size \(n\), \({{X}[{a_1}]}:={a_2}\) is an assignment of the result of \(a_2\) to the element of the array \(X\) indexed at \(a_1\), and \({f}({a_1,\dots ,a_m})\rightarrow {[{x_1,\dots ,x_n}]} \) is a call to the function with identifier \(f\) with arguments \(a_1\), ..., \(a_m\) and return variables \(x_1\), ..., \(x_n\). If some argument \(a_i\) is an array, then it is passed by reference in the call. We denote the set of all statements of the extended While language by \( Stmt _{\mathrm {ewh}}\).

A program in the extended While language is a mapping \(\rho \) from each function identifier \(f\) to a triple \(([{w_1,\dots ,w_m}], [{x_1,\dots ,x_n}], S)\) or \(\bot \). Here, each \(w_i\) (\(i\in \{1,\dots ,m\}\)) is a parameter of the function that is either a variable \(x\) or an array \(X\). Each \(x_i\) (\(i\in \{1,\dots ,n\}\)) is a return variable of the function. The \(S\) is the statement of the function. If \(\rho (f)=\bot \), then there is no function defined for the function identifier in the program.

For programs of the extended While language, a state \(\sigma \) is a pair \(({s}, {\iota })\). Here, \(s\in ( Var \cup Arr \rightarrow {\mathbb {Z}}_{\bot })\cup (\mathbb {Z}\rightarrow \mathbb {Z})\) is a store that maps each variable to an optional integer that is the value of the variable, maps each array name to an optional integer representing the starting location of the array, and maps each location to an integer that is the value stored at the location. In addition, \(\iota \in \mathbb {Z}\) is the next fresh location that can be used as the starting location of an array. For \(\sigma =({s}, {\iota })\), we write \(\sigma (a)\) for \(s(a)\). We denote the set of all states by \(\varSigma _{\mathrm {ewh}}\).

For the extended While language, the set \(C\) of configurations is \( Stmt _{\mathrm {ewh}}\times \varSigma _{\mathrm {ewh}}\), and the set \(R\) of result configurations is \(\varSigma _{\mathrm {ewh}}\). For space reasons, we omit the definition of the \( rule \) predicate that captures the big-step semantics of the extended While language. This definition can be found in the extended version of this paper, as well as the formalization in the Coq proof assistant.

Fig. 2.
figure 2

The program \(\rho _{\mathrm {mg}}\) that merges sorted array fragments

Array-Merging Program and Its Verification. The program \(\rho _{\mathrm {mg}}\) as shown in Fig. 2 merges the elements in two sorted fragments of an array \(\mathsf {S}\) into one sorted fragment in a different array \(\mathsf {T}\).

The only function in this program is \(\mathsf {merge}\). Formally, this function is the triple \(([{\mathsf {S},\mathsf {T},\mathtt {i},\mathtt {m},\mathtt {n}}], [\,], S_{\mathrm {mg}})\). The parameters \(\mathtt {i}\) and \(\mathtt {m}\) represent the initial and final index, respectively, for the first fragment of the array \(\mathsf {S}\) participating in the merger. The second fragment participating in the merger is from the index represented by \(\mathtt {m}+1\) to the index represented by \(\mathtt {n}\) in the same array \(\mathsf {S}\). The target array fragment of the merger is from the index represented by \(\mathtt {i}\) to the index represented by \(\mathtt {n}\), in the array \(\mathsf {T}\).

For the specification of the program, we use a few pieces of auxiliary notation. We write \({X}^{h}_{l}\) for a triple \((X,l,h)\) that represents the fragment of the array \(X\) from the index l to the index h. We write \((\!|{{X}^{h}_{l}}|\!)_{\sigma }\) for the list \([{\sigma (\ell +l), \dots , \sigma (\ell +h)}]\) where \(\ell =\sigma (X)\), i.e., the list of elements of the array \(X\) from the index l to the index h. We write \( occ \,{[{z_1,\dots ,z_n}]}\) for the function \(h\) mapping each integer \(z\) to the number of occurrences of \(z\) in the list \([{z_1,\dots ,z_n}]\) of integers. For two such functions \(h_1\) and \(h_2\), we write \({h_1}\oplus {h_2}\) for the function \(\lambda z. h_1(z)+h_2(z)\). We write \( sorted \,{[{z_1,\dots ,z_n}]}\) to express that the list \([{z_1,\dots ,z_n}]\) of integers is sorted in ascending order. We write \( sep ({{X}^{h_1}_{l_1}},{{Y}^{h_2}_{l_2}},{\sigma })\) to express that the elements of the array X from the index \(l_1\) to the index \(h_1\) occupy a separate memory area from that occupied by the elements of the array Y from the index \(l_2\) to the index \(h_2\), in the state \(\sigma \). In addition, we write \([{{u_1,\dots ,u_n}}]^{\sigma '}_{\sigma }\) to express for each \(i\in \{1,\dots ,n\}\), the value of each \(u_i\) is the same in the states \(\sigma \) and \(\sigma '\). Here, \(u_i\) can be a variable \(x\) or an array fragment \({X}^{h}_{l}\). In the latter case, that the value of \({X}^{h}_{l}\) is the same in the two states means \(\forall i: l \le i \le h\Rightarrow \sigma (\sigma (X)+i)= \sigma '(\sigma '(X)+i)\).

For the program \(\rho _{\mathrm {mg}}\), we devise the specification \(\varPhi _{\mathrm {mga}}\). We denote the starting index for the first source array fragment in \(\mathsf {S}\) as well as for the target array fragment in \(\mathsf {T}\) by \(l\). We use \(l\) as a global parameter in the specification.

We specify the function \(\mathsf {merge}\) as

$$\begin{aligned}&\varPhi _{\mathrm {mga}}( \langle {{\mathsf {merge}}({X,Y,a_{\mathrm {l}},a_{\mathrm {m}},a_{\mathrm {h}}})\rightarrow {[{}]}},{\sigma } \rangle _{\rho _{\mathrm {ms}}} ) \,{:=}\, \\&\quad \begin{aligned}&\{ \sigma ' \mid occ \,{(\!|{{X}^{h}_{l}}|\!)_{\sigma }} = occ \,{(\!|{{Y}^{h}_{l}}|\!)_{\sigma '}} \,\wedge \, sorted \,{(\!|{{Y}^{h}_{l}}|\!)_{\sigma '}} \, \} \\\\[-2.5ex]&\mathrm {if}~ \mathcal {A}\llbracket {a_{\mathrm {l}}}\rrbracket \sigma \!=\! l\,\wedge \, 0\!\le \! l\! \le \! m\!<\! h\,\wedge \, sorted \,{ (\!|{{X}^{m}_{l}}|\!)_{\sigma }} \wedge sorted \,{ (\!|{{X}^{h}_{m+1}}|\!)_{\sigma }} \wedge sep ({{X}^{h}_{l}},{{Y}^{h}_{l}},{\sigma }) \\&\mathrm {where}~ m=\mathcal {A}\llbracket {a_{\mathrm {m}}}\rrbracket \sigma \,\wedge \, h=\mathcal {A}\llbracket {a_{\mathrm {h}}}\rrbracket \sigma \end{aligned} \end{aligned}$$
$$\begin{aligned}&\varPhi _{\mathrm {mga}}( \langle {{\mathsf {merge}}({X,Y,a_{\mathrm {l}},a_{\mathrm {m}},a_{\mathrm {h}}})\rightarrow {[{}]}},{\sigma } \rangle _{\rho _{\mathrm {ms}}} ) \,{:=}\, \\&\quad \begin{aligned}&\{ \sigma ' \mid occ \,{(\!|{{X}^{h}_{l}}|\!)_{\sigma }} = occ \,{(\!|{{Y}^{h}_{l}}|\!)_{\sigma '}} \,\wedge \, sorted \,{(\!|{{Y}^{h}_{l}}|\!)_{\sigma '}} \, \} \\\\&\mathrm {if}~ \mathcal {A}\llbracket {a_{\mathrm {l}}}\rrbracket \sigma \!=\! l\,\wedge \, 0\!\le \! l\! \le \! m\!<\! h\,\wedge \, sorted \,{ (\!|{{X}^{m}_{l}}|\!)_{\sigma }} \wedge sorted \,{ (\!|{{X}^{h}_{m+1}}|\!)_{\sigma }} \wedge sep ({{X}^{h}_{l}},{{Y}^{h}_{l}},{\sigma }) \\&\mathrm {where}~ m=\mathcal {A}\llbracket {a_{\mathrm {m}}}\rrbracket \sigma \,\wedge \, h=\mathcal {A}\llbracket {a_{\mathrm {h}}}\rrbracket \sigma \end{aligned} \end{aligned}$$

This specification says that if we call the function \(\mathsf {merge}\) with two array identifiers \(X\) and \(Y\), and expressions \(a_{\mathrm {l}}\), \(a_{\mathrm {m}}\), \(a_{\mathrm {h}}\) that evaluate to \(l\), \(m\) and \(h\), such that

  • \(0\le l\le m<h\) holds,

  • the array fragments \({X}^{m}_{l}\) and \({X}^{h}_{m+1}\) are sorted in the pre-state,

  • the array fragments \({X}^{m}_{l}\) and \({X}^{h}_{m+1}\) are separated in the pre-state,

then the number of occurrences of each integer in the target array fragment \({Y}^{h}_{l}\) in the post-state is the same as its number of occurrences in the source array fragment \({X}^{h}_{l}\) in the pre-state, and the target array fragment \({Y}^{h}_{l}\) is sorted in ascending order in the post-state.

The core part of the function \(\mathsf {merge}\) is the loop statement \(S_{\mathrm {wh}}\) (see Fig. 2). We specify this loop as

$$\begin{aligned}&\varPhi _{\mathrm {mga}}(\langle {S_{\mathrm {wh}}},{\sigma } \rangle _{\rho _{\mathrm {ms}}}) \,{:=}\, \\&\quad \begin{aligned}&\{ \sigma ' \mid (i \le \sigma '(\mathtt {i})=m+1 \wedge j\le \sigma '(\mathtt {j}) \le n\vee j \le \sigma '(\mathtt {j})=n+1 \wedge i\le \sigma '(\mathtt {i}) \le m) \,\wedge \, \\&\qquad ~ \sigma '(\mathtt {k})= k+ \sigma '(\mathtt {i}) - i+ \sigma '(\mathtt {j}) - j\,\wedge \, [{{\mathtt {m}, \mathtt {n}, \mathsf {S}, \mathsf {T}, {\mathsf {S}}^{n}_{l}, {\mathsf {T}}^{k-1}_{l}}}]^{\sigma '}_{\sigma } \,\wedge \, \\&\qquad ~ { occ \,{(\!|{{\mathsf {S}}^{\sigma '(\mathtt {i})-1}_{i}}|\!)_{\sigma }} }\oplus { occ \,{(\!|{{\mathsf {S}}^{\sigma '(\mathtt {j})-1}_{j}}|\!)_{\sigma }} }= occ \,{(\!|{{\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{k}}|\!)_{\sigma '}} \,\wedge \, sorted \,{ (\!|{{\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{l}}|\!)_{\sigma '} } \,\wedge \, \\&\qquad ~ ( \sigma '(\mathtt {i})\le m \wedge \sigma '(\mathtt {k})\ge l+1 \Rightarrow \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {i}}]}\rrbracket \sigma ' \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma ') \,\wedge \, \\&\qquad ~ ( \sigma '(\mathtt {j})\le n \wedge \sigma '(\mathtt {k}) \ge l+1 \Rightarrow \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {j}}]}\rrbracket \sigma ' \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma ') \, \} \\\\[-2.5ex]&\mathrm {if}~0\le l\le i\le m< j\le n\,\wedge \, k = i+j-m-1 \,\wedge \, \\&\quad (k \ge l+1 \,\Rightarrow \, \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {i}}]}\rrbracket \sigma \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma \wedge \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {j}}]}\rrbracket \sigma \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma )\,\wedge \, \\&\quad sorted \,{(\!|{{\mathsf {S}}^{m}_{i}}|\!)_{\sigma }} \,\wedge \, sorted \,{(\!|{{\mathsf {S}}^{n}_{j}}|\!)_{\sigma }} \,\wedge \, sorted \,{(\!|{{\mathsf {T}}^{k-1}_{l}}|\!)_{\sigma }} \,\wedge \, sep ({{\mathsf {S}}^{n}_{l}},{{\mathsf {T}}^{n}_{l}},{\sigma }) \\&\mathrm {where}~ i= \sigma (\mathtt {i})\,\wedge \, j= \sigma (\mathtt {j})\,\wedge \, k= \sigma (\mathtt {k})\,\wedge \, m= \sigma (\mathtt {m})\,\wedge \, n= \sigma (\mathtt {n}) \end{aligned} \end{aligned}$$
$$\begin{aligned}&\varPhi _{\mathrm {mga}}(\langle {S_{\mathrm {wh}}},{\sigma } \rangle _{\rho _{\mathrm {ms}}}) \,{:=}\, \\&\quad \begin{aligned}&\{ \sigma ' \mid (i \le \sigma '(\mathtt {i})=m+1 \wedge j\le \sigma '(\mathtt {j}) \le n\vee j \le \sigma '(\mathtt {j})=n+1 \wedge i\le \sigma '(\mathtt {i}) \le m) \,\wedge \, \\&\qquad ~ \sigma '(\mathtt {k})= k+ \sigma '(\mathtt {i}) - i+ \sigma '(\mathtt {j}) - j\,\wedge \, [{{\mathtt {m}, \mathtt {n}, \mathsf {S}, \mathsf {T}, {\mathsf {S}}^{n}_{l}, {\mathsf {T}}^{k-1}_{l}}}]^{\sigma '}_{\sigma } \,\wedge \, \\&\qquad ~ { occ \,{(\!|{{\mathsf {S}}^{\sigma '(\mathtt {i})-1}_{i}}|\!)_{\sigma }} }\oplus { occ \,{(\!|{{\mathsf {S}}^{\sigma '(\mathtt {j})-1}_{j}}|\!)_{\sigma }} }= occ \,{(\!|{{\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{k}}|\!)_{\sigma '}} \,\wedge \, sorted \,{ (\!|{{\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{l}}|\!)_{\sigma '} } \,\wedge \, \\&\qquad ~ ( \sigma '(\mathtt {i})\le m \wedge \sigma '(\mathtt {k})\ge l+1 \Rightarrow \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {i}}]}\rrbracket \sigma ' \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma ') \,\wedge \, \\&\qquad ~ ( \sigma '(\mathtt {j})\le n \wedge \sigma '(\mathtt {k}) \ge l+1 \Rightarrow \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {j}}]}\rrbracket \sigma ' \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma ') \, \} \\\\&\mathrm {if}~0\le l\le i\le m< j\le n\,\wedge \, k = i+j-m-1 \,\wedge \, \\&\quad (k \ge l+1 \,\Rightarrow \, \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {i}}]}\rrbracket \sigma \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma \wedge \mathcal {A}\llbracket {{\mathsf {S}}[{\mathtt {j}}]}\rrbracket \sigma \!\ge \! \mathcal {A}\llbracket {{\mathsf {T}}[{\mathtt {k}\!-\!1}]}\rrbracket \sigma )\,\wedge \, \\&\quad sorted \,{(\!|{{\mathsf {S}}^{m}_{i}}|\!)_{\sigma }} \,\wedge \, sorted \,{(\!|{{\mathsf {S}}^{n}_{j}}|\!)_{\sigma }} \,\wedge \, sorted \,{(\!|{{\mathsf {T}}^{k-1}_{l}}|\!)_{\sigma }} \,\wedge \, sep ({{\mathsf {S}}^{n}_{l}},{{\mathsf {T}}^{n}_{l}},{\sigma }) \\&\mathrm {where}~ i= \sigma (\mathtt {i})\,\wedge \, j= \sigma (\mathtt {j})\,\wedge \, k= \sigma (\mathtt {k})\,\wedge \, m= \sigma (\mathtt {m})\,\wedge \, n= \sigma (\mathtt {n}) \end{aligned} \end{aligned}$$

In the specification, we are concerned with pre-states in which either the overall loop is yet to be executed, or some rounds of the loop have been completed and some further rounds are to be executed. We constrain these pre-states with a few further conditions. One of these conditions states that the elements with indexes \(\mathtt {i}\) and \(\mathtt {j}\) that are to be compared in the next round are both greater than or equal to the last element that has been set in the target array fragment. For each pre-state that satisfies all the conditions in the “if” part, several conditions are asserted for the potential post-state \(\sigma '\). A key condition here says that the two fragments \({\mathsf {S}}^{\sigma '(\mathtt {i})-1}_{i}\) and \({\mathsf {S}}^{\sigma '(\mathtt {j})-1}_{j}\) in the source array that are scanned between the reaching of the pre-state and the post-state agree with the fragment \({\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{k}\) that is filled between the reaching of the pre-state and the post-state. Another key condition says that the fragment \({\mathsf {T}}^{\sigma '(\mathtt {k})-1}_{l}\) of the target array that is already filled in the post-state for the loop is sorted in ascending order.

Without specification inference, the two remaining loops in the array-merging program also need to be explicitly specified. The specification of these two loops is much less involved than that for the first loop, and it is omitted here. With the technique of Sect. 2, the validity of \(\varPhi _{\mathrm {mga}}\) can be established.

Theorem 2

It holds that \( valid ({\varPhi _{\mathrm {mga}}})\).

With the help of Theorem 1, the proof requires no induction for reasoning about the loops. This proof boils down to symbolic execution with the help of a series of auxiliary lemmas about the memory layout.

Remark 2

The global parameter \(l\) in the specification \(\varPhi _{\mathrm {mga}}\) relates the auxiliary information about calls to \(\mathsf {merge}\) and about the loops in this function. The role of \(l\) can be compared to that of a logical variable in a concrete program logic. Such global parameters are captured in the Coq formalization by an explicit argument in the specifications. The type of this argument can be instantiated according to the needs in verifying each specific program. The verification of a program is required to go through for all possible values of this argument.

4.2 Eager Functional Language and List-Merging Program

Eager Functional Language. The language considered in this section is a fragment of the eager functional language as discussed in [33].

Fig. 3.
figure 3

The expressions and canonical forms of the eager functional language

A program of the eager functional language is an expression. The syntax for expressions is given in the left part of Fig. 3. Here, \(n\) is a numeral, \(x\) is a variable, \({e}\,{e'}\) is an application, \(\lambda {x}.{e}\) is a lambda abstraction, \(\mathsf {nil}\) is the empty list, and \({e_1}::{e_2}\) is the list obtained by prefixing the list \(e_2\) with the element \(e_1\). The expression \(\mathsf {listcase}~{e}~\mathsf {of}~({e'},{e''})\) branches to \(e'\) or \(e''\) depending on whether the result of \(e\) is the empty list \(\mathsf {nil}\). The expression \(\mathsf {letrec}~{x}={\lambda {x'}.{e'}}~\mathsf {in}~{e}\) binds \(x\) to \(\lambda {x'}.{e'}\) in \(e\). This expression allows \(x\) to be used in \(e'\), thereby allowing recursion. We denote the set of all expressions by \( Expr \).

The evaluation of expressions results in canonical forms \( cf \) as given in the right part of Fig. 3. A canonical form \( cf \) can be a canonical form \( icf \) for integers, a canonical form \( bcf \) for Boolean values, a canonical form \( fcf \) for functions, or a canonical form \( lcf \) for lists. We denote the set of all canonical forms by \( Cf \).

For the eager functional language, the set \(C\) of configurations is \( Expr \), and the set \(R\) of result configurations is \( Cf \). For space reasons, we omit the definition of the \( rule \) predicate that captures the big-step semantics of the eager functional language. This definition can be found in the extended version of this paper, as well as the formalization in the Coq proof assistant.

List-Merging Program and Its Verification. The program \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\) below merges two sorted lists, \( lcf _{\!1}\) and \( lcf _{\!2}\), into a single sorted list. More concretely, the variable \(\mathsf {merge}\) is bound to the expression \(\lambda \mathtt {x}.\lambda \mathtt {x'}.e_{\mathrm {lcase}}\) that destructs the lists that are bound to \(\mathtt {x}\) and \(\mathtt {x'}\), respectively. In case one of the lists is empty, the result of the merger is the other list. Otherwise, the result of the merger is obtained by prefixing the smaller head element of the two given lists over the merging result of the remaining parts of the lists.

$$\begin{aligned} e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2}) \,{:=}\,&\mathsf {letrec}~ \mathtt {merge}= (\lambda \mathtt {x}.\lambda \mathtt {x'}.e_{\mathrm {lcase}})~ \mathsf {in}~ {{\mathtt {merge}}\,{ lcf _{\!1}}}\,{ lcf _{\!2}} \\ e_{\mathrm {lcase}}\,{:=}\,&\mathsf {listcase}~\mathtt {x}~\mathsf {of}~ (\mathtt {x'}, \lambda \mathtt {i}. \lambda \mathtt {r}. \mathsf {listcase}~\mathtt {x'}~\mathsf {of}~ (\mathtt {x}, \lambda \mathtt {i'}. \lambda \mathtt {r'}. e_{\mathrm {if}})) \\ e_{\mathrm {if}}\,{:=}\,&\mathsf {if}~{\mathtt {i}\le \mathtt {i'}}~\mathsf {then}~{{\mathtt {i}}::{{{\mathtt {merge}}\,{\mathtt {r}}}\,{\mathtt {x'}}}}~\mathsf {else}~{{\mathtt {i}'}::{{{\mathtt {merge}}\,{\mathtt {x}}}\,{\mathtt {r'}}}} \end{aligned}$$

To develop a specification for the list-merging program, we define a piece of auxiliary notation. We write \(\langle \!|{ lcf }|\!\rangle \) for the mathematical list of integers represented by the canonical form \( lcf \) for lists. Formally, we define \(\langle \!|{\mathsf {nil}}|\!\rangle {:=}[\,]\), \(\langle \!|{{ icf }::{ lcf }}|\!\rangle \) \({:=}{ icf }::{ zs }\) if \( zs = \langle \!|{ lcf }|\!\rangle \,\wedge \, zs \in \mathbb {Z}^{*}\), and \(\langle \!|{ lcf }|\!\rangle {:=}\bot \) otherwise.

We devise the a specification for the list-merging program, \(\varPhi _{\mathrm {mgl}}\). Using the function \( occ \) and the predicate \( sorted \) introduced in Sect. 4.1, we specify the expression \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\) as

$$\begin{aligned}&\begin{aligned}&\varPhi _{\mathrm {mgl}}(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})) \,{:=}\, \\&\qquad \{ lcf \mid \exists zs \in \mathbb {Z}^*\!: zs = \langle \!|{ lcf }|\!\rangle \,\wedge occ \,{ zs }= { occ \,{ zs _1}}\oplus { occ \,{ zs _2}}\,\wedge \, sorted \,{ zs } \} \end{aligned} \\&\mathrm {if}~ zs _1\in \mathbb {Z}^*\,\wedge \, zs _2\in \mathbb {Z}^*\,\wedge \, sorted \,{ zs _1}\,\wedge \, sorted \,{ zs _2} \\&\mathrm {where}~ zs _1=\langle \!|{ lcf _{\!1}}|\!\rangle \,\wedge \, zs _2=\langle \!|{ lcf _{\!2}}|\!\rangle \end{aligned}$$

This specification says that given list canonical forms \( lcf _{\!1}\) and \( lcf _{\!2}\) that are both sorted in ascending order, the result of executing \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\) is a list canonical form \( lcf \). The list canonical form \( lcf \) contains the elements as contained in either \( lcf _{\!1}\) or \( lcf _{\!2}\). Furthermore, the list canonical form \( lcf \) is sorted in ascending order.

To support the verification of the specification for \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\), we specify an unfolded form of this expression. The execution of this unfolded form either terminates directly, or gives the same form again.

$$\begin{aligned}&\begin{aligned}&\varPhi _{\mathrm {mgl}}(\, (\lambda {\mathtt {x}}.{\mathsf {letrec}~{\mathsf {merge}}={\lambda {\mathtt {x}}.{\lambda {\mathtt {x'}}.{e_{\mathrm {lcase}}}}}~\mathsf {in}~{\lambda {\mathtt {x'}}.{e_{\mathrm {lcase}}}}}) \,\, lcf _{\!1}\, lcf _{\!2}\,) \, {:=}\, \\&\qquad \{ lcf \mid \exists zs \in \mathbb {Z}^*\!: zs = \langle \!|{ lcf }|\!\rangle \,\wedge \, occ \,{ zs }= { occ \,{ lcf _{\!1}}}\oplus { occ \,{ lcf _{\!2}}}\,\wedge \, sorted \,{ zs } \} \end{aligned} \\&\mathrm {if}~ zs _1\in \mathbb {Z}^*\,\wedge \, zs _2\in \mathbb {Z}^*\,\wedge \, sorted \,{ zs _1}\,\wedge \, sorted \,{ zs _2} \\&\mathrm {where}~ zs _1=\langle \!|{ lcf _{\!1}}|\!\rangle \,\wedge \, zs _2=\langle \!|{ lcf _{\!2}}|\!\rangle \end{aligned}$$

This specification reflects that the unfolded expression \((\lambda \mathtt {x}. \mathsf {letrec}~\mathsf {merge}\,=\,\lambda \mathtt {x}. \lambda \mathtt {x'}.\) \(e_{\mathrm {lcase}}\,\mathsf {in}\, \lambda \mathtt {x'}.e_{\mathrm {lcase}}) \,\, lcf _{\!1}\, lcf _{\!2}\) delivers analogous guarantees to those delivered by the original expression \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\).

With the technique of Sect. 2, the validity of \(\varPhi _{\mathrm {mgl}}\) can be established.

Theorem 3

It holds that \( valid ({\varPhi _{\mathrm {mgl}}})\).

With the help of Theorem 1, the proof requires no induction for reasoning about the recursive applications of the function bound to \(\mathsf {merge}\). This proof boils down to symbolic execution with the help of a few auxiliary lemmas about substitution and evaluation related to canonical forms.

Remark 3

It might appear that the auxiliary information needed for the verification of the list-merging program should be for expressions of the form \(\mathsf {merge}~{\_}~{\_}\). However, these expressions cannot be evaluated, because information about the actual function bound to \(\mathsf {merge}\) is missing. The form that recurs in the evaluation of \(e_{\mathrm {mg}}( lcf _{\!1}, lcf _{\!2})\) is actually \((\lambda {\mathtt {x}}.{\mathsf {letrec}~{\mathsf {merge}}={\lambda {\mathtt {x}}.{\lambda {\mathtt {x'}}.{e_{\mathrm {lcase}}}}}~\mathsf {in}~{\lambda {\mathtt {x'}}.{e_{\mathrm {lcase}}}}}) \,\, {\_}\,\, {\_}\).

5 On Completeness of the Technique

It is untrue that any valid specification can be verified. Intuitively, a specification \(\varPhi \) that is valid but missing the necessary auxiliary information such as loop variants might not be verifiable.

Consider the factorial example in Sect. 3, and the specification \(\varPhi _{\mathrm {fac}}'\) that is the same as \(\varPhi _{\mathrm {fac}}\) except that \(\varPhi _{\mathrm {fac}}'\) maps \(\langle {S_{\mathrm {wh}}}, {\sigma } \rangle \) where \(\sigma (\mathtt {m})>0\) to \(\varSigma \). The specification \(\varPhi _{\mathrm {fac}}'\) is valid as the specification \(\varPhi _{\mathrm {fac}}\) is. This is because \(\varPhi _{\mathrm {fac}}'\) is a loosened version of \(\varPhi _{\mathrm {fac}}\). However, \(\varPhi _{\mathrm {fac}}'\) cannot be verified using our proposed technique. Due to missing auxiliary information, the verification procedure leads to a non-terminating symbolic execution of the factorial program.

In the following, we show that for a given specification that is valid, there is always a more informative specification \(\varPhi '\) than \(\varPhi \) that is verifiable. Formally, a specification \(\varPhi '\) is at least as informative as a specification \(\varPhi \), as denoted by \({\varPhi }\preceq {\varPhi '}\), if for each configuration \(c\), it holds that \(\varPhi (c)\supseteq \varPhi '(c)\).

The lemma below says the specification mapping each configuration to the set of all the semantically derivable result configurations can be verified.

Lemma 2

Let \(\varPhi _\star \) be the specification satisfying \(\varPhi _\star (c)=\{r\mid deriv ({c, r})\}\) for all configurations \(c\). Then, \( verif ({\varPhi _\star })\) can be established.

Proof

We show that for all \(c\) and \(r\), if \( infer ^{\varPhi _\star }({c},{r})\), then \(r\in \varPhi _\star (c)\). This boils down to showing if \( infer ^{\varPhi _\star }({c},{r})\), then \( deriv ({c,r})\). Below, we give an inductive proof of this statement.

Assume \( infer ^{\varPhi _\star }({c},{r})\). Then, there exist some m, \(c_1\), ..., \(c_m\), \(r_1\), ..., \(r_m\), such that \( res ^{\varPhi _\star }({c_1},{r_1})\), ..., \( res ^{\varPhi _\star }({c_m},{r_m})\), and

$$\begin{aligned} rule ~{[ ({c_1},{r_1}), \dots , ({c_m},{r_m}) ]}~{({c},{r})} \end{aligned}$$
(2)

For each i, we show that \( deriv ({c_i,r_i})\) holds by distinguishing between the cases where \(\varPhi _\star (c_i)=R\) and \(\varPhi _\star (c_i)\ne R\).

  • Suppose \(\varPhi _\star (c_i)=R\). Then we deduce \( infer ^{\varPhi _\star }({c_i},{r_i})\) from \( res ^{\varPhi _\star }({c_i},{r_i})\). Hence, we have \( deriv ({c_i,r_i})\) from the induction hypothesis.

  • Suppose \(\varPhi _\star (c_i)\ne R\). We have \(r_i\in \varPhi _\star (c_i)\) using \( res ^{\varPhi _\star }({c_i},{r_i})\). Hence, we have \( deriv ({c_i,r_i})\) using the definition of \(\varPhi _\star \).

Ultimately, we have \( deriv ({c_i,r_i})\) for each \(i\in \{1,\dots ,m\}\), and we obtain \( deriv ({c,r})\) using (2). This completes the proof.    \(\square \)

The following theorem says that for each valid specification \(\varPhi \), there is a specification that is at least as informative as \(\varPhi \), and that can be verified.

Theorem 4 (Relative Completeness)

For each valid specification \(\varPhi \), there exists a specification \(\varPhi '\) such that \({\varPhi }\preceq {\varPhi '}\), and \( verif ({\varPhi '})\) can be established.

Proof

We first show that the specification \(\varPhi _\star \) in Lemma 2 is at least as informative as any valid specification. Let \(\varPhi \) be a specification satisfying \( valid ({\varPhi })\). Let \(c\) be an arbitrary configuration. Let \(r\) be any result configuration satisfying \(r\in \varPhi _\star (c)\). We have \( deriv ({c,r})\) from the definition of \(\varPhi _\star \). Hence, we have \(r\in \varPhi (c)\) because of \( valid ({\varPhi })\). Hence, \(\varPhi _\star (c)\subseteq \varPhi (c)\) holds. Hence, we have \({\varPhi }\preceq {\varPhi _\star }\). Moreover, we have \( verif ({\varPhi _\star })\) using Lemma 2. This completes the proof.    \(\square \)

If the program contained in a configuration exhibits only bounded behavior, then the corresponding result configuration can be obtained through symbolic execution. Hence, it is not necessary that a verifiable specification should cover these configurations. In an informal sense, this argument supports that for a specification to be verified, it is only necessary to provide auxiliary information about constructs such as loops and recursive function calls in the specification.

6 Related Work

Inductive invariants [24] are well-studied means to sound program verification directly based on operational execution models. An inductive invariant needs to be preserved by all the possible atomic steps that can be taken in the execution of the target program. This requirement often leads to difficulties in identifying the exact condition that qualifies as an inductive invariant, and that enables the verification of the target program.

In [26], a method is proposed to generate inductive invariants from inductive assertions. The method is based on a small-step execution relation. Minimal information about the syntactical structure of the programming language is required in the generation of the inductive invariants. In comparison, our technique targets big-step operational semantics, and its soundness does not rely on the reduction of the verification problem to the generation of inductive invariants.

In [25], a technique is proposed to generate sound program verifiers based on existing formalizations of small-step semantics in proof assistants. The soundness of the technique is established with a coinductive argument. In comparison, our technique targets big-step operational semantics, and is based on inductive reasoning. Nevertheless, we are inspired by this work in the style of language-independent program specifications and the form of completeness statements.

In [40], a language-independent verification technique based on reachability logics and semantics formulated in rewriting systems is introduced. In comparison, our technique can only be used for big-step semantics. However, our technique can be used with semantic definitions using inductive predicates in a proof assistant, and requires only the logical foundation of the proof assistant to function. Our technique also has a succinct, inductive argument for soundness.

Several developments provide means to systematically derive abstract semantics from concrete semantics such as big-step operational semantics and its variants [12, 13, 35]. Among these, [12] proposes a language-independent notion of skeletal semantics that can be instantiated to obtain concrete and abstract semantic interpretations. However, the emphasis of these developments is in obtaining automated static analyses of programs, rather than in exploiting user-provided specification in the deductive verification of deep correctness properties.

To some extent, language-independent program verification can also be supported by encoding the target languages or target programs in the same language (e.g., WhyML, Boogie, etc.) or calculus (e.g., CSP, the \(\pi \)-calculus, etc.) supporting verification. This encoding can be considerably more light-weight than the direct formalization of the syntax and semantics of the source language. However, when the features of the source language are sufficiently complicated, it can be highly non-trivial to justify the encoding.

In Unifying Theories of Programming [14, 19, 22, 29, 32, 38], the semantics of programming constructs (e.g., assignment, conditional, sequential composition, parallel composition, etc.) involved in diverse languages is formulated in a relational calculus. The connection between different kinds of semantics – algebraic semantics, denotational semantics, and operational semantics – is investigated. In comparison, we study the verification of programs based on a common model of big-step opperational semantics. We do not look at concrete programming constructs, or investigate the connection between different types of semantics.

7 Future Directions

Reuse of Existing Formalization of Semantics. For the related language-independent verification techniques based on small-step operational semantics [25, 26], it is not difficult to obtain a verification infrastructure by directly reusing an existing formalization of semantics. This is because a small-step semantics readily provides a step relation that can be used to interface with the verification framework. In comparison, we have only shown that our language-independent verification technique can be applied after the big-step semantics of the target language is formalized via a predicate that explicitly captures the premises and conclusions of the semantic rules. Although the big-step semantics formulated using this predicate closely resemble their classical formulation, it is desirable if a higher level of reusability can be enabled. A potential solution is to construct a program that automatically transforms a formalization of big-step semantics into a formulation with the \( rule \) predicate. Such transformation could be attempted using the MetaCoq framework [39] to achieve seamless integration with the Coq proof assistant.

Integration of Techniques for Other Aspects of Deductive Verification. The purpose of the present work is not to simplify the overall task of deductive program verification beyond achievable by existing techniques. Instead, the focus has been the ability to reason about different types of programming constructs that potentially cause unbounded behavior, in a uniform way. This ability helps simplify the reasoning about these programming constructs, relative to direct inductive proofs based on big-step operational semantics. To construct a full-fledged language-independent program verifier in a proof assistant, effective treatment of other aspects of deductive program verification (e.g., memory layout, mathematical reasoning in diverse problem domains, etc.) is required. In principle, it is desirable to deal with the language-generic and language specific aspects of program verification separately (as advocated in UTP [19]). Concretely, existing formalization of program logics and mathematical theories in proof assistants are expected to provide the essential technical ingredients for simplifying the remaining aspects of verification tasks.

8 Conclusion

To tackle the problem caused by the proliferation of programming languages in deductive program verification, we provide a language-independent verification technique that addresses the cross-cutting concern of reasoning about programming constructs potentially causing unbounded behavior. Typically, these constructs include loops and recursive functions in different forms. The proposed technique can be applied to any programming language with a big-step operational semantics. The user of this technique need not set up inductions for the loops and recursive calls in performing a program proof, but performs symbolic execution of the program based on the big-step semantics, and with the help of a specification containing auxiliary information about these constructs. The technique admits succinct, inductive arguments for soundness and relative completeness that are verified in the Coq proof assistant along with other formal claims [8]. It has been illustrated with verification examples targeting languages of different paradigms. It provides a basis for a language-independent program verifier based on big-step operational semantics in proof assistants.