1 Introduction

We are interested in the problem of safety of recursive programs, i.e., deciding whether an assertion at a program location always holds. The first step in Software Model Checking is to approximate the input program by a program model where the program operations are terms in a first-order theory \(\mathcal {D}\). Many program models exist today, e.g., Boolean programs [1] of SLAM [2], Goto programs of CBMC [3], BoogiePL of Boogie [4], and, indirectly, internal representations of many tools such as UFO [5], HSF [6], etc. Given a safety property and a program model over \(\mathcal {D}\), it is possible to analyze bounded executions using an oracle for Satisfiability Modulo Theories (SMT) for \(\mathcal {D}\). However, in the presence of (unbounded) recursion, safety is undecidable in general.

Throughout this paper, we assume that procedures cannot be passed as parameters. There exist several program models where safety is efficiently decidable (with the stated assumption [7]), e.g., Boolean programs with (unbounded) recursion and the unbounded use of stack [1, 8]. The general observation behind these algorithms is that one can summarize the input-output behavior of a procedure, where a summary of a procedure is an input-output relation describing what is currently known about its behavior. Thus, if a summary has enough details, it can be used to analyze a procedure call without considering the procedure body of the callee [9, 10]. For a Boolean program, the number of states is finite and hence, a summary can only be updated finitely many times. This observation led to a number of efficient algorithms that are polynomial in the number of states, e.g., the analysis framework by Reps et al. (RHS) [8], recursive state machines [11], and symbolic BDD-based algorithms of Bebop [1] and Moped [12]. When safety is undecidable (e.g., when \(\mathcal {D}\) is Linear Rational Arithmetic (LRA) or Linear Integer Arithmetic (LIA)), several existing software model checkers work by iteratively obtaining Boolean program abstractions using Predicate Abstraction [2, 13]. In this paper, we are concerned with alternative approaches that work directly on the original program model without an explicit step of Boolean abstraction. Despite the undecidability, we are interested in proving safety for many programs in practice with a guarantee of finding a counterexample when the program is unsafe.

Several algorithms have been recently proposed for verifying recursive programs without Predicate Abstraction. Notable examples are Whale [14], HSF [6], GPDR [15], Ultimate Automizer [16, 17] and Duality [18]. With the exception of GPDR, these algorithms are based on a combination of Bounded Model Checking (BMC) [19] and Craig Interpolation [20]. First, they use an SMT-solver to check for a counterexample of bounded size, where the bound is on the depth of the call stack (ie the number of nested procedure calls). Second, they use (tree) interpolation to obtain over-approximating summaries of procedures for the current bound. This is repeated with increasing values of the bound until a counterexample is found or the approximate summaries for the current bound are also invariant ( i.e., independent of the bound). The reduction to BMC ensures that the algorithms are guaranteed to find a counterexample if one exists. However, the size of the SMT instance can grow exponentially with the bound on the call-stack depth in the worst-case, due to the tree-like unrolling of the call-graph.

Fig. 1
figure 1

A Boolean program with exponential unwinding size

To illustrate the exponential growth in the SMT instances created by existing approaches for BMC, consider the program in Fig. 1 with Boolean variables and finitely many \(\mathtt{Level<i>}\) procedures (adapted from [1]). Here, nd is a routine that returns an unknown Boolean value, i.e., assume that the behavior of nd is unknown and hence, for the purpose of verification, nd effectively returns either true or false non-deterministically. For a bound n on the number of such procedures, the figure also shows its tree-like unrolling which grows exponentially in n. With one Boolean parameter per procedure, note that the number of program states is linear in n, where a state corresponds to a valuation of the program counter and the variables in scope. Therefore, many of present day BMC-based model checking algorithms are at least worst-case exponential in the number of states for Boolean programs. However, note that the operational semantics of a Boolean program can be defined in terms of a pushdown automaton where the push and pop operations on the stack correspond to procedure calls and returns, respectively, and the accepting states denote the safe program states. This reduces safety in Boolean programs to state-reachability in pushdown automata and there exist polynomial-time (cubic) algorithms for the latter [1, 8, 21].

On the other hand, the algorithm GPDR [15] follows the approach of IC3 [22] by solving BMC incrementally without unrolling the call-graph. In GPDR, interpolation is used to obtain over-approximating summaries and partial models denoting undesirable reachable states are cached for future. For some configurations (e.g., explicit-state reasoning), GPDR is worst-case polynomial for Boolean Programs. However, it gets more challenging when the program operations and formulas are in a first-order language. In this case, GPDR might even fail to find a counterexample despite the presence of an SMT oracle, unlike the guarantee given by other BMC-based algorithms mentioned above (see Appendix for an example).

To address the problems mentioned above, we present a new SMT-based algorithm RecMC that analyzes the program compositionally. That is, RecMC iteratively checks safety properties of individual procedures by inferring and utilizing approximating summaries of procedures. Our main insight is to maintain not only over-approximating summaries but also under-approximating summaries of the procedures. Syntactically, our approximations are formulas over the parameters of a procedure and auxiliary variables denoting the initial values of the parameters. Clarke showed that such formulas are sufficient to obtain a relatively complete Hoare proof system by making use of a Rule of Adaptation [9].

We use the terms may-summary and must-summary, respectively, to refer to such an over- and under-approximation. While may-summaries are used to block spurious counterexamples, must-summaries are used to analyze a procedure call without inlining the body of the callee. Thus, if the under-approximations given by the must-summaries can be reused at call-sites, they help avoid redundant explorations of the state-space. However, given a bound on the call-stack depth, the must-summaries can be too strong and the may-summaries can be too weak to show falsification or satisfaction of safety. In this case, our compositional algorithm creates and checks new safety properties of the callee procedures, updates the approximations, and enters a new iteration.

For Boolean programs, as mentioned previously, the number of states is finite and hence, the summaries can only be updated finitely many times. As the summaries are reused at call-sites in a compositional manner, RecMC has a polynomial time complexity for Boolean Programs, by using an argument similar to that of RHS [8]. Moreover, in general, assuming an SMT oracle for the first-order language of the formulas and the program operations, we show that RecMC terminates for a given bound on the call-stack depth. To the best of our knowledge, this is the first SMT-based algorithm with such guarantees.

Almost every step of RecMC introduces existential quantifiers in the formulas created. RecMC tries to eliminate these quantified variables as, otherwise, they would accumulate exponentially in the value of the current bound.Footnote 1 This is because, if no quantified variable is eliminated, the compositional algorithm essentially breaks down into an algorithm that unrolls the call-graph into a tree where, as we mentioned earlier, the size of the SMT problems created may grow exponentially in the value of the bound. A naïve solution is to use algorithms for quantifier elimination (QE), which results in an equivalent quantifier-free formula, but which is also expensive in practice. Instead, we develop an alternative approach that under-approximates QE, i.e., obtains a quantifier-free formula stronger than the original formula. However, obtaining arbitrary under-approximations can lead to divergence of the algorithm. To this end, we introduce the concept of Model Based Projection (MBP), for covering \(\exists \overline{x} \cdot \varphi (\overline{x},\overline{y})\) by finitely-many quantifier-free under-approximations obtained using satisfying models of \(\varphi (\overline{x},\overline{y})\) (see Sect. 5). We developed efficient MBPs for Linear Rational Arithmetic (LRA) and Presburger Arithmetic (also known as Linear Integer Arithmetic (LIA)) based on the QE methods by Loos-Weispfenning [23] for LRA and Cooper [24] for LIA. We use MBP to under-approximate existential quantification in \(\textsc {RecMC} \). In the best case, a partial under-approximation suffices and a complete quantifier elimination can be avoided.

In summary, we present: (a) an efficient, compositional SMT-based algorithm for model checking recursive programs, that uses under- and over-approximating summaries of procedure behavior (Sect. 4), (b) MBP functions for under-approximating quantifier elimination for LRA and LIA (Sect. 5), (c) a new, complete algorithm for Boolean Programs, with complexity polynomial in the number of states, similar to the best known method [1] (see Sect. 4), and (d) an implementation and an empirical evaluation of the approach (Sect. 6).

2 Overview

In this section, we give an overview of \(\textsc {RecMC} \) and illustrate it on an example. Let \({\mathcal {A}}\) be a recursive program. We assume that there are no internal procedures and that procedures cannot be passed as parameters. Furthermore, for simplicity of presentation, assume no loops, no global variables and that arguments are passed by reference. Let \(P(\overline{v}) \in {\mathcal {A}}\) be a procedure with parameters \(\overline{v}\), and let \(\overline{v}_0\) be fresh variables not appearing in P with \(|v_0| = |v|\), denoting the initial values of \(\overline{v}\). A safety property for P is a formula \(\varphi (\overline{v}_0,\overline{v})\). We say that P satisfies \(\varphi \), denoted \(P(\overline{v}) \models \varphi (\overline{v}_0,\overline{v})\), iff the Hoare-triple \(\{{\overline{v}=\overline{v}_0}\} ~{P(\overline{v})}~ \{{\varphi (\overline{v}_0,\overline{v})}\}\) is valid. Note that every Hoare-triple corresponds to a safety property in this sense, as shown by Clarke [9] using a Rule of Adaptation. We say that \(\varphi \) is a bounded safety property for P and a natural number \(n \ge 0\), denoted \(P(\overline{v}) \models _n \varphi (\overline{v}_0,\overline{v})\), iff all executions of P using a call-stack bounded by n satisfy \(\varphi \).

Fig. 2
figure 2

Flow of the algorithm RecMC to check if \(M \models \varphi _{{ safe}}. \sigma _o\) and \(\sigma _u\) denote the may and must-summary maps

The key steps of \(\textsc {RecMC} \) are shown in Fig. 2. RecMC decides safety for the main procedure M of \({\mathcal {A}}\). \(\textsc {RecMC} \) maintains two formula maps \(\sigma _u\) and \(\sigma _o\). The must-summary map \(\sigma _u\) maps each procedure \(P(\overline{v}) \in {\mathcal {A}}\) to a set of formulas over \(\overline{v}_0 \cup \overline{v}\) that under-approximate its behavior. Similarly, the may-summary map \(\sigma _o\) maps a procedure P to a set of formulas that over-approximate its behavior. Given P, the maps are partitioned according to the bound on the call-stack. Therefore, if \(\delta (\overline{v}_0,\overline{v}) \in \sigma _u(P, n)\) for some \(n \ge 0\), then \(\delta \) under-approximates the behavior of all executions of P that use a call-stack of depth at most n. In other words, for every model m of \(\delta \), there is an execution of P that begins in \(m(\overline{v}_0)\), the value of \(\overline{v}_0\) under m, and ends in \(m(\overline{v})\), the value of \(\overline{v}\) under m, using a call-stack bounded by n. Similarly, if \(\delta (\overline{v}_0,\overline{v}) \in \sigma _o(P, n)\), then \(\delta \) over-approximates the behavior of all executions of P using a call-stack of depth at most n, i.e., \(P(\overline{v}) \models _n \delta (\overline{v}_0,\overline{v})\).

\(\textsc {RecMC} \) alternates between two steps: (A) deciding bounded safety (that also updates \(\sigma _u\) and \(\sigma _o\) maps) and (B) checking whether the current proof of bounded safety also proves unbounded safety. It terminates when a counterexample or a proof is found.

Fig. 3
figure 3

Flow of the algorithm BndSafety to check \(P \models _b \varphi \)

Bounded safety, i.e., whether \(P \models _b \varphi \), is decided using the algorithm BndSafety shown in Fig. 3. Step 1 checks whether \(\varphi \) is falsified using the current must-summaries \((\sigma _u)\) of the callees of P at bound \(b-1\). If so, it infers a new must-summary for P at bound b witnessing the falsification of \(\varphi \). Step 2 checks whether \(\varphi \) is satisfied using the current may-summaries \((\sigma _o)\) of the callees at bound \(b-1\). If so, it infers a new may-summary for P at bound b witnessing the satisfaction of \(\varphi \). If the prior two steps fail, there is a potential counterexample \(\pi \) in P where the must-summaries of the callees are too strong to witness \(\pi \) but the may-summaries are too weak to block it. Step 3 checks the feasibility of such a path \(\pi \) by creating new bounded safety properties for the callees of P at bound \(b-1\), recursively checking the new properties, and updating the formula maps.

We conclude this section with an illustration of \(\textsc {RecMC} \) on the program in Fig. 4 (adapted from [9]). The program has 3 procedures: the main procedure M, and procedures T and D. The procedure M calls T and D. The procedure T modifies its argument t and calls itself recursively. The procedure D decrements its argument d. Suppose that we want to check if the (main procedure of the) program satisfies the safety property \(\varphi \equiv m_0 \ge 2m + 4\). The formula maps \(\sigma _u\) and \(\sigma _o\) are initially empty.

Fig. 4
figure 4

A recursive program with 3 procedures

Fig. 5
figure 5

A run of BndSafety for the program in Fig. 4 and the bounded safety property \(M(m) \models _1 m_0 \ge 2m+4\)

In the first iteration of \(\textsc {RecMC} \), the bound n on the call-stack is 0, i.e., the bounded safety problem is to check whether all executions that do not have any procedure calls are safe. Given that the only path in M has procedure calls, no such executions exist and safety trivially holds for bound 0. Figure 5 shows the four iterations of BndSafety for the next bound \(n=1\), i.e., for checking whether \(M(m) \models _1 \varphi \) holds or not. In the first iteration of BndSafety, the current may and must-summaries of the callees are insufficient to satisfy or falsify the property, and there is a potential counterexample along the only path in M. Next, we create a new property for a callee, by performing a backward analysis along the potential counterexample path beginning with the negation of the safety property, and making use of the current summaries of the callees. In practice, one need not be restricted to a backward analysis; see Sects. 4 and 6 for details. As shown in Fig. 5, assume that a new bounded safety property is created for D and \(\sigma _u(\texttt {D},0)\), the must-summary map of D at bound 0, is updated with a new must-summary that witnesses the falsification of the property. In the second iteration of BndSafety, the current summaries are still insufficient and assume that a new property is created for T and \(\sigma _o(\texttt {T},0)\) is updated with a new may-summary that witnesses the satisfaction of the property. To create the new property for T, we make use of the must-summary of D computed in the previous iteration for both the calls to D in M. This is where the compositionality of the algorithm helps avoid the potential re-computation of the must-summary of D. Similarly, in the third iteration of BndSafety, let \(\sigma _o(\texttt {D},0)\) be updated with a new may-summary. At this point, the may-summaries for T and D at bound 0 are sufficient to establish bounded safety at \(n=1\) in the fourth iteration of BndSafety, resulting in an update of \(\sigma _o(\texttt {M},1)\).

Now, the may-summary map \(\sigma _o\) is:

$$\begin{aligned} \sigma _o(\texttt {M}, 1)&= \{m_0 \ge 2m + 4\},&\sigma _o(\texttt {T}, 0)&= \{t_0 \ge 2t\},&\sigma _o(\texttt {D}, 0)&= \{d \le d_0 -1\} \end{aligned}$$

Ignoring the bounds, the may-summaries are invariant. For example, we can prove that the body of T satisfies \(t_0 \ge 2t\), assuming that the calls do, i.e., \(\{{t=t_0}\} ~{\texttt {T} (t)}~ \{{t_0 \ge 2t}\} \vdash \{{t=t_0}\} ~{{ Body}(\texttt {T})}~ \{{t_0 \ge 2t}\}\), where \({ Body}\) denotes the body of a procedure. Thus, step B of \(\textsc {RecMC} \) succeeds and the algorithm terminates declaring the program SAFE.

In summary, RecMC checks safety of a recursive program in a compositional manner by inferring under- and over-approximations of the behavior of procedures. We use an SMT-solver for automating the steps of RecMC and BndSafety.

3 Preliminaries

Consider a first-order language with equality and let \(\mathcal {S}\) be its signature, i.e., the set of non-logical function and predicate symbols (including equality). An \(\mathcal {S}\)-structure I consists of a domain of interpretation, denoted \({ dom}({I})\), and assigns elements of \({ dom}({I})\) to variables, and functions and predicates on \({ dom}({I})\) to the symbols of \(\mathcal {S}\). Let \(\varphi \) be a formula in the first-order language. We assume the usual definition of satisfaction of \(\varphi \) by I, denoted \(I \models \varphi \). I is called a model of \(\varphi \) iff \(I \models \varphi \) and this can be extended to a set of formulas. A first-order \(\mathcal {S}\)-theory \({ Th}\) is a set of deductively closed \(\mathcal {S}\)-sentences. I satisfies \(\varphi \) modulo \({ Th}\), denoted \(I \models _{ Th}\varphi \), iff \(I \models { Th}\cup \{\varphi \}\). \(\varphi \) is valid modulo \({ Th}\), denoted \(\models _{ Th}\varphi \), iff every model of \({ Th}\) is also a model of \(\varphi \).

Let I be an \(\mathcal {S}\)-structure and \(\overline{w}\) be a list of fresh function/predicate symbols not in \(\mathcal {S}\). A \((\mathcal {S}\cup \overline{w})\)-structure J is called an expansion of I to \(\overline{w}\) iff \({ dom}({J}) = { dom}({I})\) and J agrees with I on the assignments to all variables and the symbols of \(\mathcal {S}\). We use the notation \({I}\{{\overline{w}} \mapsto {\overline{u}}\}\) to denote the expansion of I to \(\overline{w}\) that assigns the function/predicate \(u_i\) to the symbol \(w_i\).

For an \(\mathcal {S}\)-sentence \(\varphi \), we write \(I(\varphi )\) to denote the truth value of \(\varphi \) under I. For a formula \(\varphi (\overline{x})\) with free variables \(\overline{x}\), we overload the notation \(I(\varphi )\) to mean \(\{\overline{a} \in { dom}({I})^{|\overline{x}|} \mid {I}\{{\overline{x}} \mapsto {\overline{a}}\} \models \varphi \}\). For simplicity of presentation, we sometimes identify the truth value true with \({ dom}({I})\) and false with \(\emptyset \).

We assume that programs do not have internal procedures and that procedures cannot be passed as parameters. Furthermore, without loss of generality, we assume that programs do not have loops or global variables. In the following, we define programs using a logical representation, as opposed to giving a concrete syntax.

Definition 1

(Programs and Procedures) A program \({\mathcal {A}}\) is a finite list of procedures with a designated main procedure M where the program begins. A procedure P is a tuple \(\langle \overline{\iota }_{P}, \overline{o}_{P}, \Sigma _{P}, \overline{\ell }_{P}, \beta _{P} \rangle \), where

  1. 1.

    \(\overline{\iota }_{P}\), \(\overline{o}_{P}\), and \(\overline{\ell }_{P}\) are disjoint finite lists of variables denoting the input values of the parameters, the output values of the parameters, and the local variables, respectively,

  2. 2.

    \(\Sigma _{P}\) is a fresh predicate symbol of arity \(|\overline{\iota }_{P}| + |\overline{o}_{P}|\),

  3. 3.

    \(\beta _{P}\) is a quantifier-free sentence over the signature \((\mathcal {S}\cup \{\Sigma _{Q} \mid Q\in {\mathcal {A}}\} \cup \overline{\iota }_{P} \cup \overline{o}_{P} \cup \overline{\ell }_{P})\) denoting the body of the procedure, where a predicate symbol \(\Sigma _{Q}\) appears only positively, i.e., under even number of negations.

We use \(\overline{v}_{P}\) to denote \(\overline{\iota }_{P} \cup \overline{o}_{P}\).

Intuitively, for a procedure P, \(\Sigma _{P}\) is used to denote its semantics and \(\beta _{P}\) encodes its body using the predicate symbol \(\Sigma _{Q}\) for a call to the procedure Q. We require that a predicate symbol \(\Sigma _{Q}\) appears only positively in \(\beta _{P}\) to ensure a fixed-point characterization of the semantics as shown later on. For example, for the signature \(\mathcal {S}= \langle 0, { Succ}, -, +, \le , >, = \rangle \), the program in Fig. 4 is represented as \(\langle M, T, D \rangle \) with the main procedure \(M = \langle m_0, m, \Sigma _{M}, \langle \ell _0,\ell _1 \rangle , \beta _{M} \rangle \), \(T = \langle t_0, t, \Sigma _{T}, \langle \ell _0,\ell _1 \rangle , \beta _{T} \rangle \), and \(D = \langle d_0, d, \Sigma _{D}, \emptyset , \beta _{D} \rangle \), where

$$\begin{aligned} \beta _{M}= & {} \Sigma _{T}(m_0,\ell _0) \wedge \Sigma _{D}(\ell _0,\ell _1) \wedge \Sigma _{D}(\ell _1,m) \qquad \beta _{D} = (d = d_0 - 1)\nonumber \\ \beta _{T}= & {} \left( t_0 \le 0 \wedge t_0 = t \right) ~\vee ~ \left( t_0 > 0 \wedge \ell _0 = t_0 - 2 \wedge \Sigma _{T}(\ell _0,\ell _1) \wedge t = \ell _1 + 1 \right) \end{aligned}$$
(1)

Here, we abbreviate \({ Succ}^i(0)\) by i and \((m_0, t_0, d_0)\) and (mtd) denote the input and the output values of the parameters of the original program, respectively. For a procedure P, let \({ Paths}({P})\) denote the set of all prime-implicants of \(\beta _{P}\). Intuitively, each element of \({ Paths}({P})\) encodes a path in the procedure.

Let \({\mathcal {A}}= \langle P_0,\dots ,P_n \rangle \) be a program and I be an \(\mathcal {S}\)-structure. Let \(\overline{X}\) be a list of length n such that each \(X_i\) is either (i) a truth value if \(P_i\) has no parameters, i.e., \(|\overline{v}_{P_i}| = 0\), or (ii) a subset of tuples from \({ dom}({I})^{|\overline{v}_{P_i}|}\) if \(|\overline{v}_{P_i}| \ge 1\). Let \(J(I,\overline{X})\) denote the expansion \({{I}\{{\Sigma _{P_0}} \mapsto {X_0}\}\dots }\{{\Sigma _{P_n}} \mapsto {X_n}\}\). The semantics of a procedure \(P_i\) given I, denoted \(\llbracket {P_i} \rrbracket _I\), characterizes all the terminating executions of \(P_i\) and is defined as follows. \(\langle \llbracket {P_0} \rrbracket _I, \dots , \llbracket {P_n} \rrbracket _I \rangle \) is the (pointwise) least \(\overline{X}\) such that for all \(Q \in {\mathcal {A}}\), \(J(I,\overline{X}) \models \forall \overline{v}_{Q} \cup \overline{\ell }_{Q} \cdot (\beta _{Q}{\!}\implies {\!}\Sigma _{Q}(\overline{v}_{Q}))\). This has a well-known least fixed-point characterization [9].

For a natural number \(b \ge 0\), denoting a bound on the call-stack, the bounded semantics of a procedure \(P_i\) given I, denoted \(\llbracket {P_i} \rrbracket ^b_I\), characterizes all the executions using a stack of depth bounded by b and is defined by induction on b:

$$\begin{aligned} \llbracket {P_i} \rrbracket ^0_I= & {} J(I,\langle \emptyset ,\dots ,\emptyset \rangle ) (\exists \overline{\ell }_{P_i} \cdot \beta _{P_i}),\\ \llbracket {P_i} \rrbracket ^b_I= & {} J(I,\langle \llbracket {P_0} \rrbracket ^{b-1}_I, \dots , \llbracket {P_n} \rrbracket ^{b-1}_I \rangle ) (\exists \overline{\ell }_{P_i} \cdot \beta _{P_i}), ~b > 0 \end{aligned}$$

Intuitively, \(\llbracket {P_i} \rrbracket ^0_I\) consists of all input-output values of the parameters of \(P_i\) reachable along paths that do not make any procedure calls, i.e., by interpreting every predicate symbol \(\Sigma _{Q}\) in the body \(\beta _{P_i}\) as \(\emptyset \). Similarly, \(\llbracket {P_i} \rrbracket ^b_I\), for \(b > 0\), consists of all input-output values of the parameters reachable along paths that use a stack of depth bounded by b.

An environment is a function that maps a predicate symbol \(\Sigma _{P}\) to a formula over \(\overline{v}_{P}\). Given a formula \(\tau \) and an environment E, we abuse the notation \(\llbracket {\cdot } \rrbracket \) and write \(\llbracket {\tau } \rrbracket _E\) for the formula obtained by instantiating every predicate symbol \(\Sigma _{P}\) by \(E(\Sigma _{P})\) in \(\tau \).

Let \({ Th}\) be an \(\mathcal {S}\)-theory. A safety property for a procedure \(P \in {\mathcal {A}}\) is a formula over \(\overline{v}_{P}\). P satisfies a safety property \(\varphi \) w.r.t \({ Th}\), denoted \(P \models _{ Th}\varphi \), iff for all models I of \({ Th}\), \(\llbracket {P} \rrbracket _I \subseteq I(\varphi )\). A safety property \(\psi \) of the main procedure M of a program \({\mathcal {A}}\) is also called a safety property of the program itself. Given a safety property \(\psi (\overline{v}_{M})\), a safety proof for \(\psi \) is an environment \(\Pi \) that is both safe and invariant:

$$\begin{aligned}&\models _{ Th}\llbracket {\forall \overline{x} \cdot \Sigma _{M}(\overline{x}) \implies \psi (\overline{x})} \rrbracket _\Pi \quad (\text {safety})\end{aligned}$$
(2)
$$\begin{aligned}&\forall P \in {\mathcal {A}}\cdot \models _{ Th}\llbracket {\forall \overline{v}_{P} \cup \overline{\ell }_{P} \cdot (\beta _{P} \implies \Sigma _{P}(\overline{v}_{P}))} \rrbracket _\Pi \quad (\text {invariance}) \end{aligned}$$
(3)

Given a formula \(\varphi (\overline{v}_{P})\) and a natural number \(b \ge 0\), denoting a bound on the call-stack, a procedure P satisfies bounded safety w.r.t \({ Th}\), denoted \(P \models _{b,{ Th}} \varphi \), iff for all models I of \({ Th}\), \(\llbracket {P} \rrbracket ^b_I \subseteq I(\varphi )\). In this case, we also call \(\varphi \) a may-summary for \(\langle P,b \rangle \). We call \(\varphi \) a must-summary for \(\langle P,b \rangle \) iff \(I(\varphi ) \subseteq \llbracket {P} \rrbracket ^b_I\), for all models I of \({ Th}\). Intuitively, may-summaries and must-summaries for \(\langle P,b \rangle \), respectively, over- and under-approximate \(\llbracket {P} \rrbracket ^b_I\) for every model I of \({ Th}\).

A bounded formula map maps a procedure P and a natural number \(b \ge 0\) to a set of formulas over \(\overline{v}_{P}\). Given a bounded formula map m and \(b \ge 0\), we define two special environments \(U_m^b\) and \(O_m^b\) as follows.

$$\begin{aligned} U_m^b : \Sigma _{P} \mapsto \bigvee \big \{\delta \in m(P,b') \mid b' \le b\big \}&\quad \quad O_m^b: \Sigma _{P} \mapsto \bigwedge \big \{\delta \in m(P,b') \mid b' \ge b\big \} \end{aligned}$$

We use \(U_m^b\) and \(O_m^b\) to under- and over-approximate the bounded semantics. For convenience, let \(U_m^{-1}\) and \(O_m^{-1}\) be environments that map every symbol to \(\bot \).

4 Model checking recursive programs

In this section, we present our algorithm \(\textsc {RecMC} ({\mathcal {A}}, \varphi _{{ safe}})\) for determining whether a program \({\mathcal {A}}\) satisfies a safety property \(\varphi _{{ safe}}\). Let \(\mathcal {S}\) be the signature of the first-order language under consideration and assume a fixed \(\mathcal {S}\)-theory \({ Th}\). To avoid clutter, we drop the subscript \({ Th}\) from the notation \(\models _{ Th}\) and \(\models _{b,{ Th}}\). We also show the soundness of RecMC and discuss its complexity guarantees. An efficient instantiation of \(\textsc {RecMC} \) to Linear Arithmetic is presented in Sect. 5.

Fig. 6
figure 6

Pseudo-code of RecMC

4.1 Top-level loop

RecMC maintains two bounded formula maps \(\sigma _u\) and \(\sigma _o\) for must and may-summaries, respectively. For brevity, for a first-order formula \(\tau \), we write \(\llbracket {\tau } \rrbracket _u^b\) and \(\llbracket {\tau } \rrbracket _o^b\) to denote \(\llbracket {\tau } \rrbracket _{U_{\sigma _u}^b}\) and \(\llbracket {\tau } \rrbracket _{O_{\sigma _o}^b}\), respectively, where the environments \(U_m^b\) and \(O_m^b\), for a bounded formula map m, are as defined in Sect. 3. Intuitively, \(\llbracket {\tau } \rrbracket _u^b\) and \(\llbracket {\tau } \rrbracket _o^b\), respectively, under- and over-approximate \(\tau \) using \(\sigma _u\) and \(\sigma _o\).

The pseudo-code of the main loop of RecMC (corresponding to the flow diagram in Fig. 2) is shown in Fig. 6. RecMC follows an iterative deepening strategy. In each iteration, \(\textsc {BndSafety} \) (described below) checks whether all executions of \({\mathcal {A}}\) satisfy \(\varphi _{ safe}\) for a bound \(n \ge 0\) on the call-stack, i.e., if \(M \models _n \varphi _{{ safe}}\). \(\textsc {BndSafety} \) also updates the maps \(\sigma _u\) and \(\sigma _o\). Whenever \(\textsc {BndSafety} \) returns \({ UNSAFE}\), the must-summaries in \(\sigma _u\) are sufficient to construct a counterexample to safety and the loop terminates. Whenever \(\textsc {BndSafety} \) returns \({ SAFE}\), the may-summaries in \(\sigma _o\) are sufficient to prove the absence of a counterexample for the current bound n on the call-stack. In this case, if \(\sigma _o\) is also invariant [see (3)], as determined by CheckInvariance, \(O_{\sigma _o}^n\) is a safety proof and the loop terminates. Otherwise, the bound on the call-stack is incremented and a new iteration of the loop begins. Note that, as a side-effect of CheckInvariance, some may-summaries are propagated to the bound \(n+1\). This is similar to the push generalization phase in the IC3 algorithm [22].

Fig. 7
figure 7

Rules defining \(\textsc {BndSafety} \big ({\mathcal {A}},\varphi _{{ safe}},n,\sigma _u^{{ Init}},\sigma _o^{{ Init}}\big )\)

4.2 Bounded safety

We describe the routine \(\textsc {BndSafety} ({\mathcal {A}},\varphi _{{ safe}},n,\sigma _u^{{ Init}},\sigma _o^{{ Init}})\) as an abstract transition system [25] defined by the inference rules shown in Fig. 7. Here, n is the current bound on the call-stack, and \(\sigma _u^{{ Init}}\) and \(\sigma _o^{{ Init}}\) are the maps of must and may-summaries input to the routine. A state of \(\textsc {BndSafety} \) is a triple \({\mathcal {Q}}\Vert {\sigma _u}\Vert {\sigma _o}\), where \(\sigma _u\) and \(\sigma _o\) are the current maps and \(\mathcal {Q}\) is a set of triples \(\langle P, \varphi , b \rangle \) for a procedure P, a formula \(\varphi \) over \(\overline{v}_{P}\), and a number \(b \ge 0\). A triple \(\langle P, \varphi , b \rangle \in \mathcal {Q}\) is called a bounded reachability query and asks whether \(P \not \models _b \lnot \varphi \), i.e., whether there is an execution in P using a call-stack bounded by b where the values of \(\overline{v}_{P}\) satisfy \(\varphi \).

BndSafety starts with a single query \(\langle M, \lnot \varphi _{ safe}, n \rangle \) and initializes the maps of must and may-summaries (rule Init). It checks whether \(M \models _n \varphi _{ safe}\) by generating new queries as necessary (rule Query) and answering existing queries using existing summaries (rules May and Must), the latter resulting in new summaries. When there are no queries left to answer, i.e., \(\mathcal {Q}\) is empty, BndSafety terminates with a result of either \({ UNSAFE}\) or \({ SAFE}\) (rules Unsafe and Safe). We explain the rules May, Must and Query below.

May infers a new may-summary when a query \(\langle P, \varphi , b \rangle \) can be answered negatively. In this case, there is an over-approximation of the bounded semantics of P at bound b, obtained using the may-summaries of callees at bound \(b-1\), that is unsatisfiable with \(\varphi \). That is, \(\models \llbracket {\beta _{P}} \rrbracket _o^{b-1} \implies \lnot \varphi \). The inference of the new summary is by interpolation [20] (denoted by Itp in the side-condition of the rule). Thus, the new may-summary \(\psi \) is a formula over \(\overline{v}_{P}\) such that \(\models \left( \llbracket {\beta _{P}} \rrbracket _o^{b-1} \implies \psi (\overline{v}_{P}) \right) ~\wedge ~ (\psi (\overline{v}_{P}) \implies \lnot \varphi )\). Note that \(\psi \) over-approximates the bounded semantics of P at b. Every query \(\langle P,\eta ,c \rangle \in \mathcal {Q}\) such that \(\eta \) is unsatisfiable with the updated environment \(O^c_{\sigma _o}(\Sigma _{P})\) is immediately answered and removed.

Must infers a new must-summary when a query \(\langle P, \varphi , b \rangle \) can be answered positively. In this case, there is an under-approximation of the bounded semantics of P at b, obtained using the must-summaries of callees at bound \(b-1\), that is satisfiable with \(\varphi \). That is, \(\not \models \llbracket {\beta _{P}} \rrbracket _u^{b-1} \implies \lnot \varphi \). In particular, there exists a path \(\pi \) in \({ Paths}(P)\) such that \(\not \models \llbracket {\pi } \rrbracket _u^{b-1} \implies \lnot \varphi \). The new must-summary \(\psi \) is obtained by choosing such a path \(\pi \) non-deterministically and existentially quantifying all local variables from \(\llbracket {\pi } \rrbracket _u^{b-1}\). Note that \(\psi \) under-approximates the bounded semantics of P at b. Every query \(\langle P,\eta ,c \rangle \in \mathcal {Q}\) such that \(\eta \) is satisfiable with the updated environment \(U^c_{\sigma _u}(\Sigma _{P})\) is immediately answered and removed.

Query creates a new query when an existing query \(\langle P,\varphi ,b \rangle \) cannot be answered using current summary maps \(\sigma _u\) and \(\sigma _o\). In this case, the current over-approximation of the bounded semantics of P at b is satisfiable with \(\varphi \) while its current under-approximation is unsatisfiable with \(\varphi \). That is, \(\not \models \llbracket {\beta _{P}} \rrbracket _o^{b-1} \implies \lnot \varphi \) and \(\models \llbracket {\beta _{P}} \rrbracket _u^{b-1} \implies \lnot \varphi \). In particular, there exists a path \(\pi \) in \({ Paths}(P)\) such that \(\not \models \llbracket {\pi } \rrbracket _o^{b-1} \implies \lnot \varphi \) and \(\models \llbracket {\pi } \rrbracket _u^{b-1} \implies \lnot \varphi \). Intuitively, \(\pi \) is a potential counterexample path that needs to be checked for feasibility. Such a path \(\pi \) is chosen non-deterministically. \(\pi \) is guaranteed to have a conjunct \(\Sigma _{R}(\overline{a})\), corresponding to a call to some procedure R, such that the under-approximation \(\llbracket {\Sigma _{R}(\overline{a})} \rrbracket _u^{b-1}\) is too strong to witness an execution along \(\pi \) that satisfies \(\varphi \) but the over-approximation \(\llbracket {\Sigma _{R}(\overline{a})} \rrbracket _o^{b-1}\) is too weak to block such an execution. That is, \(\pi \) can be partitioned into a prefix \(\pi _{ pre}\), a conjunct \(\Sigma _{R}(\overline{a})\) corresponding to a call to R, and a suffix \(\pi _{ suf}\) such that the following hold:

$$\begin{aligned} \models \llbracket {\Sigma _{R}(\overline{a})} \rrbracket _u^{b-1}&\implies \left( \big (\llbracket {\pi _{ pre}} \rrbracket _o^{b-1} \wedge \llbracket {\pi _{ suf}} \rrbracket _u^{b-1}\big ) \implies \lnot \varphi \right) \\ \not \models \llbracket {\Sigma _{R}(\overline{a})} \rrbracket _o^{b-1}&\implies \left( \big (\llbracket {\pi _{ pre}} \rrbracket _o^{b-1} \wedge \llbracket {\pi _{ suf}} \rrbracket _u^{b-1}\big ) \implies \lnot \varphi \right) \end{aligned}$$

Note that the prefix \(\pi _{ pre}\) and the suffix \(\pi _{ suf}\) are over- and under-approximated, respectively. A new query \(\langle R,\psi ,b-1 \rangle \) is created where \(\psi \) is obtained by existentially quantifying all variables from \(\llbracket {\pi _{ pre}} \rrbracket _o^{b-1} \wedge \llbracket {\pi _{ suf}} \rrbracket _u^{b-1} \wedge \varphi \) except the arguments \(\overline{a}\) of the call, and renaming appropriately. If the new query is answered negatively (using May), all executions along \(\pi \) where the values of \(\overline{v}_{P} \cup \overline{\ell }_{P}\) satisfy \(\llbracket {\pi _{ suf}} \rrbracket _u^{b-1}\) are spurious counterexamples. An additional side-condition requires that \(\psi \) “does not overlap” with \(\eta \) for any other query \(\langle R,\eta ,b-1 \rangle \) in \(\mathcal {Q}\). This is necessary for termination of BndSafety (Theorem 2). In practice, the side-condition is trivially satisfied by always applying the rule to \(\langle P, \varphi , b \rangle \) with the smallest b.

Fig. 8
figure 8

Approximations of the only path \(\pi \) of the procedure M in Fig. 4

For example, consider the program in Fig. 4 represented by (1) and the query \(\langle M, \varphi , 1 \rangle \) where \(\varphi \equiv m_0 < 2m+4\). Let \(\sigma _o= \emptyset \), \(\sigma _u(D,0) = \{d=d_0-1\}\) and \(\sigma _u(T,0) = \emptyset \). Let \(\pi = (\Sigma _{T}(m_0,\ell _0) \wedge \Sigma _{D}(\ell _0,\ell _1) \wedge \Sigma _{D}(\ell _1,m))\) denote the only path in the procedure M. Figure 8 shows \(\llbracket {\pi _i} \rrbracket _u^0\) and \(\llbracket {\pi _i} \rrbracket _o^0\) for each conjunct \(\pi _i\) of \(\pi \). As the figure shows, \(\llbracket {\pi } \rrbracket _o^0\) is satisfiable with \(\varphi \), witnessed by the execution \(e \equiv \langle m_0=3, \ell _0=3, \ell _1=2, m=1 \rangle \). Note that this execution also satisfies \(\llbracket {\pi _2 \wedge \pi _3} \rrbracket _u^0\). But, \(\llbracket {\pi _1} \rrbracket _u^0\) is too strong to witness it, where \(\pi _1\) is the call \(\Sigma _{T}(m_0,\ell _0)\). To create a new query for T, we first existentially quantify all variables other than the arguments \(m_0\) and \(\ell _0\) from \(\pi _2 \wedge \pi _3 \wedge \varphi \), obtaining \(m_0 < 2\ell _0\). Renaming the arguments by the parameters of T results in the new query \(\langle T,t_0<2t,0 \rangle \). Further iterations of BndSafety would answer this query negatively making the execution e spurious. Note that this would also make all other executions where the values to \(\langle m_0, \ell _0, \ell _1, m \rangle \) satisfy \(\llbracket {\pi _2\wedge \pi _3} \rrbracket _u^0\) spurious.

4.3 Soundness of BndSafety and RecMC

Soundness of \(\textsc {RecMC} \) follows from that of \(\textsc {BndSafety} \), which can be shown by a case analysis on the inference rules.

Theorem 1

BndSafety and RecMC are sound.

Proof

We only show the soundness of BndSafety; the soundness of RecMC easily follows. In particular, for \(\textsc {BndSafety} (M,\varphi _{ safe},n,\emptyset ,\emptyset )\) we show the following:

  1. 1.

    If the premises of Unsafe hold, then \(M \not \models _n \varphi _{ safe}\), and

  2. 2.

    If the premises of Safe hold, then \(M \models _n \varphi _{ safe}\).

It suffices to show that the environments \(U_{\sigma _u}^b\) and \(O_{\sigma _o}^b\), respectively, under- and over-approximate the bounded semantics of the procedures, for every \(0 \le b \le n\). In particular, we show that the following is an invariant of BndSafety: for every model I of the background theory \({ Th}\), for every procedure \(Q \in {\mathcal {A}}\) and \(b \in [0,n]\),

$$\begin{aligned} I(U_{\sigma _u}^{b}(\Sigma _{Q})) \subseteq \llbracket {Q} \rrbracket _I^b \subseteq I(O_{\sigma _o}^b(\Sigma _{Q})). \end{aligned}$$
(4)

Initially, \(\sigma _u\) and \(\sigma _o\) are empty and the invariant holds trivially. BndSafety updates \(\sigma _o\) and \(\sigma _u\) in the rules May and Must, respectively. We show that these rules preserve (4). We only show the case of May. The case of Must is similar.

Let \(\langle P, \varphi , b \rangle \in \mathcal {Q}\) be such that May is applicable, i.e., \(\models \llbracket {\beta _{P}} \rrbracket _o^{b-1} \implies \lnot \varphi \). Let \(\psi = \textsc {Itp} (\llbracket {\beta _{P}} \rrbracket _o^{b-1} ,\lnot \varphi )\). Note that \(\varphi \), and hence \(\psi \), does not depend on the local variables \(\overline{\ell }_{P}\). Hence, we know that

$$\begin{aligned} \models \left( \exists \overline{\ell }_{P} \cdot \llbracket {\beta _{P}} \rrbracket _o^{b-1}\right) \implies \psi . \end{aligned}$$
(5)

The case of \(b=0\) is easy and we will skip it. Let I be an arbitrary model of \({ Th}\). Assume that (4) holds at \(b-1\) before applying the rule. In particular, assume that for all \(Q \in {\mathcal {A}}\), \(\llbracket {Q} \rrbracket _I^{b-1} \subseteq I(O_{\sigma _o}^{b-1}(\Sigma _{Q}))\).

We will first show that the new may-summary \(\psi \) over-approximates \(\llbracket {P} \rrbracket _I^b\). Let \(J(I,\overline{X})\) be an expansion of I as defined in Sect. 3.

$$\begin{aligned} \llbracket {P} \rrbracket _I^b&= J\left( I,\langle \llbracket {P_0} \rrbracket ^{b-1}_I, \dots , \llbracket {P_n} \rrbracket ^{b-1}_I \rangle \right) \left( \exists \overline{\ell }_{P_i} \cdot \beta _{P_i}\right)&\\&\subseteq J\left( I,\langle I(O_{\sigma _o}^{b-1}(\Sigma _{P_0})), \dots , I(O_{\sigma _o}^{b-1}(\Sigma _{P_n})) \rangle \right) \left( \exists \overline{\ell }_{P_i} \cdot \beta _{P_i}\right)&(\hbox {hypothesis})\\&= I\left( \llbracket {\exists \overline{\ell }_{P} \cdot \beta _{P}} \rrbracket _{O^{b-1}_{\sigma _o}}\right)&(O^{b-1}_{\sigma _o} \hbox { is FO-definable})\\&= I\left( \exists \overline{\ell }_{P} \cdot \llbracket {\beta _{P}} \rrbracket _{O^{b-1}_{\sigma _o}}\right)&(\hbox {logic})\\&= I\left( \exists \overline{\ell }_{P} \cdot \llbracket {\beta _{P}} \rrbracket _o^{b-1}\right)&(\hbox {notation})\\&\subseteq I(\psi )&(\hbox {from (5)}) \end{aligned}$$

Next, we show that the invariant continues to hold. The map of may-summaries is updated to \(\sigma _o' = \sigma _o\cup \{\langle P,b \rangle \mapsto \psi \}\). Now, \(\sigma _o'\) differs from \(\sigma _o\) only for the procedure P and for bounds in [0, b]. Let \(b' \in [0,b]\) be arbitrary. Since (4) was true before applying \(\textsc {May}\), we know that \(\llbracket {P} \rrbracket _I^{b'} \subseteq I(O_{\sigma _o}^{b'}(\Sigma _{P}))\). As \(\llbracket {P} \rrbracket _I^{b'} \subseteq \llbracket {P} \rrbracket _I^b \subseteq I(\psi )\), it follows that \(\llbracket {P} \rrbracket _I^{b'} \subseteq I(O_{\sigma _o}^{b'}(\Sigma _{P})) \cap I(\psi ) \subseteq I(O_{\sigma _o}^{b'}(\Sigma _{P}) \wedge \psi ) = I(O_{\sigma _o'}^{b'}(\Sigma _{P}))\). \(\square \)

4.4 Termination and complexity of BndSafety

We will now show that \(\textsc {BndSafety} \) is complete relative to an oracle for satisfiability modulo \({ Th}\). Intuitively, a must-summary inferred by BndSafety corresponds to a path in a procedure and given a bound on the call-stack, the number of such formulas is finite. This bounds the number of may/must-summaries inferred by BndSafety, guaranteeing termination. Throughout the following, assume an oracle for SAT modulo \({ Th}\).

The following lemma shows that when a query is removed from \(\mathcal {Q}\), it is actually answered. The proof is immediate from the definitions of \(O_{\sigma _o}^b\) and \(U_{\sigma _u}^b\) given in Sect. 3.

Lemma 1

(Answered queries) Whenever BndSafety removes a query from \(\mathcal {Q}\), it is answered using the known must and may-summaries. In particular, for every query \(\langle P, \eta , b \rangle \in \mathcal {Q}\) removed from \(\mathcal {Q}\) by BndSafety,

  1. 1.

    If the query is removed by May, then \(\models \llbracket {\Sigma _{P}} \rrbracket _o^b \implies \lnot \eta \), and

  2. 2.

    If the query is removed by Must, then \(\not \models \llbracket {\Sigma _{P}} \rrbracket _u^b \implies \lnot \eta \).

Next, we show that current summaries are insufficient to answer existing queries in \(\mathcal {Q}\).

Lemma 2

(Pending queries) \(\mathcal {Q}\) only has the queries which cannot be immediately answered by \(\sigma _u\) or \(\sigma _o\), i.e., as long as \(\langle P, \eta , \ell \rangle \) is in \(\mathcal {Q}\), the following are invariants of BndSafety.

  1. 1.

    \(\not \models \llbracket {\Sigma _{P}} \rrbracket _o^{\ell } \implies \lnot \eta \), and

  2. 2.

    \(\models \llbracket {\Sigma _{P}} \rrbracket _u^{\ell } \implies \lnot \eta \).

Proof

We first show that the invariants hold when a query is newly created by Query. Let P, \(\eta \) and \(\ell \) be, respectively, R, \(\psi [\overline{a} \leftarrow \overline{v}_{R}]\) and \(b-1\), as in the conclusion of the rule. The last-but-one premise of Query is

$$\begin{aligned} \models \llbracket {\pi _{ pre}} \rrbracket _o^{b-1} \wedge \llbracket {\Sigma _{R}(\overline{a})} \rrbracket _u^{b-1} \wedge \llbracket {\pi _{ suf}} \rrbracket _u^{b-1} \implies \lnot \varphi \end{aligned}$$

which implies that

$$\begin{aligned} \models \llbracket {\Sigma _{R}(\overline{a})} \rrbracket _u^{b-1} \implies \lnot \left( \llbracket {\pi _{ pre}} \rrbracket _o^{b-1} \wedge \llbracket {\pi _{ suf}} \rrbracket _u^{b-1} \wedge \varphi \right) . \end{aligned}$$

The variables not in common, viz., \((\overline{v}_{P} \cup \overline{\ell }_{P}) \setminus \overline{a}\), can be universally quantified from the right hand side resulting in \(\models \llbracket {\Sigma _{R}} \rrbracket _u^{b-1} \implies \lnot \eta \). Similarly, \(\not \models \llbracket {\Sigma _{R}} \rrbracket _o^{b-1} \implies \lnot \eta \) follows from the last premise of the rule. Next, we show that May and Must preserve the invariants.

Let May answer a query \(\langle P, \varphi , \ell \rangle \) with a new may-summary \(\psi \) and let the updated map of may-summaries be \(\sigma _o' = \sigma _o\cup \{\langle P,\ell \rangle \mapsto \psi \}\). Now, consider \(\langle P, \eta , \ell ' \rangle \in \mathcal {Q}\) after the application of the rule. If \(\ell ' > \ell \), \(O^{\ell '}_{\sigma _o'} = O^{\ell '}_{\sigma _o}\) and the invariant continues to hold. So, assume \(\ell ' \le \ell \). From the conclusion of May, we have \(\not \models \llbracket {\Sigma _{P}} \rrbracket ^{\ell '}_o \wedge \psi \implies \lnot \eta \). Now, \(O^{\ell '}_{\sigma _o'}(\Sigma _{P}) = O^{\ell '}_{\sigma _o}(\Sigma _{P}) \wedge \psi \). So, the invariant continues to hold.

Similarly, let Must answer a query \(\langle P, \varphi , \ell \rangle \) with a new must-summary \(\psi \) and let the updated map of must-summaries be \(\sigma _u' = \sigma _u\cup \{\psi \mapsto \langle P,\ell \rangle \}\). Now, consider \(\langle P, \eta , \ell ' \rangle \in \mathcal {Q}\) after the application of the rule. If \(\ell ' < \ell \), \(U^{\ell '}_{\sigma _u'} = U^{\ell '}_{\sigma _u}\) and the invariant continues to hold. So, assume \(\ell ' \ge \ell \). From the conclusion of Must, we have \(\models \psi \implies \lnot \eta \). Assuming the invariant holds before the rule application, we also have \(\models \llbracket {\Sigma _{P}} \rrbracket _u^{\ell '} \implies \lnot \eta \). Therefore, we have \(\models \llbracket {\Sigma _{P}} \rrbracket _u^{\ell '} \vee \psi \implies \lnot \eta \). Now, \(U^{\ell '}_{\sigma _u'}(\Sigma _{P}) = U^{\ell '}_{\sigma _u}(\Sigma _{P}) \vee \psi \). So, the invariant continues to hold. \(\square \)

The next few lemmas show that the rules of the algorithm cannot be applied indefinitely, leading to a termination argument. Let N be the number of procedures in the program \({\mathcal {A}}\), p be the maximum number of paths in a procedure, and c be the maximum number of procedure calls along any path in \({\mathcal {A}}\).

Lemma 3

(Finitely-many must summaries) Given a predicate symbol \(\Sigma _{P}\) and a bound b, the environment \(U^b_{\sigma _u}\) is updated only \(O(N^b \cdot p^{b+1})\)-many times.

Proof

The environment \(U^b_{\sigma _u}\) can be updated for \(\Sigma _{P}\) and b whenever a must-summary is inferred for P at a bound \(b' \le b\). Now, a must-summary is obtained per path (after eliminating the local variables) of a procedure, using the currently known must-summaries about the callees. Moreover, Lemmas 1 and 2 imply that no must-summary is inferred twice. This is because whenever a query is answered using Must, the query could not have been answered using already existing must-summaries and a new must-summary is inferred.

This gives the following recurrence \({ Must}(b)\) for the number of updates to \(U^b_{\sigma _u}\) for a given \(\Sigma _{P}\):

$$\begin{aligned} { Must}(b) = {\left\{ \begin{array}{ll} p, &{} b = 0\\ (p \cdot N + 1) \cdot { Must}(b-1), &{} b > 0. \end{array}\right. } \end{aligned}$$

In words, for \(b=0\), the number of updates is given by the number of must-summaries that can be inferred, which is bounded by the number of paths p in the procedure P. For \(b>0\), the environment \(U^{b}_{\sigma _u}\) is updated when a must-summary is learnt for the procedure at a bound smaller than or equal to b. For the former, the number of updates is simply \({ Must}(b-1)\). For the latter, a new must-summary is inferred at bound b along a path whenever \(U^{b-1}_{\sigma _u}\) changes for a callee. For N procedures and p paths, this is given by \((p \cdot N \cdot { Must}(b-1))\).

This gives us \({ Must}(b) = O(N^b \cdot p^{b+1})\). \(\square \)

Lemma 4

(Finitely-many queries) For \(\langle {P}, {\varphi }, {b} \rangle \in \mathcal {Q}\), Query is applicable only \(O(c \cdot N^b \cdot p^{b+1})\)-many times.

Proof

First, assume that the environments \(U^{b-1}_{\sigma _u}\) and \(O^{b-1}_{\sigma _o}\) are fixed. The number of possible queries that can be created for a given path of P is bounded by the number of ways the path can be divided into a prefix, a procedure call, and a suffix. This is bounded by c, the maximum number of calls along the path. For p paths, this is bounded by \(c \cdot p\).

Consider a path \(\pi \) and its division, and let a query be created for a callee R along \(\pi \). Now, while the query is still in \(\mathcal {Q}\), updates to the environments \(O^{b-1}_{\sigma _o}\) and \(U^{b-1}_{\sigma _u}\) do not result in a new query for R for the same division along \(\pi \). This is because, the new query would overlap with the existing one and this is disallowed by the second side-condition of Query.

Suppose that the new query is answered by May. With the updated map of may-summary, the last premise of Query can be shown to fail for the current division of \(\pi \). If \(O^{b-1}_{\sigma _o}\) is updated, the last premise continues to fail. So, a new query can be created for the same prefix and suffix along \(\pi \) only if \(U^{b-1}_{\sigma _u}\) is updated for some callee along \(\pi \). The other possibility is that the query is answered by Must which updates \(U^{b-1}_{\sigma _u}\) as well.

Thus, for a given path, and a given division of it into prefix and suffix, the number of queries that can be created is bounded by the number of updates to \(U^{b-1}_{\sigma _u}\) which is \((N \cdot { Must}(b-1))\). Here, \({ Must}\) is as in Lemma 3. So, the number of times Query is applicable for a given query \(\langle {P}, {\varphi }, {b} \rangle \) is \(O(p \cdot c \cdot N \cdot { Must}(b-1))\). As \({ Must}(b) = N^b \cdot p^{b+1}\), we obtain the bound \(O(c \cdot N^b \cdot p^{b+1})\). \(\square \)

Lemma 5

(Progress) As long as \(\mathcal {Q}\) is non-empty, either May, Must or Query is always applicable.

Proof

First, we show that for every query in \(\mathcal {Q}\), either of the three rules is applicable, without the second side-condition in Query. Let \(\langle P,\varphi ,b \rangle \in \mathcal {Q}\). If \(\models \llbracket {\beta _{P}} \rrbracket _o^{b-1} \implies \lnot \varphi \), then May is applicable. Otherwise, there exists a path \(\pi \in { Paths}({P})\) such that \(\llbracket {\pi } \rrbracket _o^{b-1}\) is satisfiable with \(\varphi \), i.e., \(\not \models \llbracket {\pi } \rrbracket _o^{b-1} \implies \lnot \varphi \). Now, if \(\llbracket {\pi } \rrbracket _u^{b-1}\) is also satisfiable with \(\varphi \), i.e., \(\not \models \llbracket {\pi } \rrbracket _u^{b-1} \implies \lnot \varphi \), Must is applicable. Otherwise, \(\models \llbracket {\pi } \rrbracket _u^{b-1} \implies \lnot \varphi \). Note that this can only happen if \(b > 0\), as otherwise, there will not be any procedure calls along \(\pi \) and \(\llbracket {\pi } \rrbracket _o^{b-1}\) and \(\llbracket {\pi } \rrbracket _u^{b-1}\) would be equivalent.

Let \(\pi = \pi _0 \wedge \pi _1 \wedge \dots \pi _l\) for some finite l. Then, \(\llbracket {\pi } \rrbracket _o^{b-1}\) is obtained by taking the conjunction of the formulas

$$\begin{aligned} \langle \llbracket {\pi _0} \rrbracket _o^{b-1}, \llbracket {\pi _1} \rrbracket _o^{b-1}, \dots \rangle . \end{aligned}$$

Similarly, \(\llbracket {\pi } \rrbracket _u^{b-1}\) is obtained by taking the conjunction of the formulas

$$\begin{aligned} \langle \llbracket {\pi _0} \rrbracket _u^{b-1}, \llbracket {\pi _1} \rrbracket _u^{b-1}, \dots \rangle . \end{aligned}$$

From Theorem 1, we can think of obtaining the latter sequence of formulas by conjoining \(\llbracket {\pi _i} \rrbracket _u^{b-1}\) to \(\llbracket {\pi _i} \rrbracket _o^{b-1}\) for every i. When this is done backwards for decreasing values of i, an intermediate sequence looks like

$$\begin{aligned} \langle \llbracket {\pi _0} \rrbracket _o^{b-1}, \dots , \llbracket {\pi _{j-1}} \rrbracket _o^{b-1}, \llbracket {\pi _j} \rrbracket _u^{b-1} \dots \rangle . \end{aligned}$$

As \(\llbracket {\pi } \rrbracket _u^{b-1}\) is unsatisfiable with \(\varphi \), there exists a maximal j such that the conjunction of constraints in such an intermediate sequence are unsatisfiable with \(\varphi \). Moreover, \(\pi _j\) must be a literal of the form \(\Sigma _{R}(\overline{a})\) as otherwise, \(\llbracket {\pi _j} \rrbracket _o^{b-1} = \llbracket {\pi _j} \rrbracket _u^{b-1}\) violating the maximality condition on j. Thus, all premises of Query hold and the rule is applicable.

Now, the second side-condition in Query can be trivially satisfied by always choosing a query in \(\mathcal {Q}\) with the smallest bound for the next rule to apply. This is because, if \(\langle {R}, {\eta }, {b-1} \rangle \) is the newly created query, there is no other query in \(\mathcal {Q}\) for R and \(b-1\). \(\square \)

Lemmas 4 and 5 imply that every query in \(\mathcal {Q}\) is eventually answered by May or Must, as shown below.

Lemma 6

(Eventual answer) Every \(\langle {P}, {\varphi }, {b} \rangle \in \mathcal {Q}\) is eventually answered by May or Must, in \(O(b \cdot c^b \cdot (Np)^{O(b^2)})\) applications of the rules.

Proof

Firstly, to answer any given query in \(\mathcal {Q}\), Lemma 4 guarantees that the algorithm can only create finitely many queries. Lemma 5 guarantees that some rule is always applicable, as long as \(\mathcal {Q}\) is non-empty. Thus, when Query cannot be applied for any query in \(\mathcal {Q}\), either May or Must must be applicable for some query. Thus, eventually, all queries are answered.

The total number of rule applications to answer \(\langle {P}, {\varphi }, {b} \rangle \) is then linear in the cumulative number of applications of Query, which has the following recurrence:

$$\begin{aligned} T(b) = {\left\{ \begin{array}{ll} Q(0), &{} b = 0\\ Q(b) (1 + T(b-1)), &{} b > 0. \end{array}\right. } \end{aligned}$$

where Q(b) denotes the number of applications of Query for a fixed query in \(\mathcal {Q}\) at bound b. From Lemma 4, \(Q(b) = O(c \cdot N^{b} \cdot p^{b+1})\). This gives us \(T(b) = O(b \cdot c^b \cdot (Np)^{O(b^2)})\).       \(\square \)

The main termination theorem is an immediate consequence of the above lemma:

Theorem 2

Given a satisfiability oracle for \({ Th}\), \(\textsc {BndSafety} ({\mathcal {A}}, \varphi , n, \emptyset , \emptyset )\) decides bounded safety in finitely many iterations and terminates.

As an immediate corollary, RecMC is guaranteed to find a counterexample if one exists.

Corollary 1

\(\textsc {RecMC} ({\mathcal {A}},\varphi )\) is guaranteed to return UNSAFE with a counterexample if \({\mathcal {A}}\not \models \varphi \).

In contrast, the closest related algorithm GPDR [15], mentioned briefly in Sect. 1, does not have such guarantees. Finally, for Boolean programs \(\textsc {RecMC} \) is a complete decision procedure. Unlike the general case, the number of reachable states of a Boolean program, and hence the number of summaries, is finite. Let N denote the number of procedures of a program \({\mathcal {A}}\) and \(k = \max \{|\overline{v}_{P}| \mid P(\overline{v}_{P}) \in {\mathcal {A}}\}\).

Theorem 3

Let \({\mathcal {A}}\) be a Boolean program. Then \(\textsc {RecMC} ({\mathcal {A}}, \varphi )\) terminates in \(O(N^2 \cdot 2^{2k})\)-many applications of the rules in Fig. 7.

Proof

First, assume a bound n on the call-stack. The number of queries that can be created for a procedure at any given bound is \(O(2^k)\), the number of possible valuations of the parameters (note that Query disallows overlapping queries to be present simultaneously in \(\mathcal {Q}\)). For N procedures and n possible values of the bound, the complexity of \(\textsc {BndSafety} ({\mathcal {A}}, \varphi , n, \emptyset , \emptyset )\), for a Boolean program, is \(O(N \cdot 2^k \cdot n)\).

Now, the total number of may-summaries that can be inferred for a procedure is also bounded by \(O(2^k)\). As \(O_{\sigma _o}^b\) is monotonic in b, the number of iterations of RecMC is bounded by \(O(N \cdot 2^k)\), the cumulative number of states of all procedures. Thus, we obtain the complexity of RecMC as \(O(N^2 \cdot 2^{2k})\). \(\square \)

Note that the number of states of a Boolean program is \(O(N \cdot 2^k)\), so the above complexity is polynomial in the number of states. In contrast, other SMT-based algorithms, such as Whale [14], are worst-case exponential in the number of states of a Boolean program. Also, note that the complexity is quadratic in the number of procedures as opposed to the known upper-bound which has a linear dependency [1]. This is a manifestation of the iterative deepening strategy of RecMC and in particular, the may-summaries computed by the algorithm, which is necessary for handling programs over first-order theories. In contrast, the known optimal algorithms for Boolean programs do not compute may-summaries.

In summary, RecMC checks safety of a recursive program by inferring the necessary under- and over-approximations of procedure semantics and using them to analyze procedures individually.

5 Model based projection

The algorithm RecMC described in the previous section works for an arbitrary first-order signature \(\mathcal {S}\) and a \(\mathcal {S}\)-theory \({ Th}\) as long as there is an oracle for satisfiability modulo \({ Th}\). In this section, we consider the case of \({ Th}\) being either Linear Rational Arithmetic (LRA) or Linear Integer Arithmetic (LIA). Note that the program can also have Boolean parameters or local variables. RecMC can be used as-is, but recall that BndSafety introduces quantifiers in the formulas maintained by the algorithm. This is because the may and must-summaries are formulas over the parameters of a procedure and auxiliary variables denoting their initial values, and when creating a new summary, all other variables will be quantified away. Same is the case with creating new bounded safety properties. Unless eliminated, these quantifiers accumulate exponentially in the value of the bound corresponding to the bounded safety problem. This is because, if no quantified variable is eliminated, the compositional algorithm essentially breaks down into an algorithm that unrolls the call-graph into a tree where, as we mentioned earlier, the size of the SMT problems created may grow exponentially in the bound on the call-stack. On the other hand, it is expensive to use quantifier elimination (QE) to obtain an equivalent quantifier-free formula. Instead, we propose an alternative approach that approximates QE with quantifier-free formulas lazily and efficiently.

In particular, we (a) introduce a model-based under-approximation of QE for existentially quantified formulas, called Model Based Projection (MBP), (b) give efficient (linear in the size of formulas involved) MBP procedures for Propositional Logic, Linear Rational Arithmetic (LRA), and Linear Integer Arithmetic (LIA; also well known as Presburger Arithmetic), and (c) present a modified version of BndSafety that uses MBP to under-approximate the existential quantification of variables out of scope, and show that it remains sound and terminating. Our MBP procedures for LRA and LIA are based on the QE algorithms by Loos and Weispfenning [23] and Cooper [24], respectively.

Definition 2

(Model Based Projection) Let \(\eta (\overline{y}) = \exists \overline{x} \cdot \eta _m(\overline{x},\overline{y})\) be an existentially quantified formula where \(\eta _m\) is quantifier free. A function \( Proj _{\eta }\) from models of \(\eta _m\) to quantifier-free formulas over \(\overline{y}\) is a Model Based Projection (for \(\eta \)) iff

  1. 1.

    \( Proj _{\eta }\) has a finite image,

  2. 2.

    \(\eta \equiv \bigvee _{M \models \eta _m} Proj _{\eta }(M)\), and

  3. 3.

    for every model M of \(\eta _m\), \(M \models Proj _{\eta }(M)\).

In other words, \( Proj _{\eta }\) covers the space of all models of \(\eta _m(\overline{x},\overline{y})\) by a finite set of quantifier-free formulas over \(\overline{y}\). Note that there is a trivial MBP that maps every model of \(\eta _m\) to a quantifier-free formula equivalent to \(\eta \). However, when QE is expensive, it is not the most efficient MBP and our objective is to obtain an MBP that maps models to quantifier-free under-approximations of \(\eta \). In the following, we describe MBP procedures whose computation is linear in time and space given a model.

5.1 MBP for propositional logic

Let \(\eta (\overline{y}) = \exists \overline{x} \cdot \eta _m(\overline{x},\overline{y})\) be an existentially quantified formula where the quantified variables in \(\overline{x}\) are all Boolean (propositional). Without loss of generality, assume that \(\overline{x}\) is singleton. Our MBP procedure is based on the following equivalence:

$$\begin{aligned} \exists x \cdot \eta _m(x,\overline{y}) ~\equiv ~ \eta _m[{x} \mapsto {\bot }] \vee \eta _m[{x} \mapsto {\top }] \end{aligned}$$
(6)

where \(\eta _m[{x} \mapsto {t}]\) denotes the result of substituting the term t for x in \(\eta _m\).

We now define an MBP \( BoolProj _{\eta }\) for Propositional Logic as a map from models of \(\eta _m\) to one of the disjuncts above depending on the assignment to x in the given model:

$$\begin{aligned} BoolProj _{\eta }(M) = {\left\{ \begin{array}{ll} \eta _m[{x} \mapsto {\bot }], &{} M \models x=\bot \\ \eta _m[{x} \mapsto {\top }], &{} M \models x=\top \end{array}\right. } \end{aligned}$$

This procedure is also used in the GPDR model checking algorithm [15] implemented in the tool Z3 [26] and a similar approach is used in SAT-based iterative quantifier elimination in hardware verification [27]. The following is now immediate.

Theorem 4

\( BoolProj _{\eta }\) is a Model Based Projection.

5.2 MBP for linear rational arithmetic (LRA)

We begin with a brief overview of Loos-Weispfenning (LW) method [23] for quantifier elimination in LRA. We borrow our presentation from Nipkow [28] to which we refer the reader for more details. Let \(\eta (\overline{y}) = \exists \overline{x} \cdot \eta _m (\overline{x},\overline{y})\) as above. Assume that \({ Th}\) is LRA. Without loss of generality, assume that \(\overline{x}\) is singleton containing a rational variable x, \(\eta _m\) is in Negation Normal Form, and x only appears in the literals of the form \(\ell < x\), \(x < u\), and \(x=e\), where \(\ell \), u, and e are x-free. Let \(\text {lits }({\eta })\) denote the literals of \(\eta \). The LW-method states that \(\exists x \cdot \eta _m(x, \overline{y}) ~\equiv \)

$$\begin{aligned} \bigvee _{(x \,=\, e) \in \text {lits }({\eta })} \eta _m[{x} \mapsto _v {e}] ~\vee ~ \bigvee _{(\ell < x) \in \text {lits }({\eta })} \eta _m [{x} \mapsto _v {(\ell + \epsilon )}] ~\vee ~~ \eta _m[{x} \mapsto _v {-\infty }] \end{aligned}$$
(7)

where \(\eta _m[{x} \mapsto _v {t}]\) denotes the result of a virtual substitution of the term t for x in \(\eta _m\). Note that the symbols \(\epsilon \) and \(-\infty \) do not appear in the results of the substitutions (which is why the substitution is called virtual). Intuitively, \(\eta _m[{x} \mapsto _v {e}]\) covers the case when a literal \((x=e)\) is true. Otherwise, the set of \(\ell \)’s in the literals \((\ell < x)\) identify intervals in which x can lie which are covered by the remaining substitutions.

To perform the virtual substitution, the LW-method defines a substitution map \({ Sub}_{t}\) of literals containing x corresponding to \(\eta _m[{x} \mapsto _v {t}]\), defined as follows:

$$\begin{aligned}&{ Sub}_e(x=e) = \top , \quad { Sub}_e(\ell<x) = (\ell<e), \quad { Sub}_e(x<u) = (e<u) \end{aligned}$$
(8)
$$\begin{aligned}&{ Sub}_{\ell +\epsilon }(x=e) = \bot ,\quad { Sub}_{\ell +\epsilon }(\ell '<x) = (\ell ' \le \ell ),\quad { Sub}_{\ell +\epsilon }(x<u) = (\ell <u)\end{aligned}$$
(9)
$$\begin{aligned}&{ Sub}_{-\infty }(x=e) = \bot ,\quad { Sub}_{-\infty }(\ell<x) = \bot , \quad { Sub}_{-\infty }(x<u) =\top \end{aligned}$$
(10)

For example, let \(\eta _m\) be \((x=e \wedge \phi _1) \vee (\ell< x \wedge x< u) \vee (x<u \wedge \phi _2)\), where \(\ell ,e,u,\phi _1,\phi _2\) are x-free. Then,

\(\eta _m[{x} \mapsto _v {e}]\)

$$\begin{aligned}= & {} \left( { Sub}_{e}(x=e) \wedge \phi _1\right) \vee \left( { Sub}_{e}(\ell<x) \wedge { Sub}_{e}(x<u)\right) \vee \left( { Sub}_{e}(x<u) \wedge \phi _2\right) \\\equiv & {} \big ( \phi _1 \vee \left( \ell< e \wedge e< u \right) \vee \left( e<u \wedge \phi _2 \right) \big ), \end{aligned}$$

\(\eta _m[{x} \mapsto _v {(\ell +\epsilon )}]\)

$$\begin{aligned} =&\left( { Sub}_{\ell +\epsilon }(x=e) \wedge \phi _1\right) \vee \left( { Sub}_{\ell +\epsilon }(\ell<x) \wedge { Sub}_{\ell +\epsilon }(x<u)\right) \vee \left( { Sub}_{\ell +\epsilon }(x<u) \wedge \phi _2\right) \\ \equiv&~ \big ( \ell< u \vee (\ell <u \wedge \phi _2) \big ), \end{aligned}$$

and \(\eta _m[{x} \mapsto _v {-\infty }]\)

$$\begin{aligned} =&\left( { Sub}_{-\infty }(x=e) \wedge \phi _1\right) \vee \left( { Sub}_{-\infty }(\ell<x) \wedge { Sub}_{-\infty }(x<u)\right) \vee \left( { Sub}_{-\infty }(x<u) \wedge \phi _2\right) \\ \equiv&~ \phi _2. \end{aligned}$$

Together, we obtain \(\exists x \cdot \eta _m \equiv \phi _1 \vee (\ell < u) \vee \phi _2\).

We now define an MBP \( LRAProj _{\eta }\) for LRA as a map from models of \(\eta _m\) to disjuncts in (7). Given \(M \models \eta _m\), \( LRAProj _{\eta }\) picks a disjunct that covers M based on values of the literals of the form \(x=e\) and \(\ell < x\) in M. Ties are broken by a syntactic ordering on terms (e.g., when \(M \models \ell ' = \ell \) for two literals \(\ell < x\) and \(\ell ' < x\)).

$$\begin{aligned} LRAProj _{\eta }(M) = {\left\{ \begin{array}{ll} \eta _m[{x} \mapsto _v {e}], &{} \text {if } (x=e)\in \text {lits }({\eta }) \wedge M \models x=e \\ \eta _m[{x} \mapsto _v {(\ell +\epsilon )}], &{} \text {else if } (\ell< x) \in \text {lits }({\eta }) \wedge M \models \ell<x \wedge {} \\ &{} \forall (\ell '<x) \in \text {lits }({\eta }) \cdot \\ &{} M \models \left( (\ell ' < x) \implies (\ell ' \le \ell ) \right) \\ \eta _m[{x} \mapsto _v {-\infty }], &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Theorem 5

\( LRAProj _{\eta }\) is a Model Based Projection.

Proof

By definition, \( LRAProj _{\eta }\) has a finite image, as there are only finitely many disjuncts in (7). Thus, it suffices to show that for every \(M \models \eta _m\), \(M \models LRAProj _{\eta }(M)\).

Let \(M \models \eta _m\) and \( LRAProj _{\eta }(M) = \eta _m[{x} \mapsto _v {t}]\) where t is either e or \(\ell +\epsilon \) or \(-\infty \). As \(\eta _m\) is in NNF, it suffices to show that for every literal \(\mu \) of \(\eta _m\) containing x, the following holds:

$$\begin{aligned} M \models (\mu \implies { Sub}_t(\mu )) \end{aligned}$$
(11)

We consider the different possibilities of t below. For a term v, let M[v] denote the value of v in M.

In this case, we know that \(M\models x=e\). Now, for a literal \(\ell < x\),

$$\begin{aligned} M[\ell< x]&\implies M[\ell ]< M[x]&\\&= M[\ell ]< M[e]&\\&= M[\ell<e]&\\&= M[{ Sub}_{t}(\ell<x)]&\{{ Sub}_{t}(\ell<x) = (\ell < e)\}. \end{aligned}$$

Similarly, literals of the form \(x<u\) and \(x=e'\) can be considered.

Case \(t = e\).Case \(t =\ell +\epsilon \). In this case, we know that \(M[\ell <x]\) is true, i.e., \(M[\ell ] < M[x]\) and whenever \(M[\ell ' < x]\) is true, \(M[\ell ' \le \ell ]\) is also true. Now, for a literal \(\ell ' < x\),

$$\begin{aligned} M[\ell '< x]&\implies&M[\ell ' \le \ell ] \\= & {} M[{ Sub}_{t}(\ell '< x)] \quad \{{ Sub}_{t}(\ell '<x) = (\ell ' \le \ell )\}. \end{aligned}$$

For a literal \(x < u\),

$$\begin{aligned} M[x< u]&\implies M[x]< M[u]&\\&\implies M[\ell ]< M[u]&\{M[\ell ]< M[x]\}\\&\implies M[\ell<u]&\\&= M[{ Sub}_{t}(x<u)]&\{{ Sub}_{t}(x<u) = (\ell <u)\} \end{aligned}$$

For a literal \(x=e\), (11) vacuously holds as \(M[x=e]\) is false.

Case \(t = e\).Case \(t = -\infty \). In this case, we know that \(M[x=e]\) and \(M[\ell < x]\) are false for every literal of the form \(x=e\) and \(\ell <x\). So, for such literals (11) vacuously holds. For a literal \(x<u\), \({ Sub}_{t}(x<u) = \top \) and hence, (11) holds again. \(\square \)

5.3 MBP for linear integer arithmetic (LIA)

We will now present our MBP \({ LIAProj}_{\eta }\) for LIA. It is based on Cooper’s method for Quantifier Elimination procedure for LIA [24]. Let \(\eta (\overline{y}) = \exists x \cdot \eta _m (x,\overline{y})\), where \(\eta _m\) is quantifier free and in negation normal form. Assume that \({ Th}\) is LIA and x is an integer variable. Without loss of generality, let the only literals containing x be the form \(\ell < x\), \(x < u\), \(x = e\) or \((d \mid \pm x + w)\), where \(a \mid b\) denotes that a divides b, the terms \(\ell \), u, e and w are x-free, and \(d \in \mathbb {Z} \setminus \{0\}\). Let \(E = \{e \mid (x = e) \in \text {lits }({\eta _m})\}\) be the set of equality terms of x and \(L = \{\ell \mid (\ell < x) \in \text {lits }({\eta _m})\}\) be the set of lower-bounds of x. Then, by Cooper’s method, \(\exists x \cdot \eta _m (x,\overline{y}) ~\equiv \)

$$\begin{aligned} \bigvee _{(x\, =\, e) \in \text {lits }({\eta })} \eta _m[{x} \mapsto _v {e}] \vee \bigvee _{(\ell < x) \in \text {lits }({\eta })} \left( \bigvee _{i=\, 0}^{D-1} \eta _m[{x} \mapsto _v {(\ell + 1 + i)}] \right) \vee \bigvee _{i \,=\, 0}^{D-1} \eta _m^{-\infty }[{x} \mapsto _v {i}] \end{aligned}$$
(12)

where D is the least common multiple of all the divisors in the divisibility literals of \(\eta _m\), \([{x} \mapsto _v {t}]\) denotes a virtual substitution of the term t for x and \(\eta _m^{-\infty }\) is obtained from \(\eta _m\) by substituting all non-divisibility literals as follows:

$$\begin{aligned} (\ell< x)&\mapsto \bot&(x < u)&\mapsto \top&(x=e)&\mapsto \bot \end{aligned}$$
(13)

Intuitively, the disjunction partitions the space of the possible values of x. A disjunct for \((x=e)\) covers the case when x is equal to an equality term. Otherwise, the lower-bounds identify various intervals in which x can be present. The disjuncts for \((\ell < x)\) cover the case when x satisfies a lower-bound, and the last disjunct is for the case when x is smaller than all lower-bounds. The disjunction over the possible values of i covers the different ways in which the divisibility literals can be satisfied.

Model based projection \({ LIAProj}_{\eta }\) is defined as follows, conflicts are resolved by some arbitrary, but fixed, syntactic ordering on terms:

$$\begin{aligned} { LIAProj}_{\eta }(M) = {\left\{ \begin{array}{ll} \eta _m[{x} \mapsto _v {e}], &{} \text {if } x=e \in \text {lits }({\eta }) \wedge M\models (x=e) \\ \eta _m[{x} \mapsto _v {(\ell + 1 + i_\ell )}], &{} \text {else if } (\ell<x) \in \text {lits }({\eta }) \wedge M \models (\ell<x) \wedge {}\\ &{} \forall (\ell '<x) \in \text {lits }({\eta }) \cdot \\ &{} \quad \left( M \models ((\ell ' < x) \implies (\ell ' \le \ell )) \right) \\ \eta _m^{-\infty }[{x} \mapsto _v {i_{-\infty }}], &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

where \(i_\ell = M[x-(\ell +1)] \bmod D\), \(i_{-\infty } = M[x] \bmod D\), and M[x] is the value of x in M.

The following theorem shows that \({ LIAProj}_{\eta }\) is indeed a model based projection. The proof is similar to that of Theorem 5.

Theorem 6

\({ LIAProj}_{\eta }\) is a Model Based Projection.

5.4 Bounded safety with MBP

Given an MBP \( Proj _{\eta }\) for an existentially quantified formula \(\eta \), we have seen above that each quantifier-free formula in the image of \( Proj _{\eta }\) under-approximates \(\eta \). As above, we use \(\eta _m\) for the quantifier-free matrix of \(\eta \). We can now modify the side-condition \(\psi =\eta \) of Must and Query in the algorithm BndSafety to use quantifier-free under-approximations as follows: (i) for Must, the new side-condition is \(\psi = Proj _{\eta }(M)\) where \(M \models \eta _m \wedge \varphi \), and (ii) for Query, the new side-condition is \(\psi = Proj _{\eta }(M)\) where \(M \models \eta _m \wedge \llbracket {\Sigma _{R}(a)} \rrbracket ^{b-1}_o\). Note that to avoid redundant applications of the rules, we require M to satisfy a formula stronger than \(\eta _m\). Intuitively, (i) ensures that the newly inferred reachability fact answers the current query and (ii) ensures that the new query cannot be immediately answered by known facts. In both cases, the required model M can be obtained as a side-effect of discharging the premises of the rules. Soundness of BndSafety is unaffected and termination of \(\textsc {BndSafety} \) follows from the image-finiteness of \( Proj _{\eta }\).

Theorem 7

Assuming an oracle and an MBP for \({ Th}\), BndSafety is sound and terminating after modifying the rules as described above.

Proof sketch

Here, we show that BndSafety with MBP is sound and terminating.

First of all, in presence of MBP, May is unaffected and a reachability fact inferred by Must is only strengthened. Thus, soundness of BndSafety (Theorem 1) is preserved.

Then, it is easy to show that the modified side-conditions to Must and Query preserve Lemmas 1 and 2 and we skip the proof.

Then, we will show that the finite-image property of an MBP preserves the finiteness of the number of reachability facts inferred and the number of queries generated by the algorithm. Let d be the size of the image of an MBP. In the proof of Lemma 3, the recurrence relation will now have an extra factor of d. The rest of the proof of finiteness of the number of reachability facts remains the same. Similarly, in the proof of Lemma 4, the number of times Query can be applied along a path for a fixed division and fixed environments \(O^{b-1}_\sigma \) and \(U^{b-1}_\rho \) will increase by a factor of d. Again, the rest of the proof of finiteness of the number of queries generated remains the same. That is, Lemmas 3 and 4, and hence, Lemma 6, are preserved with scaled up complexity bounds.

Note that Theorem 5 is unaffected by under-approximations.

Together, we have that Theorem 2 is preserved, with a scaled up complexity bound. \(\square \)

Thus, BndSafety with a linear-time MBP (such as \( LRAProj _{\eta }\)) keeps the size of the formulas small by efficiently inferring only the necessary under-approximations of the quantified formulas.

6 Implementation and experiments

We have implemented RecMC for analyzing C programs as part of our tool Spacer. The back-end is based on Z3 [26] which is used for SMT-solving and interpolation. It supports propositional logic, linear arithmetic, and bit-vectors (via bit-blasting). The front-end is based on the tool UFO [29]. It encodes safety of a C program by converting it to the Horn-SMT format of Z3, which corresponds to the logical program representation described in Sect. 3. Loops are handled by creating fresh predicate symbols denoting the loop invariants and encoding the corresponding verification conditions. The implementation and benchmarks are available online.Footnote 2 We evaluated Spacer on three sets of benchmarks:

  1. (a)

    2908 Boolean programs obtained from the SLAM toolkit,Footnote 3

  2. (b)

    1535 procedural programs from Microsoft’s SDV project,Footnote 4 and

  3. (c)

    797 C programs from the Software Verification Competition (SVCOMP) 2014 [30].

The numbers of programs mentioned for the second and third sets of benchmarks above exclude programs with non-linear operations and memory-related properties as Spacer cannot handle them yet. The 797 programs in the third set of benchmarks also exclude programs that can be easily verified by our frontend (which converts a C program to the Horn-SMT format) using common compiler optimizations. Note that the programs in the last set have non-recursive procedures (which may still have loops) and our current frontend inlines all procedure calls. We call the resulting set of encodings Svcomp-1. Note that Svcomp-1 essentially corresponds to while-programs. We introduced procedural modularity in Svcomp-1 by two distinct means: (a) factoring out maximal loop-free fragments into new loop-free, recursion-free procedures (the main procedure may still have loops) to obtain Svcomp-2, and (b) factoring out loops into tail-recursive procedures (in an inside-out fashion for nested loops) to obtain Svcomp-3. Our simple outlining procedure could not handle some large programs and Svcomp-3 has 45 fewer programs.

Figure 9 shows some characteristics of the Horn-SMT encodings for the benchmarks, when viewed according to the logical program representation described in Sect. 3. The number of calls along a path roughly identifies the procedural modularity of the encodings. The number of calls of a procedure (in the entire program) identifies the potential number of times a summary (may or must) of the procedure can be reused for a given bound on the call-stack. Note that summaries can also be reused across different bounds on the call-stack. Despite the fact that the average number of procedure calls is low from the figure, we can show the practical advantage of RecMC using these benchmarks, as we show below.

Fig. 9
figure 9

Some characteristics of the Horn-SMT encodings of the benchmarks, averaged over all programs in the corresponding set

We compared Spacer against the implementation of GPDR in Z3 [15]. GPDR is inspired by the IC3 hardware model checking algorithm [22] and avoids unrolling the call-graph. Thus, it creates and checks reachability queries for individual procedures similar to RecMC. However, it only computes may summaries and because of the lack of must summaries, its query creation mechanism is quite different from the rule Query. Moreover, it does not use MBP.

In our experiments, the resource limits were set to 30 min of time and 16GB of memory, on an Ubuntu machine with a 2.2 GHz AMD Opteron(TM) Processor 6174 and 516GB RAM. Figure 10 and 11 show a high level summary of the results in terms of the number of programs verified by Spacer and Z3. Since there are some programs verified by only one of the tools, the figures also report the number of programs verified by at least one tool in the third row. We provide a more detailed discussion of the experimental results in the following. In the scatter plots shown below, a diamond indicates a time-out and a star indicates a mem-out.

Fig. 10
figure 10

Number of programs verified for Slam and Sdv benchmarks

Fig. 11
figure 11

Number of programs verified for SVCOMP benchmarks

Fig. 12
figure 12

Spacer versus Z3 for the Slam benchmarks (with ±5 min boundaries)

Fig. 13
figure 13

Spacer versus Z3 for the Boolean program in Fig. 1 which is parametric in the number of procedures

6.1 Boolean program benchmarks

Figure 12 shows the scatter plot of runtimes for Spacer and Z3 for the SLAM benchmarks. The runtimes of both the tools are within ±5 min for over 98% of the benchmarks. Of the remaining, Spacer is better on 1 benchmark, Z3 is better on 42 benchmarks which includes 13 benchmarks where Spacer runs out of time. Recall that Z3 utilizes may summaries which heuristically avoid the possible exponential blow-up associated with unwinding the call-graph and as the plot shows, such a heuristic approach can be better than Spacer in some cases. However, when we compared the tools on the parametric Boolean program from Fig. 1, in which the size of the unrolled call-tree necessarily grows exponentially in the number of procedures, Spacer handles the increasing complexity significantly better than Z3, as shown in Fig. 13.

Fig. 14
figure 14

Spacer versus Z3 for the Sdv benchmarks

6.2 SDV Benchmarks

Figure 14 shows the scatter plot of runtimes for Spacer and Z3 for the SDV benchmarks. Spacer clearly outperforms Z3 including a benchmark where Z3 runs out of time.

Fig. 15
figure 15

The advantage of MBP over Svcomp-1 encodings. As each of these encodings corresponds to a single procedure, must summaries are not required and MBP is the only key difference between Spacer and Z3. Axes are logarithmically scaled

6.3 SVCOMP 2014 Benchmarks

We begin with the scatter plot in Fig. 15 for Svcomp-1 encodings. As mentioned above, Svcomp-1 encodings correspond to while-programs and therefore, do not require must summaries. As the GPDR algorithm also computes may summaries, the plot in Fig. 15 essentially shows the advantage of using MBP in creating a new query as opposed to Z3’s variable substitution based on a given model.Footnote 5

Fig. 16
figure 16

The advantage of must summaries over Svcomp-2 encodings. Axes are logarithmically scaled

To understand the effect of must summaries, we also created a version of Spacer that only infers and utilizes may summaries. We obtained this by modifying Z3 to use MBP in creating new queries. As shown in Fig. 16, the advantage of using must summaries is quite significant on Svcomp-2 encodings.

Fig. 17
figure 17

Spacer versus Z3 for Svcomp-2 encodings. Axes are logarithmically scaled

Fig. 18
figure 18

Spacer versus Z3 for Svcomp-3 encodings. Axes are logarithmically scaled

So, a combination of MBP and must summaries is expected to result in significant improvements over using may summaries alone. This is shown experimentally in Fig. 17 and 18 for the Svcomp-2 and Svcomp-3 encodings which show that Spacer is significantly better than Z3 on most of the programs.

Recall that the rule Query checks the feasibility of a potential counterexample path \(\pi \) by recursively creating a new reachability query for a procedure R called along \(\pi \). Due to our logical representation of a program, one can consider an arbitrary permutation of the conjuncts of \(\pi \) when applying the rule and the choice of the procedure R is not deterministic. Our current implementation in Spacer can order the conjuncts either in the given order or in the reversed order and for lack of good heuristics, we do not consider other permutations. These two orderings correspond to top-down and bottom-up feasibility analyses. In particular, the plot shown in Fig. 17 corresponds to a bottom-up analysis.

Fig. 19
figure 19

Effect of changing the order of query creation in Query in (a) Spacer and (b) the corresponding change in Z3, on Svcomp-2 encodings. Axes are logarithmically scaled

As mentioned in the beginning, the Svcomp-2 encodings are obtained by taking the while-program encodings in Svcomp-1 and factoring out maximal loop-free fragments into new loop-free, recursion-free procedures. Furthermore, as also mentioned in the beginning, loops are encoded in Svcomp-1 by introducing new predicate symbols that denote loop invariants and by encoding the corresponding verification conditions. So, a path in a procedure in the resulting logical encoding (see Sect. 3) contains at most two calls, one corresponding to an invariant at a control location and the other corresponding to a newly introduced procedure for a loop-free fragment. Thus, a top-down analysis refines the may summaries of the new procedures only when necessary, similar to a CEGAR-style reasoning where the may summaries of the new procedures abstract the loop-free fragments. We call this a lazy refinement strategy. In contrast, a bottom-up analysis on these encodings corresponds to an eager refinement strategy which is shown in Fig. 17. Figure 19a shows a scatter plot of runtimes on Svcomp-2 comparing the behavior of Spacer for the two orderings. While it is unclear from the figure which ordering is better, Spacer continues to outperform Z3 even with the lazy strategy, as shown in Fig. 19b.

Fig. 20
figure 20

Comparison of Spacer ’s behavior on the encodings Svcomp-1 and Svcomp-2 of the same SVCOMP benchmarks. This plot only includes data for the benchmarks where Spacer takes more than 5 min on the Svcomp-1 encodings. We use Spacer in the lazy mode for Svcomp-2 encodings

Fig. 21
figure 21

Comparison of Spacer ’s behavior on the encodings Svcomp-1 and Svcomp-3 of the same SVCOMP benchmarks

Finally, as an interesting exercise, we compared the behavior of Spacer on various encodings of the SVCOMP benchmarks. For the comparison of runtimes between Svcomp-2 and Svcomp-1, we restricted ourselves to the benchmarks where Spacer takes more than 5 min on the Svcomp-1 encodings and we considered the lazy mode of Spacer for the Svcomp-2 encodings. Note that checking safety of the Svcomp-2 encodings in the lazy mode essentially corresponds to abstract reasoning by inferring sufficient summaries of the loop-free fragments. So, the rationale behind restricting to this subset of benchmarks is that we can see how helpful abstraction is on hard benchmarks. Figure 20 shows the runtime comparison for the benchmarks where we see that there is no clear winner. However, as we saw in an earlier work [31], abstraction can be quite powerful for SMT-based model checking and we plan to incorporate those ideas into the framework of RecMC in the future. Then, Fig. 21 shows the runtime comparison for the encoding Svcomp-3 against Svcomp-1. Recall that Svcomp-3 encodings are obtained by factoring out loops into tail-recursive procedures. In other words, we are replacing the inference of loop invariants by that of summaries of the corresponding tail-recursive procedures. Whereas a loop invariant depends on the variables in scope, the signature of the corresponding tail-recursive procedure, and hence its summary, depends on two copies of the variables in scope which denote their values before and after a loop iteration. As the plot shows, this can negatively affect the performance of verification.

Overall, we have shown significant practical benefits of the core ideas behind RecMC using our implementation in Spacer and various realistic benchmarks.

7 Related work

There is a large body of work on interprocedural program analysis. It was pointed out early on that safety verification of recursive programs is reducible to the computation of a fixed-point over relations representing the input-output behavior of each procedure [9]. Such relations are also called summaries in the functional approach of Sharir and Pnueli [10]. Reps et al. [8] showed that for a large class of finite, interprocedural dataflow problems, the summaries can be computed in time polynomial in the number of dataflow facts and procedures. Ball and Rajamani [1] adapted the RHS algorithm to the verification of Boolean programs as part of the SLAM project.

Following SLAM, other software model checkers – such as blast [32] and magic [33] – also implemented the CEGAR loop with predicate abstraction. These approaches do not use under-approximating summaries as we do.

In the context of predicate abstraction, the algorithm Smash also combines over- and under-approximations for analyzing procedural programs [34]. However, the summaries in Smash can have auxiliary variables which differ from one calling context to another, restricting the reusability of the summaries. Smash also under-approximates existential quantification in computing the results of the post and pre operations, but unlike RecMC, the under-approximations are obtained using concrete values encountered during testing of the program.

As mentioned earlier in the paper, several SMT-based algorithms have been proposed for safety verification of recursive programs, including Whale [14], HSF [6], Duality [18], Ultimate Automizer [16], and Corral [35]. These algorithms share a similar structure – they use SMT-solvers to look for counterexamples and interpolation to compute over-approximating procedure summaries. The algorithms differ in the SMT encoding and the heuristics used. However, in the worst-case, they completely unroll the call graph into a tree.

The work closest to ours is GPDR [15], which extends the hardware model checking algorithm IC3 of Bradley [22] to SMT-supported theories and recursive programs. Unlike RecMC, GPDR does not maintain must-summaries. In the context of Fig. 7, this means that \(\sigma _u\) is always empty and there is no Must rule. Instead, the Query rule is modified to use a model M that satisfies the premises (instead of our use of the entire path \(\pi \) when creating a query). Furthermore, the reachable queries are cached. In the context of Boolean programs, this ensures that every query is asked at most once (and either cached or blocked by a may-summary). Since there are only finitely many models, the algorithm is guaranteed to terminate. However, in the case of Linear Arithmetic, a formula can have infinitely many models and GPDR might end up applying the Query rule indefinitely (see Appendix). In contrast, RecMC creates only finitely many queries for a given bound on the call-stack depth and is guaranteed to find a counterexample if one exists.

8 Conclusion

We presented RecMC, a new SMT-based algorithm for model checking safety properties of recursive programs. For programs and properties over decidable theories, RecMC is guaranteed to find a counterexample if one exists. To our knowledge, this is the first SMT-based algorithm with such a guarantee while being polynomial for Boolean programs. The key idea is to use a combination of under- and over-approximations of the semantics of procedures, avoiding re-exploration of parts of the state-space. We described an efficient instantiation of RecMC for Linear Arithmetic (over rationals and integers) by introducing Model Based Projection to under-approximate the expensive quantifier elimination. We have implemented it in our tool Spacer and shown empirical evidence that it significantly improves on the state-of-the-art.

In the future, we would like to explore extensions to other theories. Of particular interest are the theory EUF of uninterpreted functions with equality and the theory of arrays. The challenge is to deal with the lack of quantifier elimination. Another direction of interest is to combine \(\textsc {RecMC} \) with Proof-based Abstraction [31, 36, 37] to explore a combination of the approximations of procedure semantics with transition-relation abstraction.