1 Introduction

When trying to prove the correctness of a program, finding useful abstractions in form of state assertions is the most important part of the process [15, 20]. In this context, usefulness is about being able to prove correctness as efficiently as possible. Hence, in order to be able to analyze large programs, it is important to find state assertions automatically. The class of methods based on abstract interpretation [12] is well-known for being able to find state assertions automatically. Abstract interpretation computes an over-approximation of a program’s states by using an up-front and largely program-independent abstraction. Many such abstractions exist [7, 13, 23, 24] and all of them are useful, because they give rise to different kinds of state assertions that can be used to prove the correctness of different kinds of programs. It is the strength of abstract interpretation that it always terminates and always computes a fixpoint in the selected abstraction. If the program contains loops, the computed fixpoint also contains a loop invariant which allows for an easy abstraction of loops. While abstract interpretation scales favorably with the size of the program, the computed over-approximation is often not precise enough to be useful to prove the correctness of a program.

Another way of proving program correctness is to use software model checking tools like Blast  [6], SLAM  [3], and more recently, CPAchecker  [9] and Ultimate Automizer  [17], that follow the counterexample-guided abstraction refinement (CEGAR) approach [11]. In CEGAR, an abstraction is continuously refined by synthesizing state assertions from paths through the control flow graph of the program that (1) are not contained in the current abstraction, (2) can reach an error location, and (3) are not executable. By extracting state assertions from those paths, the abstraction can be refined to fit the program at hand, which allows the user a greater amount of flexibility because he does not need to decide beforehand on a suitable abstraction. The path analysis has to be precise, i.e., it has to ensure that paths that represent real errors can be identified. Unfortunately, this precision causes the analysis to often produce state assertions that are not loop invariants, thus forcing the CEGAR algorithm to unwind loops of the program. If this happens, the algorithm may not be able to refine the abstraction at all, e.g., because the loop of the analyzed program can be unwound infinitely often.

In this paper we propose a unification of both techniques, abstract interpretation and CEGAR-based software model checking, such that both can benefit from their strengths: we use abstract interpretation to find loop invariants, an interpolating SMT solver to analyze single paths, and we combine both in a CEGAR-based abstraction refinement loop. We implemented our approach in the tool Ultimate Taipan which uses a CEGAR loop to iteratively refine an abstraction of an input program in the form of a control flow graph with the help of state assertions deduced from the analysis of path programs with abstract interpretation.

Fig. 1.
figure 1

Example program \(\mathcal {P}_{1}\) with its source code and its corresponding control flow graph (CFG). The location \(\ell _0\) of the CFG is the initial location, \(\ell _{7}\) is the error location.

1.1 Example

Consider the example program \(\mathcal {P}_{1}\) in Fig. 1 and its corresponding control flow graph. We are interested in proving that the error location of \(\mathcal {P}_{1}\) ’s control flow graph (see Fig. 1b), \(\ell _7\), is unreachable. A CEGAR-based approach to generate the proof by iteratively refining an abstraction of the program begins with picking a sequence of statements from the CFG, which starts in the initial location and ends in an error location. Next, an analysis decides whether the selected sequence of statements is executable or not, and if not, the abstraction is refined such that this particular sequence is no longer contained.

Assume that in the first iteration, the shortest sequence of statements \(\tau _1\) from the initial location \(\ell _0\) to the error location \(\ell _7\) is picked.

figure a

This sequence of statements is not executable, because the first two statements contradict each other. A possible proof for this contradiction consists of the following sequence of state assertions.

figure b

This sequence of state assertions allows the CEGAR tool to refine its abstraction such that \(\tau _1\) is removed from it. In the next iteration, we assume that \(\tau _2\) is selected.

figure c

Again, this sequence of statements is not executable. For example, the statements , , and contradict each other for which we can extract the following proof.

figure d

We could continue in this fashion until we have constructed a proof for each unwinding of the while loop of \(\mathcal {P}_{1}\). However, we would prefer to obtain other proofs that contain state assertions that allow us to find a more general refinement of our abstraction, thus eliminating the need for multiple loop unwindings.

In our example, the state assertions are not general enough to efficiently prove the program’s correctness, although they were obtained using a state-of-the-art interpolating SMT solver. The reason we obtain such assertions is that SMT solvers in general do not infer a relation between the variables x and y for any number of loop unwindings from sequences of statements. If, for example, we obtained the state assertion , which is a relational constraint over variables x and y, in the second iteration, the CEGAR tool would be able to refine its abstraction of the CFG in such a way that no sequence of statements remains that leads to the error location. is very useful because it is a loop invariant at location \(\ell _1\), and therefore is sufficient to prove the correctness of our example.

Approaches based on static program analysis, such as abstract interpretation, can deduce loop invariants by computing a fixpoint for each program location. In our example, an abstract interpreter that uses a relational abstract domain, e.g. octagons [21], would suffice. However, such an analysis of the whole program may not be able to find an invariant strong enough to prove the program to be correct.

Fig. 2.
figure 2

The two path programs computed for \(\mathcal {P}_{1}\).

Fig. 3.
figure 3

State assertions computed for both path programs of Fig. 2.

In this paper, we propose to improve the precision by not analyzing the whole program, but just a fragment of it. We can compute such a fragment by projecting the CFG of the program to the statements occurring in the selected sequence of statements. The resulting CFG is called a path program [8]. Figure 2a shows the path program computed from \(\mathcal {P}_{1}\) and the sequence of statements \(\tau _2\). We can now calculate the fixpoint of, e.g., an octagon-based abstraction for this path program, which then yields the state assertions shown in Fig. 3a. The state assertion at \(\ell _1\) contains a loop invariant strong enough to prove correctness of the path program in the second iteration of the CEGAR loop. The CEGAR tool removes the program fragment covered by this path program from its abstraction of the program, such that the resulting abstraction only contains sequences of statements that lead through the other branch of the if-statement inside the loop. Thus, in iteration 3, we obtain the path program depicted in Fig. 2b from one of those sequences. After computing the corresponding state assertions with abstract interpretation (see Fig. 3b), the CEGAR tool will remove this path program from the program abstraction, thus removing all sequences of statements that lead to the error location. In general, the chosen abstraction may not be precise enough to find a useful invariant. In this case, we fall back to a conventional analysis of a single trace s.t. the CEGAR loop is guaranteed to remove at least a single error trace.

In the following, we present our approach that combines the analysis of single sequences of statements and the analysis of path programs in an automata-theoretic setting. We focus on obtaining loop invariants with abstract interpretation if possible, but can fall back on the analysis of single traces if the computed abstraction is too weak to prove infeasibility of a trace.

2 Preliminaries

In this section, we present our understanding of programs and their semantics, give a brief overview over abstract interpretation, and explain the trace abstraction algorithm which we use as basis of our approach.

Programs and Traces. We consider a simple programming language whose statements are assignment, assume, and sequential composition. We use the syntax that is defined by the grammar

$$\begin{aligned} \texttt {s} \ := \ \texttt {assume~bexpr} \ \mid \ \texttt {x:=expr} \ \mid \ \texttt {s;s} \end{aligned}$$

where \( Var \) is a finite set of program variables, \(\texttt {x} \in Var \), \(\texttt {expr}\) is an expression over \( Var \) and \(\texttt {bexpr}\) is a Boolean expression over \( Var \). For brevity we use bexpr to denote the assume statement assume bexpr.

We represent a program over a given set of statements \( Stmt \) as a labeled graph \(\mathcal {P} = ( Loc , \delta , \ell _{0}) \) with a finite set of nodes \( Loc \) called locations, a set of edges labeled with statements, i.e., \(\delta \subseteq Loc \times Stmt \times Loc \), and a distinguished node \(\ell _{0} \in Loc \) which we call the initial location.

We call a sequence of statements \(\tau =\texttt {s}_0\texttt {s}_1\texttt {s}_2\ldots \in Stmt ^{*}\) a trace of the program \(\mathcal {P} \) if \(\tau \) is the edge labeling of a path that starts at the initial location \(\ell _{0}\). We define the set of all program traces formally as follows.

$$\begin{aligned} T(\mathcal {P}) = \{\texttt {s}_0\texttt {s}_1\ldots \in Stmt ^{*} \mid \exists \ell _1,\ell _2,\ldots \bullet (\ell _i,\texttt {s}_{i},\ell _{i+1}) \in \delta \text {, for } i \ge 0 \} \end{aligned}$$

Note that each trace starts in the unique initial location \(\ell _{0}\), i.e., for the first transition of a trace we only require that \(\ell _{1}\) of \((\ell _0,\texttt {s}_{0},\ell _{1})\) exists.

Let \(\mathcal {D}\) be the set of values of the program’s variables. We denote a concrete program state \(c \) as a function \(c: Var \rightarrow \mathcal {D}\) that maps program variables to values. We use \(\mathcal {C} \) to denote the set of all concrete program states. Each statement s \(\in Stmt \) defines a binary relation \(\rho _{\texttt {s}}\) over concrete program states which we call the successor relation. Let Expr be the set of all expressions over the program variables \( Var \). We assume a given interpretation function \(\mathcal {I}: Expr \times ( Var \rightarrow \mathcal {D}) \rightarrow \mathcal {D}\) and define the relation \(\rho _{\texttt {s}} \subseteq \mathcal {C} \times \mathcal {C} \) inductively as follows:

$$\begin{aligned} \rho _{\texttt {s}} = {\left\{ \begin{array}{ll} \{ (c,c ') \mid \mathcal {I}(\texttt {bexpr})(c) = true \text { and }c =c ' \} &{} \text { if }{} \texttt {s} \equiv \texttt {assume~bexpr}\\ \{ (c,c ') \mid c ' = c [\texttt {x} \mapsto \mathcal {I}(\texttt {expr})(c) ] \} &{} \text { if }{} \texttt {s} \equiv \texttt {x:=expr}\\ \{(c,c ') \mid \exists c '' \bullet (c,c '') \in \rho _{\texttt {s}_1} \text { and } (c '',c ') \in \rho _{\texttt {s}_2} \} &{} \text { if }{} \texttt {s} \equiv \texttt {s}_1;\texttt {s}_2\\ \end{array}\right. } \end{aligned}$$

Given a trace \(\tau = \texttt {s}_0\texttt {s}_1\texttt {s}_2\ldots \), a sequence of concrete program states \(\pi = c _0c _1c _2\ldots \) is called a program execution of trace \(\tau \) if each successive pair of concrete program states is contained in the successor relation of the corresponding statement of the trace, i.e., \((c _{i},c _{i+1}) \in \rho _{\texttt {s}_i}\) for \(i\in \{0,1,\ldots \}\). We call a trace \(\tau \) infeasible if it does not have any program execution, otherwise we call \(\tau \) feasible.

Path Programs [8]. Given the program \(\mathcal {P} = ( Loc , \delta , \ell _{0})\) over the set of statements \( Stmt \) and the trace \(\tau = \texttt {s}_0 \texttt {s}_1 \ldots \) of \(\mathcal {P}\), the path program \(\mathcal {P_{\tau }} = ( Loc _\tau , \delta _\tau , \ell _{0_\tau })\) is defined as follows.

  • The set of program locations \( Loc _\tau \) consists of the locations on the path of \(\mathcal {P}\) which is labeled by the trace \(\tau \), i.e., \( Loc _\tau = \{ \ell \mid \ell \in Loc \wedge \exists \texttt {s}_i \in \tau \; s.t.\; (\ell ,\texttt {s}_i,\ell ') \in \delta \vee (\ell ',\texttt {s}_i,\ell ) \in \delta )\}\),

  • the transition relation \(\delta _\tau \) consists of the transitions that lie on the path, i.e., \(\delta _\tau = \{ (\ell ,\texttt {s}_i,\ell ') \mid \texttt {s}_i \in \tau \wedge (\ell ,\texttt {s}_i,\ell ') \in \delta \}\), and

  • the initial location \(\ell _{0_\tau }\) is the initial location of the path, i.e., \(\ell _{0_\tau } = \ell _0\).

Abstract Interpretation. Abstract interpretation [12] is a well-known static analysis technique that computes a fixpoint of abstract values of an input program’s variables for each program location. This fixpoint is an over-approximated abstraction of the program’s concrete behavior. To this end, abstract interpretation uses an abstract domain defining allowed abstract values of the program’s variables in the form of a complete lattice.

Formally, an abstract domain is defined as follows. Let \(L = (\sqsubseteq , \sqcup , \sqcap , \bot , \top )\) be a complete lattice, i.e. a partially ordered set with partial ordering relation \(\sqsubseteq \), such that for all \(a, b \in L\), a and b have a least upper bound \(a \sqcup b\) and a greatest lower bound \(a \sqcap b\), and every subset \(X \subseteq L\) has a least upper bound \(\sqcup X\) and a greatest lower bound \(\sqcap X\). The least element of L is \(\bot = \sqcup \emptyset \), and the greatest element of L is \(\top = \sqcup L\). For two lattices \(L_1 = (\sqsubseteq _1, \sqcup _1, \sqcap _1, \bot _1, \top _1)\) and \(L_2 = (\sqsubseteq _2, \sqcup _2, \sqcap _2, \bot _2, \top _2)\), \(\alpha : L_1 \rightarrow L_2\) is called an abstraction function and \(\gamma : L_2 \rightarrow L_1\) is called a concretization function if and only if \(\forall x \in L_1, \forall y \in L_2 \centerdot \alpha (x) \sqsubseteq _2 y \iff x \sqsubseteq _1 \gamma (y)\).

An abstract domain is a tuple \(A^\# = (\mathcal {D}^{\#}, \nabla , \alpha , \gamma )\), where \(\mathcal {D}^{\#}\) is a complete lattice representing the domain of possible abstract values, \(\nabla : \mathcal {D}^{\#} \times \mathcal {D}^{\#} \rightarrow \mathcal {D}^{\#}\) is a widening operator, and \(\alpha : \mathcal {D}\rightarrow \mathcal {D}^{\#}\) and \(\gamma : \mathcal {D}^{\#} \rightarrow \mathcal {D}\) are the abstraction function and the concretization function, respectively, which map concrete values of the complete lattice of values of a program’s variables \(\mathcal {D}\) to abstract values and vice versa. An abstract program state is a function \(\sigma : Var \rightarrow \mathcal {D}^{\#}\) which assigns each variable occurring in the program an abstract value. We use \( S \) to denote the set of all abstract program states of the program. Given a concrete state \(c \in \mathcal {C} \), we use \(\sigma = \alpha (c)\) to denote the application of \(\alpha \) to every value of every variable in \(c \) in order to obtain the corresponding abstract state \(\sigma \in S \).

The fixpoint computation algorithm traverses the input program and assigns to each location of the program an abstract state by iteratively applying an abstract transformer, \( post ^{\#}: S \times Stmt \rightarrow S \), starting at the initial location. This abstract transformer computes an abstract post state for a given abstract state and a statement, i.e., it computes the effect a statement has on a given abstract state. In case of branching in the program, the fixpoint computation algorithm may choose to either merge the states at the join point of the branches with the join operator \(\sqcup \) defined by the complete lattice of abstract values in the abstract domain, or to keep an arbitrary number of disjunctive abstract states. In the latter case, precision is increased at the cost of additional computations due to more abstract states in the abstraction.

The fixpoint computation algorithm is guaranteed to achieve progress and to eventually terminate. Progress is achieved by the application of the widening operator \(\nabla \), defined by the abstract domain: when the fixpoint computation algorithm traverses the statements of a loop, an infinite repetition of the application of the abstract transformer to the loop’s statement is avoided by widening the approximation of the loop’s body.

Upon termination, an over-approximated abstraction of the program is guaranteed to have been computed. The resulting abstraction is represented as a mapping \( fp : Loc \rightarrow 2^ S \), which maps to each location a disjunctive set of abstract states.

Trace Abstraction. The trace abstraction algorithm [18, 19] is a CEGAR-based software model checking approach that proves the correctness of a program \(\mathcal {P} \) by partitioning the set of possible error traces in feasible and infeasible traces. In the following, we briefly explain this approach. Consider the trace abstraction algorithm shown in Fig. 4. The input program \(\mathcal {P} \) over the set of statements \( Stmt \) is first translated into a program automaton \(\mathcal {A}_{\mathcal {P}} \), which encodes the correctness property of \(\mathcal {P} \) by marking some of its locations as error locations. Those error locations serve as the accepting states of the program automaton \(\mathcal {A}_{\mathcal {P}} \), and the set of statements \( Stmt \) as its alphabet. By construction, the language of the program automaton represents all traces of \(\mathcal {P} \) that reach an error location.

Fig. 4.
figure 4

The trace abstraction software model checking algorithm.

The goal of the algorithm is to iteratively construct a data automaton \(\mathcal {A}_{D} \) whose language only consists of infeasible traces. If the language of the data automaton contains the language of the program automaton, we know that all traces of the program that may reach an error location are infeasible, and thus the program is correct. Whenever the language of the data automaton does not contain the language of the program automaton, there exists a trace \(\tau \) for which we do not know if it is infeasible and which reaches an error location. If the trace \(\tau \) is feasible, it represents at least one valid program execution that can reach an error location. Hence, \(\tau \) is a valid counterexample. If the trace \(\tau \) is infeasible, a proof of infeasibility of \(\tau \) in form of a set of Hoare triples \(\mathcal {H}\) is constructed. The algorithm then constructs from this set a new data automaton, whose language contains the language of the old data automaton and at least \(\tau \) as a new word. Only adding one trace in a single iteration is in most cases not sufficient, because, e.g., programs with loops contain infinitely many traces. Therefore, the trace abstraction algorithm constructs a data automaton not only from the trace \(\tau \), but from a set of Hoare triples \(\mathcal {H}\). Additionally, before constructing the data automaton, the algorithm tries to generalize the proof \(\mathcal {H}\) by adding more valid Hoare triples to it.

More formally, a data automaton is a Floyd-Hoare automaton [14, 19]. A Floyd-Hoare automaton \(\mathcal {A}_{} =(Q,\delta ,q_0,F)\) is an automaton over the alphabet of the program’s statements \( Stmt \) together with a mapping that assigns to each state \(q\in Q\) a state assertion \(\varphi _q\) such that the following holds:

  • The initial state is annotated by the state assertion \( true \),

  • for each transition \((q, \texttt {s}, q') \in \delta \) the triple \(\{\varphi _q\} \, \texttt {s} \, \{\varphi _{q'}\} \) is a valid Hoare triple, and

  • each accepting state \(q \in F\) is annotated by the state assertion \( false \).

The generalization of a proof \(\mathcal {H}\) is performed by a function \(\textsf {generalize}: \mathcal {H} \times \textit{htc} \rightarrow \mathcal {H} \) for a program \(\mathcal {P}\) over a set of statements \( Stmt \), where

  • \(\{\varphi _q\} \, \texttt {s} \, \{\varphi _{q'}\} \in \mathcal {H} \) is a set of valid Hoare triples, and

  • \(\textit{htc}: \mathcal {H} \rightarrow \{\top , \bot , ?\}\) is a function that determines whether a given Hoare triple is valid, invalid, or unknown.

The set of all predicates contained in a set of Hoare triples \(\mathcal {H}\) is \( Pred (\mathcal {H}):= \{ \varphi \mid \{\psi \} \, \texttt {s} \, \{\varphi \} \in \mathcal {H} \; or \; \{\varphi \} \, \texttt {s} \, \{\psi \} \in \mathcal {H} \}\). The function \(\textsf {generalize}\) generalizes a proof \(\mathcal {H}\) as follows.

figure e

If the set of predicates \( Pred (\mathcal {H})\) of a proof \(\mathcal {H}\) contains the predicates true and false, we can construct a Floyd-Hoare automaton \(\mathcal {A}_{\mathcal {H}} = (Q,\delta , q_0, F)\) from it as follows.

  • The set of locations Q consists of one location for each predicate \(\varphi \in Pred (\mathcal {H}) \), i.e., \(Q = \{ q \mid \varphi _q \in Pred (\mathcal {H}) \}\),

  • the set of transitions \(\delta \) contains one transition for each Hoare triple in \(\mathcal {H}\), i.e. \(\delta = \{ (q,s,q') \mid \{\varphi _q\} \, \texttt {s} \, \{\varphi _{q'}\} \}\),

  • the initial location \(q_0\) is the location labeled with true, and

  • the set of accepting states F contains only the location labeled with false.

Fig. 5.
figure 5

The basic idea behind our software model checking approach.

3 Algorithm

In this section, we present our software model checking algorithm and the basic idea behind it. Our approach is centered around an automata-theoretic counterexample-guided trace partitioning approach. Figure 5 shows a simplified version of our algorithm that already integrates trace abstraction, abstract interpretation and path programs. Similar to trace abstraction, we translate the input program \(\mathcal {P} \) into a program automaton \(\mathcal {A}_{\mathcal {P}} \) and try to find a data automaton \(\mathcal {A}_{D}\) that represents only infeasible traces. As long as we did not cover all traces represented by \(\mathcal {A}_{\mathcal {P}}\), we continue to update \(\mathcal {A}_{D}\) and pick a not yet covered trace \(\tau \) from the uncovered part of \(\mathcal {A}_{\mathcal {P}}\). But instead of directly analysing the trace \(\tau \), we construct a path program \(\mathcal {P_{\tau }} \). Next, we use abstract interpretation to compute a fixpoint of \(\mathcal {P_{\tau }} \) and to determine whether \(\mathcal {P_{\tau }} \) is correct, i.e., if the error location of \(\mathcal {P_{\tau }} \) is reachable. Using the path program of \(\tau \) allows us to analyze multiple traces of \(\mathcal {P} \) at once. If abstract interpretation is able to show correctness of the path program \(\mathcal {P_{\tau }}\), we directly add the program (automaton) \(\mathcal {P_{\tau }}\) to the data automaton \(\mathcal {A}_{D} \), because we now know that all traces of \(\mathcal {P_{\tau }}\) are infeasible. In contrast to trace abstraction, this step does not create a Floyd-Hoare automaton, since \(\mathcal {P_{\tau }} \) is added to \(\mathcal {A}_{D} \) instead of first producing a proof an then constructing an automaton from it.

Since abstract interpretation is used to determine correctness of the constructed path programs, the result may also be “unknown”. In this case, our simple algorithm cannot deduce any information about the correctness of \(\mathcal {P_{\tau }}\) and thus, also not about the correctness of \(\mathcal {P} \). To avoid this problem, we extend our basic algorithm with the analysis of single traces as a fallback.

Fig. 6.
figure 6

Our complete software model checking algorithm. The differences to Fig. 4 are highlighted in .

Figure 6 shows our full software model checking algorithm. Our algorithm retains the precision of software model checking but also utilizes abstract interpretations ability to find loop invariants to prevent divergence due to loop unwinding. As it is based on trace abstraction, we extended Fig. 4 and highlighted all new parts in . As with our simplified algorithm and with trace abstraction, we try to construct a data automaton that represents all traces of which we know that they are infeasible. We know that the program is safe when all traces of the program automaton are contained in the data automaton. Whenever we find a trace \(\tau \) of \(\mathcal {P}\) for which we not yet know whether it is feasible or not, we construct a path program \(\mathcal {P_{\tau }} \) from it and analyse this instead. We first try to find an abstraction of \(\mathcal {P_{\tau }} \) with abstract interpretation. If the abstraction is not sufficiently coarse to prove the correctness of the path program, we use an SMT solver to check the initial counterexample trace \(\tau \) for infeasibility. If \(\tau \) is infeasible, we also obtain a sequence of state assertions – either by Craig interpolation or by using a combination of strongest post and unsatisfiable cores – from the SMT solver. These can be directly used to construct the proof \(\mathcal {H}\) and continue the algorithm. If \(\tau \) is feasible, we found a valid counterexample and the original program \(\mathcal {P}\) is unsafe.

In the case where the abstraction computed with abstract interpretation is sufficient to show that \(\mathcal {P_{\tau }} \) is correct, we have to provide trace abstractions \(\textsf {generalize}\) function with a proof \(\mathcal {H}\) and a function \(\textit{htc}\) that can be used to check additional Hoare triples for validity. The next subsection describes how we initially obtain \(\mathcal {H}\) from the abstraction and how our \(\textit{htc}\) function works. Afterwards we explain in Sect. 3.2 how we optimize the initial proof \(\mathcal {H}\) with the function \(\textsf {weaken}\) such that we get a more general proof that still contains all loop invariants for the path program.

3.1 Proofs from Fixpoints

After abstract interpretation proves the path program \(\mathcal {P_{\tau }}\) safe, we need to construct a proof of infeasibility \(\mathcal {H}\) from the computed fixpoints. For this construction we depend on some function \( p \) that converts an abstract state to a predicate without any loss of precision. Note that in general, it might not be possible to express a particular abstract domain as a SMT-compatible predicate, e.g., because there might be no suitable theory available. Given a function \( p \), we construct the proof \(\mathcal {H}\) for a path program \(\mathcal {P_{\tau }} = (Loc, \delta , l_0)\) and an abstraction \( fp \) as follows.

First, we compute a state assertion \(\varphi _\ell \) for each location \(\ell \) of \(\mathcal {P_{\tau }}\) by taking the disjunction of all abstract states resulting from an application of \( fp \) on \(\ell \):

$$\begin{aligned} \varphi _\ell = \displaystyle \bigvee _{\sigma ^\ell _i \in fp (\ell )} p (\sigma ^\ell _i) \end{aligned}$$

Then, we construct Hoare triples by collecting all \(\varphi _\ell \) along the transition relation \(\delta \), i.e., \(\mathcal {H} = \left\{ \{\varphi _\ell \} \, \texttt {s} \, \{\varphi _{\ell '}\} \mid (\ell , \texttt {s}, \ell ') \in \delta \right\} \).

In the original trace abstraction algorithm, the next step is the generalization of the proof with \(\textsf {generalize}\) and \(\textit{htc}\). The function \(\textit{htc}\) is implemented by letting an SMT solver decide whether the Hoare triple is valid or not. Because the state assertions computed by abstract interpretation usually contain many conjuncts, SMT solver queries that involve them can be quite slow. Therefore, we use a different \(\textit{htc}\) function, namely \(\textit{htc}^{\#}\). As each state assertion \(\varphi _{\ell }\) in \( Pred (\mathcal {H})\) is constructed from a set of abstract states \( fp (\ell )\) at the location \(\ell \), we can use the abstract transformer \( post ^{\#}\) instead of an SMT query. For each candidate Hoare triple \(\{\varphi _{\ell }\} \, \texttt {s} \, \{\varphi _{\ell '}\} \) in \(\mathcal {H}\) we check whether each abstract successor state of \(\varphi _{\ell }\) is contained by one of the abstract states in \(\varphi _{\ell '}\). If the following formula is valid for the given Hoare triple, the Hoare triple itself is valid.

$$\begin{aligned} \forall \sigma ^l_i \in fp (\ell ) \centerdot \left( post ^{\#} (\sigma ^l_i,\texttt {s}) = \sigma ^{\ell ''}_i \wedge \exists \sigma ^{\ell '}_i \in fp (\ell ') \centerdot \left( \sigma ^{\ell ''}_i \subseteq \sigma ^{\ell '}_i \right) \right) \end{aligned}$$

The construction of a data automaton from a proof of infeasibility \(\mathcal {H}\) for a path program is then conducted as described in Sect. 2. Note that by construction, we retain the property of Floyd-Hoare automata, that for each transition \((q, \texttt {s}, q') \in \delta _ AI \) the triple \(\{\varphi _q\} \, \texttt {s} \, \{\varphi _{q'}\} \) is a valid Hoare triple. Therefore, the automaton accepts at least all the traces represented by the path program. Table 2 shows the locations and Table 1 the transition relation of the data automaton constructed in this fashion for the example path program from Sect. 1.1.

Table 1. Transition relation \(\delta \) with labeling for the generalized data automaton of the path program from Sect. 1.1.
Table 2. Locations of the generalized data automaton of the path program from Sect. 1.1. \(\ell _0\) is the initial location, \(\ell _4\) is an accepting state.

3.2 Weakening of State Assertions

A state assertion \(\varphi _{\ell }\), derived from a set of abstract states \( fp (\ell )\), might be very large, depending on the precision of the abstract domain. For example, let \( fp (\ell ) = \left\{ \left( x \in [0;10], y \in [-100;100]\right) \right\} \) be an abstract state that stores interval values of variables. Then, the state assertion \(\varphi _{\ell } = x \ge 0 \wedge x \le 10 \wedge y \ge -100 \wedge y \le 100\) already contains four conjunctive terms for two variables that are present in q. A more precise abstract domain, e.g., the relational domain based on octagons, not only stores the bounds for each value interval per variable, but also the relations between all pairs of variables. A corresponding state assertion thus grows quadratically with the number of variables. The size of the state assertions, i.e., the number of conjunctive terms, is crucial for the runtime of various operations. We already mentioned that using an SMT solver to check whether a Hoare triple is valid is too expensive because of the size of the state assertions. But the application of \( post ^{\#}\) also scales unfavorably with the number of variables in an abstract state; for example, calculating the closure of an octagon matrix requires cubic time relative to the number of variables.

The predicates (i.e., the pre- and postconditions of the Hoare triples) of the proof \(\mathcal {H}\) we obtained from abstract interpretation are still backed by abstract states. Hence we are interested in reducing the number of variables in those states and thus the number of conjuncts in the predicates as much as possible while still retaining the proof of correctness for the path program.

Our approach contains a simple method to achieve such a reduction, namely the function \(\textsf {weaken} \) that takes a set of Hoare triples and yields a set of Hoare triples. \(\textsf {weaken}\) uses a simple data-flow based analysis to remove variables and still retain the proof. It also exploits the fact that the proofs we obtain have a certain form. Because they correspond to traces, they can be represented as a joined sequence of Hoare triples

$$\begin{aligned} \mathcal {H} = \left\{ \varphi _0\right\} \texttt {s}_0 \left\{ \varphi _1\right\} \texttt {s}_1 \ldots \left\{ \varphi _n\right\} \texttt {s}_n \left\{ \varphi _{n+1}\right\} \end{aligned}$$

where \(\varphi _0 = true\) and \(\varphi _{n+1} = false\). Note that the proofs for path programs also have this form, because we can just use the trace from which the path program was constructed. We only have to consider the case that at least one \(\varphi _i\) is equal to \(\varphi _j\). We preserve all loop invariants because any unrolling of the loop is enough to reason about the inductivity of a loop invariant. Assuming this form for \(\mathcal {H}\), the function \(\textsf {weaken}\) implements the following algorithm.

figure f

The algorithm first initializes a set K of variables that should not be removed in line 1. Then, it iterates backwards over the set of Hoare triples \(\mathcal {H}\) that is in the form described above. The important aspects are the updates of the set of variables that should not be removed, K, in line 4 and 7, and the computation of a new Hoare-triple in line 6. In line 4, we remove all variables that are written in the current statement from the set of variables that should not be removed. The reasoning being that we do not need to keep information about variables that will change as consequence of the execution of the current statement. In line 6 the algorithm transforms each Hoare triple by removing all conjuncts from the precondition that contain only variables that are not read in the current statement and are not in the set of variables that should not be removed. In line 7, we add all variables now occurring in the new precondition to the set of variables that should not be removed.

The resulting Hoare triple is still valid because the post-condition can only contain information about variables that still have to be kept or were written in the current statement.

4 Implementation and Evaluation

In this section, we present the implementation and evaluation of our approach.

We implemented our algorithm in Ultimate, an open-source program analysis frameworkFootnote 1. The resulting tool, Ultimate Taipan, is based on Ultimate Automizer Footnote 2, a state-of-the-art software model checker that implements trace abstraction (see Sect. 2). Ultimate Taipan extends Ultimate Automizer and Ultimate with an own fixpoint computation engine based on abstract interpretation, various abstract domains and methods for the extraction and creation of path programs. Our engine currently supports sets of octagons [21] as relational abstraction, sets of intervals and sets of divisibility congruences [16] as non-relational abstractions. It also allows for parameterized combinations of these domains. We currently do not support arrays, bitvectors and floats.

Our implementation of Ultimate Taipan also uses two small optimizations:

  • We only compute a path program and use abstract interpretation when the trace \(\tau \) contains a loop. If the trace contains only straight-line code, the analysis with SMT solvers is usually faster and provide more general state assertions.

  • We cache the path programs we already analyzed to prevent re-analyzing path programs for which the used abstract domain was too weak to provide a proof. This may happen if both, abstract interpretation and SMT-based analysis, were unable to find a loop invariant for the path program and the algorithm unrolls a loop. In this case, instead of re-analyzing an already analyzed path program, the abstract interpretation module reports unknown and the algorithm continues.

In our evaluation we compare Ultimate Automizer (Automizer) with two variants of Ultimate Taipan (Taipan and LazyTaipan). The three configurations differ only in the way they obtain state assertions from traces. In each iteration, they al try multiple methods to obtain state assertions that are loop invariants for the path program induced by the trace. These methods are applied one after another. If one method fails to provide loop invariants, the next method is used. If all methods fail to provide loop invariants, but some of them could show infeasibility, the abstraction is refined with all of the state assertions obtained. If no method could show infeasibility (or feasibility), the tools abort and return “Unknown” as the final result.

The first method is for all configurations an application of the SMT solver SMTInterpol  [10] using Craig interpolation. This SMT solver is tightly integrated into Ultimate and can thus be called very efficiently.

Next, the three configurations use either the interpolation engine of the two SMT solvers Z3  [22] and CVC4  [4] on the trace, or our abstract interpretation engine on the path program induced by the trace. Only the order of these methods differs.

  • Automizer tries Z3 followed by CVC4,

  • Taipan tries abstract interpretation followed by Z3 and then CVC4, and

  • LazyTaipan tries Z3 followed by CVC4 and lastly abstract interpretation.

For our evaluation we applied the three configurations to C programs taken from the SV-COMP  2017 [5] repositoryFootnote 3. Each of the verification tasks in SV-COMP reachability category contains one error location, which is either reachable or unreachable. We concentrated on two subcategories, namely “ReachSafety-Loops” (Loops) and “ReachSafety-ECA” (ECA). We chose these categories because they represent control-intensive programs that do not contain arrays, floats or bitvectors. One main difference between the two sets is in size. While Loops contains files with 8 to 1644 lines of code, ECA’s samples range from 591 to 185053 lines of code. Loops also contains programs with more intricate loop invariants requiring relations between variables, while ECA is very control-intensive with many branches in a single loop. We used all 159 examples from Loops and 200 random examples from ECA. All benchmarks were run on an Intel Core i7-2600 with 3.40 GHz using a timeout of 90 s and a memory limit of 4 GB for the tool itself and 2 GB for the SMT solver.

Table 3. The evaluation results. The complete benchmark set contained 359 samples (column “#”). Each cell in the column “Succ.” contains the number of samples this particular setting could solve. Each cell in the column “Excl.” shows how many samples were solved exclusively by this setting. The row “Portfolio” shows how many benchmarks could be solved by any of the settings, and the row “Common” shows how many benchmarks could be solved by all of them.

Table 3 shows the results of the evaluation. Out of the 359 input programs, the default trace abstraction variant Automizer was able to solve 124 programs compared to 145 solved by LazyTaipan and 176 solved by Taipan. The clear advantage is visible in both benchmarks sets. Nevertheless, the Ultimate Automizer setting can solve 17 settings exclusively, 12 of them in ECA. These examples are due to cases where the fixpoint calculation lost precision because of branching.

Fig. 7.
figure 7

Statistics collected during the execution of the benchmarks. All plots show the measured data on the y-axis and range over the samples on the x-axis. The order of the samples is sorted by the measurement value for each plot. This allows us to show trends but also prevents the comparison of single samples. The upper-left chart “Runtime” compares the total runtime of the different settings. The upper-right chart “Overall iterations” compares the number of iterations, the middle-left chart “Iterations using abstract interpretation” compares the number of iterations were abstract interpretation was applied to path programs with loops, and the middle-right chart “Useable abstract interpretation iterations” shows the number of refinements were abstract interpretation computed a proof for the infeasibility of the path program. Finally, the lower-left and lower-right charts “Abstract interpretation time” and “Relative time” show the absolute amount of time and the time relative to the overall time that the fixpoint computation took.

Figure 7 shows four metrics collected during the evaluation. The top left hand chart shows the runtime in \(\log (s)\) for all individual benchmark programs, ordered by time. It shows that Taipan was not only able to solve the most samples, but also took a comparable amount of time.

On the top right hand side in Fig. 7 the number of refinements in the CEGAR loop is shown. This number indicates how often a new data automaton was constructed during the refinement step. The graph shows that the number of iterations are very similar for all three configurations. They only differ in their offset. This corresponds to the importance of loop invariants: Each benchmark requires a number of loop invariants that if found, solve the benchmark. If one path program remains without loop invariant, all three approaches diverge.

The middle left hand chart of Fig. 7 shows the number of iterations in which a path program was constructed and analyzed with abstract interpretation. Depending on the setting, this was done early (Taipan) or late (LazyTaipan) during an iteration. Approximately 40 (resp. 60) examples could be solved using the interpolating SMT solvers alone, but for the remaining ones abstract interpretation was required to infer suitable state assertions. In many cases, only a single abstract interpretation iteration was necessary for solving the sample – all other parts of the proof could be provided by the SMT solvers.

The middle right hand chart shows the number of iterations in which abstract interpretation could prove the infeasibility of the path program. Compared to the total number of abstract interpretation refinements, this was in roughly 70% of the benchmarks the case.

One observation from our experiments is that the analysis of a path program with abstract interpretation takes significantly more time than the analysis of a single trace with an SMT solver. In the current state, this may be an artifact of the implementation and we believe this can be improved further.

The results of our evaluation show that Taipan outperforms in particular the Automizer variant without abstract interpretation substantially. Trying to delay the more expensive analysis of path programs as we have done with the setting LazyTaipan is not helpful, as one may expect. One reason for this is that for difficult traces, the solvers also may take a long time and then only provide unsuitable state assertions.

5 Related Work

In their work on Craig Interpretation [1, 2], Albarghouthi et al. use a CEGAR-based approach with abstract interpretation to refine infeasible program traces. In contrast to our work, they use abstract interpretation to compute an initial abstraction of the whole program. Then, a trace to an error location is picked from the abstraction, instead of the original program, and analyzed using a bounded model checker. If the trace is infeasible, this results in a set of state assertions, which may be too precise, i.e., non-inductive, to be used to refine the initial abstraction. Abstract interpretation is used again, this time to weaken the found state assertions in an attempt to achieve inductivity before refinement of the last abstraction is done and the next iteration begins. Because the analysis is done on an abstraction dependent on the fixpoint computed by abstract interpretation, many iterations are needed in the worst case to identify infeasible program traces. The fact that we are using abstract interpretation to compute fixpoints of path programs which are a subset of the original program, instead of an abstraction, allows us to circumvent the problem that an abstraction of the whole program might be too weak to prove the program to be correct. Additionally, we often eliminate the need to use expensive model checking techniques to refine the abstraction iteratively. Therefore, our generalization with abstract interpretation is more localized and more precise than an abstraction obtained by analyzing the whole program.

Beyer et al. use path programs in a CEGAR approach to compute invariants of locations in a control flow graph of a program [8]. The refinement of the abstraction is done by using a constrained-based invariant synthesis algorithm which computes an invariant map, mapping predicates forming invariants to locations of the path program. Those invariants are excluding already visited parts from the original program. This is done until a counterexample for the program’s correctness has been found or the program has been proven to be correct. In contrast to our work, their approach uses an interpolant generator to generate the invariant mapping, whereas we use both, an interpolant generator and a fixpoint computation engine to obtain suitable state assertions. In addition, their approach is only able to synthesize loop invariants by using invariant templates which are parametric assertions over program variables, present in each location of the program. Although they propose to use other approaches to generate invariants, including abstract interpretation, they do not present a combination of those methods.

6 Conclusion

In this paper, we presented a CEGAR approach that benefits from the precision of trace abstraction and the scalability of abstract interpretation. We use an automata theoretical approach to pick traces from a program automaton which are checked for infeasibility. If the trace is infeasible, we construct a path program and compute an abstraction of the path program by using abstract interpretation. With the help of this abstraction, we are guaranteed to obtain state assertions, in particular loop invariants, which help us to exclude a generalization of the found infeasible trace from the program. Because abstract interpretation may yield an abstraction which is not precise enough to synthesize usable loop invariants, we use the default precise trace abstraction approach as a fallback.

Our experiments show that by using abstract interpretation to generate loop invariants of path programs, we are able to prove a substantial larger set of benchmark programs.