Abstract
We propose a new relational abstract domain for analysing programs with numeric and Boolean variables. The main idea is to represent an abstract state as a set of linear constraints over numeric variables, with every constraint being enabled by a formula over Boolean variables. This allows us, unlike in some existing approaches, to avoid duplicating linear constraints shared by multiple Boolean formulas. To perform domain operations, we adapt algorithms from constraint-only representation of convex polyhedra, most importantly Fourier-Motzkin elimination and projection-based convex hull. We made a prototype implementation of the new domain in our abstract interpreter for Horn clauses. Our initial experiments are, in our opinion, promising and show directions for future improvement.
A. Bakhirkin and D. Monniaux—Institute of Engineering Univ. Grenoble Alpes.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
1 Introduction and Related Work
Static program analysis by abstract interpretation over-approximates the set of reachable states of a program by a set with a simple description, for instance, by attaching one interval to each program variable at every location in the program. Intervals however cannot express relationships between variables, so a richer approach is to attach to every location a set of valid linear inequalities, which geometrically is, a convex polyhedron [12, 17].
Convex polyhedra are already quite formidable objects to compute with efficiently, yet they are insufficient for expressing certain invariants, and what is often needed is a disjunction of convex polyhedra. For instance, the strongest invariant of the following loop: is \((n < 0) \vee (0 \le i \le n)\). Note how the disjunction arises from the partition of executions into those that execute the loop at least once and those that do not. Better analysis precision may often be achieved by partitioning executions according to an abstraction of the control flow [30], or by partitioning abstract states with respect to conditions extracted from the program [11], etc. Some analyses of programs operating over arrays and maps abstract properties over these objects onto disjunctive relations between the scalar variables of the program and the values in the array cells [25, 27]. For instance, a loop that fills an array can be proved correct using an invariant \(\forall k~ (0 \le i \le n) \wedge (0 \le k < i \mathrel {\rightarrow }a[k]=42)\), with a disjunction between the cases \(k < i\) (filled) and \(k \ge i\) (unfilled).Footnote 1
In all cases, the analysis needs to efficiently represent sets of convex polyhedra, possibly (but not necessarily [4]) tagged by elements of a finite set T (abstract traces, Boolean vectors, etc.). Earlier works proposed to represent an abstract element by an explicit map from T to convex polyhedra, either as an array of pairs \((T_i,P_i)\) where \(T_i \subseteq T\) and \(P_i\) are polyhedra, or as a decision tree or DAG with polyhedra at the leaves. Both approaches are implemented in Jeannet’s BddApron library [19].
One issue with this approach is that the possible number of abstract partitions is often exponential in some parameter (length of the recorded trace, number of Boolean variables) and thus every operation (post-condition, convex hull) is potentially repeated for each of the exponentially many polyhedra. At the same time, the polyhedra in different abstract partitions often share most of the constraints and only differ in few that are related to the partitioning criterion. Thus, it is tempting to store a set of polyhedra in some structure that does not require duplicating shared constraints, and to use symbolic algorithms that, as much as possible, avoid enumerating abstract partitions individually.
One approach to this is offered by different kinds of decision diagrams over linear constraints. A notable example is Linear Decision Diagrams (LDD) developed by Chaki, Gurfinkel, and Strichman [10]. An LDD is a DAG, where internal nodes are labelled with linear constraints, and the two leaves are \( true \) and \( false \), thus a path through an LDD corresponds to a convex polyhedron. Based on the LDD algorithms, the same authors later developed an abstract domain of boxes [15], that only allows to have a comparison of a variable with a constant in an interior node.
Theoretical Contribution. In this paper, we propose an alternative approach: to represent an abstract state as a set of implications \(\{ B_i \mathrel {\rightarrow }c_i \}_{i=0..k}\), where \(B_i\) are arbitrary Boolean formulas, and \(c_i\) are linear constraints. This way, an abstract element can still be seen as an implicit map from a partition of \(\mathbb {B}^m\) to convex polyhedra (similar to a BddApron element), but we do not have to duplicate storage and computations for constraints shared by multiple partitions. Another appeal of this approach is that some operations on constraint-only polyhedra can be naturally adapted to sets of implications of the form \(B_i\mathrel {\rightarrow }c_i\). The algorithms in this paper are based on Fourier-Motzkin elimination [28] and the reduction of convex hull to projection of Benoy, King, and Mesnard [7, 31]. Whether it is possible to also adapt the more recent algorithms based on parametric linear programming and raytracing by Maréchal, Monniaux, and Périn [21,22,23] is a question for future work.
The Boolean variables occurring in the formulas \(B_i\) may be program variables (from small enumerated types) but may also be observers, partitioning according to trace history or calling context. This solves one issue with untagged disjunctions of polyhedra: when applying the widening operator to \(\bigcup _i P_i\) and \(\bigcup _j Q_j\), how does one “match” the \(P_i\)’s and \(Q_j\)’s to perform conventional widening over polyhedra [4]? Similarly, for the “join” operation \((\bigcup _i P_i) \sqcup (\bigcup _j Q_j)\), does one simply concatenate the two unions while removing duplicates (thus creating longer and longer lists), or does one “match” some \(P_i\) and \(Q_j\) for convex hull, and if so under which criteria? In our case, widening and join are guided by the Boolean variables: the polyhedra associated to the same Boolean choice are matched together.
Experimental Evaluation. We made a prototype implementation of the proposed abstract domain in our abstract interpreter for Horn clauses [5, 6].
2 Notation
We consider programs with Boolean and rational variables: a concrete program state is a tuple \((\mathbf {b},\mathbf {x}) \in \mathbb {B}^m \times \mathbb {Q}^n\). We use bold lowercase symbols \(\mathbf {b} \in \mathbb {B}^m\) and \(\mathbf {x} \in \mathbb {Q}^n\) to denote valuations of Boolean and numeric variables respectively. We use lowercase Italic symbols b and x respectively to denote vectors of Boolean and rational variables. We refer to the j-th variable in x as \({x}_{(j)}\). We use other lowercase Italic symbols, e.g., a, d, to denote vector and scalar coefficients, and their meaning will be clear within their context.
Without loss of generality, we make a number of assumptions on the syntactic form of linear constraints. We assume that there exists the unique unsatisfiable linear constraint \( cfalse \), i.e. we will not distinguish logically equivalent, but syntactically different falsities: \(0 < 0\), \(1 \ge 2\), etc. We assume that every linear constraint \(c_i\) is written as a greater-or-equal constraint with integer coefficients, i.e. \(c_i = a_i x \succ d_i\), where \(a_i \in \mathbb {N}^n\), \(d_i \in \mathbb {N}\), and \(\succ \in \{ =, \ge , > \}\).
We sometimes write a Boolean or a linear constraint as B[b] or c[x] to emphasize that free variables in B and c come from vectors b and x respectively. We use the [ / ] notation to denote substitution. For example, \(B[{b}_{(j)}/ true ]\) denotes the result of substituting in B the variable \({b}_{(j)}\) with \( true \). As a shortcut, we write \(B[\mathbf {b}]\) to denote the result of substituting every free variable in B with its valuation given by \(\mathbf {b}\).
3 Abstract Domain of Boolean and Linear Constraints
We propose to represent an abstract state as a set of implications:
where \(B_i\) is a propositional formula over Boolean variables, and \(c_i\) is a linear constraint (equality or inequality) over numeric variables. We do not want \(B_i\) to be a partition of \(\mathbb {B}^m\). Our intention is to never duplicate linear constraints that are shared by multiple valuations of Boolean variables.
An abstract state S represents the set of concrete states:
Alternatively, one can see an abstract state as a function that maps every valuation of Boolean variables to a convex polyhedron that describes the possible values of numeric variables. This is captured by the partial concretization \(\gamma _\mathrm{b}\):
The notion of partial concretization is useful when we want to show that we correctly lift operations on sets of constraints (e.g., projection) from linear constraints to implications \(B_i\mathrel {\rightarrow }c_i\). We normally want a operation to commute with \(\gamma _\mathrm{b}\), i.e., \(\gamma _\mathrm{b}(f_\mathrm{lifted}(S))(\mathbf {b}) = f_\mathrm{original}(\gamma _\mathrm{b}(S)(\mathbf {b}))\), which would mean that there is no loss of precision on the Boolean level.
Without loss of generality, we assume that in every abstract state, the 0-th constraint has the form \(B_0 \mathrel {\rightarrow } cfalse \), and no other constraint has \( cfalse \) on the right-hand side. In particular, the empty polyhedron is represented by \(\bot = \{ true \mathrel {\rightarrow } cfalse \}\) and the universal polyhedron is represented by \(\top = \{ false \mathrel {\rightarrow } cfalse \}\).
Example 1
The abstract state where \({x}_{(0)}\) is always non-negative, and in addition, if \({b}_{(0)}\) holds, \({x}_{(0)}\) is not greater than 1 can be represented as
3.1 Elimination of a Rational Variable
In constraint-only representation, existential quantifier elimination (projection) is the main operation on polyhedra, and most other operations are expressed using projection.
We can naturally adapt Fourier-Motzkin elimination [14, 18, 28] to our abstract domain in the following way. Let an abstract element S be \(\{ B_i \mathrel {\rightarrow }c_i \}_{i=0..k}\) and let \({x}_{(j)}\) be the variable to eliminate. First, we split every equality where \({x}_{(j)}\) appears with nonzero coefficient into a pair of inequalities. Then, we partition the constraints into three sets:
-
1.
\(E_0\), where in the linear part \({x}_{(j)}\) appears with coefficient 0;
-
2.
\(E_+\), where in the linear part \({x}_{(j)}\) appears with a positive coefficient;
-
3.
\(E^-\), where in the linear part \({x}_{(j)}\) appears with a negative coefficient.
The constraints from \(E_0\) we keep as-is, and from every pair of constraints in \(E_+\) and \(E_-\), we produce a positive combination, in which \({x}_{(j)}\) has coefficient 0. The difference from the original Fourier-Motzkin algorithm is that when we combine two constraints, we conjoin their Boolean parts. This is summarized in Fig. 1.
Example 2
Let
and let us apply the Fourier-Motzkin-based elimination to the variable \({x}_{(0)}\). First, we partition the constraints into the three sets:
We keep the elements of \(E_0\) and combine the elements of \(E_+\) and \(E_-\), producing the set
In this case, we only need to eliminate the trivially valid constraint \( true \mathrel {\rightarrow }0 \ge 0\); in general Fourier-Motzkin elimination can produce constraints that are non-trivially redundant.
Lemma 1
For every abstract state S, rational variable \({x}_{(j)}\), and \(\mathbf {b}\in \mathbb {B}^m\),
Proof Idea. To prove Lemma 1, we can pick an arbitrary \(\mathbf {b} \in \mathbb {B}^m\) and show that the set of linear constraints \(\{ c_i \mid B_i \mathrel {\rightarrow }c_i \in eliminateR (j,S) \wedge B_i[\mathbf {b}] \}\) is the same as the set of constraints produced by applying standard Fourier-Motzkin elimination to \(\gamma _\mathrm{b}(S)(\mathbf {b})\).
To eliminate multiple rational variables, we apply \( eliminateR \) iteratively. The standard heuristic is to pick and eliminate in every step a variable that minimizes \(|E_+||E_-| - |E_+| - |E_-|\), which is the upper bound on the growth of the number of constraints.
Gaussian Elimination. When an abstract element contains an equality \( true \mathrel {\rightarrow }a x = d\), where \({a}_{(j)} \ne 0\), this equality can be used as a definition of the variable \({x}_{(j)}\). Then, to eliminate the \({x}_{(j)}\) from an abstract element, we can replace it with this definition in every remaining constraint, instead of performing Fourier-Motzkin elimination. This is useful for eliminating, e.g., temporary variables that an analysis may introduce when pre-processing the program.
Example 3
Let
and let us apply the Gaussian elimination to the variable \({x}_{(0)}\), using the equality \( true \mathrel {\rightarrow }{x}_{(0)} - {x}_{(1)} = 0\). That is, we replace \({x}_{(0)}\) with \({x}_{(1)}\) in the two remaining constraints, getting
This can be generalized to the case when the abstract element contains a subset of equalities \(\{ B_j \mathrel {\rightarrow }a_j x = d_j \}_{j=1..m} \subseteq S\), s.t. \(\bigvee _{j=1}^m B_j = true \), as shown in Fig. 2
3.2 Equivalent and Redundant Constraints
When working with constraint-only representation of polyhedra, one of the big challenges is eliminating redundant constraints. As shown above, every round of Fourier-Motzkin elimination creates a quadratic number of new constraints, an most of them are usually redundant. When eliminating multiple variables (notably, during join computation, see Sect. 3.3), redundant constraints have to be eliminated regularly, otherwise their number might grow in a double-exponential way (while McMullen’s upper bound theorem [24] implies that the number of non-redundant constraints cannot grow more than exponentially with the number of projected dimensions). In his work on constraint-only representation of polyhedra [13], A. Fouilhé argues for redundancy elimination after eliminating every variable. The conventional approach to redundancy elimination in a list of n constraints is to go over every constrain and use linear programming test whether it is redundant with respect to the \(n-1\) other ones (some other criteria [18, 20] cannot eliminate all redundancies. “Raytracing” [23] is a fast method to identify redundancies, but it degenerates into the conventional linear programming approach in the worst case). We adapt that approach to the Boolean setting: to check whether a constraint is redundant, we call an SMT solver. We also implement a number of less costly redundancy checks.
Pairwise Redundancy Checks. There is a number of reductions that can be implemented without necessarily calling an SMT solver.
First, we can combine constraints with identical linear part:
This is an important step that allows to not duplicate linear constraints; duplication would be amplified by Fourier-Motzkin elimination.
Second, we can eliminate a constraint if it is implied by another constraint:
This requires a procedure to efficiently check implication between Boolean formulas, which is available, e.g., if they are represented as BDDs. Implication between a pair of linear constraints is a straightforward syntactic check.
Pairwise reduction checks reduce the number of SMT calls, which are costly. This is especially important in lower dimensions and when few constraints are relational. In these cases, most of redundant constraints can be eliminated with pairwise checks.
SMT-Based Redundancy Check. Let \(S = \{ B_i \mathrel {\rightarrow }c_i \}_{i=0..k}\). Then the j-th constraint is redundant, if its negation is unsatisfiable with respect to the other constraints:
An SMT-based redundancy check is an expensive operation, but it has to be performed regularly to limit the growth of the number of constraints.
In general, for a given abstract state S, there may be no unique smallest set of non-redundant constraints. Currently, we implement a greedy strategy: we successively check every constraint and if it is redundant, immediately eliminate it, before checking other constraints. We can artificially make this procedure deterministic by ordering the constraints; in particular it is beneficial to first attempt to remove constraints with larger absolute values of coefficients, both for human-readable output and for performance of an SMT solver.
Example 4
Let
Let us remove redundant constraints from this system. First, we note that \(( true \mathrel {\rightarrow }{x}_{(0)} \ge 0) \mathrel {\rightarrow }({b}_{(0)} \mathrel {\rightarrow }{x}_{(0)} \ge -1)\), since \({b}_{(0)} \mathrel {\rightarrow } true \) and \({x}_{(0)} \ge 0 \mathrel {\rightarrow }{x}_{(0)} \ge -1\), thus the latter constraint is redundant. Second, we note that:
Thus, the remaining non-redundant constraints are:
3.3 Join
To perform join of two abstract elements, we adapt the projection-based convex hull computation of Benoy et al. [7, 31]. The original algorithm is based on the observation that every point in the convex hull is a convex combination of a pair of points from the original polyhedra. Figure 3 expresses this more formally and adapted to our setting. Given the two abstract elements \(S_1\) and \(S_2\), first we construct the set of constraints \(S_{12}\). The variables \(\lambda _1, \lambda _2\) are the scaling coefficients of the two points in \(S_1\) and \(S_2\) respectively, s.t. \(\lambda _1, \lambda _2 \ge 0\) and \(\lambda _1 + \lambda _2 = 1\); \(y_1\) an \(y_2\) are the vectors of coordinates of the two points, pre-multiplied by scaling coefficients, and thus should satisfy the pre-multiplied constraints of \(S_1\) and \(S_2\); finally x is the vector of coordinates of a point in the convex hull, and thus \(x = y_1 + y_2\). Eliminating \(y_1, y_2, \lambda _1\), and \(\lambda _2\) from \(S_{12}\) produces the closure of the convex hull. With some extra bookkeeping, it is then possible to express the resulting linear constraints in terms of the original constraints [13] and turn some closed constraints back into open (a positive combination a set of constraints, where at least one constraints is open, is also open).
Lemma 2
For every pair of abstract states \(S_1, S_2\) and every \(\mathbf {b} \in \mathbb {B}^m\),
Proof Idea. To prove Lemma 2, similarly to Lemma 1, we can pick an arbitrary \(\mathbf {b} \in \mathbb {B}^m\) and show that the set of constraints \(S_{12}\) in Fig. 3 is the same as the set of constraints generated by F. Benoy’s convex hull applied to \(\gamma _\mathrm{b}(S_1)(\mathbf {b})\) and \(\gamma _\mathrm{b}(S_2){\mathbf {b}}\).
Join of Elements with Disjoint Pure Boolean Constraints. Let \(S_1, S_2\) be a pair of abstract elements:
where \(\lnot B_0^1 \wedge \lnot B_0^2 = false \), i.e., their pure Boolean constraints are disjoint, and for a given valuation of Boolean variables \(\mathbf {b}\), at least one of the polyhedra \(\gamma _\mathrm{b}(S_1)(\mathbf {b})\), \(\gamma _\mathrm{b}(S_2)(\mathbf {b})\) is empty. In this case, \(S_1\) and \(S_2\) can be joined exactly and without computing the convex hull as follows:
As we later show, this optimization is important for efficient elimination of Boolean variables. Soundness can be shown by writing down the disjunction of logical formulas corresponding to \(S_1\) and \(S_2\), distributing the disjunction over the conjunctions and applying equivalences that follow from \(\lnot B_0^1 \wedge \lnot B_0^2 = false \). Let \(S_1, S_2\) be a pair of abstract elements:
where \(\lnot B_0^1 \wedge \lnot B_0^2 = false \). Let us observe the disjunction of their corresponding logical characterizations:
Conjoining the pure boolean constraints to numeric constraints
Distributing the disjunction
From \(\lnot B_0^1 \wedge \lnot B_0^2 = false \) it follows that \(B_0^1 \vee B_0^2 = true \), \(\lnot B_0^1 \mathrel {\rightarrow }B_0^2\), and \(\lnot B_0^2 \mathrel {\rightarrow }B_0^1\)
Which is the logical characterization of \( boolDisjointJoin (S_1, S_2)\).
3.4 Other Operations
Intersection with a Constraint. To intersect an abstract state S with a constraint \(B \mathrel {\rightarrow }c\), we add \(B \mathrel {\rightarrow }c\) to S. To intersect an abstract state S with a linear constraint c, we add \( true \mathrel {\rightarrow }c\) to S. To intersect an abstract state S with a Boolean constraint B, we add \(\lnot B \mathrel {\rightarrow } cfalse \) to S.
Linear Assignment. The general way to apply a linear assignment \({x}_{(j)} := ax + d\) is by renaming and elimination. We introduce a fresh variable \({x'}_{(j)}\) that denotes the value of \({x}_{(j)}\) after or before the assignment, relate it to \({x}_{(j)}\), eliminate \({x}_{(j)}\) and then rename \({x'}_{(j)}\) into \({x}_{(j)}\):
This applies to both invertible (where in \(ax+d\), \({x}_{(j)}\) has a non-zero coefficient) and non-invertible assignments. Invertible assignments (e.g. \(a := 2a+1\)) can also be implemented by substituting the inverted expressions (e.g. \(a \mapsto (a-1)/2\)) into the constraints [12, Sect. 4.2.2.1].
Elimination of a Boolean Variable. We use the equivalence \(\exists b \in \mathbb {B}.\,\varphi = \varphi [b/ true ] \vee \varphi [b/ false ]\) over-approximate logical disjunction with the join operation:
Example 5
Let
That is, when \({b}_{(0)}\) is \( true \), \({x}_{(0)} = {x}_{(1)} = 0\), and when \({b}_{(0)}\) is \( false \), \({x}_{(0)} = {x}_{(1)} = 0\). To eliminate the single Boolean variable \({b}_{(0)}\), we take the join of the two abstract elements:
One possible representation of the result is
Example 6
For an example of a join of two Boolean-disjoint abstract states, let us consider the abstract state
and let us eliminate the variable \({b}_{(0)}\) from it. Notice that this abstract state asserts that \({b}_{(0)} = {b}_{(1)}\), and thus we expect that the elimination will result in substituting \({b}_{(0)}\) with \({b}_{(1)}\) in every constraint. First, we compute
Then, we observe that these abstract states are Boolean-disjoint, since \(\lnot \lnot {b}_{(1)} \wedge \lnot {b}_{(1)} = false \), i.e., we can apply the specialized version on join and, as expected, get
This example demonstrates a common scenario when eliminating temporary Boolean variables. The eliminated variable may be introduced using an explicit equality, like in this example, or in some similar way that makes it so that restricting this variable to \( true \) and \( false \) respectively produces Boolean-disjoint elements. Having a specialized join operation for Boolean-disjoint abstract states is important when an analysis may transform the input program and introduce such variables.
Boolean Assignment. An assignment of the form \({b}_{(j)} := B\) we implement, similarly to the linear case, using renaming and elimination:
Linear to Boolean Assignment. In some cases, during the analysis we want to introduce an observer variable – a Boolean variable that the stores truth value of some linear constraint at some point of program execution. When c is an inequality (not an equality), the assignment \({b}_{(j)} := c\) is straightforward to implement, since the equivalence \(b \mathrel {\leftrightarrow }c\) can be represented as a pair of constraints: \(b \mathrel {\rightarrow }c, \lnot b \mathrel {\rightarrow }\lnot c\). That is,
and similarly for \( pre \). For an equality \(ax = d\), though, we cannot assign its truth value to a single Boolean variable. Instead, we have to use two Boolean variables to separately assign to them the truth values of \(ax \ge d\) and \(-ax \ge -d\).
Widening. Widening in convex polyhedra is based on the idea of keeping the constraints of the previous approximation that are also satisfied by the new approximation [3, 17]. In our setting, we want, for every linear constraint from the previous approximation, to find for which values of Boolean variables it is implied by the new approximation. To find for which values of Booleans an inequality c is implied by an abstract state S, we can conjoin \( true \mathrel {\rightarrow }\lnot c\) to S and then eliminate all the rational variables. This produces an abstract state of the form \(\{ B \mathrel {\rightarrow } cfalse \}\) which is interpreted as: when B holds, \(\lnot c\) is unsatisfiable in S and thus \(B \mathrel {\rightarrow }c\) is implied by S. Thus, assuming that every equality is first split into a pair of inequalities and that \(S_1 \sqsubseteq S_2\), we get:
Inclusion Test. To check for inclusion between abstract states, we currently use an SMT solver. Let \(S_1 = \{ B^1_i \mathrel {\rightarrow }c^1_i \}_{i=0..k_1}\) and \(S_2 = \{ B^2_j \mathrel {\rightarrow }c^2_j \}_{j=0..k_2}\). Then
Checking, whether an abstract state S is empty, i.e., whether \(S \sqsubseteq \bot \) also requires an SMT solver call.
3.5 Implementation Details
Representing Boolean Formulas. We currently propose to represent Boolean formulas with BDDs, the main reason being that BDDs allow to represent formulas in a canonical way and avoid unbounded syntactic growth, when formulas are repeatedly conjoined (during elimination) and disjoined (when combining constraints with coinciding linear part).
Constraints over Integer Variables. To achieve additional precision, we can rewrite linear constraints when every variable with a non-zero coefficient is integer. In this case, a strict inequality can be rewritten as non-strict:
For an inequality over integer variables, we can divide the coefficients of a constraint over integer variables by the GCD of the variable coefficients, rounding the free coefficient towards 0:
For an equality over integer variables, the free coefficient has to be divisible by the GCD of the variable coefficients, otherwise the equality is unsatisfiable.
4 Implementation and Experiments
We implemented the proposed abstract domain in our abstract interpreter for Horn clauses [5, 6]. Our tool can find models of systems of constrained Horn clauses [9] with predicates over numeric and Boolean variables. It is based on the technique of path focusing [26] and uses an SMT solver (Z3) to iterate over relevant disjuncts of the direct consequence relation. As the abstract domain, it supports BddApron [19] and now also the abstract domain that we propose in this paper. The tool is implemented in OCaml.
4.1 Example
Figure 4 shows an example of a kind of a program that we are interested in. Figure 4 is a typical result of instrumenting a program with Boolean observer variables that record which branches were taken during an execution. At every step, this program non-deterministically chooses whether to assume a constraint on a numeric variable \({x}_{(i)}\), and the choice is recorder in a Boolean variable \({b}_{(i)}\). At this point, we do not care how exactly this program was obtained, and we are interested in efficiently computing invariants in a way that allows to relate Boolean and numeric variables. Our original motivation though comes from using observer variables for trace partitioning in array-manipulating programs [27], where different branches correspond to different relations between array indices.
Figure 5 encodes the example program as a system of Horn clauses that can be processed by our tool. In this system, predicates \(P_0,\cdots ,P_3\) denote the invariants of the four program locations, and every Horn clause corresponds to one transition (in general, a clause may encode multiple sequences of statements). The smallest model of the system in Fig. 5 is the collecting semantics of the program in Fig. 4.
The original implementation of our tool used the BddApron abstract domain, and the invariant that it infers for the predicate \(P_3\) (the final program location) consists of 8 polyhedra, one for every valuation of Boolean variables:
In a larger program, such an invariant would be propagated further, with every post-condition computation begin essentially repeated for each of the eight polyhedra (i.e., exponentially many times in the number of Boolean variables).
The implementation of the domain that we propose in this paper allows to represent \(P_3\) in a much more compact form:
4.2 Experiments
We evaluate the performance of the implementation using two sets of programs. For both sets, we measure the total time it took to run on every program a single forward analysis with narrowing. We summarize the results in Table 1. Time figures were obtained on a PC with a Core i7-3630QM CPU and 8GB RAM.
SV-COMP Programs. For the first set of experiments, we selected a number of programs from “loop” and “recursive” categories of the Competition on Software Verification SV-COMP [1] and translated them into Horn clauses (the input language of our tool) with the tool SeaHorn [16] using two different Clang optimization levels -O3 and -O0 (SeaHorn operates on LLVM bytecode). This way we obtained 123 systems of Horn clauses. By default, SeaHorn uses a version of large block encoding [8] and produces programs with relatively few locations, but with complicated transition relations and a large number of temporary Boolean and numeric variables; even a simple C program can produce a good benchmark for the implementation of an abstract domain. In Appendix A, we show an example of a C program and the corresponding system of Horn clauses produced by SeaHorn. On SV-COMP programs, the implementation of the proposed domain is 2–10 times slower than BddApron; about 5 times slower on average.
Hand-Crafted Programs. For the second set, we selected 10 hand-crafted programs coming from different sources: array-manipulating programs encoded using array abstraction of Gonnord and Monniaux [27], other programs that use trace partitioning with observer variables, etc. With hand-crafted examples, we noticed that some of SMT queries that test constraint for redundancy cause the solver (Z3 4.5.0) to reach timeout, which we set at 10 seconds. This does not make the analysis unsound; a timeout of a redundancy check only causes the analysis to keep a redundant constraint in an abstract element. We have not yet found a workaround, and we display the hand-crafted programs in two rows: all programs (10) and programs that do not cause solver timeouts (8). On hand-crafted programs without solver timeout, the implementation of the proposed domain is 2–10 times slower than BddApron; about 7 times slower on average.
Conclusion. On average, the current implementation of the proposed abstract domain is about 5–7 times slower than BddApron. We find this result promising (given that this is our initial prototype implementation) and it shows directions for future improvement. In particular, much of the analysis time is spent in SMT solver calls in order to detect redundant constraints. These calls are costly, but have to be performed regularly. We are going to address the performance of eliminating redundant constraints in future work.
5 Conclusion and Future Work
In this paper, we propose a new relational abstract domain for analysing programs with numeric and Boolean variables. The main idea is to represent an abstract state as a set of linear constraints over numeric variables, with every constraint being enabled by a formula over Boolean variables. This allows, unlike in some existing approaches, avoiding the duplication of linear constraints shared by multiple Boolean formulas. Currently, we use the simple formulations of Fourier-Motzkin elimination [28] and projection-based convex hull [7, 31], and we rely on an SMT solver for redundancy elimination and inclusion checks (the counterpart of systematically using linear programming). Our experiments have shown that this is a worthy combination, which avoids some of the inefficiencies of earlier works.
The main direction for future work is to improve the performance of eliminating redundant constraints. There may be multiple ways to do this.
First, we may find additional heuristics that will reduce the number calls to a complete elimination procedure (that now calls an SMT solver). For example, Maréchal and Périn propose a fast incomplete procedure to detect non-redundant constraints based on raytracing [23], and there may be a way to adapt it (or a similar heuristic) to our setting.
Second, we may replace the SMT calls with a specialized procedure that combines LP-based and BDD-based reasoning. In particular, the observation is that a constraint is non-redundant, if it is non-redundant for at least one valuation of Boolean variables. While, an abstract element in the worst case describes exponentially many (in the number of constraints) convex polyhedra, there may be a way to not enumerate all of them during the redundancy check, at least in the average case.
Third, we may attempt to adapt to our setting the state-of-the art algorithms for constraint-only polyhedra, but this is not straightforward. For example, we cannot immediately adapt algorithms that require an interior point, such as those on parametric linear programming (for projection and convex hull) and ray-tracing (for redundancy elimination), by Maréchal, Monniaux, and Périn [21,22,23]: different Boolean assignments may have different interior points (in case of polyhedra with empty interior, we consider the interior relative to the affine span; but again this depends on the affine span). This is unfortunate, since much time is currently spent inside the SMT solver for checking for redundancy, and parametric linear programming is more efficient than Fourier-Motzkin elimination. A possible workaround, to be explored, is to partition the Boolean space according to affine span and point in the relative interior.
Regardless of the issues related to redundancy, presence of Boolean constraints often prevents us from using some standard approaches to representing polyhedra. In computations over convex polyhedra, one usually maintains, in addition to a system of inequalities, a system of linear inequalities that defines the affine span of the polyhedron. Given an ordering over the dimensions, this system of equalities may be echelonized and used to eliminate variables from the system of inequalities. The resulting system of equalities and non-redundant, normalized inequalities is canonicalFootnote 2. In our case, the affine span may depend on the Boolean part, thus it is impossible to canonicalize the inequalities uniformly with respect to the Booleans. We intend to investigate partitioning the Boolean space according to the affine span.
Notes
- 1.
\(\mathrel {\rightarrow }\) denotes logical implication.
- 2.
In the case of polyhedra with nonempty interior, an non-redundant system of normalized inequalities canonically describes a polyhedron: each inequality corresponds to a face. This is not true in the general case: both \(x \le y \wedge y \le x \wedge 0 \le x \wedge x \le 1\) and \(x \le y \wedge y \le x \wedge 0 \le x \wedge y \le 1\) are non-redundant systems defining the same polyhedron. Its affine span is defined by \(x=y\), then one can rewrite the inequalities using this equality and obtain \(x =y \wedge 0 \le y \wedge y \le 1\), which is canonical.
References
Competition on software verification (SV-COMP). http://sv-comp.sosy-lab.org/. Accessed Apr 2018
Proceedings of 9th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2009, 15–18 November 2009, Austin, Texas, USA. IEEE (2009)
Bagnara, R., Hill, P.M., Ricci, E., Zaffanella, E.: Precise widening operators for convex polyhedra. Sci. Comput. Program. 58(1–2), 28–56 (2005)
Bagnara, R., Hill, P.M., Zaffanella, E.: Widening operators for powerset domains. STTT 9(3–4), 413–414 (2007)
Bakhirkin, A.: HCAI, a path focusing abstract interpreter for Horn clauses. https://gitlab.com/abakhirkin/hcai. Accessed Apr 2018
Bakhirkin, A., Monniaux, D.: Combining forward and backward abstract interpretation of Horn clauses. In: Ranzato [29], pp. 23–45
Benoy, F., King, A., Mesnard, F.: Computing convex hulls with a linear solver. TPLP 5(1–2), 259–271 (2005)
Beyer, D., Cimatti, A., Griggio, A., Keremoglu, M.E., Sebastiani, R.: Software model checking via large-block encoding. In: Proceedings of 9th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2009, 15–18 November 2009, Austin, Texas, USA [2], pp. 25–32
Bjørner, N., Gurfinkel, A., McMillan, K., Rybalchenko, A.: Horn clause solvers for program verification. In: Beklemishev, L.D., Blass, A., Dershowitz, N., Finkbeiner, B., Schulte, W. (eds.) Fields of Logic and Computation II. LNCS, vol. 9300, pp. 24–51. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23534-9_2
Chaki, S., Gurfinkel, A., Strichman, O.: Decision diagrams for linear arithmetic. In: Proceedings of 9th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2009, 15–18 November 2009, Austin, Texas, USA [2], pp. 53–60
Chen, J., Cousot, P.: A binary decision tree abstract domain functor. In: Blazy, S., Jensen, T. (eds.) SAS 2015. LNCS, vol. 9291, pp. 36–53. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48288-9_3
Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Aho, A.V., Zilles, S.N., Szymanski, T.G. (eds.) Conference Record of the Fifth Annual ACM Symposium on Principles of Programming Languages, Tucson, Arizona, USA, January 1978, pp. 84–96. ACM Press (1978)
Fouilhé, A.: Revisiting the abstract domain of polyhedra : constraints-only representation and formal proof. (Le domaine abstrait des polyèdres revisité : représentation par contraintes et preuve formelle). Ph.D. thesis, Université Grenoble Alpes, France (2015)
Fourier, J.: Note, second extrait. Histoire de l’Académie pour 1824, p. xlvii, vol. 2, pp. 325–328. Gauthier-Villars, Paris (1890). http://gallica.bnf.fr/ark:/12148/bpt6k33707/f330
Gurfinkel, A., Chaki, S.: Boxes: a symbolic abstract domain of boxes. In: Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 287–303. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15769-1_18
Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The seahorn verification framework. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 343–361. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_20
Halbwachs, N.: Détermination automatique de relations linéaires vérifiées par les variables d’un programme. Ph.D. thesis, Université Scientifique et Médicale de Grenoble & Institut National Polytechnique de Grenoble, March 1979. https://tel.archives-ouvertes.fr/tel-00288805
Imbert, J.: Fourier’s elimination: which to choose? In: PPCP, pp. 117–129 (1993)
Jeannet, B.: Bddapron. http://pop-art.inrialpes.fr/~bjeannet/bjeannet-forge/bddapron/. Accessed Apr 2018
Kohler, D.: Projections of convex polyhedral sets. Ph.D. thesis, University of California, Berkeley (1967)
Maréchal, A.: New Algorithmics for Polyhedral Calculus via Parametric Linear Programming. (Nouvelle Algorithmique pour le Calcul Polyédral via Programmation Linéaire Paramétrique). Ph.D. thesis, Université Grenoble Alpes, France (2017)
Maréchal, A., Monniaux, D., Périn, M.: Scalable minimizing-operators on polyhedra via parametric linear programming. In: Ranzato [29], pp. 212–231
Maréchal, A., Périn, M.: Efficient elimination of redundancies in polyhedra by raytracing. In: Bouajjani, A., Monniaux, D. (eds.) VMCAI 2017. LNCS, vol. 10145, pp. 367–385. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52234-0_20
McMullen, P.: The maximum numbers of faces of a convex polytope. Mathematika 17, 179–184 (1970)
Monniaux, D., Alberti, F.: A Simple abstraction of arrays and maps by program translation. In: Blazy, S., Jensen, T. (eds.) SAS 2015. LNCS, vol. 9291, pp. 217–234. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48288-9_13
Monniaux, D., Gonnord, L.: Using bounded model checking to focus fixpoint iterations. In: Yahav, E. (ed.) SAS 2011. LNCS, vol. 6887, pp. 369–385. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23702-7_27
Monniaux, D., Gonnord, L.: Cell morphing: from array programs to array-free horn clauses. In: Rival, X. (ed.) SAS 2016. LNCS, vol. 9837, pp. 361–382. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53413-7_18
Motzkin, T.S.: Beiträge zur Theorie der Linearen Ungleichungen. Ph.D. thesis, Universität Zürich (1936)
Ranzato, F. (ed.): Static Analysis - 24th International Symposium, SAS 2017,New York, NY, USA, August 30 - September 1, 2017, Proceedings, Lecture Notesin Computer Science, vol. 10422. Springer (2017)
Rival, X., Mauborgne, L.: The trace partitioning abstract domain. ACM Trans. Program. Lang. Syst. 29(5), 26 (2007)
Simon, A., King, A.: Exploiting sparsity in polyhedral analysis. In: Hankin, C., Siveroni, I. (eds.) SAS 2005. LNCS, vol. 3672, pp. 336–351. Springer, Heidelberg (2005). https://doi.org/10.1007/11547662_23
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Bakhirkin, A., Monniaux, D. (2018). Extending Constraint-Only Representation of Polyhedra with Boolean Constraints. In: Podelski, A. (eds) Static Analysis. SAS 2018. Lecture Notes in Computer Science(), vol 11002. Springer, Cham. https://doi.org/10.1007/978-3-319-99725-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-99725-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99724-7
Online ISBN: 978-3-319-99725-4
eBook Packages: Computer ScienceComputer Science (R0)