Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In the abstract interpretation [2, 3], the key component is represented by the abstract domain. A lot of them have been developed to deal with the multiple challenges of the program analysis. The most expressiveness one is the polyhedra abstract domain [4], but its analysis is time consuming. To deal with that, a lot of effort have been done by the researchers in the field and that to find a good trade-off between expressiveness and efficiency. A lot of domains have been developed, known as the sub-polyhedra or weakly relational abstract domains [6, 8, 9, 11, 12]. The authors in [10] present an abstract domain based on support functions, noted \({{\mathbb P}^\sharp _\varDelta }\). This domain proposes a good balance between expressiveness and computational time. The lattice of \({{\mathbb P}^\sharp _\varDelta }\) is closed to the lattice of the template abstract domain [9], but its result is more accurate than the one obtained using the template analysis. Because the precision of the polyhedral analysis is preserved using the \({{\mathbb P}^\sharp _\varDelta }\) analysis and that based on the choice of a finite set of direction \(\varDelta \). The larger the cardinality of \(\varDelta \), the higher the precision of the result.

The execution time of the \({{\mathbb P}^\sharp _\varDelta }\) analysis is linear in the cardinality of \(\varDelta \). So, we can get a precise post fixed point using a large number of random directions. The problem is that taking a large number of directions means that the obtained polyhedron contains the same number of constraints. So, the minimisation of this polyhedron is very time consuming. Because the minimization method, firstly, deletes the redundant constraints then computes the intersection of the other constraints. In this article, we present a new version of the polyhedral minimization method, called the k-minimization. This methods is based on the max plus pruning method [5].

2 Background

2.1 The Sub-polyhedral Abstract Domain Based Support Functions

The sub-polyhedral abstract domain presented in [10] is based on support function [7]. This domain is an abstraction of convex polyhedra over \({\mathbb R}^n\), where n is the space dimension. We denote by \({{\mathbb P}^\sharp _\varDelta }\) the abstract domain using support functions. The lattice definition is closed to the lattice of the Template abstract domain [9]. \({{\mathbb P}^\sharp _\varDelta }\) is parametrize by a finite set of directions \(\varDelta = \lbrace d_1,\dots , d_l\rbrace \). The directions in \(\varDelta \) are uniformly distributed on the unit sphere, noted \(B^n\). The definition of \({{\mathbb P}^\sharp _\varDelta }\) is given as follow:

Let \(\varDelta \subseteq B^n\) be the set of directions. We define \({\mathbb P}^{\sharp }_{\varDelta }\) as the set of all functions from \(\varDelta \) to \({\mathbb R}_{\infty }\), i.e. \( {\mathbb P}^{\sharp }_{\varDelta } = \varDelta \rightarrow {\mathbb R}_{\infty }\). We denote \(\bot _\varDelta \) (resp. \(\top _\varDelta \)) the function such that \(\forall d\in \varDelta ,\ \bot _\varDelta (d)=-\infty \) (resp. \(\top _\varDelta (d)=+\infty \)).

For each \(\varOmega \in {{\mathbb P}^\sharp _\varDelta }\), we write \(\varOmega (d)\) the value of \(\varOmega \) in direction \(d\in \varDelta \). Intuitively, \(\varOmega \) is a support function with finite domain.

The abstraction and concretization functions of \({{\mathbb P}^\sharp _\varDelta }\) are defined as follows:

Let \(\varDelta \subseteq B^n\) be the set of directions.

We define the concretization function \(\gamma _{\varDelta }:{{\mathbb P}^\sharp _\varDelta }\rightarrow {\mathbb P}\) by:

$$\begin{aligned} \forall \varOmega \in {{\mathbb P}^\sharp _\varDelta },\ \gamma _{\varDelta }(\varOmega ) = \bigcap _{d\in \varDelta } \lbrace x\in {\mathbb R}^n , < x,d> \le \varOmega (d) \rbrace \;. \end{aligned}$$

where, \(< x,d>\) is the scalar product of x by the direction d.

The abstraction function \(\alpha _\varDelta :{\mathbb P}\rightarrow {{\mathbb P}^\sharp _\varDelta }\) is defined by:

$$\begin{aligned} \forall {\mathsf {P}}\in {\mathbb P}, \alpha _{\varDelta }({\mathsf {P}}) = \left\{ \begin{array}{l l} \bot &{} \quad \text{ if } {\mathsf {P}}= \emptyset \\ \top &{} \quad \text{ if } {\mathsf {P}}= {\mathbb R}^n\\ \lambda d.\ \delta _{{\mathsf {P}}}(d) &{} \quad \text{ otherwise }\\ \end{array} \right. \;. \end{aligned}$$

where, \(\delta _{{\mathsf {P}}}(d)\) is the support function of the polyhedron \({\mathsf {P}}\) in the direction d, and \({\mathbb P}\) represents the polyhedra abstract domain.

Note that, the concretization of an abstract element of \({{\mathbb P}^\sharp _\varDelta }\) is a polyhedron defined by the intersection of half-spaces, where each one is characterized by its normal vector \(d\in \varDelta \) and the coefficient \(\varOmega (d)\). The abstraction function on the other side is the restriction of the support function of the polyhedra on the set of directions \(\varDelta \). The order structure of \({{\mathbb P}^\sharp _\varDelta }\) is defined using properties of support functions [7]. The static analysis of a program consists in computing the least fixed point of a monotone map. To do so, the most used method is the Kleene Algorithm. By combining the Kleene algorithm and the \({{\mathbb P}^\sharp _\varDelta }\) abstract domain, the obtained algorithm has a polynomial complexity in the number of iterations and linear in the number of directions in \(\varDelta \). In addition, its result is as accurate as possible: at each iterate, we have that \(\varOmega _i=\alpha _{\varDelta }({\mathsf {P}}_i)\), such that \(\varOmega _i\) (resp. \({\mathsf {P}}_i\)) is the result of the \(i^{th}\) Kleene iteration using the \({{\mathbb P}^\sharp _\varDelta }\) (respectively the polyhedra abstract domain). So \(\varOmega _{\infty }=\alpha _{\varDelta }({\mathsf {P}}_{\infty })\), with \(\varOmega _{\infty }\) is the fixed point obtained in the \({{\mathbb P}^\sharp _\varDelta }\) analysis and \({\mathsf {P}}_{\infty }\) is the one obtained using polyhedra domain. That why the \({{\mathbb P}^\sharp _\varDelta }\) analysis is more precise than the Template analysis [9]. In other terms, the \({{\mathbb P}^\sharp _\varDelta }\) analysis is done with the precision of the polyhedra domain and the over-approximation is done only, at the end, in the concretization function. When with Template domain, all the analysis is done in a less expressive domain.

2.2 The Max Plus Pruning Method

In [5], the authors present a method to reduce the curse of dimensionality in solving an optimal control problem. In this method, the value function is over-approximated using a set of Max-Plus basis functions. The general formulation for the pruning problem appearing in max-plus basis methods is the following: Let \(F = \lbrace 1, 2, \dots , m \rbrace \) be a set of integer, and let \(g (x): {\mathbb R}^n \mapsto {\mathbb R}\) be a function defined as follow:

$$\begin{aligned} g (x) = sup_{i \in F} g_i (x) \end{aligned}$$

where \(\forall i\in [1,m], g_i (x)\) is a basis function. Let \( B =\lbrace g_1, \dots , g_m \rbrace \) be the set of these m basis functions. Note that, when solving an optimal control problem the cardinality of B can be very large. The idea is to approximate g(x) by keeping only \( 0 \le k \le m\) basis functions from the set B. For that, the authors in [5] need to compute the set \( S\subset F\) with cardinality k and then approximate the function g by:

$$\begin{aligned} g (x) \simeq sup_{i \in S} g_i (x) \end{aligned}$$

The obtained set S should minimize the approximation error. This is known as a pruning problem. To solve it, a set of witness points W is used to measure the approximation error, such that: \(W = \lbrace x_1, \dots , x_m \rbrace \subset {\mathbb R}^n\), where n is the space dimension. This set is constructed using a random points in the space. Afterwards, \(\forall x_i \in W, \forall g_j \in B\) the importance metric is computed, which represents the distance between \(g(x_i)\) and \(g_j(x_i)\). This is denoted by \(c_{ij}\), such that:

$$\begin{aligned} c_{ij} = g (x_i) - g_j (x_i) \end{aligned}$$

The obtained results represents the cost matrix \(C \in {\mathbb R}^m \times {\mathbb R}^m\) (dessiner la matrix). For a better comprehension, let us take the example of Fig. 1. In this example, we want to approximate the smooth convex function g (the red graph) using the basis functions \(g_0, g_1, g_2\) (the green lines). The point \(x_0\) is one witness point. It is used to compute the distance between the function g and all basis functions. The distance between \(g(x_0)\) and \(g_0(x_0)\) is represented by the red dashed line. In this example, if we want to keep only two of the three basis function. The couple (\(g_0\),\(g_2\)) is the best choice.

Fig. 1.
figure 1

In this figure, we represent the function g (red graph) and some basis functions (green lines). The dashed red line is the distance between g and \(g_0\) using the point \(x_0\). (Color figure online)

Afterwards, the cost matrix C is used to compute the set \(\vert S \vert = k\). And that by applying one of the two methods:

  • the K-median problem [1]: to minimize the sum of the lost, such that:

    $$\begin{aligned} \min _{S\subset I} \sum _{i=1}^{m} \min _{j\in S} c_{ij} \end{aligned}$$
  • the K-center problem: to maximize the lost, such that:

    $$\begin{aligned} \min _{S\subset I} \max _{i\in [1,m]} \min _{j\in S} c_{ij} \end{aligned}$$

The obtained set S is used to approximate the function g as follow:

$$\begin{aligned} g (x) \simeq sup_{i \in S} g_i (x) \end{aligned}$$

with \(g_i \in B\).

3 The k-Minimization Method

In this section, we develop our main contribution which is a novel approach to reduce the complexity of the polyhedron construction and that by reducing the number of constraints, this method is called the K-minimization, which is inspired from the Max-Plus pruning method [5].

Let \({\mathsf {P}}^m\) be a polyhedron represented by the intersection of \(m\in {\mathbb N}\) half-spaces \({i.e.}{}\) \({\mathsf {P}}^m = \bigcap _{i=1}^m H_i\) where \(H_i = \lbrace x\in {\mathbb R}^n: \langle x, a_i \rangle \le b_i \rbrace \) with \(a_i\in {\mathbb R}^n\) and \(b_i\in {\mathbb R}\). For an m very large, the computation of \({\mathsf {P}}^m\) is time consuming. To improve this computation, we over-approximate \({\mathsf {P}}^m\) by keeping only \(k<m\) of their half-spaces, let \({\mathsf {P}}^k = \bigcap _{i=1}^k H_i\) be the resulted polyhedron, such that:

  • \({\mathsf {P}}^k \supseteq {\mathsf {P}}^m\).

  • \({\mathsf {P}}^k\) is the best approximation of \({\mathsf {P}}^m\) using k half-spaces. Note that, the definition of the best approximation strongly depend on the definition of the Hausdorff distance between two polyhedra.

For a better comprehension, let me explain the motivation using the example of Fig. 2. In this figure, the left polyhedron \({\mathsf {P}}\) is defined using the intersection of seven half-spaces. Each half space, is represented by one direction (given in red in the bottom of the left figure). We want to over-approximate \({\mathsf {P}}\) by taking only 6 half-spaces from the 7 one. The resulted polyhedron \({\mathsf {P}}'\) is given in the right figure. Where, the half-space to delete is the blue dashed one. Noted that, \({\mathsf {P}}\subseteq {\mathsf {P}}'\), with \({\mathsf {P}}'\) is the best approximation of \({\mathsf {P}}\) using only 6 half-spaces from the initial one. This approximation is known as the pruning problem, to be able to solve its automatically, we propose a method called the K-minimization method.

The K-minimization method is based essentially on three steps:

  1. 1

    The computation of the witness points.

  2. 2

    The computation of the cost matrix.

  3. 3

    The application of the K-median algorithm.

Let us detailed these steps:

Fig. 2.
figure 2

(The left part) The intersection of the red dashed half spaces defines the polyhedron \({\mathsf {P}}\). The red directions in bottom represent the normal vectors of the lines those support these half-spaces. (The right part) the polyhedron \({\mathsf {P}}'\) is an over-approximation of \({\mathsf {P}}\), that by deleting the dashed blue half-space (Color figure online)

The witness points computation: Let \(w\subseteq {\mathbb R}^n\) be a set of points, where each point of w belongs to one face of \({\mathsf {P}}^m\). So, the cardinality of w is equal to m, \({i.e.}{}\) \(\vert w \vert = m\). In the following, each half-space \(H_i\) will be characterized by its corresponding point \(x_i\) in the set w, these points are called witness points. Note that, the computation of these points is known as a convex optimization problem and to solve it, we may solve m LP problems. For all \(i\in \lbrace 1,\dots , m \rbrace \):

  • Solve the following LP problem using the interior point algorithm:

    $$\begin{aligned} \min \langle x,a_i \rangle - b_i \end{aligned}$$
    $$\begin{aligned} {s.t.}:\forall j\in \lbrace 1,\dots , m \rbrace \setminus \lbrace i \rbrace , \langle x,a_j \rangle \le b_j \end{aligned}$$
  • Add the obtained point to w the witness point set.

The cost matrix computation. Let \(C \in R^{m \times m}\) be a square matrix. We have that \(\forall i,j \in [1,m]\) \(C_{ij}\) represents the euclidean distance between \(x_i\) the \(i^{th}\) point in the set w and \(proj(x_i,L_j)\) the orthogonal projection of \(x_i\) on the plane \(L_j\). This distance is obtained as follows:

$$\begin{aligned} \begin{array}{ccc} C_{ij} &{} = &{} \Vert x_i - proj(x_i,L_{j}) \Vert \\ &{} = &{} \frac{\vert \langle x_i, a_j \rangle - b_j\vert }{\Vert a_j \Vert }. \end{array} \end{aligned}$$

So, each line i of the matrix C contains distances between \(x_i\) and all the lines that support the faces of \({\mathsf {P}}^m\). The matrix C is called the cost matrix.

The application of the k-median algorithm. To tackle the fact that we want to choose k half-spaces from the m ones and that by minimizing the approximation error, we propose the use of the k-median algorithmi [1]. The k-median problem is one of the most studied clustering problem, such that for n points given in a metric space the aim is to identify the \(k<n\) ones that minimize the sum of the distance to their nearest points.

In our problem, we want to define the k witness points such that the sum of the distance between these k points and their projections is minimized. For that, we use the cost matrix C to formalize the k-median problem as follows:

$$\begin{aligned} \min _{S\subset F ,\vert S \vert = k} \sum _{i=1}^m \min _{j\in S} C_{ij}. \end{aligned}$$

with \(F = \{ 1, 2, \dots , m \}\) the set of the witness point indices in w.

Several algorithms are known to solve this problem, and that returns the set S of indices of the witness points in w that minimize the sum of distance. We know that each witness point in w represents one half-space that is used to define \({\mathsf {P}}^m\). So, the resulted polyhedron \({\mathsf {P}}^k\) is defined as follows:

$$\begin{aligned} {\mathsf {P}}^k = \bigcap _{i\in S} H_i . \end{aligned}$$

Thus, we have that \({\mathsf {P}}^m \subseteq {\mathsf {P}}^k\).

To summarize, the k-minimization algorithm is given in Algorithm 1.

figure a

In Algorithm 1, the set of witness points, noted w, is computed using the function witnessPoint. This function uses m LP solver. Then, the euclidean distance is computed between all the points of w and their orthogonal projection on L, where L are the set of planes that support the faces of the polyhedron \({\mathsf {P}}^m\). These distances are computed using the function Distance and the results are putted in the matrix C. Afterwards, the k-median algorithm is applied using the matrix C.

In the case of the \({{\mathbb P}^\sharp _\varDelta }\) analysis, let \(\varOmega \) be the obtained fixed point. Then \(\gamma _{\varDelta }(\varOmega ) = \bigcap _{d\in \varDelta } \lbrace x\in {\mathbb R}^n , < x,d> \le \varOmega (d) \rbrace \;. \) This is the concretisation of \(\varOmega \) in the polyhedra abstract domain. Note that, the cardinality of \(\varDelta \) the set of directions can be very large, So the concretisation of \(\varOmega \) can be time consuming. That why, the K-minimization method is applied at the end of the \({{\mathbb P}^\sharp _\varDelta }\) analysis to over-approximate the result of \(\gamma _{\varDelta }(\varOmega )\). The concretisation of \(\varOmega \) is very useful, it allows us to compare our result with the one obtained using the polyhedral analysis. It, also, can be used as the input of another analysis. Recall that the \({{\mathbb P}^\sharp _\varDelta }\) analysis uses a polyhedron as initial input set. The preliminary results are given in the next section.

4 Benchmarks

We implemented the k-minimization algorithm on the top level of the Parma Polyhedra Library (PPL: http://bugseng.com/products/ppl/). We apply it on some programs, which represent digital filters. The obtained results are given in Fig. 3.

Fig. 3.
figure 3

The execution time obtained using the k-minimization method and the standard polyhedral minimization

At the end of the \({{\mathbb P}^\sharp _\varDelta }\) analysis, we concretise the result in the polyhedra abstract domain. In the table of Fig. 3, we compare the results obtained using the combination of the k-minimization and the \({{\mathbb P}^\sharp _\varDelta }\) analysis [10], with the one obtained using only the \({{\mathbb P}^\sharp _\varDelta }\) analysis. Note that, in the first result we apply the k-minimization method before the application of the concretisation function. Where in the second one, we apply the concretisation function directly on the result of the \({{\mathbb P}^\sharp _\varDelta }\) analysis, and that using all the directions in \(\varDelta \). The first results are very encouraging, with our method the analysis terminates in some minutes. Where with the standard minimization function the analysis did not terminates, and that after 10 h of executions. Even, with our method, the analysis of the last program did not terminate in a reasonable execution time. The lack of the k-minimization algorithm is the computation of the witness points which is time consuming, because we apply one LP solver per constraints. The improvement of this point is the subject of our ongoing work.

5 Conclusion

In the \({{\mathbb P}^\sharp _\varDelta }\) analysis, The execution time is linear in the cardinality of \(\varDelta \). So to improve the precision of the analysis, we can take \(\varDelta \) with a large cardinality. The result of this analysis can be concretised in the polyhedra abstract domain, where the number of the constraints of the obtained polyhedron is equal to the cardinality of \(\varDelta \). So, if we take \(\varDelta \) very large, the computation of the resulted polyhedron can be time consuming. For that, we present in this paper a method called k-minimization. This method can be applied before the concretisation method to reduce its complexity. The k-minimization method is inspired from the Max-Plus Pruning method. This method over-approximate the obtained polyhedron by keeping only k from their half-spaces, where k is chosen statically smaller then the cardinality of \(\varDelta \). The obtained over-approximation is the one that minimise the loss of precision.