1 Introduction

Computing a static analysis in the framework of abstract interpretation [11, 12] typically amounts to solving a system of equations

$$\begin{aligned} \left\{ \begin{aligned} x_1&= F_1(x_1, \ldots , x_n)\\&\vdots \\ x_n&= F_n(x_1, \ldots , x_n) \end{aligned}\right. \end{aligned}$$
(1)

describing the program behavior. Each index \(i \in \{ 1, \ldots , n\}\) represents a control point of the program to be analyzed and each \(F_i\) is a monotone, state-transition operator. The unknownsFootnote 1 \(x_1,\ldots ,x_n\) associated to each control point \(i \in \{ 1, \ldots , n \}\) range over an abstract domain V, which encodes the property we want to analyze. An element of V is called abstract object and represents a set of concrete states.

We are interested in finding the (least) solution, over the domain V, of the set of equations \(F = (F_1,\ldots ,F_n)\) associated to the program to be analyzed. The abstract interpretation framework ensures that any solution of the set of equations correctly approximates the concrete behavior of the program, and the smaller the solution, the more precise is the result of the analysis. In theory, the least solution of the system can be exactly computed as the limit of a Kleene iteration, starting from the least element of \(V^n\). In practice, such a method can be unfeasible, since many abstract domains exhibit infinite ascending chains, and thus the computation may not terminate. Moreover, even for finite abstract domains, it may happen that the ascending chains are very long, making the approach impractical.

Therefore, the standard method to perform the analysis is to compute an over approximation of the least solution of the system of equations, using widening and narrowing operators [10, 13]. For specific abstract domains or for restricted classes of programs, we may find in the literature alternatives, such as acceleration operators [17] and strategy/policy iteration [9, 15, 16], but these methods are not generally applicable and their complexity may be impractical.

A widening, generally denoted by \(\mathbin {\triangledown }\), is a binary operator over the abstract domain V such that:

  • it is an upper bound on V;

  • when used in equations of the kind \(x_i = x_i \mathbin {\triangledown }F_i(x_1, \ldots , x_n)\), it precludes the appearance of infinite ascending chains for \(x_i\).

The widening operator compares the value of \(x_i\) in the previous iteration with its value in the current iteration and, in some cases, returns an approximated value. Widening is used to ensure the termination of the analysis, while introducing a loss in precision. This is realized by replacing some of the original equations \(x_i = F_i(x_1, \ldots , x_n)\) with \(x_i = x_i \mathbin {\triangledown }F_i(x_1, \ldots , x_n)\). The replacement may involve all unknowns or, more commonly, only the ones corresponding to loop heads in the dependency graph of the equation system. Applying widening in this way ensures the termination of a Kleene iteration, but we only get a post-fixpoint of the function \(F=(F_1,\ldots ,F_n)\), instead of the least one.

Once we reach a post-fixpoint, we can start a new Kleene iteration, giving origin to a descending chain which improves the result of the analysis. However, due to infinite descending chains in the abstract domain, the descending iteration might not terminate. The next exampleFootnote 2 shows this phenomenon using the abstract domain \(\textsf {Int}_{\mathbb {Z}}\) of integer intervals, defined as:

$$\begin{aligned} \textsf {Int}_{\mathbb {Z}}= \{ [l,u] \subseteq {\mathbb {Z}}\mid l \le u\in {\mathbb {Z}}\cup \{-\infty ,\infty \} \} \cup \{\emptyset \} , \end{aligned}$$

where \(\emptyset \) denotes the empty set of concrete states, i.e., an unreachable control point. The standard widening on intervals [10] is defined as follows:

$$\begin{aligned} \emptyset \mathbin {\triangledown }I&= I\\ I \mathbin {\triangledown }\emptyset&= I\\ [l_1, u_1]\mathbin {\triangledown }[l_2,u_2]&= [l', u'] \end{aligned}$$

where

$$\begin{aligned} l' = {\left\{ \begin{array}{ll} l_1 &{} \text {if } l_1 \le l_2\\ -\infty &{} \text {otherwise} \end{array}\right. } \qquad u' = {\left\{ \begin{array}{ll} u_1 &{} \text {if } u_1 \ge u_2 \\ +\infty &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Essentially, it works by preserving stable bounds and removing unstable ones. For instance, \([0,3] \mathbin {\triangledown }[0,4] = [0,\infty ]\). In this way, infinite ascending chains are precluded.

Example 1

Consider the example program unreachableLoop in Fig. 1a, and the corresponding flowchart and set of equations in Fig. 1b, c. We perform the analysis using the integer interval domain \(\textsf {Int}_{\mathbb {Z}}\) with the standard widening. Therefore, we replace the second and the tenth equation in Fig. 1c with

$$\begin{aligned} x_2&= x_2 \mathbin {\triangledown }(x_1 \vee x_8)\\ x_{10}&= x_{10} \mathbin {\triangledown }(x_9 \vee x_{12}) . \end{aligned}$$

Note that these two equations correspond to the loop joins. We assume to follow a work-list based iteration sequence, although the result is analogous with other standard iteration schemas.

The first time \(x_2\) is considered, we have \(x_1 = [0,0]\) and \(x_2 = x_8 = \emptyset \). Widening does not trigger and \(x_2\) gets updated to \(x_2 := x_1 \vee x_8 = [0,0]\). However, the second time \(x_2\) is considered we have \(x_8 = [1,1]\), hence \(x_1 \vee x_8 = [0,1]\), which is widened to \([0,+\infty ]\). This eventually leads to \(x_9 :=[10,+\infty ]\), \(x_{10} :=[10,+\infty ]\) and \(x_{12} :=[11,+\infty ]\) which is a post-fixpoint and the result of the ascending phase of the analysis.

Fig. 1
figure 1

The example program unreachableLoop . Note that \(x_5\) appears in the right hand side of equation \(x_6\) since unreachability must be propagated. a Program. b Flowchart. c Equation system

Starting from the post-fixpoint, we continue to evaluate the semantic equations, without applying neither widening nor narrowing, thus using the original equations \(x_2 = x_1 \vee x_8\) and \(x_{10}= x_9 \vee x_{12}\). We get a descending sequence, which turns out to be infinite. In fact, the first time \(x_2\) is re-evaluated, we have

$$\begin{aligned} x_2 := x_1 \vee x_8 = [0,0] \vee [0,8] = [0,8] \end{aligned}$$

which leads to \(x_9 := \emptyset \). When we evaluate the equations in the second while loop, we get

$$\begin{aligned} x_{10} :=x_9 \vee x_{12} = \emptyset \vee [11,+\infty ] = [11,+\infty ] \end{aligned}$$

and \(x_{12}=[12,+\infty ]\). At the second iteration we get

$$\begin{aligned} x_{10} := x_9 \vee x_{12} = \emptyset \vee [12,+\infty ] = [12,+\infty ] \end{aligned}$$

and \(x_{12} :=[13,+\infty ]\). It is immediate to see that, while keeping on iterating, the values computed at the control point \(x_{10}\) are \([11,+\infty ]\), \([12,+\infty ]\), \([13,+\infty ]\), \([14,+\infty ], \ldots \) which is an infinite descending sequence, whose limit is the empty set. \(\square \)

It is worth noting that, in the previous example, the existence of an infinite descending sequence depends on the fact that the second while loop is unreachable, although the initial ascending phase of the analysis computes a non-empty over approximation. This leads to a descending sequence whose limit is the empty set. This situation is not peculiar of our example. On the contrary, we will show that this is the only way infinite descending sequences may arise in the integer interval domain.

To avoid the appearance of infinite descending chains, we may stop the descending iteration at an arbitrary step, still obtaining a post-fixpoint, or we may use a narrowing operator. Narrowing, generally denoted by \(\mathbin {\vartriangle }\), is a binary operator on a abstract domain V such that:

  • \(a_1 \mathbin {\vartriangle }a_2\) is only defined when \(a_2 \le a_1\);

  • it holds that \(a_2 \le a_1 \mathbin {\vartriangle }a_2 \le a_1\);

  • when used in equations of the kind \(x_i = x_i \mathbin {\vartriangle }F_i(x_1, \ldots , x_n)\), it precludes the appearance of infinite descending chains for \(x_i\).

The standard narrowing for intervals [10], for example, is defined as:

$$\begin{aligned} I \mathbin {\vartriangle }\emptyset&= \emptyset \\ [l_1,u_1] \mathbin {\vartriangle }[l_2, u_2]&= [l', u'] \end{aligned}$$

where

$$\begin{aligned} l' = {\left\{ \begin{array}{ll} l_2 &{} \text {if } l_1 = -\infty \\ l_1 &{} \text {otherwise} \end{array}\right. } \qquad u' = {\left\{ \begin{array}{ll} u_2 &{} \text {if } u_1 = +\infty \\ u_1 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Essentially, it works by refining only unbounded extremes. For instance, \([0,\infty ] \mathbin {\vartriangle }[0,10] = [0,10]\) but \([0,10] \mathbin {\vartriangle }[0,9] = [0,10]\). Let us reconsider Example 1 and show what happens when we use narrowing in the descending phase.

Example 2

Consider the same program, flowchart and equations of Example 1, together with the result of the analysis after the ascending phase. We now replace the equations for \(x_2\) and \(x_{10}\) with \(x_2 = x_2 \mathbin {\vartriangle }(x_1 \vee x_8)\) and \(x_{10} = x_{10} \mathbin {\vartriangle }(x_9 \vee x_{12})\) and start a descending iteration.

When the second equation is first re-evaluated, the current value for \(x_2\) is \([0,+\infty ]\), hence the standard narrowing allows to change \(+\infty \) into 8, and we have \(x_2 := [0,8]\) as for the case without narrowing. However, when \(x_{10}\) is evaluated for the first time in the decreasing sequence, we have \(x_{10} := [10,+\infty ] \mathbin {\vartriangle }[11,+\infty ] = [10,+\infty ]\): the standard narrowing precludes further improvements on the second loop. The descending sequence terminates at the cost of a big loss of precision, since we are not able to detect anymore that control points 10–12 are unreachable. \(\square \)

In the rest of the paper we will show that, under certain conditions, narrowing may be omitted even in domains with infinite descending chains. We start by proving it for the domain of integer intervals and for structured programs. We progressively relax these assumptions and show that it holds in the more general case of integer template abstract domains [20] and non-structured programs. More generally, we show that this property holds for any domain which satisfies a newly defined bottom chain condition. In order to ensure termination, we need to introduce a new join operator (for structured programs) and a special acceleration procedure (in the general case). This is the first example of acceleration operator for descending chains. On template abstract domains with real bounds we cannot avoid using narrowing. However, inspired by the results on integers, we provide a family of narrowing operators which are more precise than the standard ones.

The question whether narrowing can be omitted naturally arises since its very first definition. Actually, in many practical cases, the descending chain without narrowing terminates in a few steps. We believe that our result answers this question and, in addition, sheds some light on the behaviour of solvers on descending chains.

Plan of the paper. In Sect. 2 we recall some basic notions on equation systems. In Sect. 3 we deal with the case of integer intervals and structured programs. In Sect. 4 we generalize the results to any domain which satisfies the bottom chain condition and in Sect. 5 we prove that this condition is satisfied by all the template domains with integer bounds. Section 6 shows that, even if convergerce is ensured, the length of the descending chain may be arbitrarily long. We generalize these results to non-structured programs in Sect. 7, while in Sect. 8 we deal with the case of abstract domains with real bounds. Finally, Sect. 9 concludes with some final remark and related work.

Sections 356 and 8 are based on results appeared in [2] in preliminary form and without proofs. The generalizations to domains satisfying the bottom chain condition and to non-structured programs in Sects. 4 and 7 are entirely new. Moreover, the whole presentation has been formalized in a more general setting using flow graphs and proofs are provided for all the results.

2 Equation systems, flowcharts and dependencies

In this paper we consider the general problem of solving equation systems whose unknowns take values in a bounded join-semilattice \((V,\bot ,\vee )\). Given a set X of unknowns, an assignment is a map \(\rho : X \rightarrow V\). An equation system is a map \(F: (X \rightarrow V) \rightarrow (X \rightarrow V)\) from assignments to assignments. We will only consider equation systems with a finite number of unknowns. A post-fixpoint for F is an assignment \(\rho \) such that \(F(\rho ) \le \rho \), where \(\le \) is the standard pointwise extension of the ordering on V to assignments. In the setting of static analysis, solving an equation system means to find out a post-fixpoint for it. In many cases, an assignment which is bigger than a post-fixpoint also suffices. Determining a fixpoint (or even the least fixpoint) of the equation system is just an additional bonus.

In the previous section, we have derived an equation system from the flowchart of a program, by associating an unknown to each edge. Equation systems which arise in this way have a particular form: for each unknown x, the corresponding equation is either of the form \(x = x_1 \vee \cdots \vee x_n\) or of the form \(x = E\) where \(x_1, \ldots , x_n\) are other unknowns and E is an expression with at most one unknown.

In our theoretical treatment we will work with equation systems which are more general than those which arise from flowcharts, since each right hand side is of the form \(x = E_1 \vee \cdots \vee E_n\) and each \(E_i\) is an expression which contains at most one unknown. We will derive these equation systems from graphs we will call semantic flow graphs, which are similar to flowcharts but more abstract since not tied to the syntax of a particular language.

Definition 1

(Flow graph) A flow graph is a tuple \(\mathscr {G}=(N, E, \iota )\) where N is the set of nodes, \(E \subseteq N \times N\) is the set of edges and \(\iota \in N\) is the root node and has no incoming edges.

A path in the graph \(\mathscr {G} = (N, E, \iota )\) is a sequence \(\pi = n_0 \ldots n_k\) with \(k \ge 0\) such that \((n_{i-1}, n_{i}) \in E\) for each \(i \in \{1, \ldots , k\}\). We say that \(\pi \) starts from \(n_0\) and ends in \(n_k\).

We say that the node n dominates the node m if each path from \(\iota \) to m passes trough n. If \(n \ne m\) we say that n strictly dominates m. A back-edge is any edge (mn) such that n dominates m.

A semantic flow graph (over the poset V) is a flow graph with a distinguished value (the initial value for the root node) and whose edges are labeled with monotone functions (the state-transition functions). The labels and the distinguished value contain the semantic information that in a flowchart is conveyed by the shape and label of nodes.

Definition 2

(Semantic flow graph) A semantic flow graph over the poset V is a tuple \(\mathscr {G}=(N, E, \iota ,\{f_e\}_{e\in E},v_{\iota })\) where \((N, E, \iota )\) is a flow graph, each \(f_e: V \rightarrow V\) is a monotone function and \(v_\iota \in V\).

Definition 3

(Equation system derived from a semantic flow graph) From any semantic flow graph \(\mathscr {G} =(N, E, \iota ,\{f_e\}_{e\in E},v_{\iota })\) over the bounded join-semilattice V, we derive an equation system \(F_{\mathscr {G}}\) such that the unknowns of \(F_{\mathscr {G}}\) are the nodes of the graph and

$$\begin{aligned} F_{\mathscr {G}}(\rho )(n) = {\left\{ \begin{array}{ll} v_\iota &{} \text {if } n=\iota \\ {\bigvee }_{(m,n) \in E} f_{(m,n)}(\rho (m))) &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The equation system F is called strict if each \(f_e\) is strict, i.e. \(f_e(\bot )=\bot \). When we say that F is derived from the semantic flow graph \(\mathscr {G}\), we mean that \(F=F_{\mathscr {G}}\).

Usually, solvers of equations systems keep a current assignment \(\rho \) which is updated by selecting some unknown to be recomputed according to their corresponding equations. This is formalized by the following definition.

Definition 4

(Update of an assignment) Fixed a semantic flow graph \(\mathscr {G}\), for each assignment \(\rho \) and \(X \subseteq N\), the assignment

$$\begin{aligned} \rho ^{X}(n) = {\left\{ \begin{array}{ll} F_{\mathscr {G}}(\rho )(n) &{} \text {if } n \in X,\\ \rho (n) &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

is called the update of \(\rho \) for X.

The following facts are well known and easy to check. However, since we use them extensively in the rest of the paper, we prefer to state them clearly.

Proposition 1

(Properties of \(F_{\mathscr {G}}\) ) For any semantic flow graph \(\mathscr {G}\), the equation system \(F_{\mathscr {G}}\) is monotone on assignments. Moreover, the function which maps an assignment \(\rho \) to its update \(\rho ^X\) is monotone, too. Finally, if \(\rho \) is a post-fixpoint, then \(\rho ^{X}\) is a post-fixpoint.

Example 3

The equation system in Fig. 1c is derived from the semantic flow graph in Fig. 2. In this semantic flow graph, the functions labeling the edges are depicted by boolean expression and assignments in order to ease the comparison with the flowchart. \(\square \)

Fig. 2
figure 2

The semantic flow graph for the equation system in Fig. 1c

Semantic flow graphs and flowcharts are related. Essentially, one may be obtained from the other by changing edges in nodes and vice-versa. In this paper, we prefer semantic flow graphs since they are used by many libraries for solving fixpoint equations such as Fixpoint Footnote 3 and ScalaFix.Footnote 4

3 Descending chains on intervals of integers

In this section we informally analyze how termination of the descending phase may be ensured, in the absence of narrowing, with the domain of integer intervals. In Sect. 4 we will formalize our reasoning and prove correctness results in a more general setting. Note that what is generally called interval domain in the literature is an extension of the domain we are considering here, and it would be better called box domain since its elements are boxes in \({\mathbb {R}}^n\). However, the box domain is a particular case of a template abstract domain, and will be dealt with in Sect. 5.

Example 1 shows an analysis which leads to an infinite descending chain of intervals. In particular, the chain is \([11,+\infty ]\), \([12,+\infty ]\), \([13,+\infty ], \ldots \) and its limit is the empty set. It turns out that the only infinite descending chains of intervals are of the kind

$$\begin{aligned}{}[n_0,+\infty ], [n_1,+\infty ], [n_2,+\infty ], \ldots \end{aligned}$$

or

$$\begin{aligned} {[}-\infty , -n_0], [-\infty , -n_1], [-\infty , -n_2], \ldots \end{aligned}$$

where \(\{n_i\}_{i \in {\mathbb {N}}}\) is an infinite ascending chain of integers. The limit of all these chains is the empty set.

Proposition 2

Let \(\{I_i\}_{i\in {\mathbb {N}}}\) be an infinite descending chain of integer intervals. Then \(\bigwedge _{i\in {\mathbb {N}}} I_i = \emptyset \).

Proof

If \(\{I_i\}_{i \in {\mathbb {N}}}\) in an infinite descending chain of intervals and \(I_i=[l_i, u_i]\), then either \(l_i\) or \(-u_i\) is an infinite ascending chain of integers. Therefore, either \(l_i \rightarrow +\infty \) or \(u_i \rightarrow -\infty \). Without loss of generality, assume we are in the first case. Then, for each \(x \in {\mathbb {R}}\), there is \(j \in {\mathbb {N}}\) such that \(l_j > x\), hence \(x \notin \bigwedge _{i \in {\mathbb {N}}} I_i\). We have \(\bigwedge _{i \in {\mathbb {N}}} I_i = \emptyset \). \(\square \)

In the rest of this section, we work with equation systems which have been generated by structured programs, i.e., programs whose control flow constructs are while, for and repeat loops, if-then(-else), break and continue. Intuitively, this means that every loop has a single well defined entry point.

Assume \( loop \) is the entry point of a loop and its corresponding equation is \(x_{ loop } = x_{ in } \vee x_{ back }\), where \( in \) is the edge in the flow graph which comes from outside the loop and \( back \) the back-edge. Since in a reducible graph the entry point of a loop dominates all the nodes inside the loop, if control point \( in \) is unreachable (i.e., \(x_ in = \emptyset \) in the interval domain) the same holds for control point \( loop \).

Therefore, we may change the abstract semantics of the program by replacing each equation corresponding to a loop join \(x_{ loop } = x_{ in } \vee x_{ back }\) with \(x_{ loop } = x_{ in } \vee ^\bot x_{ back }\), where \(\vee ^\bot \) is a left-strict variant of the join operator defined as:

$$\begin{aligned} I_1 \vee ^{\bot } I_2 = {\left\{ \begin{array}{ll} \emptyset &{} \text {if } I_1=\emptyset \\ I_1 \vee I_2 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(2)

The new set of equations is correct (only for structured programs) and more precise. Moreover, during the descending phase of the analysis, narrowing is not required to achieve termination. Actually, assume that an infinite descending chain arises during the descending phase. Let \( loop \) be one of the outermost loop heads whose variable \(x_ loop \) infinitely decreases. In the presence of left-strict joins, this leads to a contradiction. The equation of \(x_ loop \) is \(x_ loop = x_ in \vee ^\bot x_ back \). The value of \(x_ in \) is definitively constant. Once it reaches its definitive value \({\bar{x}}_ in \), we may have only two cases:

  • if \(\bar{x}_ in = \emptyset \), then the first time \(x_ loop \) is re-evaluated we have \(x_ loop := \emptyset \) and \(x_ loop \) cannot descend anymore, contradicting our hypothesis;

  • if \(\bar{x}_ in \ne \emptyset \), then \(x_ loop \ge \bar{x}_ in \) always, and therefore it cannot descend infinitely, due to Proposition 2.

The above considerations hold not only for integer intervals, but for any numerical abstract domain with a distinguished value \(\bot \) denoting unreachability.

Moreover, they can be easily lifted to n-ary loop join nodes. For example, if a loop join node has equation

$$\begin{aligned} x_ loop = x_{ in _1} \vee \cdots \vee x_{ in _u} \vee x_{ back _1} \vee \cdots \vee x_{ back _v}, \end{aligned}$$

where all the edges \( in _i\) come from outside the loop and all the \( back _j\)’s are back-edges, we may use left-strict join in this way:

$$\begin{aligned} x_ loop = (x_{ in _1} \vee \cdots \vee x_{ in _u}) \vee ^\bot (x_{ back _1} \vee \cdots \vee x_{ back _v}). \end{aligned}$$

The next section is devoted to formalize the above reasoning in a formal, more general setting, and to prove the correctness of the approach.

4 The general case

In the general case, we deal with an equation system F derived from a flow graph \(\mathscr {G}\). The first step is to define a new equation system \(F^{\bot }\) which uses a left-strict variant of \(\vee \) and which is still correct.

Definition 5

If F is a strict equation system derived from \(\mathscr {G}\), we define the new equation system \(F^{\bot }\) as follows:

$$\begin{aligned} F^{\bot }(\rho )(n) = {\left\{ \begin{array}{ll} v_\iota &{} \text {if } n = \iota \\ {\bigvee }_{e=(m, n) \in E {\setminus } \mathsf {BackE}} f_e(\rho (m)) \vee ^\bot {\bigvee }_{e=(m,n) \in \mathsf {BackE}} f_e(\rho (m)) &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(\vee ^\bot \) is a variant of \(\vee \) which is strict on the left argument and \(\mathsf {BackE}\) is the set of all back-edges of \(\mathscr {G}\).

The following theorem shows that if we have a correct solution for \(F^\bot \) (namely, an assignment which is greater than a post-fixpoint), that is also a correct solution for F. The opposite also holds. This means we can use \(F^\bot \) everywhere, instead of F.

Theorem 1

If F is a strict equation system derived from \(\mathscr {G}\) and \(\rho \) is a post-fixpoint of \(F^{\bot }\), there is a \(\rho ' \le \rho \) which is a post-fixpoint of F. On the contrary, if \(\rho \) is a post-fixpoint of F, it is also a post-fixpoint of \(F^\bot \).

Proof

Assume \(\rho \) is a post-fixpoint of \(F^{\bot }\). Consider the set X of nodes which have no contributions from incoming non-backedge nodes, i.e.,

$$\begin{aligned} X= \left\{ n \in N {\setminus } \{\iota \} ~ \left| ~ \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}}\, f_e(\rho (m))=\bot \right. \right\} . \end{aligned}$$

Note that if \(n \notin X\) then \(F^{\bot }(\rho )(n) = F(\rho )(n)\). Let us define the new assignment \(\rho ^{\bot }\) as follows:

$$\begin{aligned} \rho ^{\bot }(n) = {\left\{ \begin{array}{ll} \bot &{} \text {if there is } m \in X \text { such that } m \text { dominates } n;\\ \rho (n) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

We need to prove that \(\rho ^{\bot }\) is a post-fixpoint of F, i.e. \(\rho ^{\bot }(n) \ge F(\rho ^{\bot })(n)\) for all nodes n. There are several cases:

  • if n is not dominated by any node in X, then \(\rho ^{\bot }(n) = \rho (n) \ge F^{\bot }(\rho )(n) = F(\rho )(n) \ge F(\rho ^{\bot })(n)\).

  • if \(n \in X\), then \(\rho ^{\bot }(n) = \bot \). We have

    $$\begin{aligned} F(\rho ^{\bot })(n) = \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho ^{\bot }(m)) \vee \bigvee _{e=(m,n) \in \mathsf {BackE}} f_e(\rho ^{\bot }(m)). \end{aligned}$$

    Since \(n \in X\), then \(\bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho ^{\bot }(m)) \le \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho (m)) = \bot \). If \((m,n) \in \mathsf {BackE}\), then n dominates m, hence \(\rho ^{\bot }(m)=\bot \). As a consequence, \(F(\rho ^{\bot })(n)=\bot \).

  • if \(n \notin X\) is dominated by a node \(m \in X\) then \(\rho ^{\bot }(n) = \bot \). For any edge (ln), since m strictly dominates n, then m dominates l and therefore \(F(\rho ^{\bot })(n)=\bot \).

For the second part of the theorem, let \(\rho \) be a post-fixpoint of F. Then \(\rho \) is a post-fixpoint of \(F^{\bot }\), since \(F^{\bot }(\rho ) \le F(\rho )\). \(\square \)

The next definition generalizes and formalizes the property already seen on integer intervals, that all the infinitely descending chains converge to the bottom of the abstract domain.

Definition 6

(Bottom chain condition) We say that a poset V satisfies the bottom chain condition (\(\bot \)CC ) when, for each infinitely descending chain \(v_1> v_2> \cdots> v_i > \cdots \), the greatest lower bound \(\bigwedge _i v_i\) exists and is equal to the least element \(\bot \) of V.

From Proposition 2 it turns out that the integer interval domain satisfies the \(\bot \)CC . We will show in Sect. 5 that any template domain satisfies the \(\bot \)CC .

Definition 7

(Chaotic iteration sequence) Given an assignment \(\rho \) for the equation system F derived from \(\mathscr {G}\), a chaotic iteration sequence is a sequence of assignments \(\rho = \rho _0, \rho _1, \ldots , \rho _i, \ldots \) such that:

  1. 1.

    each \(\rho _{i+1}\) is obtained from \(\rho _i\) by selecting a set \(X_i \subseteq N\) and defining \(\rho _{i+1}=\rho _i^{X_i}\);

  2. 2.

    fairness: for each \(i \in {\mathbb {N}}\) and \(n \in N\), if \(F(\rho _i)(n) \ne \rho _i(n)\), then there exists \(j \ge i\) such that \(n \in X_j\).

According to our definition, all the chaotic iteration sequences are infinite. Nonetheless, any solver for equation systems immediately stops when it recognizes that the sequence stabilizes, i.e., has reached a fixpoint. In this case, abusing terminology, we call it a finite sequence.

The next theorem shows what is one of the main results of this paper, namely that every chaotic iteration sequence on a domain which satisfies \(\bot \)CC for an equation system generated by a reducible graph eventually stabilizes. For the definition of reducible graph, depth-first ordering, forward, retreating and cross edge we refer to [1].

Theorem 2

Assume that V satisfies the \(\bot \)CC , F is an equation system derived from \(\mathscr {G}\) and \(\rho \) is a post-fixpoint of \(F^{\bot }\). If \(\mathscr {G}\) is reducible, every chaotic iteration sequence for \(F^{\bot }\) starting from \(\rho \) eventually stabilizes.

Proof

Let \(\rho = \rho _0, \rho _1, \ldots , \rho _i, \ldots \) be a chaotic iteration sequence. Assume the sequence does not stabilize, i.e., there is a node n such that the chain \(\{ \rho _i(n) \mid i \in {\mathbb {N}}\}\) is infinitely descending. Consider a depth-first ordering of \(\mathscr {G}\) and assume without loss of generality that n is the first node in the ordering which does not stabilize. We recall that, for each i,

$$\begin{aligned} F^{\bot }(\rho _i)(n) = \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho _i(m)) \vee ^\bot \bigvee _{e=(m,n) \in \mathsf {BackE}} f_e(\rho _i(m)) \end{aligned}$$

If \((m,n) \in E {\setminus } \mathsf {BackE}\), it is not a back-edge. Since \(\mathscr {G}\) is reducible, it is not a retreating edge either, therefore m comes before n in the depth-first ordering. This means that, for a big enough i, the first half of the formula for \(F^{\bot }(\rho _i)(n)\) stabilizes to a value v. Assume the stable value is reached at iteration j. Since \(\{ \rho _i(n) \}_i\) does not stabilize and V satisfies \(\bot \)CC , it should be that \(\bigwedge _{i \in N} \rho _i(n) = \bot \ge v\), hence \(v = \bot \). This immediately implies that, when the node n is selected at an iteration \(l \ge j\), we have \(\rho _l(n) = \bot \), and therefore \(\{ \rho _i(n) \}_i\) stabilizes against the original hypothesis. \(\square \)

The special case of static analysis of structured programs on the abstract domain of integer intervals immediately follows from the above theorem.

Corollary 1

Assume we have a strict equation system F generated by a structured program whose loop head nodes are of the form \(x_ loop = x_ in \vee x_ back \).

  • If the abstract domain has a distinguished value \(\bot \) denoting unreachability, replacing \(\vee \) with \(\vee ^\bot \) in all the loop heads, the new set of equations is still correct.

  • When using the abstract domain of integer intervals, every chaotic iteration sequence in the new set of equations starting from a post-fixpoint leads to a finite descending sequence.

Proof

The first point immediately derives from Theorem 1. Since the program is structured, it follows that the flow graph is reducible. By Proposition 2, the abstract domain of integer intervals satisfies Definition 6, thus we can apply Theorem 2, from which the thesis. \(\square \)

Note that a descending sequence without narrowing always leads to a fixpoint of the equation system, instead of a post-fixpoint.

5 Template abstract domains

Template abstract domains are numerical domains where each abstract object is described by a finite set of linear constraints on the program variables, and the homogeneous coefficients of these constraints are fixed in advance, before starting the analysis. Most important template abstract domains are the domain of intervals (also called box domain) [10], octagons [19], template parallelotopes [3] and template polyhedra [20]. Non-template abstract domains are, among others, polyhedra [14], parallelotopes [4] and two-variables for linear inequality [21].

A template abstract domain (with integer bounds) \(\textsf {Template}_{\mathbb {Z}}^A\) is defined by an \(m \times n\) real matrix A, called the constraint matrix. An element of \(\textsf {Template}_{\mathbb {Z}}^A\) is a subset of \({\mathbb {R}}^n\) of the form \(\{ \mathbf {x} \in {\mathbb {R}}^n \mid \mathbf {l} \le A \mathbf {x} \le \mathbf {u}\}\), where \(\mathbf {l} \in {\bar{{\mathbb {R}}}}^n\) and and \(\mathbf {u} \in {\bar{{\mathbb {R}}}}^n\) are, respectively, the lower and upper bounds. In the following, one such object will be denoted by \([\mathbf {l}, \mathbf {u}]\). Ordering is given by subset, \(\emptyset \) is the bottom element and the least upper bound \([\mathbf {l}, \mathbf {u}] \vee [\mathbf {l}', \mathbf {u}'] = [\min (\mathbf {l}, \mathbf {l}'), \max (\mathbf {u}, \mathbf {u}')]\) where \(\min \) and \(\max \) are defined component-wise. Note that, differently from \(\textsf {Int}_{\mathbb {Z}}\), an element of \(\textsf {Template}_{\mathbb {Z}}^A\) is a subset of \({\mathbb {R}}^n\) and not of \({\mathbb {Z}}^n\): only the bounds are required to be integer. Also, the constraint matrix A may contain any real number.

A box is an abstract object where A is the identity matrix. Octagons are those objects where the coefficient matrix A allows constraints of the form \(\pm x \pm y \le c\). Parallelotopes, instead, are those objects whose matrix A is invertible.

Under the hypothesis of Theorem 1, it is possible to extend Corollary 1 to all the template abstract domains. In fact, given a narrowing operator on intervals, we can immediately define a corresponding component-wise narrowing operator on any template abstract domain. We first show that template abstract domains with integer bounds enjoy a property similar to Proposition 2.

Proposition 3

Let \(\{I_i\}_{i\in {\mathbb {N}}}\) be an infinite descending chain of objects \(I_i\in \textsf {Template}_{\mathbb {Z}}^A\). Then \(\bigwedge _{i\in {\mathbb {N}}} I_i = \emptyset \).

Proof

If \(\{[\mathbf {l}^i,\mathbf {u}^i]\}_{i \in {\mathbb {N}}}\) is an infinite descending chain in \(\textsf {Template}_{\mathbb {Z}}^A\), then there exists j such that either \(\{l_j^i\}_i\) or \(\{-u_j^i\}_i\) is an infinite ascending chain of integers. Therefore, either \(l_j^i \rightarrow +\infty \) or \(u_j^i \rightarrow -\infty \). Without loss of generality, assume we are in the first case. Then, for each \(x \in {\mathbb {R}}\), there is \(k \in {\mathbb {N}}\) such that \(l_k^i > x\), hence \(x \notin \bigwedge _{i \in {\mathbb {N}}} [\mathbf {l}_j^i,\mathbf {u}_j^i]\). Therefore \(\bigwedge _{i \in {\mathbb {N}}} [\mathbf {l}^i,\mathbf {u}^i] = \emptyset \). \(\square \)

Exploiting the above proposition and Theorem 1, we can prove a result analogous to Corollary 1 which, in presence of a left-strict join, allows us to avoid narrowing, still guaranteeing termination.

Corollary 2

Assume we have a strict equation system F generated by a structured program whose loop head nodes are of the form \(x_ loop = x_ in \vee x_ back \). When using \(\textsf {Template}_{\mathbb {Z}}^A\), if we replace \(\vee \) with \(\vee ^\bot \) in all the loop heads, every chaotic iteration strategy in the new set of equations starting from a post-fixpoint leads to a finite descending sequence.

Proof

The proof is analogous to that one of Corollary 1. \(\square \)

6 Finite, arbitrarily long descending chains

The use of \(F^\bot \) allows us to get rid of narrowing. As a consequence, however, we may find programs whose descending chain is finite but arbitrarily long. We now show this phenomenon.

Consider the example program unracheableLoop2 in Fig. 3a, and the corresponding flow graph and set of equations in Fig. 3b, c. We first perform the analysis using the integer interval domain \(\textsf {Int}_{\mathbb {Z}}\) with the standard widening and narrowing and then we recompute the analysis without narrowing.

In the ascending phase we use widening on the join loops: \(x_2 = x_2 \mathbin {\triangledown }(x_1 \vee ^\bot x_4)\) and \(x_{6} = x_{6} \mathbin {\triangledown }(x_5 \vee ^\bot x_{8})\). The post-fixpoint is:

$$\begin{aligned} x_1&= [0,0]&x_4&= [1,11]&x_7&= [-\infty ,100]\\ x_2&= [0,\infty ]&x_5&= [11,\infty ]&x_{8}&= [-\infty ,99]\\ x_3&= [0,10]&x_6&= [-\infty ,\infty ]&x_{9}&= [101,\infty ] \end{aligned}$$

Now we start the descending phase with the standard narrowing, using the equations \(x_2 = x_2 \mathbin {\vartriangle }(x_1 \vee ^\bot x_4)\) and \(x_{6} = x_{6} \mathbin {\vartriangle }(x_5 \vee ^\bot x_{8})\). When we first apply narrowing in the second equation, we get:

$$\begin{aligned} x_2 = x_2 \mathbin {\vartriangle }(x_1 \vee ^\bot x_4) = [0,\infty ] \mathbin {\vartriangle }[0,11] = [0,11] \end{aligned}$$

and therefore \(x_5 = [11,11]\). We now apply narrowing in the sixth equation:

$$\begin{aligned} x_6 = x_6 \mathbin {\vartriangle }(x_5 \vee ^\bot x_8) = [-\infty ,\infty ] \mathbin {\vartriangle }[-\infty ,99] = [-\infty ,99] \end{aligned}$$

and therefore we have \(x_7 = [-\infty ,99]\), \(x_8 = [-\infty ,98]\) and \(x_9= \emptyset \), which is the fixpoint.

We now recompute the descending phase without narrowing, using the equations

$$\begin{aligned} x_2&= x_1 \vee ^\bot x_4\\ x_{6}&= x_5 \vee ^\bot x_{8}. \end{aligned}$$

The first while loop behaves as before with \(x_5 = [11,11]\). Now we enter the second while loop. The first iteration is the same as before, when using narrowing, and we get:

$$\begin{aligned} x_6&= [-\infty ,99]&x_{8}&= [-\infty ,98]\\ x_7&= [-\infty ,99]&x_{9}&= \emptyset \end{aligned}$$

But now we are able to continue the descending phase, which is:

figure a

Note that, by continuing the descending phase till the fixpoint, we are able to detect that the guard in the second while loop is over-dimensioned, since the variable i never reaches the value 100.

Fig. 3
figure 3

The example program unracheableLoop2 . a Program. b Flow graph. c Equation system

7 Non-reducible graphs

We first show that Theorem 2 does not hold when the flow graph is not reducible. Consider the program in Fig. 4a, where “\(?\)” denotes a non-deterministic boolean expression. The first while loop is the same which appears in the program unreachableLoop . The second while loop is unreachable but has two entry points. The subgraph composed of the nodes from 9 to 14 is not reducible, hence in the right hand side for \(x_{12}\) and \(x_{14}\) we cannot replace \(\vee \) with \(\vee ^\bot \). After the end of the ascending chain with the standard widening, we have \(x_{10} = x_{11} = x_{12} = x_ {14} = [10,+\infty ]\) and \(x_{13} = [11,+\infty ]\). During the descending chain, if we do not use narrowing, \(x_9, x_{10}\) and \(x_{11}\) become \(\emptyset \). However, for \(x_{12}, x_{13}\) and \(x_{14}\) an infinite descending chain begins, whose limit is the empty set.

Fig. 4
figure 4

The example program nonReducible . a Program. b Flow graph. c Equation system

Note that it would be possible to break the descending chain by recognizing that the node 9 dominates all the lower subgraph. Since \(x_{9} = \emptyset \), all the unknowns for the subgraph dominated by 9 may be set to \(\emptyset \). We may turn this idea into a new procedure for the descending chain which ensures termination also for non-reducible graphs. However, something slightly more complex is required for the general case.

First of all, we need to extend the concept of dominance between nodes to the case of dominance between edges and nodes.

Definition 8

(Edge dominance) Given a flow graph \(\mathscr {G} = (N, E, \iota )\), we say that a set of edges \(X \subseteq E\) dominates the node n if each path from \(\iota \) to n passes trough at least one of the edges in X.

Example 4

In the flow graph of Fig. 4b, edges (9, 10) and (9, 11) dominate nodes 12, 13 and 14. \(\square \)

The following lemma formalizes the fact that we can set some unknown to \(\bot \) in an assignment, preserving the property of being a post-fixpoint, and thus improving the unreachability information.

Lemma 1

If F is a strict equation system derived from \(\mathscr {G}\) and \(\rho \) is a post-fixpoint of F, let \(X \subseteq E\) be a set of edges such that, for each \((m,n) \in X\), \(f_{(m,n)}(\rho (m)) = \bot \). Then, consider the assignment

$$\begin{aligned} \rho ^{\bot }(n) = {\left\{ \begin{array}{ll} \bot &{} \text {if } X \text { dominates } n \\ \rho (n) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

We have that \(\rho ^{\bot }\) is a post-fixpoint of F.

Proof

We prove that \(\rho ^\bot (n) \ge F(\rho ^\bot )(n)\) for each \(n \in N\). We distinguish several cases:

  • X does not dominate n. By monotonicity of F, it is immediate that \(\rho ^{\bot }(n) = \rho (n) \ge F(\rho )(n) \ge F(\rho ^{\bot })(n)\).

  • X dominates n. By definition \(\rho ^{\bot }(n)=\bot \) and \(F(\rho ^{\bot })(n) = \bigvee _{e=(m,n) \in E} f_e(\rho ^{\bot }(m))\). For each \(e=(m,n) \in E\), either \(e \in X\), hence \(f_e(\rho ^\bot (m))\le f_e(\rho (m))=\bot \), by hypothesis, or X dominates m, hence \(\rho ^{\bot }(m)=\bot \). In both cases \(f_e(\rho ^\bot (m))=\bot \), and therefore \(F(\rho ^{\bot })(n) = \bot \).\(\square \)

We can augment a chaotic iteration sequences with new steps which apply Lemma 1 to the current assignment.

Definition 9

(Chaotic iteration sequence with bottom acceleration) Given an assignment \(\rho \) for the equation system F derived from \(\mathscr {G}\), a chaotic iteration sequence with bottom acceleration is a sequence of assignments \(\rho = \rho _0, \rho _0^\bot , \rho _1, \rho _1^\bot , \ldots , \rho _i, \rho _i^\bot , \ldots \) such that

  1. 1.

    \(\rho _{i+1} = \rho _i^{X}\) for some \(X \subseteq N\);

  2. 2.

    \(\rho _i^\bot \) is obtained from \(\rho _i\) by Lemma 1 by choosing \(X = \{(m,n) \in E \mid f_{(m,n)}(\rho (m)))=\bot \}\).

Theorem 3

Assume that V satisfies the \(\bot \)CC , F is a strict equation system derived from \(\mathscr {G}\) and \(\rho \) is a post-fixpoint of F, any chaotic iteration sequence with bottom acceleration starting from \(\rho \) is ultimately stationary.

Proof

First of all, note that each iteration sequence with bottom acceleration starting from \(\rho \) is a descending chain, since at each step we preserve the property of being in a post-fixpoint of F. Assume the sequence \(\rho = \rho _0, \rho _0^\bot , \rho _1, \rho _1^\bot , \ldots , \rho _i, \rho _i^\bot , \ldots \) is not ultimately stationary, i.e., there is a node n such that the chain \(\{ \rho _i(n) \mid i \in {\mathbb {N}}\}\) is infinitely descending. Consider a depth-first ordering of \(\mathscr {G}\) and assume without loss of generality that n is the first node in the ordering which does not stabilize.

Since V satisfies the \(\bot \)CC we have \(\rho _i(n) \rightarrow \bot \) and therefore \(\rho _i(m) \rightarrow \bot \) for each edge \((m,n) \in E\). If m comes before n in the depth-first ordering, then the sequence \(\{ \rho _i(m) \}_{i \in {\mathbb {N}}}\) should be ultimately stationary. This means that, if X is the set of all forward or cross edges pointing to n, there exists an index j such that \(f_{(m,n)}(\rho _j(m))=\bot \) for each \((m,n) \in X\). Note that X dominates n, therefore \(\rho _j^\bot (n)=\bot \), against the original hypothesis. \(\square \)

Example 5

Consider the program in Fig. 4a. During the descending chain, as soon as \(x_{2}\) becomes [0, 9] the acceleration step sets to \(\bot \) all unknowns from \(x_9\) to \(x_{14}\). \(\square \)

Although in this example we could just track node dominance and realize that unknowns from \(x_{10}\) to \(x_{14}\) may be set to \(\bot \) once \(x_9\) reaches \(\bot \), this is not true in general, as the following example shows.

Example 6

Consider the equation system in Fig. 5c over the domain of integer intervals. Consider the assignment \(\rho = \{ x_1 \mapsto [1,1], x_2 \mapsto \emptyset , x_3 \mapsto [1,1], x_4 \mapsto [0,+\infty ], x_5 \mapsto [0,+\infty ] \}\), which is a post-fixpoint. If we start a descending chain from \(\rho \), \(x_4\) and \(x_5\) assume all the values of the form \([i,+\infty ]\) for \(i \ge 0\). Note that neither 2 nor 3 dominate either 4 or 5. However the set of edges \(\{(2,4),(3,5)\}\) dominates both 4 and 5. Therefore, a bottom accelerated descending chain immediately converges. \(\square \)

Fig. 5
figure 5

The example program nonReducible2 . a Program. b Flow graph. c Equation system

The use of bottom acceleration is more effective than using the new equation system \(F^\bot \), since it works for both reducible and non-reducible graphs. However, it is more difficult to implement, since it require changes to the solver algorithm, while \(F^\bot \) may be generally implemented by just changing the equation system that we feed to the solver. Moreover, checking for dominance might slow down the computation in the vast majority of cases when there is no unreachable code, while using \(F^\bot \) has no adverse performance implications.

8 Narrowing on reals

The left-strict join we have introduced for integer domains may also be used with abstract domains over real variables. This improves the precision of the analysis, but does not ensure that the descending phase terminates. This depends on the fact that, once we admit real variables, we can have infinite descending chains whose limit is not the empty set. Nonetheless, in this case the left-strict join may be exploited to define a narrowing more precise than the standard one.

The next example shows that on the standard interval domain \(\textsf {Int}_{\mathbb {R}}\) for real variables, the descending phase of the analysis may lead to an infinite descending chain whose limit is not the empty set. We recall that

$$\begin{aligned} \textsf {Int}_{\mathbb {R}}= \{ [l,u] \subseteq {\mathbb {R}}\mid l \le u\in {\mathbb {R}}\cup \{-\infty ,\infty \} \} \cup \{\emptyset \} . \end{aligned}$$
Fig. 6
figure 6

The example program realLoop . a Program. b Flow graph. c Equation system

Example 7

Consider the example program realLoop in Fig. 6a, and the corresponding flow graph and equations in Fig. 6b, c. The ascending phase using left-strict join and standard widening, i.e., \(x_2=x_2\mathbin {\triangledown }(x_1 \vee ^\bot x_7)\), reaches a post-fixpoint in two iterations.

figure b

We now start from the post-fixpoint a descending iteration without applying narrowing, using the original equation \(x_2=x_1 \vee ^\bot x_7\).

figure c

At the next iterations, we obtain:

$$\begin{aligned} x_2&=[0,3]&x_7&=\left[ 1,\frac{5}{2}\right] \\ x_2&=\left[ 0,\frac{5}{2}\right]&x_7&=\left[ 1,\frac{9}{4}\right] \end{aligned}$$

and so on, without terminating. The fixpoint, which is \(x_2 = [0,2]\) and \(x_7=[1,2]\), is not the empty set. \(\square \)

Taking advantage of the result in Proposition 2, we can define a new notion of weak narrowing. A weak narrowing is very similar to the standard narrowing, the only difference being that a descending chain built with weak narrowing might either stabilize or converge to \(\bot \).

Definition 10

(Weak narrowing) A weak narrowing over the pointed poset \((V,\bot )\) is a binary operator \(\mathbin {\vartriangle }: V \times V \rightarrow V\) such that:

  • \(v_1 \mathbin {\vartriangle }v_2\) is only defined when \(v_2 \le v_1\);

  • \(v_2 \le v_1 \mathbin {\vartriangle }v_2 \le v_1\);

  • given any decreasing chain \(\{v_i\}_{i \in {\mathbb {N}}}\), the sequence \(\{v'_i\}_{i \in {\mathbb {N}}}\) defined as \(v'_0 = v_0\) and \(v'_{i+1} = v'_i \mathbin {\vartriangle }v_{i+1}\) is either ultimately stationary or converges towards \(\bot \).

The following holds for weak narrowing.

Theorem 4

Let F be an equation system derived from a reducible semantic flow graph \(\mathscr {G}\) and \(\rho \) a post-fixpoint of \(F^{\bot }\). Let \(F_{\mathbin {\vartriangle }}^{\bot }\) the equation system derived from \(F^\bot \) by applying the weak narrowing \(\mathbin {\vartriangle }\) to each loop head. Every chaotic iteration sequence for \(F^{\bot }_{\mathbin {\vartriangle }}\) starting from \(\rho \) eventually stabilizes.

Proof

It is similar to the proof of Theorem 2. However, when node n is chosen, which is the least non-stabilizing node in a depth-first ordering of \(\mathscr {G}\), two cases arise:

  • if there are no back-edges pointing to n, then

    $$\begin{aligned} F^{\bot }_{\mathbin {\vartriangle }}(\rho _i)(n) = \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho _i(m)) \end{aligned}$$

    and \(\{\rho _i(n)\}_i\) stabilizes since all the nodes which come before n in the depth-first ordering do stabilize;

  • if there are back-edges pointing to n, i.e., n is a loop head, then

    $$\begin{aligned} F^{\bot }_{\mathbin {\vartriangle }}(\rho _i)(n) = \rho _i(n) \mathbin {\vartriangle }\left( \bigvee _{e=(m,n) \in E {\setminus } \mathsf {BackE}} f_e(\rho _i(m)) \vee ^\bot \bigvee _{e=(m,n) \in \mathsf {BackE}} f_e(\rho _i(m)) \right) \end{aligned}$$

    Weak narrowing ensures that if the chain \(\{\rho _i(n)\}_i\) is infinitely descending, then \(\bigwedge _i \rho _i(n)=\bot \). With the same reasoning in Theorem 2, we know that this case never occurs.\(\square \)

Exploiting Proposition 2, we can define a weak narrowing on intervals of reals which refines successive descending iterations at the nearest integer, since we cannot have an infinite descending chain whose bounds are all integers.

Definition 11

(Weak narrowing on reals) We define a weak narrowing operator \(\mathbin {\vartriangle }^1\) on \(\textsf {Int}_{\mathbb {R}}\) as follows:

$$\begin{aligned} I \mathbin {\vartriangle }^1 \emptyset&= \emptyset \\ [l_1,u_1] \mathbin {\vartriangle }^1 [l_2,u_2]&= [l', u'] \end{aligned}$$

where

$$\begin{aligned} l' = {\left\{ \begin{array}{ll} l_2 &{} \text {if } l_1=-\infty \\ \max (l_1,\lfloor l_2 \rfloor ) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
$$\begin{aligned} u' = {\left\{ \begin{array}{ll} u_2 &{} \text {if } u_1=+\infty \\ \min (u_1, \lceil u_2 \rceil ) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Note that this operator may be immediately extended componentwise to template domains. The same also holds for the other weak narrowings we will introduce later in this section. We define them and prove they are weak narrowings only for intervals, for the sake of conciseness.

The new weak narrowing \(\mathbin {\vartriangle }^1\) refines infinite bounds to finite values, as the standard narrowing, and refines finite bounds only to new integer values. Since infinite descending sequences on integer template domains are precluded by the use of left-strict joins, any descending sequence terminates.

Theorem 5

The operator \(\mathbin {\vartriangle }^1\) is a weak narrowing.

Proof

We need to show that for any \([l_2,u_2] \le [l_1,u_1]\), it holds that \([l_2,u_2] \le [l_1,u_1] \mathbin {\vartriangle }^1 [l_2,u_2] \le [l_1,u_1]\) and any infinite descending chain of narrowing eventually stabilizes or converges to \(\emptyset \).

Let \([l', u'] = [l_1,u_1] \mathbin {\vartriangle }^1 [l_2,u_2]\). By Definition  11, we need to prove that \(u_2 \le u' \le u_1\) and \(l_1 \le l' \le l_2\). If \(u_1=+\infty \) then \(u' = u_2\) and \(u_2 \le u' \le u_1\). Otherwise, since \(u_2 \le u_1\) by definition of narrowing operator, it follows that \(u_2 = \min (u_1, u_2) \le \min (u_1, \lceil u_2 \rceil ) = u' \le u_1\). The proof for the lower bound is symmetric.

Given any chain \(\{[l_i,u_i]_{i\in {\mathbb {N}}}\}\), where \([l_{i+1},u_{i+1}] \le [l_i,u_i]\) for any \(i \in {\mathbb {N}}\), we need to prove that the sequence:

$$\begin{aligned}{}[l'_0,u'_0]&= [l_0,u_0] \\ [l'_{i+1},u'_{i+1}]&= [l'_i,u'_i] \mathbin {\vartriangle }^1 [l_{i+1},u_{i+1}] \end{aligned}$$

eventually stabilizes or converges towards \(\emptyset \).

First assume that \(u_0 \ne +\infty \). Thus for any \(i\in {\mathbb {N}}\) we have that \(u_i \ne +\infty \), From Definition 11, it follows that, for any \(i\in {\mathbb {N}}\), \(u'_{i+1} = \min (u'_i, \lceil u_{i+1} \rceil )\), from which it immediately follows that either \(u'_{i+1}=u_0\) or \(u'_{i+1}\in {\mathbb {N}}\). Thus, either the sequence is constant (and is equal to \(u_0\)) or there exists \(k\in {\mathbb {N}}\) such that \(u'_{k}\in {\mathbb {N}}\). In the first case, there is nothing to prove, since the sequence stabilizes to \(u_0\). In the second case, the narrowing sequence starting from \(u'_{k}\) is a sequence of integers, thus it converges towards \(-\infty \) and \(\bigwedge _i {[l'_i, u_i']} = \emptyset \).

Now we assume that \(u_0 = +\infty \). In this case, either the narrowing sequence is stable to \(+\infty \), or there exists \(k \in {\mathbb {N}}\) such that \(u_k \ne +\infty \): we fall in the same case as \(u_0 \ne +\infty \), just shifted by k.

The proof for lower bounds is symmetric. The extension to template domains is immediate. \(\square \)

In the next example we compare the standard narrowing with the new weak narrowing on reals \(\mathbin {\vartriangle }^1\).

Example 8

We compute the descending chain of Example 7 using the standard narrowing on intervals. We start from the post-fixpoint and use the equation \(x_2 = x_2 \mathbin {\vartriangle }(x_1 \vee ^\bot x_7)\). At the first descending iteration we get

$$\begin{aligned} x_2 = [0,+\infty ]\mathbin {\vartriangle }([0,0] \vee ^\bot [1,6]) = [0,+\infty ]\mathbin {\vartriangle }[0,6] = [0,6] . \end{aligned}$$

Note that we get exactly the same value as in the first descending iteration without narrowing. Therefore, we compute for the other unknowns exactly the same values, in particular \(x_7 = [1,4]\). It is immediate to see that this is a fixpoint for the computation using the standard narrowing, since no more unbounded values appear. In fact, we have that

$$\begin{aligned} x_2 = x_2 \mathbin {\vartriangle }(x_1 \vee ^\bot x_7) = [0,6] \mathbin {\vartriangle }[0,4] = [0,6] . \end{aligned}$$

We now recompute the descending chain of Example 7 using the weak narrowing on reals \(\mathbin {\vartriangle }^1\) in Definition 11. The first descending iteration is the same as for the standard narrowing, and we get \(x_2 =[0,6]\) and \(x_7 = [1,4]\). In the second descending iteration we have

$$\begin{aligned} x_2 = x_2 \mathbin {\vartriangle }^1 (x_1 \vee ^\bot x_7) = [0,6] \mathbin {\vartriangle }^1 [0,4] = [0,4] \end{aligned}$$

and \(x_7 = [1,3]\). In the third descending iteration we have

$$\begin{aligned} x_2 = [0,4] \mathbin {\vartriangle }^1 [0,3] = [0,3] \end{aligned}$$

and \(x_7 = [1,\frac{5}{2}]\). This is the fixpoint, since

$$\begin{aligned} x_2 = [0,3] \mathbin {\vartriangle }^1 \left[ 0,\frac{5}{2}\right] = [0,3] . \end{aligned}$$

In this case, we get a result strictly more precise than with the standard narrowing. \(\square \)

It is worth noting that \(\mathbin {\vartriangle }^1\) could be easily generalized by rounding numbers at the multiple of any strictly positive constant value \(\delta \in {\mathbb {R}}\).

Definition 12

( \(\delta \)-narrowing) Let \(\delta \in {\mathbb {R}}\) such that \(\delta >0\). We define a new weak narrowing on intervals of reals:

$$\begin{aligned} I \mathbin {\vartriangle }^{\delta } \emptyset&= \emptyset \\ [l_1,u_1] \mathbin {\vartriangle }^{\delta } [l_2,u_2]&= [l', u'] \end{aligned}$$

where

$$\begin{aligned} l' = {\left\{ \begin{array}{ll} l_2 &{} \text {if } \, l_1=-\infty \\ \max (l_1,\delta \lfloor l_2/\delta \rfloor ) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
$$\begin{aligned} u' = {\left\{ \begin{array}{ll} u_2 &{} \text {if } \, u_1=+\infty \\ \min (u_1,\delta \lceil u_2/\delta \rceil ) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

The above weak narrowing produces a descending chain whose elements differ for a multiple of \(\delta \), which is fixed in advance. It generalizes \(\mathbin {\vartriangle }^1\) given in Definition 11. In fact, Definition 12 boils down to Definition 11 when \(\delta =1\).

Proposition 4

Let \(\{[l_i, u_i]\}_{i\in {\mathbb {N}}}\) be an infinite descending chain of real intervals. Assume that there exists \(\delta >0 \in {\mathbb {R}}\) and \(k\in {\mathbb {N}}\) such that, for any \(i\ge k\), it holds that:

  • either \(u_i = u_{i+1}\) or \(u_i - u_{i+1}\ge \delta \);

  • either \(l_i = l_{i+1}\) or \(l_{i+1}-l_i \ge \delta \).

Then \(\bigwedge _{i\in {\mathbb {N}}} [l_i, u_i] = \emptyset \).

Proof

We show the result when \(k=0\). The other case immediately follows by considering the sequence \(\{[l_i, u_i]\}_{i \ge k \in {\mathbb {N}}}\). If \(\{[l_i, u_i]\}_{i \in {\mathbb {N}}}\) is an infinite descending chain of real intervals, then either \(\{l_i\}_{i \in {\mathbb {N}}}\) or \(\{-u_i\}_{i \in {\mathbb {N}}}\) is an infinite ascending chain of reals. Without loss of generality, assume we are in the first case. By hypothesis, it holds that either \(l_i = l_{i+1}\) or \(l_{i+1}-l_i \ge \delta \) for a fixed \(\delta > 0\). Since the chain is infinitely descending, \(l_{i+1}-l_i \ge \delta \) infinitely many times, hence \(l_i \rightarrow +\infty \). Then, for each \(x \in {\mathbb {R}}\), there is \(j \in {\mathbb {N}}\) such that \(l_j > x\), hence \(x \notin \bigwedge _{i \in {\mathbb {N}}} [l_i, u_i]\). We have \(\bigwedge _{i \in {\mathbb {N}}} [l_i, u_i] = \emptyset \). \(\square \)

Theorem 6

The operator \(\mathbin {\vartriangle }^\delta \) is a weak narrowing.

Proof

We need to show that for any \([l_2,u_2] \le [l_1,u_1]\), it holds that \([l_2,u_2] \le [l_1,u_1] \mathbin {\vartriangle }^{\delta } [l_2,u_2] \le [l_1,u_1]\) and any infinite descending chain of narrowing either stabilizes or converges to \(\emptyset \).

Let \([l', u'] = [l_1,u_1] \mathbin {\vartriangle }^{\delta } [l_2,u_2]\). By Definition 13, it is immediate to see that either \(l'=l_1\), or \(l'=l_2\), or \(l'=\delta \lfloor l_2/\delta \rfloor \). Note that, for any \(\delta \in {\mathbb {R}}\), we have that \(l_2 \ge \delta \lfloor l_2/\delta \rfloor \). Since \(l_1 \le l_2\), it follows that \(l_1 \le \max (l_1,\delta \lfloor l_2/\delta \rfloor ) \le l_2\). In all cases, \(l_1 \le l' \le l_2\). The symmetric reasoning holds for upper bounds.

Given any chain \(\{[l_i,u_i]_{i\in {\mathbb {N}}}\}\), where \([l_{i+1},u_{i+1}] \le [l_i,u_i]\) for any \(i \in {\mathbb {N}}\), we need to prove that the sequence:

$$\begin{aligned}{}[l'_0,u'_0]&= [l_0,u_0] \\ [l'_{i+1},u'_{i+1}]&= [l'_i,u'_i] \mathbin {\vartriangle }^{\delta } [l_{i+1},u_{i+1}] \end{aligned}$$

either stabilizes or converges to \(\emptyset \).

First assume that \(u_0\) is a multiple of \(\delta \). From Definition 12, it follows that, for any \(i\in {\mathbb {N}}\), \(u'_{i+1} = \min (u'_i, \delta \lceil u_{i+1}/\delta \rceil )\) is a multiple of \(\delta \). Therefore, either \(u'_{i+1}=u'_i\) or \(u'_{i}- u'_{i+1} \ge \delta \). The same eventually happens also when \(u_0\) is not a multiple of \(\delta \), with the possible exception of the case when \(u'_i=u_0\) for each \(i \in {\mathbb {N}}\). Similar considerations apply to the lower bounds. As a consequence, either \(\{[l_i, u_i]\}_i\) stabilizes, or it satisfies Proposition 4 and its limit is \(\emptyset \). \(\square \)

The next example applies the new weak narrowing \(\mathbin {\vartriangle }^{\delta }\) to the program realLoop .

Example 9

We compute the descending chain for the example program realLoop in Fig. 6a using \(\delta \)-narrowing with \(\delta =\frac{1}{100}\). We get the following values for \(x_2\):

$$\begin{aligned} \begin{aligned} \,[0,6], [0,4], [0,3], \left[ 0,\frac{5}{2} \right] \!, \left[ 0,\frac{9}{4} \right] \!, \left[ 0,\frac{213}{100} \right] \!, \left[ 0,\frac{207}{100} \right] \!, \left[ 0,\frac{204}{100} \right] \!, \left[ 0,\frac{202}{100} \right] \!, \left[ 0,\frac{201}{100} \right] \end{aligned} \end{aligned}$$

where the last one is the fixpoint. \(\square \)

As an alternative, instead of rounding bounds to a multiple of \(\delta \), we may refine bounds with the new value only if the difference w.r.t. the previous value is greater than a given \(\delta \). We call this \(\delta \)*-narrowing.

Definition 13

( \(\delta \)*-narrowing) Let \(\delta \in {\mathbb {R}}\) such that \(\delta > 0\). We define a new weak narrowing on intervals of reals:

$$\begin{aligned} I \mathbin {\vartriangle }^{\delta *} \emptyset&= \emptyset \\ [l_1,u_1] \mathbin {\vartriangle }^{\delta *} [l_2,u_2]&= [l', u'] \end{aligned}$$

where

$$\begin{aligned} l' = {\left\{ \begin{array}{ll} l_2 &{} \text {if } \, l_1=-\infty \text { or } l_2 - l_1 \ge \delta \\ l_1 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
$$\begin{aligned} u' = {\left\{ \begin{array}{ll} u_2 &{} \text {if } \, u_1=+\infty \text { or } u_1 - u_2 \ge \delta \\ u_1 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

The above weak narrowing keeps iterating while the difference between two successive iterations is greater than \(\delta \).

Theorem 7

The operator \(\mathbin {\vartriangle }^{\delta *}\) is a weak narrowing.

Proof

We need to show that for any \([l_2,u_2] \le [l_1,u_1]\), it holds that \([l_2,u_2] \le [l_1,u_1] \mathbin {\vartriangle }^{\delta *} [l_2,u_2] \le [l_1,u_1]\) and any infinite descending chain of narrowing either stabilizes or converge towards \(\emptyset \).

Let \([l', u'] = [l_1,u_1] \mathbin {\vartriangle }^{\delta *} [l_2,u_2]\). By Definition 13, it is immediate to see that either \(l'=l_1\) or \(l'=l_2\), and symmetrically for the upper bounds, from which we have that \([l_2,u_2] \le [l', u'] \le [l_1,u_1]\)

Given any chain \(\{[l_i,u_i]_{i\in {\mathbb {N}}}\}\), where \([l_{i+1},u_{i+1}] \le [l_i,u_i]\) for any \(i \in {\mathbb {N}}\), we need to prove that the sequence:

$$\begin{aligned}{}[l'_0,u'_0]&= [l_0,u_0] \\ [l'_{i+1},u'_{i+1}]&= [l'_i,u'_i] \mathbin {\vartriangle }^{\delta *} [l_{i+1},u_{i+1}] \end{aligned}$$

either stabilizes or converge towards \(\emptyset \). The proof proceed as for Theorem 6. \(\square \)

The next example shows the weak narrowing \(\mathbin {\vartriangle }^{\delta *}\) in the program realLoop .

Example 10

We compute the descending chain for the example program realLoop in Fig. 6a using \(\delta \)*-narrowing with \(\delta =\frac{1}{100}\). We get the following values for \(x_2\):

$$\begin{aligned} \begin{aligned} \,[0,6],[0,4],[0,3],\left[ 0,\frac{5}{2}\right] , \left[ 0,\frac{9}{4}\right] , \left[ 0,\frac{17}{8}\right] , \left[ 0,\frac{33}{16}\right] , \left[ 0,\frac{65}{32}\right] , \left[ 0,\frac{129}{64}\right] \end{aligned} \end{aligned}$$

where the last one is the fixpoint. \(\square \)

9 Conclusion and related work

We believe that the main contribution of this paper is a deeper theoretical understanding of termination issues during descending iterations within the framework of static analysis by abstract interpretation. In details, we have:

  • defined a left-strict join operators for loop heads, which improves precision by preserving unreachability;

  • proved that, when using the new join operator on a domain satisfying the bottom chain condition (such as any template abstract domain with integer bounds), the descending phase of the analysis terminates even without using a narrowing operator, provided the equation system comes from a reducible flow graph;

  • defined an acceleration operation for descending iterations which immediately propagates unreachability; this is the first instance of an accelerator operator for the descending phase of the analysis;

  • proved that, when using acceleration on an abstract domain satisfying the bottom chain condition, narrowing is not needed even for non-reducible graphs;

  • provided several improved (more precise) weak narrowings for template domains with real bounds, to be used with the new join operator;

  • shown, for the first time, examples of programs over integers and reals where the descending phase of the analysis is either infinite or arbitrarily long.

Both the new join and the improved weak narrowings may be easily implemented in existent analyzers with little effort, since they only require a single check in the abstract join in order to make it left-strict.

The new join operator may be used systematically with structured programs, since it improves both precision and speed at the same time. The same cannot be said for the new weak narrowings over reals or for the idea of avoiding narrowing at all with integer domains. In this case, we may get better precision, as shown in Example 7, but at the expense of a greater computational cost, since the analysis of the loops might be repeated several times. The good point is that we increase the computational cost only when we improve precision w.r.t. the standard narrowing.

The cost of the repeated computations in loops might be probably reduced by delaying analysis of the inner loops until outer loops are stabilized, so that a long descending sequence in a loop does not force to repeatedly analyze the inner loops. However the impact of the new weak narrowings on the precision and performance of the analysis on realistic test cases will be the topic of a future work.

Only a few papers in the literature deal with narrowing and the descending phase of the analysis. In [5,6,7] the authors propose to combine widening and narrowing during the analysis, resulting in multiple intertwined ascending and descending phases. In [5] the alternation between ascending and descending phases is driven by a hierarchical ordering [8] of the equation system, while in [6, 7] the alternation is driven by a choice of priorities for the unknowns. Both these approaches, when paired with favourable ordering or priorities, are able to completely analyze the upper part (the first loop) of the program unreachableLoop in Fig. 1a before starting the analysis of the lower part. In this case, the infinite descending chain does not arise even with the standard join operator. However, it is not clear if a favourable ordering of equations or priorities exist for any program. This seems unlikely, especially for programs with complex nested loops or non reducible equation systems.

Moreover, [5] also proposes to restart (part of) the analysis when the abstract value associated to the exit node of a loop is refined during the descending phase. Our left-strict join operator may be viewed as a variant of the restarting policy in [5], where restart is triggered only when unreachability is detected. For instance, in the example program unreachableLoop in Fig. 1a, the restarting policy triggers a full analysis (widening and narrowing phases) of the second loop with an initial assignment which maps every unknown of the second loop to bottom. However, while in the previous work restarting is a feature of the equation solver, here it is realized directly at the semantic level.

In [18] the authors try to recover precision by restarting the analysis after that a post-fixpoint has been reached. An heuristic chooses a set of predecessors for each join node. The analysis is then restarted from an initial value which is equal to the result of the previous analysis for the chosen nodes, and bottom for all the others. For the example program unreachableLoop in Fig. 1a this does not help since, at the end of the first analysis, \(x_{9}\) is bottom anyway.

Mostly, our work is orthogonal to the ones cited above: the new operators we have defined may be used within these frameworks to get more precise results.

The idea of avoiding narrowing in the descending phase is used in many papers, with the proviso of bounding the number of descending iterations to ensure termination. In this paper we show that, under certain conditions and ignoring performance issues, we do not need to bound the number of iterations.

While acceleration in the ascending phase (for numerical domains) has been introduced already in [17], the new acceleration operator is, up to our knowledge, the only example of acceleration to be used during the descending phase of the analysis.