Keywords

1 Introduction

Many program analyses and transformations require to handle conjunction of affine constraints C and \(C'\) with a universal quantification as \(\forall x: x \models C \Rightarrow x \models C'\). For instance, this appears in loop scheduling [6, 7], loop tiling [2], program termination [1] and generation of invariants [3]. Farkas lemma – affine form – provides a way to get rid of that universal quantification, at the price of introducing quadratic terms. In the context of program termination and loop scheduling, it is even possible to use Farkas lemma to turn universally quantified quadratic constraints into existentially quantified affine constraints. This requires tricky algebraic manipulations, not easy to applied by hand, neither to implement.

In this paper, we propose a scripting tool, fkcc, which makes it possible to manipulate easily Farkas lemma to benefit from those nice properties. More specifically, we made the following contributions:

  • A general formulation for the resolution of equations \(\forall x: S(\textit{\textbf{x}}) = 0\) where S is summation of affine forms including Farkas terms. So far, this resolution was applied for specific instances of Farkas summation. This result is the basic engine of the fkcc scripting language.

  • A scripting language to apply and exploit Farkas lemma; among polyhedra, affine functions and affine forms.

  • Our tool, fkcc, implementing these principles, available at http://foobar.ens-lyon.fr/fkcc. fkcc may be downloaded and tried online via a web interface. fkcc comes with many examples, making it possible to adopt the tool easily.

This paper is structured as follows. Sect. 2 presents the affine form of Farkas lemma, our resolution theorem, and explains how it applies to compute scheduling functions. Then, Sect. 3 defines the syntax and outlines informally the semantics of the fkcc language. Section 4 presents two complete use-cases of fkcc. Finally, Sect. 5 concludes this paper and draws future research perspectives.

2 Farkas Lemma in Program Analysis and Compilation

This section presents the theoretical background of this paper. We first introduce the affine form of Farkas lemma. Then, we present our theorem to solve equations \(S(\textit{\textbf{x}}) = 0\) where S is a summation of affine forms including Farkas terms. This formalization will then be exploited to design the fkcc language.

Lemma 1 (Farkas Lemma, affine form)

Consider a convex polyhedron and an affine form such that \(\phi (\textit{\textbf{x}})~\ge ~0\) \(\forall \textit{\textbf{x}} \in \mathcal {P}\).

Then: \(\exists \varvec{\lambda }\ge \textit{\textbf{0}}, \lambda _0 \ge 0\) such that:

$$\begin{aligned} {\phi (\textit{\textbf{x}}) = \ ^t\varvec{\lambda }(A \textit{\textbf{x}} + \textit{\textbf{b}}) + \lambda _0 \quad \forall \textit{\textbf{x}}} \end{aligned}$$

Hence, Farkas lemma makes it possible to remove the quantification \(\forall \textit{\textbf{x}} \in {\mathcal {P}}\) by encoding directly the positivity over \({\mathcal {P}}\) into the definition of \(\phi \), thanks to the Farkas multipliers \(\varvec{\lambda }\) and \(\lambda _0\). In the remainder, Farkas terms will be denoted by: \(\mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}})(\textit{\textbf{x}}) = \ ^t\varvec{\lambda }(A \textit{\textbf{x}} + \textit{\textbf{b}}) + \lambda _0\). We now propose a theorem to solve equations \(S(\textit{\textbf{x}}) = 0\) where S involves Farkas terms. The result is expressed as a conjunction of affine constraints, which is suited for integer linear programming:

Theorem 1

Consider a summation \(S(\textit{\textbf{x}}) = \textit{\textbf{u}} \cdot \textit{\textbf{x}} + v + \sum _i \mathfrak {F}({\lambda _i}_0,\varvec{\lambda _i},A_i,\textit{\textbf{b}}_i)(\textit{\textbf{x}})\) of affine forms, including Farkas terms. Then:

$$\begin{aligned} \forall \textit{\textbf{x}}: \; S(\textit{\textbf{x}}) = 0 \quad \hbox {iff} \quad \left\{ \begin{array}{l} \textit{\textbf{u}} + \sum _i \ ^tA_i \varvec{\lambda }_i = \textit{\textbf{0}} \; \wedge \\ v + \sum _i \left( \varvec{\lambda }_i \cdot \textit{\textbf{b}}_i + {\lambda _0}_i \right) = 0 \end{array}\right. \end{aligned}$$

Proof

We have:

$$\begin{aligned} S(\textit{\textbf{x}})= & {} \ ^t\textit{\textbf{x}} \left( \sum _i \ ^tA_i\varvec{\lambda }_i \right) + \sum _i \left( \varvec{\lambda }_i \cdot \textit{\textbf{b}}_i + {\lambda _0}_i \right) + \textit{\textbf{u}} \cdot \textit{\textbf{x}} + v\\= & {} \ ^t\textit{\textbf{x}} \left( \textit{\textbf{u}} + \sum _i \ ^tA_i\varvec{\lambda }_i \right) + v + \sum _i \left( \varvec{\lambda }_i \cdot \textit{\textbf{b}}_i + {\lambda _0}_i \right) \end{aligned}$$

\(S(\textit{\textbf{x}}) = \varvec{\tau }\cdot \textit{\textbf{x}} + \tau _0 = 0\) for any \(\textit{\textbf{x}}\) iff \(\varvec{\tau }= \textit{\textbf{0}}\) and \(\tau _0 = 0\). Hence the result.    \(\square \)

Application to Scheduling. Consider the polynomial product kernel depicted in Fig. 3.(a). Farkas lemma and Theorem 1 may be applied to compute a schedule, this is a way to reorganize the computation of the program to fulfill various criteria (overall latency, locality, parallelism, etc). On this example, a schedule may be expressed as an affine form \(\theta : (i,j) \mapsto t\) assigning a timestamp to each iteration (ij). This way, a schedule prescribes an execution order \(\prec _\theta \; := \{ \left( (i,j),(i',j') \right) \;|\; \theta (i,j)<\theta (i',j') \}\). Figure 3.(b) illustrates the order prescribed by the schedule \(\theta (i,j) = i\): a sequence of vertical wave fronts, whose iterations are executed in parallel.

A schedule must be positive everywhere on the set of iteration vectors \({\mathcal {D}}_N = \{(i,j)\;|\; A\ ^t(i,j,N) + \textit{\textbf{b}} \}\) (referred to as iteration domain). In general, the iterations domains are parametrized (typically by the array size N) and the schedule may depends on N. Hence we have to consider vectors (ijN) instead of (ij):

$$\begin{aligned} \theta (i,j,N) \ge 0 \quad \forall (i,j) \in {\mathcal {D}}_N \end{aligned}$$
(1)

Applying Farkas lemma, this translates to:

$$\begin{aligned} \exists \lambda _0 \ge 0, \varvec{\lambda }\ge 0 \quad \hbox {such that} \quad \theta (i,j,N) = \mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}})(i,j,N) \end{aligned}$$
(2)

Moreover, a schedule must satisfy the data dependencies \((i,j) \rightarrow (i',j')\). \(\rightarrow \) is generally expressed as a Presburger relation [8], in turned abstracted as a rational convex polyhedron \(\varDelta _N\) containing the correct vectors \((i,j,i',j')\) and sometimes false positives. Here again, \(\varDelta _N = \{(i,j,i',j')\;|\;C\ ^t(i,j,i',j',N) + \textit{\textbf{d}} \ge 0\}\) is parametrized by structure parameter N. This way, the correctness condition translates to:

$$\begin{aligned} \theta (i',j',N) > \theta (i,j,N) \quad \forall (i,j,i',j') \in \varDelta _N \end{aligned}$$
(3)

Note that \(\theta (i',j',N) > \theta (i,j,N)\) is equivalently written as the positivity of an affine form over a convex polyhedron: \(\theta (i',j',N) - \theta (i,j,N) - 1 \ge 0\). Applying Farkas lemma:

$$\begin{aligned} \exists \mu _0 \ge 0, \varvec{\mu }\ge 0 \; \hbox {such that} \; \theta (i',j',N) - \theta (i,j,N) - 1 = \mathfrak {F}(\mu _0,\varvec{\mu },C,\textit{\textbf{d}})(i,j,i',j',N) \end{aligned}$$

Substituting \(\theta \) using Eq. (2), this translates to \(S(i,j,i',j',N) = 0\), where \(S(i,j,i',j',N)\) is defined as the summation:

$$\begin{aligned} \mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}})(i',j',N) - \mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}})(i,j,N) - \mathfrak {F}(\mu _0,\varvec{\mu },C,\textit{\textbf{d}})(i,j,i',j',N) - 1 \end{aligned}$$

Since \(-\mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}}) = \mathfrak {F}(-\lambda _0,-\varvec{\lambda },A,\textit{\textbf{b}})\), we may apply Theorem 1 to obtain a system of affine constraints with \(\lambda _0,\varvec{\lambda },\mu _0,\varvec{\mu }\). Linear programming may then be applied to find out the desired schedule [2, 7]. The same principle might be applied in termination analysis to derive a ranking function [1], this will be developed in Sect. 4.

Fig. 1.
figure 1

Fkcc syntax

3 Language

This section specifies the input language of fkcc and outlines informally its semantics. Figure 1 depicts the input syntax of fkcc. Keywords and syntax sugar are written with verbatim letters, identifiers with italic letter and syntactic categories with roman letters. Among identifiers, p is a parameter, v is a variable (typically a loop counter) and id is an fkcc identifier.

Program, Instructions, Polyhedra. An fkcc program consists of a sequence of instructions. There is no other control structure than the sequence. An instruction may assign an fkcc object (polyhedron, affine form or affine function) to an fkcc identifier, or may be an fkcc object alone. In the latter, the fkcc object is streamed out to the standard output. Also, we often need to compute the lexicographic optimum of a polyhedron, typically to pick an optimal schedule. fkcc uses parameteric integer linear programming [5] via the Piplib library. The result is a discussion on the parameter value:

figure a

would give:

figure b

Note that structure parameters must be declared with the parameters construct. When no parameters are involved, the parameters construct may be omitted. To ensure the compatibility with iscc [10] syntax, the parameters of a polyhedron may be declared on preceding brackets [N] -> .... This is purely optional: fkcc actually does not analyze this part. The instruction set id emits id := to the standard output. This makes it possible to generate iscc scripts for further analysis. Finally, the set intersection of two polyhedra P and Q is obtained with P*Q.

Affine Forms. An affine form may be defined as a Farkas term:

figure c

If iterations is \(\{ \textit{\textbf{x}} \;|\; A\textit{\textbf{x}} + \textit{\textbf{b}} \ge 0 \}\), then theta is defined as \(\mathfrak {F}(\lambda _0,\varvec{\lambda },A,\textit{\textbf{b}})\) where \(\lambda _0\) and \(\varvec{\lambda }\) are fresh positive variables. In that case, the polyhedron is never parametrized: the parameters must be handled as variables. In particular, do not name variables with identifiers declared as parameters with parameters :=, as they would be treated as parameters whatever the context. Affine forms might be summed, scaled and composed with affine functions, typically to adjust the input dimension:

figure d

In a summation of affine forms, affine forms must have the same input dimension. Also, a constant (1) is automatically interpreted as an affine form ([i,j,i’,j’,N] -> 1). Affine forms may also be stated explicitly ({[i,j,i’,j’,N] -> 2*i-i’}). The terms of the summation are simply separated with + and -, no parenthesis are allowed.

Resolution. The main feature of fkcc is the resolution of equations \(S(\textit{\textbf{x}}) = 0\) where S is a summation of affine forms including Farkas terms. This is obtained with the instruction solve:

figure e

The result is a polyhedron with Farkas multipliers (obtained after applying Theorem 1):

figure f

At this point, we need to recover the coefficients of our affine form theta in terms of \(\varvec{\lambda }\) (lambda_0,lambda_1,lambda_2,lambda_3) and \(\lambda _0\) (lambda_4). Observe that . If the coefficients of theta are written: , we simply have: \(\varvec{\tau }= \ ^t\varvec{\lambda } A\) and \(\tau _0 = \varvec{\lambda }\cdot \textit{\textbf{b}} + \lambda _0\). This is obtained with define:

figure g

The result is a conjunction of definition equalities, gathered in a polyhedron:

figure h

The first coefficients tau_k define \(\varvec{\tau }\), the last one defines the constant \(\tau _0\). On our example, theta(i,j,N) = tau_0*i + tau_1*j + tau_2*N + tau_3. Now we may gather the results and eliminate the \(\lambda \) to keep only \(\varvec{\tau }\) and \(\tau _0\):

figure i

The result is a polyhedron with the solutions. Here, there are no solutions: the result is an empty polyhedron. All these steps may be applied once with the find command:

figure j

The coefficients are automatically named theta_0, theta_1, etc with the same convention as define. We point out that define choose fresh names for coefficients (e.g. tau_4, tau_5 on the second time with “tau”) whereas find always choose the same names. Hence find would be prefered when deriving separately constraints on the same coefficients of theta. find may filter the coefficients for several affine forms expressed as Farkas terms in a summation:

figure k

This is typically used to compute schedules for programs with multiple assignments (here S and T with dependences from iterations of S to iterations of T). Finally, note that keep tau_0,tau_1,tau_2,tau_3 in P; projects P on variables tau_0,tau_1,tau_2,tau_3: the result is a polyhedron with integral points of coordinates (tau_0,tau_1,tau_2,tau_3). This way, the order in which tau_0,tau_1,tau_2,tau_3 are specified to keep impacts directly a further lexicographic optimization.

4 Examples

This section shows how fkcc might be used to specify in a few lines termination analysis and loop scheduling.

Fig. 2.
figure 2

Termination example

4.1 Termination Analysis

Consider the example depicted on Fig. 2. The program computes the gcd of two integers \(x_0\) and \(y_0\) (a). It is translated to an affine automaton (b) (also called integer interpreted automaton), in turn analyzed to check the termination (c): does the program terminates for any input \((x_0,y_0)\) satisfying the precondition \(x_0>0 \wedge y_0>0\)?

This problem is – as most topics in static analysis – undecidable in general. However, we may conclude when it is possible to derive statically an abstraction precise enough of the program execution. In [1], we provide a termination algorithm based on the computation of a ranking. A ranking is an application which maps each reachable state of the automaton \(\langle label,\textit{\textbf{x}} \rangle \) to a rank belonging to well-founded set. On our example a reachable state could be \(\langle loop, (x:3,y:3,x_0:3,y_0:6) \rangle \) after firing the initial transition and the right transition.

The ranking is decreasing on the transitions: for any transition \(\langle label,\textit{\textbf{x}} \rangle \rightarrow \langle label',\textit{\textbf{x}}' \rangle \), we have: . Since ranks belong to a well founded set, there are – by definition – no infinite decreasing chain of ranks. Hence infinite chains of transitions from an initial state never happen.

On [1], we provide a general method for computing a ranking of an affine automaton. Our ranking is affine per label: . Figure 2.(c) depicts the ranking found on the example. Ranks ordered with the lexicographic ordering \(\ll \), the well-founded set is . This means that, by decreasing order, start comes first (2), then all the iterations of loop (1), and finally stop (0). The transitions involved to compute those constants are the transitions from start to loop and the transitions from loop to stop. Then, transitions from loop to loop (left, denoted \(\tau _1\) and right, denoted \(\tau _2\)) are used to computed the second dimension of . In the remainder, we will focus on the computation of the second dimension of (\(x+y-2\)) from transitions \(\tau _1\) and \(\tau _2\). We will write for to simplify the presentation.

Positivity on Reachable States. The ranking must be positive on reachable states of loop. The set of \(\textit{\textbf{x}}\) such that \(\langle loop,\textit{\textbf{x}} \rangle \) is reachable from an initial state is called the accessibility set of loop. In general, we cannot compute it – this is the undecidable part of the analysis. Rather, we compute an over-approximation thanks to linear relation analysis [4, 9]. This set is called an invariant and will be denoted by . Figure 2.(c) depicts the invariants on the program. All the challenge is to make the invariant close enough to the accessibility set so a ranking can be computed. In fkcc, the assertion translates to:

figure l

Decreasing on Transitions. Now it remains to find a ranking decreasing on transitions \(\tau _1\) and \(\tau _2\). We first consider \(\tau _1\). The assertion translates to:

figure m

Similarly we compute a solution set s2 from \(\tau _2\) and . Finally, the ranking is found with the instruction lexmin (s1*s2);, which outputs the result:

figure n

This corresponds to the dimension \(x+y-2\).

Fig. 3.
figure 3

Scheduling example

4.2 Scheduling

Figure 3 depicts an example of program (a) computing the product of two polynomials specified by their array of coefficients a and b, and the iteration domain with the data dependence across iterations (b) and an example schedule \(\theta \) prescribing a parallel execution by vertical waves, as discussed in Sect. 2.

Positivity. Similarly to the ranking, the positivity condition (1) translates to:

figure o

Correctness. We enhance the correctness condition (2) by making it possible to select the dependence to satisfy. For each dependence class d, we use a 0-1 variable \(\epsilon _d\). Here we have a single dependence class from S to S, so have only one 0-1 variable \(\epsilon \):

$$\begin{aligned} \theta (i',j',N) \ge \theta (i,j,N) + \epsilon \quad \forall (i,j,i',j') \in \varDelta _N \end{aligned}$$

On the ranking example, we would have four classes (\(i= start\rightarrow loop, \tau _1, \tau _2, e= loop \rightarrow stop\)). This makes it possible to choose which dependence class is satisfied (\(\epsilon _d = 1\)) or just respected (\(\epsilon _d=0\)). This is the way multidimensional schedules are built [7]: on the termination example we would have \(\epsilon _i = \epsilon _e = 1, \epsilon _{\tau _1} = \epsilon _{\tau _2} = 0\) for the first dimension, then \(\epsilon _{\tau _1} = \epsilon _{\tau _2} = 1\) for the second dimension. Here it is kind of artificial, since we have a single dependence. However, the presentation generalizes easily to several dependence classes. This translates as:

figure p

Here is the trick: parameters are forbidden to define Farkas terms; however parameters are perfectly allowed in summation. In that case, the resolution interprets parameters as constants. Hence the trick to set \(\epsilon \) as a parameter and to put it in the summation by declaring an explicit affine form {[i,j,i’,j’,N] -> -1*eps}. We then keep the definition of theta coefficients in terms of Farkas multipliers (theta_def) and the domain of \(\epsilon \) (eps_correct).

Optimality. We seek a schedule \(\theta \) with a minimal latency \(\ell (\theta )\) (number of steps). When \(\theta \) is an affine form, \(\ell (\theta )\) may be bounded by an affine form L(N) of the structure parameters [6]: \(\ell (\theta ) \le L(N)\). This means that:

$$\begin{aligned} \forall \varvec{(}i,j) \in {\mathcal {D}}_N: \quad \theta (i,j,N) \le L(N) \end{aligned}$$

Which is, again, completely Farkas compliant. It remains to express L(N), which have to be positive provided \({\mathcal {D}}_N\) is not empty i.e. \(N \ge 0\). This translates to:

figure q

Finally, it remains to gather the constraints (positivity, correctness, optimality) to obtain the result:

figure r

By priority order, we want to (i) maximize the dependence satisfied (minimize inv_eps), then (ii) to minimize the latency (L(N) = latency_0*N + latency_1). This amounts to find the lexicographic minimum with variables ordered as (inv_eps,latency_0,latency_1). Note that eps and inv_eps are parameters. Adding them to the variable list of keep has the effect to turn them to counters eps_counter and inv_eps_counter. We obtain the following result, pretty printed using the -pretty option:

figure s

Hence \(\theta (i,j,N) = N-j\), \(L(N) = N\) and the dependence was satisfied (eps_counter = 1).

5 Conclusion

In this paper, we have presented fkcc, a scripting tool to prototype program analyses and transformations using the affine form of Farkas lemma. The script language of fkcc is powerful enough to write in a few lines tricky scheduling algorithms and termination analysis. The object representation (polyhedra, affine functions) is compatible with iscc, a widespread polyhedral tool featuring manipulation of affine relations. fkcc provides features to generate iscc code, and conversely, the output of iscc might be injected in fkcc. This will allow to take profit of both worlds.

We believe that scripting tools are mandatory to evaluate rapidly research ideas. So far, Farkas lemma-based approaches were locked by two facts: (i) applying by hand Farkas Lemma is nearly impossible and (ii) implementing an analysis with Farkas lemma is usually tricky, time consuming and highly bug prone. With fkcc, computer scientists are now freed from these constraints.