1 Introduction

Data dependence analysis is an essence to identify and exploit parallelism in programs. It has been more than 40 years since the first data dependence test was proposed. Data dependence techniques for linear subscripts were well developed. They have been widely used in modern optimizing compilers. However, the multi-core technology raises an in-depth parallelization request for almost all the applications of modern computer systems, especially the light-weight multi-core mobile computing platforms. In the meantime, various kinds of scientific and engineering computing tasks, including the environmental and biological large-scale applications, are in need of parallelization. Conventional analysis approaches are not able to satisfy these demands, as they are all based on an assumption that each array subscript is an affine reference of its enclosing loops’ indexes. As a result, they usually return a conservative result for nonlinear references, which are becoming more and more ubiquitous in large-scale irregular applications, thereby preventing the parallelization.

It has been proved [1] that the dependence testing problem is actually to determine whether there is an integer solution to the system of equations

$$\begin{aligned} \left\{ {{\begin{array}{l} {f_1 {(}i_1 ,\ldots ,i_n {,\ldots ,}j_1 {,\ldots ,}j_n {)}=h_1 {(}i_1 ,\ldots ,i_n {)}-g_1 {(}j_1 {,\ldots ,}j_n {)}={0}} \\ {f_2 {(}i_1 ,\ldots ,i_n {,\ldots ,}j_1 {,\ldots ,}j_n {)}=h_2 {(}i_1 ,\ldots ,i_n {)}-g_2 {(}j_1 {,\ldots ,}j_n {)}={0}} \\ \vdots \\ {f_m {(}i_1 ,\ldots ,i_n {,\ldots ,}j_1 {,\ldots ,}j_n {)}=h_m {(}i_1 ,\ldots ,i_n {)}-g_m {(}j_1 {,\ldots ,}j_n {)}={0}} \\ \end{array} }} \right. \end{aligned}$$
(1)

which subjects to the region \(\varvec{R}={\{}L_k \le i_k ,j_k \le U_k \vert {1}\le k\le n,i_k ,j_k \in \mathrm{Z}{\}}\). Here, \(L_{k}\) and \(U_{k}\) are the lower and upper bounds of the \(k\)th enclosing loop. \(h\) and \(g\) in each equation represent the referenced subscript expressions, respectively. In recent years, due to the challenges of multi-core technology and parallelism requirements of large-scale irregular computing tasks, data dependence techniques focusing on nonlinear subscripts have also been widely studied.

Blume and Eigenmann [2] proposed a nonlinear dependence test called Range test. They consider \(h\) and \(g\) of each equation, respectively, and calculate their extreme values individually. If they are both monotonous and the maximum value of \(h \,h_\mathrm{max}\) is less than the minimum value of \(g\, g_\mathrm{min}\), the analysis determines that there is no dependence. Literature [3] described the Quadratic test, which is designed for cases when a dependence equation is quadratic and only one loop index appears in the subscripts. Since loop-carried dependences should be considered, there will be two variables \(x\) and \(y\) in the equation, which are different instances of the same loop index. The algorithm iterates one variable with the other so as to narrow their value intervals. When either of their intervals becomes null, the iteration process will be terminated and “no dependence” is reported. Otherwise, each interval will be reduced to one point, hereby resulting in the solution of the equation. If it is not an integer solution, there is no dependence. Literature [4] presented an integer interval theory-based nonlinear dependence test called polynomial variable interval (PVI) test. It views a dependence equation as an interval equation and eliminates its variables one by one. Finally, only one constant term and an integer interval in the left- and right-hand expressions are left, respectively. If the constant term is not in the interval, there is no dependence.

It is not so easy to solve a system of nonlinear equations. In practice, a quadratic subscript is relatively common than other cases [3, 5, 6]. Hence, the research community developed specific data dependence techniques for quadratic subscripts. For instance, the Quadratic test described above is such a technique. It is based on a subscript-by-subscript idea, which solves each equation step by step instead of the whole system. The dependence equation is required to be in the form of

$$\begin{aligned} ax^{2}+by=c \end{aligned}$$
(2)

where \(x\) and \(y\) are different instances of the same loop index. The drawback of the Quadratic test is manifest, but it is an exact test, saying it will find the integer solutions of the equation when there is a dependence. To solve this problem, literature [5] designed a quadratic programming-based nonlinear dependence test called quadratic programming (QP) test. It regards the left-hand expression of a dependence equation as the objective function of a quadratic programming problem, and calculates its extreme value. If the minimum value is greater than zero, there is no dependence. When it is not greater than zero, literature [6] analyzed whether a solution making the objective function equal to zero is integer-valued via branch and bound method. If it is not an integer solution, there is no dependence. However, the QP test is efficient only when the coefficient matrix of the quadratic polynomials is positive semi-definite. In other words, the QP test will always return there are dependences when the hypothesis is not satisfied. The PVI test proposed in [4] is not a specific quadratic test, but it can also be applied to analyze a quadratic equation, but mixed polynomials cannot appear in the equation.

Focusing on the quadratic subscripts in modern nonlinear and irregular programs, we propose an improved nonlinear data dependence test in this paper. First, we transform a general quadratic equation into its canonical form. Second, we determine whether the canonical equation is integer solvable in the region of interest with the interval equation theory. Compared with existing techniques, our method can either analyze more general quadratic dependences [4] or has a lower time complexity [5, 6].

The remainder of this paper is structured as follows. Section 2 reviews existing quadratic data dependence tests and presents our motivation. Section 3 describes the process of normalizing a general quadratic equation. Section 4 introduces the interval equation theory and some corresponding theorems to eliminate polynomials of an equation. Section 5 describes the improved test algorithm. In Sect. 6, a case study is illustrated to help readers to understand the algorithm. Section 7 performs our experimental evaluations and Sect. 8 shows the theoretical analysis. The related work is reviewed in Sect. 9. Finally, our conclusions are placed in the last section.

2 Motivation

As described above, the Quadratic test [3] can only analyze a set of equations with one loop iteration variable in each equation. It requires that the dependence equation must be in the form of Eq. (2). Reorganize it as following

$$\begin{aligned} y=(c-ax^{2})/b \end{aligned}$$
(3)

then the monotony of the variable \(y\) in the range of interest can be inferred according to the restricted range \([L, U]\) of \(x\) and the difference function when \(\Delta x=1\). \(L\) and \(U\) are the lower and upper bounds of the enclosing loop. If \(y\) is monotonically increasing or decreasing, the dependence equation (2) can also be written as

$$\begin{aligned} x^{2}=(c-by)/a \end{aligned}$$
(4)

Accordingly, the generated lower and upper bounds of \(x\) are

$$\begin{aligned} \begin{array}{l} L^\prime =\sqrt{\left( -\frac{b}{a}\right) ^+L-\left( -\frac{b}{a}\right) ^-U+\frac{c}{a}} \\ U^\prime =\sqrt{\left( -\frac{b}{a}\right) ^+U-\left( -\frac{b}{a}\right) ^-L+\frac{c}{a}} \\ \end{array} \end{aligned}$$
(5)

If we intersect \([L, U]\) with \([L', U']\), a new range of \(x\) is obtained. Since the Eq. (3) is monotonous, the repeating process is convergent and \(x\)’s range will be either reduced to a point or null.

If \(y\) is not monotonous, the corresponding figure of the Eq. (3) is a parabola. In this case, it will try to find the extreme value of the parabola and then divide the interval into two parts to proceed.

To analyze more general dependence equations, the QP test is proposed to calculate quadratic programming problem whose objective function is the left-hand expression \(f(\varvec{x})\)

$$\begin{aligned}&\mathrm{min} \quad f(\varvec{x})=\varvec{C}^{T}\varvec{x}+\frac{1}{2}x^T \varvec{H}\varvec{x}, \nonumber \\&\mathrm{s.t.}\quad {A\varvec{x}\le b,\varvec{x}\ge 0,} \end{aligned}$$
(6)

In this model, \(f(\varvec{x})\) represents the objective function of the system, while \(A\varvec{x}\le b\), \(\varvec{x}\ge 0\) are the constraints introduced by the bounds of loop nests. \(A\) is a matrix in which each element is the coefficient of the loop index in its reference. First, the technique determines whether the coefficient matrix \(\varvec{H}\) of the quadratic polynomials is positive semi-definite. If it is, the minimum \(f(\varvec{x})_\mathrm{min}\) of the quadratic programming system (6) can be figured out. If \(f(\varvec{x})_\mathrm{min}\) is greater than zero, there is no solution to the dependence equation, saying there is no dependence. If \(f(\varvec{x})_\mathrm{min}\) is not greater than zero, try to find the integer solutions making \(f(\varvec{x}) =0\). If no integer feasible solution is found, there is no dependence.

The PVI test is not a specific quadratic technique. For a quadratic subscript, a dependence equation must be in the following form

$$\begin{aligned} f(\varvec{x})=\sum \limits _{i=1}^n {a_i x_i^2 +\sum \limits _{i=1}^n {b_i x_i } =c} \end{aligned}$$
(7)

which implies that a mixed polynomial should not appear in the equation. Here, \(x_{i} (1\le i\le n)\) represents an instance of a loop index. That is to say, the PVI test can analyze a quadratic dependence equation only when the coefficient matrix \(\varvec{H}\) of the quadratic polynomials is in the form of

$$\begin{aligned} \varvec{H}=2\times \left[ {{\begin{array}{llll} {h_{11} } &{} &{} &{} \\ &{} {h_{22} } &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} {h_{nn} } \\ \end{array} }} \right] \end{aligned}$$

Therefore, we summarize the quadratic data dependence tests and their limitations in Table 1. To solve the problems listed in Table 1, we attempt to propose and implement an improved nonlinear data dependence technique for quadratic subscripts. We suppose that the dependence equation is of the form

$$\begin{aligned} f(\varvec{x})=\sum \limits _{i=1}^n {a_i x_i^2 +\sum \limits _{i=1}^n {\sum \limits _{j=i+1}^n {b_{ij} x_i x_j +\sum \limits _{i=1}^n {c_i x_i } =d} }} \end{aligned}$$
(8)

In the first place, it can be seen that any \(x_{i} (1\le i\le n)\) can appear in Eq. (8), while only two variables are allowed in Eq. (2). As a result, there is no constraint on the number of variables of this equation, which means any number of loop index variables can appear in the subscripts. In the second place, any mixed polynomial \(x_{i}x_{j} (1\le i\le n, i\le j\le n)\) can be seen in Eq. (8), so mixed polynomials are also seen in this equation, which breaks the PVI test’s limitation. Finally, it can be seen from the following sections that our method will not require the matrix \(\varvec{H}\) to be positive semi-definite.

Table 1 Quadratic dependence tests and their limitations

3 The canonical form of dependence equations

The limitation of the PVI test is that a mixed polynomial is not allowed. It is manifest that the only difference between the Eqs. (7) and (8) is the mixed polynomials. Therefore, if we can eliminate the mixed polynomials from an equation without altering its meaning, then we can still determine whether the equation is integer solvable.

Theorem 1

All mixed polynomials of the Eq. (8) can be eliminated without altering its meaning.

Proof

First, we should show a matter of fact as everyone knows. For any integer \(x_{i}\) and \(x_{j}\) which represent different instances of loop indexes, the following is preserved

$$\begin{aligned} (x_i +x_j )^2=x_i^2 +x_j^2 +2x_i x_j \end{aligned}$$

Hence, we have

$$\begin{aligned} x_i x_j =((x_i +x_j )^2-(x_i^2 +x_j^2 ))/2 \end{aligned}$$

and we can reorder the Eq. (8) as

$$\begin{aligned} f(\varvec{x})&= \sum \limits _{i=1}^n {a_i x_i^2 +\sum \limits _{i=1}^n {\sum \limits _{j=i+1}^n {b_{ij} x_i x_j } } } +\sum \limits _{i=1}^n {c_i x_i }\nonumber \\&= \sum \limits _{i=1}^n {a_i x_i^2 } +\sum \limits _{i=1}^n {\sum \limits _{j=i+1}^n {\frac{b_{ij} }{2}((x_i +x_j )^2-x_i^2 -x_j^2 )} } +\sum \limits _{i=1}^n {c_i x_i } =d \end{aligned}$$
(9)

At this point, there is no mixed polynomial in the Eq. (9), and no new variable is introduced. So, the mixed polynomials of the Eq. (8) can be eliminated without altering its meaning. \(\square \)

However, the generated equation brings a quadratic polynomial like (\(x_{i}+x_{j})^{2}\) for each mixed polynomial \(x_{i}x_{j}\), although it will not produce new variables. If we make \(y_{ij}=x_{i}+x_{j}\), there will also be a constraint on \(y_{ij}\) according to its two components. So, the dependence analysis problem is equal to determining whether the Eq. (9) is integer solvable subject to

$$\begin{aligned} \varvec{R}'={\{}L_k \le x_k \le U_k ,y_{ij} =x_i +x_j \vert {1}\le i,j,k\le n,1\le ij \le n(n-1)/2,x_k \in \mathrm{Z}{\}}\nonumber \\ \end{aligned}$$
(10)

4 Interval equation and its theorems

The interval equation theory can be used to address dependence testing problem. For example, the I test [7], which is a conventional dependence analysis designed for linear subscripts, and the PVI test described in previous sections, which is a dependence test for nonlinear subscripts. The I test provides an approach to eliminate a linear polynomial from the left-hand expression of the equation, while the PVI test eliminates nonlinear polynomials.

4.1 Interval equation

Since finding the integer-valued solutions to an equation is extraordinary difficult, we determine whether the equation is integer solvable with interval equation instead of finding such integer-valued solutions. The following are some related definitions about interval equation.

Definition 1

For each integer \(a\), say \(a^{+}\) and \(a^{-}\) are the positive and negative parts of integer \(a \), respectively, where

$$\begin{aligned} \begin{array}{l} a^+=( {\left| a \right| +a})/{2} \\ a^-=(\left| a \right| -a)/{2} \\ \end{array} \end{aligned}$$

Definition 2

Given an integer region \(\varvec{R}\in \varvec{Z}^{n}\), and two functions \(L\) and \(U\) from \(\varvec{R}\) to \(\varvec{Z}\), then the integer interval [\(L(\varvec{x}), U(\varvec{x})\)] is the union of all integer intervals for each \(x_{i}\) of \(x\) in \(\varvec{R}\)

$$\begin{aligned}{}[L(\varvec{x}),U(\varvec{x})]=\bigcup \limits _{\varvec{x}_i \in \varvec{R}} {[L(\varvec{x}_i ),U(\varvec{x}_i )]} \end{aligned}$$

Definition 3

Given an integer region \(\varvec{R}\in \varvec{Z}^{n}\), and three functions \(F\), \(L\) and \(U\) from \(\varvec{R}\) to \(\varvec{Z}\), then the following

$$\begin{aligned} F(\varvec{x})=[L(\varvec{x}),U(\varvec{x})] \end{aligned}$$
(11)

is an interval equation.

The interval equation (11) is integer solvable in the integer region \(\varvec{R}\), if and only if there exists such an \(\varvec{x}_{0}\in \varvec{R}\) that \(L(\varvec{x}_{0})\le F(\varvec{x}_{0})\le U(\varvec{x}_{0})\). Hence, we can rewrite the dependence equation in the following form

$$\begin{aligned} \sum \limits _{i=1}^n {a_i x_i^2 } +\sum \limits _{i=1}^n {\sum \limits _{j=i+1}^n {\frac{b_{ij} }{2}(y_{ij}^2 -x_i^2 -x_j^2 )} } +\sum \limits _{i=1}^n {c_i x_i } =[d,d] \end{aligned}$$
(12)

If we can eliminate all the quadratic polynomials from the left-hand side of the Eq. (12), then it is transformed into a linear interval equation. Considering such equations can be determined by the I test, we can address the dependence problem in this way. As for quadratic polynomials, the integer equation theory will transfer them from the left-hand side to right side, and eliminate them on the basis of their own constraints.

4.2 Polynomial elimination

The PVI test provides a theorem to eliminate nonlinear polynomials from a dependence equation.

Lemma 1

[4] Consider the following variable interval equation

$$\begin{aligned} F(\varvec{x})+aX^n=[L(\varvec{x})+bX^n,U(\varvec{x})+cX^n] \end{aligned}$$

subject to a set of constraints on \(\varvec{x}\) in \(\varvec{R}\), \(P^{n}(\varvec{x})\le X^{n}\le Q^{n}(\varvec{x})\) and \(n\in \varvec{Z}^{+}\), where \(X^{n}\) does not appear in any of the constraints in \(\varvec{R}\). If \((b-a)(c-a) \le 0\), or \((b-a)(c-a)>0\) and \(\mathrm{min}(U(\varvec{x})-L(\varvec{x})+(c-b)^{+}P^{n}(\varvec{x})-(c-b)^{-}Q^{n}(\varvec{x})+1)\) \(\ge \) \(\mathrm{min}(\vert b-a\vert ,\vert c-a\vert )\), then the equation above is integer solvable if and only if the interval equation

$$\begin{aligned} F(\varvec{x})&= [L(\varvec{x})+(b-a)^+P^n(\varvec{x})-(b-a)^-Q^n(\varvec{x}),\\&U(\varvec{x})+(c-a)^+Q^n(\varvec{x})-(c-a)^-P^n(\varvec{x})] \end{aligned}$$

is integer solvable subject to the same constraints on \(\varvec{x}\) in \(\varvec{R}\) and the constraint \(P^{n}(\varvec{x})\le Q^{n}(\varvec{x})\).

Because we only need to consider the linear and quadratic polynomials in this paper, we extract two theorems from this lemma as follows.

Theorem 2

Consider the following variable interval equation

$$\begin{aligned} F(\varvec{x})+aX^2=[L(\varvec{\varvec{x}})+bX^2,U(\varvec{x})+cX^2] \end{aligned}$$

subject to a set of constraints on \(\varvec{x}\) in \(\varvec{R}\), \(P^{2}(\varvec{x})\le X^{2}\le Q^{2}(\varvec{x})\), where \(X^{2}\) does not appear in any of the constraints in \(\varvec{R}\). If \((b-a)(c-a)\le 0, \mathrm{or} (b-a)(c-a)>0\) and \((\mathrm{min}U(\varvec{x})-L(\varvec{x})+(c-b)^{+}P^{2}(\varvec{x})-(c-b)^{-}Q^{2}(\varvec{x})+1)\ge \mathrm{min}(\vert b-a\vert ,\vert c-a\vert )\), then the equation above is integer solvable if and only if the interval equation

$$\begin{aligned} F(\varvec{x})&= [L(\varvec{x})+(b-a)^+P^2(\varvec{x})-(b-a)^-Q^2(\varvec{x}),\\&U(\varvec{x})+(c-a)^+Q^2(\varvec{x})-(c-a)^-P^2(\varvec{x})] \end{aligned}$$

is integer solvable subject to the same constraints on \(\varvec{x}\) in \(\varvec{R}\) and the constraint \(P^{2}\)(x)\(\le Q^{2}\)(x).

Theorem 3

Consider the following variable interval equation

$$\begin{aligned} F(\varvec{x})+aX=[L(\varvec{x})+bX,U(\varvec{x})+cX] \end{aligned}$$

subject to a set of constraints on \(\varvec{x}\) in \(\varvec{R}\), \(P(\varvec{x})\le X\le Q(\varvec{x})\), where \(X\) does not appear in any of the constraints in \(\varvec{R}\). If \((b-a)(c-a) \le 0\), or \((b-a)(c-a)>0\) and \(\mathrm{min}(U(\varvec{x})-L(\varvec{x})+(c-b)^{+}P(\varvec{x})-(c-b)^{-}Q(\varvec{x})+1)\ge \mathrm{min}(\vert b-a\vert ,\vert c-a\vert )\), then the equation above is integer solvable if and only if the interval equation

$$\begin{aligned} F(\varvec{x})&= [L(\varvec{x})+(b-a)^+P(\varvec{x})-(b-a)^-Q(\varvec{x}),U(\varvec{x})\\&+\, (c-a)^+Q(\varvec{x})-(c-a)^-P(\varvec{x})] \end{aligned}$$

is integer solvable subject to the same constraints on \(\varvec{x}\) in \(\varvec{R}\) and the constraint \(P(x)\le Q(x)\).

As a matter of fact, Theorem 2 and 3 are two specific forms of Lemma 1. We will therefore not repeat their proofs. We put Theorem 3 here, as the constraint of a loop iteration variable may be a function of other loop indexes. The elimination of such a quadratic variable will introduce the linear form of these indexes to the right-hand expression. We would like to construct an integer interval on the right-hand side, so we also need to eliminate linear variables which appear in the right-hand expression.

After eliminating all the quadratic and linear polynomials which may appear on the right-hand side, a linear interval equation will generate. At this point, the equation can be determined in two ways. One is to eliminate all the remaining linear polynomials with Theorem 3 until the left-hand expression becomes a zero. As the right-hand expression represents an integer interval, the original dependence equation is integer solvable if the zero is in this interval. However, two accuracy conditions are needed to check before using Theorem 3. The other is to invoke the I test. Compared with Theorem 3, no accuracy conditions should be checked when using the I test. Hence, we decide to invoke the I test algorithm at this point instead of using Theorem 3. So, whether the original equation is integer solvable depends on the result of the I test.

5 The algorithm

From the above, we can summarize the improved quadratic dependence algorithm. It can be divided into four steps.

  1. (1)

    Reorganize the quadratic dependence equation into its canonical form (9) with Theorem 1, and add each constraint of variable \(y_{ij}\) to the region of interest \(\varvec{R}\).

  2. (2)

    Transfer the Eq. (9) into an interval equation (11) according to Theorem 2, and repeat eliminating the quadratic polynomials from the left-hand expression until only linear polynomials are left at this side.

  3. (3)

    If a linear polynomial is introduced when eliminating a quadratic polynomial or if its constraint is not constant, then eliminate it from the equation with Theorem 3.

  4. (4)

    At this point, the Eq. (11) has been transformed into a linear dependence equation. Invoke the I test algorithm to determine whether the equation has an integer-valued solution. If it is integer solvable, return there are dependences, otherwise there are no dependences.

Besides, we need to check two accuracy conditions before using Theorems 2 or 3. They are as follows [4].

Accuracy Condition 1. For each variable \(X\), eliminated from the variable interval equation

$$\begin{aligned} F(\varvec{x})+aX^k=[L(\varvec{x})+bX^k,U(\varvec{x})+cX^k] \end{aligned}$$

subject to a set of constraints on \(\varvec{x}\) in \(\varvec{R}\), \(P^{k}(\varvec{x})\le X\le Q^{k}(\varvec{x})\), where \(X\) does not appear in any of the constraints in \(\varvec{R}\), the following inequalities need to be satisfied

$$\begin{aligned} (b-a)(c-a)&\le 0\ \quad \mathrm{or} \\ (b-a)(c-a)&> 0,{\min }( {U(\varvec{x})-L(\varvec{x})+({c-b})^+P^k(\varvec{x})-( {c-b})^-Q^k(\varvec{x})+{1}})\\&\ge {\min }( {\left| {b-a} \right| ,\vert c-a\vert })(k={1},{2}) \\ \end{aligned}$$

Accuracy Condition 2. For each variable \(X\), eliminated from the variable interval equation

$$\begin{aligned} F(\varvec{x})+aX^k=[L(\varvec{x})+bX^k,U(\varvec{x})+cX^k] \end{aligned}$$

subject to a set of constraints on \(\varvec{x}\) in \(\varvec{R}\), \(P^{k}(\varvec{x})\le X\le Q^{k}(\varvec{x})\), where \(X\) does not appear in any of the constraints in \(\varvec{R}\), the following inequalities need to be satisfied

$$\begin{aligned} \mathrm{min}(Q^k(\varvec{x})-P^k(\varvec{x}))\ge 0(k={1},{2}) \end{aligned}$$

We therefore list the algorithm of this dependence test as follows.

figure a
figure b

Pos and Neg represent the positive and negative parts in Definition 1 of an integer, respectively. The function Const is used to determine whether the input constraint is constant. Readers can find the process of the I test from the literature [7]. Its main idea is to eliminate the variables by two basic theorems until the left-hand expression of the equation becomes a zero without checking any accuracy condition.

6 A case study

To help readers understand the algorithm, we demonstrate a case study in this section. Given the codes shown below

$$\begin{aligned}&\hbox {for}( {i={1};i<={5};i++}) \\&\quad \hbox {for}(j={5}-i;j<=i+{1}0;j++) \\&\qquad \hbox {for}( {k={1};k<={1}0;k++}) \\&\hbox {S}_{1} {A}[i*(i-j)+i+{3}*k-{2}]{ }={ }\cdots \\&\hbox {S}_{2} \cdots { }={ A}[i*i-{2}*i+{3}] \\ \end{aligned}$$

We need to determine whether the statement S\(_{1}\) depends on the statement S\(_{2}\) or the contrary. According to the principle of dependence testing, we construct the dependence equation as following

$$\begin{aligned} i_1^2 -i_1 j-i_2^2 +i_1 +2i_2 +3k=5 \end{aligned}$$

It subjects to the constraint defined by the loop bounds. \(i_{1}\) and \(i_{2}\) represent different instances of the loop iteration variable \(i\). As a loop-carried dependence should be taken into account, we must initialize two variables for the index \(i\) here. The equation above can be rewritten as follows for the purpose of discussion

$$\begin{aligned} I_1^2 -I_1 I_3 -I_2^2 +I_1 +2I_2 +3I_4 =5 \end{aligned}$$

\(I_{1}\), \(I_{2}\), \(I_{3}\) and \(I_{4}\) represent the variables \(i_{1}\), \(i_{2}\), \(j\) and \(k\), respectively. We reorganize this equation by Theorem 1 as

$$\begin{aligned} 3I_1^2 -I_5^2 -2I_2^2 +I_3^2 +2I_1 +4I_2 +6I_4 =10 \end{aligned}$$

where \(I_{5}=I_{1}+I_{3}\). From the definitions of interval equation, we can know that the dependence testing problem is to determine whether the interval equation

$$\begin{aligned} 3I_1^2 -I_5^2 -2I_2^2 +I_3^2 +2I_1 +4I_2 +6I_4 =[10,10] \end{aligned}$$

subject to the region \(\varvec{R}=\)

$$\begin{aligned} \begin{array}{l} 1\le I_1 ,I_2 \le 5, 5-I_1 \le I_3 \le I_1 +10 \\ 1\le I_4 \le 10, 6-I_1 \le I_5 =I_1 +I_3 \le I_1 +15 \\ (5-I_1 )^2\le I_{_3 }^2 \le (I_1 +10)^2 \\ (6-I_1 )^2\le I_{_5 }^2 \le (I_1 +15)^2 \\ \end{array} \end{aligned}$$

is integer solvable.

It is not difficult to get the Lower_Bounds and Upper_Bounds sets from the region \(\varvec{R}\). Since \(I_{1}^{2}\)appears in these two sets, we know Stack \(=\) {\(I_{2}\), \(I_{3}\), \(I_{5}\)}. \(I_{3}^{2}\) or \(I_{5}^{2}\) should be eliminated first according to Theorem 2. We may start with \(I_{5}^{2}\). In this case, \(a=-1\), \(b=c=0\), \(L(I)=U(I)=10\), \(P^{2}(I_{5})=(6-I_{1})^{2}\) and \(Q^{2}(I_{5})=(I_{1}+15)^{2}\). Check the accuracy conditions:

Accuracy Condition 1

$$\begin{aligned} (b-a)(c-a)={ }(0-(-{1}))(0-(-{1}))={1}>0 \end{aligned}$$

and

$$\begin{aligned}&\min (U(I)-L(I)+(c-b)^+P^2(I_5 )-(c-b)^-Q^2(I_5 )+1) \\&\quad =\min (10-10+0(6-I_1 )^2-0(I_1 +15)^2+1) \\&\quad \ge \min (\vert 0-(-1)\vert ,\vert 0-(-1)\vert ) \end{aligned}$$

Accuracy Condition 2

$$\begin{aligned} \min (Q^2(I_5 )-P^2(I_5 ))=\min ((I_1 +15)^2-(6-I_1 )^2)=\min (42I_1 +189)\ge 0 \end{aligned}$$

Both are satisfied, so \(I_{5}^{2}\) can be eliminated and the equation is transformed into

$$\begin{aligned}&3I_1^2 -2I_2^2 +I_3^2 +2I_1 +4I_2 +6I_4 \\&\quad =[10+(0-(-1))(6-I_1 )^2,10+(0-(-1))(I_1 +15)^2] \\&\quad =[10+(6-I_1 )^2,10+(I_1 +15)^2] \\&\quad =[I_1^2 -12I_1 +46,I_1^2 +30I_1 +235] \end{aligned}$$

We continue by eliminating variable \(I_{3}^{2}\). In this case, \(a=1\), \(b=c=0\), \(L(I)=I_{1}^{2}-12I_{1}+46\), \(U(I)= I_{1}^{2}+30I_{1}+235\), \(P^{2}(I_{3})= (5-I_{1})^{2}\) and \(Q^{2}(I_{3})= (I_{1}+10)^{2}\). Check the accuracy conditions:

Accuracy Condition 1

$$\begin{aligned} (b-a)(c-a)\,=\,(0-{1})(0-{1})={1}>0 \end{aligned}$$

and

$$\begin{aligned}&\min (I_1^2 +30I_1 +235-(I_1^2 -12I_1 +46)+0\times (5-I_1 )^2-0\times (I_1 +10)^2+1)\\&\quad \ge \min (\vert 0-1\vert ,\vert 0-1\vert ) \end{aligned}$$

Accuracy Condition 2

$$\begin{aligned} \min (Q^2(I_3 )-P^2(I_3 ))=\min ((I_1 +10)^2-(5-I_1 )^2)\ge 0 \end{aligned}$$

Both are satisfied, so \(I_{3}^{2}\) can be eliminated and the equation is transformed into

$$\begin{aligned}&3I_1^2 -2I_2^2 +2I_1 +4I_2 +6I_4 \\&\quad =[10+(6-I_1 )^2-(0-1)^-(I_1 +10)^2,10+(I_1 +15)^2-(0-1)^-(5-I_1 )^2] \\&\quad =[-32I_1 -54,40I_1 +210] \end{aligned}$$

Next is \(I_{2}^{2}\). In this case, \(a=-2\), \(b=c=0\), \(L(I)= -32I_{1}-54\), \(U(I)= 40I_{1}+210\), \(P^{2}(I_{2})=1\) and \(Q^{2}(I_{2})=25\). Check the accuracy conditions:

Accuracy Condition 1 \((b-a)(c-a)={ }(0-(-{2}))(0-(-{2}))={4}>0\) and

$$\begin{aligned}&\min (40I_1 +210-(-32I_1 +54)+0\times 1-0\times 25+1)\\&\quad \ge \min (\vert 0-(-2)\vert ,\vert 0-(-2)\vert ) \end{aligned}$$

Accuracy Condition 2

$$\begin{aligned} \min (Q^2(I_2 )-P^2(I_2 ))\ge \min (25-1)\ge 0 \end{aligned}$$

Both are satisfied, so \(I_{2}^{2}\) can be eliminated and the equation is transformed into

$$\begin{aligned}&3I_1^2 +2I_1 +4I_2 +6I_4 \\&\quad =[-32I_1 -54+(0-(-2))\times 1,40I_1 +210+(0-(-2))\times 25] \\&\quad =[-32I_1 -52,40I_1 +260] \\ \end{aligned}$$

At this point, no variable is included in the Lower_Bounds or the Upper_Bounds set. So, the Stack is {\( I_{1}\)}. In this case, \(a=3, b=c=0\), \(L(I)=-32 I_{1}-52\), \(U(I)=40I_{1}+260\), \(P^{2}(I_{1})=1\) and \(Q^{2}(I_{1})=25\). Check the accuracy conditions:

Accuracy Condition 1

$$\begin{aligned} (b-a)(c-a)\,=\,(0-{3})(0-{3})={9}>0 \end{aligned}$$

and

$$\begin{aligned} \min (40I_1 +260-(-32I_1 -52)+0\times 1-0\times 25+1)\ge \min (\vert 0-3\vert ,\vert 0-3\vert ) \end{aligned}$$

Accuracy Condition 2

$$\begin{aligned} \min (Q^2(I_1 )-P^2(I_1 ))\ge \min (25-1)\ge 0 \end{aligned}$$

Both are satisfied, so \(I_{1}^{2}\) can be eliminated and the equation is transformed into

$$\begin{aligned}&2I_1 +4I_2 +6I_4 \\&\quad =[-32I_1 -54-3\times 25,40I_1 +260-3\times 1] \\&\quad =[-32I_1 -129,40I_1 +257] \\ \end{aligned}$$

We therefore finish eliminating all quadratic polynomials from the equation. However, a linear polynomial \(I_{1}\) is introduced when eliminating \(I_{5}^{2}\) to the right-hand expression, so it should be eliminated according to the algorithm. In this case, \(a=2\), \(b=-32\), \(c=40\), \(L(I)=-129\), \(U(I)=257\), \(P(I_{1})=1\) and \(Q(I_{1})=5\).

Check the accuracy conditions:

Accuracy Condition 1

$$\begin{aligned} (b-a)(c-a)\,=\,(-{32}-{2})({4}0-{2})\le 0 \end{aligned}$$

Accuracy Condition 2

$$\begin{aligned} \min (Q(I_1 )-P(I_1 ))\ge \min (5-1)\ge 0 \end{aligned}$$

Both are satisfied, so \(I_{1}\) can be eliminated and the equation is transformed into

$$\begin{aligned} 4I_2 +6I_4 =[-129-34\times 5,257+38\times 1]=[-299,295] \end{aligned}$$

The right-hand expression becomes an integer interval. According to the algorithm, the I test algorithm should be invoked to analyze this interval equation. It returns the equation is integer solvable, so there are dependences between the statements S\(_{1}\) and S\(_{2}\).

The above example illustrates the process of our algorithm well. However, it shows there are dependences. Consider the codes shown below

$$\begin{aligned}&\mathrm{for}( {i={1};i<={5};i++}) \\&\quad \mathrm{for}( {j={1};j<=5;j++}) \\&{S}_{1}\, {A}\left[ {i*( {i+{2}j})+j*j+{2}0} \right] { }={ }\cdots \\&{S}_{2}\, \cdots { }={ A}[i*i-{2}] \\ \end{aligned}$$

We need to determine whether the statement S\(_{1}\) depends on the statement S\(_{2}\) or the contrary. According to the principle of dependence testing, we construct the dependence equation as follows

$$\begin{aligned} i_1^2 +2i_1 j+j^2-i_2^2 =-22 \end{aligned}$$

It subjects to the constraint defined by the loop bounds. \(i_{1}\) and \(i_{2}\) represent different instances of the loop iteration variable \(i\). As a loop-carried dependence should be taken into account, we must initialize two variables for the index \(i\) here. The equation above can be rewritten as follows for the purpose of discussion

$$\begin{aligned} I_1^2 +2I_1 I_3 +I_3^2 -I_2^2 =-22 \end{aligned}$$

\(I_{1}\), \(I_{2}\) and \(I_{3}\) represent the variables \(i_{1}\), \(i_{2}\) and \(j\), respectively. We reorganize this equation by Theorem 1 as

$$\begin{aligned} I_4^2 -I_2^2 =-22 \end{aligned}$$

where \(I_{4}=I_{1}+I_{3}\). From the definitions of interval equation, we can know that the dependence testing problem is to determine whether the interval equation

$$\begin{aligned} I_4^2 -I_2^2 =[-22,-22] \end{aligned}$$

subject to the region \(\varvec{R}=\)

$$\begin{aligned}&1\le I_1 ,I_2 ,I_3 \le 5,2\le I_4 \le 10 \\&1\le I_{_1 }^2 ,I_{_2 }^2 ,I_{_3 }^2 \le 25 \\&4\le I_{_4 }^2 \le 100 \\ \end{aligned}$$

is integer solvable.

We can solve it just like the previous example and get the final interval equation

$$\begin{aligned} 0=[-121,-1] \end{aligned}$$

which implies no integer-valued solution is found for the equation, and there is no dependence between S\(_{1}\) and S\(_{2}\).

From the analysis process of the case study, we have the following conclusions. First, the PVI test loses its efficiency due to the mixed polynomials. Second, it can be known after a simple calculation that the coefficient matrix of the quadratic polynomials is not positive semi-definite for each example, so the QP test gives conservative results. In the third place, it is manifest that the equation in this case is not similar to the Eq. (2). Hence, the Quadratic test cannot determine this equation effectively. All of these results illustrate that all existing quadratic dependence testing methods are inefficient for these cases, but our approach can analyze then and give exact results.

7 Experimental evaluations

A nonlinear dependence test can be applied more widely provided it can analyze more general subscripts. To maximize parallelism in applications, a parallelizing compiler always expects to reduce conservative results of its dependence tests. As a result, dependence testing is also required to figure out the integer solutions of the system of Eq. (1) when it confirms that the dependence equations are dependent. If a testing technique can always determine whether a dependence equation has integer solutions, it is supposed to be accurate. However, an accurate test is definitely a time-consuming technique, such as the Omega test [8]. To evaluate the performance of our algorithm, we compare our method with other nonlinear dependence testing techniques from different aspects. All the tested programs are executed on our personal computers equipped with 2.0 GHz CPU, 2 GB memory, and RedHat Linux Enterprise 5. The compiler we used in the experiment is Open64-5.0.

7.1 Efficiency and applicability

First, let us illustrate the applicability of different nonlinear dependence tests by two practical applications. Figure 1a shows the very simplified version of the nonlinear array reference codes in TRFD, which is extracted from the perfect benchmark suites. As there are irregular subscripts in this program, compilers cannot determine its dependences. However, they can be rearranged after the interprocedural analysis. An aggressive interprocedural analysis can prove that \(p[i] = i (i -1)/2\), so the optimized codes are in the form of codes in Fig. 1b, in which the subscript has been a quadratic form. Figure 2 shows the kernel loop nest of OCEAN, which is also a member of the perfect benchmark suites. There are also some nonlinear subscripts. We analyze these codes with our algorithm and other nonlinear dependence tests. The results are listed in Table 2.

Table 2 Testing results of TRFD and OCEAN programs with different nonlinear tests
Fig. 1
figure 1

Nonlinear array references in TRFD

Fig. 2
figure 2

Nonlinear array references in OCEAN

We can see from Table 2 that all nonlinear tests except Quadratic test can prove that the codes in TRFD are independent. The Quadratic test fails in this case since there are two loop indexes here. It demonstrates that our work can analyze nonlinear dependence testing problems in practice efficiently. As for OCEAN, the Quadratic test and PVI test fail to detect whether it is dependent, so they both return maybe. The subscripts in Fig. 2 do not match the form of Eq. (2), so the Quadratic test fails. Besides, there exist mixed polynomials, thereby resulting in the PVI test’s failure. According to the analysis above, we construct equation sets that different nonlinear dependence tests can analyze and summarize them in Fig. 3. It illustrates that our method is able to handle more general quadratic dependence problems than the Quadratic test and PVI test.

Fig. 3
figure 3

Relationship between different quadratic equation sets of three nonlinear tests

On the contrary, our algorithm proves that the program is independent, as the QP test and Range test do. However, the QP test needs to calculate the principal minor of coefficient matrix \(\varvec{H}\), and solve the quadratic programming with a complicated process. As a result, it consumes much more time than the Range test and our method when analyzing OCEAN. Figure 4 shows the time comparison between these three methods. It has been normalized to the time consumed by our technique. The measurement unit of the execution time of each dependence testing technique is millisecond. It illustrates that compared with the QP test, our method has a much lower time complexity than that of the QP test.

Fig. 4
figure 4

Normalized time comparison between three nonlinear tests

7.2 Compared with existing nonlinear dependence tests

We can also see from Table 2 that our method performs similarly as the Range test in terms of TRFD and OCEAN. The difference between our method and the Range test can be explained as follows. Compared with the other nonlinear dependence testing methods mentioned in this paper, the Range test is definitely different. All the above-mentioned methods determine the dependence by searching or determining whether there exist integer-valued solutions, but the Range test considers the two subscript expressions \(h\) and \(g\) of a dependence equation individually. If they are both monotonous and the maximum value of \(h\,h_\mathrm{max}\) is less than the minimum value of \(g\, g_\mathrm{min}\), then the analysis result will report that there is no dependence. It has no constraint on the form of the subscript expressions, but relies on their monotony. Unlike the Range test, our method determines dependences by solving the system of Eq. (1).

It is manifest that the QP test has a higher time complexity, but it is an accurate technique when it does not return maybe. As we mentioned above, the reason is the QP test that always seeks the integer solutions when it proves a dependence. Nevertheless, current parallelizing compilers usually analyze a dependence with a time saving but inefficient test first, and then use an exact but time-consuming test. The latter serves as a backup test when the former fails. This strategy tends to decrease the analysis time as well as maintain the accuracy of dependence testing. Take the Open64 compiler as an example. Its testing suite is composed of two main tests, which are GCD test [9] and Omega test, respectively. When analyzing a dependence, the compiler always detects it with the GCD test first and then turns to the Omega test when the GCD test fails. The Open64 compiler also uses some simple testing strategies for the same simplified cases, but it is unnecessary to discuss them here.

Therefore, we use the QP test as a backup accurate test. Since the Range test analyzes the two subscript expressions \(h\) and \(g\) of a dependence equation separately, its analysis results cannot be utilized by other backup tests such as the QP test. If we combine the Range and QP test into a test suite, the QP test has to construct dependence equations when it is invoked. On the contrary, the compiler is allowed to invoke the QP algorithm directly without constructing dependence equations if it uses our method instead of the Range test. Consequently, we compare our method with the Quadratic, PVI and Range tests in this subsection.

To evaluate their performance on quadratic subscripts, we select various nonlinear programs from five numerical packages [1012]. We find 303 pairs of one-dimensional (1D) cases with constant bounds and quadratic equations, and 168 pairs of multidimensional cases. Here, a 1D case refers to a situation that only one loop index is allowed in an array subscript, while a multidimensional case implies that more than one loop indexes may appear. Among these multidimensional cases, 67 pairs contain mixed polynomials. We categorize these cases into three categories, which are 1D, multidimensional without mixed polynomials and multidimensional with mixed polynomials, respectively. The evaluation results are shown in Figs. 5, 6 and 7, which are normalized to the total numbers of 1D case, multidimensional case with mixed polynomials and multidimensional case without mixed polynomials, respectively.

Fig. 5
figure 5

Testing results of 1D cases

Fig. 6
figure 6

Testing results of multidimensional cases without mixed polynomials

Fig. 7
figure 7

Testing results of multidimensional cases with mixed polynomials

As Fig. 5 shows, the Quadratic test does not return maybe for one-dimensional cases. In other words, for a 1D case, the Quadratic test is exact. Our method performs the same as the PVI test, since mixed polynomials will not appear in this case. Compared with the Quadratic test, our method returns some maybes in this case, since our technique as well as the PVI test need to check accuracy conditions. However, as Figs. 6 and 7 show, the Quadratic test always fails for multidimensional cases, as the dependence equations do not match the form of Eq. (2) in this case. When the tested subscripts contain no mixed polynomials, as the results shown in Fig. 6, the performance of our algorithm is still the same as that of the PVI test. However, the PVI test loses its efficiency when mixed polynomials appear, which is illustrated in Fig. 7, while our technique still works in this case, which means our technique can confirm whether there are integer solutions to the equation but will not return a conservative result.

From the evaluation results, we can see that the Range test can disprove dependences in all cases, and its performance is comparable with our method. Nonetheless, it always returns maybe when it cannot prove that there is no dependence, since it only confirms that there are solutions but cannot tell whether an integer solution exists. Our method is able to fix this problem when it is applicable, although it will not find these integer solutions.

As a consequence of the above discussions, we conclude that our algorithm can perform as well as the other three tests in one-dimensional and mixed polynomials free cases. When mixed polynomials appear, the Quadratic test and PVI test fail but our method still works.

7.3 Working together with QP test

As we described in previous subsections, the QP test is an exact dependence testing technique, but its time complexity is a boring problem. To reduce the time spent in quadratic dependence analysis, we implement our algorithm in a parallelizing compiler, and combine it with the QP test to constitute a sophistic dependence testing suite. As a result, not only can the time cost in dependence testing process be reduced, but more pseudo dependences can be eliminated, thereby exploiting more parallelism in applications. The reasons why we choose our method and the QP test have been explained above.

We implement our algorithm as well as that of the QP test on top of the Open64-5.0 compiler. During the quadratic dependence analysis process, the compiler captures necessary information about tested array subscripts, and constructs the set of quadratic dependence equations. After delivering this system of equations to the dependence testing procedure, the compiler first invokes the algorithm described in this work. If it disproves the dependences, then the process will not enter the QP test and returns the testing results. Otherwise, the compiler transfers the system of equations to the QP test. QP test will find out the integer solutions provided it proves the existence of dependences, furthermore determine other dependence information, e.g., dependence directions, dependence distances, etc.

The experiment is divided into two phases. In the first phase, we detect nonlinear dependences with each test individually, and record the time costs of dependence analysis. Then, we test the same programs with two methods together, and write down the time spent in dependence testing procedure. The tested programs are extracted from the Perfect benchmark suites, which are DYFESM, OCEAN, QCD and TRFD, respectively. In these programs, numerous quadratic subscripts and other kinds of nonlinear subscripts which can be transformed into quadratic form are found. So, discussion on these programs can demonstrate the efficiency of nonlinear dependence tests. In the second phase, we record the dependence testing results when using different tests, so as to compare their accuracies. The time cost and testing performance comparisons are shown in Figs. 8 and 9, which are normalized to the time consumed by our method and the total number of tested cases, respectively.

Fig. 8
figure 8

Comparison of time costs between different methods

Fig. 9
figure 9

Comparison of testing results between different methods

From Fig. 8, we can see that for the tested programs in the experiment, the time cost of the QP test far outstrips those of our method and two tests together. When the compiler only uses our method, its time complexity is lower than that when the compiler uses two methods together. Figure 9 shows that compared with sole QP test, combining two tests can disprove more dependences. The reason is as follows. When testing a quadratic dependence with QP test only, the compiler can analyze the problem only when the coefficient matrix of the problem is positive semi-definite. But if the hypothesis cannot be satisfied, it has to give a conservative result. In this case, our algorithm may still disprove dependences. Compared with using each method separately, combining two algorithms together can find more independents, while decreasing the number of maybes. That is to say, the accuracy is enhanced.

8 Theoretical analysis

The previous section verifies the technique’s efficiency in an experimental way, but it can only illustrate that the methods are efficient to the data and the programs tested in the experiments. It cannot be used for a general purpose. Compared with the experimental evaluation method, the theoretical evaluation approach is more persuasive and general. Hence, we perform a theoretical analysis in this section. Since all quadratic dependence testing methods use a subscript-by-subscript approach, the efficiency of testing a single dependence equation is equal to that of testing the whole system of Eq. (1). We therefore only analyze the efficiency of testing a single dependence equation.

As described in Sect. 2, the Quadratic test and QP test are two specific methods for quadratic subscripts. The former can only deal with dependence equations written in the form of Eq. (2), while the latter must be on the basis of a positive semi-definite matrix. When the number of variables of a dependence equation is large than 2, saying that the tested subscripts include more than one loop index, the Quadratic test will lose its efficiency and give a conservative result. However, the QP test is still able to give an exact result when its hypothesis is satisfied. Hence, the QP test is more efficient than the Quadratic test in this case.

Now consider the other case. When a dependence equation is written in the form of Eq. (2), it is manifest that the coefficient matrix \(\varvec{H}\) of the quadratic polynomials is certainly positive semi-definite (It has been proved that the QP test has the same process with the standard form in the case of \(a\le 0\), provided \(e(\varvec{i})=-f(\varvec{i}))\). Besides, after the extension algorithm [6], the QP test can also give as exact results as those of the Quadratic test. The QP test is therefore as accurate as the Quadratic test in this case. From the above analysis, we say that the QP test is more efficient than the Quadratic test for a general dependence equation (9).

Unlike the Quadratic test and QP test, the PVI test is not a specific dependence technique for quadratic subscripts, but a general nonlinear dependence testing approach. The PVI test can deal with the Eq. (2) for certain, and it has been proved above that the QP test is more efficient than Quadratic test. So, we compare the QP test and PVI test. As shown in Table 1, the QP test can determine a dependence equation with mixed polynomials, but its limitation lies in the positive semi-definite matrix. On the contrary, the PVI test has no restriction on the coefficient matrix, but it requires no mixed polynomials. As a result, when a mixed polynomial appears in the dependence equation, the QP test is more efficient than the PVI test.

Now consider the case without mixed polynomials. In this case, it has been proved in [5] that only when each \(h_{kk}\ge 0\) (\(1\le k\le n)\) is satisfied, the coefficient matrix \(\varvec{H}\) can be positive semi-definite. The PVI test can determine whether there is an integer-valued solution to the dependence equation, although it will not search such a solution. So, we think the PVI test is more efficient than the QP test in this case. On the basis of the above discussion, we cannot say that the QP test is certainly more efficient than the PVI test or the contrary for a general dependence equation (2). It depends on whether there are mixed polynomials in the equation.

Next, let us compare our proposed method and PVI test. The common ground of these two techniques is that they are both based on the interval equation theory. They both first eliminate nonlinear variables via this theory and determine whether the equation is integer solvable. The PVI test cannot deal with mixed polynomials as described above, but our method can transform any equation into canonical form and gives an exact result. Hence, we think our method is more efficient than the PVI test in this case. When there is no mixed polynomial in a dependence equation, the PVI test needs to continue eliminating linear variables, while our method only needs to eliminate the introduced or trapezoidal linear variables and call the I test algorithm, because the equation has been a canonical form in this case. The I test will not check any accuracy conditions. So, our method is also more efficient in this case. Therefore, the proposed method in this paper is more efficient than the PVI test for the general dependence equation (9).

Now compare our method with the QP test. When the coefficient matrix \(\varvec{H}\) is not positive semi-definite, our algorithm can still give an exact answer but the QP test is inefficient. On the contrary, when the coefficient matrix \(\varvec{H}\) is positive semi-definite, the QP test can determine whether there is an integer-valued solution to the dependence equation. Our test needs to check accuracy conditions. However, as the experimental results show, QP test is much more time-consuming than our algorithm, although it is likely to be able to disprove more dependences than ours.

The difference between our work and the Range test is illustrated in Sect. 7.2, so we do not repeat it here.

Of course, our technique has its limitations. Since we focus on quadratic cases, it can only analyze quadratic subscripts compared with the PVI test and Range test. The QP test always seeks the integer solutions when it proves the dependence equations, while our technique only determines whether it has such solutions. Compared with the Quadratic test, our drawback is that we need to check accuracy conditions in 1D case. To summarize, we list both the advantages and disadvantages of these nonlinear dependence testing techniques in Table 3.

Table 3 Comparisons between our technique with other nonlinear dependence tests

9 Related work

It has been more than 40 years since the first data dependence testing method was proposed. Earlier testing techniques determine the dependence by constructing and solving the Diophantine [13] equation of the subscript expressions. When testing multidimensional arrays with linear subscripts, a subscript position is said to be separable if its indexes do not occur in the other subscripts [14, 15]. If they contain the same index, they are coupled [16]. The research community uses a subscript-by-subscript method to analyze separable subscripts.

The GCD test [9] is a dependence testing technique using a fundamental theorem about the Diophantine equation. If the greatest common divisor of all the coefficients of loop induction variables does not divide the constant term of the equation, the dependence equation has no integer-valued solution and “no dependence” is reported. The Banerjee inequality [17] is used to determine whether the equation has real solutions in the region of interest. It first calculates the extreme values of the left-hand expression to get a real interval, and checks whether the constant on the right-hand side is in this interval. There is no dependence if this condition cannot be satisfied. Neither the GCD test’s nor Banerjee Inequality’s is the region of interest. The former does not consider the bound constraint, while the latter does not require an integer-valued solution. However, they indicate two specific directions for the subsequent methods. Considering the problems brought by these two methods, Kong et al. [7] proposed the I test, which combines the advantages of these two methods. Its main idea is also to eliminate variables from the dependence equation with the interval equation theory. One limitation of the I test is that the constraint of each variable has to be integer.

All these methods are subscript-by-subscript techniques. If the subscripts are coupled, the system of Eq. (1) may also be unsolvable even if each equation has an integer-valued solution. Hence, the research community developed dependence testing techniques for coupled subscripts. Knuth [18] extended the GCD test algorithm and proposed a general algorithm to solve a set of linear equations with integer coefficients. Just like the GCD test, this method does not take the bound constraint into account. The \(\lambda \) test [16, 19] is the first technique to consider all the subscripts of coupled subscripts together. The algorithm, which combines all equations in one with some \(\lambda \)s, can be viewed as the multidimensional form of the Banerjee Inequality. Since it introduces some new variables, the algorithm can determine the equation only when all \(\lambda \)s can be eliminated. Focusing on the problem generated by the general GCD test, the Power test [20] was presented. When the general GCD test returns solutions to a dependence equation, the Power test will determine whether these solutions are in the region of interest with the Fourier–Motzkin elimination (FME) [21]. The Omega test [8] is a similar approach. It also tries to address an integer programming problem by FME. As the integer programming is an NP-hard problem as everyone knows, techniques like this always trend to have a high time complexity. The Delta test [22] is an exact and efficient testing method. Its main idea is to propagate the constraints produced by some single index variable (SIV) subscripts to others in the same group without losing accuracy. It has been proved that the Delta test is just a restricted form of the \(\lambda \) test [1].

The above techniques can determine the dependence of linear subscripts well for most cases. So, some developers began to evaluate these tests to construct a complete dependence test suite in early 1990s. Shen et al. [23] evaluated some dependence tests and analyzed their efficiencies with an experimental method. They performed a preliminary empirical study on some numerical packages, including the Linpack and Eispack. Petersen and Padua [24] also carried out an experimental evaluation of a proposed sequence of dependence tests based on some packages, such as the Linpack, Eispack and Perfect Benchmarks. Besides evaluating the accuracy, Psarris and Kiriakopoulos [25] also showed tradeoffs between accuracy and efficiency and the time costs of three representative techniques. All these methods tried to give experimental or empirical results so as to enhance the ability of compilers. Directed by this issue, some newer proposed data dependence tests combined a suite of testing methods for different cases. Golf et al. [22] divided the subscripts into three categories, which are zero index variable (ZIV), SIV and multiple index variable (MIV), respectively. They also presented a set of testing methods to deal with these different cases. Maydan et al. [26] also proposed a subscript pattern-based dependence test suite. They adapted the FME as an expensive back up test when all the primary methods failed. All the above tests are designed based on an assumption that the subscripts are linear affine expressions of the loop index variables, while the algorithm proposed in this paper, as well as the Quadratic test [3], QP test [5, 6], PVI test [4] and Range test [2] are nonlinear dependence tests.

In addition to these, there are some other specific data dependence analysis techniques. Hommel [27] proposed a general data dependence test for dynamic, pointer-based data structures based on a logic proof idea. It can also be viewed as a nonlinear test. Paek et al. [28, 29] categorized the nonlinear subscripts, and proposed an array access pattern LMAD. A subscript is represented as a general expression, and an ART test [30, 31] was described to address the dependence problem based on it. Engelen et al. [32] presented a unified approach for nonlinear dependence testing. They proposed Nonlinear GCD test, Nonlinear Value Range test and Nonlinear EVT (Extreme Value test) for different cases, respectively.

The literature [33] proposed a knowledge-based learning system K test. It integrates existing tests and makes good use of their advantages. An single instruction multiple data (SIMD) faced dependence test, D test, was proposed in [34]. It extended the Banerjee Inequality and proved some theorems for the case where the dependence distance is greater than or equal to the number of data processed in the SIMD register.

With the development of speculative parallelizing compilers, the research community began to focus on the study on data dependence profiling, which provides runtime dependence information in speculative optimizations. Chen et al. [35] proposed a hash-based rather than pair-wise data dependence profiling technique, so as to perform the compiler-based instrumentation. To take full advantages of multi-core architectures, shadow profiling [36] was introduced to perform sampling on long traces of instrumented codes. Other representative prior data dependence profilers can mainly categorized into pair-wise [37, 38], which tries to track all pairs of dependences that occur at runtime, and software signature-based [39] determining the dependence by grouping many pair-wise relationships into a single set operation.

10 Conclusions

Conventional linear data dependence tests are not able to satisfy their demands due to the existence of irregular and nonlinear subscripts for the scientific and engineering applications. To address this problem, the research community proposed some nonlinear tests. The QP test and PVI test are two representative methods among these tests. However, both of them have their own limitations when applied in practice. This paper proposed an improved nonlinear data dependence test, which can deal with the cases where the PVI test fails, and it can reduce the time cost of dependence testing when combined with QP test. Experimental results show that compared with existing quadratic dependence analysis techniques, the method proposed in this paper either has a wide range of application or has a much lower time complexity. As we design this technique for quadratic subscripts, our next research plan is to extend the algorithm for more complicated nonlinear cases.