Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Rectangular Spectra

We have developed a satisfactory theory for the Peaceman–Rachford ADI iterative solution of model problems where matrices H and V have the same spectral intervals. That improvement is possible when these intervals differ is demonstrated with a simple example. Let the interval for H be [0. 001, 4] and for V be [0. 025, 4]. A prescribed error reduction yields a value for the nome q 2. Referring to Eqs. 143, we find that the number of iterations varies as K ∕ K′. When k′ < < 1 this varies as \(\ln \frac{4} {k^\prime}\). For straightforward use of Eq. 2 of Chap. 1, we would choose parameters for the eigenvalue interval [0. 001, 4], and the number of iterations would be \(J\doteq s\,\ln \frac{4} {{k}^{{\prime}}}= s\,\ln \frac{16} {0.001} = 9.68s\) for some constant s depending on the prescribed error reduction. Suppose we redefine H and V by adding \(\frac{c-a} {2} = 0.012\) times the identity matrix to H and subtracting this from V. The new eigenvalue intervals are [0. 013, 4. 012] and [0. 013, 3. 988]. We find that for these intervals \(J\doteq s\,\ln \frac{16.048} {0.013} = 7. 12s\), and we have a significant gain in efficiency.

Inspection of Eq. 2 of Chap. 1 reveals that this is equivalent to retaining the original H and V matrices but using different iteration parameters in Eqs. 1–2.1 and 1–2.2. If the parameters for the redefined matrices are p j , then we could use \(p_{j}^{{\prime}} = p_{j} + 0.012\) with the original H in Eq. 1–2.1 and \(q^\prime_{j} = p_{j} - 0.012\) with the original V in Eq. 1–2.2. We, therefore, generalize the Peaceman–Rachford equations to Eq. 3 of Chap. 1 (with matrix F equal to the identity for the present). One now considers optimization of these generalized equations. In our illustrative example, the simple shift led to almost identical eigenvalue ranges, and little gain could be achieved by further optimization. However, suppose the intervals for the eigenvalues of H and V were [0. 01, 1] and [1, 100]. The shift to equate lower bounds at 0. 505 leads to upper bounds of 1. 495 and 100. 495. This gives a partial improvement from k  = 0. 0001 to k  = 0. 005, but greater improvement is possible. Before describing how this is accomplished, we consider the ADI minimax problem for Eqs. 3 of Chap. 1. The spectral radius of the generalized ADI iteration (GADI) matrix is

$$\displaystyle{\rho (G_{J})\ =\mathop{\max }\limits_{\lambda \ ,\ \gamma }\left \vert \prod _{j=1}^{J}\frac{(q_{j}-\lambda )(p_{j}-\gamma )} {(p_{j}+\lambda )(q_{j}+\gamma )}\right \vert \ ,}$$

where λ ranges over the eigenvalues of F  − 1 H and γ ranges over the eigenvalues of F  − 1 V .

When F is the identity matrix, this is the 2-norm of the ADI iteration matrix. The B-norm of a vector v for any SPD matrix B is defined as the square root of the inner product (v, B v). The subordinate matrix norm is called the B-norm of the matrix. In general, the spectral radius of the ADI iteration matrix G J is equal to the F-norm of G J , and we choose to define our minimax problem as minimization of this norm. This norm is equal to the 2-norm of \({F}^{\frac{1} {2} }G_{J}{F}^{-\frac{1} {2} }\) which is equal to ρ(G J ). Thus, the minimax problem for the generalized ADI equations is for a given J to choose sets of iteration parameters p j and q j to minimize ρ(G J ). The role of matrix F will be developed later. For the present, we choose F as the identity matrix. Suppose λ and γ both vary over the same interval. Then we may revert to Eq. 2 of Chap. 1 by choosing p j  = q j . It happens that this choice is optimal. Although this seems evident from symmetry considerations with respect to the two eigenvalue variables, the proof is not trivial and will be given subsequently. It follows that the additional degrees of freedom in Eq. 3 of Chap. 1 lead to a more efficient scheme only when the eigenvalue intervals differ.

2.2 W.B. Jordan’s Transformation

The algorithm for J = 2n was generalized by Jordan to yield optimum parameters when the eigenvalue intervals were [a, b] and [c, d] with a + c > 0 [Wachspress, 19631966]. Before each reduction of order (fan-in) by spectrum folding, the spectra were shifted by adding a constant to one and subtracting that constant from the other so that the product of the endpoints was identical for the shifted spectra. This enabled a folding that preserved the original form of the error function. Significant improvement was demonstrated when these intervals differed widely. Just as the earlier algorithm with its AGM theme stimulated W.B. Jordan to develop the elliptic-function theory for all J for Eq. 2 of Chap. 1, the generalized algorithm for Eq. 3 of Chap. 1 led Jordan to solution of this minimax problem. He found a transformation of variables which preserved the form of the minimax problem but with identical ranges for the new variables [Wachspress, 19631966].Footnote 1

A linear fractional transformation y = B(z) is of the form

$$\displaystyle{ y = \frac{\alpha z+\beta } {\gamma z+\delta }. }$$
(1)

The composite transformation B(z) = B 2[B 1(z)] is isomorphic to matrix multiplication with

$$\displaystyle{ B \sim \left [\begin{array}{*{10}c} \alpha &\beta \\ \gamma &\delta \\ \end{array} \right ]. }$$
(2)

Thus, the composite transformation is obtained with B = B 2 B 1. Moreover, if we define \(B_{-}(z) = B(-z)\), then

$$\displaystyle{ B_{-} = B\left [\begin{array}{*{10}c} -1&0\\ 0 &1\\ \end{array} \right ]. }$$
(3)

The two-variable ADI minimax problem is to find the parameters p j and q j which minimize the maximum absolute value of the function

$$\displaystyle{ g(x,y,\mathbf{p},\mathbf{q}) =\prod _{ j=1}^{J}\frac{(x - q_{j})(y - p_{j})} {(x + p_{j})(y + q_{j})} }$$
(4)

for x ∈ [a, b] and y ∈ [c, d], where a + c > 0. Define the linear fractional transformation

$$\displaystyle{R_{j}(z) = \frac{z - q_{j}} {z + p_{j}}.\quad \text{ Then }\quad \frac{(x - q_{j})(y - p_{j})} {(x + p_{j})(y + q_{j})} = \frac{R_{j}(x)} {R_{j}(-y)}.}$$

We now seek a relationship between transformations B 1 and B 2 such that when we define x = B 1(x ) and y = B 2(y ) there exist a p  ,  q such that g(x, y, p, q)  =  g(x , y , p , q ). This can be accomplished if for each j

$$\displaystyle{ \frac{R_{j}(x)} {R_{j}(-y)} = \frac{R_{j}[B_{1}({x}^{{\prime}})]} {R_{j}[-B_{2}({y}^{{\prime}})]} = \frac{S_{j}({x}^{{\prime}})} {S_{j}(-{y}^{{\prime}})} }$$
(5)

for some linear fractional transformation S j .

The matrix isomorphism yields R j B 1 = S j and \(R_{j-}B_{2} = S_{j-}\). Thus, \(R_{j}B_{1} = (R_{j-}B_{2})_{-}\ \), and it follows that

$$\displaystyle{ R_{j}B_{1} = R_{j}\ \left [\begin{array}{*{10}c} -1&0\\ 0 &1\\ \end{array} \right ]B_{2}\left [\begin{array}{*{10}c} -1&0\\ 0 &1\\ \end{array} \right ], }$$
(6)

and multiplying on the left by R j  − 1, we find that our goal is achieved when

$$\displaystyle{ B_{1} = \left [\begin{array}{*{10}c} -1&0\\ 0 &1\\ \end{array} \right ]B_{2}\left [\begin{array}{*{10}c} -1&0\\ 0 &1\\ \end{array} \right ]. }$$
(7)

This yields the desired relationship between B 1 and B 2:

$$\displaystyle{\text{If }B_{1}\ =\ \left [\begin{array}{*{10}c} \alpha &\beta \\ \gamma &\delta \\ \end{array} \right ]\ \text{, then }B_{2}\ =\ \left [\begin{array}{*{10}c} \alpha &-\beta \\ -\gamma & \delta \\ \end{array} \right ].}$$

In Chap. 1 it was demonstrated that the optimum parameters for the one-variable problem with x ∈ [a, b] are \(p_{j} = q_{j} = \text{b}dn[\frac{(2j-1)K} {2J} ,k]\), where dn[z, k] is the Jacobian elliptic dn-function of argument z and modulus \(k = \sqrt{1 -{ {k}^{{\prime} } }^{2}}\). Here, k is the complementary modulus, which is in this application equal to \(\frac{a} {b}\). Having this result in mind, Jordan chose to normalize the common interval of x and y to [k , 1]. We now derive Jordan’s result, which is that there is a unique k  < 1 and transformation matrix B 1 which accomplishes this task. The four conditions, when \(x = a,\ {x}^{{\prime}} = {k}^{{\prime}}\); when \(x = b,\ {x}^{{\prime}} = 1\); when \(y = c,\ {y}^{{\prime}} = {k}^{{\prime}}\); and when \(y = d,\ {y}^{{\prime}} = 1\), yield the homogeneous matrix equation C ϕ = 0, where ϕ T = [α, γ, β, δ] and

$$\displaystyle{ C = \left [\begin{array}{*{10}c} {k}^{{\prime}}&-a{k}^{{\prime}}& 1 &-a \\ 1 & -b & 1 & -b \\ {k}^{{\prime}}& \ c{k}^{{\prime}} &-1& -c \\ 1 & \ d &-1&-d\\ \end{array} \right ]. }$$
(8)

This system has a nontrivial solution only when the determinant of matrix C vanishes. It will be shown that there are only two values for k for which this occurs, one greater than unity and the other less than unity. We first define the three matrices:

$$\displaystyle{ K = \left [\begin{array}{*{10}c} {k}^{{\prime}}&0 \\ 0 &1\\ \end{array} \right ],\quad A = \left [\begin{array}{*{10}c} 1&-a \\ 1& -b\\ \end{array} \right ],\text{ and }F = \left [\begin{array}{*{10}c} 1& c\\ 1 &d\\ \end{array} \right ]. }$$
(9)

Then

$$\displaystyle{ C = \left [\begin{array}{*{10}c} KA& A\\ KF &-F\\ \end{array} \right ] = \left [\begin{array}{*{10}c} KA{F}^{-1} & 0 \\ K &-I\\ \end{array} \right ]\left [\begin{array}{*{10}c} F & F{A}^{-1}{K}^{-1}A \\ 0\ &F + KF{A}^{-1}{K}^{-1}A\\ \end{array} \right ]. }$$
(10)

Since A, F, and K are nonsingular, C is singular only when \(F + KF{A}^{-1}{K}^{-1}A = (F{A}^{-1}K + KF{A}^{-1}){K}^{-1}A\) is singular or when \(G\mathop{ =}\limits^{ \text{def}}F{A}^{-1}K + KF{A}^{-1}\) is singular. We determine that

$$\displaystyle{ G = \frac{1} {a - b}\left [\begin{array}{*{10}c} -2{k}^{{\prime}}(b + c) &(1 + {k}^{{\prime}})(a + c) \\ -(1 + {k}^{{\prime}})(b + d)& 2(a + d)\\ \end{array} \right ]. }$$
(11)

Let \(\tau = \frac{2(a+d)(b+c)} {(a+c)(b+d)}\). Then det(G) = 0 when k satisfies the quadratic equation:

$$\displaystyle{{ {k}^{{\prime}}}^{2} - 2(\tau -1){k}^{{\prime}} + 1 = 0. }$$
(12)

Now define the positive quantity

$$\displaystyle{ m\ \mathop{ =}\limits^{ \text{def}}\ \frac{2(b - a)(d - c)} {(a + c)(b + d)}. }$$
(13)

It is easily shown that \(\tau -1 = m + 1\) and the solution to Eq. 12 which is less than unity is

$$\displaystyle{ {k}^{{\prime}}\ =\ \frac{1} {1 + m + \sqrt{m(2 + m)}}. }$$
(14)

The other solution is its reciprocal, which is greater than unity. From Eq. 10,

$$\displaystyle{ \left [\begin{array}{*{10}c} F &F{A}^{-1}{K}^{-1}A \\ \mathbf{0} & G{K}^{-1}A\\ \end{array} \right ]\left [\begin{array}{*{10}c} \alpha \\ \gamma \\ \beta \\ \delta \\ \end{array} \right ] = \mathbf{0}\text{ and }G{K}^{-1}A\left [\begin{array}{*{10}c} \beta \\ \delta \\ \end{array} \right ] = \mathbf{0}.}$$
(15)

We have

$$\displaystyle{ G{K}^{-1}A = \left [\begin{array}{*{10}c} (1 + {k}^{{\prime}})(a + c) - 2(b + c)&2a(b + c) - b(1 + {k}^{{\prime}})(a + c) \\ -\frac{1+{k}^{{\prime}}} {{k}^{{\prime}}} (b + d) + 2(a + d) & \frac{1+{k}^{{\prime}}} {{k}^{{\prime}}} a(b + d) - 2b(a + d)\\ \end{array} \right ]. }$$
(16)

We now define \(\sigma = 2(a + d)/(b + d)\) and obtain from the second row of Eq. 16:

$$\displaystyle{ [-(1 + {k}^{{\prime}}) +\sigma {k}^{{\prime}}]\beta + [a(1 + {k}^{{\prime}}) - b\sigma {k}^{{\prime}}]\delta = 0. }$$
(17)

We preempt division by zero by setting

$$\displaystyle{ \delta = (1 + {k}^{{\prime}}-\sigma {k}^{{\prime}})\text{ and }\beta = a(1 + {k}^{{\prime}}) - b\sigma {k}^{{\prime}}. }$$
(18)

The first row of C in Eq. 10 yields the relationship

$$\displaystyle{ KA\left [\begin{array}{*{10}c} \alpha \\ \gamma \\ \end{array} \right ]+A\left [\begin{array}{*{10}c} \beta \\ \delta \\ \end{array} \right ] = \mathbf{0}\ , }$$
(19)

from which we obtain

$$\displaystyle{ {k}^{{\prime}}(a\gamma -\alpha ) =\beta -a\delta \text{ and }(b\gamma -\alpha ) =\beta -b\delta. }$$
(20)

Substituting the values for β and δ given in Eq. 18, we get

$$\displaystyle{ \alpha = b\sigma - a(1 + {k}^{{\prime}})\text{ and }\gamma =\sigma -(1 + {k}^{{\prime}}). }$$
(21)

We must show that the transformation matrices B 1 and B 2 are nonsingular or that αδ − βγ ≠ 0 for any intervals [a, b] and [c, d] for which a + c > 0. We have

$$\begin{array}{rlrlrl} \alpha \,\delta -\beta \,\gamma & = [b\sigma - a(1 + k^\prime)](1 + k^\prime -\sigma k^\prime) - [a(1 + k^\prime) - b\sigma k^\prime][\sigma -(1 + k^\prime)] & & \\ & =\sigma (b - a)(1 -{ k^\prime}^{2}) > 0 &\end{array}$$
(22)

We must also show that B 1 transforms the interior of [k′, 1] into the interior of [a, b] and that B 2 transforms the interior of [k′, 1] into the interior of [c, d]. Since the transformations were generated to transform the endpoints properly, we need only show that one point x′ outside of [k′, 1] is such that B 1(x′) is outside [a, b] and one point y′ outside [k′, 1] is such that B 2(y′) is outside [c, d]. First, we consider the case where γ = 0. We have \(\sigma = 1 + k^\prime,\alpha = (b - a)(1 + k^\prime),\delta = 1 -{ k^\prime}^{2},\,\text{and}\,\,\beta = (1 + k^\prime)(a - bk^\prime)\). It follows that

$$\displaystyle{ B_{1}(x^\prime) = \frac{(b - a)(1 + k^\prime)x^\prime + (1 + k^\prime)(a - bk^\prime)} {(1 -{ k^\prime}^{2})} = \frac{(b - a)x^\prime + (a - bk^\prime)} {(1 - k^\prime)}. }$$
(23)

Thus, x′ =  transforms into x = . The corresponding expression for B 2(y′) differs only in a negative sign for the second term in the numerator. Thus y′ =  transforms into y = . The case of γ = 0 is thus resolved. When γ ≠ 0, we choose \(x^\prime = -\frac{\delta }{\gamma }\) and \(y^\prime = \frac{\delta }{\gamma }\) so that B 1(x′) =  and B 2(y′) = . We then obtain from Eqs. 18 and 21:

$$\displaystyle{ \left \vert \frac{\delta } {\gamma }\right \vert = \left \vert \frac{1 + k^\prime -\sigma k^\prime} {1 + k^\prime-\sigma } \right \vert = \left \vert \frac{1} {1 - \frac{\sigma (1-k^\prime)} {1+k^\prime-\sigma k^\prime}}\right \vert. }$$
(24)

This is greater than unity when \(0 < \frac{\sigma (1-k^\prime)} {1+k^\prime-\sigma k^\prime} < 2\). We note from the definition of σ that 0 < σ < 2. It follows that \(1 + k^\prime -\sigma k^\prime > 1 - k^\prime > 0\) and hence that

$$\displaystyle{ 0 < \frac{\sigma (1 - k^\prime)} {1 + k^\prime -\sigma k^\prime} < \frac{\sigma (1 - k^\prime)} {1 - k^\prime} =\sigma < 2, }$$
(25)

as was to be shown. We have proved that the points at infinity for x and y correspond to points outside [k′, 1] and hence that B 1([k′, 1]) = [a, b] and B 2([k′, 1]) = [c, d].

The formulas derived here are of a simpler form than those given in [Wachspress, 19631966]. The two formulations do however give identical iteration parameters. The iteration parameters for J iterations over the interval [k , 1] are \(w_{j} = dn[(2j - 1)K/2J,k]\). To determine p j and q j from w j , we equate the roots of g(x, y, p, q) and \(g({x}^{{\prime}},{y}^{{\prime}},\mathbf{w},\mathbf{w})\text{ to obtain }{x}^{{\prime}}-w_{j} = B_{1}^{-1}(x)-w_{j} = 0\text{ when }x = B_{1}(w_{j}) = q_{j}\text{ and }{y}^{{\prime}}-w_{j} = B_{2}^{-1}(y)-w_{j} = 0\text{ when }y = B_{2}(w_{j}) = p_{j}.\) Thus,

$$\displaystyle{ p_{j} = \frac{\alpha w_{j}-\beta } {-\gamma w_{j}+\delta },\text{ and }q_{j} = \frac{\alpha w_{j}+\beta } {\gamma w_{j}+\delta }. }$$
(26)

The possibility of significant gain in efficiency is illustrated by the following example: Let the intervals be [0. 01, 10] and [100, 1000]. For Eq. 2 of Chap. 1, we would use \({k}^{{\prime}} = \frac{0.01} {1000}\) which yields J varying as \(\ln \frac{4} {{k}^{{\prime}}} = 12.9\). The transformation equations yield

$$\begin{array}{rlrlrl} m & = \frac{2(10 - 0.01)(1000 - 100)} {(0.01 + 100)(10 + 1000)} = 0.17802, & & \\ {k}^{{\prime}} & = \frac{1} {1 + m + \sqrt{m(2 + m)}} = 0.555. & & \end{array}$$

Now \(\ln \frac{4} {{k}^{{\prime}}} = 1.97\) and the number of iterations is reduced by a factor of \(\frac{12.9} {1.97} = 6.53\).

We also note that the generalized formulation only requires that matrix A be SPD. This ensures a + c > 0 and allows a splitting with either a or c less than zero. Convergence rate and relationships among J, k , and R are established in the transformed space.

2.3 The Three-Variable ADI Problem

Analysis of ADI iteration for three space variables is less definitive. Let X, Y, Z be the commuting components of the matrix A which are associated with line sweeps parallel to the x, y, z axes, respectively. Douglas [1962] proposed the iteration

$$\begin{array}{rlrlrl} (X + p_{j}I)\mathbf{u}_{j-2/3} & = -2\left (Y + Z + \frac{X} {2} -\frac{p_{j}} {2} I\right )\mathbf{u}_{j-1} + 2\mathbf{b}, &\end{array}$$
(27.1)
$$\begin{array}{rlrlrl} (Y + p_{j}I)\mathbf{u}_{j-1/3} & = Y \mathbf{u}_{j-1} + p_{j}\mathbf{u}_{j-2/3}, &\end{array}$$
(27.2)
$$\begin{array}{rlrlrl} (Z + p_{j}I)\mathbf{u}_{j} & = Z\mathbf{u}_{j} + p_{j}\mathbf{u}_{j-1/3}. &\end{array}$$
(27.3)

Although Douglas suggested methods for choosing parameters 30 years ago, I am unaware at this time of any determination of optimum parameters as a function of spectral bounds. Moreover, error reduction as a function of parameter choice is not easily computed a priori. Perhaps a thorough literature search would uncover more extensive analysis. Rather than pursue this approach, we shall consider an alternative which allows a more definitive analysis.

Two of the three commuting matrices may be treated jointly. Let these be designated as H and V and let the third be P. We wish to solve the system

$$\displaystyle{ A\mathbf{u} \equiv (H + V + P)\mathbf{u} = \mathbf{b}. }$$
(28)

The standard ADI iteration

$$\begin{array}{rlrlrl} (H + V + p_{j}I)\mathbf{u}_{j-1/2} & = (p_{j}I - P)\mathbf{u}_{j-1} + \mathbf{b} &\end{array}$$
(29.1)
$$\begin{array}{rlrlrl} (P + q_{j})\mathbf{u}_{j} & = (q_{j}I - H - V )\mathbf{u}_{j-1/2} + \mathbf{b} &\end{array}$$
(29.2)

applies when solution of Eq. 29.1 is expedient, but this is not often the case. The analysis is simplified when applied in the transformed space where the eigenvalue intervals of X′ ≡ H′ + V′ and of Z′ ≡ P′ are both [k′, 1]. In this space the iteration parameters for the two sweeps are the same, and Eqs. 29 become

$$\begin{array}{rlrlrl} (X^\prime + w_{j}I)\mathbf{u}_{j-1/2} & = (w_{j}I - Z^\prime)\mathbf{u}_{j-1} + \mathbf{b}, &\end{array}$$
(30.1)
$$\begin{array}{rlrlrl} (Z^\prime + w_{j})\mathbf{u}_{j} & = (w_{j}I - X^\prime)\mathbf{u}_{j-1/2} + \mathbf{b}. &\end{array}$$
(30.2)

Suppose we approximate \(\mathbf{u}_{j-1/2}\) by standard ADI iteration applied to the commuting matrices \((H^\prime + \frac{w_{j}} {2} I)\) and \((V ^\prime + \frac{w_{j}} {2} I)\). If this “inner” ADI iteration matrix is T j , then Eq. 30.1 is replaced by

$$\displaystyle{ \mathbf{u}_{j-1/2} = T_{j}\mathbf{u}_{j-1} + (I - T_{j}){(X^\prime + w_{j}I)}^{-1}[(w_{ j}I - Z^\prime)\mathbf{u}_{j-1} + \mathbf{b}]. }$$
(31)

The error vector e j  ≡ u j  − u after the double sweep of Eqs. 30 is L j e j − 1, where

$$\displaystyle{ L_{j} = {(Z^\prime + w_{j}I)}^{-1}{(X^\prime + w_{ j}I)}^{-1}(w_{ j}I - X^\prime)[(w_{j}I - Z) + T_{j}(X^\prime + Z^\prime)]. }$$
(32)

T j commutes with X′ and Z′. Let the error reduction of the inner ADI iteration be ɛ j . If this value is not sufficiently small, the iteration can diverge. This is illustrated by considering a limiting case of the eigenvector whose X′-eigenvalue is 1 and whose Z′-eigenvalue is k′. The corresponding eigenvalue of T j is ɛ j . The corresponding eigenvalue of L j is

$$\displaystyle{ \lambda = \frac{(w_{j} - 1)[w_{j} - k^\prime +\varepsilon _{j}(1 + k^\prime)]} {(w_{j} + 1)(w_{j} + k^\prime)}. }$$
(33)

For one of the outer ADI iterations, w j can be close to k′ and thus small compared to unity. We consider the case where \(w_{j}\doteq k^\prime\). Then \(\vert \lambda \vert \doteq\frac{\varepsilon _{j}} {2k^\prime}\). We observe that ɛ j must be less than 2k′ for this eigenvalue to be less than unity. The composite J-step outer ADI iteration may still converge, but convergence can be seriously hampered by insufficient convergence of the inner ADI iteration. When sufficient inner ADI iterations are performed to ensure \(\|T_{j}\| < 2k^\prime\) for all j, the norm of the composite ADI iteration is bounded by the square root of the value achieved with Eq. 29. This is due to the factor of \({(X^\prime + w_{j}I)}^{-1}(w_{j}I - X^\prime)\) in Eq. 32. In Chap. 3 we shall discuss use of ADI iteration as a preconditioner for a conjugate gradient iteration. In this application, modest error reduction is required of the ADI iteration.

The three-variable ADI iteration is not performed in the transformed space, and the analysis leading to Eqs. 3233 must be modified accordingly. We find that with \(X = H + V\) and Z = P Eq. 32 becomes

$$\displaystyle{ L_{j} = {(Z + q_{j}I)}^{-1}{(X + p_{ j}I)}^{-1}(q_{ j}I - X)[(p_{j}I - Z) + T_{j}(X + Z)]. }$$
(32A)

Applying the WBJ transformation to this equation, we find that Eq. 33 becomes

$$\displaystyle{ \lambda = \frac{(w_{j} - x)} {(w_{j} + x)}\left [(1 -\varepsilon _{j})(w_{j} - z) +\varepsilon _{j}(w_{j} + x)\frac{(\delta -\gamma z)} {(\delta +\gamma x)}\right ]. }$$
(33A)

A careful analysis of the spectrum reveals that the square root of the convergence rate attained by Eq. 29 is guaranteed when

$$\displaystyle{ \varepsilon _{j} < \mathrm{min}\left [\frac{w_{j}(\delta +\gamma )} {(\delta -\gamma w_{j})}, \frac{2k^\prime(\delta +\gamma )} {(1 + k^\prime)(\delta -\gamma w_{j})}\right ]. }$$
(34)

This bound on ɛ j is approximately equal to the smaller of 2k′ and w j . This iteration does not appear to be particularly efficient when significant error reduction is required as a result of the many H, V iterations for each P-step. We defer further analysis until after we have discussed ADI preconditioning for conjugate gradients in Chap. 3.

2.4 Analysis of the Two-Variable Minimax Problem

Shortly after my book on “Iterative Solution of Elliptic Systems” was published, I received a phone call from Bruce Kellogg (University of Maryland) asking if anyone had ever solved the two-variable ADI minimax problem. I thought that Bill Jordan and I had done so. After all, it was obvious from symmetry considerations, after Jordan’s transformation to yield identical ranges for the two variables, that the two-variable solution to the transformed problem was equal to the one-variable solution. Or was it obvious? After careful consideration, I determined that it was not evident and that, in fact, I could find no simple proof. I spent a good deal of time on this problem during the summer of 1967 and the analysis was of sufficient depth that I submitted it as my RPI PhD thesis, from which this section has been extracted. The thesis flavor is retained by the attention to detail here.

We consider the spectral radius of the generalized ADI equations (Eq. 3 of Chap. 1) after Jordan’s transformation. Let a j  ≡ p j and b j  ≡ q j . Then

$$\displaystyle{\rho (G_{J})\ =\ \mathop{\max }\limits_{ {k}^{{\prime}}\leq x,y \leq 1}\left \vert \prod _{ j=1}^{J}\frac{(b_{j} - x)(a_{j} - y)} {(a_{j} + x)(b_{j} + y)}\right \vert .}$$

We consider the three parts of Chebyshev minimax analysis: existence, alternance, and uniqueness. We first note that if any a j or b j is less than k , then replacing that value by k will decrease the magnitude of each nonzero factor in this product. Similarly, replacing any a j or b j greater than unity by unity will also decrease the magnitude of each nonzero factor. In our search for optimum parameters, we may restrict them to lie in the interval [k , 1]. When all the parameters are in this interval each factor has magnitude less than unity, and hence ρ < 1. Once it is shown that ρ is a continuous function of the parameters, standard compactness arguments may be used to establish the existence of a solution to the minimax problem.

The spectral radius is not affected by any change in the order in which the parameters are applied. We choose the nondecreasing ordering: a j  ≤ a j + 1 and b j  ≤ b j + 1. It will be demonstrated eventually that the optimum parameters in each set are distinct. Uniqueness will then be established for the ordered optimum parameter sets. In the ensuing analysis all parameter sets are restricted to the interval [k , 1]. We now establish continuity of ρ. We define g(x, a, b) as

$$\displaystyle{ g(x,\mathbf{a},\mathbf{b}) =\prod _{ j=1}^{J}\frac{a_{j} - x} {b_{j} + x}. }$$
(35)

Then

$$\displaystyle{ \rho (G_{J}) =\mathop{\max }\limits_{ k^\prime \leq x,y \leq 1}\vert g(x,\mathbf{a},\mathbf{b})g(y,\mathbf{b},\mathbf{a})\vert. }$$
(36)

Let \(Z =\mathop{\max }\limits_{ j}\vert z_{j}\vert \) for any J-tuple z. Consider a perturbation from parameter sets a and b to a + c and b + f. Let ρ(a, b) be attained at (x 1, y 1) and let ρ(a + c, b) be attained at (x 2, y 2), where ρ(a, b) ≤ ρ(a + c, b). (The argument is similar if the reverse inequality is assumed.) Since g is uniformly continuous over [k , 1]2J + 1, there exists for any e > 0 a d > 0 such that \(\vert g(x_{2},\mathbf{a + c},\mathbf{b})g(y_{2},\mathbf{b},\mathbf{a + c}) - g(x_{2},\mathbf{a},\mathbf{b})g(y_{2},\mathbf{b},\mathbf{a})\vert < e/2\) for any c for which C < d.

For any real numbers w and u, \(\big\vert \vert w\vert -\vert u\vert \big\vert \leq \vert w - u\vert \). Thus,

$$\displaystyle{\big\vert \rho (\mathbf{a + c},\mathbf{b}) -\vert g(x_{2},\mathbf{a},\mathbf{b})g(y_{2},\mathbf{b},\mathbf{a})\vert \big\vert < e/2.}$$

Moreover,

$$\displaystyle{\rho (\mathbf{a + c},\mathbf{b}) \geq \rho (\mathbf{a},\mathbf{b}) \geq \vert g(x_{2},\mathbf{a},\mathbf{b})g(y_{2},\mathbf{b},\mathbf{a})\vert .}$$

Therefore, when C < d,

$$\displaystyle{\vert \rho (\mathbf{a + c},\mathbf{b}) -\rho (\mathbf{a},\mathbf{b})\vert \leq \big\vert \rho (\mathbf{a + c},\mathbf{b}) -\vert g(x_{2},\mathbf{a},\mathbf{b})g(y_{2},\mathbf{b},\mathbf{a})\vert \big\vert < e/2.}$$

Similarly, there is an h > 0 such that when F < h, we have

$$\displaystyle{\vert \rho (\mathbf{a + c},\mathbf{b + f}) -\rho (\mathbf{a + c},\mathbf{b})\vert < e/2.}$$

Therefore, when C < d and F < h,

$$\begin{array}{rlrlrl} \vert \rho (\mathbf{a + c},\mathbf{b + f}) -\rho (\mathbf{a},\mathbf{b})\vert & = \vert \rho (\mathbf{a + c},\mathbf{b + f}) -\rho (\mathbf{a + c},\mathbf{b}) +\rho (\mathbf{a + c},\mathbf{b}) -\rho (\mathbf{a},\mathbf{b})\vert & & \\ &\leq \vert \rho (\mathbf{a + c},\mathbf{b + f}) -\rho (\mathbf{a + c},\mathbf{b})\vert + \vert \rho (\mathbf{a + c},\mathbf{b}) -\rho (\mathbf{a},\mathbf{b})\vert & & \\ & < e. & & \end{array}$$

Thus, ρ(a, b) is continuous over [k′, 1]2J and it follows that ρ must attain its minimum value over [k′, 1]2J for at least one pair of J-tuples. We have established the existence of a solution to the two-variable ADI minimax problem, and we now address the alternance property. In the ensuing discussion, a o and b o are J-tuples for which ρ attains its least value and perturbations in the analysis are restricted so that all components remain in [k′, 1]. We will prove the following theorem:

Theorem 5 (The two-variable Poussin Alternance Property). 

If

$$\displaystyle{\rho ({\mathbf{a}}^{o},{\mathbf{b}}^{o}) =\mathop{\min }\limits_{ \mathbf{a},\mathbf{b}}\rho (\mathbf{a},\mathbf{b}),}$$

then both g(x,a o,b o) and g(y, b o,a o) attain their maximum absolute values with alternating signs J + 1 times on [k ,1].

The proof is long, and we require three lemmas:

Lemma 6.

The components of a o are distinct and the components ofb o are distinct.

Proof.

We show that the assumption \(a_{k}^{o} = a_{k+1}^{o}\) leads to a contradiction. The identical argument applies to b o. Let

$$\displaystyle{ G =\mathop{\max }\limits_{ {k}^{{\prime}}\leq x \leq 1}\vert g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o})\vert }$$
(37)

and

$$\displaystyle{ H =\mathop{\max }\limits_{ {k}^{{\prime}}\leq y \leq 1}\vert g(y,{\mathbf{b}}^{o},{\mathbf{a}}^{o})\vert. }$$
(38)

Then ρ(a o, b o) = GH. Let P(x) = 1 when J = 2 and for J > 2 define the polynomial

$$\displaystyle{P(x) =\mathop{\prod _{ j=1}^{J}}\limits_{j\neq k,k + 1}(a_{ j}^{o} + x).}$$

Now consider

$$\displaystyle{ g(x,\mathbf{a}_{e},{\mathbf{b}}^{o}) = \frac{\prod _{j=1}^{J}(a_{ j}^{o} - x) - exP(-x)} {\prod _{j=1}^{J}(b_{j}^{o} + x)} , }$$
(39)

where e is a positive number which will subsequently be defined more precisely and where a e is the J-tuple whose components are the zeros of the numerator on the right-hand side. The value of e is chosen sufficiently small that all these zeros are positive. These zeros include the J − 2 roots of P( − x) and the two roots in [k′, 1] of the quadratic \({(a_{k}^{o} - x)}^{2} - ex = 0\). For all components of a o in [k′, 1] and e positive, this quadratic has two real positive roots. Hence, all J roots are positive.

In general, \(g(x,\mathbf{b},\mathbf{a}) ={ g(-x,\mathbf{a},\mathbf{b})}^{-1}\). Hence,

$$\displaystyle{ g(y,{\mathbf{b}}^{o},\mathbf{a}_{ e}) = \frac{\prod _{j=1}^{J}(\mathbf{b}_{j}^{o} - y)} {\prod _{j=1}^{J}(\mathbf{a}_{j}^{o} + y) + eyP(y)}, }$$
(40)

where both terms in the denominator are positive when e, y and all components of a o are positive. Therefore, if we define

$$\displaystyle{ H_{e} \equiv \mathop{\max }\limits_{ k^\prime \leq y \leq 1}\vert g(y,{\mathbf{b}}^{o},\mathbf{a}_{ e})\vert , }$$
(41)

then H e  < H. We next define

$$\displaystyle{ z(x) = g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) - g(x,\mathbf{a}_{ 1},{\mathbf{b}}^{o}) = \frac{xP(-x)} {\prod _{j=1}^{J}(\mathbf{b}_{j}^{o} + x)}. }$$
(42)

(We note that when \(e = 1,\ \mathbf{a}_{e} = \mathbf{a}_{1}\).) We observe that \(g(x,\mathbf{a}_{e},{\mathbf{b}}^{o}) = g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) - ez(x)\). When all components of a o and x are in [k′, 1], | a j 0 − x |  < 1 and | b j o + x | ≥ 2k′. Thus, if we define M ≡ 2(2k′) − J then | z(x) |  < M. Let \(e_{o} = G/M\). Then, 0 < e | z(x) |  < G when z(x) ≠ 0 and g(x, a o, b o) = 0 when z(x) = 0. Moreover, sign g(x, a o, b o) = sign z(x) when g ≠ 0. It follows that

$$\begin{array}{rlrlrl} \big\vert g(x,\mathbf{a}_{e},{\mathbf{b}}^{o})\vert & = \vert g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) - ez(x)\big\vert & & \\ & =\big \vert \vert g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o})\vert - e\vert z(x)\vert \big\vert < G. &\end{array}$$
(43)

If we define \(G_{e} \equiv \mathop{\max }\limits_{ k^\prime \leq x \leq 1}\vert g(x,\mathbf{a}_{e},\mathbf{b}_{o})\vert \), then G e  < G. We have already shown that H e  < H. Hence, G e H e  < GH = ρ(a o, b o), in contradiction to the hypothesis that the latter is a lower bound on the spectral radius. This establishes the lemma.

We next prove

Lemma 7.

If G and H are as defined in Lemma 6:

  1. i.

    \(g(k^\prime,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) = G\,{ and }\,g(k^\prime,{\mathbf{b}}^{o},{\mathbf{a}}^{o}) = H\)

  2. ii.

    \(g(1,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) = {(-1)}^{J}G\,{and}\,g(1,{\mathbf{b}}^{o},{\mathbf{a}}^{o}) = {(-1)}^{J}H\)

Proof.

The components of the J-tuples a o and b o are in [k′, 1] so that if we define V by \(g(k^\prime,{\mathbf{a}}^{o},{\mathbf{b}}^{o}) = G - V\), then 0 ≤ V ≤ G. Let a differ from a o only in its first element: \(a^\prime_{1} = a_{1}^{o} + e\) with e ∈ [0, e o ], where e o is a nonnegative number to be defined. Let \(G^\prime \equiv \mathop{\max }\limits_{ k^\prime \leq x \leq 1}\vert g(x,\mathbf{a}^\prime,{\mathbf{b}}^{o})\vert \), and let \(H^\prime \equiv \mathop{\max }\limits_{ k^\prime \leq x \leq 1}\vert g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime)\vert \). Let e 1 ≡ a 2 o − a 1 o. By Lemma 6, e 1 > 0. Excluding the values x = a j o for \(j = 2,3,\ldots ,J\) and y = b j o for \(j = 1,2,\ldots ,J\), where \(g(x,\mathbf{a}^\prime,{\mathbf{b}}^{o}) = g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime) = 0\), we have for x ≥ a 1 o + e and y ∈ [k′, 1],

$$\displaystyle{ \left \vert \frac{g(x,\mathbf{a}^\prime,{\mathbf{b}}^{o})g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime)} {g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o})g(y,{\mathbf{b}}^{o},{\mathbf{a}}^{o})}\right \vert = \left \vert \frac{(x - a_{1}^{o} - e)(y + a_{1}^{o})} {(x - a_{1}^{o})(y + a_{1}^{o} + e)}\right \vert < 1. }$$
(44)

Therefore,

$$\begin{array}{rlrlrl} &\mathop{\max }\limits_{a_{1}^{o} + e \leq x \leq 1,k^\prime \leq y \leq 1}\vert g(x,\mathbf{a}^\prime,{\mathbf{b}}^{o})g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime)\vert <\mathop{\max }\limits_{k^\prime \leq x,y \leq 1}\vert g(x,{\mathbf{a}}^{o},{\mathbf{b}}^{o})g(y,{\mathbf{b}}^{o},{\mathbf{a}}^{o})\vert = GH.& \end{array}$$
(45)

When \(y = b_{j}^{o},\ g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime) = g(y,{\mathbf{b}}^{o},{\mathbf{a}}^{o}) = 0\) for \(j = 1,2,\ldots ,J\). For all other y ∈ [k′, 1],

$$\displaystyle{ \left \vert \frac{g(y,{\mathbf{b}}^{o},\mathbf{a}^\prime)} {g(y,{\mathbf{b}}^{o},{\mathbf{a}}^{o})}\right \vert = \left \vert \frac{y + a_{1}^{o}} {y + a_{1}^{o} + e}\right \vert < 1. }$$
(46)

Hence,

$$\displaystyle{ H^\prime < H. }$$
(47)

For \(k^\prime \leq x \leq a_{1}, \dfrac{\partial \big\vert \frac{a_{j}-x} {b_{j}+x} \big\vert } {\partial x} = - \frac{a_{j}+b_{j}} {{(b_{j}+x)}^{2}} < 0\). Hence, g(x, a, b) increases in absolute value as x decreases from a 1 to k′. It follows that for e ∈ (0, e 1),

$$\displaystyle{ \mathop{\max }\limits_{k^\prime \leq x \leq a_{1}^{o} + e}\vert g(x,\mathbf{a}^\prime,{\mathbf{b}}^{o})\vert = g(k^\prime,\mathbf{a}^\prime,{\mathbf{b}}^{o}). }$$
(48)

If we define \(S \equiv \frac{\prod _{j=2}^{J}(a_{ j}^{o}-k^\prime)} {\prod _{j=1}^{J}(a_{j}^{o}+k^\prime)}\) we have \(g(k^\prime,\mathbf{a}^\prime,{\mathbf{b}}^{o}) = G - V + eS\). Suppose V ≠ 0 and let \(e_{o} =\min (e_{1}\,,\,V/2S)\). Then for 0 < e < e o ,

$$\displaystyle{ g(k^\prime,\mathbf{a}^\prime,{\mathbf{b}}^{o}) < G - V + e_{ o}S \leq G -\frac{V } {2}. }$$
(49)

Combining Eqs. 4547, we have G′H′ < GH = ρ(a o, b o), contrary to the hypothesis that ρ(a o, b o) is a lower bound on the spectral radius. The contradiction is resolved only if V = 0, in which case e o  = 0 and g(k′, a o, b o) = G. The same argument applied to g(k′, b o, a o) establishes that this is equal to H, and part (i) of the lemma is proved.

Part (ii) of the lemma can be proved by symmetry properties. Let \(x = k^\prime/x^\prime\) and \(y = k^\prime/y^\prime\). Then the minimax problem in terms of the primed variables is the same as the original problem with J-tuples related by: Components of a equal components of k′ ∕ a in reverse order, and components of b equal components of k′ ∕ b in reverse order. Since g(x′, a ′o, b ′o) = ( − 1)J g(x, a o, b o) and g(y′, b ′o, a ′o) = ( − 1)J g(y, b o, a o), part (ii) of the lemma is established by substituting k′ for x in these equations. One reasons that if (ii) were not true for some minimizing set of parameters, then (i) would not be true in the primed system. But we have already established (i) for any minimizing set.

For a fixed pair of positive J-tuples, g is a rational function of x and is continuous for positive x. One more lemma will be proved before we establish the Chebyshev alternance property of the optimizing parameters. We first partition the interval [k′, 1] into subintervals such that g(x) has only positive extrema, G, or only negative extrema, − G, with opposite signs in successive intervals. Since g can have at most J changes of sign, there can be at most J + 1 subintervals. Let g have only I alternations (i.e., I + 1 subintervals). Let the leftmost extreme point in subinterval i + 1 be x i (1) and the rightmost extreme point in this subinterval be x i (2). If there is only one extreme in the interval, x i (1) = x i (2). By Lemma 7, x 0(1) = k′ and x I (2) = 1. The function g is continuous over [k′, 1] and must therefore have at least one zero between x i − 1(2) and x i (1). We choose any set of these zeros as u i with x i − 1(2) < u i  < x i (1) for \(i = 1,2,\ldots ,I\). There must be a positive V such that one of the following inequalities holds in each interval (u i , u i + 1) for \(i = 1,\ldots ,I\):

$$\displaystyle{ -G + V < g(x) \leq G,\quad u_{i} \leq x \leq u_{i+1}\quad i\text{ even }\!, }$$
(50.1)
$$\displaystyle{ -G \leq g(x) < G - V,\quad u_{i} \leq x \leq u_{i+1}\quad i\text{ odd }\!. }$$
(50.2)

Similarly, if h(y) has K alternations, we can select a set of v k and a positive W such that for \(k = 1,\ldots ,K\):

$$\displaystyle{ -H + W < h(y) \leq H,\quad v_{k} \leq y \leq v_{k+1}\quad k\text{ even }, }$$
(51.1)
$$\displaystyle{ -H \leq h(y) < H - W,\quad v_{k} \leq y \leq v_{k+1}\quad k\text{ odd }. }$$
(51.2)

Let U be the smaller of V and W and define

$$\displaystyle{ F(x) \equiv -x\prod _{i=1}^{I}(u_{ i} - x)\prod _{k=1}^{K}(v_{ k} + x). }$$
(52)

Since both a and b are positive, the products \(\prod _{j=1}^{J}(a_{j} - x)\) and \(\prod _{j=1}^{J}(b_{j} + x)\) have no common root. The Divisor Lemma in Chap. 1 establishes the existence of polynomials P(x) and R(x) of maximal degree J such that for \(I + K + 1 \leq 2J\),

$$\displaystyle{ R(x)\prod _{j=1}^{J}(a_{ j} - x) - P(-x)\prod _{j=1}^{J}(b_{ j} + x) = F(x). }$$
(53)

Since g and h can have at most J alternations in \([k^\prime,1],\ I + K + 1 > 2J\) if and only if \(I = K = J\). It will be shown that this is indeed the case for any set of parameters for which ρ attains its lowest bound. If we assume to the contrary, we will find that polynomials P and R may be used to construct other sets of J-tuples for which the spectral radius is decreased. In the ensuing discussion, a and b are assumed to be optimal so that the conditions of Lemmas 6 and 7 are satisfied. Polynomials P and R satisfy Eq. 53 for these J-tuples. We are now ready to prove the final lemma:

Lemma 8.

Suppose g and h do not both have J Chebyshev alternations over [k′,1]. Then there is a positive value, e 0 , such that for all e ∈ (0,e 0 ) if we define

$$\displaystyle{ g_{1}(x) = \frac{\prod _{j=1}^{J}(a_{j} - x) - eP(-x)} {\prod _{j=1}^{J}(b_{j} + x) - eR(x)} }$$
(54.1)

and

$$\displaystyle{ h_{1}(y) = \frac{\prod _{j=1}^{J}(b_{j} - y) - eR(-y)} {\prod _{j=1}^{J}(a_{j} + y) - eP(y)} , }$$
(54.2)

then

  1. i.

    All the zeros of g 1 (x) and of h 1 (y) are real.

  2. ii.

    G 1 H 1 < GH, where \(G_{1} =\mathop{\max }\limits_{ k^\prime \leq x \leq 1}\vert g_{1}(x)\vert \text{ and }H_{1} =\mathop{\max }\limits_{ k^\prime \leq y \leq 1}\vert h_{1}(y)\vert .\)

Proof.

Let N,  X,  Y,  D be real numbers. When D and D − Y are nonzero,

$$\begin{array}{rlrlrl} \frac{N - X} {D - Y } = \frac{D(N - X)} {D(D - Y )} & = \frac{D(N - X) + N(D - Y ) - N(D - Y )} {D(D - Y )} & & \\ & = \frac{N} {D} + \frac{(NY - DX)} {D(D - Y )}. &\end{array}$$
(55)

Applying this identity to g 1 and h 1, we get

$$\displaystyle{ g_{1}(x) = g(x) + \frac{eF(x)} {\prod _{j=1}^{J}(b_{j} + x)[\prod _{j=1}^{J}(b_{j} + x) - eR(x)]}, }$$
(56.1)

and

$$\displaystyle{ h_{1}(y) = h(y) - \frac{eF(-y)} {\prod _{j=1}^{J}(a_{j} + y)[\prod _{j=1}^{J}(a_{j} + y) - eP(y)]}. }$$
(56.2)

Let M be an upper bound on the magnitudes of the three polynomials F(x),  P(x), and R(x) for − 1 ≤ x ≤ 1. We note that \(\prod _{j=1}^{J}(a_{j} + x)\ \mathrm{and}\ \prod _{j=1}^{J}(b_{j} + x)\) are each ≥ (2k′)J. Let \(e_{1} = {(2k^\prime)}^{J}/M\). Then for e ∈ (0, e 1) and k′ ≤ x, y ≤ 1

$$\displaystyle{ \prod _{j=1}^{J}(b_{ j} + x) - eR(x) \geq {(2k^\prime)}^{J} - eM > 0, }$$
(57.1)

and

$$\displaystyle{ \prod _{j=1}^{J}(a_{ j} + y) - eP(y) \geq {(2k^\prime)}^{J} - eM > 0. }$$
(57.2)

From Eqs. 5455, we conclude that

$$\begin{array}{rlrlrl} \text{sign}[g_{1}(x) - g(x)] & = \text{sign}F(x), &\end{array}$$
(58.1)
$$\begin{array}{rlrlrl} \text{sign}[h_{1}(y) - h(y)] & = -\text{sign}F(-y) &\end{array}$$
(58.2)

for e ∈ (0, e 1) and k′ ≤ x,  y ≤ 1.

From the definition of F(x) in Eq. 52, we obtain

$$\begin{array}{rlrlrl} F(x) & < 0\qquad u_{i} < x < u_{i+1}\text{ and }i\text{ even }, &\end{array}$$
(59.1)
$$\begin{array}{rlrlrl} F(x) & > 0\qquad u_{i} < x < u_{i+1}\text{ and }i\text{ odd }, &\end{array}$$
(59.2)
$$\begin{array}{rlrlrl} F(-y) & > 0\qquad v_{k} < y < v_{k+1}\text{ and }k\text{ even }, &\end{array}$$
(59.3)
$$\begin{array}{rlrlrl} F(-y) & < 0\qquad v_{k} < y < v_{k+1}\text{ and }k\text{ odd }. &\end{array}$$
(59.4)

Recalling the definition of U (after Eqs. 51) and of M (after Eqs. 56), we define

$$\displaystyle{ e_{2}^\prime \equiv \frac{{(2k^\prime)}^{2J}U} {M[1 + {(2k^\prime)}^{J}U]}\text{ and }e_{2} = \mathrm{min}(e_{1},e_{2}^\prime). }$$
(60)

Then for e ∈ (0, e 2) and k′ ≤ x ≤ 1

$$\begin{array}{rlrlrl} \vert g_{1}(x) - g(x)\vert & = \left \vert \frac{eF(x)} {\prod _{j=1}^{J}(b_{j} + x)[\prod _{j=1}^{J}(b_{j} + x) - eR(x)]}\right \vert & & \\ &\leq \frac{eM} {{(2k^\prime)}^{J}[{(2k^\prime)}^{J} - eM]} < U \leq V. &\end{array}$$
(61)

Similarly, there is an e 3 such that for e ∈ (0, e 3) and k′ ≤ y ≤ 1, | h 1(y) − h(y) |  < U ≤ W. Let e 4 = min(e 2, e 3). For e ∈ (0, e 4) and k′ = u 0 ≤ x ≤ u 1, we have from Eq. 50

$$\displaystyle{ -G + V < g(x) \leq G. }$$
(62.1)

By Eq. 59,

$$\displaystyle{ F(x) < 0, }$$
(62.2)

and since \(\mathrm{sign}\,[g_{1}(x) - g(x)] = \mathrm{sign}\,F(x)\) is negative,

$$\displaystyle{ g_{1}(x) < g(x) \leq G. }$$
(62.3)

Moreover, by Eq. 61, \(\vert g_{1}(x) - g(x)\vert = g(x) - g_{1}(x) < U \leq V\) so that

$$\displaystyle{ g_{1}(x) > g(x) - V > -G. }$$
(62.4)

From Eqs. 62.3 and 62.4, − G < g 1(x) < G. Also, \(g(u_{1}) = F(u_{1}) = 0\). Hence, g 1(u 1) = 0. For e ∈ (0, e 4) and u 1 < x < u 2, we have

$$\displaystyle{ -G \leq g(x) < G - V \text{ from}\ \,\mathrm{Eq}.\,50, }$$
(63.1)
$$\displaystyle{ F(x) > 0\text{ from }\,\,\mathrm{Eq}.\,59, }$$
(63.2)

and \(\mathrm{sign}\,[g_{1}(x) - g(x)] = \mathrm{sign}\,F(x)\) is positive so that

$$\displaystyle{ g_{1}(x) > g(x) \geq -G. }$$
(63.3)

Moreover, by Eq. 61, \(\vert g_{1}(x) - g(x)\vert = g_{1}(x) - g(x) < U \leq V\). Hence,

$$\displaystyle{ g_{1}(x) < g(x) + V \leq G. }$$
(63.4)

From Eqs. 63.3 and 63.4, − G < g 1(x) < G. Also, \(g(u_{2}) = F(u_{2}) = 0\) so that g 1(u 2) = 0.

Continuing through all the intervals in this fashion, we find that | g 1(x) |  < G over [k′, 1]. The same argument suffices to prove that | h 1(y) |  < H. The lemma is thus proved.

The construction in proof of Lemma 8 fails only when I = K = J. Since g 1(x) and h 1(y) are continuous over [k′, 1], they can alternate J times over this interval only if all their zeros are in this interval. In fact, they are bounded rational functions in this interval whose numerators are polynomials of maximal degree J and accordingly have precisely J zeros in [k′, 1].

Since we have proved that a solution to the minimax problem exists, it follows immediately from Lemma 8 that for any J-tuples which achieve the least maximum there must be J Chebyshev alternations. We have thus proved:

Theorem 9 (Chebyshev alternance theorem). 

Let a o and b o be J-tuples for which the spectral radius of the two-variable ADI error-reduction matrix is minimized. Then g(x, a o,b o ) and h(y, b o,a o ) both have J Chebyshev alternations on [k′,1].

Our final task is to establish uniqueness. Once we have proved that only one pair of ordered J-tuples can satisfy the Chebyshev theorem, we can assert that since the choice of a = b equal to the optimizing J-tuple for the one-variable problem yields the Chebyshev alternance property, this choice is the unique solution to the two-variable problem.

Let a be the optimizing J-tuple for the one-variable problem with maximum value for | g(x) | equal to G and let a , b be another set which yields the Chebyshev alternance property with maximum values for | g′(x) | and | h′(y) | equal to G′ and H′, respectively. We define the continuous function over [k′, 1]:

$$\displaystyle{ d(x) \equiv g(x,\mathbf{a},\mathbf{a}) - g(x,\mathbf{a}^\prime,\mathbf{b}^\prime) \equiv g(x) - g^\prime(x). }$$
(64)

When GG′, it is easily shown that d(x) alternates J times on [k′, 1] for if G > G′ then d has the sign of g at its alternation points and if G < G′ then d has the sign of g′ at its alternation points. It follows that d(x) has at least J zeros in [k′, 1].

When G = G′, the analysis is slightly more complicated. If d(x) = 0 at an interior alternation point, two sign changes are removed and only one zero identified at this alternation point. However, we note that the derivatives of both g and g′ vanish at this common alternation point. Hence the derivative of d with respect to x also vanishes at this point and it is at least a double root. We thus recover the “lost” zero. Of course, the endpoint alternation points are common to both functions and each yields only one zero since the derivatives do not vanish at these points. However, each of these alternation points only accounts for one zero when GG′. We have thus proved that d(x) has at least J roots in [k′, 1] even when G = G′.

A similar argument applies to the difference between h(y) and h′(y). Now define

$$\displaystyle{ n(x) \equiv \prod _{j=1}^{J}(a_{ j} - x)(b^\prime_{j} + x) -\prod _{j=1}^{J}(a_{ j} + x)(a^\prime_{j} - x). }$$
(65)

Then

$$\displaystyle{ d(x) = \frac{n(x)} {\prod _{j=1}^{J}(a_{j} + x)(b^\prime_{j} + x)}. }$$
(66)

Thus, since we have established that d has at least J zeros in [k′, 1], it follows that n(x) has these same zeros. Applying the same argument to h(y) − h′(y), we conclude that the polynomial

$$\displaystyle{ m(y) \equiv \prod _{j=1}^{J}(a_{ j} - y)(a^\prime_{j} + y) -\prod _{j=1}^{J}(a_{ j} + y)(b^\prime_{j} - y). }$$
(67)

has at least J zeros in [k′, 1]. We now observe that \(n(-x) = -m(x)\). Therefore, the negatives of the zeros of m(y) are also zeros of n(x). Hence, n(x) has at least 2J zeros. Inspection of Eq. 65 reveals that n(x) is of maximal degree 2J − 1. A contradiction is established unless n(x) is the zero polynomial, in which case \(\mathbf{a}^\prime = \mathbf{b}^\prime = \mathbf{a}\). We have proved the following:

Theorem 10 (Main Theorem). 

The two-variable ADI minimax problem has as its unique solution the pair of J-tuples a =b which are equal to the J-tuple that solves the one-variable ADI minimax problem.Footnote 2

2.5 Generalized ADI Iteration

Periodically, my interest in ADI model-problem theory wanes. I see little need for further analysis. Then some new research area is uncovered and my enthusiasm is revived. One example is the discovery around 1982 of the applicability of ADI iteration to Lyapunov and Sylvester matrix equations. This led to need for generalization of the theory into the complex plane, a subject which will be covered in Chap. 4. In December of 1992 Dick Varga forwarded to me for comments and suggestions a draft of a paper by N. Levenberg and L. Reichel on “GADI” iteration. This “GADI” method differs from classical ADI (which they call CADI) in that one allows a different number of mesh sweeps in the two directions. This stimulated analysis presented here.

The “GADI” iteration introduced in by Levenberg and Reichel in 1994 addresses possible improvement by performing a different number of sweeps in the two directions in each iteration. Their analysis is based on potential theory developed by Bagby  [1969]. There are two situations where GADI can outperform PR ADI (which they call CADI). One is where the work required to iterate in one direction is less than the work required in the other direction. They observe that this is the case for Sylvester’s equation when the orders of matrices A and B (see Eqs. 366) differ significantly. Another example is the three-variable approach described in Sect. 2.3, where the H, V iteration even with one inner per outer requires twice the work of the P sweep. The second situation is where the two eigenvalue intervals differ appreciably. We will develop a more precise measure of this disparity.

Let

$$\displaystyle{ g(x,y) =\prod _{ j=1}^{m}\frac{p_{j} - x} {p_{j} + y}\prod _{k=1}^{n}\frac{q_{k} - y} {q_{k} + x}. }$$
(68)

We apply Jordan’s transformation as described in Sect. 2.2 and find that

$$\displaystyle{ g(x,y,\mathbf{p},\mathbf{q}) ={ \left (\frac{\delta -\gamma y^\prime} {\delta +\gamma x^\prime}\right )}^{m-n}g(x^\prime,y^\prime,\mathbf{p}^\prime,\mathbf{q}^\prime). }$$
(69)

When m = n this reduces to the result of Sect. 2.2, but when mn there is an additional factor of

$$\displaystyle{ K_{m,n} ={ \left (\frac{\delta -\gamma y^\prime} {\delta +\gamma x^\prime}\right )}^{m-n}. }$$
(70)

When γ = 0, we have reduced the parameter optimization problem to one where both intervals are [k′, 1]. We have already proved that in general \(\vert \frac{\delta }{\gamma }\vert > 1\). If the work for the two directions is the same, we may choose m ≥ n when γ > 0 and n ≥ m when γ < 0. Then K m, n is in (0, 1) for all x′ and y′ in [k′, 1]. The following theorem establishes the preferential sweep direction in terms of the spectral intervals:

Theorem 11.

If the spectral interval for x is [a,b] and for y is [c,d], then γ > 0 if and only if \((d - c)(b + c) > (b - a)(a + d)\) .

Proof.

From the analysis in Sect. 2.2, we have

$$\begin{array}{rlrlrl} 1 + k^\prime & = 2 + m -\sqrt{m(2 + m)} =\tau -\sqrt{\tau (\tau -2)} =\tau \left [1 -\sqrt{1 - \frac{2} {\tau }} \ \right ], &\end{array}$$
(71.1)
$$\begin{array}{rlrlrl} \frac{2} {\tau } & = \frac{(a + c)(b + d)} {(a + d)(b + c)}, &\end{array}$$
(71.2)
$$\begin{array}{rlrlrl} \sigma & = \frac{2(a + d)} {(b + d)} =\tau \frac{(a + c)} {(b + c)} , &\end{array}$$
(71.3)
$$\begin{array}{rlrlrl} \gamma & =\sigma -(1 + k^\prime) =\tau \left (\frac{(a + c)} {(b + c)} - 1 + \sqrt{1 - \frac{2} {\tau }} \ \right ). &\end{array}$$
(71.4)

Since τ > 2, we obtain from Eq. 71.4 γ > 0 when \((1 -\frac{2} {\tau } ) >{\bigl ({ \frac{b-a} {b+c} \bigr )}}^{2}\). Using Eq. 71.2 we find after a little algebra that this inequality reduces to \((d - c)(b + c) > (b - a)(a + d)\).

It follows that the greater number of sweeps should be in the direction of the variable with the larger normalized spectral interval. This is consistent with the potential analysis in Levenberg and Reichel.

In many applications \(\vert \frac{\delta }{\gamma }\vert >> 1\) and K m, n are close to unity. It will now be shown that in this case CADI outperforms GADI when the work is the same in both directions. Let G(m, n) be the maximum absolute value of g(x′, y′, p , q ) for the optimum parameter sets. Since x′ and y′ vary over the same interval, each value with x′ = y′ occurs in g. The value for G(n, m) must be greater than that attained with the optimum CADI parameters for n + m sweeps. The CADI error reduction is \(C(n + m) = G{(n + m,n + m)}^{2}\) for the corresponding 2(n + m) steps. Thus, \(G(m,n) \geq \sqrt{C(n + m)}\). If the CADI asymptotic convergence rate is ρ(C), then \(C{(s)\doteq\kappa \rho }^{s}\) for some constant κ. The asymptotic convergence rate of GADI, ρ(G), must therefore satisfy

$$\displaystyle{ \rho (G) =\mathop{\lim }\limits_{ m + n \rightarrow \infty }G{(m,n)}^{ \frac{1} {m+n} } \geq \mathop{\lim }\limits_{ m + n \rightarrow \infty }C{(n + m)}^{ \frac{1} {2(n+m)} } =\rho (C) }$$
(72)

with equality only when m = n. One cannot anticipate significant improvement over CADI when the work is the same for the two ADI steps of each iteration and \(K_{m,n}\doteq1\). Any possible improvement arises from K m, n in Eq. 70, which can in certain circumstances render GADI more efficient. Suppose the y-direction is preferred (γ > 0). One strategy is to choose an integer value for r and let m = rn. Then the inequality in Eq. 72 becomes

$$\displaystyle{ \rho (G) \geq \frac{(\delta -\gamma )} {(\delta +\gamma )}\rho (C). }$$
(73)

Even when K is close to unity, significant improvement may be achieved with GADI when the work differs for the two steps. As mentioned previously, this is true for the three-variable ADI iteration and for the Sylvester matrix equation when the orders of A and B differ appreciably. The minimax theory from which optimum CADI parameters were derived has not been generalized to GADI at this writing. The Bagby points described by Levenberg and Reichel do yield asymptotically optimal parameters. Their “generalized” Bagby points are easy to compute and provide a convenient means for choosing good parameters.

We leave GADI now and return to our discussion of “classical” ADI. The theory for determining optimum parameters and associated error reduction as a function of eigenvalue bounds for F  − 1 H and F  − 1 V is firm when these matrices commute and the sum of their lower bounds is positive. We first examine in Chap. 3 how to choose F to yield these “model problem” conditions for a class of elliptic boundary value problems. We then describe how this model problem may be used as a preconditioner for an even more general class of problems.