Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Five-Point Laplacians

The heat-diffusion problem in two space dimensions was treated by Peaceman and Rachford [ 1955 ] in their seminal work on ADI iteration. They considered both time-dependent parabolic problems and steady-state elliptic problems. The Laplacian operator may be discretized over a rectangular region by standard differencing over a grid with spacing h in both the x and y directions. If one multiplies the equations by h 2, one obtains five-point interior equations with diagonal coefficient of 4 and off-diagonal coefficients of − 1 connecting each interior node to its four nearest neighbors. Boundary conditions are incorporated in the difference equations. This is a model ADI problem when the boundary condition on each side is uniform. Given values need not be constant on a side, but one cannot have given value on part of the side and another condition like zero normal derivative on the remainder of the side. It was shown by Birkhoff, Varga and Young ( 1962 ) that there must be a full rectangular grid in order that model conditions prevail. For the Dirichlet problem (with values given on all boundaries), the horizontal coupling for a grid with m rows and n columns of unknowns when the equations are row-ordered is

$$\displaystyle{ H = \mathrm{diag}_{m}[L_{n}], }$$
(1.1)
$$\displaystyle{ L_{n} = \mathrm{tridiag}_{n}[-1,2,-1]. }$$
(1.2)

The subscripts designate the orders of the matrices. The vertical coupling is similar with m and n interchanged when the equations are column-ordered. When row-ordered this coupling is

$$\displaystyle{ V = \mathrm{tridiag}_{m}[-I_{n},2I_{n},-I_{n}], }$$
(2)

where I n is the identity matrix of order n. Matrices H and V commute and the simultaneous eigenvectors for \(r = 1,2,\ldots,m\) and \(s = 1,2,\ldots,n\) have components at the node in column i and row j of

$$\displaystyle{ v(r,s;\,i,j) =\sin \frac{ir\pi } {m + 1}\,\sin \frac{js\pi } {n + 1}. }$$
(3)

The corresponding eigenvalues are

$$\begin{array}{rlrlrl} \lambda (H) & = 2\left (1 -\cos \frac{s\pi } {n + 1}\right ), &\end{array}$$
(4.1)
$$\begin{array}{rlrlrl} \gamma (V ) & = 2\left (1 -\cos \frac{r\pi } {m + 1}\right ). &\end{array}$$
(4.2)

When the spacing is h along the x-axis and k along the y-axis, one may multiply the difference equations by the mesh-box area hk to yield matrices \(H^{\prime} = \frac{k} {h}H\) and \(V ^{\prime} = \frac{h} {k}V\). The eigenvectors remain the same but the eigenvalues are now multiplied by these mesh ratios. It is seen that when the ratio of these increments (the “aspect ratio”) differs greatly from unity, the spectra for the two directions differ significantly even when m = n. For optimal use of ADI iteration, one must consider the two-variable problem and apply Jordan’s transformation to obtain parameters for use in the generalized equations, Eqs. 3 of Chap. 1.

Now consider variable increments, h i between columns i and i + 1 and k j between rows j and j + 1. The equation at node i, j may be normalized by the mesh-box area: \(\frac{1} {4}(h_{i-1} + h_{i})(k_{j-1} + k_{j})\). Then

$$\displaystyle{ L_{n} = \mathrm{tridiag}_{n}\left [- \frac{1} {h_{i-1}(h_{i-1} + h_{i})},\, \frac{1} {2h_{i-1}h_{i}},\,- \frac{1} {h_{i}(h_{i-1} + h_{i})}\right ]. }$$
(5)

Note that the elements of L n do not depend on the row index j. The eigenvalues of matrix H are now the eigenvalues of tridiagonal matrix L n , each of multiplicity m. The Jordan normal form of this matrix is diagonal since it is the product of a positive diagonal matrix and a symmetric matrix. Bounds on these eigenvalues must be computed in order to determine optimum iteration parameters. If the V matrix is ordered by columns, then the corresponding diagonal blocks of order m are tridiagonal matrices with k j replacing h i in Eq. 5. Thus, column-ordered V = tridiag n [S m ], with

$$\displaystyle{ S_{m} = \mathrm{tridiag}_{m}\left [- \frac{1} {k_{j-1}(k_{j-1} + k_{j})},\, \frac{1} {2k_{j-1}k_{j}},\,- \frac{1} {k_{j}(k_{j-1} + k_{j})}\right ]. }$$
(6)

Eigenvalue bounds for S m must also be estimated for determining iteration parameters. Instead of dividing the equations by the mesh-box areas, we may retain the H and V matrices so that H + V is the difference approximation to the differential operator integrated over the mesh box. We now multiply the iteration parameters by the normalizing (diagonal) matrix F whose entries are the mesh-box areas. This approach has ramifications which are beneficial in a more general context. Iteration Eq. 4 of Chap. 1 yield a matrix whose eigenvectors are independent of the iteration parameters when \(H{F}^{-1}V - V {F}^{-1}H = 0\). This is evidently true for this case where F  − 1 H and F  − 1 V commute. Commutation is revealed by the fact that the elements in F  − 1 H (which are displayed in Eq. 5) depend only on the index i while the elements in F  − 1 V (which are displayed in Eq. 6) depend only on the index j. The spectra for which parameters are computed remain those of F  − 1 H and F  − 1 V.

The ADI model-problem conditions are attainable in any orthogonal coordinate system for a full rectangular grid. When the Laplacian operator is discretized by integrating over the mesh box around node ij, the diagonal matrix of mesh-box areas is the appropriate matrix F. In fact, the first application of ADI iteration with Eq. 3 of Chap. 1 included cylindrical and polar coordinates [Wachspress,  1957 ].

A comparison with Fast Fourier Transform solution of such problems is revealing [Concus and Golub,  1973 ]. When the spacing is uniform in each direction, the eigensolutions are known. When high accuracy is desired the FFT outperforms ADI in this case. However, when only modest error reduction is demanded ADI is quite competitive. The FFT suffers somewhat when the number of rows or columns is not a power of two, but that is more a programming complication than a deficiency of the approach. Now consider variable increments. For ADI iteration we need only eigenvalue bounds. For the FFT we need the complete eigensolutions for both the H and the V matrices. This is time-consuming, and ADI in general outperforms FFT in such cases. Only when the same grid is used with many forcing vectors can FFT become competitive in this more general case. There are other “Fast Poisson Solvers” which may outperform ADI when very high accuracy is demanded [Buzbee, Golub and Nielson,  1970 ].

Eigenvalue bounds for the tridiagonal matrices, L n and S m , are relatively easy to compute. The maximum absolute row sum provides an adequate upper bound. The iteration is insensitive to loose (but conservative) upper bounds. Lower bounds can be computed with shifted inverse iteration, starting with a guess of zero. There is only one tridiagonal matrix for each direction and the time for the eigenvalue bound computation is negligible compared to the iteration time.

3.2 The Neutron Group-Diffusion Equation

The neutron group-diffusion equation is

$$\displaystyle{ -\triangledown \,\cdot \, D(x,y)\triangledown u(x,y) +\sigma (x,y)u(x,y) = s(x,y), }$$
(7)

where D(x, y) > 0 and σ(x, y) ≥ 0. This is an ADI model problem when the region is rectangular with uniform boundary condition on each side and the coefficients are separable in that

$$\displaystyle{ D(x,y) = D(x)D^{\prime}(y)\text{ and }\sigma (x,y) = D(x)D^{\prime}(y)[\sigma (x) +\sigma ^{\prime}(y)], }$$
(8)

for we may then divide the equation by D(x)D′(y) and express the operator as the sum of two commuting operators, ℋ and \(\mathcal{V}\), where

$$\displaystyle{ \mathcal{H} = \frac{1} {D(x)} \dfrac{\partial } {\partial x}\,D(x) \dfrac{\partial } {\partial x} +\sigma (x) }$$
(9.1)

and

$$\displaystyle{ \mathcal{V} = \frac{1} {D^{\prime}(y)} \dfrac{\partial } {\partial y}\,D^{\prime}(y) \dfrac{\partial } {\partial y} +\sigma ^{\prime}(y). }$$
(9.2)

This is a slight generalization of the model problem displayed by Young and Wheeler ( 1964 ) in which σ was restricted to KD(x)D′(y) with K constant.

When the neutron group-diffusion equation is discretized by the box-integration method, the difference forms of Eqs. 9 are each three-point equations. We need not divide the equations by D(x, y) if we define the F matrix by

$$\displaystyle{ F = \text{diag}[(i,j)] = GG^{\prime} = \text{diag}[g(i)]\,\text{diag}[g^{\prime}(j)], }$$
(10)

where

$$\displaystyle{ g(i) = \frac{1} {2}[D_{i}h_{i} + D_{i-1}h_{i-1}], }$$
(11.1)

and

$$\displaystyle{ g^{\prime}(j) = \frac{1} {2}[D_{j}^{\prime}k_{j} + D_{j-1}^{\prime}k_{j-1}]. }$$
(11.2)

In these equations, D i  = D(x) between columns i and i + 1 while D j  = D′(y) between rows j and j + 1. The coefficient matrix obtained by box-integration can now be expressed as

$$\displaystyle{ A = LG^{\prime} + L^{\prime}G, }$$
(12)

where for row-ordered equations

$$\displaystyle{ L \equiv \text{diagonal}_{m}[L_{n}], }$$
(13)

with the matrix L n repeated as the m diagonal blocks in L given by

$$\displaystyle{ L_{n} = \text{tridiagonal}\left \{-\frac{D_{i-1}} {h_{i-1}},\ \left [D_{i-1}\left ( \frac{1} {h_{i-1}}+\frac{h_{i-1}\sigma _{i-1}} {2} \right )+D_{i}\left ( \frac{1} {h_{i}}+\frac{h_{i}\sigma _{i}} {2} \right )\right ],\ -\frac{D_{i}} {h_{i}} \right \}, }$$
(14)

and for column-ordered equations

$$\displaystyle{ L^{\prime} \equiv \text{diagonal}_{n}[L^{\prime}_{m}], }$$
(15)

with the matrix L′ m repeated as the n diagonal blocks in L′ given by

$$\displaystyle{ L^{\prime}_{m} = \text{tridiagonal}\left \{-\frac{D^{\prime}_{j-1}} {k_{j-1}},\ \left [D^{\prime}_{j-1}\left ( \frac{1} {k_{j-1}}+\frac{k_{j-1}\sigma ^{\prime}_{j-1}} {2} \right )+D^{\prime}_{j}\left ( \frac{1} {k_{j}}+\frac{k_{j}\sigma ^{\prime}_{j}} {2} \right )\right ],\ -\frac{D^{\prime}_{j}} {k_{j}} \right \}. }$$
(16)

Here, σ i is the value between columns i and i + 1 while σ′ j is the value between rows j and j + 1.

The primed and unprimed matrices of order mn commute. The ADI equations can be expressed in the form

$$\begin{array}{rlrlrl} (LG^{\prime} + w_{s}GG^{\prime})\mathbf{u}_{s-\frac{1} {2} } & = -(L^{\prime}G - w_{s}GG^{\prime})\mathbf{u}_{s-1} + \mathbf{s}, &\end{array}$$
(17.1)
$$\begin{array}{rlrlrl} (L^{\prime}G + w_{s}^{\prime}GG^{\prime})\mathbf{u}_{s} & = -(LG^{\prime} - w_{s}^{\prime}GG^{\prime})\mathbf{u}_{s-\frac{1} {2} } + \mathbf{s}, & \\ s & = 1,2,\ldots,J. & & \end{array}$$
(17.2)

The right-hand side of Eq. 17.1 may be computed with the column-ordered block diagonal matrix L′ and column-ordered u and s. The resulting vector may then be reordered by rows as the forcing term for Eq. 17.1 with row ordering. Similarly, the right-hand side of Eq. 17.2 may be computed in row order and transposed to column order.

Eigenvalue bounds must be computed for the commuting tridiagonal matrices G n  − 1 L n and G m  − 1 L′ m for determining optimum parameters and associated convergence. These matrices are similar to SPD matrices and methods described for the model Laplace equation suffice for computing these eigenvalue bounds.

3.3 Nine-Point (FEM) Equations

When the Laplace or neutron group-diffusion operator is discretized by the finite element method over a rectangular mesh with bilinear basis functions, the equations are nine-point rather than five-point. It is by no means obvious that these are model ADI problems. Although Peaceman and Rachford introduced ADI iteration in the 1950s and the theory relating to convergence and choice of optimum parameters was in place by 1963, it was not until 1983 that I discovered how to express the nine-point equations as a model ADI problem [Wachspress,  1984 ]. The catalyst for this generalization was the analysis of the generalized five-point model problem discussed in Sect. 3.2 and in particular the form of the ADI iteration in Eqs. 17. This method was first implemented in 1990 [Dodds, Sofu and Wachspress], roughly 45 years after the seminal work by Peaceman and Rachford. One might question the practical worth of such effort in view of the restrictions imposed by the model conditions. However, application of model-problem analysis to more general problems will be exposed in Sect. 3.4.

Finite element discretization is based on a variational principle applied with a set of basis functions over each element. The basis functions from which the nine-point equations over a rectangular grid are obtained are bilinear. These nine-point finite element equations are related to the five-point box-integration equations.

A detailed analysis reveals that when the model conditions of Eq. 8 are satisfied, the finite element equations can be expressed as in Eq. 12:

$$\displaystyle{ A\mathbf{u} \equiv (LG^{\prime} + L^{\prime}G)\mathbf{u} = \mathbf{s}, }$$
(18)

where we define the unprimed matrices when the equations are ordered by rows as

$$\begin{array}{rlrlrl} L & \equiv \text{diagonal}_{m}[L_{n}], &\end{array}$$
(19.1)
$$\begin{array}{rlrlrl} G & \equiv \text{diagonal}_{m}[G_{n}], &\end{array}$$
(19.2)

with tridiagonal matrices repeated as diagonal blocks:

$$\displaystyle{ L_{n} = \text{tridiagonal}\left \{D_{i-1}\left (\frac{h_{i-1}\sigma _{i-1}} {6} - \frac{1} {h_{i-1}}\right ),\right. }$$
$$\displaystyle{ \quad \left.\left [D_{i-1}\left (\frac{h_{i-1}\sigma _{i-1}} {3} + \frac{1} {h_{i-1}}\right ) + D_{i}\left (\frac{h_{i}\sigma _{i}} {3} + \frac{1} {h_{i}}\right )\right ],\ D_{i}\left (\frac{h_{i}\sigma _{i}} {6} - \frac{1} {h_{i}}\right )\right \} }$$
(20)

and

$$\displaystyle{ G_{n} = \text{tridiagonal}[D_{i-1}h_{i-1},\ 2(D_{i-1}h_{i-1} + D_{i}h_{i}),\ D_{i}h_{i}]/6. }$$
(21)

The primed matrices are of the same form when the equations are ordered by columns:

$$\begin{array}{rlrlrl} L^{\prime} & \equiv \text{diagonal}_{n}[L^{\prime}_{m}], &\end{array}$$
(22.1)
$$\begin{array}{rlrlrl} G^{\prime} & \equiv \text{diagonal}_{n}[G^{\prime}_{m}], &\end{array}$$
(22.2)

with tridiagonal matrices:

$$\displaystyle{ L_{m}^{\prime} = \text{tridiagonal}\left \{D^{\prime}_{j-1}(\frac{k_{j-1}\sigma _{j-1}^{\prime}} {6} - \frac{1} {k_{j-1}}),\right. }$$
$$\displaystyle{ \quad \left.\left [D^{\prime}_{j-1}\left (\frac{k_{j-1}\sigma _{j-1}^{\prime}} {3} + \frac{1} {k_{j-1}}\right )+D^{\prime}_{j}\left (\frac{k_{j}\sigma _{j}^{\prime}} {3} + \frac{1} {k_{j}}\right )\right ],\ D^{\prime}_{j}\left (\frac{k_{j}\sigma _{j}^{\prime}} {6} -\frac{1} {k_{j}}\right )\right \} }$$
(23)

and

$$\displaystyle{ G^{\prime}_{m} = \text{tridiagonal}[D^{\prime}_{j-1}k_{j-1},\ 2(D^{\prime}_{j-1}k_{j-1} + D^{\prime}_{j}k_{j}),\ D^{\prime}_{j}k_{j}]/6. }$$
(24)

The σ terms in the L and L′ matrices are characteristic of finite element rather than box-integration equations, but this difference is sometimes eliminated by the “lumped mass” finite element approach which reduces the σ contribution to the box-integration diagonal contribution. Matrices L n and L m in Eqs. 20 and 23 are then identical to matrices L n and L m in Eqs. 14 and 16. This has no effect on the ADI analysis. The G and G′ matrices are now tridiagonal diffusion-coefficient-weighted Simpson rule quadrature matrices. The fact that these matrices are tridiagonal rather than diagonal seems to preclude efficient ADI iteration, but we shall soon show how this is remedied.

We consider the ADI-type iteration defined in Eq. 17:

$$\begin{array}{rlrlrl} (LG^{\prime} + w_{s}GG^{\prime})\mathbf{u}_{s-\frac{1} {2} } & = -(L^{\prime}G - w_{s}GG^{\prime})\mathbf{u}_{s-1} + \mathbf{s}, &\end{array}$$
(25.1)
$$\begin{array}{rlrlrl} (L^{\prime}G + w_{s}^{\prime}GG^{\prime})\mathbf{u}_{s} & = -(LF^{\prime} - w_{s}^{\prime}GG^{\prime})\mathbf{u}_{s-\frac{1} {2} } + \mathbf{s},\ & \\ s & = 1,2,\ldots,J. & & \end{array}$$
(25.2)

Since G and G′ are tridiagonal rather than diagonal, the systems to be solved in each step are not block tridiagonal but have the same structure as the coefficient matrix A. They are systems of nine-point equations. We must somehow reduce these iteration equations to the form of Eqs. 1–3 with tridiagonal systems on the left-hand sides. For this purpose we define the vectors

$$\displaystyle{ \mathbf{v}_{s-\frac{1} {2} } = G^{\prime}\mathbf{u}_{s-\frac{1} {2} } }$$
(26.1)

and

$$\displaystyle{ \mathbf{v}_{s} = G\mathbf{u}_{s}. }$$
(26.2)

One starts the iteration by computing v 0 = G u 0 and by virtue of commutativity of primed and unprimed matrices rewrites Eqs. 25 as

$$\begin{array}{rlrlrl} (L + w_{s}G)\mathbf{v}_{s-\frac{1} {2} } & = -(L^{\prime} - w_{s}G^{\prime})\mathbf{v}_{s-1} + \mathbf{s}, &\end{array}$$
(27.1)
$$\begin{array}{rlrlrl} (L^{\prime} + w_{s}^{\prime}G^{\prime})\mathbf{v}_{s} & = -(L - w_{s}^{\prime}G)\mathbf{v}_{s-\frac{1} {2} } + \mathbf{s}, & \\ s & = 1,2,\ldots,J. & & \end{array}$$
(27.2)

These equations are almost the same as the five-point iteration equations. They differ only in that the iteration parameters are multiplied by tridiagonal rather than diagonal matrices. However, the matrices on each side of these equations have the same structure as the corresponding five-point matrices. The coefficient matrix on the left side of Eq. 27.1 for update of all rows is the tridiagonal matrix (L n  + w s G n ), and the coefficient matrix on the left side of Eq. 27.2 for update of all columns is the tridiagonal matrix (L′ m  + w′ s G′ m ). The iteration is terminated with recovery of u J after J iterations by solving the tridiagonal systems G u J  = v J .

The eigenvalue bounds for G n  − 1 L n and G′ m  − 1 L′ m must be computed. These may be treated as generalized eigenvalue problems: L n e = λG n e and L′ m e  = γG′ m e . Shifted inverse iteration has been used to compute upper and lower bounds for these eigenvalues. Some simple observations facilitate the computation. Matrices L n and L′ m have positive inverses [Varga,  1962 ] and matrices G n and G′ m are irreducible and nonnegative. Therefore, matrices L n  − 1 G n and L′ m  − 1 G′ m are positive. The Perron theorem asserts that the largest eigenvalues of these matrices have positive eigenvectors. If we choose e 0 as a vector with all components equal to unity and solve the tridiagonal systems L n e 1 = G n e 0 and L′ m e′ 1 = G′ m e 0, then the largest components of e 1 and e′ 1 are upper bounds on the largest eigenvalues of these positive matrices. Their reciprocals are therefore lower bounds for the smallest eigenvalues of G n  − 1 L n and G′ m  − 1 L′ m , respectively. These bounds may be used as a first shift in the computation of the lower eigenvalue bounds. First estimates for upper bounds may be computed with Rayleigh quotients \(\frac{\mathbf{f}_{0}^{T},\,L_{ n}\,\mathbf{f}_{0}} {\mathbf{f}_{0}^{T},\,G_{n}\,\mathbf{f}_{0}}\) and \(\frac{\mathbf{f}_{0}^{T},\,L^{\prime}_{ m}\,\mathbf{f}_{0}} {\mathbf{f}_{0}^{T},\,G^{\prime}_{m}\,\mathbf{f}_{0}}\), where the components of f 0 alternate between plus one and minus one.

3.4 ADI Model-Problem Limitations

We have described a class of boundary value problems to which ADI model-problem theory applies. There is no other iterative method for which precise convergence prediction is possible that has the logarithmic dependence on problem condition. (We measure problem condition of an SPD system by the ratio of maximum to minimum eigenvalue of the coefficient matrix. This condition often varies as the number of nodes in the grid when spacing retains the same uniformity as the grid is refined.) Preconditioned conjugate gradient and multigrid computation may be competitive and even superior for some of these problems, but convergence theory is less definitive. Successive overrelaxation and Chebyshev extrapolation converge as the square root of the condition of the problem. For moderately difficult ADI model problems, the ADI iteration is more efficient. For example, the five-point Laplace problem with equal spacing and a 100 ×100 grid requires about 150 SOR iterations and only 10 ADI iterations for an error reduction by a factor of 10 − 4. One model-problem ADI iteration, including both sweeps, requires about twice the work of one SOR iteration, but ADI has a clear advantage here. This advantage tends to manifest itself with smaller grids when mesh spacing is not uniform.

The greatest failing of ADI iteration is not in solution of model problems, but rather in restrictions imposed by the model conditions. Practitioners often demand methods which are applicable to a greater variety of problems. ADI iteration is often applied to problems for which model conditions are not met. Although considerable success has been realized for a variety of problems, departure from model conditions can lead to significant deterioration of the rapid convergence characteristic of ADI applied to model problems. Varga [ 1962 ] illustrated this with a simple problem contrived so that ADI iteration diverges with parameters chosen as though model conditions are satisfied when in reality they are not. Theory relating to parameter selection for general problems is sketchy. Although convergence can be guaranteed with some choices, the rate of convergence can rarely be predicted with variable parameters when the model conditions are not satisfied. It is this lack of sound theoretical foundations that motivated restriction of this monograph to application of ADI iteration only to model problems. In the next section we describe how model-problem ADI iteration may be applied to solve problems for which model conditions are not satisfied.

3.5 Model-Problem Preconditioners

3.5.1 Preconditioned Iteration

Several significant concepts were Introduced in Wachspress [ 1963 ]. The Peaceman–Rachford ADI equations (Eq. 1 of Chap. 1) were generalized with different parameters for the two sweeps each iteration (Eqs. 1–3) to improve efficiency in solution of problems with different spectral intervals for the two directions. The earlier AGM algorithm for computing parameters when J = 2n (Sect. 1.4) was extended to this generalized iteration. This algorithm motivated Jordan’s transformation of variables (Sect. 2–1.3). Both the variable transformation and Jordan’s elliptic-function solution to the minimax problem were published for the first time as an appendix in Wachspress [ 1963 ].

The method now known as “preconditioned conjugate gradients” was also introduced in this paper as “compound iteration.” Studies performed in 1962 established the potency of this new procedure, but the sparse numerical studies reported in this paper stimulated little interest and the method lay dormant for several years. It was rediscovered, was enhanced with a variety of preconditioners, and is now one of the more universally used methods for solving large elliptic type systems.

Compound iteration with ADI inner iteration was introduced by D’Yakonov [ 1961 ] to extend application of model problem ADI iteration to problems for which the model conditions were violated. The model problem was thus used as a “preconditioner” for the true problem. The term preconditioner was not introduced until several years after D’Yakonov’s paper appeared. D’Yakonov used a two-term “outer” iteration with a constant extrapolation that converged about the same as Gauss–Seidel applied to the preconditioned system. The combination of ADI preconditioning and Lanczos-typeFootnote 1 outer iteration was the new aspect of the analysis in my 1963 paper. This is in general much more efficient than Gauss–Seidel iteration.

The following description of compound iteration is taken directly from the 1963 paper. The wording parallels that of modern texts on this method. I have been unable to find an earlier published account of preconditioned conjugate gradients, and refer the reader to the comprehensive historical review by Golub and O’Leary [ 1987 ].

3.5.2 Compound Iteration (Quotations from Wachspress,  1963 )

“Application of compound iteration with inner iteration other than ADI was described by Cesari [ 1937 ] and by Engeli et al. [ 1959 ]. Use of ADI iteration in this manner was discussed first by D’Yakonov [ 1961 ]. We wish to solve the matrix equation A z = s for z when given the vector s and the real, positive definite matrix A.Footnote 2 It is not often possible to express A as the sum of two symmetric commuting matrices, H and V, such that the matrix inversions in [Eq. 3 of Chap. 1] are readily performed. There may, however, be a model problem matrix M which approximates A in the sense that p(M  − 1 A) < < p(A), where p is the p-condition number, equal in this case to the ratio of the maximum to minimum eigenvalues. The closer p(M  − 1 A) is to unity, the more efficient compound iteration becomes.”

The paper continued with proof of a theorem on the effect of termination of the ADI model problem iteration with error reduction ε on the condition of this compound iteration. The ADI iteration actually replaces M by an SPD matrix B ε . A more detailed proof with useful innovations will be given in Sect. 3.6. The theorem asserts that the effective condition is

$$\displaystyle{ p(B_{\varepsilon }^{-1}A) \leq \frac{1+\varepsilon } {1-\varepsilon }p({M}^{-1}A). }$$
(28)

Next, details were given for a symmetric conjugate gradient algorithm applied directly to the system B  − 1 A. This was the first published account of applicability of this algorithm to a product of SPD matrices. Hestenes and Stieffel [ 1952 ] discussed preconditioning of nonsymmetric systems with their transposes to yield symmetric systems. The observation that these algorithms could be applied to a product of SPD matrices is trivial and can be cast as application to A with inner products defined as \((\mathbf{w}\,,\,\mathbf{z}) ={ \mathbf{w}}^{T}{B}^{-1}\mathbf{z}\). When B is SPD one can define a norm consistent with this inner product as \(\|\mathbf{u}\| = {(\mathbf{u}\,,\,\mathbf{u})}^{\frac{1} {2} }\). The conjugate gradient algorithm then minimizes the norm of the residual vector.

After giving the conjugate gradient algorithm inner products and recursion formulas, my 1963 paper continued with: “The number of Lanczos iterations for a prescribed error reduction varies as \(\sqrt{ p({B}^{-1}A)}\). [A footnote attributed this result to Lanczos being at least as efficient as Chebyshev extrapolation.] To gain some insight regarding best strategy for compound iteration, we …observe that the total number of ADI iterations …varies as

$$\displaystyle{ J\sqrt{\frac{1 +\varepsilon _{J } } {1 -\varepsilon _{J}}p({M}^{-1}A)}. }$$
(29)

…When Jordan’s [parameter selection] is used, J is optimum when ε J is approximately equal to 0. 36. In numerical application, however one must consider relative time requirements of inner and outer iterations…It may then be best to choose J so that ε J is an order of magnitude smaller. This may increase the total number of inner (ADI) iterations, but the overall time may be reduced significantly…A desirable feature of compound iteration is that, having decided upon strategy according to machine limitations, one may find efficient iteration parameters with negligible computation time.”

The paper continued with analysis of dependence on mesh spacing as a function of normalization of A in an attempt to approach model conditions and with numerical studies comparing different normalizations. The paper concluded with the statement that “Numerical results support prediction based on theory of rapid convergence rates in the numerical solution of the diffusion equation over a rectangular domain. Further studies are contemplated, including extension to nonrectangular domains.” This latter study was pursued with a few examples in my 1966 book.

3.5.3 Updated Analysis of Compound Iteration

Although much of the early analysis is still valid, developments during the past 25 years have shed new light on this approach and have led to improvements. We first consider generation of a model problem. The early studies were done with the Laplace operator as a model for the diffusion operator with diffusion coefficient D(x, y). D’Yakonov proved that p(M  − 1 A) is equal to the ratio of the maximum to minimum values of D(x, y). This is independent of grid geometry. Thus, the number of outer iterations is independent of spacing h as h → 0. Computation time per iteration increases as h  − 2 and the number of inner ADI iterations per outer iteration to achieve a fixed error reduction increases as \(\log \frac{1} {h}\).

In my 1984 paper an algorithm was presented for choosing a separable model problem to solve the diffusion equation in the absence of the σ term. This requires a “best” approximation to D(x, y) by the separable coefficient D(x)D′(y). If one considers the approximation of lnD(x, y) by lnD(x) + lnD′(y), one has the problem treated by Diliberto and Strauss [ 1951 ]: “On the approximation of a function of several variables by a sum of functions of fewer variables.” In our application we have a precise measure of merit in that now

$$\displaystyle{ p({M}^{-1}A) \leq \frac{\max \frac{D(x,y)} {D(x)D^{\prime}(y)}} {\min \frac{D(x,y)} {D(x)D^{\prime}(y)}}. }$$
(30)

Footnote 3The algorithm for determining separable diffusion coefficients entails alternating improvement of D(x) and D′(y) until further improvement yields negligible reduction in p. The algorithm is:

  1. 1.

    For \(i = 1,2,\ldots,m\), set D i  = 1. 0.

  2. 2.

    For \(j = 1,2,\ldots,n\), set

    \(D^{\prime}_{j} ={ \left (\mathop{\max }\limits_{i}D_{ij} \cdot \mathop{\min }\limits_{ i}D_{ij}\right )}^{\frac{1} {2} }\).

  3. 3.

    For \(i = 1,2,\ldots,m\), set

    \(D_{i} ={ \left (\mathop{\max }\limits_{j}\frac{D_{ij}} {D^{\prime}_{j}} \cdot \mathop{\min }\limits_{ j}\frac{D_{ij}} {D^{\prime}_{j}} \right )}^{\frac{1} {2} }\).

  4. 4.

    For \(j = 1,2,\ldots,n\), set

    \(D^{\prime}_{j} ={ \left (\mathop{\max }\limits_{i}\frac{D_{ij}} {D_{i}} \cdot \mathop{\min }\limits_{ i}\frac{D_{ij}} {D_{i}} \right )}^{\frac{1} {2} }\).

  5. 5.

    Cycle through steps 3 and 4 until values do not change appreciably. Convergence is quite rapid and high accuracy is not required. Two or three iterations often suffice.

The example given in [Wachspress,  1984 ] was for the pattern of diffusion coefficients in the matrix

$$\displaystyle{ D_{ij} =\begin{array}{*{10}c} 9& 25 & 1\\ 16 &100 &1600 \\ 1 & 4 & 36 \end{array}. }$$

The values for D i and D′ j obtained by two cycles of the algorithm were

\(D_{1} = 6.931\quad D_{2} = 28.88\quad D_{3} = 23.10\)

\(D^{\prime}_{1} = 0.465\quad D^{\prime}_{2} = 12.65\quad D^{\prime}_{3} = 0.237\)

This resulted in

$$\displaystyle{ D_{i}D^{\prime}_{j} =\begin{array}{*{10}c} 1.643 & 6.845 & 5.475\\ 87.677 &365.332 &292.215 \\ 3.223 & 13.429 & 10.742 \end{array}. }$$

The ratios of diffusion coefficients were then

$$\displaystyle{ \frac{D_{ij}} {D_{i}D^{\prime}_{j}} =\begin{array}{*{10}c} 5.478 &3.652 &0.183\\ 0.183 &0.274 &5.475 \\ 0.310 & 0.298 &3.351 \end{array}. }$$

Thus, \(p({M}^{-1}A) = \frac{5.478} {0.183} = 29.93\) in contrast with the Laplacian model-problem value of 1600. Since the solution effort varies as the square root of p, there is a gain by a factor greater than seven through use of the best separable problem. Note that the “best” D i and D′ j are not necessarily unique. In this example, D′ 1 may vary within the interval [0. 286, 0. 760] without increasing p.

For the more general diffusion equation with removal σ, we first compute the separable diffusion coefficient as above and then approximate \(\tau _{ij} \equiv \frac{\sigma _{ij}} {D_{i}D_{j}^{\prime}}\) by τ i  + τ j . One scheme which has been used successfully is to approximate exp(τ ij ) by the product exp(τ i )exp(τ′ j ), using the same algorithm as for approximating the nonseparable diffusion coefficient. Care must be taken to disallow negative removal. This can be accomplished by replacing an exponential value less than unity by unity in the algorithm.

If \(\alpha > \frac{D_{ij}} {D_{i}D_{j}^{\prime}} > \frac{1} {\alpha }\) and \(\beta > \frac{\tau _{ij}} {\tau _{i}+\tau _{j}^{\prime}} > \frac{1} {\beta }\), then p(M  − 1 A) is bounded by (α + β)2. Competition between diffusion and removal is a function of the geometry and changes with mesh spacing. The removal term will have its maximum effect on eigenvectors associated with smaller eigenvalues of the matrix A. The geometric buckling of a rectangle of length X and height Y is defined as \({B}^{2} = ( \frac{{\pi }^{2}} {{X}^{2}}+ \frac{{\pi }^{2}} {{Y }^{2}} )\). A reasonable estimate for p is

$$\displaystyle{ p({M}^{-1}A)\doteq\frac{\mathop{\max }\limits_{ij} \frac{{B}^{2}D_{ ij}+\sigma _{ij}} {({B}^{2}+\tau _{i}+\tau _{j}^{\prime})D_{i}D_{j}^{\prime}}} {\mathop{\min }\limits_{ij} \frac{{B}^{2}D_{ij}+\sigma _{ij}} {({B}^{2}+\tau _{i}+\tau _{j}^{\prime})D_{i}D_{j}^{\prime}}}. }$$
(31)

The value computed in the absence of removal is precise when there is an interior node in each region of constant D i D j . There is an eigenvector of M  − 1 A with a component of unity at each such node and zero elsewhere belonging to the eigenvalue \(\frac{D_{ij}} {D_{i}D_{j}^{\prime}}\). In the absence of such interior nodes, the value computed is a close upper bound on p. The value in Eq. 31 is only an estimate that can be used to assess the model problem prior to the actual iteration. In the absence of removal, precise bounds are computable for the eigenvalues of M  − 1 A. This facilitates use of Chebyshev extrapolation as the outer iteration. In the absence of such bounds, conjugate gradient iteration seems preferable. The cost of the additional inner products is not significant.

3.6 Interaction of Inner and Outer Iteration

Let A be the coefficient matrix of the discretized diffusion operator − D(x, y) over a rectangular partitioning of a rectangle, resulting from either five-point differencing or nine-point bilinear finite elements. The vector u whose components are the approximations to the desired field vector at the grid nodes is obtained as the solution to the linear system

$$\displaystyle{ A\mathbf{u}\ =\ \mathbf{b}, }$$
(32)

where b is a given vector. Let B be the corresponding matrix with the separable diffusion coefficient D(x)D′(y), and let the model-problem matrix equation be

$$\displaystyle{ B\mathbf{v}\ =\ \mathbf{r}. }$$
(33)

Let F be the SPD normalizing matrix defined in Sect. 3.1 for which the matrix splitting \(B = H + V\) satisfies \(H{F}^{-1}V - V {F}^{-1}H = 0\). It follows that F  − 1 B commutes with F  − 1 H and F  − 1 V. For any matrix X, define \(\tilde{X} = {F}^{-\frac{1} {2} }\ X\ {F}^{-\frac{1} {2} }\). Then if we define

$$\displaystyle{ \tilde{\mathbf{v}} \equiv {F}^{\frac{1} {2} }\mathbf{v},\ \ \tilde{\mathbf{r}} \equiv {F}^{-\frac{1} {2} }\mathbf{r}\,\ \tilde{\mathbf{b}}\ \equiv {F}^{-\frac{1} {2} }\mathbf{b}\,\ \text{ and }\tilde{\mathbf{u}} \equiv {F}^{\frac{1} {2} }\mathbf{u}, }$$
(34)

we have the transformed problem to be solved:

$$\displaystyle{ \tilde{A}\tilde{\mathbf{u}} =\tilde{ \mathbf{b}}\, }$$
(35)

and the corresponding model problem:

$$\displaystyle{ \tilde{B}\tilde{\mathbf{v}}\ =\ \tilde{ \mathbf{r}} }$$
(36)

with \(\tilde{B}\ =\ \tilde{ H}\ +\ \tilde{ V } \ \), where

$$\displaystyle{ \tilde{H}\tilde{V } -\tilde{ V }\tilde{H} = {F}^{-\frac{1} {2} }[H{F}^{-1}V - V {F}^{-1}H]{F}^{-\frac{1} {2} } = 0. }$$
(37)

Matrices \(\tilde{A}\), \(\tilde{B}\), \(\tilde{H}\), and \(\tilde{V }\) are all SPD. Let \(\tilde{T}\) be the ADI iteration matrix for the symmetric normalized equations. This iteration matrix is symmetric with eigenvalues in the interval [ − ε, ε]. The base matrix on which the outer iteration acts is

$$\displaystyle{ \tilde{W} = (I -\tilde{ T})\tilde{{B}}^{-1}\tilde{A}, }$$
(38)

where

$$\displaystyle{ \tilde{T}\tilde{B} -\tilde{ B}\tilde{T} = 0. }$$
(39)

A similarity transformation with \(\tilde{B}\) \({}^{\frac{1} {2} }\) yields

$$\displaystyle{ \tilde{W} \sim G \equiv (I -\tilde{T})\tilde{{B}}^{-\frac{1} {2} }\tilde{A}\tilde{{B}}^{-\frac{1}{2}}}.$$
(40)

G is the product of two SPD matrices, \((I -\tilde{ T})\) and \(\tilde{{B}}^{-\frac{1} {2} }\tilde{A}\tilde{{B}}^{-\frac{1} {2} }\). Therefore, the eigenvalues of G are all real and positive and its Jordan normal form is diagonal. Let

$$\displaystyle{ {b}^{{\prime}}\equiv \lambda _{\max }(\tilde{{B}}^{-1}\tilde{A}) =\lambda _{\max }(\tilde{{B}}^{-\frac{1} {2} }\tilde{A}\tilde{{B}}^{-\frac{1} {2} }), }$$
(41)

and let

$$\displaystyle{ {a}^{{\prime}}\equiv \lambda _{\min }(\tilde{{B}}^{-1}\tilde{A}) =\lambda _{\min }(\tilde{{B}}^{-\frac{1} {2} }\tilde{A}\tilde{{B}}^{-\frac{1} {2} }). }$$
(42)

Let \(b \equiv \lambda _{\max }(\tilde{W})\text{ and }a \equiv \lambda _{\min }(\tilde{W})\). Then

$$\displaystyle{ b\ \leq \ \| G\|\ \leq \ \| I -\tilde{ T}\|\ \|\tilde{{B}}^{-\frac{1} {2} }\tilde{A}\tilde{{B}}^{-\frac{1} {2} }\| = (1+\varepsilon ){b}^{{\prime}} }$$
(43)

and

$$\displaystyle{ a\ \geq \ \| {G{}^{-1}\|}^{-1}\ \geq \ {[\|{(I -\tilde{ T})}^{-1}\|\ \|\tilde{{B}}^{\frac{1} {2} }\tilde{{A}}^{-1}\tilde{{B}}^{\frac{1} {2} }\|]}^{-1} = (1-\varepsilon ){a}^{{\prime}}. }$$
(44)

Thus, we have as rigorous bounds on the eigenvalues of \(\tilde{W}\):

$$\displaystyle{ a = (1-\varepsilon ){a}^{{\prime}}\text{and }b = (1+\varepsilon ){b}^{{\prime}}}.$$
(45)

The ADI equations are not normalized with the square-root matrix. The matrix on which the outer iteration acts is now \(W = (I - T){B}^{-1}A\). However, a similarity transformation with \({F}^{\frac{1} {2} }\) reveals that \(W \sim \tilde{ W}\). Hence, the eigenvalues of W are all real and positive with the same bounds, and the Jordan form of W is also diagonal. Let K be the matrix of eigenvectors of W. Then \(W = K\boldsymbol{\Lambda }{K}^{-1}\) where Λ is the positive diagonal matrix of eigenvalues of W. Any polynomial P n (W) can be expressed as \(P_{n}(W) = KP_{n}(\boldsymbol{\Lambda }){K}^{-1}\). Therefore,

$$\displaystyle{ \|P_{n}(W)\|\ \leq \ \| K\|\|{K}^{-1}\|\ \mathop{\max }\limits_{\lambda }\vert P_{ n}(\lambda )\vert =\kappa (K)\mathop{\max }\limits_{a \leq \lambda \leq b}\vert P_{n}(\lambda )\vert \, }$$
(46)

where κ is the condition number of matrix K. When Chebyshev extrapolation is used for the outer iteration with the eigenvalue bounds a and b,

$$\displaystyle{ \mathop{\max }\limits_{\lambda }\vert P_{n}(\lambda )\vert ={ \left (\cosh \left [{n\cosh }^{-1}\left (\frac{b + a} {b - a}\right )\right ]\right )}^{-1}. }$$
(47)

Thus, the norm of the error reduction after n outer iterations, with inner ADI error reduction ε each outer iteration, is bounded by

$$\displaystyle{ \sigma =\kappa { \left (\cosh \left [{n\cosh }^{-1}\left (\frac{b + a} {b - a}\right )\right ]\right )}^{-1}\, }$$
(48)

where the dependence on ε occurs through \(a = (1-\varepsilon ){a}^{{\prime}}\) and \(b = (1+\varepsilon ){b}^{{\prime}}\). Rigorous bounds on b and a are found readily. In finite element discretization, the contribution from rectangle q to x  ⊤  A x divided by the contribution to x  ⊤  B x is \(\frac{D(x,y)} {D(x)D^{\prime}(y)}\vert _{q}\). Therefore, the maximum eigenvalue of B  − 1 A is equal to

$$\displaystyle{ {b}^{{\prime}} =\mathop{\max }\limits_{ x,y} \frac{D(x,y)} {D(x)D^{\prime}(y)}. }$$
(49)

Similarly,

$$\displaystyle{ {a}^{{\prime}} =\mathop{\min }\limits_{ x,y} \frac{D(x,y)} {D(x)D^{\prime}(y)}. }$$
(50)

Let point i, j be interior to a region of constant D(x, y) and D(x)D′(y). Then the vector with nonzero value only at i, j is an eigenvector of B  − 1 A with eigenvalue equal to \(\frac{D(x,y)} {D(x)D^{\prime}(y)}\). Thus, the computed bounds are actually achieved in the presence of interior nodes. The other eigenvectors are in general not easily found and have components which are mostly nonzero. The separable model problem is generated to minimize the ratio b ∕ a. Although the ADI inner iterations required to attain a prescribed error reduction increases logarithmically with grid refinement, the number of outer iterations remains fixed. Conjugate gradient outer iteration seems appropriate in the presence of space-dependent removal terms (as in neutron diffusion problems), but when accurate eigenvalue bounds are easily found Chebyshev extrapolation may be slightly more efficient since one then avoids the need for computing two inner products per iteration.

Optimum choice of the number of inner iterations per outer may be determined in advance by minimizing the work required for a prescribed accuracy. Each inner iteration requires about the same work as the residual evaluation for the next outer iteration. Let t be the number of inners per outer and s the number of outers. Then the total work varies as \(f(t) = s(1 + t)\). For significant error reduction, s varies as

$$\displaystyle{ s = C\sqrt{\frac{1 +\varepsilon _{t } } {1 -\varepsilon _{t}}}. }$$
(51)

Optimum strategy often requires few inners per outer so that asymptotic inner iteration convergence estimates are not valid. The AGM algorithm for t = 2n is useful in this analysis. We define

$$\begin{array}{rlrlrl} \theta _{1} & \equiv \sqrt{k^{\prime}}, &\end{array}$$
(52.1)
$$\begin{array}{rlrlrl} \theta _{m} & \equiv {\left [ \frac{2\theta _{m-1}} {1 +\theta _{ m-1}^{2}}\right ]}^{\frac{1} {2} }. &\end{array}$$
(52.2)

The inner iteration error reduction for t iterations is

$$\displaystyle{ \varepsilon (t) ={ \left (\frac{1 -\theta _{t}} {1 +\theta _{t}}\right )}^{2}. }$$
(53)

The number of outer iterations s varies as \({(\theta +\frac{1} {\theta } )}^{1/2}\), and

$$\displaystyle{ f(t = {2}^{n}) = C^{\prime}(1 + t){\left (\theta _{ t} + \frac{1} {\theta _{t}} \right )}^{\frac{1} {2} }. }$$
(54)

The most efficient strategy depends on the value of k′, and we examine a range of values (Table 3.1):

Table 3.1 Inner–outer iteration

For most problems of interest, k′ < < 0. 01 and a value close to t = 4 is optimum. One may compute ε(t) by one of the methods described in Sect. 1.6 to optimize. For example, when \(k^{\prime} = 1{0}^{-6}\), Eq. 1–54 gives

$$\begin{array}{rlrlrl} \varepsilon (t) & = 4\exp \left [- \frac{{\pi }^{2}t} {\ln (4/1{0}^{-6})}\right ] & & \\ & = 4{(0.5224)}^{t}. &\end{array}$$
(55)

For comparison with the values in the above table,

$$\displaystyle{ f(t) = C^{\prime}\sqrt{2}(1 + t){\left (\frac{1+\varepsilon } {1-\varepsilon }\right )}^{\frac{1} {2} }, }$$
(56)

and we compute \(f(3)/C^{\prime}\doteq10.82\) and \(f(5)/C^{\prime}\doteq9.93\). For a fair comparison we reevaluate f(4) ∕ C′ with this approximation as \(f(4)/C^{\prime}\doteq9.61\). This does not differ appreciably from the value of 9. 54 in the table. In this case, t = 4 is indeed optimal.

Having established that the number of outer iterations varies as \(\sqrt{ p({B}^{-1}A)}\) and a means for relating the number of inner iterations per outer to k′, we return to the question of whether or not a nine-point model preconditioner is more efficient than a five-point model preconditioner when A is a nine-point finite element matrix. The smallest eigenvalues of B 5 and B 9 do not differ significantly. However, the largest eigenvalues differ significantly in general. One can compute these values before actually deciding on the preconditioner for a particular problem. Some insight is gained by considering the discrete Laplacian with equal mesh spacing. The B 5 and B 9 matrices have common eigenvectors in this case. However, their eigenvalues differ. The maximum absolute row sum in B 5 is 8 and the corresponding value in B 9 is 16 ∕ 3. It follows that \(p({B_{5}}^{-1}A)\doteq1.5p({B_{9}}^{-1}A)\). The additional work of significance when the nine-point preconditioner is used is the recovery of the solution vector from the last iteration each cycle. This requires three flops per node. Each ADI inner iteration requires ten flops per node. Thus, if t inners are performed per outer, the additional work per outer with B 9 is by a factor of \((10t + 3)/10t\). The work ratio of nine-point to five-point iteration is then approximately equal to \((10t + 3)/(10t\sqrt{1.5}) = (10t + 3)/12.247t\). This is greater than one only when t = 1. When t = 4, which is often close to optimal, the work saving through use of the nine-point preconditioner is by a factor of approximately 1. 14. One must weigh the complexity of programming a nine-point preconditioner against the gain of approximately 14 % in computation efficiency. The effect of unequal spacing should be investigated.

3.7 Cell-Centered Nodes

The five-point Laplacian discussed in Sect. 3–3.1 and the nine-point FEM discretization described in Sect. 3–3.3 are both associated with vector components computed at intersections of grid lines. An alternative cell-centered formulation also enjoys widespread application. The discretization technique is exposed by considering the operator \(-\frac{\mathrm{d}} {\mathrm{d}x}D(x) \frac{\mathrm{d}} {\mathrm{d}x}\) at segment i of width h i and diffusion coefficient D i . The right neighboring segment is of width h i + 1 and has diffusion coefficient D i + 1. The equation is integrated over segment i. The coupling between i and i + 1 is the two-point approximation to \([-D \frac{\mathrm{d}} {\mathrm{d}x}]\) at the right end of segment i. We assume a continuous piecewise linear solution between the cell centers with joint at the segment junction. Continuity of value and current \([-D \frac{\mathrm{d}} {\mathrm{d}x}]\) at this junction yields a value there in terms of the cell-centered values of

$$\displaystyle{ u_{o} = \frac{D_{i}h_{i+1}u_{i} + D_{i+1}h_{i}u_{i+1}} {D_{i}h_{i+1} + D_{i+1}h_{i}}. }$$
(57)

The current \([-D \frac{\mathrm{d}} {\mathrm{d}x}]\) at the junction is then approximated by

$$\displaystyle{ D_{i}\frac{2(u_{i} - u_{o})} {h_{i}} = \frac{2D_{i}D_{i+1}} {D_{i}h_{i+1} + D_{i+1}h_{i}}(u_{i} - u_{i+1}). }$$
(58)

We now consider solution of Poisson’s equation with a separable approximation as a preconditioner: − ▽  ⋅D(x, y) ▽ u approximated by − ▽  ⋅D(x)D(y) ▽ u. We prove that when the nonseparable cell diffusion coefficient D i, j is approximated by the separable D i D j , the eigenvalue bounds in Eqs. 4950 are valid. Let \(\alpha _{i,j} \equiv \frac{D_{i,j}} {D_{i}D_{j}}\) and let α ≤ α i, j  ≤ 1 ∕ α. The ratio of the true coupling between nodes i, j and i + 1, j and the separable approximation is

$$\displaystyle{ R(i,j) =\alpha _{i,j}\alpha _{i+1,j}\frac{D_{j}(h_{i+1}D_{i} + h_{i}D_{i+1})} {h_{i+1}D_{i,j} + h_{i}D_{i+1,j}} }$$
$$\displaystyle{ =\alpha _{i,j}\alpha _{i+1,j} \frac{h_{i+1}D_{i} + h_{i}D_{i+1}} {\alpha _{i,j}h_{i+1}D_{i} +\alpha _{i+1,j}h_{i}D_{i+1}}. }$$
(59)

It follows that R(i, j) is in the interval [α i, j , α i + 1, j ]. All coefficient ratios satisfy similar relationships. Hence, the eigenvalues of B  − 1 A are in the interval α, 1 ∕ α as asserted when cell-centered equations are used for the true and the model problem.

3.8 The Lyapunov Matrix Equation

Let the n ×n matrix A and the SPD n ×n matrix C be given. Then the Lyapunov matrix problem is to find the symmetric matrix X such that

$$\displaystyle{ AX + X{A}^{\top } = C. }$$
(60)

That this Lyapunov matrix equation (and more generally the Sylvester matrix equation \(AX + XB = C\), where A is of order n and B of order m) is a model ADI problem was discovered in 1982 in connection with determination of “infinitesimal scaling” impedance matrices [Hurwitz,  1984 ] and [Wachspress,  1988a ]. Although ADI was developed for application to SPD systems with real spectra, the iteration equations do not rely on symmetry. The model condition that the component matrices commute is retained. However, the SPD condition may be relaxed to require only that the eigenvalues of the coefficient matrix lie in the positive-real half plane. Such matrices are said to be “N-stable.” (The eigenvalues of a “stable” matrix are in the negative-real half plane. The “N” in N-stable is for negative and this notation implies the double negative which flips the eigenvalues into the positive-real half plane.) When A is N-stable, it is known that Eq. 51 has a unique SPD solution matrix, X. A major deterrent to use of ADI iteration for solving elliptic partial differential equations is possible loss in convergence in the absence of a convenient commuting splitting. The N-stable Lyapunov matrix problem is seen to be a model ADI problem when one recognizes that this is equivalent to a linear operator \(\mathcal{A}\) mapping X into C where \(\mathcal{A}\) is the sum of the commuting operators: premultiplication of X by A and postmultiplication by A  ⊤ . Thus, commutation is inherent in the Lyapunov application.

The ADI equations applied directly to Eq. 60 are

$$\begin{array}{rlrlrl} &X_{0} = \mathbf{0}, &\end{array}$$
(61.1)
$$\begin{array}{rlrlrl} &(A + p_{j}I)X_{j-\frac{1} {2} } = C - X_{j-1}({A}^{\top }- p_{ j}I), &\end{array}$$
(61.2)
$$\begin{array}{rlrlrl} &(A + p_{j}I)X_{j} = C - X_{j-\frac{1} {2} }^{\top }({A}^{\top }- p_{ j}I), & \\ &\text{ with }j = 1,2,\ldots,J. & & \end{array}$$
(61.3)

Matrix X is not in general symmetric after the first sweep of each iteration, but the result of the double sweep is symmetric. Each row of grid points in ADI solution of a Laplacian-type system corresponds to a column of the matrix X and each column of the Laplace grid corresponds to a row of matrix X. Equation 61.3 is actually the transpose of the conventional ADI second step. An iterative method introduced by Smith [ 1968 ] is closely related to ADI with all the p j the same. Each of Smith’s iterations effectively doubles J at the expense of three matrix multiplications.

Application of ADI iteration to N-stable Lyapunov matrix equations requires generalization of the ADI theory into the complex plane. This is described in depth in Chap. 4. The initial work concerned generalization of the elliptic-function theory and was reported in a series of papers by Ellner (nee Saltzman), Lu, and Wachspress (1986–1991). This analysis centered around embedding a given spectrum in a region bounded by a curve of the form

$$\displaystyle{ \Gamma =\{ z = b\,dn[u \pm ir,\,k]\vert 0 \leq u \leq 1\}. }$$
(62)

Such regions were denoted as “elliptic-function” regions. Additional theory relating to ADI iteration with complex spectra and methods for determining optimal ADI parameters for spectra not well represented by the elliptic-function regions used in the earlier work were reported by Starke [ 1989 ]. Alternative effective parameters for rectangular spectra were developed in [Wachspress,  1991 ]. Subsequent analysis by Istace and Thiran [ 1993 ] applied nonlinear optimization techniques to this problem.

Popular techniques for solving Eq. 60 include the method proposed by Smith [ 1968 ] and the B–S scheme developed by Bartels & Stewart  ( 1972 ). The B–S algorithm requires about 15N 3 flops to solve for X. In many applications, neither Smith’s method nor ADI iteration is competitive with B–S when applied directly to a full matrix. Even if A has a known real spectrum so that the ADI theory is precise and convergence is rapid, each iteration requires several n 3 flops. It was found that one feasible technique which makes ADI iteration competitive is to first reduce the system to banded form. ADI iterative solution when A has bandwidth b < < n requires only O(bn 2) flops. An additional advantage of this method is that the spectrum can be determined with little increase in computation time. This facilitates choice of iteration parameters for specific spectra.

Any similarity transformation with a matrix G reduces the Lyapunov equation to

$$\displaystyle{ SZ + Z{S}^{\top } = D, }$$
(63)

where

$$\displaystyle{ S = GA{G}^{-1}\quad Z = GX{G}^{\top }\quad \text{and }D = GC{G}^{\top }. }$$
(64)

Once Z is found, X may be recovered from

$$\displaystyle{ X = {G}^{-1}Z{G}^{-\top }. }$$
(65)

Reduction to diagonal form yields the solution \(z_{ij} = d_{ij}/(g_{ii} + g_{jj})\), but this reduction is too costly. It is equivalent to finding all the eigenvalues and eigenvectors of matrix A. When A is symmetric, Householder reduction to tridiagonal form is efficient and robust. The spectrum is real and ADI iteration rests on theory already described. When A is not symmetric, Householder reduction may be used to transform A into upper Hessenberg form, H. ADI iteration with H is often not competitive with B–S. One may attempt to reduce H to tridiagonal form with gaussian transformations. This is a classical problem in linear algebra, known to have many pitfalls [Wilkinson,  1965 ]. Large multipliers often arise and these lead to rapid loss in accuracy. Several researchers addressed this problem in seeking efficient means for finding the eigenvalues of A. [Dax and Kaniel,  1981 , Hare and Tang,  1989 , Tang,  1988 , Watkins,  1988 ]. Once A or H is reduced to tridiagonal form, shifted LR transformations which preserve the band structure yield the eigenvalues more efficiently than the shifted QR transformations conventionally applied to H for this purpose. Wilkinson and later researchers showed that multipliers as large as \({2}^{\frac{t} {3} }\) would not detract from eigenvalue accuracy for calculations performed with roundoff error of order 2t. However, for solution of the Lyapunov equation, more stringent bounds are needed.

In the first numerical studies of ADI applied to the Lyapunov equation, three features were introduced. First, the gaussian reduction was applied to the Hessenberg matrix by columns, starting at the last column. Second, a recovery algorithm was applied when a large multiplier was encountered. This consisted in creating a bulge at the (n − 2, n) element and chasing the bulge up to the “breakdown” column [Wachspress,  1988b ]. Although this often succeeded, there were situations where this did not remedy the problem. To ensure robustness, on failure of the recovery algorithm, the offending column was left intact and the algorithm was continued. This resulted in a tridiagonal system (from bounded gaussian transformations) with a few added vertical “spikes” above the diagonal. Although this was reasonably successful for the ADI iteration, it was not suitable for the eigenvalue computation since the LR iterations fill in to a full Hessenberg matrix when there are spikes. The ADI iteration lost efficiency due to insufficient spectral knowledge.

Footnote 4A significant variant introduced in [Geist,  1989 ] reduced rows and columns of A sequentially from row/col 1 to row/col n. Before each row/col reduction he permuted rows and columns in an attempt to reduce the magnitude of the gaussian multipliers. Such permutations were not possible when reducing from Hessenberg form. When the row and column to be reduced are close to orthogonal large multipliers cannot be avoided. Al’s program was made robust by abandoning the reduction at the point of breakdown, applying a random Householder transformation to matrix A, and restarting the reduction. With a grant from ORNL, my graduate student at the University of Tennessee (An Lu) incorporated Geist’s program ATOTRI into our ADI Lyapunov solver [Geist, Lu and Wachspress,  1989 ]. Geist’s shifted LR eigenvalue solver was then used to determine the matrix spectrum for the ADI parameter optimization.

Although most problems are solved efficiently with this procedure, the lack of robustness and the computation time expended in recovery from breakdown detract from the method. Subsequently, motivated by discussions with Al and me at ORNL, Howell [ 1994 ] handled breakdown by allowing the bandwidth to expand above the diagonal. The row reduction lagged behind the column reduction with an increase in upper-half bandwidth each time another large multiplier was encountered. In the worst case, matrix A was reduced to upper Hessenberg form by stable gaussian transformations. Howell’s program BHESS [Howell and Diaa,  2005 ] is well suited for ADI solution of the Lyapunov equation.

The success of Geist’s permutation to reduce large multipliers was puzzling since after reducing the column (which was arbitrarily reduced first) the pivot for the row is small when the row and column to be reduced are nearly orthogonal. No initial permutation can change the product of the two pivots. Large multipliers from different row/col reductions can interact to yield large norms for the composite transformation matrix and its inverse. For the Lyapunov application one should monitor the accumulated condition number of the transformation matrix.

In 1994 I suggested a BHESS modification (described in the Howell et al. paper) which could possibly reduce interaction of large multipliers and thereby improve stability. This has been realized and is developed in Chap. 5 with application to Lyapunov and Sylvester equations.

When excessive multiplication factors do not occur, theoretical improvement over the B–S method by a factor of around two is possible with combination of reduction to banded form followed by ADI iteration. The iterative method facilitates approximate solution when solving nonlinear (Riccati) equations with Newton iteration. Each Newton iteration requires solution of a Lyapunov equation. Another beneficial property of the iterative method is that it appears to be more readily parallelizable than the B–S method, in which QR transformations consume significant computer time. The ADI iteration itself on the banded equation requires O(bn 2) flops. The arithmetic associated with the similarity transformation adds up to about 7n 3 flops.

The B–S algorithm applies to all nonsingular matrices A. The ADI iteration applies directly only when A is N-stable. When A is nonsingular but not N-stable, it is possible to transform the problem into an equivalent N-stable system [Watkins,  1988 ]. However, this transformation may be too expensive to justify the entire procedure. The B–S scheme seems preferable in such cases. Fortunately for the ADI alternative, many of the problems encountered are N-stable. This is evidenced by widespread use over the years of the Smith algorithm which also requires N-stability.

The minimax theory was extended for the ADI problem in analogous fashion to the polynomial approximation problem [Opfer and Schober,  1984 ] with Rouché’s theorem replacing the Chebyshev alternating extremes property. Elliptic-function regions play the role in ADI iteration of the ellipses in polynomial approximation. The logarithms of the elliptic-function regions are close to elliptical in shape. The theory is quite definitive and yields close to optimal parameters when the spectrum can be embedded in an elliptic-function region without excessive expansion. These regions have logarithmic symmetry with respect to the real and translated imaginary axes. When such regions are not appropriate one must seek alternative parameters [Istace and Thiran,  1993 , Starke,  1989 , Wachspress,  1991 ]. Fortunately, elliptic-function regions and unions of such regions apply to many problems of concern. The ADI minimax problem is more tractable than the corresponding polynomial problem in that when the parameters are positive or appear as conjugate pairs with positive real part the spectral radius of the iteration matrix is bounded by unity.

3.9 The Sylvester Matrix Equation

The Sylvester matrix equation

$$\displaystyle{ AX + XB = C }$$
(66)

has a unique solution X for any C when there is no combination of eigenvalues λ(A) and γ(B) which sum to zero. The system is then said to be nonsingular. The ADI iteration is applicable only when the sum of the real parts is positive for all combinations. Although it is possible to construct from any nonsingular system another system with the same solution for which all real part combinations are positive, this construction often involves prohibitive computation. ADI iteration does not seem to be viable for such problems.

If A and B are symmetric, solution by the method of Golub, Nash and VanLoan [ 1979 ] is quite efficient, and ADI iteration is not competitive in this case. Reduction of one of A and B to tridiagonal form and the other to diagonal form with the symmetric QR algorithm provides a robust and elegant basis for solution of the Sylvester equation. On the other hand, when A and B are not symmetric, the Householder reduction to Hessenberg form does not yield a tridiagonal matrix. The method of Golub et al. requires further reduction of only one of these Hessenberg matrices to Schur form. Nevertheless, the additional work associated with reduction to Schur form of a matrix of order n takes about 13n 3 flops. Thus, considerable time savings may be realized through use of gaussian reduction to banded form and ADI iterative solution of the reduced equations.

Let the similarity transformations that reduce A and B to the banded matrices S and T be G and H, respectively. Then the Sylvester equation reduces to

$$\begin{array}{rlrlrl} SZ + ZT & = F, &\end{array}$$
(67.1)
$$\begin{array}{rlrlrl} \text{ where }S & = GA{G}^{-1}, &\end{array}$$
(67.2)
$$\begin{array}{rlrlrl} T & = HB{H}^{-1}, &\end{array}$$
(67.3)
$$\begin{array}{rlrlrl} F & = GC{H}^{-1}, &\end{array}$$
(67.4)
$$\begin{array}{rlrlrl} \text{ and }Z & = GX{H}^{-1}. &\end{array}$$
(67.5)

The spectra for A and A  ⊤  in the Lyapunov equation were the same. Hence, parameters p j and q j in Eq. 62 were the same for the two steps of each iteration applied to the Lyapunov equation. Here, the spectra of A and B differ in most cases and the more general two-variable ADI theory is applicable. A generalization to complex spectra of the transformation of W.B. Jordan described in Chap. 2 will be exposed in Chap. 4. This transformation provides a basis for choice of parameters p j and q j .

Once A and B have been reduced to S and T of bandwidth b, one can solve the Sylvester equation by ADI iteration with O(bnm) flops per iteration, where A is of order n and B is of order m. The iteration equations for the reduced system are

$$\begin{array}{rlrlrl} Z_{0} & = \mathbf{0}, &\end{array}$$
(68.1)
$$\begin{array}{rlrlrl} (S + p_{j}I_{n})Z_{j-\frac{1} {2} } & = F - Z_{j-1}(T - p_{j}I_{m}), &\end{array}$$
(68.2)
$$\begin{array}{rlrlrl} ({T}^{\top } + q_{ j}I_{m})Z_{j}^{\top } & = {[F - (S - q_{ j}I_{n})Z_{j-\frac{1} {2} }]}^{\top }, & \\ \text{ for }j & = 1,2,\ldots,,J & & \end{array}$$
(68.3)

Let the right-hand sides in Eqs. 68.2 and 68.3 be denoted by \(G_{j-\frac{1} {2} }\) and G j . The ADI iteration arithmetic is reduced if one computes these terms recursively:

$$\begin{array}{rlrlrl} \text{ For the first half step, }G_{\frac{1} {2} } & = F &\end{array}$$
(69.1)
$$\begin{array}{rlrlrl} \text{ and thereafter on the half steps }G_{j-\frac{1} {2} } & = F + {[(p_{j} + q_{j-1})Z_{j-1} -G_{j-1}]}^{\top }. & \end{array}$$
(69.2)
$$\begin{array}{rlrlrl} \text{ For the whole steps: }G_{j} & = {[F + (p_{j} + q_{j})Z_{j-\frac{1} {2} } - G_{j-\frac{1} {2} }]}^{\top }. &\end{array}$$
(69.3)

A rough estimate of the number of flops required to solve the Sylvester equation when m = n is 21n 3 for the Golub et al. method and 10n 3 for the ADI method. The savings with iteration is essentially the flops associated with reduction of A or B from Hessenberg to Schur form. The iterative method uses \(\frac{5} {3}({n}^{3} + {m}^{3})\) flops to reduce A and B to banded form while accumulating the gaussian transformations, nm(n + m) flops to transform the right-hand side, and another nm(n + m) flops to recover X from Z. The estimate of 10n 3 flops includes an allowance for the ADI iterations and verification of the approximate solution.

3.10 The Generalized Sylvester Equations

The generalized Sylvester equations may be expressed in the form

$$\begin{array}{rlrlrl} AX + Y B & = C, &\end{array}$$
(70)
$$\begin{array}{rlrlrl} EX - Y F & = G. &\end{array}$$
(71)

Matrices A and E are n ×n, B and F are m ×m, X, Y, C, and G are n ×m. These equations arise in solution of eigenvalue problems [Golub, Nash and VanLoan,  1979 ] and in control theory [Byers,  1983 ]. In these applications it is often true that

$$\displaystyle{ Re\lambda ({E}^{-1}A) + Re\lambda (B{F}^{-1}) > 0. }$$
(72)

This is a stability condition which ensures existence of a unique solution to the generalized Sylvester equations. The ADI iteration equations for numerical solution of Eqs. 7071 are

$$\begin{array}{rlrlrl} Y _{0} & = \mathbf{0}, &\end{array}$$
(73.1)
$$\begin{array}{rlrlrl} (A + p_{j}E)]X_{j} & = C + p_{j}G - Y _{j-1}(B - p_{j}F), &\end{array}$$
(73.2)
$$\begin{array}{rlrlrl} ({B}^{\top } + q_{ j}{F}^{\top })Y _{ j}^{\top } & = {[(C - q_{ j}G) + (q_{j}E - A)X_{j}]}^{\top }, & \\ \text{ for }j & = 1,2,\ldots,J. & & \end{array}$$
(73.3)

These equations may be reduced to banded form. Let \(S = HA{E}^{-1}{H}^{-1}\) and \(T = K{F}^{-1}B{K}^{-1}\) be of bandwidth b. These matrices are computed in approximately \(\frac{7} {2}({n}^{3} + {m}^{3})\) flops. One also must compute \({C}^{{\prime}} = HC{K}^{-1}\) and \({G}^{{\prime}} = HG{K}^{-1}\). This takes 2nm(n + m) flops. The reduced equations are

$$\begin{array}{rlrlrl} Z_{0} & = \mathbf{0}, &\end{array}$$
(74.1)
$$\begin{array}{rlrlrl} (S + p_{j}I_{n})V _{j} & = {C}^{{\prime}} + p_{ j}{G}^{{\prime}}- Z_{ j-1}^{\top }(T - p_{ j}I_{m}), &\end{array}$$
(74.2)
$$\begin{array}{rlrlrl} ({T}^{\top } + q_{ j}I_{m}]Z_{j} & = {[{C}^{{\prime}}- q_{ j}{G}^{{\prime}}- (S - q_{ j}I_{n})V _{j}]}^{\top }, & \\ \text{ for }j & = 1,2,\ldots,J. & & \end{array}$$
(74.3)

Note that each iteration updates both V and Z. A simple recursive relationship may be used to reduce the arithmetic in computing successive right-hand sides, denoted by L:

$$\begin{array}{rlrlrl} L_{\frac{1} {2} } & = {C}^{{\prime}} + p_{ 1}{G}^{{\prime}}, &\end{array}$$
(75.1)
$$\begin{array}{rlrlrl} L_{j-\frac{1} {2} } & = ({C}^{{\prime}} + p_{ j}{G}^{{\prime}}) + {[(p_{ j} + q_{j-1})Z_{j-1} - L_{j-1}]}^{\top }, &\end{array}$$
(75.2)
$$\begin{array}{rlrlrl} L_{j} & = {[{C}^{{\prime}}- q_{ j}{G}^{{\prime}}- L_{ j-\frac{1} {2} } + (p_{j} + q_{j})V _{j}]}^{\top }. &\end{array}$$
(75.3)

This is an O(bnm) algorithm with time small compared to the O(n 3 + m 3) operations performed before and after solution of the equations. Matrices X and Y are recovered from V and Z with

$$\begin{array}{rlrlrl} X & = {E}^{-1}{H}^{-1}V _{ J}K &\end{array}$$
(76.1)
$$\begin{array}{rlrlrl} \text{ and }Y & = {H}^{-1}{Z}^{\top }K{F}^{-1}. &\end{array}$$
(76.2)

This requires another 3nm(n + m) flops. When n = m, the total arithmetic is thus around 17n 3 flops.

3.11 The Three-Variable Laplacian-Type Problem

In Sect. 2.3 of Chap. 2 we discussed the three-variable ADI model problem and described an iteration designed primarily as a preconditioner. We will now examine this preconditioner in more detail. If we were able to use Eqs. 29.1 and 29.2 of Chap. 2 we would obtain the usual ADI preconditioning matrix, say

$$\displaystyle{ {[B(t)]}^{-1} = [I - M(t)]{B}^{-1}, }$$
(77)

where \(B = H + V + P\) is the model-problem matrix and M(t) is the standard ADI iteration matrix for t double sweeps. The analysis already presented in this chapter would then apply. However, when Eqs. 2–32 are used, the preconditioner becomes

$$\displaystyle{ {[B(t)]}^{-1} = \left [I -\prod _{ j=1}^{t}L_{ j}\right ]{B}^{-1}, }$$
(78)

where L j includes the inner ADI iteration matrix for double-sweep j of the ADI iteration. This preconditioner must be SPD for the conjugate gradient procedure to succeed. Since the L j and B commute with one another, the preconditioner is symmetric. The norm of the ADI iteration matrix is now the spectral radius of

$$\displaystyle{ L(t) \equiv \prod _{j=1}^{t}L_{ j}. }$$
(79)

The spectral radius, say ε, of L(t) must be less than unity for the preconditioner to be positive definite. Sufficient inner iterations must be performed to guarantee an SPD preconditioner. In Sect. 3.6 we discussed the interaction of inner ADI and outer CG iteration. We found that a value of ε of order magnitude 0. 1 was reasonable and that t = 4 was often near optimal.

For three-variable iteration, the optimum value for t tends to be smaller than for corresponding two-variable problems. The smaller value for k′ j for the inner ADI iterations when w j is small tends to reduce the relative efficiency as t is increased. Precise optimization of the inner ADI, the outer ADI, and the CG iteration is possible but requires evaluation of various options. This may be clarified by example.

The Dirichlet problem with Laplace’s equation over a uniform grid with 100 nodes on each side yields k′ = 0. 000281 for the outer ADI iteration. The optimal parameters for t = 4 are [0. 000507, 0. 00508, 0. 05536, 0. 55439] in the transformed space. The corresponding error reduction with Eqs. 29.1 and 29.2 of Chap. 2 is 0. 0644. The inner ADI iterations for an error reduction of ε j  < 2k′ = 0. 000562 satisfy (Table 3.2).

Table 3.2 Inner iterations for a 3D problem

The spectral radius of the ADI iteration is bounded by \(\varepsilon _{4} = \sqrt{0.0644} = 0.2537\). The number of CG iterations is increased by iterative approximation of the model-problem inverse by a factor of 1. 3. The number of mesh sweeps per CG iteration is 50. Each CG acceleration requires about the work of around three mesh sweeps. We therefore estimate the work factor as (1. 3)(53) = 68. 9.

A similar computation for t = 2 yields ε 2 = 0. 69. This is achieved with 11 inner ADI iterations for a total of 24 mesh sweeps per CG iteration. The CG loss factor is now 2. 33 and the estimated work factor is (2. 33)(27) = 62. 9. This is slightly better than t = 4.

For a two-variable computation with the same value for k′, four ADI iterations per CG step would yield a work factor of 11 and two ADI iterations per CG step a factor of (1. 678)(7) = 11. 75. Although the optimum number of ADI iterations per step is now greater, these computations display an insensitivity of efficiency to the number of ADI iterations per CG step with relatively few ADI iterations being optimal.

In three-variable iteration, insufficient inner ADI iterations lead to growth in high mode H + V error components. These are the oscillatory modes and their growth is similar to that associated with roundoff instability. One must not confuse this behavior with roundoff error.