Iterative Methods

Corless, Robert M.; Fillion, Nicolas

doi:10.1007/978-1-4614-8453-0_7

Robert M. Corless³ &
Nicolas Fillion³

5850 Accesses

Abstract

This chapter looks at iterative methods to solve linear systems and at some alternative methods to solve eigenvalue problems. That is, we now look at iteration instead of using a finite number of noniterative steps. Iterative methods for solving eigenvalue problems are, of course, completely natural. We looked at power iteration and at the QR iteration in Chap. 5; here we look at some methods that take advantage of sparsity or structure. We also use one pass of iterative refinement to improve structured backward error. ⊲

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Iterative Refinement and Structured Backward Error

Let us begin with the simplest possible iterative method for solving a linear system. We first consider a 3 × 3 example that hardly needs iteration, but we will shortly extend to larger matrix sizes. So suppose we wish to solve

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{c@{\enskip }c@{\enskip }c} 4 &1 &0\\ 1 &4 &1 \\ 0 &1 &4 \end{array} \right ]\left [\begin{array}{c} x_{1} \\ x_{2} \\ x_{3}\end{array} \right ] = \left [\begin{array}{c} 1\\ - 1 \\ 1 \end{array} \right ].& & {}\\ \end{array}$$

The exact solution, which is easy to find by any method, is $\mathbf{x} = [5,-6,5]/14$. Let us imagine that we don’t know that, but that due to a prior computation, we do know that the matrix

$$\displaystyle\begin{array}{rcl} \mathbf{B} = \left [\begin{array}{c@{\enskip }c@{\enskip }c} 2 + \sqrt{3} &1\\ 1 &4 &1 \\ &1 &4 \end{array} \right ]& & {}\\ \end{array}$$

has the Cholesky factoring ${\mathbf{LDL}}^{T}$ with

$$\displaystyle\begin{array}{rcl} \mathbf{L} = \left [\begin{array}{c@{\enskip }c@{\enskip }c} 1\\ \alpha &1 \\ & \alpha &1 \end{array} \right ],& & {}\\ \end{array}$$

$\alpha {= }^{1}/_{(2+\sqrt{3})}$, and $\mathbf{D} =\mathrm{ diag}(2+\sqrt{3}, 2+\sqrt{3}, 2+\sqrt{3})$. As a result, ${\mathbf{B}}^{-1} ={ \mathbf{L}}^{-T}{\mathbf{D}}^{-1}{\mathbf{L}}^{-1}$ is easy to compute, or, more properly,

$$\displaystyle\begin{array}{rcl} \mathbf{B}\mathbf{x} = \mathbf{b}\quad \Leftrightarrow \quad {\mathbf{LDL}}^{T}\mathbf{x} = \mathbf{b}& & {}\\ \end{array}$$

is easy to solve. Here, if we let $\mathbf{P} ={ \mathbf{B}}^{-1}$ (at least in thinking about it, not in actually doing it), we have

$$\displaystyle\begin{array}{rcl} \mathbf{PA}\mathbf{x} = \mathbf{P}\mathbf{b} = \left [\begin{array}{c} 0.3847\\ - 0.4359 \\ 0.35898 \end{array} \right ].& & {}\\ \end{array}$$

Notice that $\mathbf{PA}$ is nearly the identity, that is, $\mathbf{PA} = \mathbf{I} -\mathbf{S}$, where $\mathbf{S}$ is a matrix with small entries:

$$\displaystyle\begin{array}{rcl} \mathbf{S} = \left [\begin{array}{c@{\enskip }c@{\enskip }c} - 0.073216421430700 &0 &0\\ 0.0206191045714862 &0 &0 \\ - 0.00515477614287156 &0 &0 \end{array} \right ].& & {}\\ \end{array}$$

Our equation has thus become

$$\displaystyle\begin{array}{rcl} (\mathbf{I} -\mathbf{S})\mathbf{x} = \mathbf{P}\mathbf{b} = \left [\begin{array}{c} 0.3847\\ - 0.4359 \\ 0.35898 \end{array} \right ],& & {}\\ \end{array}$$

and we are left with the problem, seemingly as difficult, of solving a linear system with matrix $\mathbf{I} -\mathbf{S}$. However, we have made some progress, since we can use the smallness of $\mathbf{S}$ to solve the system by means of an iterative scheme. First, observe that $(\mathbf{I} -\mathbf{S})\mathbf{x} = \mathbf{P}\mathbf{b} = \mathbf{x}_{0}$ implies $\mathbf{x} = \mathbf{x}_{0} + \mathbf{S}\mathbf{x}$. Hence, we can then write the following natural iteration:

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{k+1} = \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{k}.& & {}\\ \end{array}$$

This is the Richardson iteration, which is about as simple an iterative method as it gets. Then we obtain

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{1}& =& \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0}. {}\\ \end{array}$$

Similarly, we find that

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{2}& =& \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0} +{ \mathbf{S}}^{2}\mathbf{x}_{ 0} {}\\ \mathbf{x}_{3}& =& \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0} +{ \mathbf{S}}^{2}\mathbf{x}_{ 0} +{ \mathbf{S}}^{3}\mathbf{x}_{ 0}. {}\\ \end{array}$$

In general, the kth iteration results in

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{k} =\sum _{ j=0}^{k}{\mathbf{S}}^{j}\mathbf{x}_{ 0}.& & {}\\ \end{array}$$

This series converges if $\|{\mathbf{S}}^{k}\|$ goes to zero, which it does, exactly as for the geometric series, if there is a $\rho < 1$ for which $\|{\mathbf{S}}^{k}\| {\leq \rho }^{k}$. In this case, $\|\mathbf{S}\| \leq 0.01$. An obvious induction gives $\|{\mathbf{S}}^{k}\| \leq \|{\mathbf{S}\|}^{k} \leq {(0.01)}^{k}$ and so this iteration converges; indeed, already $\mathbf{x}_{4}$ is correct to four digits. Note that $\max \vert \lambda \vert \leq \|\mathbf{S}\|$ in general, and it is very possible that $\max \vert \lambda \vert < 1$. So the powers eventually decay even though $\|\mathbf{S}\| > 1$. We will see examples shortly.

Before we look at larger matrices, let’s look at this iteration in a different way. Using a matrix $\mathbf{P}$, which is close to the inverse of $\mathbf{A}$, we make the initial guess $\mathbf{x}_{0} = \mathbf{P}\mathbf{b}$ (since $\mathbf{A}\mathbf{x} = \mathbf{b}$ then implies $\mathbf{x} \approx \mathbf{P}\mathbf{b}$). The residual resulting from this choice is

$$\displaystyle\begin{array}{rcl} \mathbf{r}_{0} = \mathbf{b} -\mathbf{A}\mathbf{x}_{0} = \mathbf{b} -\mathbf{AP}\mathbf{b}.& & {}\\ \end{array}$$

Since $\mathbf{0} = \mathbf{b} -\mathbf{A}\mathbf{x}$, we find that

$$\displaystyle\begin{array}{rcl} \mathbf{r}_{0}& =& \mathbf{b} -\mathbf{A}\mathbf{x}_{0} - (\mathbf{b} -\mathbf{A}\mathbf{x}) = \mathbf{A}\mathbf{x} -\mathbf{A}\mathbf{x}_{0} = \mathbf{A}(\mathbf{x} -\mathbf{x}_{0}) = \mathbf{A}\varDelta \mathbf{x}. {}\\ \end{array}$$

Thus, we see that $\varDelta \mathbf{x} = \mathbf{x} -\mathbf{x}_{0}$ solves

$$\displaystyle\begin{array}{rcl} \mathbf{A}\varDelta \mathbf{x} = \mathbf{r}_{0}.& & {}\\ \end{array}$$

Now, with this equation, we can use $\mathbf{P}$ as above and let $\mathbf{x}_{1} -\mathbf{x}_{0} = \mathbf{P}\mathbf{r}_{0}$. Then

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{1} = \mathbf{x}_{0} + \mathbf{P}\mathbf{r}_{0}.& & {}\\ \end{array}$$

The process can clearly be repeated:

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{2}& =& \mathbf{x}_{1} + \mathbf{P}\mathbf{r}_{1} {}\\ \mathbf{x}_{3}& =& \mathbf{x}_{2} + \mathbf{P}\mathbf{r}_{2}, {}\\ \end{array}$$

where $\mathbf{r}_{2} = \mathbf{b} -\mathbf{A}\mathbf{x}_{2}$ and $\mathbf{r}_{1} = \mathbf{b} -\mathbf{A}\mathbf{x}_{1}$ are the corresponding residuals. This process is called iterative refinement. Note that

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{1}& =& \mathbf{x}_{0} + \mathbf{P}(\mathbf{b} -\mathbf{A}\mathbf{x}_{0}) = \mathbf{x}_{0} + \mathbf{P}\mathbf{b} -\mathbf{PA}\mathbf{x}_{0} = \mathbf{x}_{0} + \mathbf{x}_{0} -\mathbf{PA}\mathbf{x}_{0} {}\\ & =& \mathbf{x}_{0} + (\mathbf{I} -\mathbf{PA})\mathbf{x}_{0} = \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0}, {}\\ \end{array}$$

since $\mathbf{PA} = \mathbf{I} -\mathbf{S}$ in our earlier notation. Similarly, one obtains

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{2}& =& \mathbf{x}_{1} + \mathbf{P}(\mathbf{b} -\mathbf{A}\mathbf{x}_{1}) = \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0} + \mathbf{P}\mathbf{b} -\mathbf{PA}(\mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0}) {}\\ & =& \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0} + \mathbf{x}_{0} - (\mathbf{I} -\mathbf{S})(\mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0}) {}\\ & =& \mathbf{x}_{0} + \mathbf{S}\mathbf{x}_{0} +{ \mathbf{S}}^{2}\mathbf{x}_{ 0}, {}\\ \end{array}$$

which is mathematically equivalent to what we had before and converges under the same conditions.

The matrix $\mathbf{P}$, our approximate inverse, is called a preconditioner (and its inverse is usually denoted $\mathbf{M}$). Probably the most important part of any iterative method is choosing the right preconditioner. For solving $\mathbf{A}\mathbf{x} = \mathbf{b}$, we need for $\mathbf{P}$ to allow fast evaluation of products $\mathbf{P}\mathbf{v}$ and simultaneously be close to ${\mathbf{A}}^{-1}$. Unfortunately, these goals are often in opposition. It is useful in practice to use even quite crude approximations to ${\mathbf{A}}^{-1}$ as preconditioners, though.

Let us illustrate the usefulness of this method. Suppose we want to solve $\mathbf{A}\mathbf{x} = \mathbf{b}$ and, moreover, suppose $\mathbf{A} = \mathbf{F}_{n}(\mathbf{I} + \mathbf{S})$, where

$$\displaystyle\begin{array}{rcl} \mathbf{F}_{n} = \left [\begin{array}{c@{\enskip }c@{\enskip }c@{\enskip }c@{\enskip }c} 2 + \sqrt{3} &1\\ 1 &4 &1 \\ &1 &4 &1\\ & & \ddots & \ddots &\ddots \end{array} \right ]& & {}\\ \end{array}$$

is n × n and $\mathbf{S}$, off diagonal, is small [we will allow $s_{11} = (4 - (2 + \sqrt{3}))/(2 + \sqrt{3})$ to be sort of big]. Then, let $\mathbf{P} = \mathbf{F}_{n}^{-1}$, although because $\mathbf{F}_{n}^{-1}$ is full, we never compute it. Instead, we note that by symmetric factoring, we have $\mathbf{F}_{n} = \mathbf{L}_{n}\mathbf{D}\mathbf{L}_{n}^{T}$, where

$$\displaystyle\begin{array}{rcl} \mathbf{L}_{n} = \left [\begin{array}{cccc} 1\\ \alpha &1 \\ & \alpha &1\\ & & \ddots&\ddots \end{array} \right ]& & {}\\ \end{array}$$

and $\mathbf{D} =\mathrm{ diag}(2 + \sqrt{3},2 + \sqrt{3},\ldots,2 + \sqrt{3})$. Note that we won’t compute $\mathbf{S}$, either. Instead, we solve the sequence of equations

$$\displaystyle\begin{array}{rcl} \mathbf{L}_{n}\mathbf{z}_{0}& =& \mathbf{b} {}\\ \mathbf{D}_{n}\mathbf{y}_{0}& =& \mathbf{z}_{0} {}\\ \mathbf{L}_{n}^{T}\mathbf{x}_{ 0}& =& \mathbf{y}_{0} {}\\ \end{array}$$

in O(n) flops to get $\mathbf{x}_{0}$, by means of which we will use iterative refinement to get an accurate value of $\mathbf{x}$ as shown below:

for $k = 1,2,\ldots$ do

Compute $\mathbf{r}_{k-1} = \mathbf{b} -\mathbf{A}\mathbf{x}_{k-1}$

% Now, we compute $\mathbf{x}_{k} -\mathbf{x}_{k-1} = \mathbf{P}\mathbf{r}_{k-1}$

Solve $\mathbf{L}\mathbf{z}_{k} = \mathbf{r}_{k-1}$

Solve $\mathbf{D}\mathbf{y}_{k} = \mathbf{z}_{k}$

Solve ${\mathbf{L}}^{T}\varDelta \mathbf{x}_{k} = \mathbf{y}_{k}$

Let $\mathbf{x}_{k} = \mathbf{x}_{k-1} +\varDelta \mathbf{x}_{k}$

end for

This is an iterative refinement formulation of the iteration. Because $\|\mathbf{S}\|\doteq0.01$, 10 or so iterations of this process gets $\mathbf{x}$ accurate to most significant digits; and each iteration costs O(n) flops. Thus, in O(n) flops, we have solved our system. This is significantly better than the O(n ³) cost for full matrices!

Note that $\mathbf{A}$ need not really be tridiagonal: It can have a few more entries here and there off the main diagonals, contributing to $\mathbf{S}$, if they’re not too large. Even if there are lots of them, the cost of computing the residual is at most O(n ²) per iteration, and if $\mathbf{S}$ is small, we will need only O(1) iterations.

It’s hard to overemphasize the importance of this seemingly trivial change from direct, algorithmic finite-number-of-steps solution to a convergent iteration, but most large systems are, in practice, solved with such methods. As Greenbaum notes,

With a sufficiently good preconditioner, each of these iterative methods can be expected to find a good approximate solution quickly. In fact, with a sufficiently good preconditioner $\mathbf{M}$, an even simpler iteration method such as $\mathbf{x}_{k} = \mathbf{x}_{k-1} +{ \mathbf{M}}^{-1}(\mathbf{b} - A\mathbf{x}_{k-1})$ may converge in just a few iterations, and this avoids the cost of inner products and other things in the more sophisticated Krylov space methods (in Hogben 2006 p. 41–10)

(which highlights the importance of choosing $\mathbf{P}$ well). The iterative methods included in Matlab are (for $\mathbf{A}\mathbf{x} = \mathbf{b}$)

bicg—biconjugate gradient
bicgstab—biconjugate gradient stabilized
cgs—conjugate gradient squared
gmres—generalized minimum residual
lsqr—least squares
minres—minimum residual
pcg—preconditioned conjugate gradient
qmr—quasiminimal residual
symmlq—symmetric LQ

but there is no explicit program for iterative refinement, because it is so simple. See, for example, Olshevsky (2003b) for pointers to the literature, or perhaps Hogben (2006).

It was Skeel who first noticed that a single pass of iterative refinement could be used to improve the structured backward error. He noticed that computing the residual in the same precision (not twice the precision, which might not be easily available) gives the exactly rounded residual for $(\mathbf{A} + \varDelta \mathbf{A})\mathbf{x} = \mathbf{b} + \mathbf{r}$ for some $\vert \varDelta \mathbf{A}\vert \leq O(\mu _{M})\vert \mathbf{A}\vert $. That is, the computed residual is the exact residual for only $O(\mu _{M})$ relative backward errors in $\mathbf{A}$, preserving structure. Notice that the computed solution $\mathbf{x}$ usually comes only with a normwise backward error guarantee: It is the correct solution to $(\mathbf{A} + \varDelta \mathbf{A})\mathbf{x} = \mathbf{b} +\varDelta \mathbf{b}$ with $\|\varDelta \mathbf{A}\| = O(\mu _{M}\|\mathbf{A}\|)$ and $\|\varDelta \mathbf{b}\| = O(\mu _{M}\|\mathbf{b}\|)$, which does not preserve structure. A single pass of iterative refinement can, if the condition number of $\mathbf{A}$ is not too large, improve this situation considerably. Let $\mathbf{x}_{1} = \mathbf{x} +\varDelta \mathbf{x}$, where

$$\displaystyle\begin{array}{rcl} \mathbf{A}(\varDelta \mathbf{x}) = \mathbf{r}.& & {}\\ \end{array}$$

Then solving this system gives us, more nearly, a solution of the same sort of problem.

The following argument, though not “tight,” gives some idea of why this is so. Suppose we have approximately solved $\mathbf{A}\mathbf{x} = \mathbf{b}$ and found a computed solution, which we will call $\mathbf{x}_{0}$. Then, on computing the residual $\mathbf{r}_{0} = \mathbf{b} -\mathbf{A}\mathbf{x}_{0}$ in single precision, we know that we have found the exact solution of

$$\displaystyle\begin{array}{rcl} \left (\mathbf{A} + \varDelta \mathbf{A}_{0}\right )\mathbf{x}_{0} = \mathbf{b} -\mathbf{r}_{0},& & {}\\ \end{array}$$

where $\vert \varDelta \mathbf{A}_{0}\vert \leq c\mu _{M}\vert \mathbf{A}\vert $ and c is a small constant that depends linearly on the dimension n. Notice that the $\varDelta \mathbf{A}_{0}$ is componentwise small. The working-precision residual $\mathbf{r}_{0}$ is included (it might not be very small), and what this statement says is merely that we have an accurate residual for a closely perturbed system. How small is $\mathbf{r}_{0}$? It is easy to see that, normwise,

$$\displaystyle\begin{array}{rcl} \|\mathbf{r}_{0}\|& \doteq& \rho \|\mathbf{A}\|\,\|{\mathbf{A}}^{-1}\|\|\mathbf{b}\|\mu _{ M} \\ & \doteq& \rho \kappa (\mathbf{A})\|\mathbf{b}\|\mu _{M},{}\end{array}$$

(7.1)

at most (being sloppy with constants, though). ρ is called a growth factor. Now we suppose that in solving $\mathbf{A}\varDelta \mathbf{x} = \mathbf{r}_{0}$ in the same approximate fashion (call the solution $\varDelta \mathbf{x}_{0}$), we get the same approximate growth, so that the residual in this equation can be written

$$\displaystyle\begin{array}{rcl} \left (\mathbf{A} + \varDelta \mathbf{A}_{1}\right )\varDelta \mathbf{x} = \mathbf{r}_{0} -\mathbf{s}_{0},& & {}\\ \end{array}$$

where again the perturbation $\varDelta \mathbf{A}_{1}$ is small componentwise compared to $\mathbf{A}$, and $\mathbf{s}_{0}$ is the residual that we could compute using working precision in the update equation:

$$\displaystyle\begin{array}{rcl} \mathbf{s}_{0} = \mathbf{r}_{0} -\mathbf{A}\varDelta \mathbf{x}_{0}.& & {}\\ \end{array}$$

Our “similar growth” assumption says that $\|\mathbf{s}_{0}\|\doteq\rho \kappa (\mathbf{A})\|\mathbf{r}_{0}\|$. This will be, roughly speaking, ${\rho }^{2}\kappa {(\mathbf{A})}^{2}\|\mathbf{b}\|{\mu _{M}}^{2}$ and might, if we are lucky, be quite a bit smaller. Adding together the two equations, we find that

$$\displaystyle\begin{array}{rcl} \left (\mathbf{A} + \varDelta \mathbf{A}_{0}\right )\left (\mathbf{x}_{0} +\varDelta \mathbf{x}_{0}\right )& =& \mathbf{b} + \left (\varDelta \mathbf{A}_{0} -\varDelta \mathbf{A}_{1}\right )\varDelta \mathbf{x}_{0} -\mathbf{s}_{0} {}\\ & =& \mathbf{b} + O({\mu _{M}}^{2}), {}\\ \end{array}$$

where we have suppressed the ${\rho {}^{2}\kappa }^{2}(\mathbf{A})$ and the dependence on $\kappa (\mathbf{A})$ from the other small term in the order symbol. This loose argument leads us to expect that a single pass ought to give us nearly the exact solution to a perturbed problem where the perturbation is componentwise small.

Of course, it takes more effort to establish in detail that it actually does so under many circumstances, and to describe exactly what those circumstances are. We can easily see in the above argument though that if the condition number of $\mathbf{A}$ or the growth factor ρ or both are “too large,” there will be trouble. Full details of a much tighter argument are in Skeel (1980).

Example 7.1.

This idea helps in coping with examples where the residual is unacceptably large. This can happen even with well-scaled matrices (in theory, though as we have discussed it is almost unheard of in practice). Consider the family of matrices shaped like the following (we show the n = 6 case):

$$\displaystyle\begin{array}{rcl} \mathbf{A} = \left [\begin{array}{c@{\enskip }c@{\enskip }c@{\enskip }c@{\enskip }c@{\enskip }c} 1 & 0 & 0 & 0 & 0 &1\\ - 1 & 1 & 0 & 0 & 0 &0 \\ - 1 & - 1 & 1 & 0 & 0 &0\\ - 1 & - 1 & - 1 & 1 & 0 &0 \\ - 1 & - 1 & - 1 & - 1 & 1 &0\\ - 1 & - 1 & - 1 & - 1 & - 1 &1 \end{array} \right ].& &{}\end{array}$$

(7.2)

This well-known example has a growth factor for Gaussian elimination with partial pivoting (although pivoting doesn’t actually happen because it is arranged that the pivots are already in the right place) that is as bad as possible: The largest element in $\mathbf{U}$ where $\mathbf{A} = \mathbf{L}\mathbf{U}$ is 2ⁿ + 1. The condition number of the matrix is quite reasonable, however; it is only 33 or so when n = 32. But the solution with GEPP is not acceptable, without iterative refinement, as we will see. As proved in Skeel (1980), a single pass of iterative refinement is enough to stabilize the algorithm in the strong sense discussed above.

Suppose we take $\mathbf{b}$ to be the vector $\mathbf{v}_{n}$ corresponding to the smallest singular value of $\mathbf{A}$. The choice of $\mathbf{b}$ doesn’t really matter very much, though this choice is especially cruel. When we compute (for n = 32) the solution of $\mathbf{A}\mathbf{x} = \mathbf{b}$, we should get $\mathbf{u}_{n}$, the final vector of the $\mathbf{U}$ matrix from the SVD. Call our computed solution $\mathbf{x}_{0}$. We compute the residual $\mathbf{r}_{0} = \mathbf{b} -\mathbf{A}\mathbf{x}_{0}$, using the same 15-digit precision used to compute $\mathbf{x}_{0}$. The norm of $\mathbf{r}_{0}$ is about 10⁻⁹, and thus the nearest matrix $\mathbf{A} + \varDelta \mathbf{A}$ for which $\mathbf{x}_{0}$ really solves the problem is about the same distance away, componentwise. If we now solve $\mathbf{A}\varDelta \mathbf{x} = \mathbf{r}$ and put $\mathbf{x}_{1} = \mathbf{x}_{0} +\varDelta \mathbf{x}$, then when we compute the residual again, we find that $\|\mathbf{r}_{1}\|_{\infty }$ is about 10⁻¹⁷. This produces an entirely satisfactory backward error.

For n = 64, the situation is much worse, at the beginning. The zeroth solution has a residual with infinity norm nearly 1; that is, almost no figures in the solution are correct. A single pass of iterative refinement gives $\mathbf{x}_{1}$ with $\|\mathbf{r}_{1}\|_{\infty }\doteq1.22 \cdot 1{0}^{-13}$, 13 orders of magnitude better. The 2-norm condition number of the matrix is only about 56. 8, mind, and the ∞-norm condition number is 128. The Skeel condition number (see Eq. (6.9)) $\mathrm{cond}(\mathbf{A}) =\| \vert {\mathbf{A}}^{-1}\vert \vert \mathbf{A}\vert \|_{\infty }$ is not very different, being very close to 66. However, the structured condition number for this $\mathbf{x}$ is quite a bit smaller:

$$\displaystyle\begin{array}{rcl} \mathrm{cond}(\mathbf{A},\mathbf{x}) = \frac{\|\vert {\mathbf{A}}^{-1}\vert \,\vert \mathbf{A}\vert \,\vert \mathbf{x}\vert \,\|_{\infty }} {\|\mathbf{x}\|_{\infty }} \doteq5.548.& & {}\\ \end{array}$$

Thus, for n = 64, we can expect nearly 13 figures of accuracy in $\mathbf{x}_{1}$, because the residual is so small. ⊲

Remark 7.1.

We should point out that $\vert \mathbf{A}\vert $ does not commute with $\vert {\mathbf{A}}^{-1}\vert $ in general, and in particular does not commute for this example. The Skeel condition number uses the inverse first. ⊲

2 What Could Go Wrong with an Iterative Method?

Let us now return to the iterative idea itself, and no longer think about the effects of just one pass, but rather now think about what happens if many iterations are needed. Indeed, thousands of iterations are common in some applications. The basic theoretical question is now: when does ${\mathbf{S}}^{k} \rightarrow 0$, and how fast does it do so? A theorem of eigenvalues—${\mathbf{S}}^{k} \rightarrow 0$ if all eigenvalues have $\vert \lambda \vert \leq \rho < 1$—seems to characterize things completely. However, as we saw in Sect. 5.5.2, pseudospectra turn out to play a role for nonnormal $\mathbf{S}$. There are other methods to look at this problem, and there is an extensive discussion in Higham (2002 chapter 18). We content ourselves here with an example.

Example 7.2.

Suppose that $\mathbf{A} = \mathbf{I} -\mathbf{S}$, where $\mathbf{S}$ is bidiagonal, with all diagonal entries equal to ${}^{8}/_{9}$ and all entries of the first superdiagonal equal to − 1. This is similar to the example matrix that was used in Sect. 5.5.2. Now, we wish to solve $\mathbf{A}\mathbf{x} = \mathbf{b}$, where, say, $\mathbf{b}$ has all entries equal to 1. Because all eigenvalues of $\mathbf{S}$ are less than 1 in magnitude, we know that the series $\mathbf{I} + \mathbf{S} +{ \mathbf{S}}^{2} + \cdots $ converges. Moreover, we know that ultimately the error goes to zero like “some constant” times ${{(}^{8}/_{9})}^{k}$, and that k = 400 gives ${{(}^{8}/_{9})}^{400}\doteq1 \times 1{0}^{-21}$. Therefore, the Richardson iteration

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{k+1} = \mathbf{b} + \mathbf{S}\mathbf{x}_{k}& & {}\\ \end{array}$$

should converge to the reference solution. Incidentally, the reference solution has x _n = 9, $x_{j} = O({{(}^{9}/_{8})}^{n-j})$ for $j = n - 1$, …, 1 by back substitution. This exponential growth in the solution suggests that we should evaluate the quality of our solution by examining the scaled residual,

$$\displaystyle\begin{array}{rcl} \delta = \frac{\|\mathbf{b} -\mathbf{A}\mathbf{x}\|} {\|\mathbf{A}\|\|\mathbf{x}\|}.& & {}\\ \end{array}$$

We will use the kth iterate to scale the residual of the kth solution in the figures below.

Because the pseudospectrum of this matrix (when the dimension is large) pokes out into the region $\vert \lambda \vert > 1$—that is, the pseudospectral radius ρ _ɛ of Eq. (5.13) is larger than 1—we expect that this iteration will encounter trouble for large dimensions. In other words, the “constant” that we hid under the blanket called “some constant” in the previous discussion actually grows exponentially with the dimension n. While it is constant for any given iteration, the size of the constant gets ridiculously large. In Problem 7.5, you are asked to give an explicit lower bound, confirming this. Thus, as might be expected, the iteration works quite well for a 5 × 5 matrix, as shown in Fig. 7.1. Also, as predicted, our expectation of trouble is confirmed by an 89 × 89 matrix, as shown in Fig. 7.2. ⊲

3 Some Classical Variations

In this section, we look at a few variations of the iterative method we have discussed thus far, namely, Jacobi iteration, Gauss–Seidel iteration, and successive overrelaxation (SOR).

Let us begin with Jacobi iteration. Take $\mathbf{P} ={ \mathbf{D}}^{-1}$, the inverse of the diagonal part of the matrix (so, write the matrix as $\mathbf{D} + \mathbf{E}$). Then, mathematically, $\mathbf{PA} ={ \mathbf{D}}^{-1}\mathbf{A}$ and $\mathbf{S} = \mathbf{I} -{\mathbf{D}}^{-1}\mathbf{A}$ is pretty simple, but unless the off-diagonal elements of $\mathbf{A}$ are small compared to $\mathbf{D}$, this won’t converge: $\mathbf{I} -{\mathbf{D}}^{-1}\mathbf{A}$ has only off-diagonal elements, ${}^{-a_{ij}}/_{a_{ ii}}$, and we want (ideally) $\|\mathbf{S}\| < 1$. As an iteration to solve $\mathbf{A}\mathbf{x} = \mathbf{b}$, we proceed as follows. $\mathbf{A}\mathbf{x} = \mathbf{b}$ is equivalent to $(\mathbf{D} + \mathbf{E})\mathbf{x} = \mathbf{b}$. Therefore,

$$\displaystyle\begin{array}{rcl} \mathbf{D}\mathbf{x}& =& \mathbf{b} -\mathbf{E}\mathbf{x} {}\\ \mathbf{x}_{n+1}& =&{ \mathbf{D}}^{-1}(\mathbf{b} -\mathbf{E}\mathbf{x}_{ n}) {}\\ & =& \mathbf{x}_{n} -\mathbf{x}_{n} +{ \mathbf{D}}^{-1}(\mathbf{b} -\mathbf{E}\mathbf{x}_{ n}) {}\\ & =& \mathbf{x}_{n} +{ \mathbf{D}}^{-1}(\mathbf{b} -\mathbf{D}\mathbf{x}_{ n} -\mathbf{E}\mathbf{x}_{n}) {}\\ & =& \mathbf{x}_{n} +{ \mathbf{D}}^{-1}(\mathbf{b} -\mathbf{A}\mathbf{x}_{ n}), {}\\ \end{array}$$

which is the Jabobi iteration.

The Gauss–Seidel method is also worth considering. As Strang (1986 p. 406) said, “[T]his is called the Gauss–Seidel method, even though Gauss didn’t know about it and Seidel didn’t recommend it. Nevertheless it is a good method.” Take $\mathbf{P} ={ \mathbf{L}}^{-1}$, where $\mathbf{L}$ is the lower-triangular part of $\mathbf{A}$, including the diagonal:

$$\displaystyle\begin{array}{rcl} \mathbf{L} = \left [\begin{array}{cccc} a_{11} \\ a_{21} & a_{22}\\ \vdots & & \ddots \\ a_{n1} & a_{n2} & \cdots &a_{nn}. \end{array} \right ]& & {}\\ \end{array}$$

The iteration demands, for $\mathbf{A} = \mathbf{L} + \mathbf{U}$, that we solve

$$\displaystyle\begin{array}{rcl} \mathbf{L}\mathbf{x}_{k+1} = \mathbf{b} -\mathbf{U}\mathbf{x}_{k}& & {}\\ \end{array}$$

for $\mathbf{x}_{k+1}$ or, alternatively, that use the map

$$\displaystyle\begin{array}{rcl} \mathbf{x}_{k+1} ={ \mathbf{L}}^{-1}\mathbf{b} -{\mathbf{L}}^{-1}\mathbf{U}\mathbf{x}_{ k}& & {}\\ \end{array}$$

(at least in theory—in practice, we can write this as a simple iteration, reusing the same vector $\mathbf{x}$ as we go so; it uses less storage than Jacobi iteration). Because $\mathbf{L}$ is a better approximation to $\mathbf{A}$, this often converges twice as fast as Jacobi. This is usually win–win, although Jacobi iteration can in some cases win by use of parallelism.

But there is a dramatically better method using only trivially more effort, successive overrelaxation (SOR). Split $\mathbf{A} = \mathbf{L} + \mathbf{D} + \mathbf{U}$, with $\mathbf{L}$ now being strictly lower-triangular. We get, with an “overrelaxation parameter” ω ∈ (0, 2),

$$\displaystyle\begin{array}{rcl} (\mathbf{D} +\omega \mathbf{L})\mathbf{x} =\omega \mathbf{b} - (\omega \mathbf{U} - (\omega -1)\mathbf{D})\mathbf{x}& & {}\\ \end{array}$$

from the following:

$$\displaystyle\begin{array}{rcl} \mathbf{A}\mathbf{x}& =& \mathbf{b} {}\\ \omega \mathbf{A}\mathbf{x}& =& \omega \mathbf{b} {}\\ \mathbf{D}\mathbf{x} +\omega \mathbf{A}\mathbf{x}& =& \omega \mathbf{b} + \mathbf{D}\mathbf{x} {}\\ \mathbf{D}\mathbf{x} +\omega (\mathbf{L} + \mathbf{D} + \mathbf{U})\mathbf{x}& =& \omega \mathbf{b} + \mathbf{D}\mathbf{x} {}\\ (\mathbf{D} +\omega \mathbf{L})\mathbf{x}& =& \omega \mathbf{b} + \mathbf{D}\mathbf{x} -\omega \mathbf{D}\mathbf{x} -\omega \mathbf{U}\mathbf{x} {}\\ & =& \omega \mathbf{b} - (\omega \mathbf{U} - (\omega -1)\mathbf{D})\mathbf{x}. {}\\ \end{array}$$

Here, $\mathbf{P} =\omega {(\mathbf{L} +\omega \mathbf{D})}^{-1}$ and we have a free parameter ω, the relaxation parameter, to choose. We may choose it differently for every iteration, to try to minimize the maximum eigenvalue of what we have been calling $\mathbf{S}$. As information is extracted from the solution estimating the largest Jacobi iteration matrix eigenvalue, we may improve our choice. Here $\mathbf{S} = {(\mathbf{L} +\omega \mathbf{D})}^{-1}((\omega -1)\mathbf{D} -\omega \mathbf{U})$, and for some finite-difference applications the optimal ω is known. For the right choice of ω, this can seriously outperform Gauss–Seidel.

Example 7.3.

We use A = delsq( numgrid( ‘B’, n ) ) as an example for SOR, even though direct methods are actually better for this nearly banded matrix. We look first at small-dimension matrices, specifically for n = 5, 8, 13, 21, and 34. The dimension of $\mathbf{A}$ is $O({n}^{2}) \times O({n}^{2})$. By fitting the data from these smaller matrices, the largest eigenvalue of the Jacobi iteration matrix ${\mathbf{D}}^{-1}\left (\mathbf{A} -\mathbf{D}\right )$ seems to be $\mu = 1 {-}^{16.65}/_{{n}^{2}}$, which means that the optimal $\omega = 2/(1 + \sqrt{1 {-\mu }^{2}})$ is about $2/(1 {+ }^{16.65}/_{n})$, and the eigenvalues of the SOR error matrix are then less than $(1 {-}^{16.65}/_{n})/(1 {+ }^{16.65}/_{n})$, approximately.

When we use 150 iterations of SOR to solve the system for n = 80 (so the matrix is 4808 × 4808), we find that the residual behaves on the kth iteration as approximately $1{0}^{3} \times {(\omega -1)}^{k}$, and after 150 iterations, the residual is $4.5 \times 1{0}^{-7}$. In contrast, the same number of Jacobi iterations cannot be expected even to give one figure of accuracy, and Gauss–Seidel is not much better. The difference between ${(1 - O{(}^{1}/_{n}))}^{k}$ and ${(1 - O{(}^{1}/_{{n}^{2}}))}^{k}$ is huge. The constant 10³ above changes, of course, with the dimension n. It seems experimentally to vary as (n ²)² or the square of the dimension of $\mathbf{A}$, which, though growing with n, is at least not growing exponentially with n. ⊲

Remark 7.2.

These classical methods are still useful in some circumstances, but there have been serious advances in iterative methods since these were invented. Multigrid methods and conjugate gradient methods seem to be the methods of choice. See Hogben (2006 chapter 41), by Anne Greenbaum, for an entry point to the literature. ⊲

4 Large Eigenvalue Problems

All methods for finding eigenvalues are iterative^{Footnote 1}; so, unlike the case where we were solving $\mathbf{A}\mathbf{x} = \mathbf{b}$, where there was a distinction between finite, terminating “direct” methods (such as QR factoring or LU factoring) and nonterminating “iterative” methods such as SOR, when we tackle $\mathbf{A}\mathbf{x} =\lambda \mathbf{x}$, the distinction in algorithm classes is a bit fuzzy and depends chiefly on how large a “large matrix” is today. On a tablet PC in 2010, not a high-end machine by any means, it took Matlab five seconds to compute all 1, 000 eigenvalues and eigenvectors of a random 1, 000 × 1, 000 matrix, as follows:

% % Eigenvalues of a 1000 by 1000 Random Matrix

A = rand( 1000 );

e = eig( A );

plot( real(e), imag(e), 'k.' )

axis('square'), axis([-10,10,-10,10]),set(gca,'Fontsize',16)

xlabel('Real␣Part'),ylabel('Imaginary␣Part')

So today a 1, 000 × 1, 000 matrix is not large, even though it and its matrix of eigenvectors have a million entries each. See Fig. 7.3.

For many applications, however, we might not need all 1, 000 eigenvalues and eigenvectors, but perhaps just the six largest, or six smallest. Consider the following situation. Suppose we execute

a=rand(1000);

eigs(a)

in Matlab and receive the following warning:

Warning: Only 5 of the 6 requested eigenvalues converged.

In eigs>processEUPDinfo at 1474

In eigs at 367

This command had some sort of iteration failure—it only found five of the six largest eigenvalues. We will see in a moment a possible way to work around this failure. But before, notice that if we execute

eigs(a,6,0)

we successfully and quickly find the six smallest eigenvalues. Note that eigs is not eig. The “s” is for “sparse,” although it works (as in this case) on a dense matrix. The following simple kludge avoids the convergence failure in this example:

eigs( a - 10.032*speye(1000) )

ans + 10.032

That is, we simply shifted the matrix a random amount, and this was enough to kick the iteration over its difficulties. Then we correctly find the eigenvalues:

$$\displaystyle\begin{array}{rcl} 1{0}^{2}\left [\begin{array}{c} 5.0033\\ - 0.0908 - 0.0118i \\ - 0.0908 + 0.0118i \\ - 0.0882 + 0.0119i \\ - 0.0882 - 0.0119i \\ - 0.0880 + 0.0016i \end{array} \right ].& & {}\\ \end{array}$$

This is, of course, not entirely satisfactory, but we shall pursue this in a bit of detail shortly.

For large sparse matrices, special methods of iterating are needed: The construction of an upper Hessenberg intermediate matrix is already too expensive, so the QR iteration (as is) is also too expensive. The techniques of choice are Arnoldi iteration (as implemented in ARPACK and in Matlab’s eigs routine) and other special-purpose routines, such as Rayleigh quotient iteration for the symmetric eigenproblem. Before moving on to this method, we consider the so-called Krylov subspaces

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \mathbf{v}\quad &\mathbf{A}\mathbf{v}\quad &{\mathbf{A}}^{2}\mathbf{v}\quad &{\mathbf{A}}^{3}\mathbf{v}\quad &\ldots \quad &{\mathbf{A}}^{k}\mathbf{v} \end{array} \right ],& & {}\\ \end{array}$$

which can be generated using only k matrix–vector multiplications. The power method considered only the latest ${\mathbf{A}}^{k}\mathbf{v}$ (and perhaps the previous). In exact arithmetic, as noted before, the characteristic polynomial can be constructed from the finite sequence $[\mathbf{v},\mathbf{A}\mathbf{v},\ldots,{\mathbf{A}}^{n}\mathbf{v}]$ because these vectors must be linearly dependent; but in the presence of rounding errors, we are much better off using other techniques; if we’re at all lucky, we will get good eigenvalue information with k iterations for k ≪ n.

Rayleigh quotient iteration—or RQI—is easily described (see Problem 6.16). Given an initial guess for an eigenvector $\mathbf{x}_{0}$, form

$$\displaystyle\begin{array}{rcl} \mu = \frac{\mathbf{x}_{0}^{H}\mathbf{A}\mathbf{x}_{0}} {\mathbf{x}_{0}^{H}\mathbf{x}_{0}},& & {}\\ \end{array}$$

the Rayleigh quotient. We make the crucial simplification of assuming $\mathbf{A} \in {\mathbb{R}}^{n\times n}$ and ${\mathbf{A}}^{H} = \mathbf{A}$; that is, $\mathbf{A}$ is symmetric. More, let $\mathbf{A}$ be positive-definite, and sparse (or at least fast to make matrix–vector products $\mathbf{y} = \mathbf{A}\mathbf{v}$ with). Finally, we suppose eigenvalues are simple. Once we have μ, which is the best least-squares approximation to an eigenvalue corresponding to $\mathbf{x}_{0}$, we now use it to improve $\mathbf{x}_{0}$. Solve

$$\displaystyle\begin{array}{rcl} (\mathbf{A} -\mu \mathbf{I})\mathbf{z} = \mathbf{x}_{0},& &{}\end{array}$$

(7.3)

and put $\mathbf{x}_{1} = \mathbf{z}/\|\mathbf{z}\|$. You may use any convenient method to solve Eq. (7.3); since $\mathbf{A}$ is sparse (or $\mathbf{A}\mathbf{v}$ is easy), you may choose a sparse iterative method. You may choose not to solve it very accurately; after all, $\mathbf{x}_{1}$ will just be another approximate eigenvector, and we’re going to do the iteration again. When do we stop? If

$$\displaystyle\begin{array}{rcl} \|\mathbf{A}\mathbf{x}_{i} -\mu _{i}\mathbf{x}_{i}\| <\epsilon,& & {}\\ \end{array}$$

then we know that μ _i is an exact eigenvalue for $\mathbf{A} + \varDelta \mathbf{A}$ with $\|\varDelta \mathbf{A}\| \leq \epsilon \|\mathbf{A}\|$. Hence, this is a reliable test for convergence, from a backward error point of view. Since symmetric matrices have perfectly conditioned eigenvalues (normwise), this may be satisfactory from the forward point of view, too. Thus, we get Algorithm 7.1.

Algorithm 7.1 Rayleigh quotient iteration

Require: A vector $\mathbf{x}_{0}$, a method to compute $\mathbf{y} = \mathbf{A}\mathbf{v}$, a method to solve $(\mathbf{A} -\mu \mathbf{I})\mathbf{z} = \mathbf{b}$

for $i = 1,2,\ldots$ until converged do

$\mu _{i-1} = \mathbf{x}_{i-1}^{T}(\mathbf{A}\mathbf{x}_{i-1})/(\mathbf{x}_{i-1}^{T}\mathbf{x}_{i-1})$

Solve $(\mathbf{A} -\mu _{i-1}\mathbf{I})\mathbf{z} = \mathbf{x}_{i-1}$

$\mathbf{x}_{i} = \mathbf{z}/\|\mathbf{z}\|$

end for

We may want to find generalizations of this method; for example, we wish to find more than one eigenvector at a time. Suppose $\mathbf{x}_{0} \in {\mathbb{R}}^{n\times k}$ (k ≪ n). Then if $\mathbf{x}_{0}^{T}\mathbf{x}_{0} = \mathbf{I}$,

$$\displaystyle\begin{array}{rcl} \mathbf{H} = \mathbf{x}_{0}^{T}\mathbf{A}\mathbf{x}_{ 0} \in {\mathbb{R}}^{k\times k}& & {}\\ \end{array}$$

shares some interesting features with the 1 × 1 case. The eigenvalues of $\mathbf{H}$, called Ritz values, are approximations to eigenvalues of $\mathbf{A}$, in some sense. Alternatively, one can think of the following iteration:

for $i = 1,2,\ldots$ until converged do

$\mathbf{H} = \mathbf{x}_{i-1}^{T}\mathbf{A}\mathbf{x}_{i-1}$

$\mu =\mathrm{ diag}(\mathbf{H})$

for $j = 1,2,\ldots,k$ do

Solve $\left (\mathbf{A} -\mu _{jj}\mathbf{I}\right )\mathbf{z}_{j} = (\mathbf{x}_{i-1})_{j}$

$(\mathbf{x}_{i})_{j} = \mathbf{z}_{j}$

end for

$(\mathbf{X}_{j},\mathbf{R}) = \mathtt{qr}(\mathbf{X}_{j})$

end for

This essentially does k independent Rayleigh iterations at once; the qr step just makes sure the eigenvalues are kept separate.

We might also wish to solve unsymmetric problems. The difficulties here are worse, as we must solve for left eigenvectors, too; this is called broken iteration, or Ostrowski iteration for some variations. In the symmetric case, convergence is often cubic; for the nonsymmetric case, this is true only sometimes. More seriously, if all we can do with $\mathbf{A}$ is make $\mathbf{A}\mathbf{v}$, how do we make ${\mathbf{y}}^{H}\mathbf{A}$? This can be done without constructing $\mathbf{A}$ explicitly [which costs O(n ²)], but it can be awkward.^{Footnote 2} Still, we have a method:

Require: For $\mathbf{x}_{0},\mathbf{y}_{0} \in {\mathbb{C}}^{n}$, a way to compute $\mathbf{A}\mathbf{v}$ and a way to solve both $(\mathbf{A} -\mu \mathbf{I})\mathbf{z} = \mathbf{x}$ and $\left ({\mathbf{A}}^{H} -\overline{\mu }\mathbf{I}\right ){\mathbf{w}}^{H} ={ \mathbf{y}}^{H}$

for $i = 1,2,\ldots$ until converged do

$\mu _{i-1} = (\mathbf{y}_{i-1}^{H}\mathbf{A}\mathbf{x}_{i-1})/(\mathbf{y}_{i-1}^{H}\mathbf{x}_{i-1})$ (N.B. fails if $\mathbf{y}_{i-1}^{H}\mathbf{x}_{i-1}$ is too small)

Solve $(\mathbf{A} -\mu _{i-1}\mathbf{I})\mathbf{z} = \mathbf{x}_{i-1}$

$\mathbf{x}_{i} = \mathbf{z}/\|\mathbf{z}\|$

Solve $({\mathbf{A}}^{H} -\overline{\mu }\mathbf{I})\mathbf{w} = \mathbf{y}_{i-1}$

$\mathbf{y}_{i} = \mathbf{w}/\|\mathbf{w}\|$

end for

Convergence in residual happens if

$$\displaystyle\begin{array}{rcl} \|\mathbf{A}\mathbf{x}_{i} -\mu _{i}\mathbf{x}_{i}\| \leq \epsilon & & {}\\ \end{array}$$

as before, but note that now the eigenvalue may be very ill-conditioned, in which case $\mu _{i} \in \varLambda _{\epsilon }(\mathbf{A})$ does not mean $\vert \lambda -\mu _{i}\vert = O(\epsilon )$ for a modest multiple of ε.^{Footnote 3}

Again, when to stop the iteration? Since the residuals are being computed at each stage, one can in principle stop if the residuals get small enough that the backward error interpretation of $\mathbf{r}$, namely, that we have solved $\mathbf{A}\mathbf{x} = \mathbf{b} -\mathbf{r}$, suggests that the residual is negligible. However, rounding errors (especially if the matrix $\mathbf{S}$ is not normal) can prevent the residuals from getting as small as we like.^{Footnote 4}

Example 7.4.

The popular Jenkins–Traub method (Jenkins and Traub 1970) for finding roots of polynomials expressed in the monomial basis has at its core an iteration related to the Rayleigh quotient iteration on the companion matrix for the polynomial. In this example, we use RQI on the companion matrix of a polynomial to find some of its roots, as follows. Recall that a companion matrix for a monic polynomial $p(z) = a_{0} + a_{1}z + \cdots + {z}^{n}$ can be written as a sparse matrix, all zero except for the first subdiagonal, which is just 1s, and the final column, which is the negative of the polynomial coefficients. It is a short exercise to see that if z is a root of p(z), then the vector $[1,z,{z}^{2},\ldots,{z}^{n-1}]$ is a left eigenvector of $\mathbf{C}$, and a corresponding right eigenvector is $[\alpha _{1}(z),\alpha _{2}(z),\ldots,\alpha _{n}(z)]$, where $\alpha _{n}(z) = 1$, $\alpha _{n-1}(z) = a_{n-1} + z$, $\alpha _{n-2}(z) = a_{n-2} + z(a_{n-1} + z)$, and so on up until $\alpha _{1}(z) = a_{1} + z(a_{2} + z(a_{3} + \cdots \,)$, which must also equal ${}^{-a_{0}}/_{z}$ if $z\not =0$ (and, of course, a ₀ = 0 if z = 0). These are the successive evaluations of the polynomial that one gets by executing Horner’s method. That is, for this kind of matrix, a guess at an eigenvalue λ will automatically give us a pair of approximate left and right eigenvectors. It is simple to form the Rayleigh quotient $({\mathbf{x}}^{H}\mathbf{C}\mathbf{x})/({\mathbf{x}}^{H}\mathbf{x})$ or the Ostrowski quotient $({\mathbf{y}}^{H}\mathbf{C}\mathbf{x})/({\mathbf{y}}^{H}\mathbf{x})$ from these to give us a hopefully improved estimate of the eigenvalue (which then can be fed back into the eigenvector formulae to use on the next iteration). This works, and it’s faster than solving (which also works, and works more generally).

Consider Newton’s example, $p(z) = {z}^{3} - 2z - 5$. A companion matrix for this is

$$\displaystyle\begin{array}{rcl} \mathbf{C} = \left [\begin{array}{c@{\enskip }c@{\enskip }c} 0 &0 &5\\ 1 &0 &2 \\ 0 &1 &0 \end{array} \right ].& & {}\\ \end{array}$$

If we start with an initial approximation $z_{0} = -1 + i$ and use the formulae above for Ostrowski iteration, we get convergence in five iterations. If instead we solve for our approximate eigenvectors at each step via $\mathbf{C} - {z}^{(i)}){\mathbf{x}}^{(i+1)} ={ \mathbf{x}}^{(i)}$, and similarly for the left eigenvector, neither of which is hard because this matrix is sparse, then this is more like a normal Rayleigh quotient case where we don’t know what the eigenvectors look like. In both cases the convergence appears to be quadratic, but the Rayleigh quotient only converges if solving for the new eigenvector happens each time. That is, with the formulae for the left and right eigenvectors instead of solving, only Ostrowski (also called “broken”) iteration converges, but Rayleigh quotient iteration converges if the new eigenvectors are solved for.

Once a root has been found, it is necessary to deflate the matrix (or the polynomial); we do not discuss this in any detail here, although note that this is entirely possible within the framework of matrices—using either the left or right eigenvectors, one can in theory find a matrix one dimension smaller that has all the remaining roots as eigenvalues. Let

$$\displaystyle\begin{array}{rcl} \mathbf{X} = \left [\begin{array}{c@{\enskip }c@{\enskip }c} \alpha _{1} &0 &0 \\ \alpha _{2} &1 &0 \\ \alpha _{3} &0 &1 \end{array} \right ],& & {}\\ \end{array}$$

where the first column is the right eigenvector corresponding to the root z that we have found. Note that $\alpha _{1} {= }^{-a_{0}}/_{z}$, which we assume is nonzero, so that $\mathbf{X}$ is invertible. Then ${\mathbf{X}}^{-1}\mathbf{C}\mathbf{X}$ has [z, 0, 0]^T as its first column, and the remaining two eigenvalues of $\mathbf{C}$ are the two eigenvalues of the 2 × 2 block in the second two rows and columns. Similarly, one could deflate instead with the left eigenvector (which works even if a ₀ = 0, though trivially since the matrix is already deflated in that case).

This is mathematically equivalent to synthetic division if the right eigenvector is used, and the deflated matrix is also a companion matrix; if the left eigenvector is used, then a different matrix is obtained. However, there is a tendency for rounding errors to accumulate in this process when one works with polynomials of high degree.

One can use a code such as this to implement this idea:

1 % % Rayleigh Quotient Iteration for a Companion Matrix

2 %

3 % Newton ' s example polynomial was $p ( z ) = z ^3 - 2 z - 5 = 0 $.

4 %

5 C = [0 0 1.67608204095197550; 1 0 2; 0 1 -0.66478359180960489;];

6 x0 = -6 + 5i;

7 x = @(z) [-C(2,end)+z*(C(3,end)+z); C(3,end)+z; 1];

8 niters = 19;

9 xi = zeros( niters, 1 );

10 xia = zeros( niters, 1 );

11 % Now solve at each step for new eigenvector.

12 xi(1) = x0;

13 xia(1)= x0;

14 x1 = x(x0); % Initial eigenvector

15 xa = x1;

16 x1 = x1/norm(x1,2);

17 for i=2:niters,

18 x1 = (C-xi(i-1)*eye(3))\x1;

19 x1 = x1 / norm(x1,2) ;

20 xi(i) = (x1' * C * x1 ); % ( x1 '* x1 ) =1

21 xia(i) = (xa' * C * xa )/(xa'*xa);

22 xa = x(xi(i)); % analytic eigenvector formula

23 end

24 ers = xi(:) - xi(end);

25 close( figure(1) )

26 figure(1), semilogy( abs(ers), 'ko' ), set(gca,'fontsize',16),hold on

27 ersa = xia(:)-xi(end);

28 semilogy( abs( ersa ), 'kS' )

It is straightforward to adapt this code for other similar problems. ⊲

Problems

7.1.
Add an iterative refinement step to your solution of Problem 6.6. Note that evaluation of the residual is comparable in cost to the solution of the system, so this is a significantly costly step in this case. Does this help?
7.2.
Consider the following system:
$$\displaystyle\begin{array}{rcl} 2x_{1} - x_{2}& =& 1 {}\\ -x_{j-1} + 2x_{j} - x_{j+1}& =& j,\qquad j = 2,\ldots,n - 1 {}\\ -x_{n-1} + 2x_{2}& =& n {}\\ \end{array}$$
with n = 100. Parts 1–2 are from Moler (2004 prob. 2.19).
1. 1.
  Use diag or spdiags to form the coefficient matrix and then use lu, ∖, and tridisolve to solve the system.
2. 2.
  Use condest to estimate the condition of the coefficient matrix.
3. 3.
  Solve the same problem as above, but changing 2 to be θ > 2, say θ = 2. 1, and using the approach of Seneca.m to encode the matrix–vector product, use Jacobi iteration instead (note that $\mathbf{{P}^{-1}} =\theta \mathbf{I}$ and so $\mathbf{P}\mathbf{x} = \frac{1} {\theta } \mathbf{x}$ is particularly easy). How large can the size of the problem be, before it takes Matlab at least 60 s to solve the problem this way? How large can the problem be using a direct method? (And, even more, the comparison is unfair; Matlab’s method is built-in, and Jacobi iteration must be “interpreted.” Still, …)
7.3.
Implement in Matlab the SOR method as described in the text. Be careful not to invert any matrices. Use your implementation with $\omega = 2 - O{(}^{1}/_{n})$ to solve the linear system described in Problem 7.2 with θ = 2. 1.
7.4.
Take $\mathbf{A} = \mathtt{hilb}(8)$, the 8 × 8 Hilbert matrix. Use MGS to factor $\mathbf{A}$ approximately:
$$\displaystyle\begin{array}{rcl} \mathbf{A} = \mathbf{QR}& & {}\\ \end{array}$$
with ${\mathbf{Q}}^{T}\mathbf{Q}\doteq\mathbf{I}$. In fact, ${\mathbf{Q}}^{T}\mathbf{Q} = \mathbf{I} + \mathbf{E}$, where $\|\mathbf{E}\| \leq \kappa (\mathbf{A}) \cdot c \cdot \mu _{M}$, where c is a modest constant and μ _M is the unit roundoff. Solve $\mathbf{A}\mathbf{x} = \mathbf{b}$ by using this $\mathbf{Q}$ and $\mathbf{R}$ in a factoring, as follows:
$$\displaystyle\begin{array}{rcl} \mathbf{Q}\mathbf{y}& =& \mathbf{b} {}\\ \mathbf{R}\mathbf{x}& =& \mathbf{y}, {}\\ \end{array}$$
and use the solution process
$$\displaystyle\begin{array}{rcl} \hat{\mathbf{y}}& =&{ \mathbf{Q}}^{T}\mathbf{b} {}\\ \hat{\mathbf{x}}& =& \mathbf{R}\setminus \hat{\mathbf{y}}. {}\\ \end{array}$$
Use one or two iterations of refinement to improve your solution. Discuss.
7.5.
Consider the matrix from Example 7.2. Use the formula for the pseudospectral radius, namely, Eq. (5.13), and the estimate $\|{\left (\mathbf{(A)} - z\mathbf{I}\right )}^{-1}\|_{2} \geq \vert z {-}^{8}/_{9}{\vert }^{n}$ [this is easy to see, because the (n, 1) entry of the resolvent is just that, and the 2-norm must be at least as large as any element of the matrix] to derive a reasonably tight lower bound on the maximum $\|{\mathbf{S}}^{k}\|_{2}$ when n = 89. Verify your bound by computation of ${\mathbf{S}}^{k}$ for 1 ≤ k ≤ 1600. Hint: Take $\varepsilon {= }^{e}/_{9}^{n}$ and use ${e}^{{}^{1} /_{n}} > 1 {+ }^{1}/_{n}$. Ultimately, of course, $\|{\mathbf{S}}^{k}\|_{2}$ must go to zero as k → ∞, but this analysis shows that it gets quite large along the way. This is why Richardson iteration is so slow for the system $\left (\mathbf{I} -\mathbf{S}\right )\mathbf{x} = \mathbf{b}$.
7.6.
The diagonal dominance of the matrix
$$\displaystyle{\mathbf{A} = \left [\begin{array}{rrrrrr} - 10& 1& & & & \\ 1 & - 10 & 1 & & & \\ & 1& - 10& 1& & \\ & & 1 & - 10 & 1 & \\ & & & 1& - 10& 1\\ & & & & 1 & - 10 \end{array} \right ]}$$
tempts us to try Jacobi iteration $x_{n+1} = x_{n} +{ \mathbf{D}}^{-1}\left (\mathbf{b} -\mathbf{A}x_{n}\right )$.
1. 1.
  For $\mathbf{b} = [1$, 1, 1, 1, 1, 1]^T and an initial guess of $x_{0} = -[1$, 1, 1, 1, 1, 1]^T∕10, carry out two iterations by hand. (The arithmetic for this problem is not out of reach: The numbers were chosen to be nice enough to do on a midterm exam.) Can you estimate how accurate your final answer is?
2. 2.
  Using symmetry and the eigenvalue formula for tridiagonal Toeplitz matrices $\lambda _{k } = -10 + 2\cos {(}^{\pi k}/_{(n+1)})$ (here n = 6), estimate the 2-norm condition number. The Skeel condition number cond$(\mathbf{A}) =\|\, \vert {\mathbf{A}}^{-1}\vert \,\vert \mathbf{A}\vert \,\|$ can be shown to have exactly the same value. Using the phrases “structured condition number” and “structured backward error” in a sentence, explain what this means.

Notes

1.
Yes, even for n = 2, because while square roots are “legal,” they are not finite—extracting them is iterative, too.
2.
See Bostan et al. (2003). For a history of the transposition principle, see http://cr.yp.to/transposition.html.
3.
Please consult Demmel (1997) or Hogben (2006) for more information on general techniques such as the implicitly restarted Arnoldi iteration.
4.
For more on this, see the discussion in Higham (2002).

References

Bostan, A., Lecerf, G., & Schost, É. (2003). Tellegen’s principle into practice. In: Proceedings ISSAC, pp. 37–44. New York: ACM.
Book Google Scholar
Demmel, J. W. (1997). Applied numerical linear algebra. Philadelphia: SIAM.
Book MATH Google Scholar
Higham, N. (2008). Functions of matrices: theory and computation. Philadelphia: SIAM.
Book Google Scholar
Hoffman, P. (1998). The man who loved only numbers: the story of Paul Erdös and the search for mathematical truth. New York: Hyperion.
MATH Google Scholar
Jarlebring, E., & Damm, T. (2007). Technical communique: the Lambert W function and the spectrum of some multidimensional time-delay systems. Automatica, 43, 2124–2128.
Article MATH MathSciNet Google Scholar
Milne-Thomson, L. M. (1951). The calculus of finite differences (2nd ed., 1st ed. 1933). MacMillan and Company, London.
Google Scholar
Olshevsky, V., (Ed.), (2001b). Structured matrices in mathematics, computer science, and engineering II, vol. 281 of Contemporary mathematics. Philadelphia: American Mathematical Society.
Google Scholar
Shonkwiler, R. W., & Herod, J. (2009). Mathematical biology. New York: Springer.
MATH Google Scholar
Stewart, G. W. (1985). A note on complex division. ACM Transactions on Mathematical Software, 11(3), 238–241.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Applied Mathematics, University of Western Ontario, London, ON, Canada
Robert M. Corless & Nicolas Fillion

Authors

Robert M. Corless
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Fillion
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Corless, R.M., Fillion, N. (2013). Iterative Methods. In: A Graduate Introduction to Numerical Methods. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8453-0_7

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8453-0_7
Published: 21 November 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8452-3
Online ISBN: 978-1-4614-8453-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Iterative Methods

Abstract

Keywords

1 Iterative Refinement and Structured Backward Error

Example 7.1.

Remark 7.1.

2 What Could Go Wrong with an Iterative Method?

Example 7.2.

3 Some Classical Variations

Example 7.3.

Remark 7.2.

4 Large Eigenvalue Problems

Algorithm 7.1 Rayleigh quotient iteration

Example 7.4.

Problems

Notes

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation