Abstract
Matrices of two-by-two block form with matrix blocks of equal order arise in various important applications, such as when solving complex-valued systems in real arithmetics, in linearized forms of the Cahn–Hilliard diffusive phase-field differential equation model and in constrained partial differential equations with distributed control. It is shown how an efficient preconditioner can be constructed which, under certain conditions, has a resulting spectral condition number of about 2. The preconditioner avoids the use of Schur complement matrices and needs only solutions with matrices that are linear combinations of the matrices appearing in each block row of the given matrix and for which often efficient preconditioners are already available.
Mathematics Subject Classification (2010): 65F10, 65F35, 76T10, 49J20
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Two-by-two block-structured matrices
- Preconditioning
- Complex-valued system
- Cahn–Hilliard phase-field model
- Optimal control
- Distributed control
1 Introduction
To motivate the study, we give first some examples of two-by-two block matrices where blocks of equal order, i.e. square blocks, appear. Although the matrices are of special type, as we shall see there are several important applications where they arise. One such example is related to the solution of systems with complex-valued matrices. Complex-valued systems arise, for instance, when solving certain partial differential equations (PDE) appearing in electromagnetics and wave propagation; see [1]. Complex arithmetics requires more memory storage and may require more involved implementation. Therefore it is desirable to rewrite a complex-valued matrix system in a form that can be handled using real arithmetics.
Using straightforward derivations, for a complex-valued matrix A + i B, where A and B are real and A is nonsingular, it holds
so
It follows that a complex-valued system
where x, y, f, g are real vectors, can be solved by solving two real-valued systems with matrix \(A + B{A}^{-1}B\) with right-hand sides f and g respectively, in addition to a matrix vector multiplication with B and two solutions of systems with the matrix A.
In many applications, \(A + B{A}^{-1}B\) can be ill conditioned and costly to construct and solve systems with, in particular as it involves solutions with inner systems with the matrix A. Therefore, this approach is normally less efficient.
As has been shown in [2] (see also [1, 3]), it may be better to rewrite the equation in real-valued form
A matrix factorization shows that
where I is the identity matrix. It is seen that here it suffices with one solution with matrix \(A + B{A}^{-1}B,\) in addition to two solves with A. However, we will show that the form (1) allows for an alternative solution method based on iteration and the construction of an efficient preconditioner that involves only two systems with matrices that are linear combinations of matrices A and B and that a corresponding iterative solution of (1) can substantially lower the computational expense. We shall show that such a preconditioner can be constructed for a matrix in the more general form
where α, β are positive numbers. By the introduction of a new, scaled second variable vector \(\mathbf{y}\, :\,={ \frac{1} {\alpha }^{2}} \mathbf{y},\) the systems transform into the alternative form
where \(a ={ \frac{1} {\alpha }^{2}},b {=\beta }^{2}\). This form arises in the two-phase version of the Cahn–Hilliard equation used to track interfaces between two fluids with different densities using a stationary grid; see [4, 5].
As we shall see in the sequel, a matrix in the form (2), with β = 1 arises also in optimization problems for PDE, with a distributed control function, that is, a control function defined in the whole domain of definition of the PDE. For an introduction to such problems, see [6, 7].
Problems of this kind appear in various applications in engineering and geosciences but also in medicine [8] and finance [9]. As a preamble to this topic, we recall that the standard form of a constrained optimization problem with a quadratic function takes the form
subject to the constraint B u = g. Here, u, f ∈ ℜ n, g ∈ ℜ m, and A is a symmetric and positive definite (spd) matrix of order n ×n and B has order m ×n, m ≤ n. For the existence of a solution, if m = n we must assume that dimℜ(B) < m, where ℜ(B) denotes the range of B. The corresponding Lagrangian function with multiplier p and regularization term − α p T C p, where α is a small positive number and C is spd, takes the form
By the addition of the regularization term, the Lagrange multiplier vector p becomes unique.
The necessary first-order conditions for an optimal, saddle point solution lead to
Here, we can extend the matrix B with n − m zero rows and the vector g with n − m zero components, to make B of the same order as A. Similarly, C is extended. It is possible to let C = A. (Then the n − m correspondingly added components of p become zero.) As we shall see, in optimal control problems with a distributed control, we get such a form with no need to add zero rows to B.
If we change the sign of p, the corresponding matrix takes the form \(\left [\begin{array}{cc} A& - {B}^{T} \\ B & A \end{array} \right ]\), i.e. the same form as in (1). The matrix in (4) is indefinite. It can be preconditioned with a block-diagonal matrix, but it leads to eigenvalues on both sides of the origin, which slows down the convergence of the corresponding iterative acceleration method, typically of a conjugate gradient type, such as MINRES in [10]. In this paper we show that much faster convergence can be achieved if instead we precondition \(\mathcal{A}\) with a matrix that is a particular perturbation of it, since this leads to positive eigenvalues and no Schur complements need to be handled. We consider then preconditioning of matrices of the form (2) or (3). Thereby we assume that A is symmetric and positive definite, or at least positive semidefinite and \(ker(A) \cap ker(B) =\{ \varnothing \}\), which will be shown to guarantee that \(\mathcal{A}\) is nonsingular.
In Sect. 2 we present a preconditioner to this matrix, but given in the still more general form
where it is assumed that \(H_{i} = A + \sqrt{ab}B_{i},\,i = 1,2\) are regular. It involves only solutions with the matrices H 1 and H 2. Hence, no Schur complements needed to be handled arise here.
In Sect. 3 we perform an eigenvalue analysis of the preconditioning method. This result extends the applicability of the previous results, e.g. in [2] and [4]. Furthermore, the present proofs are sharper and more condensed.
In Sect. 4 we show that certain constrained optimal control problems for PDE with a distributed control can be written in the above two-by-two block form. The results in that section extend related presentations in [7].
Further development of the methods and numerical tests will be devoted to part II of this paper.
The notation A ≤ B for symmetric matrices A, B means that A − B is positive semidefinite.
2 The Preconditioner and Its Implementation
Given a matrix in the form (2), we consider first a preconditioner to \(\mathcal{A}\) in the form
where \(\tilde{\alpha }\) is a positive preconditioning method parameter to be chosen. A computation shows that
We show now that an action of its inverse requires little computational work.
Proposition 1.
An action of the inverse of the form of the matrix \(\mathcal{B}\) in ( 6) requires one solution of each of the matrices \(A,\tilde{\alpha }A +\beta B\) and \(A,\tilde{\alpha }A +\beta {B}^{T}\) , in this order.
Proof.
To solve a system
solve first
which requires a solution with A and \(\tilde{\alpha }A +\beta B\). Solve then
by solving
to finally obtain
In applications, often A is a mass matrix and B is a stiffness matrix. When A depends on heterogeneous material coefficients, the matrices \(\tilde{\alpha }A +\beta B\) and \(\tilde{\alpha }A +\beta {B}^{T}\) can be better conditioned than A. We show now that by applying the explicit expression for \({\mathcal{B}}^{-1}\), the separate solution with A in (6) can be avoided.
We find it convenient to show this first for preconditioners \(\mathcal{B}\) applied to the matrix \(\mathcal{A}\) in the form (3). Here,
For its inverse the following proposition holds. For its proof, we assume first that A is spd.
Proposition 2.
Let A be spd. Then
where \(H = A + \sqrt{ab}B\) , which is assumed to be nonsingular.
Proof.
For the derivation of the expression for the inverse we use the form of the inverse of a general matrix in two-by-two block form. (However, clearly we can verify the correctness of the expression directly by computation of the matrix times its inverse. An alternative derivation can be based on the Schur–Banachiewicz form of the inverse.) Assume that A i i , i = 1, 2 are nonsingular. Then
Here, the Schur complements S i , i = 1, 2 equal
Further, \(S_{2}^{-1}A_{21}A_{11}^{-1} = A_{22}^{-1}A_{21}S_{1}^{-1}\).
For the given matrix it holds
Further,
Similarly,
Finally, since the pivot block in the inverse matrix equals the inverse of the Schur complement, the corresponding equality holds for the pivot block in the matrix itself, that is,
Therefore,
Remark 1.
Incidently, relation (8) can be seen as a proof of the familiar Sherman–Morrison–Woodbury formula.
We show now that Proposition 2 implies that an action of the matrix \({\mathcal{B}}^{-1}\) needs only a solution with each of the matrices H and H T. This result has appeared previously in [4], but the present proof is more condensed and more generally applicable. We will then show it for a matrix in the general form (5).
Guided by the result in Proposition 2, we give now the expression for the inverse of the preconditioner to a matrix in the form (5).
Proposition 3.
Let
then
where \(H_{i} = A + \sqrt{ab}B_{i},i = 1,2,\) which are assumed to be nonsingular.
Proof.
We show first that \(\mathcal{B}\) is nonsingular. If
then \(A\mathbf{x} = aB_{2}\mathbf{y}\) and
Then
or
Hence, \(\mathbf{x} = -\sqrt{\frac{a} {b}}\mathbf{y},\) so \(\sqrt{\frac{a} {b}}(A + \sqrt{ab}B_{2})\mathbf{y} = 0\) or y = 0, so (9) has only the trivial solution. The expression for \({\mathcal{B}}^{-1}\) follows by direct inspection.
Proposition 4.
Assume that \(A + \sqrt{ab}B_{i},i = 1,2\) are nonsingular. Then \(\mathcal{B}\) is nonsingular and a linear system with the preconditioner \(\mathcal{B}\),
can be solved with only one solution with \(A + \sqrt{ab}B_{1}\) and one with \(A + \sqrt{ab}B_{2}\).
Proof.
It follows form Proposition 3 that an action of the inverse of \(\mathcal{B}\) can be written in the form
where
The computation can take place in the following order:
-
(i)
Solve \(H_{1}\mathbf{g} = \mathbf{f}_{1} + \sqrt{\frac{a} {b}}\mathbf{f}_{2}\).
-
(ii)
Compute \(A\mathbf{g}\) and f 1 − A g.
-
(iii)
Solve \(H_{2}\mathbf{h} = \mathbf{f}_{1} - A\mathbf{g}\).
-
(iv)
Compute \(\mathbf{x} = \mathbf{g} + \mathbf{h}\) and \(\mathbf{y} = -\sqrt{ \frac{b} {a}}\mathbf{h}\).
Remark 2.
In some applications \(H_{1} = A + \sqrt{ab}B_{1},\) and \(H_{2} = A + \sqrt{ab}B_{2}\) may be better conditioned than A itself. Even if it is not, often software for these combinations exists.
3 Condition Number Bounds
To derive condition number bounds for the preconditioned matrix \({\mathcal{B}}^{-1}\mathcal{A}\), we consider two cases:
-
(i)
\(B_{1} = B,\,B_{2} = {B}^{T},\,A\) is symmetric, A and B + B T are positive semidefinite, and
$$\displaystyle{ker(A) \cap ker(B_{i}) = \left \{\varnothing \right \},i = 1,2}$$ -
(ii)
A is symmetric and positive definite and certain conditions, to be specified later, hold for B 1 and B 2.
3.1 A Is Symmetric and Positive Semidefinite
Assume that conditions (i) hold. Then it follows that \(A + \sqrt{ab}B\) and \(A + \sqrt{ab}{B}^{T},\) and hence also \(\mathcal{B},\) are nonsingular. We show first that then \(\mathcal{A}\) is also nonsingular.
Proposition 5.
Let condition (i) hold. Then \(\mathcal{A}\) is nonsingular.
Proof.
If
then
so \(\frac{1} {a}{\mathbf{x}}^{{\ast}}A\mathbf{x} + \frac{1} {b}{\mathbf{y}}^{{\ast}}A\mathbf{y} = 0\), where x ∗ , y ∗ denote the complex conjugate vector.
Since A is positive semidefinite, it follows that x, y ∈ k e r A. But then B T y = 0 and B x = 0, implying that x, y ∈ k e r B, so \(\mathcal{A}\left [\begin{array}{c} \mathbf{x}\\ \mathbf{y} \end{array} \right ] = \left [\begin{array}{c} \mathbf{0}\\ \mathbf{0} \end{array} \right ]\) has only the trivial solution.
Proposition 6.
Let \(\mathcal{A} = \left [\begin{array}{cc} A &a{B}^{T} \\ - bB & A \end{array} \right ]\) , where a,b are nonzero and have the same sign and let \(\mathcal{B} = \left [\begin{array}{cc} A & a{B}^{T} \\ - bB &A + \sqrt{ab}(B + {B}^{T}) \end{array} \right ]\) . If conditions (i) hold, then the eigenvalues of \({\mathcal{B}}^{-1}\mathcal{A},\) are contained in the interval \([\frac{1} {2},1].\)
Proof.
For the generalized eigenvalue problem
it follows from Proposition 5 that \(\lambda \not =0\). It holds
Here, λ = 1 if y ∈ k e r(B + B T). If λ ≠ 1, then
and
or
Since both A and B + B T are positive semidefinite, it follows that λ ≤ 1.
Further it holds,
so
or
Since B + B T is positive semidefinite, \(\mid \mathbf{x}\mid + \mid \mathbf{y}\mid \not =0\), and a and b have the same sign, it follows that
that is, \(\lambda \geq \frac{1} {2}.\)
3.2 A Is Symmetric and Positive Definite
Assume now that A is symmetric and positive definite. Let \(\mathcal{A}\) be defined in (5) and let \(\tilde{B}_{i} = \sqrt{ab}\,{A}^{-1/2}B_{i}{A}^{-1/2},\,i = 1,2.\) Assume that the eigenvalues of the generalized eigenvalue problem,
are real and μ max ≥ μ ≥ μ min > − 1.
Proposition 7.
Let \(\mathcal{A}\) be defined in (5) , let \(\tilde{B}_{i} = \sqrt{ab}\,{A}^{-1/2}B_{i}{A}^{-1/2},\,i = 1,2\) , and assume that \(\tilde{B}_{1} +\tilde{ B}_{2}\) is spd and ( 10) holds. Then the eigenvalues of \({\mathcal{B}}^{-1}\mathcal{A}\) are contained in the interval \(\left [ \frac{1} {1+\mu _{\max }}, \frac{1} {1+\mu _{\min }}\right ].\)
Proof.
\(\lambda \mathcal{B}\,\left [\begin{array}{c} \mathbf{x}\\ \mathbf{y} \end{array} \right ] = \mathcal{A}\,\left [\begin{array}{c} \mathbf{x}\\ \mathbf{y} \end{array} \right ]\) implies
Hence, a block-diagonal transformation with \(\left [\begin{array}{cc} {A}^{-1/2} & 0 \\ 0 &{A}^{-1/2} \end{array} \right ]\) shows that
where \(\tilde{\mathbf{x}} = {A}^{1/2}\mathbf{x},\tilde{\mathbf{y}} = {A}^{1/2}\mathbf{y}.\)
If \(\lambda \not =1,\) then
Hence, by (10),
which implies the stated eigenvalue bounds.
Corollary 1.
If \(B_{1} = B,\,B_{2} = {B}^{T}\) , and \(I +\tilde{ B}\) is nonsingular, then
where μ min > −1. If the symmetric part of B is positive semidefinite, then
Proof.
Since
it follows that
which implies μ ≤ 1 in (10). Similarly,
that is,
implies μ min ≥ − 1. But μ min > − 1 since \(I +\tilde{ B},\) and hence \(I +\tilde{ {B}}^{T},\) are nonsingular. If B + B T ≥ 0, then μ min = 0.
Corollary 2.
If \(B_{1} = B,\,B_{2} = B -\delta /\sqrt{ab}\,A\) for some real number δ, where B is spd and \(2B >\delta /\sqrt{ab}\,A,\) that is, \(B_{1} + B_{2} = 2B -\delta /\sqrt{ab}\,A\) is spd, then
Proof.
Here, (10) takes the form
where \(\tilde{B} = \sqrt{ab}\,{A}^{-1/2}B{A}^{-1/2}.\) Let β be an eigenvalue of \(\tilde{B}.\)
Then
Since δ < 2β, it follows that μ > 0, that is, λ ≤ 1. Further, a computation shows that μ takes its largest value when
or
that is when
Then \(\mu = 2/\sqrt{4 {-\delta }^{2}}\) and the statement follows from \(\lambda = 1/(1+\mu ).\)
Remark 3.
Matrices in the form as given in Corollary 2 appear in phase-field models; see, e.g. [4, 5]. For complex-valued systems, normally the coefficients are \(a = b = 1\). In other applications, such as those in Sects. 4.1 and 4.2, a form such as in Proposition 1 arises. One can readily transform from one form into the other.
Propositions 6 and 7 show that if A is spd and B + B T is positive semidefinite, then the condition number of the preconditioned matrix satisfies
Using a preconditioning parameter, as in (6), we derive now a further-improved condition number bound under the assumption that matrix B is symmetric. We consider then the form (2) of matrix \(\mathcal{A}.\)
Proposition 8.
Let \(\mathcal{A} = \left [\begin{array}{cc} A & - {B}^{T} \\ {\beta }^{2}B &{ \alpha }^{2}A \end{array} \right ]\) , where α > 0, β > 0, and let \(\mathcal{B}\) be defined in ( 6) . Assume that A and B are symmetric and that A is positive definite. Let \(\tilde{B} =\beta {A}^{-1/2}B{A}^{-1/2}\) and assume that \(\tilde{B}\) has eigenvalues μ in the interval \(\left [\mu _{\min },\mu _{\max }\right ]\) , where \(0 \leq \mid \mu _{\min }\mid <\mu _{\max }\) , and that \(\frac{\tilde{\alpha }}{\alpha } = \vert \tilde{\mu }_{\min }\vert + \sqrt{1 +\tilde{\mu }_{ \min }^{2}}\) where \(\tilde{\mu }_{\min } =\mu _{\min }/\alpha,\,\tilde{\mu }_{\max } =\mu _{\max }/\alpha\) . Then the eigenvalues of \({\mathcal{B}}^{-1}\,\mathcal{A}\) satisfy
For its condition number it holds
where \(\delta = \mid \mu _{\min }\mid /\mu _{\max }\) and \(\gamma = \sqrt{(1 +\tilde{\mu }_{ \min }^{2 })/(1 +\tilde{\mu }_{ \max }^{2 })}.\) Here it holds
If B is positive semidefinite, then
where the upper bound is taken for
Proof.
Since both \(\mathcal{A}\) and \(\mathcal{B}\) are nonsingular, the eigenvalues λ of the generalized eigenvalue problem,
are nonzero. Using (2) and (6), we find
If y = 0, then for all \(\mathbf{x}\not =\mathbf{0}\) it follows that λ = 1. For \(\lambda \not =1\), it follows that A x = B T y and, since A is spd,
or
where \(\tilde{B} =\beta {A}^{-1/2}B{A}^{-1/2}\) and \(\tilde{\mathbf{y}} = {A}^{1/2}\mathbf{y}\). Since \(\tilde{B}\) is symmetric, if \(\tilde{B}\tilde{\mathbf{y}} =\mu \tilde{ \mathbf{y}},\,\tilde{\mathbf{y}}\not =0,\) i.e. μ is an eigenvalue of \(\tilde{B},\) it follows that μ is real and
The eigenvalues vary as indicated in Fig. 1.
Consider first the case where there exists negative eigenvalues. To get λ < 1 for negative values of μ, we must choose \({(\tilde{\alpha }+\mu )}^{2} {>\alpha }^{2} {+\mu }^{2},\) i.e. \({\tilde{\alpha }}^{2} + 2\tilde{\alpha }\mu {-\alpha }^{2} > 0\), that is,
The minimum value of λ(μ) can be found from
that is,
To minimize the condition number, it can be seen (cf. Fig. 1) that we must choose \(\tilde{\alpha }\) such that
that is,
or
Here γ < 1, since by assumption \(\mu _{\max } > \mid \mu _{\min }\mid\). Hence,
Then
It holds
Hence,
If B is positive semidefinite, then we let μ min = 0 so \(\delta = 0,\gamma = 1/\sqrt{1 +\tilde{\mu }_{ \max }^{2}}\) and
which is taken for
Remark 4.
If μ min = 0 then \(\kappa ({\mathcal{B}}^{-1}\mathcal{A}) < 2\) and if μ max → ∞ then \(\tilde{\alpha }\rightarrow \alpha\) and \(\kappa ({\mathcal{B}}^{-1}\mathcal{A}) \rightarrow 2\). If \(\tilde{\mu }_{\max } = 1\) then \(\tilde{\alpha }/\alpha = 1 + \sqrt{2}\) and
Remark 5.
As is well known, when eigenvalue bounds of a preconditioned matrix, as in the case with \({\mathcal{B}}^{-1}\mathcal{A}\), are known, then one can replace the conjugate gradient (CG) with a Chebyshev acceleration method. This can be important, for instance, if one uses some domain decomposition method for massively parallel computations, as it avoids the global communication of inner products used in CG methods.
4 Distributed Optimal Control of Elliptic and Oseen Equations
Let Ω be a bounded domain in ℜ d, d = 1, 2 or 3, and let ∂ Ω be its boundary which is assumed to be sufficiently smooth. Let L 2(Ω), H 1(Ω) and H 0 1(Ω) denote the standard Lebesgue and Sobolev spaces of functions in Ω, where H 0 1(Ω) denotes functions with homogeneous Dirichlet boundary values at \(\Gamma _{0} \subset \partial \Omega \) where Γ 0 has a nonzero measure. Further, let ( ⋅, ⋅) and ∥ ⋅ ∥ denote the inner product and norm, respectively, in L 2(Ω), both for scalar and vector functions. Extending, but following [7], and based on [6], we consider now two optimal control problems. In [7] a block-diagonal preconditioner is used. Here we apply instead the preconditioner presented in Sect. 2.
4.1 An Elliptic State Equation
The problem is to find the state u ∈ H 0 1(Ω) and the control function y ∈ L 2(Ω) that minimizes the cost function
subject to the state equation
Here b is a given, smooth vector. For simplicity, assume that \(\mathbf{b} \cdot \mathbf{n}\mid _{\Gamma _{1}} = 0\). Further, u d denotes a given, desired state (possibly obtained by measurements at some discrete points and then interpolated to the whole of Ω). The forcing term y acts as a control of the solution to the state equation. By including the control in the cost functional, the problem becomes well posed. The regularization parameter α, chosen a priori, is a positive parameter chosen sufficiently small to obtain a solution close to the desired state, but not too small and also not too large as this leads to ill conditioning. This is similar to the familiar Tikhonov regularization. The variational (weak) formulation of (11) reads
The Lagrangian formulation associated with the optimization problem takes the form
where p ∈ H 0 1(Ω) is the Lagrange multiplier corresponding to the constraint (12). The weak formulation of the corresponding first-order necessary conditions,
gives now the system of optimality equations:
which defines the solution \((u,y) \in H_{0}^{1}(\Omega ) \times {L}^{2}(\Omega )\) of the optimal control problem with Lagrange multiplier p ∈ H 0 1(Ω). From the second equation, it follows that the control function y is related to the Lagrange multiplier as \(y = \frac{1} {\alpha } p\). Eliminating y and applying the divergence theorem, this leads to the reduced system
Since the problem is regularized, we may here use equal-order finite element approximations, for instance, piecewise linear basis functions on a triangular mesh (in 2D), for both the state variable u and the co-state variable p. This leads to a system of the form
where index h denotes the corresponding mesh parameter. Here M corresponds to a mass matrix and K, which has the same order as M, to the second-order elliptic operator with a first-order advection term.
By a change of sign of p h , it can be put in the form
and we can directly apply the preconditioner from Sects. 2 and 3, and the derived spectral condition number bounds. If
i.e. if the operator is semi-coercive, then K + K T is positive semidefinite and it follows from Proposition 6 that the corresponding spectral condition number is bounded by 2, with eigenvalues in the interval 1 ∕ 2 ≤ λ ≤ 1.
Remark 6.
In [7], a block-diagonal preconditioner,
is used for the saddle point matrix
where B = B T and A is symmetric and positive semidefinite, and \(\ker (A) \cap \ker (B) =\{ 0\}\), so \(A {+\alpha }^{1/2}B\) is symmetric and positive definite.
By assumptions made, from the generalized eigenvalue problem
it follows that here μ ∈ [0, 1] and it follows further readily that the preconditioned matrix \({\mathcal{D}}^{-1}\mathcal{A}\) has eigenvalues that satisfy
that is, \(1/\sqrt{2} \leq \mid \lambda \mid \leq 1\). Hence, the eigenvalues are located in the double interval:
For such eigenvalues in intervals on both sides of the origin, an iterative method of conjugate gradient type, such as MINRES, needs typically the double number of iterations, as for eigenvalues in a single interval on one (positive) side of the origin, to reach convergence; see e.g. [11]. This can be seen from the polynomial approximation problem
where π k 0 denotes the set of polynomials of degree k, normalized at the origin, i.e. P k (0) = 1.
Since the number of iterations increases as \(O(\sqrt{\kappa })\), where \(\kappa = \mid \lambda _{\max }\mid /\mid \lambda _{\min }\mid\) is the condition number, it follows that an indefinite interval condition number \(\kappa = \sqrt{2}\) typically corresponds to a one-sided condition number of \(4\sqrt{2}\).
The method proposed in the present paper has a condition number bounded by 2 and needs therefore a number of iterations about \(\simeq \, \frac{\sqrt{2}} {{2}^{5/4}} = \frac{{2}^{1/4}} {2} \simeq 0.6\) times those for a corresponding block diagonal preconditioner. However, even if the block-diagonal preconditioning method requires more iterations, each iteration may be cheaper than in the method proposed in this paper. An actual comparison of the methods will appear.
4.2 Distributed Optimal Control of the Oseen Problem
In [7], Stokes equation is considered. Here, we extend the method to the Oseen equation and consider the velocity tracking problem for the stationary case, which reads as follows:
Find the velocity \(\mathbf{u} \in H_{0}^{1}{(\Omega )}^{d}\); the pressure p ∈ L 0 2(Ω), where \(L_{0}^{2}(\Omega ) =\{ q \in {L}^{2}(\Omega ),\,\int _{\Omega }q\,dx = 1\}\); and the control function f, which minimize the cost function
subject to state equation for an incompressible fluid velocity u, such that
and boundary conditions u = 0 on ∂ Ω 1, u ⋅n = 0 on ∂ Ω 2 = ∂ Ω ∖ ∂ Ω 1, where n denotes the outward normal vector to the boundary ∂ Ω.
Here u d is the desired solution and α > 0 is a regularization parameter, used to penalize too large values of the control function. Further, b is a given, smooth vector. For simplicity we assume that b = 0 on ∂ Ω 1 and b ⋅n = 0 on ∂ Ω 2.
In a Navier–Stokes problem, solved by a Picard iteration using the frozen coefficient framework, b equals the previous iterative approximation of u, in which case normally ∇ ⋅u = 0 in Ω. For simplicity, we assume that this holds here also, that is, ∇ ⋅b = 0.
The variational form of the state equation reads as follows:
The Lagrangian functional, corresponding to the optimization problem, is given by
where v is the Lagrange multiplier function for the state equation and q for its divergence constraint. Applying the divergence theorem, the divergence condition ∇ ⋅b = 0 and the boundary conditions, we can write
The five first-order necessary conditions for an optimal solution take then the form
Here u, p, f are the solutions of the optimal control problem with v, q as Lagrange multipliers for the state equation, and \(\tilde{\mathbf{u}},\tilde{\mathbf{v}},\tilde{p},\tilde{q},\tilde{\mathbf{f}}\) denote corresponding test functions.
As in the elliptic control problem, the control function f can be eliminated, \(\mathbf{f} {=\alpha }^{-1}\mathbf{v}\), resulting in the reduced system,
To discretize (14) we use an LBB-stable pair of finite element spaces for the pair (u, v) and (p, q). In [7] the Taylor–Hood pair with {Q2, Q2, Q1, Q1} is used, namely, piecewise quadratic basis functions for u, v and piecewise bilinear basis functions for p, q for a triangular mesh. The corresponding discrete system takes the form
where we have changed the sign of v. Here D comes from the divergence terms. Further, M is the mass matrix and L + C is the discrete operator, corresponding to the convection–diffusion term \(-\Delta \mathbf{u} + \mathbf{b} \cdot \nabla \mathbf{u}\) and \(-L + C\) to Δ v + b ⋅ ∇ v, respectively. Due to the use of an inf–sup (LBB)-stable pairs of finite element spaces, the divergence matrix D has full rank.
As for saddle point problems of similar type, one can use either a grad–div stabilization or a div–grad stabilization. In the first case we add the matrix D T W − 1 D to M and \({\alpha }^{-1}{D}^{T}{W}^{-1}D\) to α − 1 M, respectively, possibly multiplied with some constant factor, where W is a weight matrix. If W is taken as the discrete Laplacian matrix, then D T W − 1 D becomes a projection operator onto the orthogonal complement of the solenoidal vectors.
The other type of stabilization consists of perturbing the zero block matrix in (15) by \(\varepsilon \left [\begin{array}{cc} \Delta & 0\\ 0 &\Delta \end{array} \right ]\), where \(\varepsilon\) is a small parameter, typically \(\varepsilon = O({h}^{2})\) with h being the space discretization parameter. In that case there is no need to use LBB-stable elements; see, e.g. [12] for more details. In the present paper, however, we use LBB-stable elements and there is no need to use any additional regularization at all but consider instead the solution of the system with the Schur complement matrix system:
This system can be solved by inner–outer iterations. To compute the residuals, we must then solve inner systems with the matrix \(\left [\begin{array}{cc} M & - L + C\\ L + C &{ \alpha }^{-1 } M \end{array} \right ]\), which takes place in the way discussed earlier in Sect. 2. To recall, only systems with \(M + \sqrt{\alpha }(L + C)\) and \(M + \sqrt{\alpha }(L - C)\) have to be solved. Further, as is seen from (16), the corresponding systems which actually arise have the form \(D{[M + \sqrt{\alpha }(L + C)]}^{-1}{D}^{T}\) and \(D{[M + \sqrt{\alpha }(L - C)]}^{-1}{D}^{T}\). At least for not too large convection terms, related to the diffusion term, these systems are well conditioned and can be preconditioned with a mass matrix or a mass matrix minus a small multiple times the Laplacian.
To avoid the need to solve inner systems and for stronger convections, it may be better to use a block-triangular factorization of the matrix in (15). For the arising inner systems with \(M + \sqrt{\alpha }(L + C)\) and \(M + \sqrt{\alpha }(L - C)\), it can be efficient to use some off-the-shelf software, such as some algebraic multigrid (AMG) method; see [13, 14]. In [15] and [13] numerical tests are reported, showing that AGMG [13], as one choice of an AMG method, performs much better than some other possible methods.
The perturbations due to the use of inner iterations with stopping criteria lead in general to complex eigenvalues. A generalized conjugate gradient method of GMRES [16] type can be used. Such methods go under different names and have been referred to as nonlinear conjugate gradient, variable preconditioned conjugate gradient [17] and flexible GMRES [18]. Since, due to the accurate preconditioning, there are few iterations, the additional cost for having a full length Krylov subspace, involving all previous search directions, is not much heavier than if a conjugate gradient method with vectors, orthogonal with respect to a proper inner product and, hence, short recursions, is used.
We remark, however, that such a method has been constructed for indefinite matrices in [19], based on inner products, defined by the matrix
where \(\hat{M}_{0}\) is an approximation of \(\hat{M}\), such that \(\hat{M}_{0} <\hat{ M}\) and \(S_{0} <\hat{ B}\hat{{M}}^{-1}\hat{{B}}^{T}\) is an spd approximation of the Schur complement matrix for the two-by-two block system \(\left [\begin{array}{cc} \hat{M}&\hat{{B}}^{T} \\ \hat{B} & 0 \end{array} \right ].\)This makes the matrix \({\left [\begin{array}{cc} \hat{M}_{0} & 0 \\ \hat{B} & - S_{0} \end{array} \right ]}^{-1}\left [\begin{array}{cc} \hat{M}&\hat{{B}}^{T} \\ \hat{B} & 0 \end{array} \right ]\) self-adjoint with respect to that inner product. The drawback of the method is the need to properly scale the approximation \(\hat{M}_{0}\) to satisfy \(\hat{M}_{0} <\hat{ M}\), and furthermore, \(\hat{M}_{0}\) must be fixed, i.e. cannot be implicitly defined via variable inner iterations.
In our case, the corresponding preconditioning matrix defined in Sect. 2 satisfies \(\hat{M}_{0} >\hat{ M}\), but there is no need to scale it. Furthermore, we may apply inner iterations for this preconditioner and also for the Schur complement matrix, hence the corresponding matrix \(\hat{M}_{0}\) is in general not fixed so the above inner product method is not applicable.
The presentation of block-triangular factorization preconditioner and approximations of the arising Schur complement preconditioners with numerical tests will be devoted.
References
van Rienen, U.: Numerical Methods in Computational Electrodynamics. Linear Systems in Practical Applications. Springer, Berlin (1999)
Axelsson, O., Kucherov, A.: Real valued iterative methods for solving complex symmetric linear systems. Numer. Lin. Algebra Appl. 7, 197–218 (2000)
Benzi, M., Bertaccini, D.: Block preconditioning of real-valued iterative algorithms for complex linear systems. IMA J. Numer. Anal. 28, 598–618 (2008)
Axelsson, O., Boyanova, P., Kronbichler, M., Neytcheva, M., Wu, X.: Numerical and computational efficiency of solvers for two-phase problems. Comput. Math. Appl. 65, 301–314 (2012). http://dx.doi.org./10.1016/j.camva.2012.05.020
Boyanova, P.: On numerical solution methods for block-structured discrete systems. Doctoral thesis, Department of Information Technology, Uppsala University, Sweden (2012). http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-173530
Lions, J.-L.: Optimal Control of Systems Governed by Partial Differential Equations. Springer, Berlin (1971)
Zulehner, W.: Nonstandard norms and robust estimates for saddle-point problems. SIAM J. Matrix Anal. Appl. 32, 536–560 (2011)
Arridge, S.R.: Optical tomography in medical imaging. Inverse Probl. 15, 41–93 (1999)
Egger, H., Engl, H.W.: Tikhonov regularization applied to the inverse problem of option pricing: convergence analysis and rates. Inverse Probl. 21, 1027–1045 (2005)
Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12, 617–629 (1975)
Axelsson, O., Barker, V.A.: Finite Element Solution of Boundary Value Problems. Theory and Computation. Academic, Orlando, FL (1984)
Brezzi, F., Fortin, M.: Mixed and Hybrid Finite Elements Methods. Springer, Berlin (1991)
Notay, Y.: The software package AGMG. http://homepages.ulb.ac.be/~ynotay/
Vassilevski, P.: Multilevel Block Factorization Preconditioners. Springer, New York (2008)
Notay, Y.: Aggregation-based algebraic multigrid for convection-diffusion equations. SIAM J. Sci. Comput. 34, A2288–A2316 (2012)
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856–869 (1986)
Axelsson, O., Vassilevski, P.S.: A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. SIAM J. Matrix Anal. Appl. 12(4), 625–644 (1991)
Saad, Y.: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14, 461–469 (1993)
Bramble, J.H., Pasciak, J.E.: A preconditioning technique for indefinite systems resulting from mixed approximations of elliptic problems. Math. Comput. 50, 1–17 (1988)
Acknowledgements
This work was supported by the European Regional Development Fund in the IT4 Innovations Centre of Excellence project (CZ.1.05/1.1.00/02.0070).
Discussions with Maya Neytcheva on implementation aspects of the method are gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Axelsson, O. (2013). Preconditioners for Some Matrices of Two-by-Two Block Form, with Applications, I. In: Iliev, O., Margenov, S., Minev, P., Vassilevski, P., Zikatanov, L. (eds) Numerical Solution of Partial Differential Equations: Theory, Algorithms, and Their Applications. Springer Proceedings in Mathematics & Statistics, vol 45. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7172-1_3
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7172-1_3
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7171-4
Online ISBN: 978-1-4614-7172-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)