1 Introduction

Flow control has been widely used in the petroleum, chemical, and aeronautical engineering fields, and has become a very active research area in scientific computing. It is clear that developing efficient numerical methods for flow control is one of the keys to its successful application. In this paper, we consider numerical solutions for Stokes control problems to prepare for more complicated fluid dynamic problems, such as Navier-Stokes control problems. This study focuses on the solution of the multiple saddle point problems generated by the discretize-then-optimize approach. Our goal is to construct iterative solvers that are independent of not only the mesh size of the finite element discretization, but also of the regularization parameters for optimization. Some successful solvers and preconditioners for Stokes control optimization problems have been designed from different perspectives. Most preconditioners have been designed according to the properties of saddle point matrices. In [25], a parameter-robust block-diagonal preconditioner was derived from the nonstandard norm argument. In [14], block-diagonal and block-triangular preconditioners were generated based on a commutator argument [9]. Additionally, some iteration methods and preconditioners have been designed for the following reduced structured block 2 × 2 linear system:

$$ \begin{array}{@{}rcl@{}} \textbf{A}\textbf{x}\equiv \left[ \begin{array}{cc} \textbf{W} & -\textbf{T} \\ \textbf{T} & \textbf{W} \end{array} \right] \left[ \begin{array}{c} \textbf{y} \\ \textbf{z} \end{array} \right] = \left[ \begin{array}{c} \textbf{p} \\ \textbf{q} \end{array} \right] \equiv\textbf{g}. \end{array} $$
(1.1)

In [2], the authors studied the properties and numerical behaviors of a preconditioned square block (PRESB) preconditioner for solving (1.1). There have also been some studies focusing on solutions for the equivalent formulation of (1.1) as a complex symmetric indefinite linear system:

$$ (\textbf{W} + \mathrm{i} \textbf{T})(\textbf{y} + \mathrm{i} \textbf{z}) = \textbf{p} + \mathrm{i} \textbf{q}. $$

Several complex arithmetic algorithms were proposed in [21, 22]. Several real arithmetic algorithms were proposed in [13, 16] under the assumption that −TWT. Additional iterative methods derived from different matrix splitting techniques were studied in [17, 23, 24].

In this paper, we discuss the preconditioned modified Hermitian/skew-Hermitian splitting (PMHSS) iteration method and a corresponding preconditioner for Stokes control distributed problems. The PMHSS iteration method and preconditioner were proposed in [4] for Poisson control optimization. The PMHSS preconditioner has nearly the same workload as the PRESB preconditioner. However, it can be used in the short-term recurrence iteration methods (e.g., minimal residual method (MINRES) and Chebyshev semi-iteration method) [5, 18, 19]. The convergence of the PMHSS iteration method is analyzed when W, \(\textbf {T} \in \mathbb {R}^{n \times n}\) are symmetric positive semidefinite matrices. When W or T are indefinite, the convergence rate of the PMHSS method and the preconditioned eigenvalue distribution deteriorate. In this study, we analyze the convergence of the PMHSS iteration method and the eigenvalue distribution of the PMHSS preconditioned matrix when W and T are a positive semidefinite matrix and a saddle point matrix, respectively. Similar to the PRESB preconditioner, in every iteration the PMHSS preconditioner requires solutions of two saddle point linear systems. Saddle point problems can be solved utilizing inner iterations, such as the Flexible Generalized Minimal Residual (GMRES) or parameterized and preconditioned Uzawa iterations [6, 7, 10]. Alternatively, to save computing cost and avoid decisions regarding inner tolerance, we propose a modified version of the PMHSS preconditioner called the rotated block constraint (RBC) preconditioner. The eigenvalue distribution of the corresponding preconditioned matrix is also analyzed.

The remainder of this paper is organized as follows. In Section 2, we describe the PMHSS iteration method and preconditioner for solving the Stokes control optimization problem. The convergence results and eigenvalue distribution are also derived. In Section 3, we derive the RBC preconditioner and analyze the eigenvalue distribution of the preconditioned matrix. In Section 4, we present existing feasible preconditioners for solving the Stokes control problems and describe their implementations in detail. In Section 5, the numerical performances of the introduced preconditioners are compared via testing on model problems. Our final conclusions are summarized in Section 6.

2 PMHSS iteration and preconditioner for optimality systems

We consider the following Stokes distributed control problem:

$$ \underset{u,f}{\min}\frac{1}{2} \|y-y_{d}\|_{2}^{2} +\frac{1}{2}\upbeta \|u\|_{2}^{2}, $$
$$ \begin{array}{@{}rcl@{}} \begin{array}{rcl} \text{subject to} - \nabla^{2} y + \nabla p &= u&\text{in}\quad {\Omega}, \\ \nabla \cdot y& = 0&\text{in}\quad {\Omega},\\ y&=g_{D}& \text{on}\quad \partial{\Omega}, \end{array} \end{array} $$
(2.1)

where Ω is a domain in \(\mathbb {R}^{2}\) or \(\mathbb {R}^{3}\) and Ω is the boundary of Ω. The desired state function yd and boundary value function gD are given. Because the Stokes equation is self-adjoint, the discretize-then-optimize and optimize-then-discretize processes are mathematically equivalent and lead to the same solution. In this section, we discuss the algebraic systems that are obtained from the discretize-then-optimize approach. The rectangular Taylor-Hood finite element method is then adopted because it is inf-sup stable. Specifically, the velocity y and control u are approximated by linear combinations of the Q2-basis functions {ϕj}, j = 1, ⋯ , nv, while the pressure p is approximated by linear combinations of the Q1-basis functions {ψj}, k = 1, ⋯ , np. The first-order necessary optimality condition for the discretized optimization problem then yields the following linear system:

$$ \mathcal{A}^{R} \left[ \begin{array}{c} \textbf{y} \\ \textbf{p}\\ \textbf{l}\\ \textbf{m} \end{array} \right] \equiv \left[ \begin{array}{llll} M & 0 & -F^{T}& -B^{T}\\ 0 & 0 & -B & 0\\ F & B^{T}& M & 0\\ B & 0 & 0 & 0 \end{array} \right]\left[ \begin{array}{c} \textbf{y} \\ \textbf{p}\\ \textbf{l}\\ \textbf{m} \end{array} \right] = \left[ \begin{array}{c} \textbf{b} \\ \textbf{0}\\ \textbf{f}\\ \textbf{g} \end{array} \right], $$
(2.2)

where

$$ \begin{array}{@{}rcl@{}} && F = \sqrt{\upbeta} {\int}_{\Omega} \nabla \phi_{i} : \nabla \phi_{j}, \quad M = {\int}_{\Omega} \phi_{i} \phi_{j}, \quad B = - \sqrt{\upbeta} {\int}_{\Omega} \psi_{k} \cdot \nabla \phi_{j},\\ && b = {\int}_{\Omega} y_{d} \phi_{i} - \sum\limits_{j = n_{y}+1}^{n_{y}+n_{\partial}} {\int}_{\Omega}\nabla \phi_{i} : \nabla \phi_{j},\\ && f_{i} = - \sqrt{\upbeta}\sum\limits_{j = n_{y}+1}^{n_{y}+n_{\partial}} y_{j} {\int}_{\Omega}\nabla \phi_{i} : \nabla \phi_{j}, \quad g_{i} = \sqrt{\upbeta} \sum\limits_{j = n_{y}+1}^{n_{y}+n_{\partial}}y_{j} {\int}_{\Omega} \psi_{i} \nabla \cdot \phi_{j},\\ && f = [f_{i}], \quad g = [g_{i}], \end{array} $$

and l and m are scaled Lagrange multipliers corresponding to y and p, respectively. The matrices \(M \in \mathbb {R}^{n_{\nu } \times n_{\nu }}\) and \(F\in \mathbb {R}^{n_{\nu } \times n_{\nu }}\), which are referred to as the mass matrix and scaled stiffness matrix, respectively, are symmetric positive definite. The matrix \(B\in \mathbb {R}^{n_{p} \times n_{\nu }}\) is of full row rank. We refer to [1, 11, 14] for the details of finite element discretization. The 4 × 4 block matrix \(\mathcal {A}^{R}\) in (2.2) can be partitioned into the following 2 × 2 block form:

$$ \mathcal{A}^{R} := \left[ \begin{array}{cc} \textbf{W} & -\textbf{T} \\ \textbf{T} & \textbf{W} \end{array} \right], $$
(2.3)

where

$$ \textbf{W} = \left[ \begin{array}{cc} M & 0 \\ 0 & 0 \end{array} \right] \quad \text{and} \quad \textbf{T} = \left[ \begin{array}{cc} F & B^{T} \\ B & 0 \end{array} \right] $$
(2.4)

are a symmetric positive semidefinite matrix and a non-singular saddle point matrix, respectively. In [4], Bai et al. defined the PMHSS iteration method when W and T were symmetric positive semidefinite matrices based on the following matrix splitting:

$$ \mathcal{A}^{R} = \textbf{F} - \textbf{G}, $$

where

$$ \begin{array}{@{}rcl@{}} \textbf{F}:= \left[ \begin{array}{cc} \textbf{I} & -\textbf{I} \\ \textbf{I} & \textbf{I} \end{array} \right] \left[ \begin{array}{cc} \textbf{W}+\textbf{T} & 0 \\ 0 & \textbf{W}+\textbf{T} \end{array} \right],\quad \text{and} \quad \textbf{G}:= \left[ \begin{array}{cc} \textbf{T} & -\textbf{W} \\ \textbf{W} & \textbf{T} \end{array} \right]. \end{array} $$
(2.5)

The PMHSS iteration scheme for solving a 2 × 2 block system can be written as:

$$ \textbf{F} \textbf{x}^{(k+1)} = \textbf{G} \textbf{x}^{(k)} +\textbf{g}. $$

The matrix F is referred to as the PMHSS preconditioner. The PMHSS iteration method and preconditioner provide excellent results based on the clustered eigenvalue distributions and the fact that when W and T are symmetric positive definite the eigenvectors form a normal matrix. In this study, we explored the performance of the PMHSS iteration method and preconditioner for the linear system in (2.2).

Lemma 2.1

Let W and T be the matrices in (2.4), where M and F are symmetric positive definite and B is of full row rank. Then, the eigenvalues of T− 1W are all nonnegative.

Proof

Let τ and x = [x1;x2] be the eigenvalues and corresponding eigenvectors of the following generalized eigenvalue problem:

$$ \left[ \begin{array}{cc} M & 0 \\ 0 & 0 \end{array} \right]\left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right] = \tau\left[ \begin{array}{cc} F & B^{T} \\ B & 0 \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right] , $$

which can be rewritten as:

$$ \left\{ \begin{array}{l c l} \tau (F x_{1} + B^{T}x_{2}) &=& M x_{1},\\ \tau B x_{1} & =& 0. \end{array} \right. $$
(2.6)

It is clear that τ = 0 is the eigenvalue with a multiplicity np because [0;x2] are the corresponding eigenvectors for an arbitrary \(x_{2} \in \mathbb {R}^{n_{p}}\). In the case of τ≠ 0, we have Bx1 = 0. After multiplying by \(x_{1}^{*}\) on both sides of the first equality in (2.6), we have:

$$ \tau (x_{1}^{*}F x_{1} + x_{1}^{*}B^{T}x_{2}) = x_{1}^{*}M x_{1}. $$

Therefore, it holds that:

$$ \tau = \displaystyle\frac{x_{1}^{*}M x_{1}}{x_{1}^{*}F x_{1}}. $$
(2.7)

Note that x1 ≠ 0 when τ ≠ 0. Otherwise, we have x2 = 0 and x is not an eigenvector. Therefore, we have τ > 0 because F and M are symmetric positive definite. It follows that:

$$ x_{2} = \frac{1}{\tau} (B B^{T})^{-1}B(M-\tau F)x_{1} $$

for any x1 ∈null(B)/{0} and τ in (2.7). □

Define E = (W + T)− 1(WT). It is trivial to verify that \(\mu = \displaystyle \frac {\tau -1}{\tau +1}\) are the eigenvalues of E and − 1 ≤ μ < 1. Furthermore, μ = − 1 are eigenvalues with a multiplicity at least np.

Theorem 2.1

Let μj and j = 1, ⋯ , nv + np be the eigenvalues of E. Then, the eigenvalues of \(\textbf {F}^{-1}\mathcal {A}^{R}\) are:

$$ \lambda^{\pm}_{j} = \displaystyle\frac{1}{2} (1\pm \mathrm{i} \mu_{j}). $$

Proof

The PMHSS preconditioned matrix is written as:

$$ \begin{array}{lcl} \textbf{F}^{-1}\mathcal{A}^{R} & = & \displaystyle\frac{1}{2} \left[ \begin{array}{cc} \textbf{W} +\textbf{T} & 0\\ 0 & \textbf{W} + \textbf{T} \end{array} \right]^{-1} \left[ \begin{array}{cc} \textbf{I} & \textbf{I}\\ -\textbf{I} & \textbf{I} \end{array} \right] \left[ \begin{array}{cc} \textbf{W} &-\textbf{T} \\ \textbf{T} & \textbf{W} \end{array} \right] \\ & = & \displaystyle\frac{1}{2} \left[ \begin{array}{cc} \textbf{W} +\textbf{T} & 0\\ 0 & \textbf{W} + \textbf{T} \end{array} \right]^{-1} \left[ \begin{array}{cc} \textbf{W} +\textbf{T} &\textbf{W}-\textbf{T} \\ \textbf{T} -\textbf{W} & \textbf{W}+ \textbf{T} \end{array} \right]\\ & = & \displaystyle\frac{1}{2} \left[ \begin{array}{cc} \textbf{I} & (\textbf{W} + \textbf{T})^{-1}(\textbf{W} -\textbf{T})\\ (\textbf{W} + \textbf{T})^{-1}(\textbf{T}-\textbf{W}) &\textbf{I} \end{array} \right]\\ & = & \displaystyle\frac{1}{2} \left[ \begin{array}{cc} \textbf{I} & E\\ -E &\textbf{I} \end{array} \right]. \end{array} $$

Let λj and \(w_{j} = ({u^{T}_{j}}, ~{v^{T}_{j}})^{T}\) be the eigenvalues and eigenvectors of \(\textbf {F}^{-1}\mathcal {A}^{R}\), respectively. It holds that:

$$ \displaystyle\frac{1}{2}\left[ \begin{array}{cc} \textbf{I} & E\\ -E &\textbf{I} \end{array} \right] \left[ \begin{array}{c} u_{j}\\ v_{j} \end{array} \right] = \lambda_{j} \left[ \begin{array}{c} u_{j}\\ v_{j} \end{array} \right]. $$

Consequently, we have:

$$ E^{2} v_{j}= -(2 \lambda -1)^{2} v_{j}. $$

This indicates that − (2λj − 1)2 and vj are the eigenvalues and corresponding eigenvectors of E2. Then, \(\lambda _{j}^{\pm }= \frac {1}{2} (1 \pm \mathrm {i} \mu _{j})\). □

According to Theorem 2.1, the eigenvalues of preconditioned coefficient matrix \(\textbf {F}^{-1}\mathcal {A}^{R}\) are located on the unitary segment between \(\frac {1}{2}(1+\mathrm {i} )\) and \(\frac {1}{2}(1-\mathrm {i})\). The eigenvalues of the PMHSS iterative matrix:

$$ \textbf{L} = \textbf{F}^{-1}\textbf{G} = I - \textbf{F}^{-1}\mathcal{A}^{R} $$

are \(1 - \lambda _{j}^{\pm } =\frac {1}{2} (1 \mp \mathrm {i} \mu _{j})\). The PMHSS iteration method is convergent because the spectral radius of the iterative matrix is not greater than \(\frac {1}{\sqrt {2}}\).

When PMHSS preconditioning is used in Krylov subspace methods, the generalized residual equation Fr = z must be solved in every iteration. The main workload lies in solving two linear saddle point problems with:

$$ \textbf{W} +\textbf{T} = \left[ \begin{array}{cc} F+M & B^{T} \\ B & 0 \end{array} \right] $$
(2.8)

in every iteration. These saddle point equations can be solved utilizing inner iteration methods, as discussed in [1].

3 RBC preconditioner for optimality systems

In this section, we propose a new preconditioner to avoid solving saddle point equations. We introduce the matrices:

$$ \widetilde{\textbf{W}} = \left[ \begin{array}{cc} M & 0 \\ 0 & S \end{array} \right] \quad \text{and} \quad \widetilde{\textbf{T}} = \left[ \begin{array}{cc} F & B^{T} \\ B & -\hat{S} \end{array} \right], $$
(3.1)

where S = B(M + F)− 1BT is the (negative) Schur complement of (2.8) and \(\hat {S}\) serves as a symmetric positive definite approximation of S. Therefore, the matrices \(\widetilde {\textbf {W}}\) and \(\widetilde {\textbf {T}}\) are symmetric positive definite and symmetric indefinite, respectively. The RBC preconditioner is defined as:

$$ \begin{array}{@{}rcl@{}} \widetilde{\textbf{F}}:= \left[ \begin{array}{cc} \textbf{I} & -\textbf{I} \\ \textbf{I} & \textbf{I} \end{array} \right] \left[ \begin{array}{cc} \widetilde{\textbf{W}}+\widetilde{\textbf{T}} & 0 \\ 0 & \widetilde{\textbf{W}}+\widetilde{\textbf{T}} \end{array} \right]. \end{array} $$
(3.2)

The preconditioner \(\widetilde {\textbf {F}}\) does not need to be generated explicitly because:

$$ \widetilde{\textbf{W}}+\widetilde{\textbf{T}} = \left[ \begin{array}{cc} M+ F & B^{T} \\ B &S -\hat{S} \end{array} \right]= \left[ \begin{array}{cc} \textbf{I} & 0 \\ B(M+ F)^{-1} &\textbf{I} \end{array} \right] \left[ \begin{array}{cc} M+ F & B^{T} \\ 0 & -\hat{S} \end{array} \right], $$

is the product of a block unit lower triangular matrix and block upper triangular matrix. \(\widetilde {\textbf {W}}+\widetilde {\textbf {T}}\) is considered as a constraint preconditioner for the saddle point matrix W + T. The computing cost for solving the generalized residual equations:

$$ \widetilde{\textbf{F}} \textbf{z} = \textbf{r}, $$

includes two solutions with M + F and one solution with \(\hat {S}\). A reasonable \(\hat {S}\) will yield an effective and efficient preconditioner \(\widetilde {\textbf {F}}\). We discuss how the preconditioner \(\widetilde {\textbf {F}}\) approximates \(\mathcal {A}^{R}\) in the following theorem.

Theorem 3.1

Let \(M \in \mathbb {R}^{n_{\nu } \times n_{\nu }}\) and \(F\in \mathbb {R}^{n_{\nu } \times n_{\nu }}\) be symmetric positive definite matrices and let \(B \in \mathbb {R}^{n_{p} \times n_{\nu }}\) be a full column rank. \(\widetilde {\textbf {F}} \in \mathbb {R}^{2n \times 2n}\), which is defined in (3.2), is the RBC preconditioner for \(\mathcal {A}^{R} \in \mathbb {R}^{2n \times 2n}\) in (2.2) with n = nν + np. \(\hat {S}\) serves as a symmetric positive definite approximation of S = B(M + F)− 1BT. We define \({\Delta } S = S -\hat {S}\). The Jordan decomposition of \(\textbf {F}^{-1}\mathcal {A}^{R}\) is:

$$ \textbf{F}^{-1}\mathcal{A}^{R} = \textbf{Q}\textbf{J} \textbf{Q}^{-1}, \quad \textbf{J} = \left[ \begin{array}{ccc} J_{1} & & \\ & {\ddots} & \\ & & J_{k} \end{array} \right], $$
(3.3)

where

$$ J_{i} = \left[ \begin{array}{ccccc} \lambda_{i} &1 & \\ & \lambda_{i} & 1 \\ & & {\ddots} & {\ddots} & \\ & & & \lambda_{i} &1\\ & & &&\lambda_{i} \end{array} \right] \quad \text{or} \quad J_{i} = \left[ \begin{array}{ccccc} \lambda_{i} & & \\ & \lambda_{i} &\\ & & {\ddots} & & \\ & & & \lambda_{i} &\\ & & &&\lambda_{i} \end{array} \right], \quad i = 1,\cdots,k $$

are the Jordan blocks in J and λi are the eigenvalues of \(\textbf {F}^{-1}\mathcal {A}^{R}\). If m is the order of the largest Jordan block in J and

$$ \|\hat{S}^{-1}{\Delta} S\| \leq \frac{\sqrt{2}}{1+\sqrt{2}}\frac{1}{m+1}\frac{1}{\kappa(\textbf{Q})}\frac{1}{\sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}} , $$

then for any eigenvalue σ of \(\widetilde {\textbf {F}}^{-1}\mathcal {A}^{R}\), there is an eigenvalue λ of \(\textbf {F}^{-1}\mathcal {A}^{R}\) such that

$$ |\sigma -\lambda| \leq \frac{m}{m+1}(m+1)^{\frac{1}{m}}\left( 1+ \frac{\sqrt{2}}{2}\right)^{\frac{1}{m}}\kappa(\textbf{Q})^{\frac{1}{m}} \left( \sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}\right)^{\frac{1}{m}}\|\hat{S}^{-1}{\Delta} S\|^{\frac{1}{m}}, $$

where κ(Q) = ∥Q− 1∥∥Q∥. Furthermore, if m0 is the order of the largest Jordan block to which λ belongs, then:

$$ |\sigma -\lambda| \leq \frac{m_{0}}{m_{0}+1}(m_{0}+1)^{\frac{1}{m_{0}}}\left( 1+ \frac{\sqrt{2}}{2}\right)^{\frac{1}{m_{0}}}\kappa(\textbf{Q})^{\frac{1}{m_{0}}} \left( \sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}\right)^{\frac{1}{m_{0}}}\|\hat{S}^{-1}{\Delta} S\|^{\frac{1}{m_{0}}}. $$

Proof

We denote:

$$ \textbf{D} = \left[ \begin{array}{cc} \textbf{W} +\textbf{T} &0 \\ 0 & \textbf{W} +\textbf{T} \end{array} \right], \quad \widetilde{\textbf{D}} = \left[ \begin{array}{cc} \widetilde{\textbf{W}} +\widetilde{\textbf{T}} &0 \\ 0 & \widetilde{\textbf{W}} +\widetilde{\textbf{T}} \end{array} \right],\quad \text{and} \quad \textbf{P} = \left[ \begin{array}{cc} \textbf{I} &-\textbf{I} \\ \textbf{I} & \textbf{I} \end{array} \right]. $$

Then, the preconditioned matrix can be expressed as:

$$ \widetilde{\textbf{F}}^{-1}\mathcal{A}^{R} = \widetilde{\textbf{D}}^{-1}\textbf{P}^{-1}\mathcal{A}^{R}. $$

Note that

$$ \widetilde{\textbf{D}}^{-1} = \textbf{D}^{-1}-\widetilde{\textbf{D}}^{-1} (\widetilde{\textbf{D}}-\textbf{D})\textbf{D}^{-1} $$

and we denote

$$ \delta_{\textbf{D}} =\widetilde{\textbf{D}}-\textbf{D} = \left[ \begin{array}{cccc} 0 & 0\\ 0 & S -\hat{S}\\ & & 0 & 0\\ & & 0 & S -\hat{S} \end{array} \right]. $$

Therefore, we have:

$$ \widetilde{\textbf{F}}^{-1}\mathcal{A}^{R} = \textbf{F}^{-1}\mathcal{A}^{R} - \widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{F}^{-1}\mathcal{A}^{R}. $$

Let \({\Delta } S = S -\hat {S}\). Then,

$$ \left\|\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\right\| = \|(\widetilde{\textbf{W}} +\widetilde{\textbf{T}})^{-1}(\widetilde{\textbf{W}} +\widetilde{\textbf{T}} -(\textbf{W} +\textbf{T}))\|= \left\| \left[ \begin{array}{cc} 0 &(M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S \\ 0 & -\hat{S}^{-1}{\Delta} S \end{array} \right]\right\|. $$

It is clear that \(\left \|\widetilde {\textbf {D}}^{-1}\delta _{\textbf {D}}\right \|\) is the square root of the maximum eigenvalue of

$$ \left[ \begin{array}{cc} 0 &0 \\ 0 & ((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S)^{T}((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S) +(\hat{S}^{-1}{\Delta} S)^{T}(\hat{S}^{-1}{\Delta} S) \end{array} \right]. $$

Note that

$$ \begin{array}{@{}rcl@{}} \begin{array}{rl} &\|((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S)^{T}((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S) +(\hat{S}^{-1}{\Delta} S)^{T}(\hat{S}^{-1}{\Delta} S)\|\\ [2mm] \leq & \|((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S)^{T}((M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S)\| +\|(\hat{S}^{-1}{\Delta} S)^{T}(\hat{S}^{-1}{\Delta} S)\|\\ \leq & \|(M+F)^{-1}B^{T}\hat{S}^{-1}{\Delta} S)\|^{2} +\|\hat{S}^{-1}{\Delta} S\|^{2}\\ \leq& (1+ \|(M+F)^{-1}B^{T}\|^{2})\|\hat{S}^{-1}{\Delta} S\|^{2}, \end{array} \end{array} $$

we have

$$ \left\|\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\right\| \leq \sqrt{ 1+ \left\|(M+F)^{-1}B^{T}\right\|^{2}}\left\|\hat{S}^{-1}{\Delta} S\right\|. $$

Consider the eigenvalue distribution of \(\textbf {F}^{-1}\mathcal {A}^{R}\), we have \(\|\textbf {J}\| \leq 1+ \frac {\sqrt {2}}{2}\). We derive the estimates of σ as follows. When

$$ \|\hat{S}^{-1}{\Delta} S\| \leq \frac{1}{\sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}} \frac{1}{\|\textbf{Q}^{-1}\|\|\textbf{Q}\|}\frac{\sqrt{2}}{1+\sqrt{2}}\frac{1}{m+1}, $$

it is clear that

$$ \begin{array}{@{}rcl@{}} \|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{F}^{-1}\mathcal{A}^{R}\textbf{Q}\| &=&\|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{F}^{-1}\mathcal{A}^{R}\textbf{Q}\| \\ &=& \|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{Q} \textbf{Q}^{-1}\textbf{F}^{-1}\mathcal{A}^{R}\textbf{Q}\|\\ & = &\|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{Q} \textbf{J} \| \\ &\leq& \|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{Q}\| \|\textbf{J} \| \\ & \leq & \frac{1}{m+1}. \end{array} $$

According to Theorem 2 in [12], for any eigenvalue σ of \(\widetilde {\textbf {F}}^{-1}\mathcal {A}^{R}\), there is an eigenvalue λ of \(\textbf {F}^{-1}\mathcal {A}^{R}\) such that:

$$ \begin{array}{@{}rcl@{}} |\sigma -\lambda| &\leq & \frac{m}{m+1}(m+1)^{\frac{1}{m}}\|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{Q} \textbf{J} \|^{\frac{1}{m}} \\ & \leq & \frac{m}{m+1}(m+1)^{\frac{1}{m}}\|\textbf{Q}^{-1}\|^{\frac{1}{m}}\|\textbf{Q}\|^{\frac{1}{m}} \left\|\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\right\|^{\frac{1}{m}} \|\textbf{J} \|^{\frac{1}{m}}\\ & \leq & \frac{m}{m+1}(m+1)^{\frac{1}{m}}\|\textbf{Q}^{-1}\|^{\frac{1}{m}}\|\textbf{Q}\|^{\frac{1}{m}} \left( \sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}\right)^{\frac{1}{m}}\|\hat{S}^{-1}{\Delta} S\|^{\frac{1}{m}} \|\textbf{J} \|^{\frac{1}{m}}\\ & \leq & \frac{m}{m+1}(m+1)^{\frac{1}{m}}\left( 1+ \frac{\sqrt{2}}{2}\right)^{\frac{1}{m}}\|\textbf{Q}^{-1}\|^{\frac{1}{m}}\|\textbf{Q}\|^{\frac{1}{m}} \left( \sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}\right)^{\frac{1}{m}}\|\hat{S}^{-1}{\Delta} S\|^{\frac{1}{m}}. \end{array} $$

Furthermore, if m0 is the order of the largest Jordan block to which λ belongs, then:

$$ \begin{array}{@{}rcl@{}} |\sigma -\lambda| &\leq & \frac{m_{0}}{m_{0}+1}(m_{0}+1)^{\frac{1}{m_{0}}}\|\textbf{Q}^{-1}\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\textbf{Q} \textbf{J} \|^{\frac{1}{m_{0}}} \\ & \leq & \frac{m_{0}}{m_{0}+1}(m_{0}+1)^{\frac{1}{m_{0}}}\|\textbf{Q}^{-1}\|^{\frac{1}{m_{0}}}\|\textbf{Q}\|^{\frac{1}{m_{0}}} \left\|\widetilde{\textbf{D}}^{-1}\delta_{\textbf{D}}\right\|^{\frac{1}{m_{0}}} \|\textbf{J} \|^{\frac{1}{m_{0}}}\\ & \leq &\frac{m_{0}}{m_{0}+1}(m_{0}+1)^{\frac{1}{m_{0}}}\left( 1+ \frac{\sqrt{2}}{2}\right)^{\frac{1}{m_{0}}}\|\textbf{Q}^{-1}\|^{\frac{1}{m_{0}}}\|\textbf{Q}\|^{\frac{1}{m_{0}}} \left( \sqrt{ 1+ \|(M+F)^{-1}B^{T}\|^{2}}\right)^{\frac{1}{m_{0}}}\|\hat{S}^{-1}{\Delta} S\|^{\frac{1}{m_{0}}}. \end{array} $$

It is clear that σ tends towards λ when ΔS tends towards zero. Furthermore, it is always true that \(|\sigma -\lambda | \leq \frac {m}{m+1}\) when the premise in Theorem 3.1 holds. A good approximation of the Schur complement leads to a value of σi close to λi. For the Stokes problem in a convex domain, \(S_{p} = \upbeta \left (\sqrt {\upbeta }M_{p}^{-1}+F_{p}^{-1}\right )^{-1}\) is a well-known approximation of S that was proposed in [9], where Mp and Fp are the mass matrix and stiffness matrix of the pressure space, respectively. This approximation was also adopted in [1, 14, 25].

4 Feasible preconditioners and computational implementations

We now discuss some effective preconditioners for the Stokes control problem and their corresponding main workloads. In [25, 26], a preconditioner of the form:

$$ \mathcal{P}_{nsn} = \left[ \begin{array}{cccc} M+F & 0 & 0 & 0\\ 0&\frac{1}{\upbeta}(M+F) & 0 & 0\\ 0 & 0&\left( F_{p}^{-1}+\sqrt{\upbeta} M_{p}^{-1}\right)^{-1} & 0\\ 0 & 0 &0& \upbeta \left( F_{p}^{-1}+\sqrt{\upbeta} M_{p}^{-1}\right)^{-1} \end{array} \right] $$
(4.1)

was proposed for the linear system in (2.2), where Mp and Fp are the pressure mass matrix and pressure stiffness matrix, respectively.

In [14], a preconditioner of the form:

$$ \mathcal{P}_{cta} = \left[ \begin{array}{cccc} M & 0 & 0 & 0\\ 0& \frac{1}{\upbeta}(F+M)M^{-1}(F+M)^{T} & 0 & 0\\ 0 & 0&F_{p} & 0\\ 0 & 0 &0& (M_{p}^{-1}F_{p}M_{p}^{-1}+\frac{1}{\upbeta}F_{p}^{-1})^{-1} \end{array} \right] $$
(4.2)

was designed to approximate the reordered coefficient matrix:

$$ \mathcal{A} = \left[ \begin{array}{cccc} M & F& B^{T} & 0\\ F&-\frac{1}{\upbeta}M & 0 & B^{T}\\ B & 0&0 & 0\\ 0 & B &0& 0 \end{array} \right] $$
(4.3)

Utilizing an operator commutator argument. The preconditioners \(\mathcal {P}_{nsn}\) and \(\mathcal {P}_{cta}\) can be utilized in the MINRES method because they are symmetric positive definite. Numerical experiments verified that the convergence of preconditioned MINRES methods is independent of the mesh size h of discretization. However, it does rely on the regularization parameter β. The preconditioner:

$$ \mathcal{P}_{presb} = \left[ \begin{array}{cccc} M & 0 & -F & -B^{T}\\ 0&0 & -B & 0\\ F & B^{T}&M+2F & 2B^{T}\\ B & 0 &2B& 0 \end{array} \right] $$
(4.4)

was proposed in [1] and later referred to as the PRESB preconditioner in [3]. It was proven that the condition number for the preconditioned system is no greater than 2 if the preconditioning is performed precisely. The performance of this preconditioner is not only parameter independent, but also mesh size independent. In the implementation of PRESB, there are two subsystems with a saddle point matrix (2.8) that must be solved in every iteration. Similarly, PMHSS must also solve two subsystems with a matrix (2.8) in every iteration. However, these two subsystems can be solved independently. For practical consideration, the linear saddle point systems should be solved approximately utilizing inner iterations, such as Uzawa-type methods [6, 10] or the flexible GMRES method with a block triangular preconditioner.

$$ \mathcal{P}_{tri} = \left[ \begin{array}{cc} M+F & \\ B & -\hat{S} \end{array} \right], $$
(4.5)

where \(\hat {S}\) serves as an approximation of the Schur complement of matrix (2.8). In our numerical experiments, we set \(\hat {S} = S_{p}\). Table 1 lists the linear systems involved in every iteration for different preconditions. The common workload of the preconditioning processes is the solving of linear systems with Fp, Mp, M, and M + F. As implemented in previous works [1, 14], the linear systems with Fp were solved utilizing algebraic multigrid (AMG) methods. According to the eigenvalue distribution in [15, 20], linear systems with Mp can be solved in 20 steps using Chebyshev semi-iteration with relaxed Jacobi iteration with a damping parameter of ω = 4/5. Similarly, linear systems with M can be solved in 20 steps utilizing Chebyshev semi-iteration with relaxed Jacobi iteration with a damping parameter of ω = 32/29. Linear systems with a coefficient matrix M + F are solved utilizing AMG methods.

Table 1 Computational complexity of different preconditioners

5 Numerical results

In this section, we compare the numerical performances of the preconditioners discussed above by solving the Stokes control model (2.1) in Ω = [0, 1]2 with different desired states yd and different boundary value conditions.

Example 5.1

y = yd on the boundary and yd(x1,x2) = (yd,1(x1,x2),yd,2(x1,x2)), given by \(y_{d,1}(x_{1},x_{2}) = 10 \displaystyle \frac {\partial }{\partial x_{2}}(\varphi (x_{1})\varphi (x_{2}))\) and \(y_{d,2}(x_{1},x_{2}) = -10 \displaystyle \frac {\partial }{\partial x_{1}}(\varphi (x_{1})\varphi (x_{2}))\) with \(\varphi (z)= (1-\cos \limits (0.8\pi z))(1-z)^{2}\).

Example 5.2

The desired state yd = x2ix1j and let y = 0 on the boundary, except for on x1 = 1,0 ≤ x2 ≤ 1, where y = −j.

In our experiments, all systems with mass matrices M and Mp were solved utilizing the Chebyshev semi-iteration method. The Chebyshev semi-iteration method is performed for a maximum of 20 steps until the relative residual is reduced to 10− 4. We employed an AMG routine called HSL_MI20 from the Harwell Subroutine Library [8]. This routine performs two V-cycles with two pre- and post-smooth (symmetric Gauss-Seidel) steps to solve a linear system with Fp and M + F. Saddle point subsystems with W + T are solved utilizing a flexible preconditioned GMRES method with a preconditioner \(\mathcal {P}_{tri}\). The inner iterations are terminated once the relative residual is reduced to 10− 4. The symmetric preconditioners \(\mathcal {P}_{nsn}\) and \(\mathcal {P}_{cta}\) from (4.1) and (4.2), respectively, are utilized to precondition the MINRES method for solving a linear system with the matrix in (4.3). The non-symmetric preconditioners \(\mathcal {P}_{presb}\), F, and \(\tilde {\textbf {F}}\) from (4.4), (2.5) and, (3.2), respectively, are solved by a preconditioned GMRES method. In our implementations, all iteration processes start from initial vectors containing zeros and terminate as soon as the relative residuals of (2.2) are less than 10− 6 or the maximum number of iterations \(IT_{\max \limits } = 500\) is reached. The number of iterations (denoted as IT) and CPU time in seconds (denoted as CPU) are reported with respect to different mesh size h and regularization parameter β. The numbers of degree of freedom (denoted as DOF) are reported also. For the PMHSS preconditioned GMRES and PRESB preconditioned GMRES methods, the average number of inner iterations (denoted as IT_inner) for solving the saddle point subsystems are listed in brackets. All tests are performed in Matlab 8.0.0 (64 bit).

In Table 2, we list the IT and CPU values of different preconditioned Krylov subspace methods for solving Example 5.1. As mentioned in previous studies, NSN preconditioned MINRES and CTA preconditioned MINRES are mesh-independent methods, but their convergence depends on a regularization parameter β. Both NSN and CTA preconditioners perform better with smaller values of β.

Table 2 Problem 1: IT, IT_inner (in brackets), and CPU of different Krylov subspace methods

When β = 10− 2 and h = 2− 7, the CTA preconditioned MINRES method abnormally breaks down based on the inexact implementation of preconditioning. The data in Table 2 reveals that similar to the PRESB preconditioned GMRES method, the PMHSS and RBC preconditioned GMRES methods are not only mesh size independent, but also parameter independent. Furthermore, the RBC preconditioner is advantageous because it saves more than 50% of CPU time compared with the PMHSS preconditioner in cases with small mesh sizes. The RBC preconditioner outperforms the other preconditioners, including the PRESB preconditioner, in terms of CPU time in most cases. Regarding the number of iterations, the PRESB preconditioned GMRES method outperforms the other iteration methods. However, it has to solve several saddle point problems utilizing inner loops. The same is true for the PMHSS preconditioned GMRES method. The data in Table 3 support these conclusions.

Table 3 Problem 2: IT, IT_inner (in brackets), and CPU of different Krylov subspace methods

We report the cost functional \(\mathcal {J}\) of distributed control problem in Example 5.1 for h = 2− 6 in Table 4. The results generated by the two methods have no inapparent difference. With the decreasing of β, the cost functional decreases and the state y goes close to the desired state \(\hat {y}\). However, the control u becomes large. It indicates that β = 10− 10 is a proper regularization parameter for Example 5.1.

Table 4 The cost functional of the distributed control problem 5.1

We plot the velocity, control, and pressure values for Examples 5.1 and 5.2 for the cases of h = 2− 6 and β = 10− 6 based on data calculated utilizing the RBC preconditioned GMRES method in Figs. 1 and 2. The maximum value of velocity y in the 2-norm is 1.00 and the maximum value of control u in the 2-norm is 57.7. These results are consistent with the results in [1, 25]. Additionally, the images of pressure and velocity in Fig. 2 coincide with the results in [1].

Fig. 1
figure 1

Problem 1: State y, control u, and pressure p for h = 2− 6 and β = 10− 6

Fig. 2
figure 2

Problem 2: State y, control u, and pressure p for h = 2− 6 and β = 10− 6

6 Conclusion

In this paper, we study PMHSS-type iteration methods and corresponding preconditioners for solving the Stokes control problems. By approximating the saddle point matrix in the PMHSS preconditioner as a product of lower and upper matrices, a more practical preconditioner was designed. This new preconditioner avoids solving saddle point systems in every iteration. The eigenvalue distribution of the preconditioned matrix is favorable and numerical experiments demonstrated that preconditioning behaviour is independent of the mesh size of discretization, as well as of the regularization parameter. The proposed preconditioning can also be extended to solving more complicated Navier-Stokes control problems.