Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The fluid flow in porous media appears in many applications in geomechanics, environmental problems, biomechanics etc. This paper is devoted to a model of nonstationary flow in fully saturated media. The model is based on Darcy law and an assumption of nonzero storativity - ability to increase fluid amount in a volume with increasing fluid pressure. The storativity results from a slight compressibility of the fluid and deformability of the solid matrix.

The discretization of the porous media flow problem is done by a mixed finite elements in space, namely the lowest order Raviart-Thomas are used, see [10]. The Euler type systems with generalized saddle point matrices, which appear in each time step of the implicit Euler method, are then solved by MINRES or FGMRES method with Schur complement type preconditioner.

Numerical complexity is concentrated into solving the Euler type systems and especially into the preconditioning of the velocity block of the preconditioner. To this aim, highly parallelizable one-level additive Schwarz method can be used, see the analysis in [1]. Both analysis and numerical experiments show that this Schwarz method is highly efficient for a class of flow problems with material parameters corresponding to many applications in geosciences.

The content of this paper is as follows. Section 2 describes the nonstationary Darcy problem and its discretization. The block preconditioners for the systems arising in each time step of the implicit Euler method are described in Sect. 3. The Schwarz method for solving the velocity block systems is then described in Sect. 4 with analysis, which strengthens the results from [1]. Section 5 discuses implementation on parallel computers and provides numerical experiments on a massively parallel computer.

The main conclusion is that the combination of the block preconditioners and the additive Schwarz method provides efficient and highly parallelizable preconditioners for a class of nonstationary Darcy flow problems with parameters corresponding to many applications in geosciences and other fields.

2 Nonstationary Darcy Flow Problem and Its Discretization

The nonstationary Darcy problem for very slightly compressible liquid and matrix can be written in the following mixed form

$$\begin{aligned} \begin{aligned}K^{-1}v+\nabla p&=0\,\,\,\text {in}\,\,\Omega ,\\ \nabla \cdot v+c_{pp}\partial _{t}v&=f\,\,\,\text {in}\,\,\Omega , \end{aligned} \end{aligned}$$
(1)

where \(\Omega \subset R^{d}\) is the problem domain, p is the fluid pressure, v is the Darcy velocity, \(K\in R^{d\times d}\) is the permeability represented in a general anisotropic media by a symmetric positive definite matrix and \(c_{pp}\) is the storativity constant. Note that in Sects. 4 and 5, we restrict ourselves to isotropic media \(K=kI\), where \(k=k(x)\ge k_{0}>0\) and I is the identity matrix. The model is described in detail e.g. in [11].

The weak formulation of the problem (1) leads to finding the pair \((v,\,p)\), \(v=v(x,t)\) and \(p=p(x,t)\), which fulfils a mixed variational identity in \(V\times X=H(div,\,\Omega )\times L_{2}(\Omega )\), see e.g. [13].

We assume discretization in \(V_{h}\times X_{h}\) with Raviart-Thomas finite elements on squares [10] for velocity and piecewise constant functions for pressure. The choice of a basis \(V_{h}=span\left\{ \psi _{i}\right\} \) and \(X_{h}=span\left\{ \phi _{i}\right\} \) and induced isomorphisms \(V_{h}\leftrightarrow \varvec{V}_{h}\equiv R^{N}\), \(v_{h}\leftrightarrow \varvec{v}\), \(X_{h}\leftrightarrow \varvec{X}_{h}\equiv R^{Z}\), \(p_{h}\leftrightarrow \varvec{p}\) then provides differential algebraic system of the form

$$ \mathcal {A}_{1}\frac{d}{dt}\mathcal {U}+\mathcal {A}\mathcal {U}=\mathcal {F} $$
$$ \mathcal {A}_{1}=\left[ \begin{array}{cc} 0 &{} 0\\ 0 &{} -C \end{array}\right] ,\,\,\,\mathcal {A}=\left[ \begin{array}{cc} M &{} B^{T}\\ B &{} 0 \end{array}\right] ,\,\,\,\mathcal {U}=\left[ \begin{array}{c} \varvec{v}\\ \varvec{p} \end{array}\right] , $$

where \(M\in R^{N\times N}\) \(B\in R^{Z\times N}\) and \(C\in R^{Z\times Z}\) are matrices defined by the following identities

$$\begin{aligned} \left\langle M\varvec{u},\,\varvec{v}\right\rangle&=\intop _{\Omega }K^{-1}u_{h}\cdot v_{h}\,d\Omega ,\,\,\,\left\langle B\varvec{v},\,\varvec{p}\right\rangle =-\intop _{\Omega }div(v_{h})p_{h}\,d\Omega \,\,\,\,\forall v_{h}\in V_{h},\,p_{h}\!\in \! X_{h}, \end{aligned}$$
(2)
$$\begin{aligned} \left\langle C\varvec{p},\,\varvec{q}\right\rangle =\intop _{\Omega }c_{pp}p_{h}q_{h}\,d\Omega ,\,\,\,p_{h},\,q_{h}\in X_{h}, \end{aligned}$$
(3)

where \(\left\langle \cdot ,\cdot \right\rangle \) denotes Euclidean inner product.

Note, that the regularity of the solution will be low when there are large jumps in the permeability, so we restrict ourselfs only to the lowest order Raviart-Thomas elements for discretization of velocity.

Implicit Euler method [13] uses time discretization \(0=t_{0}<\ldots<t_{k}<\ldots \) and computes the values \(\mathcal {U}^{k}=\left[ \begin{array}{c} v^{k}\\ p^{k} \end{array}\right] \) in the time steps \(t_{k}\), \(k\ge 1\) by solving the systems with the matrices \(\mathcal {A}_{E}\),

$$\begin{aligned} \mathcal {A}_{E}\mathcal {U}^{k+1}=\mathcal {F}^{k+1}+\frac{1}{\tau _{k}}\mathcal {A}_{1}\mathcal {U}^{k}, \end{aligned}$$
(4)
$$\begin{aligned} \mathcal {A}_{E}=\frac{1}{\tau _{k}}\mathcal {A}_{1}+\mathcal {A}=\left[ \begin{array}{cc} M &{} B^{T}\\ B &{} -\frac{1}{\tau _{k}}C \end{array}\right] . \end{aligned}$$
(5)

The time step \(\tau _{k}=t_{k+1}-t_{k}\) can be variable or fixed. In the analysis of preconditioners, we use the notation \(\tau _{k}\equiv \tau \) without a loss of generality since we always consider solving system with matrix (5) during one specific timestep.

3 Preconditioning of the Euler Type Systems

The preconditioners for \(\mathcal {A}_{E}\) are based on the two by two partition shown in (5). Since the matrix C is diagonal for the piecewise constant approximation for pressure, we can consider the Schur complement \(M_{C}=M+\tau B^{T}C^{-1}B\) and either block diagonal or block triangular preconditioners with the Schur complement block \(M_{C}\),

$$ \mathcal {P}_{D}=\left[ \begin{array}{cc} M_{C} &{} 0\\ 0 &{} \frac{1}{\tau }C \end{array}\right] \,\,\,\text {and}\,\,\,\mathcal {P}_{T}=\left[ \begin{array}{cc} M_{C} &{} B^{T}\\ 0 &{} -\frac{1}{\tau }C \end{array}\right] $$

The block diagonal preconditioner \(\mathcal{P}_{D}\) is positive definite and in the ideal case (when the problems connected with \(M_{C}\) and C are solved exactly) can be combined with MINRES method. The convergence is then driven by spectral properties, specifically by the following localization of the spectrum of the preconditioned matrix

$$ \sigma (\mathcal{P}_{D}^{-1}\mathcal {A}_{E})\subset \left\langle \frac{-1-\sqrt{5}}{2},-1\right\rangle \cup \left\langle \frac{-1+\sqrt{5}}{2},\,1\right\rangle . $$

The proof of this localization can be found e.g. in [6, 9]. Note that this result depends only on the algebraic structure of the matrix \(\mathcal {A}_{E}\) and the localization result is robust with respect to the material parameters (permeability, storativity) of the model and discretization parameters (\(h,\,\tau )\).

Even stronger localization of spectrum of the preconditioned system occurs for the block triangular preconditioner,

$$ \mathcal{P}_{T}^{-1}\mathcal {A}_{E}=\mathcal{P}_{T}^{-1}\left[ \begin{array}{cc} M &{} B^{T}\\ B &{} -\frac{1}{\tau }C \end{array}\right] =\mathcal{P}_{T}^{-1}\begin{bmatrix}M_{C}&B^{T}\\ 0&-\frac{1}{\tau }C \end{bmatrix}\begin{bmatrix}I_{1}&0\\ -\tau C^{-1}B&I_{2} \end{bmatrix}. $$

Thus in the ideal case (exact solvers for \(M_{C}\) and C, no influence of finite arithmetic) it holds that

$$ \mathcal{P}_{T}^{-1}\mathcal {A}_{E}=\begin{bmatrix}I_{1}&0\\ -\tau C^{-1}B&I_{2} \end{bmatrix} $$

and consequently \(\sigma \left( \mathcal{P}_{T}^{-1}\mathcal {A}_{E}\right) =\left\{ 1\right\} \) and \(\mathcal {P}_{T}^{-1}\mathcal {A}_{E}\) has minimal polynomial of order two \(\left( \mathcal {P}_{T}^{-1}\mathcal {A}_{E}-I\right) ^{2}=0\) . The block triangular preconditioner spoils the symmetry, which requires to use e.g. GMRES method. Two iterations of any Krylov space method are sufficient to solve the system.

In the implementation described in Sect. 5, we use an exact solver for the block C but inexact solver for the velocity block \(M_{C}\). This solver uses conjugate gradient (CG) method with one level additive Schwarz preconditioning described in the next section. Numerical experiments show that when the accuracy of the inner solver is reasonably good, the outer iterations realized by flexible GMRES behave similarly to the ideal case.

4 Additive Schwarz Method for the Velocity Block

Both preconditioners presented in the previous section require the solution of the system with the Schur complement matrix \(M_{C}\). This matrix is symmetric and positive definite and therefore can be solved by the conjugate gradient method. This section describes one level additive Schwarz preconditioner for the Schur complement system including the theory based on element-by-element analysis. The results presented in this section extend the results from [1] by considering nonzero block C and by utilising the elementwise computed maximum contrast \(c_{pp}^{-1}k\).

The preconditioner \(P_{AS}\) is defined via a decomposition of the finite element space \(V_{h}=V_{1}+\cdots +V_{m}\), where the subspaces \(V_{k}\) are defined via an overlapping decomposition of the domain \(\Omega \), \(\bar{\Omega }=\bar{\Omega }_{1}\cup \ldots \cup \bar{\Omega }_{m}\). We assume that \(\bar{\Omega }_{k}\) are aligned with the finite element division \(\mathcal {T}_{h}\), which is used for definition of \(V_{h}\) (lowest order Raviart-Thomas RT(0) elements). Then

$$ V_{k}=\left\{ v_{h}\in V_{h},\,\,v_{h}\equiv 0\,\,\text {in}\,\,\Omega \setminus \Omega _{k}\right\} . $$

Functions \(v_{h}\in V_{h}\) are represented by algebraic vectors \(\varvec{v}\in \varvec{V}_{h}\equiv R^{N}\) through isomorphism \(V_{h}\leftrightarrow \varvec{V}\), the same isomorphism provides relations \(V_{k}\leftrightarrow \varvec{V}_{k}\equiv R^{N_{k}}\). The inclusion \(V_{k}\subset V_{h}\) induces restriction \(\varvec{V}\rightarrow \varvec{V}_{k}\) represented by the matrix \(R_{k}\in R^{N\times N_{k}}\). Then the preconditioner \(P_{AS}\) to \(M_{C}\) can be defined as

$$\begin{aligned} P_{AS}^{-1}=\sum _{k=1}^{m}\,R_{k}^{T}M_{C_{k}}^{-1}R_{k},\,\,\,M_{C_{k}}=R_{k}M_{C}R_{k}^{T}. \end{aligned}$$
(6)

The matrix \(M_{C}=M+\tau B^{T}C^{-1}B\) is created from the matrices defined variationally in (2) and (3). For a subsequent analysis, it is important that \(M_{C}\) can be also defined variationally, in particular

$$ \left\langle M_{C}\varvec{u},\,\varvec{v}\right\rangle =m(u_{h},\,v_{h})+\tau d(u_{h},\,v_{h})=a(u_{h},\,v_{h}), $$

where \(m(u_{h},\,v_{h})=\intop _{\Omega }k^{-1}u_{h}v_{h}\,dx,\) we presume that k is constant on mesh elements from \(\mathcal {T}_{h}\), \(k=k_{E}\) on \(E\in \mathcal {T}_{h}\), and \(d(u_{h},\,v_{h})\) is defined as follows

$$\begin{aligned} d(u_{h},\,v_{h})&=\left\langle B^{T}C^{-1}B\varvec{u},\,\varvec{v}\right\rangle =\left\langle C^{-1}B\varvec{u},\,B\varvec{v}\right\rangle \\&=\sum _{i}c_{ii}^{-1}\left( \intop _{\Omega }\,div(u_{h})\psi _{i}\,dx\right) \left( \intop _{\Omega }\,div(v_{h})\psi _{j}\,dx\right) \\&=\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,div(u_{h})\,dx\right) \left( \intop _{E}\,div(v_{h})\,dx\right) . \end{aligned}$$

The summation above is over \(E\in \mathcal {T}_{h}\) and we use the fact that the basis functions \(\psi _{i}\) of the space \(X_{h}\) are equal to 1 on \(E=E_{i}\in \mathcal {T}_{h}\) and to zero on the other elements \(E\ne E_{i}\). Moreover, \(c_{ij}=\delta _{ij}\intop _{E_{i}}c_{pp}=\delta _{ij}c_{pp}\left| E\right| \), where \(\delta _{ij}\) is the Kronecker’s symbol. As \(div(u_{h})\) is constant on E for RT(0) function \(u_{h}\),

$$\begin{aligned} d(u_{h},\,v_{h})&=\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}div(u_{h})div(v_{h})\left| E\right| ^{2}\\&=\sum _{E}\intop _{E}c_{pp}^{-1}\,div(u_{h})div(v_{h})\,dx \end{aligned}$$

and we conclude that \(a(u_{h},\,v_{h})\) is a weighted H(div) inner product, which guarantees positive definiteness of a.

The condition number of the preconditioned matrix \(P_{AS}^{-1}M_{C}\) can be bounded by

$$\begin{aligned} cond(P_{AS}^{-1}M_{C})\le c_{0}c_{1}, \end{aligned}$$
(7)

see e.g. [7, 8], where the constants \(c_{0}\), \(c_{1}\) come from the conditions

$$\begin{aligned} \forall v_{h}\in V_{h},\,\exists v_{k}\in V_{k},\quad v_{h}=\sum _{k=1}^{m}\,v_{k}:\,\,\,\sum _{k=1}^{m}a(v_{k},\,v_{k})\le c_{0}a(v_{h},\,v_{h}), \end{aligned}$$
(8)
$$\begin{aligned} \forall v_{h}\in V_{h},\,\forall v_{k}\in V_{k},\quad v_{h}=\sum _{k=1}^{m}\,v_{k}:\,\,\,a(v_{h},\,v_{h})\le c_{1}\sum _{k=k_{0}}^{m}a(v_{k},\,v_{k}). \end{aligned}$$
(9)

The rest of this section is devoted to determining the values of \(c_{0},c_{1}.\) It is easy to show that \(c_{1}\) can be taken as maximal number of subdomains which mutually intersect \(c_{nis}\).

The estimate of \(c_{0}\) is more complicated and requires a suitable construction of the decomposition of the elements \(v\in \varvec{V}\). To derive the estimate we analyse the decomposition provided by

$$ v=\sum _{k=1}^{m}\,v_{k},\,\,\,v_{k}=\Pi _{RT}(\theta _{k}v), $$

where \(\theta _{k}\) are functions of a decomposition of unity [18],

$$ 1=\sum _{k=1}^{m}\theta _{k},\,\,\,supp(\theta _{k})=\bar{\Omega }_{k},\,\,0\le \theta _{k}\le 1,\,\,\left\| grad(\theta _{k})\right\| \le c\delta ^{-1}, $$

where \(\delta \) is an overlap (usually a nonoverlapping decomposition \(\bar{\Omega }=\bar{\Omega }_{1}^{0}\cup \ldots \cup \bar{\Omega }_{m}^{0}\) is enlarged to overlapping one by construction of subdomains \(\Omega _{k}=\left\{ x\in \Omega ,\,\,dist(x,\,\Omega _{k}^{0})\le \delta \right\} \)).

Our analysis will make use of the Raviart-Thomas interpolation \(\Pi _{h}^{RT}:\,\,C(\Omega )\rightarrow RT_{0}\) given by

$$ \Pi _{h}^{RT}v=\sum _{i}\,\left( \frac{1}{\left| e_{i}\right| }\intop _{e_{i}}v\cdot n_{e_{i}}\,ds\right) \psi _{i}, $$

where the summation goes over the degrees of freedom (located on edges of the elements), see [15].

We will show that

$$\begin{aligned} \sum _{k}m(v_{k},\,v_{k})\le \kappa c_{nis}m(v_{h},\,v_{h}), \end{aligned}$$
(10)
$$\begin{aligned} \sum _{k}d(v_{k},\,v_{k})\le 2c_{nis}d(v_{h},\,v_{h})+2\tau c_{nis}\delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} m(v_{h},\,v_{h}) \end{aligned}$$
(11)

The constant \(\kappa \) will be determined later in the analysis.

To derive the estimate (10) we consider

$$\begin{aligned} \sum _{k}m(v_{k},\,v_{k})&=\sum _{k}\sum _{E\subset \bar{\Omega }_{k}}\intop _{E}\,k_{E}^{-1}v_{k}\cdot v_{k}=\sum _{k}\sum _{E\subset \bar{\Omega }_{k}}k_{E}^{-1}\intop _{E}\,v_{k}\cdot v_{k}\\&\le c_{nis}\sum _{E\in \mathcal {T}_{h}}k_{E}^{-1}\intop _{E}\,\left\| \Pi _{h}^{RT}\left( \theta _{k}v_{h}\right) \right\| ^{2}\le c_{nis}\kappa \sum _{E}\intop _{E}\,k_{E}^{-1}\left\| v_{h}\right\| ^{2}\\&\le c_{nis}\kappa m(v_{h},\,v_{h}). \end{aligned}$$

Above, we used fact that \(\Pi _{h}^{RT}\left( \theta _{k}v_{h}\right) |_{E}=\sum _{i}z_{i}\hat{\psi }_{i}\) where \(\hat{\psi }_{i}\) are local (element) \(RT_{0}\) basis functions and

$$ z_{i}=\frac{1}{\left| e_{i}\right| }\intop _{e_{i}}(\theta _{k}v_{h})\cdot n_{e_{i}}\,ds=\frac{1}{\left| e_{i}\right| }\intop _{e_{i}}\theta _{k}(v_{h}\cdot n_{e_{i}})\,ds=(v_{h}\cdot n_{e_{i}})\frac{1}{\left| e_{i}\right| }\intop _{e_{i}}\theta _{k}\,ds\le v_{h}\cdot n_{e_{i}} $$

as \(v_{h}\cdot n_{e_{i}}\) is constant on \(e_{i}\) for \(v\in RT_{0}\). Therefore,

where \(M_{E}\) is the velocity mass matrix, \((M_{E})_{ij}=\intop _{E}\,\hat{\psi }_{i}\cdot \hat{\psi }_{j}\) and \(\kappa =\frac{\mu _{\max }(M_{E})}{\mu _{\min }(M_{E})}\), i.e. the condition number of the local mass matrix.

To prove (11), we investigate \(\sum _{k}d(v_{k},\,v_{k})\)

$$\begin{aligned} \sum _{k}d(v_{k},\,v_{k})&=\sum _{k}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,div\left( \Pi _{h}^{RT}(\theta _{k}v_{h})\right) \,dx\right) ^{2}\nonumber \\&=\sum _{k}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,div(\theta _{k}v_{h})\,dx\right) ^{2}\\&=\sum _{k}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,\theta _{k}div(v_{h})+grad(\theta _{k})\cdot v_{h}\,dx\right) ^{2}\nonumber \\&\le 2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,\theta _{k}div(v_{h})\,dx\right) ^{2}\nonumber \\&\,\,\,+2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,grad(\theta _{k})\cdot v_{h}\,dx\right) ^{2}\nonumber \\&\le 2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( div(v_{h})\intop _{E}\,\theta _{k}\,dx\right) ^{2}\end{aligned}$$
(12)
$$\begin{aligned}&\,\,\,+2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \intop _{E}\,\left\| grad(\theta _{k})\right\| ^{2}\,dx\right) \left( \intop _{E}\,\left\| v_{h}\right\| ^{2}\,dx\right) \nonumber \\&\le 2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}div(v_{h})^{2}\left| E\right| ^{2}\,\nonumber \\&\,\,\,+2c_{nis}\sum _{E}c_{pp}^{-1}\left| E\right| ^{-1}\left( \delta ^{-2}\left| E\right| \right) k_{E}\left( \intop _{E}k_{E}^{-1}\,\left\| v_{h}\right\| ^{2}\,dx\right) \nonumber \\&=2c_{nis}d(v_{h},\,v_{h})+2c_{nis}\delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} m(v_{h,},\,v_{h})\nonumber \end{aligned}$$
(13)

In (13) we use the fact that \(div(v_{h})\) is constant on elements from \(\mathcal {T}_{h}\), (12) follows from

$$ \intop _{E}\,div\left( \Pi _{h}^{RT}(\theta _{k}v)\right) \,dx=\intop _{E}\,div(\theta _{k}v)\,dx, $$

i.e.

$$\begin{aligned} 0&=\intop _{E}\,div\left( \theta _{k}v-\Pi _{h}^{RT}(\theta _{k}v)\right) \,dx\\&=\intop _{\partial E}\,\left( \theta _{k}v-\Pi _{h}^{RT}(\theta _{k}v)\right) \cdot n\,ds\\&=\sum _{i}\,\intop _{e_{i}}\theta _{k}v\cdot n\,ds-\intop _{e_{i}}\left( \frac{1}{\left| e_{i}\right| }\intop _{e_{i}}\theta _{k}v\cdot n_{e_{i}}\,ds\right) n_{e_{i}}\cdot n\,ds\\&=\sum _{i}\,\intop _{e_{i}}\theta _{k}v\cdot n\,ds-\intop _{e_{i}}\left( \frac{1}{\left| e_{i}\right| }\intop _{e_{i}}\theta _{k}v\cdot (n_{e_{i}}\cdot n)n\,ds\right) n_{e_{i}}\cdot n\,ds\\&=\sum _{i}\,\intop _{e_{i}}\theta _{k}v\cdot n\,ds-(n_{e_{i}}\cdot n)^{2}\intop _{e_{i}}\theta _{k}v\cdot n\,ds=0 \end{aligned}$$

Note that \(n_{e_{i}}\) is an apriori selected normal, which is used for definition of the degrees of freedom and n is the outer normal to the element E, \(n_{e_{i}}\cdot n=\pm 1\).

The whole estimate is now

$$\begin{aligned} \sum _{k}a(v_{k},v_{k})&\le \left( \kappa c_{1}+2c_{nis}\tau \delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} \right) m(v_{h},\,v_{h})+2c_{nis}d(v_{h},\,v_{h})\\&\le c_{nis}\max \left\{ 2,\,\kappa +2\tau \delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} \right\} a(v_{h},\,v_{h}) \end{aligned}$$

and as \(c_{nis}\) and \(\kappa \) are independent of physical and discretization parameters, the efficiency and robustness of the estimate depends mostly on the term

$$\begin{aligned} c_{AS}=\tau \delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} . \end{aligned}$$
(14)

The above results can be summarized in the following theorem.

Theorem

Let us consider the time step matrix \(\mathcal {A}_{E}\) from (5) with time step \(\tau \) and the Schur complement \(M_{C}=M+\tau B^{T}C^{-1}B\). Let \(P_{AS}\) be the additive Schwarz preconditioner from (6). Then

$$ cond(P_{AS}^{-1}M_{C})\le c_{nis}^{2}\max \left\{ 2,\,\kappa +2\tau \delta ^{-2}\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} \right\} . $$

We remind the notation in which \(c_{nis}\) is the maximum number of mutually overlapping subdomains, \(\delta \) is the overlap of the decomposition, \(\kappa \) is maximum condition number of the element mass matrices \(M_{E}\), \(c_{pp}\) and \(k_{E}\) are storativity and permeability assumed to be constant on the finite elements and \(\max _{E}\left\{ c_{pp}^{-1}k_{E}\right\} \) is taken over all elements of the finite element division.

Remark

Note that for 2D and RT(0) elements on squares \(\left\langle x_{01},\,x_{01}+h\right\rangle \times \left\langle x_{02},\,x_{02}+h\right\rangle \), we get

$$ M_{E}=\frac{1}{6}h^{2}\left[ \begin{array}{cccc} 2 &{} 0 &{} -1 &{} 0\\ 0 &{} 2 &{} 0 &{} -1\\ -1 &{} 0 &{} 2 &{} 0\\ 0 &{} -1 &{} 0 &{} 2 \end{array}\right] ,\,\,\,\sigma (M_{E})=\frac{1}{6}h^{2}\left\{ 1;\,3\right\} ,\,\,\,\kappa =3. $$

5 Implementation and Numerical Experiments

The numerical experiments are computed with our own code available at [19] written in C on top of PETSc [5, 16]. Matrices M, \(M_{C}\), B and C are assembled element by element from local contributions. All of these matrices are created and stored in a distributed form using PETSc MatCreateAIJ operation. The system matrix \(\mathcal {A}_{E}\) and the preconditioner matrix \(\mathcal{P}\) are then formed implicitly from blocks using PETSc MATNEST matrix type. The action of the preconditioner, which combines separate preconditioners for individual fields, is provided by PETSc PCFIELDSPLIT operation.

Fig. 1.
figure 1

Model problem

For numerical experiments we use a model problem described by (1) with the zero volume source \(f\equiv 0\), boundary conditions as shown in Fig. 1 and initial condition \(v=0\) and \(p=0\) in \(\Omega \). The problem domain \(\Omega =\left\langle 0,\,1\right\rangle ^{2}\) is regularly divided into square elements with the meshsize characterized by the parameter \(n=1/h\) being the number of segments on the side. Timestep \(\tau =0.1\) is used for all experiments. The permeability of the material is supposed to be isotropic and elementwise constant. Values on each element are in the form \(k=k_{s}k_{r}\), \(\log k=\log k_{s}+\log k_{r},\) where \(k_{r}\) are sampled from lognormal distribution with parameters \(\mu =0\) and \(\sigma =2\), \(\log k\in N(\log k_{s},\,\sigma ^{2})\).

FGMRES with the block preconditioner is used to solve the outer system with the matrix \(\mathcal {A}_{E}\) and conjugate gradients with Schwarz preconditioner are used to solve the inner system corresponding to the block \(M_{C}\). The stopping criterion for both outer and inner iterations is the reduction of relative unpreconditioned residual to be equal or less than \(10^{-6}\).

The Schwarz preconditioner for the matrix \(M_{C}\) uses the PETSc PCASM functionality. The decomposition to subdomains for the model problem corresponds to the splitting of the domain into horizontal strips which for the regular mesh and natural numbering of nodes corresponds to row-wise matrix decomposition. In PETSc, the overlap is imposed by adding a proper number of matrix rows to implicitly performed nonoverlapping row-wise matrix splitting. The preconditioner uses LU decomposition for the solution of systems on subdomains. The LU decomposition is computed during the setup of the preconditioner and then it is repeatedly applied during the iterations. Possible generalization of this approach to 3D problem discretized with regular mesh and numbering aligned with domain decomposition on layers is straightforward.

Table 1 shows scalability of the implementation. We investigate “weak” scaling with problem size increasing with increasing number of subdomains (processors), i.e. the size of the subproblems is kept not strongly decreasing. Note that the subproblems arise by decomposition of \(\mathcal {A}_{E}\in R^{N_{t}\times N_{t}}\) and \(M_{C}\in R^{N\times N}\). For the problem on \(n\times n\) mesh and RT(0)-P0 elements, \(N_{t}\sim 3n^{2}\) and \(N\sim 2(nsd\cdot n)^{2}\), \(nsd=2no+n/np\), no is the size of overlap, np is the number of processors. The parameter no corresponds to number of rows of \(n\times n\) mesh common to neighbouring subdomains, the geometrical overlap \(\delta \) is obtained by multiplication no by the width of the row/strip. The number no is changing to keep the value \(\delta \) not strongly dependent on the mesh size (h).

The values of outer iterations in Tables 13 report the average number of both outer FGMRES and inner CG iterations over one time step (averaged over the first ten timesteps, with zero initial guess within each timestep). Division of the number of inner iterations by number of outer iterations shows how many inner iterations are needed for the inner systems. Note that in a class of parameters, which we investigate there is frequently just one inner iteration per outer one. The time in all tables express the time spent by solvers that means without including times for matrix assembly and initialization of preconditioner. The computations were performed on the Salomon supercomputer, see [17].

The results in Table 1 correspond to the material parameters \(k_{s}=10^{-15}\), \(c_{pp}=10^{-10}\), \(k_{s}/c_{pp}=10^{-5}\) and use of more efficient triangular preconditioner. Tables 2 and 3 report number of outer/inner iterations in dependence on the material parameters, especially on the ratio \(k_{s}/c_{pp}\). It can be seen that for both triangular preconditioner (Table 2) and diagonal preconditioner (Table 3) the efficiency is excellent if \(k_{s}/c_{pp}\le 10^{-4}\). This condition is fulfilled for many applications in geomechanics and biomechanics, see Table 5.

Table 1. Test on scaling: mesh size \(n\times n\), number of DOFs for RT(0)-P0 elements \(\sim 3n^{2}\). Material parameters \(k_{s}=10^{-15}\), \(c_{pp}=10^{-10}\), \(k_{s}/c_{pp}=10^{-5}\).
Table 2. Dependence of number of iterations for FGMRES - \(\mathcal {P}_{T}\) (triangular preconditioner) and CG - Schwarz on material parameters \(k_{s},\,\sigma =2\), \(c_{pp}\). Other parameters \(n=1000\), 24 subdomains, overlap 8 are kept constant.
Table 3. Dependence of number of iterations for FGMRES - \(\mathcal {P}_{D}\) (diagonal preconditioner) and CG - Schwarz on material parameters \(k_{s},\,\sigma =2\), \(c_{pp}\). Other parameters \(n=1000\), 24 subdomains, overlap 8 are kept constant.

Tables 2 and 3 illustrate the dependence of the iterative processes on material parameters \(k_{s}\) and \(c_{pp}\). The first number in each cell is the average number of outer iterations and the second is the average number of inner iterations over one time step both averaged over the first ten timesteps, with zero initial guess within each timestep to assess the convergence without the influence good initial guess from previous timestep. The Table 4 shows dependence on the overlap measured by number of rows. The overlap is increasing in the rows, decreasing in the column. The test is done for unfavourable ratio \(k_{s}/c_{pp}=1\). For favourable ratio \(k_{s}/c_{pp}=1\le 10^{-4}\) the dependence on the overlap is very weak.

Table 4. Dependence on the geometrical overlap \(\delta \) for triangular preconditioner, \(k_{S}=10^{-10},\) \(c_{pp}=10^{-10},\) \(\sigma =0\), 24 processes.

6 Conclusions

This paper presents iterative technique for solving the systems arising from non-stationary Darcy flow problems discretized by mixed finite elements in space and implicit Euler method in time. The technique combines outer iteration by FGMRES and block preconditioner with the velocity block solved by inner CG iteration with Schwarz type preconditioner.

It is shown that the convergence of the outer iterations is practically independent on the material parameters and inner iterations converge extremely fast for the ratio of permeability to storativity small enough (\(k_{s}/c_{pp}\le 10^{-4}\)). This observation derived from the numerical tests is also in good agreement with our theoretical result (14), which follows from extending the analysis provided in [1]. As another robust inner iterative method, we can mention e.g. [14].

Such suitable ratio of permeability to storativity is characteristic in many geo applications dealing with semi-pervious and impervious materials, see the following values provided e.g. by references [11, 12]:

Table 5. Ranges of matrial parameters

The presented iterative solution technique is also highly parallelizable and numerical experiments demonstrate its scalability.

The technique can be also used in the case that time discretization is done by higher order scheme, such as e.g. the Radau IIA method [3]. The systems arising within time steps of Radau method can be preconditioned by block preconditioner involving Euler type matrices as blocks, see [4]. The results of this paper can be also used for solving poroelasticity problems, cf. [2,3,4].