Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Substructuring algorithms such as Balancing Neumann-Neumann (BNN) or Finite Element Tearing and Interconnecting (FETI) are defined for non overlapping domain decompositions but not for overlapping subdomains. Schwarz method (Schwarz, 1870) is defined only for overlapping subdomains. With the help of a coarse space correction, the two-level versions of both type of methods are weakly scalable, see Toselli and Widlund (2005) and references therein.

The domain decomposition method introduced by Lions (1990) can be applied to both overlapping and non overlapping subdomains. It is based on improving Schwarz methods by replacing the Dirichlet interface conditions by Robin interface conditions. This algorithm was extended to Helmholtz problem by Després (1993). Robin interface conditions can be replaced by more general interface conditions that can be optimized (Optimized Schwarz methods, OSM) for a better convergence, see Gander et al. (2002), Gander (2006) and references therein. When the domain is decomposed into a large number of subdomains, these methods are, on a practical point of view, scalable if a second level is added to the algorithm via the introduction of a coarse space Japhet et al. (1998), Farhat et al. (2000), Conen et al. (2014). But there is no systematic procedure to build coarse spaces with a provable efficiency.

The purpose of this article is to define a general framework for building adaptive coarse space for OSM methods for decomposition into overlapping subdomains. We prove that we can achieve the same robustness that what was done for Schwarz (Spillane et al., 2014) and FETI-BDD (Spillane et al., 2013) domain decomposition methods with so called GenEO (Generalized Eigenvalue in the Overlap) coarse spaces. Compared to these previous works, we have to introduce a non standard symmetric variant of the ORAS method as well as two generalized eigenvalue problems. Although theory is valid only in the symmetric positive definite case, the method scales very well for saddle point problems such as highly heterogeneous nearly incompressible elasticity problems as well as the Stokes system.

2 Symmetrized ORAS Method

The problem to be solved is defined via a variational formulation on a domain \(\varOmega \subset \mathbb{R}^{d}\) for \(d \in \mathbb{N}\):

$$\displaystyle{\text{Find }u \in V \text{such that: }a_{\varOmega }(u,v) = l(v)\,,\ \ \forall v \in V \,,}$$

where V is a Hilbert space of functions from Ω with real values. The problem we consider is given through a symmetric positive definite bilinear form that is defined in terms of an integral over any open set ω ⊂ Ω. A typical example is the elasticity system (\(\boldsymbol{C}\) is the fourth-order stiffness tensor and \(\boldsymbol{\varepsilon }(\boldsymbol{u})\) is the strain tensor of a displacement field \(\boldsymbol{u}\)):

$$\displaystyle{a_{\omega }(\boldsymbol{u},\,\boldsymbol{v}):=\int _{\omega }\boldsymbol{C}:\boldsymbol{\varepsilon } (\boldsymbol{u}):\boldsymbol{\varepsilon } (\boldsymbol{v})\,dx\,.}$$

The problem is discretized by a finite element method. Let \(\mathcal{N}\) denote the set of degrees of freedom and \((\phi _{k})_{k\in \mathcal{N}}\) be a finite element basis on a mesh \(\mathcal{T}_{h}\). Let \(A \in \mathbb{R}^{\#\mathcal{N}\times \#\mathcal{N}}\) be the associated finite element matrix, A kl : = a Ω (ϕ l , ϕ k ), \(k,l \in \mathcal{N}\). For some given right hand side \(\mathbf{F} \in \mathbb{R}^{\#\mathcal{N}}\), we have to solve a linear system in U of the form

$$\displaystyle{A\mathbf{U} = \mathbf{F}\,.}$$

Domain Ω is decomposed into N overlapping subdomains (Ω i )1 ≤ i ≤ N so that all subdomains are a union of cells of the mesh \(\mathcal{T}_{h}\). This decomposition induces a natural decomposition of the set of indices \(\mathcal{N}\) into N subsets of indices \((\mathcal{N}_{i})_{1\leq i\leq N}\):

$$\displaystyle{ \mathcal{N}_{i}:=\{ k \in \mathcal{N}\ \vert \ meas(\mathop{\mathrm{supp}}\nolimits (\phi _{k}) \cap \varOmega _{i})> 0\}\,,\ 1 \leq i \leq N. }$$
(1)

For all 1 ≤ i ≤ N, let R i be the restriction matrix from \(\mathbb{R}^{\#\mathcal{N}}\) to the subset \(R^{\#\mathcal{N}_{i}}\) and D i be a diagonal matrix of size \(\#\mathcal{N}_{i} \times \#\mathcal{N}_{i}\), so that we have a partition of unity at the algebraic level, I d  =  i = 1 N R i TD i R i  , where \(I_{d} \in \mathbb{R}^{\#\mathcal{N}\times \#\mathcal{N}}\) is the identity matrix.

For all subdomains 1 ≤ i ≤ N, let B i be a SPD matrix of size \(\#\mathcal{N}_{i} \times \#\mathcal{N}_{i}\), which comes typically from the discretization of boundary value local problems using optimized transmission conditions, the ORAS preconditioner St-Cyr et al. (2007) is defined as

$$\displaystyle{ M_{ORAS,1}^{-1}:=\sum _{ i=1}^{N}R_{ i}^{T}D_{ i}B_{i}^{-1}R_{ i}\,. }$$
(2)

Due to matrices D i , this preconditioner is not symmetric. We introduce here a non standard variant of the ORAS preconditioner (2), the symmetrized ORAS (SORAS) algorithm:

$$\displaystyle{ M_{SORAS,1}^{-1}:=\sum _{ i=1}^{N}R_{ i}^{T}D_{ i}B_{i}^{-1}D_{ i}R_{i}\,. }$$
(3)

More details are given in Dolean et al. (2015).

3 Two-Level SORAS Algorithm

In order to define the two-level SORAS algorithm, we introduce two generalized eigenvalue problems.

First, for all subdomains 1 ≤ i ≤ N, we consider the following problem:

Definition 1

$$\displaystyle{ \begin{array}{c} \text{Find }(\mathbf{U}_{ik},\mu _{ik}) \in \mathbb{R}^{\#\mathcal{N}_{i}}\setminus \{0\} \times \mathbb{R}\mbox{ such that} \\ D_{i}R_{i}AR_{i}^{T}D_{i}\mathbf{U}_{ik} =\mu _{ik}B_{i}\,\mathbf{U}_{ik}\,\,\,. \end{array} }$$
(4)

Let γ > 0 be a user-defined threshold, we define \(Z_{geneo}^{\gamma } \subset \mathbb{R}^{\#\mathcal{N}}\) as the vector space spanned by the family of vectors \((R_{i}^{T}D_{i}\mathbf{U}_{ik})_{\mu _{ik}>\gamma \,,1\leq i\leq N}\) corresponding to eigenvalues larger than γ.

In order to define the second generalized eigenvalue problem, we introduce for all subdomains 1 ≤ j ≤ N, \(\tilde{A}_{j}\), the \(\#\mathcal{N}_{j} \times \#\mathcal{N}_{j}\) matrix defined by

$$\displaystyle{ \mathbf{V}_{j}^{T}\tilde{A}_{ j}\mathbf{U}_{j}:= a_{\varOmega _{j}}\left (\sum _{l\in \mathcal{N}_{j}}\mathbf{U}_{jl}\phi _{l},\,\sum _{l\in \mathcal{N}_{j}}\mathbf{V}_{jl}\phi _{l}\right )\,,\ \ \mathbf{U}_{j},\,\mathbf{V}_{j} \in \mathbb{R}^{\mathcal{N}_{j} }\,. }$$
(5)

When the bilinear form a results from the variational solve of a Laplace problem, the previous matrix corresponds to the discretization of local Neumann boundary value problems.

Definition 2

We introduce the generalized eigenvalue problem

$$\displaystyle{ \begin{array}{c} \mbox{ Find }(\mathbf{V}_{jk},\lambda _{jk}) \in \mathbb{R}^{\#\mathcal{N}_{i}}\setminus \{0\} \times \mathbb{R}\mbox{ such that} \\ \tilde{A}^{i}\mathbf{V}_{ik} =\lambda _{ik}B_{i}\mathbf{V}_{ik}\,\,.\end{array} }$$
(6)

Let τ > 0 be a user-defined threshold, we define \(Z_{geneo}^{\tau } \subset \mathbb{R}^{\#\mathcal{N}}\) as the vector space spanned by the family of vectors \((R_{i}^{T}D_{i}\mathbf{V}_{ik})_{\lambda _{ik}<\tau \,,1\leq i\leq N}\) corresponding to eigenvalues smaller than τ.

We are now ready to define the two level SORAS preconditioner

Definition 3 (The SORAS-GenEO-2Preconditioner)

Let P 0 denote the A-orthogonal projection on the coarse space

$$\displaystyle{Z_{\text{GenEO-2}}:= Z_{geneo}^{\gamma }\bigoplus Z_{ geneo}^{\tau }\,,}$$

the two-level SORAS-GenEO-2preconditioner is defined as follows:

$$\displaystyle{ M_{SORAS,2}^{-1}:= P_{ 0}A^{-1} + (I_{ d} - P_{0})\,\sum _{i=1}^{N}R_{ i}^{T}D_{ i}B_{i}^{-1}D_{ i}R_{i}(I_{d} - P_{0}^{T})\,. }$$
(7)

Note that this definition is reminiscent of the balancing domain decomposition preconditioner (Mandel, 1992) introduced for Schur complement based methods as well as of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) update formula, see Nocedal and Wright (2006). We have the following theorem

Theorem 1 (Spectral Estimate for the SORAS-GenEO-2Preconditioner)

Let k 0 be the maximum number of neighbors of a subdomain (a subdomain is a neighbor of itself) and k 1 be the maximal multiplicity of the subdomain intersections, γ, τ > 0 be arbitrary constants used in Definitions  2 and  3 .

Then, the eigenvalues of the two-level preconditioned operator satisfy the following spectral estimate

$$\displaystyle{ \frac{1} {1 + \frac{k_{1}} {\tau } } \leq \lambda (M_{SORAS,2}^{-1}\,A) \leq \max (1,k_{ 0}\,\gamma )}$$

where λ(M SORAS,2 −1  A) is an eigenvalue of the preconditioned operator.

The proof is based on the fictitious space lemma (Nepomnyaschikh, 1991) and is given in Haferssas et al. (2015).

Remark 1

The following heuristic provides an interpretation to both generalized eigenvalues (4) and (6).

We first remark that for the ASM preconditioner we have a very good upper bound for the preconditioned operator that does not depend on the number of subdomains but only on the number of neighbors of a subdomain:

$$\displaystyle{\lambda _{max}(M_{ASM}^{-1}A) \leq k_{ 0}\,.}$$

Thus from definitions of ASM and SORAS, we can estimate that vectors for which the action of local matrices (R i AR T)−1 and D i B i −1D i differ notably might lead to a bad upper bound for M SORAS −1 A. By taking the inverse of both operators this condition means that R i AR T and D i −1B i D i −1 differ notably. By left and right multiplication by D i it means we have to look at vectors V i for which D i R i AR TD i V i and B i V i have very different values. This a way to interpret the generalized eigenvalue problem (4) which controls the upper bound of the eigenvalues of M SORAS −1A.

Second, we introduce the following preconditioner M NN −1

$$\displaystyle{ M_{NN}^{-1}:=\sum _{ 1\leq i\leq N}D_{i}\,\widetilde{A_{i}}D_{i} }$$
(8)

which is reminiscent of the Neumann-Neumann preconditioner (Tallec et al., 1998) for substructuring methods. We have a very good lower bound for the preconditioned operator M NN −1A that does not depend on the number of subdomains but only on the maximum multiplicity of intersections:

$$\displaystyle{ \frac{1} {k_{1}} \leq \lambda _{min}(M_{NN}^{-1}\,A)\,.}$$

If we compare formulas for M NN −1 (8) and M SORAS −1 (3), we see that we have to look at vectors V i for which \(D_{i}\,\widetilde{A_{i}}\,D_{i}\,\mathbf{V}_{i}\) and B i V i have very different values. This is a way to interpret the generalized eigenvalue problem (6) which controls the lower bound of the eigenvalues of M SORAS −1A.

4 Nearly Incompressible Elasticity

Although our theory does not apply in a straightforward manner to saddle point problems, we use it for these difficult problems for which it is not possible to preserve both symmetry and positivity of the problem. Note that generalized eigenvalue problems (4) and (6) still make sense if A is the matrix of a saddle point problem and matrices B i and \(\widetilde{A}_{i}\) are properly defined for each subdomain 1 ≤ i ≤ N. The new coarse space was tested quite successfully on Stokes and nearly incompressible elasticity problems with a discretization based on saddle point formulations in order to avoid locking phenomena. The mechanical properties of a solid can be characterized by its Young modulus E and Poisson ratio ν or alternatively by its Lamé coefficients λ and μ. These coefficients relate to each other by the following formulas:

$$\displaystyle{ \lambda = \frac{E\nu } {(1+\nu )(1 - 2\nu )}\ \text{ and }\ \mu = \frac{E} {2(1+\nu )}\,. }$$
(9)

The variational problem consists in finding \((\boldsymbol{u}_{h},p_{h}) \in \mathcal{V}_{h}:= \mathbb{P}_{2}^{d} \cap H_{0}^{1}(\varOmega ) \times \mathbb{P}_{1}\) such that for all \((\boldsymbol{v}_{h},q_{h}) \in \mathcal{V}_{h}\)

$$\displaystyle{ \begin{array}{l} \left \{\begin{array}{@{}l@{\quad }l@{}} \int _{\varOmega }2\mu \varepsilon (\boldsymbol{u}_{h}): \varepsilon (\boldsymbol{v}_{h})dx\quad &-\int _{\varOmega }p_{h}\boldsymbol{\text{div}}\,(\boldsymbol{v}_{h})dx =\int _{\varOmega }\boldsymbol{f}\boldsymbol{v}_{h}dx \\ -\int _{\varOmega }\boldsymbol{\text{div}}\,(\boldsymbol{u}_{h})q_{h}dx\quad &-\int _{\varOmega }\frac{1} {\lambda } p_{h}q_{h} = 0\end{array} \right. \\ \phantom{hfldkleg}\Longrightarrow A\mathbf{U} = \left [\begin{array}{*{10}c} H&B^{T} \\ B& C \end{array} \right ]\left [\begin{array}{*{10}c} \boldsymbol{u}\\ p \end{array} \right ] = \left [\begin{array}{*{10}c} \text{f} \\ 0 \end{array} \right ] = \mathbf{F}.\end{array} }$$
(10)

Matrix \(\widetilde{A}_{i}\) arises from the variational formulation (10) where the integration over domain Ω is replaced by the integration over subdomain Ω i and finite element space \(\mathcal{V}_{h}\) is restricted to subdomain Ω i . Matrix B i corresponds to a Robin problem and is the sum of matrix \(\widetilde{A}_{i}\) and of the matrix of the following variational formulation restricted to the same finite element space:

$$\displaystyle{\int _{\partial \varOmega _{i}\setminus \partial \varOmega }\frac{2\alpha \mu (2\mu +\lambda )} {\lambda +3\mu } \,\boldsymbol{u}_{h} \cdot \boldsymbol{ v}_{h}\,\text{ with }\alpha = 10\text{ in our test}.}$$

In Dolean et al. (2015), we tested our method for a heterogeneous beam of eight layers of steel (E 1, ν 1) = (210 ⋅ 109, 0. 3) and rubber (E 2, ν 2) = (0. 1 ⋅ 109, 0. 4999), see Fig. 1. The beam is clamped on its left and right sides.

Fig. 1
figure 1

2D elasticity: coefficient distribution of steel and rubber

Table 7.1 of Dolean et al. (2015) shows that our method performs consistently much better than various domain decomposition methods: the one level Additive Schwarz (AS) and SORAS methods, the two level AS and SORAS methods with a coarse space consisting of rigid body motions which are zero energy modes (ZEM) and finally AS with a GenEO coarse space. In our test, the GenEO-2coarse space defined in Definition 3 was built with τ = 0. 4 and γ = 103. Eigenvalue problem (6) accounts for roughly 90% of the GenEO-2coarse space size. In Figs. 3 and 2, we plot the eigenvectors of the generalized eigenvalue problems (4) and (6) for the linear elasticity case. The domain decomposition is such that all subdomains contain the eight alternating layers of steel and rubber. The GenEO coarse space for lower bound (Fig. 3) will consist of the first 12 modes. The first three are known as the rigid body modes. The other nine eigenmodes display very different behaviors for the steel and the rubber. The the 13th eigenvalue and the next ones are larger than 0. 25 and are not incorporated into the coarse space. Interestingly enough, steel and rubber have the same deformations in these modes.

Fig. 2
figure 2

Largest eigenvalues and corresponding eigenmodes of the GenEO II generalized eigenproblem for the upper bound (4)

Fig. 3
figure 3

Lowest eigenvalues and corresponding eigenmodes of the GenEO II generalized eigenproblem for lower bound (6)

In this paragraph, we perform a parametric study of the dependence of the convergence on the thresholds γ and τ of the coarse space. In Fig. 4 we study the influence of the parameter τ alone keeping the parameter γ = 1∕0. 001. We see that for τ < 10−2, there are plateau in the convergence curves. But for larger values of τ, convergence curves are almost straight lines. This is in agreement with the gap in the spectrum of the eigenvalue problem (6), see Fig. 4.

Fig. 4
figure 4

Left: Convergence history vs. threshold τ. Right: Eigenvalues for the lower bound eigenvalue problem (6)

A comparable study was made for the impact of the threshold γ. We see on Fig. 5 that this parameter has only a small impact on the iteration count.

Fig. 5
figure 5

Left: Convergence history vs. threshold γ. Right: Eigenvalues for the upper bound eigenvalue problem (4)

We also performed large 3D simulations on 8192 cores of a IBM/Blue Gene Q machine with 1.6 GHz Power A2 processors for both elasticity (200 millions of d.o.f’s in 200 s) and Stokes (200 millions of d.o.f’s in 150 s ) equations. Computing facilities were provided by an IDRIS-GENCI project. We focus on results for the nearly incompressible elasticity problem. The problem is solved with a geometric overlap of two mesh elements and a preconditioned GMRES is used to solve the resulting linear system where the stopping criteria for the relative residual norm is fixed to 10−6. All the test cases were performed inside FreeFem++ code (Hecht, 2012) interfaced with the domain decomposition library HPDDM (Jolivet and Nataf, 2014; Jolivet et al., 2013). The factorizations are computed for each local problem and also for the global coarse problem using MUMPS (Amestoy et al., 2001). Generalized eigenvalue problems to generate the GenEO space are solved using ARPACK (Lehoucq et al., 1998). The coarse space is formed only with the generalized eigenvalue problem (6) since we noticed that the other one (4) has only a little effect on the convergence.

These computations, see Fig. 6, assess the weak scalability of the algorithm with respect to the problem size and the number of subdomains. All times are wall clock times. The domain is decomposed automatically into subdomains with a graph partitioner, ranging from 256 subdomains to 8192 and the problem size is increased by mesh refinement. In 3D the initial problem is about 6 millions d.o.f decomposed into 256 subdomains and solved in 145. 2s and the final problem is about 197 millions of d.o.f decomposed into 8192 subdomains and solved in 196s which gives an efficiency near to 75%.

Fig. 6
figure 6

Weak scaling experiments