Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The PDE eigenvalue problems can be divided into several categories depending on different criterion. This article intends to introduce the state-of-art results and new advancements for elliptic, second-order, selfadjoint and compact partial differential operators. A well-known example of this class is the Laplace eigenvalue problem: Find \(\lambda \in \mathbb{R}\) and u ∈ H 0 1(Ω) such that

$$\displaystyle{ \begin{array}{r l} \varDelta u& =\lambda u\quad \mbox{ in }\varOmega \\ u & = 0\quad \mbox{ on}\partial \varOmega, \end{array} }$$
(9.1)

where \(\varOmega \in \mathbb{R}^{d},d = 1,2,\ldots\) is a bounded, polyhedral Lipschitz domain and ∂Ω is its boundary. This simple but how important model problem can, however, illustrate many interesting phenomena of eigenvalues and eigenfunctions of general elliptic selfadjoint partial differential equations. We refer the reader to a wonderful survey article [71] on Laplace eigenvalue problem which contains many references to the original papers.

Throughout this survey we are going to concentrate on selfadjoint problems, however, we mention some results and further reading for the non-selfadjoint case when relevant.

The paper is organized as follows. We present a variationally stated PDE eigenvalue problem, discuss its properties and present its Galerkin approximation in Sects. 9.1.1 and 9.1.2. In Sect. 9.1.3 we introduce the main ingredients of adaptive FEM. The whole Sect. 9.2 is dedicated to error analysis of AFEM. We discuss an a priori as well as an a posteriori error estimators in Sects. 9.2.1 and 9.2.2, respectively. The state-of-art eigensolvers used in the context of adaptive FEM are presented in Sect. 9.3, whereas issues like convergence and optimality are the subject of Sect. 9.4. Last but not least, Sect. 9.5 sheds some light on the fundamental role of linear algebra not only in eigenvalue, but in all adaptive FE computations.

1.1 An Eigenvalue Problem for Partial Differential Equation

Let V and U be real Hilbert spaces with V ⊂ U densely and continuously embedded in H, e.g. V: = H 0 1(Ω) and U: = L 2(Ω), and \(\|\cdot \|_{V }\) and \(\|\cdot \|_{U}\) the associated norms, respectively. Let \(a(\cdot,\cdot ): V \times V \rightarrow \mathbb{R}\) and \((\cdot,\cdot )_{U}: U \times U \rightarrow \mathbb{R}\) be symmetric and continuous bilinear forms. We consider the following variationally stated eigenvalue problem: Find \(\lambda \in \mathbb{R}\) and u ∈ V such that

$$\displaystyle{ a(u,v) =\lambda (u,v)_{U}\quad \mbox{ for all }\quad v \in V. }$$
(9.2)

We assume that a(⋅ , ⋅ ) is continuous, i.e., there exists \(\beta:=\| a\|_{V } < \infty \) such that

$$\displaystyle{ \vert a(w,v)\vert \leq \beta \| w\|_{V }\|v\|_{V }, }$$
(9.3)

and V -elliptic (coercive), i.e., there exists α > 0 such that

$$\displaystyle{ a(v,v) \geq \alpha \| v\|_{V }^{2}\quad \mbox{ for all }\ v \in V. }$$
(9.4)

Remark 1

The bilinear form a(⋅ , ⋅ ) satisfies all properties of the scalar product on V. The norm induced by a(⋅ , ⋅ ) is the so-called energy norm

$$\displaystyle{ \vert \!\vert \!\vert \cdot \vert \!\vert \!\vert:= a(\cdot,\cdot )^{1/2}, }$$
(9.5)

which is equivalent to the standard norm \(\|\cdot \|_{V }\) in V. Namely

$$\displaystyle{ \alpha \|v\|_{V }^{2} \leq \vert \!\vert \!\vert v\vert \!\vert \!\vert ^{2} \leq \beta \| v\|_{ V }^{2}. }$$
(9.6)

Due to conditions (9.3) and (9.4) the existence and uniqueness of a weak eigenpair (λ, u) is a simple consequence of the classical Lax-Milgram Lemma [95], see, e.g., [66, Theorem 2.12, p. 52], [77, Theorem 6.5.9, p. 140], [118, Theorem 5.5.1, p. 133] or [122, §5.5, Theorem 13]. For the selfadjoint eigenvalue problem the existence of a unique eigenpair (λ, u) can also be proved using the Riesz Representation Theorem [122, §5.4, Theorem 11].

Let us consider, for any f ∈ V, the following (variational) boundary value problem

$$\displaystyle{\mathit{Find}w \in V \quad \mathit{such\ that}\quad a(w,v) = (f,v)_{U}\quad \mbox{ for all }\ v \in V.}$$

Following the results in classical spectral approximation theory [44, 83, 132, 133] we introduce a linear, compact solution operator \(\mathfrak{T}: V \rightarrow V\), such that for any given f ∈ V

$$\displaystyle{a(\mathfrak{T}f,v) = (f,v)_{U}\quad \mbox{ for all }\ v \in V.}$$

Now, using the definition of the operator \(\mathfrak{T}\) with f = u and Eq. (9.2) yields

$$\displaystyle{ a(\mathfrak{T}u,v) = (u,v)_{U} = \frac{1} {\lambda } a(u,v) = a(\frac{1} {\lambda } u,v)\quad \mbox{ for all }\ v \in V. }$$
(9.7)

Hence (λ (i), u (i)) is a solution of the eigenvalue problem (9.2) if and only if (μ (i), u (i)), \(\mu ^{(i)} = \frac{1} {\lambda ^{(i)}}\), is an eigenpair of the operator \(\mathfrak{T}\). The assumptions on a(⋅ , ⋅ ) guarantees that the operator \(\mathfrak{T}\) has countably many real and positive eigenvalues whose inverses are denoted by

$$\displaystyle{0 <\lambda ^{(1)} \leq \lambda ^{(2)} \leq \ldots \leq \lambda ^{(i)} \leq \ldots \;,}$$

according to their multiplicity, and the corresponding (normalized) eigenfunctions u (i) form the orthogonal basis for the space U. Obviously, the eigenfunctions u (i) are orthogonal with respect to a(⋅ , ⋅ ) and (⋅ , ⋅ ) U , i.e.,

$$\displaystyle{a(u^{(i)},u^{(j)}) =\lambda ^{(i)}(u^{(i)},u^{(j)})_{ U} = 0,\quad \mbox{ for }\ i\neq j.}$$

For further details see, e.g. [14, Chapter 2, pp. 692–714], [44, Chapter 4, pp. 203–204] or [25, 90, 127].

1.2 The Galerkin Approximation and the Finite Element Method (FEM)

In order to find an approximation (λ h , u h ) to the exact solution of a variational problem (9.2), we consider the idea of approximating the exact solution by an element from a given finite dimensional subspace V h , known as the Galerkin method (also known as Bubnov-Galerkin or Ritz-Galerkin method in the selfadjoint case) [48].

For V h  ⊆ V the variationally stated eigenvalue problem (9.2) is approximated by the discrete eigenvalue problem: Find \(\lambda _{h} \in \mathbb{R}\) and u h ∈ V h such that

$$\displaystyle{ a(u_{h},v_{h}) =\lambda _{h}(u_{h},v_{h})_{U}\quad \mbox{ for all }\quad v_{h} \in V _{h}. }$$
(9.8)

Remark 2

One should mention that there exist a number of other approximate methods, i.e., the Petrov-Galerkin method, the generalized Galerkin method, the method of weighted residuals, collocation methods etc., see [118, 122]. In general, the trial space U h where the solution u h lives and test space V h are not related to each other. Intuitively, the trial space is responsible for the approximability of the solution, whereas the test space for stability (or quasi-optimality) of the discretization method, see [9, 131]. The Galerkin method is simple to analyze, since both spaces are taken to be the same, i.e., U h  = V h , however, in many cases one should consider them to be distinct [82].

Since V h  ⊆ V, the bilinear form a(⋅ , ⋅ ) is also bounded and coercive on V h . Therefore, the existence of a unique Galerkin solution (λ h , u h ) is inherited from the well-posedness of the original problem [122]. Analogously, there exists a discrete compact solution operator \(\mathfrak{T}_{h}: V \rightarrow V\) such that for any given f ∈ V

$$\displaystyle{a(\mathfrak{T}_{h}f,v) = (f,v)_{U}\mbox{ for all }\quad \ v \in V _{h},}$$

and therefore the eigenpairs of \(\mathfrak{T}_{h}\) can be identified with those of (9.8), accordingly.

At this point, let us discuss some of the possible choices for space V h , namely, basics of the Finite Element Method (FEM) [48]. For simplicity, we restrict ourself to consider only polygonal domains in \(\mathbb{R}^{2}\).

Let \(\mathcal{T}_{h}\) be a partition (triangulation) of a domain Ω into elements (triangles) T, such that

$$\displaystyle{\mathcal{T}_{h}:=\bigcup \limits _{T\in \mathcal{T}_{h}} = \overline{\varOmega },}$$

and any two distinct elements in \(\mathcal{T}_{h}\) share at most a common edge E or a common vertex υ. For each element \(T \in \mathcal{T}_{h}\) by \(\mathcal{E}(T)\) and \(\mathcal{N}(T)\) we denote the set of corresponding edges and vertices, respectively, where \(\mathcal{E}_{h}\) and \(\mathcal{N}_{h}\) denote all edges and vertices in \(\mathcal{T}_{h}\). Likewise, we define a diameter (the length of the longest edge) of an element as h T . For each edge E we denote its length by h E and the unit normal vector by \(\mathbf{n}_{E}\). The label h associated with the triangulation \(\mathcal{T}_{h}\) denotes its mesh size and is given as \(h:=\max \limits _{T\in \mathcal{T}_{h}}h_{T}\). We say that the triangulation is regular in the sense of Ciarlet [48] if there exist a positive constant ρ such that

$$\displaystyle{\frac{h_{T}} {d_{T}} <\rho,}$$

with d T being the diameter of the largest ball that may be inscribed in element T, i.e., the minimal angle of all triangles in \(\mathcal{T}_{h}\) is bounded away from zero. Of course the choice of triangle elements is not a restriction of the finite element method and is made only in order to clarify the notation.

Consider a regular triangulation \(\mathcal{T}_{h}\) of Ω and the set of polynomials \(\mathbb{P}_{p}\) of total degree p ≥ 1 on \(\mathcal{T}_{h}\), which vanish on the boundary of Ω, see, e.g., [29]. Then the Galerkin discretization (9.8) with V h p ⊂ V, dim V h p = n h , taken as

$$\displaystyle{V _{h}^{p}(\varOmega ):= \left \{v_{ h} \in C^{0}(\overline{\varOmega }): v_{ h}\vert _{T} \in \mathbb{P}_{p}\mbox{ for all }T \in \mathcal{T}_{h}\mbox{ and }v_{h} = 0\mbox{ on }\partial \varOmega \right \},}$$

is called a finite element discretization. The Finite Element Method (FEM) [48] is a Galerkin method with a special choice of the approximating subspace, namely, the subspace of piecewise polynomial functions, i.e., continuous in Ω and polynomials on each \(T \in \mathcal{T}_{h}\). For the sake of simplicity we consider only \(\mathbb{P}_{1}\) finite elements, i.e., p = 1, and use V h : = V h 1. The last condition, which is crucial from the practical point of view, states that the space V h should have a canonical basis of functions with small supports over \(\mathcal{T}_{h}\). It is easily seen that the simplest set satisfying this condition is the set \(\left \{\varphi _{h}^{(1)},\ldots,\varphi _{h}^{(n_{h})}\right \}\) of the Lagrange basis (also known as nodal or hat functions) [48]. With this special choice of basis, the solution u h is determined by its values at the n h grid points of \(\mathcal{T}_{h}\) and it can be written as

$$\displaystyle{u_{h} =\sum _{ i=1}^{n_{h} }u_{h,i}\varphi _{h}^{(i)}.}$$

Then the discretized problem (9.8) reduces to a generalized algebraic eigenvalue problem of the form

$$\displaystyle{ A_{h}\mathbf{u}_{h} =\lambda _{h}B_{h}\mathbf{u}_{h}, }$$
(9.9)

where the matrices

$$\displaystyle{A_{h}:= [a(\varphi _{h}^{(i)},\varphi _{ h}^{(j)})]_{ 1\leq i,j\leq n_{h}},\quad B_{h}:= [(\varphi _{h}^{(i)},\varphi _{ h}^{(j)})_{ U}]_{1\leq i,j\leq n_{h}}}$$

are called stiffness and mass matrix, respectively. The representation vector u h is defined as

$$\displaystyle{\mathbf{u}_{h}:= [u_{h,i}]_{1\leq i\leq n_{h}}.}$$

Since \(\left \{\varphi _{h}^{(1)},\ldots,\varphi _{h}^{(n_{h})}\right \}\) are chosen to have a small support over \(\mathcal{T}_{h}\), the resulting matrices A h , B h are sparse. The symmetry of A h , B h and positive definiteness of B h are inherited from the properties of the bilinear forms a(⋅ , ⋅ ) and (⋅ , ⋅ ) U , accordingly.

For conforming approximations, i.e., V h  ⊂ V, the Courant-Fisher min-max characterization, see, e.g. [14, Equation (8.42), p. 699], [130, Equation (23), p. 223] or [49, 138], implies that exact eigenvalues are approximated from above, i.e.,

$$\displaystyle{\lambda ^{(i)} \leq \lambda _{ h}^{(i)},\quad i = 1,2,\ldots,n_{ h}.}$$

On the contrary, for the nonconforming discretizations, e.g., the Crouzeix-Raviart method [29, 50] or [56, Sections 1.2.6, 3.2.3], the discrete eigenvalues provide lower bounds [8, 25, 40] for the exact eigenvalues. The convergence of discrete eigenvalues/eigenfunctions towards their continuous counterparts preserving the multiplicities and preventing spurious solutions is discussed in details in [25, 26].

1.3 The Adaptive Finite Element Method (AFEM)

The standard finite element method would proceed from the selection of a mesh and basis to the generation of a solution. However, it is well-known that the overall accuracy of the numerical approximation in determined by several factors: the regularity of the solution (smoothness of the eigenfunctions), the approximation properties of the finite element spaces, i.e., the search and test space, the accuracy of the eigenvalue solver and its influence on the total error. The most efficient approximations of smooth functions can be obtained using large higher-order finite elements (p-FEM), where the local singularities, arising from re-entrant corners, interior or boundary layers, can be captured by small low-order elements (h-FEM) [47]. Unfortunately, in real-world applications, none of those phenomena are known a priori. Therefore, constructing an optimal finite dimensional space to improve the accuracy of the solution requires refining the mesh and (or) basis and performing the computations again. A more efficient procedure try to decrease the mesh size (h-adaptivity) or (and) increase the polynomial degree of the basis (p-refinement) automatically such that the accurate approximation can be obtained at a lower computational cost, retaining the overall efficiency. This adaptation is based on the local contributions of the global error estimates, the so-called refinement indicators, extracted from the numerical approximation. This simple algorithmic idea is called Adaptive Finite Element Method (AFEM) and can be described as a following loop

The number of manuscripts addressing adaptive finite element methods is constantly growing, and its importance can not be underestimated. In the following sections we present a small fraction of material presented in [4, 14, 16, 17, 29, 32, 48, 5557, 66, 67, 73, 77, 82, 83, 93, 94, 113, 114, 117, 118, 121, 123, 128, 130, 136].

The application of the adaptive FEM to the variationally stated eigenvalue problem (9.2) yields to the following scheme: first the eigenvalue problem is solved on some initial mesh \(\mathcal{T}_{0}\) to provide a finite element approximation (λ h , u h ) of the continuous eigenpair (λ, u). Afterwards, the total error in the computed solution is estimated by some error estimator η h . Unfortunately, even the most accurate global error estimators itself do not guarantee the efficiency of the adaptive algorithm. If the global error is sufficiently small, the adaptive algorithm terminates and returns (λ h , u h ) as a final approximation, otherwise, the local contributions of the error are estimated on each element. A local error indicator (refinement indicator) for an element \(T \in \mathcal{T}_{h}\) is usually denoted by η T and related to a global error estimator η h through

$$\displaystyle{\eta _{h} =\big (\sum \limits _{T\in \mathcal{T}_{h}}\eta _{T}^{2}\big)^{1/2}.}$$

Based on those, the elements for refinement are selected and form the set \(\mathcal{M}\subset \mathcal{T}_{h}\) of marked elements. The final step involves the actual refinement of marked elements and creating a new mesh. Since the resulting mesh may possess hanging nodes, an additional closure procedure is applied is order to obtain a new regular (conforming) mesh. As a consequence, the adaptive finite element method (AFEM) generates a sequence of nested triangulations \(\mathcal{T}_{0},\mathcal{T}_{1},\ldots\) with corresponding nested spaces

$$\displaystyle{V _{0} \subseteq V _{1} \subseteq \ldots \subseteq V _{n_{h}} \subset V.}$$

In the upcoming chapters we will concentrate particularly on SOLVE and ESTIMATE steps, however, let us shortly discuss the marking and refinement procedures.

As we already know the set \(\mathcal{M}\) of marked elements is determined based on the sizes of refinement indicators η T . Now, a question arises: How do we decide which elements \(T \in \mathcal{T}_{h}\) should be added to the set \(\mathcal{M}\) such that the newly obtained adaptive mesh fulfil the regularity condition. The process of selecting the elements of \(\mathcal{M}\) is called the marking criterion or the marking strategy. Let us keep in mind that by marking an element we actually mean marking all its edges. The simplest marking strategy takes a fixed rate (e.g. 50 %) of elements of \(\mathcal{T}_{h}\) with the largest values of η T or elements T for which the refinement indicators η T are larger than some fixed threshold \(\tau \in \mathbb{R},\ \tau > 0\), i.e.,

$$\displaystyle{\mathcal{M}:=\{ T \in \mathcal{T}_{h}:\tau \leq \eta _{T}\}.}$$

Notice that the choice of a threshold τ is essential for the efficiency of the adaptive algorithm since it directly determines the size of the set \(\mathcal{M}\). A more sophisticated strategy is the maximum criterion, where the elements selection is based on a fixed fraction Θ of the maximal refinement indicator in \(\mathcal{T}_{h}\), i.e.,

$$\displaystyle{\tau:=\varTheta \max \limits _{T\in \mathcal{T}_{h}}\eta _{T},}$$

with 0 ≤ Θ ≤ 1. The most interesting, especially in the context of optimality of a standard adaptive finite element method, is a Dörfler marking strategy [53], where the set of marked elements is defined as the subset \(\mathcal{M}\subseteq \mathcal{T}_{h}\) of smallest cardinality such that

$$\displaystyle{ (1-\varTheta )^{2}\sum _{ T\in \mathcal{T}_{h}}\eta _{T}^{2} \leq \sum _{ T\in \mathcal{M}}\eta _{T}^{2}, }$$
(9.10)

where 0 ≤ Θ ≤ 1. i.e., Θ = 0 corresponds to \(\mathcal{M} = \mathcal{T}_{h}\) and Θ = 1 to \(\mathcal{M} = \varnothing \). For more details see [31].

The refinement of the finite element space can be performed using various techniques like moving grid points (r-refinement), subdividing elements of a fixed grid (h-refinement), applying locally higher-order basis functions (p-refinement) or any combinations of those [47], see Fig. 9.1 for illustration.

Fig. 9.1
figure 1

(a) Original mesh, (b) a uniform h-refinement and (c) a uniform p-refinement

For the sake of exposition, here, we consider only the h-refinement of the triangle elements. The most common h-refinement subdivision rules (techniques) based on edge marking are presented in Fig. 9.2.

Fig. 9.2
figure 2

Bisec3, red, green and blue refinement. The edges marked by MARK step are colored. The new reference edge is marked through a second line in parallel opposite the new vertices new 1, new 2 or new 3 [38]

As we have mentioned before, applying these refinement procedures may lead to nonconforming meshes with the so-called hanging nodes. Therefore, the closure algorithm [38] is applied to overcome this drawback and get a regular triangulation. Summarizing, if any edge E of the element T is marked for the refinement, the reference edge (e.g. longest edge) of T should also be marked. Since each element has k = 0, 1, 2, or 3 edges marked for the refinement, if k ≥ 1, the reference edge belongs to it. Moreover, the choice of a refinement method, see Fig. 9.2, depends on k, for instance, if k = 1 the green refinement is used etc. For more details about adaptive refinement strategies see, e.g., [4, 47, 135]. In the remainder of this survey we will focus on the h-adaptivity, hence, we will denote the finite dimensional space over the partition \(\mathcal{T}_{h}\) as V h and the associated Galerkin approximation as (λ h , u h ).

2 Error Estimates

Over the years, research in fundamental spectral approximation by [1013, 59, 83, 91, 130] resulted in the unified spectral approximation framework, nowadays referred to as a so-called Babuška-Osborn Theory [14, Theorem 7.1–7.4, 8.1–8.4].

Theorem 1 ([14, 25])

Let u (i) be a unit eigenfunction associated with the eigenvalue λ (i) of multiplicity m and let u h (i) ,…,u h (i+m−1) denote the eigenfunctions associated with the m discrete eigenvalues converging to λ (i) , then for each index i

$$\displaystyle{\lambda ^{(i)} \leq \lambda _{ h}^{(j)} \leq \lambda ^{(i)} + C\sup \limits _{\stackrel{ u\in E_{\lambda ^{(i)}}}{\|u\|=1}}\inf \limits _{v\in V _{h}}\|u - v\|_{V }^{2},\quad j = i,\ldots,i + m - 1,}$$

and there exists w h (i) ∈span{u h (i) ,…,u h (i+m−1) } such that

$$\displaystyle{\|u^{(i)} - w_{ h}^{(i)}\|_{ V } \leq C\sup \limits _{\stackrel{u\in E_{\lambda ^{(i)}}}{\|u\|=1}}\inf \limits _{v\in V _{h}}\|u - v\|_{V },}$$

where \(E_{\lambda ^{(i)}}\) denotes the eigenspace associated with λ (i) and C is a generic constant.

Remark 3

For λ (i) being simple, we get the following estimates

$$\displaystyle{\vert \lambda ^{(i)}-\lambda _{ h}^{(i)}\vert \leq C\sup \limits _{\stackrel{ u\in E_{\lambda ^{(i)}}}{\|u\|=1}}\inf \limits _{v\in V _{h}}\|u-v\|_{V }^{2}\quad \mbox{ and}\quad \|u^{(i)}-u_{ h}^{(i)}\|_{ V } \leq C\sup \limits _{\stackrel{u\in E_{\lambda ^{(i)}}}{\|u\|=1}}\inf \limits _{v\in V _{h}}\|u-v\|_{V }.}$$

Theorem 1 have some important consequences. First, it is easy to notice that the convergence rate of the eigenvalue/eigenfunction of interest is directly related to the approximability of the associated eigenfunctions (eigenspace). Namely, the approximation rate for the eigenvalue is double with respect to the approximation rate of the corresponding eigenfunctions, which is a well-known fact in the standard perturbation theory for the matrix eigenvalue problems and explains nicely why the eigenvalues are usually much more accurate than the corresponding eigenfunctions. On the other hand, the approximability of the eigenfunctions depends strongly on their regularity and the approximability conditions of the discrete solution space V h , e.g. the finite element space. Hence, following [127, Theorem 3.6], we consider the regularity result for the eigenvalue problem (9.2) as

$$\displaystyle{\|u\|_{H^{2}(\varOmega )} = C\sqrt{\lambda },}$$

where C is a generic constant depending on the data, but not on λ itself. The latter condition, is a well-known phenomena, called the best approximation property of the solution space. The choice of V h being a space of piecewise polynomials of degree p guarantees the best approximation property of the finite element space [25, 48], i.e.,

$$\displaystyle\begin{array}{rcl} \inf \limits _{v\in V _{h}}\|u - v\|_{L^{2}(\varOmega )}& \leq & \mathit{Ch}^{\min \{p+1,r\}}\|u\|_{ H^{r}(\varOmega )}, {}\\ \inf \limits _{v\in V _{h}}\|u - v\|_{H^{1}(\varOmega )}& \leq & \mathit{Ch}^{\min \{p,r-1\}}\|u\|_{ H^{r}(\varOmega )}, {}\\ \end{array}$$

where r is the regularity of the eigenspace (in the case of a simple eigenvalue, the regularity of the corresponding eigenfunction). These two facts immediately explain the common fact of deteriorating rates of convergence in the presence of singularities (singular eigenfunctions). Moreover, the eigenfunctions present more oscillations when the associated eigenvalue increases and the largest eigenvalues are much harder to approximate [38].

The Babuška-Osborn Theory provided important basis for further developments in a priori and a posteriori estimates and designing efficient numerical algorithms. In the remainder of this section we will focus in more details on estimating the eigenvalue/eigenfunction errors.

2.1 A Priori Error Estimates

In general, a priori error estimators give information about the asymptotic behavior of the error or the stability of the applied solver independent of the actually computed approximation. Likewise, they require particular regularity conditions of the solution, the stability properties of the discrete operator or the continuous solution u itself [4, 135] and approximability conditions of the discrete solution spaces. Except of some simple one-dimensional boundary value problems, where an optimal finite element space can be constructed based on a priori estimates, see [112, Lecture 1] or [113] for details. All these conditions make a priori error estimators not computable and hardly applicable in practice. Of course, if some a priori information about the solution is known it can be relevant for the construction of efficient numerical algorithms, e.g. a priori adaptivity technique [117], unfortunately, this is often not the case.

One of the simplest a priori error results obtained in [121] gives estimates to the piecewise linear eigenvalue/eigenfunction error in L 2(Ω) and energy norm depending on the regularity of the solution space, i.e.,

$$\displaystyle{\vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert \leq \mathit{Ch}^{r},\quad \|u - u_{ h}\|_{L^{2}(\varOmega )} \leq \mathit{Ch}^{r}\vert \!\vert \!\vert u - u_{ h}\vert \!\vert \!\vert,\quad \vert \lambda -\lambda _{h}\vert \leq C\vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert ^{2} \leq \mathit{Ch}^{2r},}$$

for u ∈ H 1+r(Ω), where r ∈ (0, 1] is a regularity constant and C > 0 is a constant depending on the eigenvalue λ, the eigenfunction u and on the triangulation \(\mathcal{T}_{h}\), see, e.g. [77, Corollary 11.2.21, p. 264]. Note that if r = 1 (convex domain), the solution u has to fulfil a H 2(Ω)-regularity condition, which is very restrictive and excludes a large class of problems, e.g., the Laplace eigenvalue problem on the L-shape domain. More details can be found in [72] for nonconvex domains and in [1, 2, 65] for higher order polynomials. As mentioned before, also here, the eigenvalue approximation is much more accurate (double) than the corresponding eigenfunction approximation, i.e., the eigenfunctions are first order convergent in H 1(Ω) and second order convergent in L 2(Ω). In [14] this result was generalized for multiple eigenvalues. Also, in this case, the error behaves like \(\mathcal{O}(h)\) for the energy norm of the eigenfunction and \(\mathcal{O}(h^{2})\) for the eigenvalue. Several further results include improved eigenvector estimates in [12, 13], refined estimates in H 1(Ω)−norm together with the lower and upper bound for the eigenvalue error in the case of multiple eigenvalues [43, 45], to mention just a few.

In [90, Theorems 3.1, 3.2 and 3.3] some new a priori FEM error estimates for simple, multiple and clustered eigenvalues were proposed. The error estimate for a simple eigenvalue [90, Theorem 2.7] depend on the continuous eigenfunction u (i) and its approximability properties in space V h , but do not involve any underdetermined constants. Analogously, for the multiple or clustered eigenvalues [90, Theorem 2.11], the a priori error estimates depend only on approximability properties of eigenfunctions in the corresponding eigenspace (invariant subspace). Moreover for clustered eigenvalues the presented estimates are cluster robust, i.e., they do not depend on the width of the cluster. This work has improved several previous results involving the approximability properties of all previous eigenvectors and easily explained different convergence rates of Ritz values approximating a multiple eigenvalue [1214]. To conclude, we present an a priori eigenvalue error estimator for the FEM approximation of a simple eigenvalue introduced in [90, Theorems 3.1 and 3.2], i.e.,

Theorem 2

Knyazev [90, Theorem  3 .2] For a fixed index i such that 1 ≤ i ≤ n h suppose that

$$\displaystyle{d_{i,V _{h}}:=\min \limits _{j=1,\ldots,i-1}\vert \lambda _{h}^{(j)} -\lambda ^{(i)}\vert \neq 0,}$$

then

$$\displaystyle\begin{array}{rcl} 0 \leq \frac{\lambda _{h}^{(i)} -\lambda ^{(i)}} {\lambda _{h}^{(i)}} & \leq & \left (1 +\max _{j=1,\ldots,i-1} \frac{\big(\lambda _{h}^{(j)}\lambda ^{(i)}\big)^{2}} {\vert \lambda _{h}^{(j)} -\lambda ^{(i)}\vert ^{2}}\vert \!\vert \!\vert (I - P_{h})\mathfrak{T}P_{h,1,\ldots,i-1}\vert \!\vert \!\vert ^{2}\right ) {}\\ & & \sin ^{2}\angle _{\vert \!\vert \!\vert \cdot \vert \!\vert \!\vert }(u_{i},V _{h}), {}\\ \end{array}$$

where P h : V → V h is an \(\vert \!\vert \!\vert \cdot \vert \!\vert \!\vert\) -orthogonal projector on V h , i.e., for all u ∈ V, a(Phu − u,v) = 0, for all v ∈ Vh and P h,1,…,i−1 is the \(\vert \!\vert \!\vert \cdot \vert \!\vert \!\vert\) -orthogonal projector onto \(E_{h,1,\ldots,i-1} = \mathit{span}\left \{u_{h}^{(1)},\ldots,u_{h}^{(i-1)}\right \}\) .

Remark 4

If i = j, Theorem 2 turns into [90, Theorem 3.1], namely

$$\displaystyle{ 0 \leq \frac{\lambda _{h}^{(i)} -\lambda ^{(i)}} {\lambda _{h}^{(i)}} \leq \vert \!\vert \!\vert (I - P_{h})P_{1,\ldots,i}\vert \!\vert \!\vert ^{2}. }$$
(9.11)

Finally, we would like to mention an a priori error estimate obtained in [125, Theorem 3], [85, Theorem 3.3], [127, Theorem 4.11], i.e.,

Theorem 3 (Saad [125, Theorem 3], [127, Theorem 4.11])

Let (μ (i) ,u (i) ), 1 ≤ i ≤ n h be the i-th eigenpair of the operator \(\mathfrak{T}\) defined in  (9.7) with normalization \(\|u^{(i)}\|_{H^{1}(\varOmega )} = 1\) . Suppose that

$$\displaystyle{\widehat{d}_{i,V _{h}}:=\min \limits _{j\neq i}\vert \mu _{h}^{(j)} -\mu ^{(i)}\vert \neq 0,}$$

then there exists u h (i) such that

$$\displaystyle{\|u^{(i)} - u_{ h}^{(i)}\|_{ H^{1}(\varOmega )} \leq \bigg (1 + \frac{\|(I - P_{h})\mathfrak{T}P_{h}\|_{H^{-1}(\varOmega )}^{2}} {\widehat{d}_{i,V _{h}}^{2}} \bigg)^{1/2}\inf \limits _{ v\in V _{h}}\|u^{(i)} - v\|_{ H^{1}(\varOmega )}.}$$

The aforementioned theorem is a special case of a more general result for eigenspaces and invariant subspaces proposed in [85, Theorem 2.1, 3.1]. Obviously, the a priori error estimators are not limited to those listed above and include work presented in [75, 87, 92, 115] etc.

2.2 A Posteriori Error Estimates

Although a priori error estimates are usually not available for practical problems, we would still like to determine the quality of the numerical approximation, i.e., obtain a reliable estimate of the error e h  = uu h in a specified norm \(\|\cdot \|\) or quantity of interest (quality measure), e.g., energy norm \(\vert \!\vert \!\vert \cdot \vert \!\vert \!\vert\) [4, 34, 135], or to terminate the algorithm as soon as a prescribed tolerance \(\varepsilon\) is reached, e.g. \(\|e_{h}\| \leq \varepsilon\). Therefore, we need some computable quantity η h (a posteriori error estimator) which can estimate the actual error \(\|e_{h}\|\), i.e.,

$$\displaystyle{\|u - u_{h}\| \approx \eta _{h}.}$$

The formal definition, see [34], states as follows:

Definition 1 (A posteriori error estimator)

A computable quantity η h is called a posteriori error estimator if it can be extracted from the computed numerical solution u h and the given data of the problem, e.g. the known domain Ω and its boundary ∂ Ω.

There are several important practical requirements on a posteriori error estimators. First, as the definition states, they should be computable. Secondly, in contrast to a priori error estimators, they depend on the stability properties of the continuous operator which are known and use the approximate solution itself to check its quality. Last but not least, calculating the estimator should be cheaper than computing the new numerical approximation (e.g., assembling the matrices). Besides, it is of great importance, especially in the context of the AFEM, to be able to extract the local contribution of the error estimator, i.e., the refinement indicators η T ,

$$\displaystyle{\eta _{h} =\Big (\sum \limits _{T\in \mathcal{T}_{h}}\eta _{T}^{2}\Big)^{1/2}.}$$

As far as a global upper bound is sufficient to assure the accuracy of the solution, an a posteriori error estimator should also provide local lower bound for the true error.

These properties of a posteriori error estimator η h are called reliability (guaranteed upper bound)

$$\displaystyle{\vert \!\vert \!\vert e_{h}\vert \!\vert \!\vert \leq C_{\mathit{rel}}\eta _{h} + h.o.t_{\mathit{rel}}}$$

and local efficiency

$$\displaystyle{\eta _{T} \leq C_{\mathit{eff }}\|e_{h}\|_{\widehat{T}} + h.o.t._{\mathit{eff }},}$$

with constants C rel , C eff  > 0 independent of the mesh size h or polynomial degree p, \(\widehat{T}\) the union of T and neighbouring elements and higher-order terms h. o. t rel , h. o. t eff related to data oscillations. Both these bounds are crucial from the point of convergence and optimality, respectively. Namely, it is well-known that a reliable and efficient a posteriori error estimator, i.e.,

$$\displaystyle{ \frac{1} {C_{\mathit{eff }}}\eta _{h} \leq \vert \!\vert \!\vert e_{h}\vert \!\vert \!\vert \leq C_{\mathit{rel}}\eta _{h}.}$$

decays with the same rate as the actual computational error up to higher-order terms. We discuss this issue in detail in Sect. 9.4. The aforementioned definition of the reliability, where the constant C rel is present, is called a weak form of reliability. In the ideal situation we would like the estimator to satisfy

$$\displaystyle{\vert \!\vert \!\vert e_{h}\vert \!\vert \!\vert \leq \eta _{h},}$$

which is very rarely the case.

In order to conclude, the main goal is to obtain an accurate solution with an optimal use of resources and guarantee that the a posteriori error estimator captures the behavior of the actual error as h → 0. In practice, we are often interested in the asymptotical exactness or efficiency of the a posteriori error estimator. Following [4], we call the error estimator η h asymptotically exact if

$$\displaystyle{\lim _{h\rightarrow 0}\theta = 1,}$$

where \(\theta:= \frac{\eta _{h}} {\|e_{h}\|}\) is called a global efficiency index. An error estimator is called efficient if its efficiency index θ and its inverse are bounded independent on the mesh size [34].

The pioneering work of Babuška and Rheinboldt [15] initiated decades of research devoted to a posteriori error estimates. We strongly encourage the reader to further explore the vast literature on the a posteriori error estimates, see e.g., [4, 16, 32, 135, 136]. Following [47, 135], the a posteriori error estimators can be classified as residual error estimators (explicit error estimators), local problem-based estimators (implicit error estimators), averaging estimators (recovery-based estimators), hierarchical estimators (multilevel estimators) and goal-oriented error estimators. For the sake of exposition, let us concentrate on the residual type estimators and provide only some general information for the other classes.

2.2.1 The Residual Error Estimators (Explicit Error Estimators)

Whereas a priori error estimators relate the error \(\|u - u_{h}\|_{V }\) to the regularity of the exact solution, residual a posteriori error estimators consider the connection of the error to the residual of the computed finite element solution u h .

Let us consider the residual

$$\displaystyle{\mathit{Res}_{h}(\cdot ):=\lambda _{h}(u_{h},\cdot )_{U} - a(u_{h},\cdot )_{U} \in V ^{{\ast}}}$$

and the residual equation

$$\displaystyle{ \mathit{Res}_{h}(v) = a(u - u_{h},v) - (\lambda u -\lambda _{h}u_{h},v)_{U}\quad \mbox{ for all }\ v \in V, }$$
(9.12)

where V denotes the dual of the space V.

First of all, notice that the Galerkin orthogonality property does not hold for variationally stated eigenvalue problems, namely

$$\displaystyle{ a(u - u_{h},v_{h}) = (\lambda u -\lambda _{h}u_{h},v_{h})_{U}\neq 0\quad \mbox{ for some }\ v_{h} \in V _{h}. }$$
(9.13)

Secondly, since e h  ∈ V, Eq. (9.12) combined with the higher-order term [54]

$$\displaystyle{ (\lambda u -\lambda _{h}u_{h},e_{h})_{U} = \frac{\lambda +\lambda _{h}} {2} \|e_{h}\|_{U}^{2} }$$
(9.14)

imply

$$\displaystyle{ \vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert ^{2} = \frac{\lambda +\lambda _{h}} {2} \|u - u_{h}\|_{U}^{2} + \mathit{Res}_{ h}(u - u_{h}), }$$
(9.15)

which provides the crucial relation in the residual type error analysis, namely, the equivalence between the energy norm of the error and the residual, which, up to the higher-order terms was proved for the selfadjoint eigenvalue problems in [38].

Theorem 4

Let e h = u − u h and Res h (⋅):= λ h (u h ,⋅) U − a(u h ,⋅). Then the following holds

$$\displaystyle{ \alpha \|u - u_{h}\|_{V } \lesssim \|\mathit{Res}_{h}\|_{V ^{{\ast}}} \lesssim \beta \| u - u_{h}\|_{V }, }$$
(9.16)

where 0 < α ≤β < ∞ are the coercivity and continuity constants introduced in  (9.4) and  (9.3) , respectively.

Proof

The coercivity (9.4), the residual equation (9.12) and (9.14) imply

$$\displaystyle\begin{array}{rcl} \alpha \|u - u_{h}\|_{V }& \leq & \frac{a(e_{h},e_{h})} {\|e_{h}\|_{V }} = \frac{\mathit{Res}_{h}(e_{h})} {\|e_{h}\|_{V }} + \frac{(\lambda +\lambda _{h})} {2} \frac{\|e_{h}\|_{U}^{2}} {\|e_{h}\|_{V }} {}\\ & \leq & \sup \limits _{v\in V }\frac{\mathit{Res}_{h}(v)} {\|v\|_{V }} + \frac{(\lambda +\lambda _{h})} {2} \frac{\|e_{h}\|_{U}^{2}} {\|e_{h}\|_{V }} {}\\ & =& \|\mathit{Res}_{h}\|_{V ^{{\ast}}} + \frac{(\lambda +\lambda _{h})} {2} \frac{\|e_{h}\|_{U}^{2}} {\|e_{h}\|_{V }}. {}\\ \end{array}$$

Since \(\frac{\lambda +\lambda _{h}} {2} \|e_{h}\|_{U}^{2}\) was proved to be of higher order, see e.g. [54], it can be neglected and the left inequality holds. Furthermore the continuity (9.3) implies

$$\displaystyle\begin{array}{rcl} \mathit{Res}_{h}(v)& =& \lambda _{h}(u_{h},v)_{U} - a(u_{h},v) \\ & =& \lambda _{h}(u_{h},v)_{U} - a(u_{h},v) + a(u,v) -\lambda (u,v)_{U} \\ & =& a(u - u_{h},v) + (\lambda _{u}u_{h} -\lambda u,v)_{U} \\ & =& a(u - u_{h},v) + \frac{(\lambda +\lambda _{h})} {2} \|v\|_{U}^{2} \\ & \leq & \beta \|u - u_{h}\|_{V }\|v\|_{V } + \frac{(\lambda +\lambda _{h})} {2} \|v\|_{U}^{2} {}\end{array}$$
(9.17)

and therefore

$$\displaystyle{\|\mathit{Res}_{h}\|_{V ^{{\ast}}} = \frac{\mathit{Res}_{h}(v)} {\|v\|_{V }} \leq \beta \| u - u_{h}\|_{V } + \frac{(\lambda +\lambda _{h})} {2} \frac{\|v\|_{U}^{2}} {\|v\|_{V }},}$$

which completes the proof.

Theorem 4 proves that the dual norm of the residual, i.e., \(\|\mathit{Res}_{h}\|_{V ^{{\ast}}}\) is equivalent to the error \(\|u - u_{h}\|_{V }\). Nevertheless, it is still a challenge to estimate the dual norm of the residual Res h (v) in the most reliable and efficient way.

Remark 5

Since the standard norm \(\|\cdot \|_{V }\) in V is equivalent to the energy norm \(\vert \!\vert \!\vert \cdot \vert \!\vert \!\vert\), see Remark 1, the dual norm of the residual \(\|\mathit{Res}_{h}\|_{V ^{{\ast}}}\) is also equivalent to the energy norm of the error, i.e., \(\vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert\).

Now, exploiting the variational eigenvalue problem (9.2) and its Galerkin discretization (9.8) it is easy to derive a simple residual type a posteriori estimator

$$\displaystyle{\eta _{\mathit{res},h} \equiv \left (\sum _{T\in \mathcal{T}_{h}}h_{T}^{2}\|\varDelta u_{ h} +\lambda _{h}u_{h}\|_{L^{2}(T)}^{2} +\sum _{ E\in \mathcal{E}_{h}}h_{E}\|\,[\nabla u_{h} \cdot \mathbf{n}_{E}]\,\|_{L^{2}(E)}^{2}\right )^{1/2}\,,}$$

h T : = diam(T) and h E : = length(E), such that

$$\displaystyle{ \vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert ^{2} \leq C\,\eta _{\mathit{ res},h}\vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert + \frac{\lambda +\lambda _{h}} {2} \|u - u_{h}\|_{L^{2}(\varOmega )}^{2}\,, }$$
(9.18)

see [27, Section 6.3], [54, Theorem 3.1], [137, Section 4], or the earlier work of Larson [92]. Here, the constant C > 0 depends on the minimal angle allowed in the mesh elements, on the Poincaré-Friedrichs inequality constant (which is a function of the volume of Ω and the area of the portion of ∂ Ω corresponding to the Dirichlet condition, see [135, p. 11]) and on the constants α, β from the Lax-Milgram conditions (9.3) and (9.4), see [95]. However, the possibly large value of C can produce a significant overestimate of the error, see e.g. [36, 46].

In (9.18) the energy norm of the error is bounded by the sum of local contributions of the interior (volumetric) element residuals Δ u h +λ h u h , measuring how good the finite element approximations λ h , u h satisfy the original PDE in its strong form on the interior of the domain, and of the edge residuals, the jumps of the gradient of u h over the element edges E, reflecting the accuracy of the approximation [31, 135]. Here h T , h E denote the mesh-depending weights and \(\|\cdot \|_{L^{2}(T)}\), \(\|\cdot \|_{L^{2}(E)}\) the problem dependent, local norms.

As it was shown in [54] that the L 2-norm of the error is of higher order than the energy norm of the error (9.18) represents an a posteriori estimate for the energy norm of the error. This residual a posteriori error estimator is reliable in a weaker form with the constant C in, e.g. [36, 37, 134], and it is locally efficient, see e.g. [4, 135]. The asymptotic exactness of the estimator usually does not hold in practical computations. Many interesting results on residual type a posteriori error estimates for eigenvalue/eigenvector approximations were proposed in the last two decades, see e.g. [74, 92, 108, 115, 137], to mention only few.

The residual a posteriori estimators, though well-understood and well-established in practice, may significantly overestimate the actual error due to the possibly very large constant C present in the bound. Therefore, several other techniques, which we now briefly review, were introduced over the last years, see, e.g. [4, 16, 47, 69, 136].

2.2.2 Local Problem-Based Estimators (Implicit Estimators)

In the case of explicit error estimators all information about the total error is obtained only from the computed approximation. The main idea behind implicit estimators is to enrich these information by solving some supplementary problem, e.g., local analogues of the residual equations. In order to capture the local behavior of the solution and to get accurate information about the error, the local problems usually involve only small subdomains of Ω (subdomain or element residual method) and more accurate finite element spaces, see [69] for more details. In terms of complexity the solution of all local problems should cost less than assembling the stiffness matrix of the original discrete problem. Implicit error estimators for boundary value problems, e.g. partition of unity or equilibration estimators, are discussed in [4, Chapter 3], [47, Section 6.3.2], [66, Section 15.3], [135, Section 1.3] and [3, 31, 42]. A proof of the local efficiency for this type of estimator can be found, e.g., in [4], whereas reliability and asymptotic exactness are usually satisfied in practical computations. To the best of the author’s knowledge there are no local problem-based error estimators designed specifically for eigenvalue problems.

2.2.3 Averaging Estimators (Recovery-Based Estimators)

These error estimators, also known as ZZ-estimators [145], exploit a local extrapolation or averaging techniques. Due to the high accuracy, practical effectiveness and robustness they are widely used in engineering applications. In general, the error of the approximation is controlled by a difference of a low-order approximation (e.g., a piecewise constant function) and a finite element solution obtained in the space of higher-order elements (e.g., globally continuous piecewise linear functions) satisfying more restrictive continuity conditions than the approximation itself e.g. [4, Chapter 4], [47, Section 6.3.3] or [135, Section 1.5]. For example, if a quantity to be recovered is a gradient ∇u h , the main idea is to compare the smoothed and unsmoothed gradients to estimate the actual error. Reference [34] gives a nice overview of averaging techniques in a posteriori finite element error analysis in general, whereas [16, 143, 144, 146, 147] discuss the gradient recovery-based estimators in details. A local averaging technique for eigenvalue problems was proposed in [97]. Here, we present an improved averaging a posteriori error estimator neglecting the volumetric contributions introduced in [38].

Let \(\mathcal{A}_{h}: L^{2}(\varOmega ) \rightarrow \mathcal{S}^{1}(\mathcal{T}_{h})\), \(\mathcal{S}^{1}(\mathcal{T}_{h}):= \mathbb{P}_{1} \cap H_{0}^{1}(\varOmega )\) be a local averaging operator, i.e.,

$$\displaystyle{\mathcal{A}_{h}(v):=\sum \limits _{z\in \mathcal{N}_{h}} \frac{1} {\vert \omega _{z}\vert }\Big(\int \limits _{\omega _{z}}v\;\mathit{dx}\Big)\varphi _{z},}$$

with a nodal hat function \(\varphi _{z}\) and a nodal patch ω z associated with node z. Then the averaging error estimator for problem (9.2) reads

$$\displaystyle{ \eta _{\mathit{avg},h}^{2}:=\sum \limits _{ T\in \mathcal{T}_{h}}\|\mathcal{A}_{h}(\nabla u_{h}) -\nabla u_{h}\|_{L^{2}(T)}^{2}. }$$
(9.19)

The reliability of the averaging error estimator (9.19) is proved in [38], whereas the efficiency follows from the fact that the averaging estimator is locally equivalent to the residual estimator [33, 34, 135]. The proof of the asymptotic exactness can be found, e.g., in [4, 135]. More details on recovery-based a posteriori error estimators for higher-order polynomials can be found in [35, Section 9.4]. Recovery type a posteriori error estimates for the eigenvalues and eigenfunctions of selfadjoint elliptic equations by the projection method are derived in [104, 139] and [96] for conforming and nonconforming finite elements, respectively.

2.2.4 Hierarchical Estimators (Multilevel Estimators)

The main idea of a hierarchical error estimator is to evaluate the residual obtained for the finite element solution u h  ∈ V h with respect to another finite element space V h satisfying V h  ⊂ V h ⊂ V. Then the error \(\vert \!\vert \!\vert u - u_{h}\vert \!\vert \!\vert\) can be bounded by

$$\displaystyle{\eta _{\mathit{hie},h}:=\vert \!\vert \!\vert u_{h'} - u_{h}\vert \!\vert \!\vert,}$$

where u h ∈ V h, see [20, 52, 58], [47, Chapter 6], or [135, Section 1.4] for details. The finite element space V h corresponds usually to a refinement \(\mathcal{T}_{h'}\) of \(\mathcal{T}_{h}\) or consists of higher-order finite elements. The idea behind goes back to a so-called saturation assumption [21] stating that the error of a fine discrete solution u h is supposed to be smaller than the error of the coarse solution u h in the sense of an error reduction property, i.e.,

$$\displaystyle{\vert \!\vert \!\vert u_{h'} - u\vert \!\vert \!\vert \leq \sigma \vert \!\vert \!\vert u_{h} - u\vert \!\vert \!\vert,}$$

where σ ∈ (0, 1). Good general references concerning hierarchical estimators are [20, 21, 52, 58]. Hierarchical error estimators for eigenvalue problems are discussed in [98, 108].

2.2.5 Goal-Oriented Estimators

The objective in goal-oriented error estimation is to determine the accuracy of the finite element solution u h with respect to some physically relevant scalar quantity of interest given as a linear functional \(J(\cdot ): V \rightarrow \mathbb{R}\) of the solution u, e.g. velocity, flow rates, deformations, stresses or lifts and drags in the case of Navier-Stokes problems etc. The error in the quantity of interest is then related to the residual, i.e.,

$$\displaystyle{\eta _{h}:= \vert J(u) - J(u_{h})\vert \approx \sum _{T\in \mathcal{T}_{h}}\rho _{T}(u_{h})\omega _{T},}$$

where ρ T (u h ) denotes the so-called “cell residuals” of the approximate solution, and ω T a corresponding “cell weights”. The latter are obtained from the solution u of the so-called dual problem, which, in practice, is replaced by its locally postprocessed discrete approximation u h . In order to make this abstract concept a little more explicit, for a simple boundary value problem \(\mathcal{L}u = f\) the cell residuals read

$$\displaystyle{\|\mathcal{L}u_{h} - f\|_{L^{2}(T)} + h_{T}^{1/2}\|[\nabla u_{ h} \cdot \mathbf{ n}_{E}]\|_{L^{2}(\partial T)},}$$

with ∂ T being the boundary of an element \(T \in \mathcal{T}_{h}\). Probably, one of the most well-known techniques of goal-oriented error estimation is the Dual Weighted Residual (DWR) method introduced in [119]. The reliability, efficiency and asymptotic exactness of goal-oriented estimators are typically hard to prove, however, they are very successful in many challenging practical applications. For eigenvalue problems, the full potential of a goal-oriented a posteriori error estimation was demonstrated in [81] as a successful application of the DWR method to non-selfadjoint operators. For eigenvalues λ and λ h being simple, the DWR a posteriori error estimator of the following form is proposed

$$\displaystyle{\vert \lambda -\lambda _{h}\vert \leq c\sum \limits _{T\in \mathcal{T}_{h}}h_{T}^{2}\Big(\rho _{ T}(\lambda _{h},u_{h}) +\rho _{ T}^{{\ast}}(\lambda _{ h}^{{\ast}},u_{ h}^{{\ast}})\Big),}$$

where ρ T (λ h , u h ) and ρ T (λ h , u h ) denote cell residuals of the primal and dual problem, respectively. See [18, Chapter 7], [81, 116] for more details.

3 Adaptive Finite Element Eigenvalue Solver

The choice of a proper iterative eigenvalue solver is an integral part of the successful adaptive finite element scheme. We present some well-established iterative methods which admit the quasi-optimal computational complexity on uniform meshes. However, since generated meshes are refined adaptively, there is an increasing demand for designing efficient and reliable matrix-free eigensolvers with mesh size independent convergence rates. We will discuss this issue, in more details, in the following sections.

Let us consider the generalized algebraic eigenvalue problem

$$\displaystyle{ \mathit{Ax} =\lambda \mathit{Bx} }$$
(9.20)

resulting from the finite element discretization of (9.8), namely,

$$\displaystyle{A = A_{h},\quad B = B_{h}\quad \mbox{ and }\quad x = \mathbf{u}_{h},}$$

defined as in (9.9).

3.1 PINVIT

The preconditioned inverse iteration (PINVIT), introduced and analyzed in series of papers [89, 105, 106, 109], is an iterative method for solving the generalized eigenvalue problem (9.20) written as a system of linear equations Ax k+1 = λ(x k )Bx k , where the new iterate x k+1 is determined as

$$\displaystyle{x_{k+1} = x_{k} - M^{-1}\big(\mathit{Ax}_{ k} -\lambda (x_{k})\mathit{Bx}_{k}\big),}$$

with a symmetric and positive definite optimally scaled preconditioner (e.g., multigrid preconditioner) M −1 of the matrix A such that

$$\displaystyle{\|I - M^{-1}A\|_{ A} \leq \gamma,\quad \gamma \in [0,1).}$$

The corresponding error propagation equation

$$\displaystyle{x_{k+1} -\lambda (x_{k})A^{-1}\mathit{Bx}_{ k} = (I - M^{-1}A)\big(x_{ k} -\lambda (x_{k})A^{-1}\mathit{Bx}_{ k}\big),}$$

not only illustrates the dependence between the initial error x k λ(x k )A −1 Bx k , the new iterate error x k+1λ(x k )A −1 Bx k and the error propagation matrix (reducer) IM −1 A, but presents a more general relation between preconditioned gradient eigensolvers [68, 126] and preconditioned inverse iteration. PINVIT can be viewed as a counterpart of multigrid algorithms for the solution of boundary value problems. As a simple mesh-free eigensolver, with convergence independent on the largest eigenvalue and the mesh size, it is perfectly suitable for grid-dependent eigenvalue problems. More details about the subspace version of the method can be found in [30, 107]. A continuous counterpart of the preconditioned inverse iteration was proposed and analyzed in [142].

3.2 LO(B)PCG

Let us assume that we are given the smallest eigenvalue λ 1 of our problem. Then obtaining the corresponding eigenvector requires solving the homogeneous linear system (Aλ 1 B)x 1 = 0. The method of choice in this case would be a (preconditioned) conjugate gradient ((P)CG) method [28]. Though, in practice, the exact eigenvalue is not known, the underlying idea is still useful and can be combined with the standard preconditioned steepest descent (PSD) method [78, 126]. A sharp convergence estimate and a subspace variant of PSD combined with AFEM are discussed in [110, 111].

The Locally Optimal (Block) Preconditioned Conjugate Gradient (LO(B)PSCG) [86] method combines a three-term recurrence method with the robust and simple Rayleigh-Ritz minimization procedure which allows (allowing) efficient solutions of large and sparse eigenvalue problems. The main idea of the method is to determine a new eigenvalue/eigenvector approximation as the smallest Ritz value/vector with respect to the three-dimensional space span{x k ,  M −1(Ax k λ(x k )Bx k ),  x k−1}. The new iterate x k+1 is now determined as

$$\displaystyle{x_{k+1} = x_{k} -\vartheta _{k}x_{k-1} -\xi _{k}M^{-1}(\mathit{Ax}_{ k} -\lambda (x_{k})\mathit{Bx}_{k}),}$$

where

$$\displaystyle{(\vartheta _{k},\xi _{k}) =\arg \min \limits _{(\vartheta,\xi )}\lambda (x_{k} -\vartheta x_{k-1} -\xi M^{-1}(\mathit{Ax}_{ k} -\lambda (x_{k})\mathit{Bx}_{k})).}$$

It is important to notice that the preconditioner is not used in inner iterations to solve the linear system, but it is directly integrated into a Krylov-based iteration. LO(B)PCG is broadly used eigensolver within the AFEM due to its low memory requirements (only one additional vector has to be stored), reasonable complexity (few additional inner products to determine the Rayleigh-Ritz projection) and its convergence. On every step, the LO(B)PSCG is not slower than the preconditioned steepest descent in terms of the maximizing the Rayleigh quotient [84], however, in practice the convergence is much faster and robust than PSD or Jacobi-Davidson. A commonly used implementation of the method, released by its developer under GNU Lesser GPL, together with several benchmark model problems, is available as Block Locally Optimal Preconditioned Eigenvalue Xolver (BLOPEX) [88].

3.3 Two-Grid Discretization

Already in 1979, the idea of using the multigrid-method for solving mesh eigenproblems was introduced in [76]. A simple one-stage method requires computations of one eigenpair on the coarse grid and approximates further fine grid eigenpairs in a recursive way. Its computational effort is proportional to the dimension of the finite dimensional space and convergence is proved also for the approximate eigenpairs. A well-known example of the class of multigrid approaches is the two-grid discretization method introduced in [140, 141]. The idea of the method is quite simple and uses the underlying expertise from the study of boundary value problems. In particular, we consider to linear finite element spaces V H (Ω) ⊂ V h (Ω) ⊂ H 0 1(Ω), e.g., coarse and fine space, respectively. The solution of an eigenvalue problem on a fine grid is reduced to the solution of an eigenvalue problem on a much coarser grid (mesh size H) followed by the solution of a boundary value problem on the fine grid (mesh size h), whereas the resulting solution maintains an asymptotically optimal accuracy for \(H = \mathcal{O}(\sqrt{h})\). We can summarize the method within three steps:

Step 1 :

Find \((\lambda _{H},u_{H}) \in (\mathbb{R},V _{H})\) s.t. a(u H , v H ) = λ H (u H , v H ) U , for all v H  ∈ V H .

Step 2 :

Find u h  ∈ V h  s.t. a(u h , v h ) = λ H (u H , v h ) U , for all v h  ∈ V h .

Step 3 :

Compute λ h as Rayleigh quotient of u h .

In other words the method can be reformulated as finding a correction e h in the fine space, such that

$$\displaystyle{a(e_{h},v_{h}) =\lambda _{H}(u_{H},v_{h})_{U} - a(u_{H},v_{h})\mbox{ for all }v_{h} \in V _{h}}$$

and setting

$$\displaystyle{u_{h} = u_{H} + e_{h}.}$$

4 Convergence and Optimality Results

In the classical sense the convergence of the FEM requires that for each value h → 0 the approximation error is of required order or accuracy. When dealing with AFEM the goal is to show that the method is a contraction between two consecutive loops. The algebraic convergence rates for adaptive FEM, under the assumption of the exact solution of the algebraic eigenvalue problem, were first proved in [103] and later on improved in [129]. A first convergence results for adaptive finite element methods for eigenvalue problems have been obtained in [64]. Assuming a sufficiently fine initial mesh, Dörfler’s strategy for marking separately error and oscillation indicators, and enforcing the interior node property, the authors proved an error reduction result for consecutive iterates, which is essential for proving quasi-optimality, but very hard to satisfy in practice. Uniform convergence and optimal complexity, relaxing the assumptions of [64], was introduced in [51]. In order to prove convergence, marking of oscillation terms is not required. Moreover, the optimal complexity was shown without any additional assumptions on the data. At the same time, an independent adaptive finite element eigenvalue solver (AFEMES) enabling a contraction property up to higher-order terms (also known as Q-linear convergence) and global strong convergence, was proposed in [38]. Also this result requires no assumptions on the inner node property and small enough mesh size. Furthermore, the same authors provided the first adaptive finite element method combined with an iterative algebraic eigenvalue solver of asymptotic quasi-optimal computational complexity [39]. Another important contribution to be mentioned is the convergence result given in [62]. Here, despite less restrictive initial assumptions (any initial triangulation and marking strategy is allowed) and only minimal refinement of marked elements, the convergence was proved for simple as well as for the multiple eigenvalues. A recent article [35] presents a general framework on optimality of adaptive schemes covering linear as well as nonlinear problems, which embeds the previous results of [38, 51]. The authors consider optimality and convergence of the adaptive algorithm with an optimal convergence rate guaranteed by the efficiency of the error estimator η h , see [35, Theorem 4.5]. In particular, in the case of determining a simple eigenvalue, following [35, Lemma 3.4 and Proposition 10.5], [39, Lemma 4.2] and the convergence of the conforming finite element discretization [130], one can prove that the following four properties,

  1. (A1)

    Stability on non-refined elements,

  2. (A2)

    Reduction property on refined elements,

  3. (A3)

    General quasi-orthogonality,

  4. (A4)

    Discrete reliability of the error estimator;

together with sufficiently small initial mesh size h 0, are sufficient for optimal convergence of an adaptive scheme. Finally, in conclusion, we point the reader to [39, 60, 61] for more results on clustered eigenvalues, nonconforming AFEM, inexact solves and algorithms of optimal computational complexity.

5 The Role of Linear Algebra in AFEM for PDE Eigenvalue Problems

In this section we would like to point the attention of the reader to a very important, though commonly neglected, aspect of practical realization of adaptive FEM. The majority of the AFEM publications consider exact solutions of the algebraic problems (linear systems or eigenvalue problems). When the cost for solving these problems is small and the problems itself are well conditioned independently of the mesh refinement, see [19] and [32, Section 9.5], this assumption is acceptable. However, in real-world applications, adaptive finite element methods are used for challenging, very large and often ill-conditioned problems, for which an exact (up to machine precision) solution is not available. Notice that even a small algebraic residual does not guarantee a good accuracy of the resulting solution, neither for linear systems nor eigenvalue problems. We refer to [79], [5, Section 4] for more details. Moreover, solving the linear algebraic problems to a (much) higher accuracy than the order of the discretization error not only does not improve the overall accuracy but also significantly increases the computational cost [66, Section 13.4.1].

Because of these reasons, in the following, we will advocate for considering the algebraic error as an integral part of the adaptive FEM, especially, in practical applications. Hence, when estimating the total error we will aim at estimates of the form

$$\displaystyle{ \|u - u_{h}^{(n)}\| \approx \eta _{ h,n}, }$$
(9.21)

where η h, n is a function of the approximate solution u h (n) (or λ h (n) and u h (n)) of the linear algebraic problem. Moreover, the fact that the algebraic problems are not solved exactly (and the Galerkin orthogonality does not hold when u h is replaced by u h (n)) should be also taken into account in the derivation of all a posteriori error estimators discussed in Sect. 9.2.2.

A pioneering work in this direction was published in 1995 by Becker, Johnson, and Rannacher [22]. Although dedicated to boundary value problems, it proposes a posteriori error estimates in the H 1(Ω)- and L 2(Ω)-norms that incorporate algebraic errors and design of the adaptive algorithm, and they suggest stopping criterion for the multigrid computations. Several aspects concerning the interplay between discretization and algebraic computation in adaptive FEM are discussed in a recent survey [6].

Now, at step SOLVE of the adaptive FEM applied to problem (9.2), the generalized eigenvalue problem is solved inexactly and we obtain an eigenvector approximation u h (n) and a corresponding eigenvalue approximation λ h (n), associated with the following algebraic errors

$$\displaystyle{\mathbf{u}_{h} -\mathbf{u}_{h}^{(n)}\; \in \; \mathbb{R}^{n_{h} }\quad \mbox{ or}\quad u_{h} - u_{h}^{(n)}\; \in \; V _{ h},\quad \mbox{ and}\quad \lambda _{h} -\lambda _{h}^{(n)}.}$$

The total errors are then given as a sum of the discretization and the algebraic error, i.e.,

$$\displaystyle{ u - u_{h}^{(n)} = (u - u_{ h}) + (u_{h} - u_{h}^{(n)})\quad \mbox{ and} }$$
(9.22)
$$\displaystyle{ \lambda -\lambda _{h}^{(n)} = (\lambda -\lambda _{ h}) + (\lambda _{h} -\lambda _{h}^{(n)}). }$$
(9.23)

For boundary value problems minimizing the total error can be achieved by applying the CG method, which naturally minimizes the algebraic energy norm of the error. However, the same task is much more complicated in the case of eigenvalue problems which, by their nature, are nonlinear. Even the definition of an appropriate (in the physical sense) norm to measure the error for the eigenvalue problem is not trivial and still under intensive consideration, see [80].

In [98], exploiting backward error analysis and saturation assumption, the authors introduce a residual a posteriori error estimators for total errors (9.22) and (9.23) and develop an adaptive FEM, called AFEMLA (LA standing for linear algebra), which incorporates the inexact iterative eigensolver, i.e., the Lanczos method. In particular, this new approach allows for mesh-free adaptation, which is of great interest in the context of the discrete finite element modeling [99], being known in engineering practice for decades.

The concept of a functional backward error and condition number introduced in [7] for boundary value problems is used again in [101] for selfadjoint eigenvalue problems in order to analyze the continuous dependence of the inexact solution on the data, in particular to analyze the approximation error and the backward stability of the algebraic eigenvalue problem. This resulted in a combined residual a posteriori error estimator and a balanced AFEM algorithm, where the stopping criteria are based on the variant of the shift-invert Lanczos method introduced in [80]. A similar direction was considered in [70] in the context of bound-constrained optimization; the ideas introduced there can be applied to the minimization of the Rayleigh-quotient in the case of eigenvalue computations.

When dealing with inexact AFEM, issues such as convergence and optimality are of even greater interest. The convergence of the perturbed preconditioned inverse iteration (PPINVIT), see Sect. 9.3.1, i.e., an algorithm in which the application of the operator is performed approximately, was proved in [24] with bounds for the convergence rate depending on the eigenvalue gap and the quality of the preconditioner. Regarding the optimality of AFEM for eigenvalue problems, in [124] the authors exploited the theory of best N-term approximation. Namely, the number of degrees of freedom needed to obtain the AFEM solution of a given accuracy should be proportional to the number of degrees of freedom needed to approximate the exact solution up to the same accuracy. Under the assumption that the iteration error \(\vert \!\vert \!\vert u_{h} - u_{h}^{(n)}\vert \!\vert \!\vert ^{2} + \vert \lambda _{h} -\lambda _{h}^{(n)}\vert \) for two consecutive AFEM steps is small in comparison with the size of the residual a posteriori error estimate quasi-optimality of the inexact inverse iteration coupled with adaptive finite element method (AFEM) for a class of elliptic eigenvalue problems was proved in [39]. Moreover, the proposed method admits also a quasi-optimal complexity. A similar analysis of convergence and a quasi-optimality of the inexact inverse iteration coupled with adaptive finite element methods was presented in [142] for operator eigenvalue problem.

The aforementioned results have been derived in the context of selfadjoint eigenvalue problems. To deal with non-selfadjoint problems, one can follow results in [23, 100] and their DWR approach. Here duality techniques are used to estimate the error in the target quantities in terms of the weighted primal and dual residuals, i.e.,

$$\displaystyle{ \mathit{Res}_{h}(u_{h},\lambda _{h})(v) \equiv \lambda _{h}(u_{h},v)_{U} - a(u_{h},v), }$$
(9.24)
$$\displaystyle{ \mathit{Res}_{h}^{{\ast}}(u_{ h}^{{\ast}},\lambda _{ h}^{{\ast}})(v) \equiv \lambda _{ h}^{{\ast}}(v,u_{ h}^{{\ast}})_{ U} - a(v,u_{h}^{{\ast}}), }$$
(9.25)

respectively. The resulting estimates, based on a perturbation argument, can be written as

$$\displaystyle{ \lambda -\lambda _{h}^{(n)} \lesssim \left (\eta _{ h,n} +\eta _{ h,n}^{{\ast}} +\eta _{ h,n}^{(\mathit{it})}\right ), }$$
(9.26)

with the primal and the dual eigenvalue residual estimators

$$\displaystyle{ \eta _{h,n} \equiv \frac{1} {2}\mathit{Res}_{h}(u_{h}^{(n)},\lambda _{ h}^{(n)})(I_{ 2h}^{(2)}u_{ h}^{{\ast}(n+1)} - u_{ h}^{{\ast}(n)}), }$$
(9.27)
$$\displaystyle{ \eta _{h,n}^{{\ast}}\equiv \frac{1} {2}\mathit{Res}_{h}^{{\ast}}(u_{ h}^{{\ast}(n)},\lambda _{ h}^{{\ast}(n)})(I_{ 2h}^{(2)}u_{ h}^{(n+1)} - u_{ h}^{(n)}), }$$
(9.28)

the iteration error indicator

$$\displaystyle{ \eta _{h,n}^{(\mathit{it})} = \mathit{Res}_{ h}(u_{h}^{(n)},\lambda _{ h}^{(n)})(u_{ h}^{{\ast}(n)}), }$$
(9.29)

and the interpolation operator I 2h (2). For more details we refer to [120]. Another approach, based on a homotopy method which allows adaptivity in space, in the homotopy step-size as well as in the stopping criteria for the iterative algebraic eigenvalue solvers has been derived in [41], see also [63, 102].

6 Concluding Remarks

This short survey gives a very brief introduction to the adaptive approximation of PDE eigenvalue problems, but it is far away from being complete in any sense. At this point, we excuse for any missing contributions about whose existence we were not aware in the time of preparation of this manuscript. Due to the lack in space, we mentioned only shortly some results on non-selfadjoint eigenvalue problems, and did not consider at all a very important class of nonlinear eigenvalue problems. As our study on adaptive FEM has no end, we will leave the reader with their own thoughts, questions and ideas to contemplate. There are still many doors to be open and we encourage researchers from many fields such as mathematical and numerical PDE analysis and discretization, functional analysis and matrix computations to write further chapters of this wonderful story.