1 Introduction

Adaptive finite element methods (AFEMs) are well-established and efficient tools for solving boundary value problems. A common form is the iteration of

$$\displaystyle \begin{aligned} \textsf{SOLVE}\quad \rightarrow\quad \textsf{ESTIMATE}\quad \rightarrow\quad \textsf{MARK}\quad \rightarrow\quad \textsf{REFINE}, \end{aligned} $$
(1)

i.e. solve for the current finite element solution, estimate its error by means of so-called a posteriori indicators, and use this information in a marking strategy to make refinement decisions. The resultant problem-dependent adaptation of the mesh often significantly improves efficiency, compared to classical uniform refinement. There are theoretical results corroborating this practical observation; see the surveys [3, 9] and [5], which even obtains a nonlinear quasi-optimality result with respect to an augmented energy norm error.

Almost all results about the convergence behavior of AFEMs invoke extra regularity for the data which does not fit to the proven convergence rate in the light of [2, 6]. To exemplify and explain this, let us consider the approximation of the weak solution of the Poisson problem

$$\displaystyle \begin{aligned} -\varDelta u = f \text{ in }\varOmega \subset \mathbb{R}^d, \qquad u=0\text{ on }\partial\varOmega \end{aligned} $$
(2)

by a realization of (1) with linear finite elements. Here one usually requires the regularity f ∈ L 2(Ω) beyond H −1(Ω), irrespective of the actual regularity of the target function u. One reason for requiring f ∈ L 2(Ω) lies in a posteriori error analysis. Aiming at computability, local H −1-norms are replaced with L 2-norms scaled by the local meshsize h T, leading to terms like the classical oscillation indicator \(h_T \min _{f_T\in \mathbb {R}}\|f-f_T\|{ }_{L^2(T)}\), which needs f ∈ L 2(Ω) to be defined. Note however that such a term is not yet really computable and may cause serious overestimation; cf. [7]. Another reason lies in many convergence analyses themselves. They often rely on the fact that the scaling factors of the aforementioned L 2-norms strictly reduce under refinement and, e.g., in the case of oscillation terms, this strict reduction hinges on extra regularity of f. It is worth mentioning that the second reason does not apply to the derivations of plain convergence in [8, 10].

In [7] we present a new approach to a posteriori analysis covering arbitrary f ∈ H −1(Ω) and prove in particular

$$\displaystyle \begin{aligned} \|\nabla (u- U_{\mathcal{T}})\|{}_{\varOmega} \lesssim \mathcal{E}_{\mathcal{T}}+\delta_{\mathcal{T}} \lesssim \|\nabla (u- U_{\mathcal{T}})\|{}_{\varOmega}, \end{aligned}$$

where \(U_{\mathcal {T}}\) is the linear finite element solution over the mesh \(\mathcal {T}\), the estimator \(\mathcal {E}_{\mathcal {T}}\) is computable in a similar sense as \(U_{\mathcal {T}}\), while the computation of the oscillation \(\delta _{\mathcal {T}}\) requires additional information on f.

Here we prove plain convergence of adaptive linear finite elements for (2) by combining the a posteriori analysis [7] and the convergence analyses [8, 10]. The result forgoes extra regularity and so convergence rates are precluded. Section 2 presents the adaptive finite method in detail, along with an account of [7], and Sect. 3 is concerned with the convergence proof, thereby characterizing convergent marking strategies. Section 4 then concludes with a brief discussion of a few convergent marking strategies.

2 Adaptive Algorithm with Abstract Marking Strategy

In this section we introduce adaptive finite element methods for approximating the weak solution \(u\in H_0^1(\varOmega )\) of (2) for some fixed but arbitrary load

$$\displaystyle \begin{aligned} f\in H^{-1}(\varOmega). \end{aligned} $$
(3)

Here \(H_0^1(\varOmega )\) is the subspace of Sobolev functions in H 1(Ω) with vanishing trace and H −1(Ω) denotes its topological dual space. Bisection and Conforming Linear Finite Elements

Let \(\mathcal {T}_0\) be a suitable simplicial, face-to-face (conforming) mesh of Ω and denote by \(\mathbb {T}\) the family of its refinements using iterative or recursive bisection; cf. the discussion in [9] and the references therein. In what follows, ‘\(\lesssim \)’ stands for ‘≤ C’, where the generic constant C may depend on the shape coefficient of \(\mathcal {T}_0\). For example, for any simplex arising in \(\mathbb {T}\), we have that its shape coefficient \(\lesssim 1\).

Let \(\mathcal {T}\in \mathbb {T}\) be a mesh. We denote the set of its vertices by \(\mathcal {V}_{\mathcal {T}}\) and the set of its faces by \(\mathcal {F}_{\mathcal {T}}\). The associated space of continuous piecewise affine functions is

$$\displaystyle \begin{aligned} \mathbb{V}(\mathcal{T}) := \left\{V\in C^0(\bar{\varOmega}) \mid V_{|T}\in\mathbb{P}_1 ~\text{for all}~T\in\mathcal{T} \right\} \subset H^1(\varOmega). \end{aligned} $$

Its nodal basis is given by

Note . The finite element functions are conforming and the associated Galerkin approximation \(U_{\mathcal {T}}\) of (2) is characterized by

(4)

where 〈⋅, ⋅〉 is the duality pairing between H −1(Ω) and \(H_0^1(\varOmega )\). In order to be able to compute \(U_{\mathcal {T}}\), we suppose that

(5)

A Posteriori Error Analysis with Error-Dominated Oscillation

We outline the approach to a posteriori error analysis in [7], which prepares the ground for formulating our convergence theorem. To this end, we fix the mesh \(\mathcal {T}\) and drop it as subscript in the following discussion. Further, given a subdomain ω ⊂ Ω, we choose \(\|\nabla \cdot \|{ }_\omega :=\| \nabla \cdot \|{ }_{L^2(\omega )}\) as norm on \(H_0^1(\varOmega )\) and denote by ∥⋅∥−1;ω its dual norm on H −1(ω).

We then have the global relationship ∥∇(u − U)∥Ω = ∥f + ΔU−1;Ω between error and residual as well as, locally,

$$\displaystyle \begin{aligned} \forall z \in \mathcal{V} \quad \|f+\varDelta U\|{}_{-1;\omega_z} \le \|\nabla(u-U)\|{}_{\omega_z}. \end{aligned} $$
(6)

Exploiting Galerkin orthogonality and the fact that \(\{\varPhi _z\}_{z\in \mathcal {V}}\) provides a partition of unity, one derives that, for any \(v\in H_0^1(\varOmega )\) and ,

$$\displaystyle \begin{aligned} \langle f+\varDelta U, v\rangle = \langle f+\varDelta U, v-V\rangle \lesssim \sum_{z\in\mathcal{V}} \|f+\varDelta U\|{}_{-1,\omega_z}\|\nabla v\|{}_{\omega_z}; \end{aligned} $$
(7)

see, e.g., [4, 7]. Applying the Cauchy-Schwarz inequality on the sum and recalling (6), one obtains the error localization

$$\displaystyle \begin{aligned} \|\nabla (u-U)\|{}_{\varOmega}^2 \eqsim \sum_{z\in\mathcal{V}} \|f+\varDelta U\|{}_{-1;\omega_z}^2. \end{aligned}$$

In general, upper bounds for the local residual norms \(\|f+\varDelta U \|{ }_{-1;\omega _z}\), as for the global one ∥f + Δu−1;Ω, cannot be computed by a finite number of operations from the information in (5). The reason lies in the possibly infinite-dimensional nature of f; see [7]. One therefore may isolate this difficulty by inserting a finite-dimensional approximation of f, e.g., some piecewise constant function \(\bar {f}\). Unfortunately, the arising terms \(\|f-\bar {f}\|{ }_{-1;\omega _z}\), or the larger classical oscillation terms \(\|h(f-\bar {f})\|{ }_{\omega _z}\), may overestimate the error by far on a given mesh, see [7], and even lead to worse convergence rates, see [4].

To overcome these drawbacks, we have developed in [7] a projection operator \(\mathcal {P}:H^{-1}(\varOmega ) \to \mathbb {D}(\mathcal {T})\) with the following properties. The evaluation \(\mathcal {P}f\) is computable under (5) and the range

$$\displaystyle \begin{aligned} \mathbb{D}(\mathcal{T}) := \big\{ & \ell \in H^{-1}(\varOmega) \mid \langle \ell,v\rangle = \sum_{T\in\mathcal{T}} \int_T c_T v\,\mathrm{d}x + \sum_{F\in\mathcal{F}} \int_F c_F v\,\mathrm{d}s \\ &\text{for all } v\in H^1_0(\varOmega) \text{ with } c_T,c_F\in\mathbb{R} \text{ for }T\in\mathcal{T}, F\in\mathcal{F} \big\} \end{aligned} $$

contains ΔU ∈ H −1(Ω). Moreover, \(\mathcal {P}\) is locally stable and invariant in that, for any  ∈ H −1(Ω), \(z\in \mathcal {V}\), and \(D\in \mathbb {D}(\mathcal {T})\), we have

$$\displaystyle \begin{aligned} \|{\mathcal{P} \ell}\|{}_{-1;\omega_z} \lesssim \|\ell\|{}_{-1;\omega_z}, \end{aligned} $$
(8a)

as well as

$$\displaystyle \begin{aligned} \ell = \mathcal{P}\ell \text{ in } H^{-1}(\omega_z) \quad \text{whenever}\quad \ell = D \text{ in } H^{-1}(\omega_z). \end{aligned} $$
(8b)

Consequently, we have \(\mathcal {P}(\varDelta U) = \varDelta U\) and the local splittings

$$\displaystyle \begin{aligned} \|f+\varDelta U\|{}_{-1;\omega_z} \eqsim \| \mathcal{P}f + \varDelta U \|{}_{-1;\omega_z} + \| f-\mathcal{P}f \|{}_{-1;\omega_z}, \end{aligned} $$
(9)

where \(\| \mathcal {P}f + \varDelta U \|{ }_{-1;\omega _z}\) is quantifiable under (5), but the computation or bounding of \(\| f-\mathcal {P}f \|{ }_{-1;\omega _z}\) requires information beyond. To sum up, we formulate the following theorem .

Theorem 1 (A Posteriori Bounds)

For any mesh \(\mathcal {T}\in \mathbb {T}\) , we have

as well as, for any \(z\in \mathcal {V}_{\mathcal {T}}\) ,

Adaptive Algorithm

The bounds in Theorem 1 are vertex-wise, while bisection is applied element-wise. Therefore, we assume that there are element indicators \(\mathcal {E}_{\mathcal {T}}(T)\), \(T\in \mathcal {T}\), such that

(10a)

and

(10b)

Such local equivalences hold for various indicators resembling the ones from the standard residual estimator, the hierarchical estimator, or estimators based upon discrete local problems; cf. [7].

In order to achieve a similar grouping for the oscillation, we introduce, for any \(T\in \mathcal {T}\), its minimal ring \(\omega (T) := \bigcap \{ \omega _{\mathcal {T}'}(T) : \mathcal {T}' \in \mathbb {T}, \mathcal {T}' \ni T \}\), where \(\omega _{\mathcal {T}}(T) := \cup \{T'\in \mathcal {T} : T'\cap T\neq \emptyset \}\). Then

$$\displaystyle \begin{aligned} \delta(T) := \| f-\mathcal{P}_{\mathcal{T}}f \|{}_{-1;\omega(T)} \end{aligned} $$
(11)

does not depend on the surrounding mesh, and we can derive the inequalities

(12)

Algorithm (AFEM)

Starting from \(\mathcal {T}_0\) and k = 0, compute \((\mathbb {V}_k)_k\), (U k)k, \((\mathcal {T}_k)_k\) iteratively as follows, where step 3 will be specified further below.

  1. 1.

    Set and let \(U_k \in \mathbb {V}_k\) be the solution of (4) with \(\mathcal {T} = \mathcal {T}_k\).

  2. 2.

    Compute the indicators of estimator and oscillation in (10) and (11), respectively, and write \(\mathcal {E}_k\) as an abbreviation for \(\mathcal {E}_{\mathcal {T}_k}\).

  3. 3.

    Choose a subset \(\mathcal {M}_k \subset \mathcal {T}_k\) with the help of the values of the indicators \(\{ \mathcal {E}_k(T) \}_{T\in \mathcal {T}_k}\) and \( \{ \delta (T) \}_{T\in \mathcal {T}_k}\).

  4. 4.

    Let \(\mathcal {T}_{k+1}\) be the smallest refinement of \(\mathcal {T}_k\) in \(\mathbb {T}\) such that \(\mathcal {M}_k \cap \mathcal {T}_{k+1} = \emptyset \).

  5. 5.

    Increment k and go to step 1.

3 Convergence and Marking Strategy

The above adaptive method generalizes global uniform refinement, which is the special case corresponding to \(\mathcal {M}_k = \mathcal {T}_k\) for all k. The new key feature is that certain elements may not be refined anymore. In other words: it may happen that

$$\displaystyle \begin{aligned} \mathcal{T}^* := \{ T \mid \exists m \in \mathbb{N}_0 \, \forall k \geq m \; T \in \mathcal{T}_k \} \neq \emptyset \quad \text{and so}\quad \mathbb{V}_\infty := \overline{\bigcup_{k} \mathbb{V}_k} \neq H^1_0(\varOmega). \end{aligned}$$

It is then the task of step 3, the marking strategy, to preclude \(u \in H^1_0(\varOmega ) \setminus \mathbb {V}_\infty \), which is equivalent to non-convergence.

Let us first derive a necessary condition for the marking strategy. If limk∥∇(u − U k)∥Ω = 0, then, for any element \(T\in \mathcal {T}^*\), the local lower bounds in (10), (11), and Theorem 1 imply \(\lim _{k\to \infty } \mathcal {E}_k(T) + \delta (T) = 0\) for the associated indicators. Hence it is necessary for convergence that the marking strategy ensures

$$\displaystyle \begin{aligned} \forall T\in\mathcal{T}^* \quad \delta(T) = 0 \text{ and } \lim_{k\to\infty} \mathcal{E}_k(T) = 0, \end{aligned} $$
(13)

which specifies [8, (5.1)]. It turns out that this condition is also sufficient for convergence, as it complements the following property of our refinement procedure. Let h k denote the meshsize function of \(\mathcal {T}_k\) given by h k |T = |T|1∕d for all \(T\in \mathcal {T}_k\) and let χ k the characteristic function of \(\bigcup \{ T\in \mathcal {T}_k : T \not \in \mathcal {T}_{k+1} \}\). Then [9, Lemma 9] shows

$$\displaystyle \begin{aligned} \lim_{k\to\infty} \| h_k \chi_k \|{}_{L^\infty(\varOmega)} = 0, \end{aligned} $$
(14)

which generalizes maxΩ h k → 0 for non-adaptive, uniform refinement.

Theorem 2 (Adaptive Convergence)

Let f  H −1(Ω) be arbitrary and assume that the indicators of the estimator satisfy (10). Then the approximate solutions (U k)k of the AFEM converge to the exact solution u if and only if the marking strategy ensures (13).

Proof

We adopt the proof of [9, Theorem 8], which essentially follows [10]. Using the Lax-Milgram theorem, let \(U_\infty \in \mathbb {V}_\infty \) be such that

$$\displaystyle \begin{aligned} \int_{\varOmega} \nabla U_\infty \cdot \nabla V\,\mathrm{d}x = \langle f, V \rangle \quad \text{for all } V \in \mathbb{V}_\infty. \end{aligned}$$

Thanks to [9, Lemma 7], we have

$$\displaystyle \begin{aligned} \lim_{k\to\infty} \| \nabla(U_\infty-U_k) \|{}_\varOmega = 0 \end{aligned} $$
(15)

and it remains to show that U  = u or, as an alternative, that its residual vanishes:

$$\displaystyle \begin{aligned} 0 = \langle R_\infty, v \rangle := \langle f, v \rangle - \int_\varOmega \nabla U_\infty\cdot \nabla v\,\mathrm{d}x \quad \text{for all } v\in H^1_0(\varOmega). \end{aligned} $$
(16)

Here we can take the test functions only from \(C^\infty _0(\varOmega )\), because \(C^\infty _0(\varOmega )\) is a dense subset of \(H^1_0(\varOmega )\). Consequently, the convergence (15) shows that (16) follows from

$$\displaystyle \begin{aligned} \forall \varphi\in C^\infty_0(\varOmega) \qquad \lim_{k\to\infty} \langle R_k, \varphi \rangle=0, \end{aligned} $$
(17)

where R k:= f + ΔU k ∈ H −1(Ω) is the residual of U k.

In order to verify this, let \(\varphi \in C^\infty _0(\varOmega )\) and \(k,\ell \in \mathbb {N}_0\) with k ≥  and introduce the set \(\mathcal {T}^*_\ell := \mathcal {T}_\ell \cap \mathcal {T}^*\). The inclusion \(\mathbb {V}_\ell \subset \mathbb {V}_k\), the abstract error bound (7), the local equivalences (9), as well as the upper bounds in (10) and (11) imply

$$\displaystyle \begin{aligned} \langle R_k, \varphi \rangle = \langle R_k, \varphi - I_\ell\varphi \rangle \lesssim S_{\ell,k} + S^*_{\ell,k}, \end{aligned} $$
(18)

where I denotes Lagrange interpolation onto \(\mathbb {V}_\ell \) and we expect that

$$\displaystyle \begin{aligned} S_{\ell,k} := \sum_{T\in\mathcal{T}_k\setminus\mathcal{T}_\ell^*} \big( \mathcal{E}_k(T) + \delta(T) \big) \|\nabla(\varphi-I_\ell\varphi)\|{}_{\omega_k(T)} \end{aligned}$$

gets small because of decreasing meshsize, whereas

$$\displaystyle \begin{aligned} S^*_{\ell,k} := \sum_{T\in\mathcal{T}_\ell^*} \big( \mathcal{E}_k(T) + \delta(T) \big) \|\nabla(\varphi-I_\ell\varphi)\|{}_{\omega_k(T)} \end{aligned}$$

gets small thanks to condition (13) on the marking strategy. Here \(\omega _k(T)=\omega _{\mathcal {T}_k}(T)\) is the ring around T in \(\mathcal {T}_k\).

We first deal with S ,k. The lower bounds in (10), (11), and Theorem 1 entail that

$$\displaystyle \begin{aligned} \sum_{T\in\mathcal{T}_k\setminus\mathcal{T}_\ell^*} \big( \mathcal{E}_k(T) + \delta(T) \big)^2 \lesssim \| \nabla(U_k-U_\infty) \|{}_\varOmega^2 + \| \nabla(U_\infty-u) \|{}_\varOmega^2 \lesssim 1 \end{aligned} $$
(19)

is uniformly bounded thanks to (15). Furthermore, standard error bounds for Lagrange interpolation on \(\mathcal {T}_\ell \) yield

$$\displaystyle \begin{aligned} \begin{aligned} \sum_{T\in\mathcal{T}_k\setminus\mathcal{T}_\ell^*} \|\nabla(\varphi-I_\ell\varphi)\|{}_{\omega_k(T)}^2 &\lesssim \sum_{T\in\mathcal{T}_\ell\setminus\mathcal{T}_\ell^*} \|\nabla(\varphi-I_\ell\varphi)\|{}_{\omega_\ell(T)}^2 \\ &\lesssim \| h_\ell \chi_\ell \|{}_{L^\infty(\varOmega)}^2 \| D^2\varphi \|{}_{L^\infty(\varOmega)}^2 |\varOmega|. \end{aligned} \end{aligned}$$

Hence the Cauchy-Schwarz inequality on the sum S ,k and (14) imply

$$\displaystyle \begin{aligned} S_{\ell,k} \to 0 \text{ as }\ell\to\infty \text{ uniformly in }k. \end{aligned} $$
(20)

For \(S^*_{\ell ,k}\), condition (13) leads to

$$\displaystyle \begin{aligned} \sum_{T\in\mathcal{T}_\ell^*} \big( \mathcal{E}_k(T) + \delta(T) \big)^2 = \sum_{T\in\mathcal{T}_\ell^*} \mathcal{E}_k(T)^2 \to 0 \quad \text{as}\quad k \to \infty \end{aligned}$$

and, since \(\sum _{T\in \mathcal {T}_\ell ^*} \|\nabla (\varphi -I_\ell \varphi )\|{ }_{L^2(\omega _k(T))}^2\) is uniformly bounded, the Cauchy-Schwarz inequality provides

$$\displaystyle \begin{aligned} S^*_{\ell,k} \to 0 \text{ as }k\to\infty \text{ for any fixed }\ell. \end{aligned} $$
(21)

To conclude, let 𝜖 > 0 be arbitrary. We exploit (20) and (21) by first choosing so that S ,k ≤ 𝜖∕2 and next k ≥  so that \(S_{\ell ,k}^*\le \epsilon /2\). Inserting this into (18) yields the desired convergence (17) and finishes the proof. □

4 Convergent Marking Strategies

In this concluding section, we discuss how to ensure condition (13) on the marking strategy, which is necessary and sufficient for the convergence of the AFEM.

To this end, the following observation about a vanishing limit is helpful. If \((T_k)_{k\in \mathbb {N}_0}\) is a sequence of elements satisfying \(T_k\in \mathcal {T}_k\setminus \mathcal {T}_{k+1}\), \(k\in \mathbb {N}_0\), then

$$\displaystyle \begin{aligned} \begin{aligned} \mathcal{E}_k(T_k)+\delta(T_k) &\lesssim \|\nabla (u-U_k)\|{}_{\omega_k(T_k)} \\ &\le \|\nabla (U_\infty-U_k)\|{}_{\omega_k(T_k)} + \|\nabla(u-U_\infty)\|{}_{\omega_k(T_k)} \to 0 \end{aligned} \end{aligned} $$
(22)

as k →. In fact, the first term on the right-hand side vanishes thanks to (15) and the second term vanishes in view of \(\omega _k(T_k) \lesssim |T_k| \lesssim \| h_k \chi _k \|{ }_{L^\infty (\varOmega )}^d\) and property (14).

Condition (13) is thus satisfied if we have

$$\displaystyle \begin{aligned} \max_{T'\in\mathcal{T}_k} \big( \delta(T')+\mathcal{E}_k(T') \big) \le \max_{T'\in\mathcal{T}_k\setminus\mathcal{T}_{k+1}} \big( \delta(T')+\mathcal{E}_k(T') \big), \end{aligned} $$

which, thanks to \(\mathcal {M}_k\subset \mathcal {T}_k\setminus \mathcal {T}_{k+1}\), can be achieved by requiring that at least one element with maximal combined indicators is marked:

$$\displaystyle \begin{aligned} \mathcal{M}_k \cap \left\{ T\in \mathcal{T}_k \mid \delta(T)+\mathcal{E}_k(T) \ge \max_{T'\in\mathcal{T}_k} \big( \delta(T')+\mathcal{E}_k(T') \big)\right\} \neq \emptyset. \end{aligned} $$
(23)

This property is verified by most of the common marking strategies applied to the combined indicators \(\delta (T)+\mathcal {E}_k(T)\), \(T\in \mathcal {T}_k\). Important examples are maximum-, Dörfler-, and equal distribution strategy; cf. [8].

As an alternative to combined marking, one may mark the two indicators separately. Similarly as before, requiring (23) for the single indicator \(\mathcal {E}_k(T)\) instead of the combined indicator \(\mathcal {E}_k(T)+\delta (T)\) then implies

$$\displaystyle \begin{aligned} \forall T\in\mathcal{T}^* \quad \lim_{k\to\infty} \mathcal{E}_k(T) = 0. \end{aligned}$$

To ensure also δ(T) = 0 for all \(T\in \mathcal {T}^*\), one may employ again the respective counterpart of (23) or a different approach, which, as fast tree approximation [1], capitalizes on the locality and history of the oscillation indicator. The following simple consequence of Theorem 2, which is also of interest by its own, is useful in this context.

Remark 3 (Modified Oscillation)

For any simplex T in \(\mathbb {T}\), let \(\tilde {\delta }(T)\) be a modification or approximation of δ(T) such that \(\tilde {\delta }(T)= 0 \implies \delta (T)=0\). Then the AFEM , where each δ(T) is replaced by \(\tilde {\delta }(T)\), converges if and only if \(\lim _{k\to \infty } \mathcal {E}_k(T) = 0\) and \(\tilde {\delta }(T) = 0\) for all \(T \in \mathcal {T}^*\).