1 Introduction

In this paper we analyze a discontinuous Galerkin method applied to the following model Helmholtz problem:

$$\begin{aligned} -\varDelta u-k^{2}u&= f\quad \text{ in } \varOmega ,\end{aligned}$$
(1.1)
$$\begin{aligned} \frac{\partial u}{\partial \mathbf{n}}+\mathrm{i}ku&= g\quad \text{ on } \partial \varOmega . \end{aligned}$$
(1.2)

Here, \(\varOmega \) is a bounded Lipschitz domain in \(\mathbb R ^{d}\), \(d\in \{2,3\}\), and \(k\ge k_0 > 0\) is the real and positive wavenumber bounded away from zero. The outer normal vector to \(\partial {\varOmega }\) is denoted \(\mathbf{n}\), and we write \(\mathrm{i}=\sqrt{-1}\) for the imaginary unit. We assume \(f\in L^{2}(\varOmega )\) and \(g\in L^{2}(\partial {\varOmega })\). By \(H^{s}(\varOmega )\) we denote the usual Sobolev space with norm \(\Vert \cdot \Vert _{H^{s}(\varOmega )}\), [1]. The seminorm which contains only the derivatives of order \(s\) is denoted by \(\vert \cdot \vert _{H^{s}(\varOmega )}\).

The weak formulation for (1.1) is given by: Find \(u\in V:=H^{1} (\varOmega )\) such that

$$\begin{aligned} a\left( u,v\right) =F(v)\quad \forall v\in H^{1}(\varOmega ), \end{aligned}$$
(1.3)

where

$$\begin{aligned} a\left( u,v\right)&:= \int \limits _{\varOmega }\left( \nabla u\nabla \bar{v} -k^{2}u\bar{v}\right) +\mathrm{i}k\int \limits _{\partial {\varOmega }}u\bar{v},\end{aligned}$$
(1.4)
$$\begin{aligned} F(v)&:= \int \limits _{\varOmega }f\bar{v}+\int \limits _{\partial {\varOmega }}g\bar{v}. \end{aligned}$$
(1.5)

Existence and uniqueness for the continuous problem were proved in [34] for bounded Lipschitz domains.

Problems in high-frequency scattering of acoustic or electro-magnetic waves are highly indefinite, and the design of discretization methods that behave robustly with respect to the amount of indefiniteness is of great importance. For our model problem, the highly indefinite case arises for high wavenumbers \(k\), and the solution \(u\) is highly oscillatory. It is well-known for such problems that low order finite elements suffer from the pollution effect, which mandates very fine meshes, [30]. For example, the classical analysis for lowest order \(\mathbb P _{1}\)-finite element spaces (see, e.g., [41], [30, Sec. 4]) guarantees unique solvability and quasi-optimality only under the condition that the number of degrees of freedom \(N\) satisfies \(N\gtrsim k^{2d}\), where \(d\) is the spatial dimension. We hasten to add that the conditions on the mesh size are less stringent for higher order FEM. A particular example is the analysis of [36, 37], which shows for high order methods that linking the polynomial degree \(p\) logarithmically to the wavenumber can lead to a stable method with few degrees of freedom per wavelength. We mention that on regular meshes the pollution error can also be understood by a dispersion analysis that quantifies the phase difference between the exact solution and the numerical solution, [25, 13, 16, 3032].

While the existence of discrete solutions for classical, conforming finite element discretizations is understood, it is worth stressing that a minimal resolution condition is required to ensure their existence. This observation motivates the quest for stabilized variational formulations that always guarantee the discrete stability of the method (existence and uniqueness of the discrete solution). Prominent examples of these types of methods are those incorporating least squares ideas, [17, 26, 27, 38] and Discontinuous Galerkin (DG) methods. Several variants of DG methods based on standard piecewise polynomial spaces are analyzed, for example, in [1921, 44, 45]. They feature unique solvability of the discrete systems without any resolution conditions; yet, it is worth pointing out that reduced or no convergence takes place in the preasymptotic regime.

The Ultra Weak Variational Formulation (UWVF) of Cessenat and Després [8, 9, 14] can be understood as a DG method that permits using non-standard, discontinuous local discretization spaces such as plane waves (see [7, 23, 28, 29]). In present paper we follow the idea of [23], where a DG method was derived from the UWVF for the Helmholtz problem. For plane waves as local ansatz spaces in this DG method, [23] shows linear convergence of the method under appropriate resolution conditions. By specializing to homogeneous Helmholtz problems [28] establishes quasi-optimal convergence (in a norm dictated by the method) without any resolution condition.

The goal of our work is to develop a theory for the same DG formulation as in [23] that allows us to infer the convergence behavior of abstract conforming and non-conforming generalized finite element spaces from certain local approximation properties and local inverse estimates, which may be easy to check, possibly even at run-time.

This paper is structured as follows: In Sect. 2, we recall from [23] a DG method for the Helmholtz problem (1.1). Section 3 is devoted to discrete stability and convergence. The unified theory presented there covers two popular choices of approximation spaces, namely, spaces consisting of piecewise plane waves and conforming as well as non-conforming polynomial \(hp\)-finite element spaces on affine simplicial meshes. Nevertheless, we also derive an abstract approximation criterion for general finite element spaces that implies existence and uniqueness of the discrete solution. Based on these results, we obtain quasi-optimal convergence in the DG-norm for general finite element spaces [40].

In Sect. 4 we apply the results of Sect. 3 to the \(hp\)-version of the polynomial FEM. We obtain a convergence theory that is explicit in the wavenumber \(k\) as well as the mesh width \(h\) and the polynomial degree \(p\). These results may be viewed as an extension of the results [36, 37] for classical \(H^{1}\)-conforming discretizations to the DG-setting. In these papers, a scale resolution condition of the form

$$\begin{aligned} \frac{kh}{p}\le c_{1}\quad \text{ and } \quad p\ge c_{2}\log k \end{aligned}$$
(1.6)

(for suitable \(c_{1}\), \(c_{2}\)) is sufficient to guarantee quasi-optimality. For the \(hp\)-version of the DG-FEM on regular meshes, or, more generally, meshes that permit sufficiently rich \(H^{1}\)-conforming subspaces of the non-conforming DG-space, the same condition yields quasi-optimality. In the general case, the slightly stronger condition (4.16) is a sufficient condition for quasi-optimality [40]. In particular, we show, for the first time for a DG method on regular meshes, that quasi-optimality can be obtained for a fixed number of degrees of freedom per wavelength. Two appendices conclude the article. Appendix 1 gives details for the regularity result Theorem 4.5. Appendix 2 is concerned with elementwise defined \(hp\)-approximations that are optimal in the broken \(H^2\)-norm; this result is required for the proof of Theorem 4.11.

2 Discontinuous Galerkin Method

2.1 Meshes and Spaces

To formulate the DG method we first introduce some notation. Let \(\varOmega \subset \mathbb R ^{d}\), \(d\in \{2,3\}\), denote a polygonal (\(d=2\)) or polyhedral (\(d=3\)) Lipschitz domain.Footnote 1 The DG problem is based on a partition \(\fancyscript{T}\) of \(\varOmega \) into non-overlapping curvilinear polygonal/polyhedral subdomains (“finite elements”) \(K\); hanging nodes are allowed. The local and global mesh width is denoted by

$$\begin{aligned} h_{K}:=\mathrm{diam}K\quad \text{ and }\quad h:= \max _{K\in \fancyscript{T}}h_{K}. \end{aligned}$$
(2.1)

In the case \(d=3\), the boundary of \(K\) can be split into faces and for \(d=2\) into edges. For ease of notation we use the terminology “faces” in both cases. For \(K\in \fancyscript{T}\), we denote the set of faces by \(\fancyscript{E}(K)\). The subset of interior faces, i.e., the set of faces of \(K\) which are not lying on \(\partial \varOmega \), is denoted by \(\fancyscript{E}^{\fancyscript{I}}(K)\). For instance the number \(\sharp \fancyscript{E}(K)=d+1\) if \(K\) is a simplex. As a convention we consider the finite elements \(K\in \fancyscript{T}\) always as open sets and the faces \(e\in \fancyscript{E}(K)\) as relatively open sets.

The interior skeleton \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\) and the boundary skeleton \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\) are given by

$$\begin{aligned} \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}:= {\displaystyle \bigcup \limits _{K\in \fancyscript{T}}} {\displaystyle \bigcup \limits _{e\in \fancyscript{E}^{\fancyscript{I}}\left( K\right) }} e,\quad \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}:= {\displaystyle \bigcup \limits _{K\in \fancyscript{T}}} {\displaystyle \bigcup \limits _{\begin{array}{c} e\in \fancyscript{E}\left( K\right) \\ e\subset \partial \varOmega \end{array}}}e. \end{aligned}$$

Note that \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\), \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\) are the union of the relative interior of the faces and, consequently, for any point \(x\in \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\), there exist exactly two elements in \(\fancyscript{T}\) (denoted by \(K_{x}^{+}\), \(K_{x}^{-}\)) with \(x\in \overline{K_{x}^{+}}\cap \overline{K_{x}^{-}}\).

Also define \(\nabla _{\fancyscript{T}}\) and \(\varDelta _{\fancyscript{T}}\) as elementwise applications of the operators \(\nabla \) and \(\varDelta \), respectively. The one-sided restrictions of some \(\fancyscript{T}\)-piecewise smooth function \(v\) for \(x\in \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\) are denoted by

$$\begin{aligned} v^{+}\left( x\right) :=\lim _{\begin{array}{c} y\in K_{x}^{+}\\ y\rightarrow x \end{array}}v\left( y\right) \quad \text{ and }\quad v^{-}\left( x\right) :=\lim _{\begin{array}{c} y\in K_{x}^{-}\\ y\rightarrow x \end{array}}v\left( y\right) . \end{aligned}$$

We use the same notation for vector-valued functions.

We define the averages and jumps for \({\fancyscript{T}}\)-piecewise smooth scalar-valued functions \(v\) and vector-valued functions \(\sigma _{S}\) on \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\) by

$$\begin{aligned} \text{ the } \text{ averages: } \left\{ v\right\}&:= \dfrac{1}{2}\left( v^{+} +v^{-}\right) , \,\,\left\{ \varvec{\sigma }_{S}\right\} :=\dfrac{1}{2}\left( \varvec{\sigma }_{S}^{+}+ \varvec{\sigma }_{S}^{-}\right) ,\\ \text{ the } \text{ jumps: } \,[\![v]\!]_{N}&:= v^{+}\mathbf{n}^{+} +v^{-}\mathbf{n}^{-}, [\![\varvec{\sigma }_{S}]\!]_{N} :=\varvec{\sigma }_{S}^{+}\cdot \mathbf{n}^{+}+\varvec{\sigma }_{S}^{-} \cdot \mathbf{n}^{-}. \end{aligned}$$

where \(\mathbf{n}^{+}(x)\), \(\mathbf{n}^{-}(x)\) denote the (outer) normal vectors of elements \(K_{x}^{+}\), \(K_{x}^{-}\).

Based on the partition \({\fancyscript{T}}\) we can introduce broken Sobolev spaces in the standard way: For \(s\ge 0\), we set

$$\begin{aligned} H_{\mathrm{pw}}^{s}\left( \varOmega \right) :=L^{2}\left( \varOmega \right) \cap {\displaystyle \prod \limits _{K\in \fancyscript{T}}} H^{s}\left( K\right) . \end{aligned}$$
(2.2)

2.2 Discrete Formulation

We approximate the solution of (1.3) from an abstract finite-dimensional space \(S \subset H^{2} _{\mathrm{pw}}(\varOmega )\), i.e., only the following two conditions are imposed:

$$\begin{aligned} S\subset L^{2}\left( \varOmega \right) \quad \text{ and }\quad S\subset \prod \limits _{K\in \fancyscript{T}}H^{2}\left( K\right) . \end{aligned}$$
(2.3)

We briefly recall the derivation of the DG formulation from the UWVF as in [23]. We denote by \((\cdot ,\cdot )\) the \(L^{2}\) inner product on \(\varOmega \), i.e., \((u,v)=\int _{\varOmega }u\overline{v}dV\). Let \(S\) be a discrete space as in (2.3). Let \(\alpha \in L^{\infty }(\overline{\mathfrak{S }_{\fancyscript{T}}^{\fancyscript{I}}})\), \(\beta \in L^{\infty }(\overline{ \mathfrak{S }_{\fancyscript{T}}^{\fancyscript{I}}})\), and \(\delta \in L^{\infty }(\overline{\mathfrak{S }_{\fancyscript{T}}^{\fancyscript{B}}})\) be some positive and bounded functions on the mesh skeletons. (It will turn out that these functions can be chosen to be piecewise constant on a certain partition of the skeleton as elaborated in Remark 2.2.) Then, the DG formulation can be written in the following form, [23, 28]:

Find \(u_{S}\in S\) such that, for all \(v\in S\),

$$\begin{aligned} a_{\fancyscript{T}}(u_{S},v)-k^{2}(u_{S},v)= (f,v)-\int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{B}}}\delta \frac{1}{\mathrm{i}k}g\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS+\int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{B}}}(1-\delta )g\overline{v}dS=:F_{\fancyscript{T}}(v),\nonumber \\ \end{aligned}$$
(2.4)

where \(a_{\fancyscript{T}}(\cdot ,\cdot )\) is the DG-bilinear form on \(S\times S\) defined by

$$\begin{aligned} a_{\fancyscript{T}}(u,v)&:= (\nabla _{\fancyscript{T}}u, \nabla _{\fancyscript{T}} v)- \int \limits _\mathfrak{S _{\fancyscript{T}}^{ \fancyscript{I}}} [\![u]\!]_{N} \cdot \{\overline{\nabla _{\fancyscript{T}}v}\}dS- \int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{I}}} \{\nabla _{\fancyscript{T}}u\} \cdot [\![\overline{v}]\!]_{N}dS\nonumber \\&\quad -\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}}\delta u\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS- \int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{B}}} \delta \nabla _{\fancyscript{T}}u\cdot \mathbf{n}\overline{v}dS\nonumber \nonumber \\&\quad -\frac{1}{\mathrm{i}k} \int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{I} }}\beta [\![\nabla _{\fancyscript{T}} u]\!]_{N}[\![\overline{\nabla _{\fancyscript{T}} v}]\!]_{N}dS-\frac{1}{\mathrm{i}k} \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \delta \nabla _{\fancyscript{T}}u\cdot \mathbf{n}\overline{\nabla _{ \fancyscript{T}}v\cdot \mathbf{n}}dS\nonumber \\&\quad +\,\mathrm{i}k\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}} \alpha [\![u]\!]_{N}[\![\overline{v}]\!]_{N} dS+\mathrm{i}k\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} (1-\delta )u\overline{v}dS. \end{aligned}$$
(2.5)

Note that \(a_{\fancyscript{T}}(\cdot ,\cdot )\) can be extended to a sesquilinear form on \(H_{\mathrm{pw}}^{3/2+{\varepsilon }}( \varOmega )\times H_{\mathrm{pw}}^{3/2+{\varepsilon }}( \varOmega )\) for any \({\varepsilon }>0\). So far, the functions \(\alpha \), \(\beta \), \(\delta \) are arbitrary, positive \(L^{\infty }\) functions. Our analysis will rely on certain properties of \(\alpha \) that depend on some trace inverse estimates for the space \(S\). We therefore introduce:

Definition 2.1

(inverse trace inequality) For each element \(K\), the constant \(C_{\mathrm{trace}}(S,K)\) is the smallest constant such that

$$\begin{aligned} \Vert \nabla \left( \left. v\right| _{K}\right) \Vert _{L^{2}\left( \partial K\right) }\le C_{\mathrm{trace}}(S,K)\Vert \nabla v\Vert _{L^{2}\left( K\right) }\quad \forall v\in S. \end{aligned}$$
(2.6)

Remark 2.2

The analysis of the continuity and coercivity will lead to the condition

$$\begin{aligned} \alpha \left( x\right) \ge \frac{4}{3k}\max _{K\in \left\{ K_{x}^{+},K_{x} ^{-}\right\} }C_{\mathrm{trace}}^{2}\left( S,K\right) \quad \forall x\in \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}. \end{aligned}$$
(2.7)

For the special case that \(S\) is a conforming/non-conforming polynomial \(hp\)-finite element space, the estimate of the approximation property of \(S\) with respect to the \(\Vert \cdot \Vert _{DG}\) and \(\Vert \cdot \Vert _{DG^{+}}\) norms, (cf. Sect. 4.2 ahead) leads to the choices

$$\begin{aligned} \alpha \left( x\right) =\mathfrak a \max _{K\in \left\{ K_{x}^{+},K_{x} ^{-}\right\} }\frac{p^{2}}{kh_{K}},\quad \beta =\mathfrak b \frac{kh}{p} ,\quad \delta =\quad \mathfrak d \frac{kh}{p}, \end{aligned}$$
(2.8)

where the parameter \(\mathfrak a \) is selected fixed but sufficiently large; the parameters \(\mathfrak b \), \(\mathfrak d \) are selected to be of size \(O(1)\). \(\square \)

Remark 2.3

It is easy to see that \(x \mapsto \alpha (x)\) can be chosen piecewise constant with respect to a sub-partition \(\fancyscript{E}\) of the set of all faces. More precisely, we define a subdivision of the set of inner faces by

$$\begin{aligned} \fancyscript{E}^{\fancyscript{I}}:=\left\{ \overset{\circ }{\partial K}\cap \overset{\circ }{\partial K^{\prime }}\cap \varOmega \mid K\in \fancyscript{T}, \quad K^{\prime }\in \fancyscript{T}\backslash \left\{ K\right\} \right\} , \end{aligned}$$

where \(\overset{\circ }{\partial K}:=\bigcup \limits _{e\in \fancyscript{E}(K)}e\). For any \(e^{\prime }\in \fancyscript{E}^{\fancyscript{I}}\), the maximum in (2.7) over \(x\in e^{\prime }\) can always be chosen as one fixed element \(K\) so that the value of \(\alpha \) is constant along \(e^{\prime }\). Hence, without loss of generality we may assume in the following that \(\alpha \) is chosen as an \(\fancyscript{E}\)-piecewise constant function. Note that the assumption “\(\alpha \) is positive” then implies for each \(K \in \fancyscript{T}\)

$$\begin{aligned} \alpha _{\partial K}^{\min }:=\inf \limits _{x\in \partial K}\alpha \left( x\right) =\alpha \left( X\right) \end{aligned}$$
(2.9)

for some \(X\in \overset{\circ }{\partial K} \cap \varOmega \). \(\square \)

In the rest of this section we will show that the discretization given by the sesquilinear form \(a_{\fancyscript{T}}\) is consistent as well as adjoint consistent. The latter property will prove particularly useful to obtain error estimates.

Lemma 2.4

(consistency) Let the exact solution \(u\) of (1.2) be in \(H^{3/2+{\varepsilon }}(\varOmega )\) for some \({\varepsilon }>0\). Then \(u\) satisfies, with the right-hand side \(F_{\fancyscript{T}}\) given in (2.4), the consistency condition

$$\begin{aligned} a_{\fancyscript{T}}(u,v) - k^{2} (u,v) = F_{\fancyscript{T}}(v) \quad \forall v \in S. \end{aligned}$$

Proof

From the \(H^{3/2+{\varepsilon }}\)-regularity of \(u\) it follows that \(u\) and \(\nabla u\) have well-defined traces on \(\partial K\) for each \(K\in {\fancyscript{T}}\) and

$$\begin{aligned}{}[\![u]\!]_{N}=0,\quad [\![\nabla u]\!]_{N} =0,\quad \{\nabla u\}= \nabla u\quad \text{ on }\quad \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}. \end{aligned}$$

We multiply both sides of Eq. (1.1) by a test function \(v\in S\), integrate elementwise, sum over all elements, and integrate by parts to get

$$\begin{aligned} \sum _{K\in \fancyscript{T}}\left( \int \limits _{\partial {K}}(-\nabla u\cdot \mathbf{n} )\bar{v}+\int \limits _{K}\nabla u\cdot \nabla \bar{v}\right) -\int \limits _{\varOmega }k^{2} u\bar{v}=\int \limits _{\varOmega }f\bar{v}. \end{aligned}$$
(2.10)

From the definition of the jumps on the inner faces and the boundary condition (1.2), we get

$$\begin{aligned} -\sum _{K\in \fancyscript{T}}\int \limits _{\partial {K}}(\nabla u\cdot \mathbf{n})\bar{v}dS&= -\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \delta \nabla u\cdot \mathbf{n}\overline{v}dS- \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}} }(1-\delta )g\overline{v}dS\\&\quad +\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}}\mathrm{i} k(1-\delta )u\overline{v}dS- \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}} }\nabla u\cdot [\![\overline{v}]\!]_{N}dS. \end{aligned}$$

The boundary condition (1.2) gives us

$$\begin{aligned} -\sum _{K\in \fancyscript{T}}\int \limits _{\partial {K}}(\nabla u\cdot \mathbf{n})\bar{v}dS&= -\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \!\!\! \delta \nabla u\cdot \mathbf{n}\overline{v}dS- \int \limits _{ \mathfrak S _{ \fancyscript{T}}^{\fancyscript{B}} } \!\!\!(1- \delta )g\overline{v}dS+ \int \limits _\mathfrak{S _{\fancyscript{T}}^{ \fancyscript{B}}}\!\!\!\mathrm{i}k(1-\delta )u\overline{v}dS\\&-\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}}\nabla u\cdot [\![\overline{v}]\!]_{N}dS +\frac{1}{\mathrm{i}k} \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}}\delta g\,\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS\\&-\frac{1}{\mathrm{i}k} \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}}\delta \nabla u\cdot \mathbf{n}\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS-\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}}\delta u\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS. \end{aligned}$$

Inserting this result into Eq. (2.10) leads to

$$\begin{aligned} a_{\fancyscript{T}}(u,v)-k^{2}(u,v)=(f,v)-\int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{B}}}\delta \frac{1}{\mathrm{i}k}g\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS+\int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{B}}}(1-\delta )g\overline{v}dS,\quad \forall v\in S, \end{aligned}$$

which (2.4) as desired. \(\square \)

Lemma 2.7 below will establish the consistency with respect to the following adjoint problem.

Definition 2.5

(adjoint solution operator \(\varvec{N_{k}^{*}}\) ) The adjoint Helmholtz problem is given by:

$$\begin{aligned} \text{ For } w\in L^{2}\left( \varOmega \right) \text{ find } \phi \in H^{1} (\varOmega ) \text{ such } \text{ that } a\left( v,\phi \right) =\left( v,w\right) \quad \forall v\in H^{1}\left( \varOmega \right) .\qquad \end{aligned}$$
(2.11)

The solution operator \(N_{k}^{*}:L^{2}( \varOmega )\rightarrow H^{1}(\varOmega )\) is characterized by the condition

$$\begin{aligned} a\left( v,N_{k}^{*}(w)\right) =\left( v,w\right) . \end{aligned}$$
(2.12)

We say that problem (2.11) has \(H^{s}(\varOmega )\) -regularity for some \(s>1\) if for any given right-hand side \(w\in L^{2}(\varOmega )\) the solution \(\phi \) of (2.11) is in \(H^{s}(\varOmega )\) and satisfies

$$\begin{aligned} \left\| \phi \right\| _{H^{s}\left( \varOmega \right) }\le C_{\mathrm{reg}}\left\| w\right\| _{L^{2}\left( \varOmega \right) } \end{aligned}$$

for some positive constant \(C_{\mathrm{reg}}\) that is independent of \(w\).

Remark 2.6

The adjoint problem (2.11) is a well-posed problem, for which even \(k\)-explicit regularity is available. For example, if \(\varOmega \) convex (or smooth and star-shaped), then \(\phi \in H^{2}(\varOmega )\) and

$$\begin{aligned} k\Vert \phi \Vert _{L^{2}(\varOmega )}+\Vert \nabla \phi \Vert _{L^{2}(\varOmega )}&\le C_{1}(\varOmega )\Vert w\Vert _{L^{2}\left( \varOmega \right) },\\ \Vert \nabla ^{2}\phi \Vert _{L^{2}(\varOmega )}&\le C_{2}(\varOmega )(1+k)\Vert w\Vert _{L^{2}\left( \varOmega \right) }, \end{aligned}$$

with \(C_{1}(\varOmega )\), \(C_{2}(\varOmega )>0\) independent of \(k\ge k_{0}>0\) (\(k_{0}\) is arbitrary but fixed), [34, Prop. 8.1.4] for \(d=2\) and [10] for \(d=3\). For general Lipschitz domains, we have by [15, Thm. 2.4]

$$\begin{aligned} k\Vert \phi \Vert _{L^{2}(\varOmega )}+ \Vert \nabla \phi \Vert _{L^{2}(\varOmega )}\le C_{3}(\varOmega )k^{5/2}\Vert w\Vert _{L^{2}\left( \varOmega \right) } \end{aligned}$$

for a constant \(C_{3}(\varOmega )\) independent of \(k\ge k_{0}\). For polygonal/polyhedral Lipschitz domains \(\varOmega \) the classical elliptic regularity theory provides \(\phi \in H^{3/2+{\varepsilon }}(\varOmega )\) for some \({\varepsilon }>0\), which depends on the geometry of \(\varOmega \). \(\square \)

Lemma 2.7

(adjoint consistency) Let the adjoint Helmholtz problem be \(H^{3/2+{\varepsilon }}(\varOmega )\)-regular for some \(\varepsilon > 0\). Then for any \(w\in L^{2}(\varOmega )\), the solution \(\phi :=N_{k}^{*}(w)\) of the adjoint problem (2.11) satisfies

$$\begin{aligned} a_{\fancyscript{T}}(v,\phi )-k^{2}(v,\phi )=(v,w)\quad \forall v\in H_{\mathrm{pw}}^{3/2+{\varepsilon }}\left( \varOmega \right) . \end{aligned}$$
(2.13)

Proof

From the \(H^{3/2+{\varepsilon }}(\varOmega )\)-regularity of \(\phi \) it follows that \(\phi \) and \(\nabla \phi \) have well-defined traces on \(\partial K\) for each \(K\in {\fancyscript{T}}\) and

$$\begin{aligned}{}[\![\phi ]\!]_{N}=0, \quad [\![\nabla \phi ]\!]_{N} =0,\quad \{\nabla \phi \}= \nabla \phi \quad \text{ on }\quad \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}. \end{aligned}$$

The rest of the proof is just a repetition of the arguments in the proof of Lemma 2.4 by taking into account the zero Robin boundary conditions for the adjoint problem. \(\square \)

On \(H_{\mathrm{pw}}^{3/2+{\varepsilon }}(\varOmega )\) for \({\varepsilon }>0\) we will use the mesh-dependent norms \(\Vert \cdot \Vert _{DG}\) and \(\Vert \cdot \Vert _{DG^{+}}\) that were introduced in [23]:

$$\begin{aligned} \Vert v\Vert _{DG}^{2}&:= \Vert \nabla _{\fancyscript{T}}v\Vert _{L^{2} \left( \varOmega \right) }^{2}+ k^{-1}\Vert \beta ^{1/2} [\![\nabla _{\fancyscript{T} }v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I} }\right) }^{2}+k\Vert \alpha ^{1/2}[\![v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }^{2}\\&\quad +k^{-1}\Vert \delta ^{1/2}\nabla _{\fancyscript{T}}v\cdot \mathbf{n}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }^{2} +k\Vert (1-\delta )^{1/2}v\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{B}}\right) }^{2}+k^{2}\Vert v\Vert _{L^{2}\left( \varOmega \right) }^{2},\\ \;\Vert v\Vert _{DG^{+}}^{2}&:= \Vert v\Vert _{DG}^{2}+k^{-1}\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}v\}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }^{2}. \end{aligned}$$

3 Discrete Stability and Convergence Analysis

This section is devoted to the analysis of the discrete problem for the finite dimensional space \(S\) satisfying the condition (2.3).

3.1 Continuity and Coercivity

Proposition 3.1

Define \(b_{\fancyscript{T}}(u,v):=a_{\fancyscript{T}}(u,v)+k^{2} (u,v)\). For any \(0<\delta <\frac{1}{3}\) and \(\alpha \) satisfying (2.7), there exist constants \(c_\mathrm{coer}\), \(C_{\mathrm{c} }>0\) independent of \(h\), \(k\), \(\alpha \), \(\beta \), \(\delta \), and \(C_{\mathrm{trace}}(S,K)\) such that the following two statements are true:

  1. (a)

    The sesquilinear form \(b_{\fancyscript{T}}(\cdot ,\cdot )\) is coercive:

    $$\begin{aligned} |b_{\fancyscript{T}}(v,v)|\ge c_{\mathrm{coer}}\Vert v\Vert _{DG}^{2}\quad \quad \forall v\in S. \end{aligned}$$
  2. (b)

    For any \({\varepsilon }>0\), the sesquilinear form \(b_{\fancyscript{T}}(\cdot ,\cdot )\) satisfies the following continuity estimates

    $$\begin{aligned} |b_{\fancyscript{T}}(v,w_{S})|&\le C_{\mathrm{c}}\Vert v\Vert _{DG^{+}}\Vert w\Vert _{DG^{+}}\quad \quad \forall v,w\in H_{\mathrm{pw}}^{3/2+{\varepsilon }}\left( \varOmega \right) ,\end{aligned}$$
    (3.1)
    $$\begin{aligned} |b_{\fancyscript{T}}(v,w_{S})|&\le C_{\mathrm{c}}\Vert v\Vert _{DG^{+}}\Vert w_{S}\Vert _{DG}\quad \quad \forall v\in H_{\mathrm{pw}}^{3/2+{\varepsilon }}\left( \varOmega \right) , \quad \forall w_{S}\in S,\end{aligned}$$
    (3.2)
    $$\begin{aligned} |b_{\fancyscript{T}}(w_{S},v)|&\le C_{\mathrm{c}}\Vert v\Vert _{DG^{+}}\Vert w_{S}\Vert _{DG}\quad \quad \forall v\in H_{\mathrm{pw}}^{3/2+{\varepsilon }}\left( \varOmega \right) ,\quad \forall w_{S}\in S. \end{aligned}$$
    (3.3)

Proof

The proof uses the same argument as [23, Props. 4.2, 4.4]; we trace the dependence on our abstract framework and work out constants explicitly.

  1. (a)

    The definition of \(b_{\fancyscript{T}}(.,.)\) leads to

    $$\begin{aligned} b_{\fancyscript{T}}(v,v)&= \Vert \nabla _{\fancyscript{T}}v\Vert _{L^{2}\left( \varOmega \right) }^{2}-2\mathrm{Re}\left( \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}}[\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}}v}\}dS\right) -2\mathrm{Re} \left( \int _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \delta v\overline{\nabla _{\fancyscript{T}}v\cdot \mathbf{n}}dS\right) \\&+\,\mathrm{i}k^{-1}\Vert \beta ^{1/2} [\![\nabla _{\fancyscript{T} }v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I} }\right) }^{2}+\mathrm{i}k^{-1}\Vert \delta ^{1/2}\nabla _{\fancyscript{T} }v\cdot \mathbf{n}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }^{2}\\&+\,\mathrm{i}k\Vert \alpha ^{1/2}[\![v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) } ^{2}+\mathrm{i}k\Vert (1-\delta )^{1/2}v\Vert _{0,\mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}}^{2}+k^{2}\Vert v\Vert _{L^{2}\left( \varOmega \right) }^{2}. \end{aligned}$$

By using Young’s inequality for some positive function \(s\in L^{\infty }( \overline{\mathfrak{S }_{\fancyscript{T}}^{\fancyscript{I}}})\) we get for the second term in the representation of \(b_{\fancyscript{T}}(\cdot ,\cdot )\)

$$\begin{aligned} \left| 2\mathrm{Re} \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}} }[\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}} }v\}dS\right| \le k\Vert \sqrt{\frac{s}{\alpha }}\alpha ^{1/2} [\![v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\right) }^{2} +\frac{1}{k}\Vert \frac{1}{\sqrt{s}}\nabla \left( \left. v\right| _{K}\right) \Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }^{2}. \end{aligned}$$

We choose \(s:=4\alpha /5\). By using (2.7) we get

$$\begin{aligned} \left| 2\mathrm{Re}\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}} }[\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}} }v\}dS\right| \le \frac{4}{5}k\Vert \alpha ^{1/2} [\![v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\right) }^{2} +\sum _{K\in \fancyscript{T}}\frac{5}{4k}\Vert \frac{1}{\alpha ^{1/2}} \nabla \left( \left. v\right| _{K}\right) \Vert _{L^{2}\left( \varOmega \cap \partial K\right) }^{2}. \end{aligned}$$

For the second summand, we get with \(\alpha _{\partial K}^{\min }\) as in (2.9)

$$\begin{aligned} \sum _{K\in \fancyscript{T}}\frac{5}{4k}\Vert \frac{1}{\alpha ^{1/2}} \nabla \left( \left. v\right| _{K}\right) \Vert _{L^{2}\left( \varOmega \cap \partial K\right) }^{2}\le \sum _{K\in \fancyscript{T}}\frac{5}{4k}\frac{C_{\mathrm{trace}}^{2}\left( S,K\right) }{\alpha _{\partial K}^{\min }}\Vert \nabla v\Vert _{L^{2}\left( K\right) }^{2}. \end{aligned}$$

Let \(X\in \overset{\circ }{\partial K}\cap \varOmega \) be defined as in Remark 2.3. Since \(K\in \{ K_{X}^{+},K_{X}^{-}\}\), the condition on \(\alpha \) [cf. (2.6)] implies

$$\begin{aligned} \alpha _{\partial K}^{\min }=\alpha \left( X\right) \ge \frac{4}{3k} \max _{K^{\prime }\in \left\{ K_{X}^{+},K_{X}^{-}\right\} } C_{\mathrm{trace}}^{2}\left( S,K^{\prime }\right) \ge \frac{4}{3k}C_{\mathrm{trace}}^{2}\left( S,K\right) . \end{aligned}$$
(3.4)

Hence,

$$\begin{aligned} \sum _{K\in \fancyscript{T}}\frac{5}{4k}\Vert \frac{1}{\alpha ^{1/2}} \nabla \left( \left. v\right| _{K}\right) \Vert _{L^{2}\left( \varOmega \cap \partial K\right) }^{2}\le \frac{15}{16}\Vert \nabla _{\fancyscript{T}}v\Vert _{L^{2}\left( \varOmega \right) }^{2}. \end{aligned}$$

All in all we have derived

$$\begin{aligned} \left| 2\mathrm{Re}\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}} }[\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}} }v\}dS\right| \le \frac{4k}{5}\Vert \alpha ^{1/2}[\![v]\!]_{N} \Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) } ^{2}+\frac{15}{16}\Vert \nabla _{\fancyscript{T}}v\Vert _{L^{2}\left( \varOmega \right) }^{2}. \end{aligned}$$

The third term in \(b_{\fancyscript{T}}(\cdot ,\cdot )\) can be estimated in a similar fashion for any \(t>0\) by

$$\begin{aligned} \left| 2\mathrm{Re} \int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}} }\delta v\overline{\nabla _{\fancyscript{T}}}v\cdot \mathbf{n}dS\right| \le tk\frac{\delta }{1-\delta }\Vert (1-\delta )^{1/2}v\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }^{2}+\frac{1}{tk}\Vert \delta ^{1/2}\nabla _{\fancyscript{T}}v\cdot \mathbf{n}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }^{2}. \end{aligned}$$

By choosing \(0<\delta <\frac{1}{3}\) as well as \(t=3/2\) we obtain

$$\begin{aligned} \left| b_{\fancyscript{T}}(v,v)\right|&\ge \frac{1}{\sqrt{2}}\left( \left| \mathrm{Re}(b_{\fancyscript{T} }(v,v))\right| +\left| \mathrm{Im}(b_{\fancyscript{T}} (v,v))\right| \right) \nonumber \\&\ge \frac{1}{\sqrt{2}}\Bigl ( \frac{1}{16}\Vert \nabla _{\fancyscript{T}} v\Vert _{L^{2}\left( \varOmega \right) }^{2}+\frac{k}{5}\Vert \alpha ^{1/2}[\![v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\right) }^{2} + \frac{k}{4} \Vert (1-\delta )^{1/2}v\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }^{2} \nonumber \\&\quad +\frac{1}{3k}\Vert \delta ^{1/2}\nabla _{\fancyscript{T}}v\cdot \mathbf{n}\Vert _{0,\mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}}^{2} +k^{-1}\Vert \beta ^{1/2}[\![\nabla _{\fancyscript{T}} v]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }^{2}+k^{2}\Vert v\Vert _{L^{2}\left( \varOmega \right) }^{2}\Bigr ) \nonumber \\&\ge c_{\mathrm{coer}}\Vert v\Vert _{DG}^{2}. \end{aligned}$$
(3.5)
  1. (b)

    Using Young’s inequality we get

    $$\begin{aligned}&\!\!\! |b_{\fancyscript{T}}(v,w)|\nonumber \\&\!\!\!\quad \le |(\nabla _{\fancyscript{T}}v,\nabla _{\fancyscript{T} }w)| +k^{2}|(v,w)| +\!\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}} [\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}}w} \}dS\right| +\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}} \{\nabla _{\fancyscript{T} }v\}\cdot [\![\overline{w} ]\!]_{N}dS\right| \nonumber \\&\!\!\!\quad \quad \!+\!\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \delta v\overline{\nabla _{\fancyscript{T}} w\cdot \mathbf{n}}dS\right| +\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \delta \nabla _{\fancyscript{T} }v\cdot \mathbf{n}\overline{w}dS\right| + \frac{1}{k}\left| ~\int \limits _\mathfrak{S _{\fancyscript{T} }^{\fancyscript{I}}}\left( \beta [\![\nabla _{\fancyscript{T}}v ]\!]_{N} [\![\overline{\nabla _{\fancyscript{T}}w} ]\!]_{N}\right) dS\right| \nonumber \\&\!\!\!\quad \quad +\frac{1}{k}\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} \left( \delta \nabla _{\fancyscript{T}}v\cdot \mathbf{n}\overline{\nabla _{\fancyscript{T}} w\cdot \mathbf{n}}\right) dS\right| \!+\!\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}} }\left( k\alpha [\![v]\!]_{N}[\![\overline{w} ]\!]_{N}\right) dS\right| \!+\!k\left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{B}}} (1\!-\!\delta )v\overline{w}dS\right| .\nonumber \\ \end{aligned}$$
    (3.6)

For \(0<\delta <1/3\) and for any \(v\), \(w\in H_{\mathrm{pw}} ^{3/2+{\varepsilon }}(\varOmega )\) we finally obtain

$$\begin{aligned} |b_{\fancyscript{T}}(v,w)|\le C_{\mathrm{c}}\Vert v\Vert _{DG^{+}}\Vert w\Vert _{DG^{+}}. \end{aligned}$$

Estimates in weaker norms are possible if one of these two functions is from the discrete space \(S\), e.g., \(w\in S\). A careful inspection of Eq. (3.6) shows that the only term which requires the \(DG^{+}\)-norm instead of \(DG\)-norm for \(w\) in the continuity estimate is \(\int _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}} [\![v]\!]_{N} \cdot \{\overline{ \nabla _{\fancyscript{T}}w}\}dS\). Using Young’s inequality we get

$$\begin{aligned} \left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}}[\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}}w}\}dS\right| \le \sum _{K\in \fancyscript{T}}\left\{ \left\| [\![v]\!]_{N} \right\| _{L^{2}\left( \varOmega \cap \partial K\right) }\left\| \nabla \left( \left. w\right| _{K}\right) \right\| _{L^{2}\left( \varOmega \cap \partial K\right) }\right\} . \end{aligned}$$

We apply the trace inequality in (2.6) and also (2.7 ) to obtain

$$\begin{aligned} \left| ~\int \limits _\mathfrak{S _{\fancyscript{T}}^{\fancyscript{I}}} [\![v]\!]_{N}\cdot \{\overline{\nabla _{\fancyscript{T}}w} \}dS\right|&\le \!\!\sum _{K\in \fancyscript{T}}\left\{ \! \frac{1}{\sqrt{\alpha _{\partial K}^{\min }}}\left\| \alpha ^{\frac{1}{2}} [\![v]\!]_{N}\right\| _{L^{2}\left( \varOmega \cap \partial K\right) }C_{\mathrm{trace}}\left( S,K\right) \left\| \nabla _{\fancyscript{T}}w\right\| _{L^{2}\left( K\right) }\!\right\} \\&\!\!\!\overset{(3.4)}{\le }\sqrt{\frac{3k}{4}}\sum _{K\in \fancyscript{T}}\left\{ \left\| \alpha ^{\frac{1}{2}} [\![v]\!]_{N}\right\| _{L^{2}\left( \varOmega \cap \partial K\right) }\left\| \nabla _{\fancyscript{T}}w\right\| _{L^{2}\left( K\right) }\right\} \\&\le \sqrt{\frac{3k}{2}}\left\| \alpha ^{\frac{1}{2}} [\![v]\!]_{N}\right\| _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }\left\| \nabla _{\fancyscript{T} }w\right\| _{L^{2}\left( \varOmega \right) }. \end{aligned}$$

Hence, we finally obtain (3.2). The estimate (3.3) can be shown using the same techniques or derived from (3.2) by observing that for \(v\), \(w\in H_{\mathrm{pw}}^{3/2+\varepsilon } (\varOmega )\) we have

$$\begin{aligned} b_{{\fancyscript{T}},k}(v,w)=\overline{b_{{\fancyscript{T}},-k}(w,v)}, \end{aligned}$$

where we have added the subscript \(k\) (or \(-k\)) to emphasize how the parameter \(k\) enters the definition. \(\square \)

Remark 3.2

The restriction \(0<\delta <1/3\) in Proposition 3.1 was made to simplify the proof and may be relaxed to \(0<\delta <1/2\). Then, the coercivity constant is bounded from below but degenerates to zero as \(\delta \rightarrow 1/2\). This can be shown by assuming \(0<\delta \le 1/2-{\varepsilon }\) and \(t=1/(1-2{\varepsilon })\) with \(0<{\varepsilon }<1/2\). Following similar steps as in (3.5), one can show

$$\begin{aligned} C_{\mathrm{coer}}=\frac{1}{\sqrt{2}}\min \left\{ \frac{1}{16},\frac{2{\varepsilon }}{1+2{\varepsilon }}\right\} . \end{aligned}$$

\(\square \)

As a corollary of (3.3) we have the following continuity assertion, which will be useful for certain adjoint problems:

Corollary 3.3

For any \({\varepsilon }>0\), it holds

$$\begin{aligned} |a_{\fancyscript{T}}\left( v,u\right) -k^{2}\left( v,u\right) |\le C_{\mathrm{c}}\Vert u\Vert _{DG+}\Vert v\Vert _{DG}\quad \forall u\in H_{\mathrm{pw}}^{3/2+{\varepsilon }}\left( \varOmega \right) \quad \forall v\in S. \end{aligned}$$
(3.7)

3.2 Quasi-Optimality

We start with a definition: We say that a pair \((u,u_{S})\in H_{\mathrm{pw}}^{3/2+\varepsilon }(\varOmega )\times S\) of functions satisfies the Galerkin orthogonality if

$$\begin{aligned} a_{\fancyscript{T}}(u-u_{S},v)=0\quad \forall v\in S. \end{aligned}$$
(3.8)

Our starting point for the analysis of our DG problem is a quasi-optimality result which is proved under the assumption that the above Galerkin orthogonality is valid. The existence and uniqueness of a solution \(u_{S}\) of the discrete problem (2.4) is then shown in a second step based on the quasi-optimality result.

Proposition 3.4

There exists a constant \(\widetilde{C} > 0\) depending solely on the constants \(C_{c}\), \(c_{\mathrm{coer}}\) of Proposition 3.1 such that the following is true: Any pair \((u,u_{S}) \in H^{3/2+\varepsilon }_{\mathrm{pw}}(\varOmega ) \times S\) meeting the orthogonality condition (3.8) satisfies

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le \widetilde{C}\left( \inf _{v\in S}\Vert u-v\Vert _{DG^{+}}+\sup _{0\ne w_{S}\in S}\frac{k|(u-u_{S},w_{S})|}{\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }}\right) . \end{aligned}$$

Proof

For the reader’s convenience, we include the proof taken from [23, Proposition 4.4]. We start with a triangle inequality

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le \Vert u-v\Vert _{DG}+\Vert v-u_{S}\Vert _{DG} \quad \quad \forall v\in S \end{aligned}$$
(3.9)

and employ the coercivity of \(b_{\fancyscript{T}}(\cdot ,\cdot )\)

$$\begin{aligned} \Vert v-u_{S}\Vert _{DG}^{2}&\le \frac{1}{c_{\mathrm{coer}}}|b_{\fancyscript{T}}(v-u_{S},v-u_{S})|\nonumber \\&\le \frac{1}{c_{\mathrm{coer}}}|b_{\fancyscript{T}}(v-u,v-u_{S})|+\frac{1}{c_{\mathrm{coer}}}|b_{\fancyscript{T}}(u-u_{S},v-u_{S})|\nonumber \\&= \frac{1}{c_{\mathrm{coer}}}|b_{\fancyscript{T}}(v-u,v-u_{S})|+\frac{2k^{2} }{c_{\mathrm{coer}}}|(u-u_{S},v-u_{S})|, \end{aligned}$$
(3.10)

where in the last inequality we employed the orthogonality condition (3.8). The continuity of \(b_{\fancyscript{T}}(\cdot ,\cdot )\) expressed in (3.1) together with (3.10) implies

$$\begin{aligned} \Vert v-u_{S}\Vert _{DG}^{2}\le \frac{C_{\mathrm{c}}}{c_{\mathrm{coer}}}\Vert v-u\Vert _{DG^{+}}\Vert v-u_{S}\Vert _{DG}+\frac{2k^{2}}{c_{\mathrm{coer}} }|(u-u_{S},v-u_{S})|. \end{aligned}$$

We combine this result with (3.9) and obtain

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le \Vert u-v\Vert _{DG}+\frac{C_{\mathrm{c}} }{c_{\mathrm{coer}}}\Vert v-u\Vert _{DG^{+}}+\frac{2k}{c_{\mathrm{coer}}} \sup _{0\ne w_{S}\in S}\frac{|(u-u_{S},w_{S})|}{\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }}. \end{aligned}$$

\(\square \)

Next, we will use the adjoint problem to gauge the contribution \(\sup _{w_{S}\in S}\frac{k|(u-u_{S},w_{S})|}{\Vert w_{S}\Vert _{L^{2}(\varOmega )}}\) in Proposition 3.4.

Proposition 3.5

Assume that the adjoint Helmholtz problem is \(H^{3/2+\varepsilon }(\varOmega )\) regular for some \(\varepsilon > 0\). Let the coefficients in the definition of \(a_{\fancyscript{T}}(\cdot ,\cdot )\) satisfy \(0<\delta <1/3\) and (2.7). Then the following is true: For any pair \((u,u_{S}) \in H^{3/2+\varepsilon }_{\mathrm{pw}}(\varOmega ) \times S\) that satisfies (3.8) we have

$$\begin{aligned} \sup _{0\ne w_{S}\in S}\frac{k|(u-u_{S},w_{S})_{L^{2}\left( \varOmega \right) }|}{\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }}\le \left( 1+3C_{\mathrm{c}}\right) \eta _{k}(S)\left( \inf _{v\in S}\Vert u-v\Vert _{DG^{+}}+\Vert u-u_{S}\Vert _{DG}\right) , \end{aligned}$$

where the adjoint approximation property is defined by

$$\begin{aligned} \eta _{k}(S):=\sup _{f\in L^{2}(\varOmega )\setminus \{0\}}\inf _{\psi _{S}\in S} \frac{k\Vert N_{k}^{*}(f)-\psi _{S}\Vert _{DG^{+}}}{\Vert f\Vert _{L^{2}\left( \varOmega \right) }}. \end{aligned}$$
(3.11)

Proof

Write \(\phi = N_{k}^{*}(w_S)\) for the solution of (2.12) with right-hand side \(w_S \in S \subset L^2(\varOmega )\). Our regularity assumption implies \(\phi \in H^{3/2+{\varepsilon }}(\varOmega )\) for some \({\varepsilon }>0\) (cf. Remark 2.6). The adjoint consistency of the method stated in Lemma 2.7 then provides

$$\begin{aligned} (u-u_{S},w_{S})=a_{\fancyscript{T}}(u-u_{S},\phi )-k^{2}(u-u_{S},\phi ). \end{aligned}$$

Using the definition of the sesquilinear form \(a_{\fancyscript{T}}\) and the Galerkin orthogonality, we get for any \(v\in S\)

$$\begin{aligned} |(u-u_{S},w_{S})|&\le |a_{\fancyscript{T}}(u-v,\phi -\psi _{S})|+ |a_{\fancyscript{T} }(v-u_{S},\phi -\psi _{S})| +k^{2}|(u-u_{S},\phi -\psi _{S})|\\&\le \left( C_{\mathrm{c}}\Vert u-v\Vert _{DG^{+}}+C_{\mathrm{c}}\left\| v-u_{S}\right\| _{DG} +\left\| u-u_{S}\right\| _{DG}\right) \left\| \phi -\psi _{S}\right\| _{DG^{+}}\\&\le \left( 2C_{\mathrm{c}}\Vert u-v\Vert _{DG^{+}}+(1+C_{\mathrm{c}})\Vert u-u_{S}\Vert _{DG}\right) \Vert \phi -\psi _{S}\Vert _{DG^{+}}. \end{aligned}$$

Since \(v\), \(\psi _{S}\in S\) are arbitrary, the statement follows. \(\square \)

The combination of the previous results leads to the following wavenumber-explicit error estimate (still under the assumption of existence of a discrete solution).

Theorem 3.6

(quasi-optimal convergence) Assume that the adjoint Helmholtz problem is \(H^{3/2+\varepsilon }(\varOmega )\) regular for some \(\varepsilon > 0\). Let the coefficients in the definition of \(a_{\fancyscript{T}}\left( \cdot ,\cdot \right) \) satisfy \(0<\delta <1/3\) and (2.7). If the condition

$$\begin{aligned} \eta _{k}(S)<\frac{c_{\mathrm{coer}}}{4(1+C_{c})} \end{aligned}$$

holds, then for any pair \((u,u_{S}) \in H^{3/2+\varepsilon } _{\mathrm{pw}}(\varOmega ) \times S\) that satisfies (3.8) we have

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le C\inf _{v\in S}\Vert u-v\Vert _{DG^{+}}, \end{aligned}$$
(3.12)

where \(C\) depends solely on \(C_{c}\) and \(c_{\mathrm{coer}}\).

Proof

By combining the results of Propositions 3.4 and 3.5, we get the following:

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le \left( 1+\frac{C_{c}}{c_{\mathrm{coer}}} \!+\!\frac{4C_{c}}{c_{\mathrm{coer}}}\eta _{k}(S)\right) \inf _{v\in S}\Vert u-v\Vert _{DG^{+}}+\frac{2(1+C_{c})}{c_{\mathrm{coer}}}\eta _{k}(S)\Vert u-u_{S}\Vert _{DG}. \end{aligned}$$

The condition \(\frac{2(1+C_{c})}{c_{\mathrm{coer}}}\eta _{k}(S)<1/2\) allows us to absorb the error term on the right-hand side in the left-hand side. \(\square \)

3.3 Discrete Stability

The preceding section provides an error analysis under the assumption of existence of the discrete solution \(u_{S}\in S\) of (2.4). Extra conditions have to be imposed for existence as the following Example 3.7 shows. That is, the discontinuous Galerkin method for the Helmholtz problem is not necessarily stable for an arbitrary discrete space \(S\) that only satisfies the minimal condition (2.3).

Example 3.7

Let \(\varOmega := \mathrm{conv}\{(0,0)^{\intercal }, (1,0)^{\intercal }, (0,1)^{\intercal }\}\) and let the mesh \(\fancyscript{T}\) consists of the single element \(\{\varOmega \}\). A (one-dimensional) space \(S\) that satisfies condition (2.3) is defined by the span of the squared cubic bubble function, \(S=\mathrm{span}\{(27\lambda _{1}\lambda _{2}\lambda _{3})^{2}\}\), where \(\lambda _{1}=\xi _{1},\,\lambda _{2}=\xi _{2},\,\lambda _{3}=1-\xi _{1}-\xi _{2}\) and \(0\le \xi _{1}\le 1,\,0\le \xi _{2}\le 1-\xi _{1}\). In this case, Eq. (3.16) reduces to

$$\begin{aligned} (\nabla w_{S},\nabla v_{S})-k^{2}(w_{S},v_{S})=0\quad \forall v_{S}\in S. \end{aligned}$$
(3.13)

As \(S\) is a one-dimensional space we get the following \(1\times 1\) system \((A-k^{2}B)w=0,\) where \(A=\int _{\widehat{K}}\nabla b_{1}\cdot \nabla b_{1}=5.1125,\, B=\int _{\widehat{K}}b_{1}^{2}= 0.0843\) and \(b_{1}=(27\lambda _{1}\lambda _{2}\lambda _{3})^{2}\). Obviously, the value of \(k=\sqrt{\frac{A}{B}}\) is a critical wavenumber where the system matrix becomes singular. \(\square \)

In this section, we will study conditions under which the DG problem admits a unique solution in the discrete space \(S\). One possible condition (3.14) is formulated in Theorem 3.8 and it is shown that this condition is always satisfied for plane waves methods as well as for conforming and non-conforming polynomial \(hp\)-finite element spaces on affine simplicial meshes (cf. Remark 3.9). Thus, Theorem 3.8 presents a unified stability theory for these types of methods and shows that a unique numerical solution always exists for these important choices of spaces. This is in contrast to conventional Galerkin methods applied to (1.3), where a minimal resolution condition on the finite element space, e.g., on the mesh width, has to be imposed in order to guarantee unique solvability of the discrete equations.

Alternatively, as in the classical Galerkin discretization, a condition on the adjoint approximation property on the abstract space can be employed to prove existence, uniqueness, and quasi-optimality of the discretization. This is proved in Theorem 3.10.

Theorem 3.8

Let the discrete space \(S\) satisfy (2.3). Let \(\beta \ge 0\), \(0<\delta <1/3\), and choose \(\alpha \) such that (2.7) is satisfied. Then, the DG problem (2.4) has a unique solution \(u_{S} \in S\) if

$$\begin{aligned} C_{S}<\frac{k}{2\left( 1+C_{c}\right) }\quad \text{ with }\quad C_{S} :=\sup _{w_{S}\in S\cap H_{0}^{2}(\varOmega )\setminus \{0\}} \inf _{v_{S}\in S} \frac{\Vert \left\langle x,\nabla w_{S}\right\rangle -v_{S}\Vert _{DG^{+}} }{\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }}. \end{aligned}$$
(3.14)

Furthermore, let the exact solution of (1.3) satisfy \(u\in H^{3/2+{\varepsilon }}(\varOmega )\), and let the adjoint Helmholtz problem be \(H^{3/2+{\varepsilon }}(\varOmega )\) regular for some \({\varepsilon }>0\). Assume the adjoint approximation condition

$$\begin{aligned} \eta _{k}(S)<\frac{c_{\mathrm{coer}}}{4(1+C_{c})}. \end{aligned}$$

Then, the quasi-optimal error estimate

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}\le C\inf _{v\in S}\Vert u-v\Vert _{DG^{+}} \end{aligned}$$

holds, where \(C\) is independent of \(k\) and the space \(S\).

Proof

If the discrete solution \(u_{S} \in S\) of (2.4) exists, then the consistency statement Lemma 2.4 implies the orthogonality condition (3.8) so that the quasi-optimality assertion follows from Theorem 3.6. It therefore remains to assert existence of \(u_{S} \in S\). By dimension arguments, existence of a solution \(u_{S} \in S\) of (2.4) follows, if we can verify the following uniqueness assertion:

$$\begin{aligned} \forall w_{S}\in S\setminus \{0\}\quad \exists v_{S}\in S\quad \text{ s.t. } \quad |a_{\fancyscript{T}}(w_{S},v_{S})-k^{2}(w_{S},v_{S})|>0. \end{aligned}$$
(3.15)

We prove (3.15) indirectly, by showing the equivalent implication:

For any \(w_{S}\in S\) it holds:

$$\begin{aligned} \left( \forall v_{S}\in S\quad a_{\fancyscript{T}}(w_{S},v_{S})-k^{2}(w_{S} ,v_{S})=0\right) \Rightarrow w_{S}=0. \end{aligned}$$
(3.16)

Our assumption in (3.16) implies for any \(w_{S}\in S\)

$$\begin{aligned} \mathrm{Im}\left( a_{\fancyscript{T}}(w_{S},v_{S})-k^{2}(w_{S} ,v_{S})\right) =0\quad \text{ and }\quad \mathrm{Re}\left( a_{\fancyscript{T} }(w_{S},v_{S})-k^{2}(w_{S},v_{S})\right) =0.\qquad \quad \end{aligned}$$
(3.17)

First we choose \(v_{S}=w_{S}\) in (3.17). From the equation for the imaginary part we obtain

$$\begin{aligned}{}[\![\nabla _{\fancyscript{T}}w_{S}]\!]_{N}&= 0\quad \text{ on }\quad \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}, \\ \nabla _{\fancyscript{T}}w_{S}\cdot \mathbf{n}&= 0\quad \text{ on } \quad \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}},\\ [\![w_{S}]\!]_{N}&= 0\quad \text{ on }\quad \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}, \\ w_{S}&= 0\quad \text{ on }\quad \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}, \end{aligned}$$

and this implies \(w_{S}\in H_{0}^{2}(\varOmega )\cap S\) (in particular, it implies \(\nabla _{\fancyscript{T}}w_{S}=\nabla w_{S}\)). Hence, the real part of Eq. (3.17) gives us

$$\begin{aligned} \left\| \nabla w_{S}\right\| _{L^{2}\left( \varOmega \right) }^{2} -k^{2}\left\| w_{S}\right\| _{L^{2}\left( \varOmega \right) }^{2}=0. \end{aligned}$$
(3.18)

Define \(v_{S}^{*}(x)=\langle x,\nabla w_{S}\rangle \). From the real part of Eq. (3.17) it follows

$$\begin{aligned} 0&= \mathrm{Re}\left( a_{\fancyscript{T}}(w_{S},v_{S}^{*} )-k^{2}(w_{S},v_{S}^{*})\right) +\mathrm{Re}\left( a_{\fancyscript{T} }(w_{S},v_{S}-v_{S}^{*})-k^{2}(w_{S},v_{S}-v_{S}^{*})\right) \\&\ge \mathrm{Re}\left( a_{\fancyscript{T}}(w_{S},v_{S}^{*} )-k^{2}(w_{S},v_{S}^{*})\right) -|a_{\fancyscript{T}}(w_{S},v_{S}^{*} -v_{S})|-|k^{2}(w_{S},v_{S}^{*}-v_{S})|. \end{aligned}$$

By using \(2\mathrm{Re}(w_{S}\nabla \overline{w_{S}})= \nabla (|w_{S}|^{2})\) for the first term, and continuity of \(a_{\fancyscript{T}}\), and applying Cauchy-Schwarz inequality we get (see also [19, 34])

$$\begin{aligned} 0&\ge (2-d)\Vert \nabla w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2} +dk^{2}\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2}-2C_{c}\Vert w_{S}\Vert _{DG}\Vert v_{S}^{*}-v_{S}\Vert _{DG^{+}}\nonumber \\&\quad -2k^{2}\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }\Vert v_{S}^{*}- v_{S}\Vert _{L^{2}\left( \varOmega \right) }\nonumber \\&\ge (2-d)\Vert \nabla w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2} +dk^{2}\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2}-2C_{c}C_{S}\Vert w_{S}\Vert _{DG}\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }\nonumber \\&\quad -2C_{S}k\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2}. \end{aligned}$$
(3.19)

Using the definition of DG-norm and taking into account that \(w_{S}\in H_{0}^{2}(\varOmega )\cap S\) we get \(\Vert w_{S}\Vert _{DG}=\Vert w_{S} \Vert _{\fancyscript{H}}\), where \(\Vert w_{S}\Vert _{\fancyscript{H}}^{2}:=\Vert \nabla w_{S}\Vert _{L^{2}( \varOmega )}^{2}+k^{2}\Vert w_{S}\Vert _{L^{2}( \varOmega )}^{2}\). For \(d=1\), we get

$$\begin{aligned} 0&\ge \Vert w_{S}\Vert _{\fancyscript{H}}^{2}-2C_{c}C_{S}\Vert w_{S} \Vert _{\fancyscript{H}}\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) } -2C_{S}k\Vert w_{S}\Vert _{L^{2}\left( \varOmega \right) }^{2}\\&\ge \left( 1-\frac{2C_{c}C_{S}}{k}-\frac{2C_{S}}{k}\right) \Vert w_{S}\Vert _{\fancyscript{H}}^{2}. \end{aligned}$$

If \(C_{S}<\frac{k}{2(1+C_{c})}\) then it follows that \(w_{S} =0\,\text{ in }\,\varOmega \). For \(d=2\), 3 we add (3.18) to the Eq. (3.19) and then proceed with the same argument as in 1d. \(\square \)

Remark 3.9

For general finite-dimensional spaces \(S\), condition (3.14) could be interpreted as a condition on the scale resolution. However, the condition (3.14) is always satisfied in the following two important cases:

  • In [23] the variational formulation (2.4) was derived for the discretization by locally (discontinuous) plane waves. In that setting, condition (3.14) is not imposed since it is trivially satisfied as then \(S\cap H_{0}^{2}(\varOmega )=\{0\}\) (this equality follows from the unique continuation principle for elliptic PDEs—see, e.g., the discussion in [15, Sec. 6.3] for details).

  • DG-methods based on classical piecewise polynomials on affine triangulations (consisting of simplices) satisfy (3.14) automatically as \(\langle x,\nabla _{\fancyscript{T}}w_{S}\rangle \in S\). The proof is closely related to the arguments presented in [1921]. Indeed, the key observation in these references is that, for given \(u\in S\), elementwise defined test functions of the form \(u\) and \(x\cdot \nabla u\) or, more generally, \(\alpha (x-x_{\varOmega })\cdot \nabla u+\beta u\) (for constants \(\alpha \), \(\beta \) and a chosen point \(x_{\varOmega }\)) are useful to provide stability and error estimates. \(\square \)

For new generalized finite element spaces, it might be complicated to verify condition (3.14). In the following theorem, we present a different criterion which also implies discrete stability.

Theorem 3.10

Let the exact solution of (1.3) satisfy \(u\in H^{3/2+{\varepsilon }}(\varOmega )\) and let the adjoint Helmholtz problem be \(H^{3/2+{\varepsilon }}(\varOmega )\) regular for some \({\varepsilon }>0\). Assume that the coefficients in the definition of \(a_{\fancyscript{T}}(\cdot ,\cdot )\) satisfy \(0<\delta <1/3\) and (2.7). If the condition

$$\begin{aligned} \eta _{k}(S)<\frac{c_{\mathrm{coer}}}{4(1+C_{c})} \end{aligned}$$

holds, then the DG problem (2.4) has a unique solution \(u_{S}\in S\) and satisfies the quasi-optimality property (3.12).

Proof

The proof follows the lines in [33, Thm. 3.9]. We merely have to show existence of \(u_{S}\). Since the (2.4) corresponds to a linear system of equations, it suffices to show uniqueness. Therefore, let \(u_{S}\in S\) be in the kernel of the discrete operator, i.e., \(a_{\fancyscript{T}}(u_{S},v)-k^{2}( u_{S},v)=0\) for all \(v\in S\). Then the pair \((0,u_{S}) \in H^{3/2+\varepsilon }(\varOmega ) \times S\) satisfies the orthogonality condition (3.8). Hence, Theorem 3.6 implies \(\Vert 0-u_{S}\Vert _{DG}\le C\inf _{v\in S} \Vert 0-v\Vert _{DG+}=0\), which shows \(u_{S}=0\). Again, the quasi-optimality follows as a combination of Theorem 3.6 and Lemma 2.4. \(\square \)

4 Application to Polynomial \(hp\)-Finite Elements

Theorem 3.6 provides a quasi-optimal error estimate for abstract approximation spaces \(S\) that satisfy the conditions (2.3) and (3.14). The concrete choice of the space \(S\) enters the analysis via (a) the constant \(C_{\mathrm{trace}}(S,K)\), (b) the estimate of the approximation error \(\inf _{v\in S}\Vert u-v\Vert _{DG^{+}}\), c) the adjoint approximation property \(\eta _{k}(S)\), and d) the constant \(C_{S}\) in (3.14). As explained in Remark 3.9 the condition on \(C_{S}\) is “automatically” satisfied for polynomial \(hp\)-finite element spaces if affine meshes are considered. The focus in the present section is on non-affine meshes so that the stability of the DG method will be inferred from the condition on the adjoint approximability as discussed in Theorem 3.8. Our primary reason for considering curved elements is that our regularity theory for Helmholtz problems (see Theorems 4.5) is done for smooth (more precisely: analytic) geometries. In this setting, we derive explicit estimates for these quantities in the context of polynomial \(hp\)-finite element space which are explicit with respect to the polynomial degree \(p\), and the mesh size \(h\).

4.1 Preliminaries

We consider a partition of the domain \(\varOmega \) into “simplicial” elements. That is, the finite element mesh \(\fancyscript{T}\) consists of elements \(K\) that are the images of the reference element \(\widehat{K}\), i.e., the reference triangle (in 2D) or the reference tetrahedron (in 3D), under the element map \(F_{K}:\widehat{K}\rightarrow K\). The mesh width is denoted by \(h:=\max _{K\in \fancyscript{T}}\mathrm{diam}K\) [cf. (2.1)].

We use the symbol \(\nabla ^{n}\) to denote derivatives of order \(n\); more precisely, for a function \(u:\varOmega \rightarrow \mathbb R ,\varOmega \subset \mathbb R ^{d}\), we set

$$\begin{aligned} |\nabla ^{n}u(x)|^{2}=\sum _{\alpha \in \mathbb N _{0}^{d}:\, |\alpha |=n}\frac{n!}{\alpha !}|D^{\alpha }u(x)|^{2}. \end{aligned}$$

We will need some conditions on the element maps \(F_{K}\) of the triangulations in order to capture the approximation properties of the polynomial \(hp\)-FEM spaces. The following assumption will make this more precise. We emphasize that, in contrast to the case of \(H^1(\varOmega )\)-conforming subspaces, we do not require in the present context of DG-methods a “compatibility” condition for element maps of neighboring elements.

Assumption 4.1

(“simplicial” finite element mesh). Each element map \(F_{K}\) can be written as \(F_{K}=R_{K}\circ B_{K}\), where \(B_{K}\) is an affine map (containing the scaling by \(h_{K}\)) and \(R_{K}\) is analytic. Let \(\widetilde{K}:=B_{K}(K)\). The maps \(R_{K}\) and \(B_{K}\) satisfy for shape regularity constants \(C_{\mathrm{affine}},C_{\mathrm{metric} },\gamma >0\) independent of \(h\):

$$\begin{aligned}&\Vert B_{K}^{\prime }\Vert _{L^{\infty }\left( \widehat{K}\right) }\le C_{\mathrm{affine}}h_{K},\quad \quad \Vert (B_{K}^{\prime })^{-1}\Vert _{L^{\infty }\left( \widehat{K}\right) }\le C_{\mathrm{affine}}h_{K}^{-1}\\&\Vert (R_{K}^{\prime })^{-1}\Vert _{L^{\infty }(\widetilde{K})}\le C_{\mathrm{metric}},\quad \quad \Vert \nabla ^{n}R_{K}\Vert _{L^{\infty } (\widetilde{K})}\le C_{\mathrm{metric}}\gamma ^{n}n!\quad \forall n\in \mathbb N _{0}. \end{aligned}$$

Remark 4.2

If the mapping \(R_{K}\) in Assumption 4.1 are affine we say that \(\fancyscript{T}\) is an affine triangulation.

The constants \(C\) in the estimates below may depend on the shape regularity constants in a continuous way and, possibly, increase with increasing values of \(C_{\mathrm{affine}}\), \(C_{\mathrm{metric}}\), and \(\gamma \). \(\square \)

In this paper we are allowed to consider non-conforming meshes with general interfaces, i.e., one mesh can be a submesh of the other one, or meshes can have entirely unmatched interfaces.

For meshes \(\fancyscript{T}\) satisfying Assumption 4.1 we define the following non-conforming space of piecewise (mapped) polynomials by

$$\begin{aligned} S^{p,0}({\fancyscript{T}}):=\{u\in L^{2}(\varOmega )|\quad \forall K\in \fancyscript{T}:\,u|_{K}\circ F_{K}\in \fancyscript{P}_{p}\}, \end{aligned}$$

where \(\fancyscript{P}_{p}\) denotes the space of polynomials of degree \(p\). The mesh size function \(h_{\fancyscript{T}}\) is defined by \(h_{\fancyscript{T} }|_{K}:=\text{ diam }\,K\) for all \(K\in \fancyscript{T}\). The estimate of \(C_{\mathrm{trace}}(S,K)\) in these cases is a local trace estimate for multivariate polynomials:

Lemma 4.3

Let \({\fancyscript{T}}\) satisfy Assumption 4.1. Then there exists \(c_{\mathrm{inv}}>0\) independent of \(K\in {\fancyscript{T}}\) and \(p\) such that for the polynomial \(hp\)-finite element space \(S^{p,0} ({\fancyscript{T}})\) we have [cf. (2.6)]

$$\begin{aligned} C_{\mathrm{trace}}\left( S,K\right) \le \frac{c_{\mathrm{inv} }p}{\sqrt{h_{K}}} \end{aligned}$$

Furthermore, for

$$\begin{aligned} \mathfrak a >\frac{4}{3}c_{\mathrm{inv}}^{2}, \end{aligned}$$
(4.1)

which is independent of \(K\), \(p\), and \(k\), the choice of \(\alpha \) given in (2.8) implies the condition (2.7).

Proof

We merely prove the inverse estimate. On the reference element \(\widehat{K}\), we have with the multiplicative trace inequality and a standard polynomial inverse estimate (see, e.g., [42, Thm. 4.76], where the case \(d=2\) is covered) for any \(v\in {\fancyscript{P}}_{p}\)

$$\begin{aligned} \Vert v\Vert _{L^{2}(\partial \widehat{K})}^{2}\le C\Vert v\Vert _{L^{2} (\widehat{K})}\Vert v\Vert _{H^{1}(\widehat{K})}\le Cp^{2}\Vert v\Vert _{L^{2}(\widehat{K})}^{2}. \end{aligned}$$

The assumptions on the element maps \(F_{K}\) are such that the same \(h\)-dependence as in classical scaling argument are obtained, i.e., for \(v\in S^{p,0}({\fancyscript{T}})\) we get for each \(K\in {\fancyscript{T}}\)

$$\begin{aligned} \Vert v\Vert _{L^{2}(\partial K)}\le Cph^{-1/2}\Vert v\Vert _{L^{2}(K)}. \end{aligned}$$
(4.2)

For the actual estimate of interest, we let \(v\in S^{p,0}({\fancyscript{T}})\), fix \(K\), and set \(\widehat{v}:=v|_{K}\circ F_{K}\). We note \(\nabla v=(\nabla \widehat{v})\circ F_{K}\circ (F_{K}^{\prime })^{-1}\) with, by the assumptions on the properties of \(B_{K}\) and \(R_{K}\),

$$\begin{aligned} \Vert (F_{K}^{\prime })^{-1}\Vert _{L^{\infty }(\widehat{K})}\le Ch_{K} ^{-1},\quad \Vert (F_{K}^{\prime })\Vert _{L^{\infty }(\widehat{K})}\le Ch_{K}. \end{aligned}$$
(4.3)

Applying the estimate (4.2) to the components of \(\nabla \widehat{v}\circ F_{K}\) and observing (4.3), one can show the desired result. \(\square \)

The trace inequality of Lemma 4.3 shows that the constant \(\mathfrak a \) in (2.8) can be selected such that (2.7) is satisfied. This observation implies the following result:

Theorem 4.4

Let \(\alpha \), \(\beta \), and \(\delta \) be chosen according to (2.8) with \(\mathfrak a \) sufficiently large. Let \(S=S^{p,0}({\fancyscript{T}})\) be the polynomial \(hp\)-finite element space based on a mesh \(\fancyscript{T}\) that satisfies Assumption 4.1.

  • If \(C_{S}\) satisfies condition (3.14) then the DG problem has a unique solution in \(S\).

  • If \(\fancyscript{T}\) is an affine triangulation of \(\varOmega \) and satisfies Assumption 4.1, then the DG problem has a unique solution in \(S\).

4.2 Convergence Analysis

In this section we will show that the solution \(u\) of the model boundary value problem (1.1), (1.2) can be approximated from the finite element space \(S^{p,0}(\fancyscript{T})\) provided that \(kh/\sqrt{p}\) is small enough and \(p\ge c\log k\) (with \(c\) sufficiently large independent of \(h\), \(k\), \(p\)). Under more stringent conditions on the mesh, we will show that this condition can be relaxed to the condition that \(kh/p\) be small enough and \(p\ge c\log k\).

The proof of this approximation property is based on the following decomposition lemma, which is a generalization of [37, Theorem 4.10], where the special case \(s = 0\) is covered:

Theorem 4.5

(Decomposition Lemma) Let \(\varOmega \in \mathbb R ^{d} \), \(d\in \{2,3\}\) be a bounded Lipschitz domain. Assume additionally that \(\varOmega \) has an analytic boundary. Assume furthermore that the solution operator \((f,g)\mapsto u:=S_{k}(f,g)\) for the Helmholtz boundary value problem (1.1), (1.2) satisfies

$$\begin{aligned} \Vert u\Vert _{\fancyscript{H},\varOmega }\le C_{\mathrm{stab}}k^{\vartheta }\left( \Vert f\Vert _{L^{2}(\varOmega )}+\Vert g\Vert _{L^{2}(\partial \varOmega )}\right) \end{aligned}$$
(4.4)

for some \(C_{\mathrm{stab}}\) and \(\vartheta \ge 0\) independent of \(k\). Fix \(s\in \mathbb{N }_{0}\). Then there exist constants \(C\), \(\lambda >0\) independent of \(k\ge k_{0}\) such that for every \(f\in H^{s}(\varOmega )\) and \(g\in H^{s+1/2}(\partial \varOmega )\) the solution \(u=S_{k}(f,g)\) of the Helmholtz problem (1.3) can be written as \(u=u_{H^{s+2}}+u_{\fancyscript{A} }\), where, for all \(n\in \mathbb N _{0}\)

$$\begin{aligned} \Vert u_{\fancyscript{A}}\Vert _{\fancyscript{H},\varOmega }&\le Ck^{\vartheta } \left( \Vert f\Vert _{L^{2}(\varOmega )}+\Vert g\Vert _{H^{1/2}(\partial \varOmega )}\right) , \end{aligned}$$
(4.5)
$$\begin{aligned} \Vert \nabla ^{n+2}u_{\fancyscript{A}}\Vert _{L^{2}(\varOmega )}&\le C\lambda ^{n}k^{\vartheta -1}\max \{n,k\}^{n+2}\left( \Vert f\Vert _{L^{2} (\varOmega )}+\Vert g\Vert _{H^{1/2}(\partial \varOmega )}\right) , \end{aligned}$$
(4.6)
$$\begin{aligned} \Vert u_{H^{s+2}}\Vert _{H^{s+2}(\varOmega )}+ k^{s+2} \Vert u_{H^{s+2}} \Vert _{L^{2}(\varOmega )}&\le C\left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) . \end{aligned}$$
(4.7)

Proof

The proof follows the lines of [37, Theorem 4.10]. The key modifications are collected in Appendix 1. \(\square \)

Remark 4.6

For the present model problem (1.1), (1.2) the assumption (4.4) holds with \(\vartheta = 5/2\) by [15, Thm. 2.4]. For star-shaped domains, \(\vartheta = 0\) is possible as shown in [34, Prop. 8.1.4] for \(d=2\) and subsequently for \(d=3\) in [10]. \(\square \)

4.2.1 Convergence analysis for General Non-conforming Polynomial \(hp\)-Finite Elements

In this section we consider general non-conforming polynomial \(hp\)-finite elements, where no interelement compatibility conditions are imposed on the element maps \(F_{K}\) that relate element maps of neighboring elements to each other. Hence, the conforming subspace \(S \cap H^{1}(\varOmega ) \subset S\) may be small. As we will discuss in more detail in Sect. 5 below, better results can be expected if the conforming subspace \(S \cap H^{1}(\varOmega ) \subset S\) is sufficiently rich.

We start with a lemma that takes the role of the standard scaling argument:

Lemma 4.7

Let \({\fancyscript{T}}\) be a shape-regular mesh in the sense of Assumption 4.1. Fix \(s\in \mathbb{N }_{0}\). Then for each \(K\in {\fancyscript{T}}\) and every sufficiently smooth \(v\) the following relations between \(v\) and \(\widehat{v}:=v|_{K}\circ F_{K}\) are true:

$$\begin{aligned} \Vert v\Vert _{L^{2}(K)}&\sim h^{d/2}\Vert \widehat{v}\Vert _{L^{2} (\widehat{K})},\\ \Vert \nabla v\Vert _{L^{2}(K)}&\sim h^{d/2-1} \Vert \nabla \widehat{v}\Vert _{L^{2} (\widehat{K})},\\ \Vert \nabla ^{s+2}\widehat{v}\Vert _{L^{2}(\widehat{K})}&\le Ch^{s+2-d/2}\Vert v\Vert _{H^{s+2}(K)},\\ \Vert v\Vert _{L^{2}(\partial K)}&\sim h^{(d-1)/2}\Vert \widehat{v}\Vert _{L^{2}(\partial \widehat{K})},\\ \Vert \nabla v\Vert _{L^{2}(\partial K)}&\sim h^{(d-1)/2-1}\Vert \nabla \widehat{v}\Vert _{L^{2} (\partial \widehat{K})}, \end{aligned}$$

where \(C\) and the implied constants depend solely on the constants appearing in Assumption 4.1.

Proof

We will only consider the case of the \((s+2)\)nd derivatives. We note the form \(F_{K}=R_{K}\circ A_{K}\), where \(A_{K}\) is affine. This implies the estimates

$$\begin{aligned} \Vert F_{K}^{\prime }\Vert _{L^{\infty }(\widehat{K})}\le Ch_{K},\quad \sum _{\alpha \in \mathbb{N }_{0}^{2}:|\alpha |=s+2}\Vert D^{\alpha }F_{K} \Vert _{L^{\infty }(\widehat{K})}\le Ch_{K}^{s+2}, \end{aligned}$$

where the constants depend only on the constants appearing in Assumption 4.1. The chain rule then implies the estimates for \(\Vert \nabla ^{s+2}\widehat{v}\Vert _{L^{2}(\widehat{K})}\). \(\square \)

For shape-regular triangulations (cf. Assumption 4.1) we have the following result:

Theorem 4.8

Let \(\varOmega \subset \mathbb R ^{d}\), \(d\in \{2,3\}\) be a bounded Lipschitz domain with analytic boundary. Let the mesh \({\fancyscript{T}}\) be shape-regular in the sense of Assumption 4.1. Fix \(s \in \mathbb{N }_{0}\). Let \(\alpha \), \(\beta \), \(\delta \) be chosen according to (2.8). Fix \(\overline{C} > 0\) and assume \(p \ge s+1\) as well as \(kh/p \le \overline{C}\). Then there exist constants \(C\), \(\sigma >0\) independent of \(h\), \(p\), and \(p\) such that, for every \(f\in H^{s}(\varOmega )\) and \(g\in H^{s+1/2}(\partial \varOmega )\), there holds

$$\begin{aligned} \inf _{v \in S} k \Vert u-v\Vert _{DG^{+}}\le C_{f,g}\left( \left( \frac{h}{p}\right) ^{s} \frac{kh}{\sqrt{p}} + k^{\vartheta }\left\{ \left( \frac{h}{h+\sigma }\right) ^{p}+k \left( \frac{kh}{\sigma p}\right) ^{p}\right\} \right) , \end{aligned}$$
(4.8)

where \(C_{f,g}:=\Vert f\Vert _{H^{s}\left( \varOmega \right) }+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega ) }\) and \(\vartheta \ge 0\) is given by (4.4) (note also Remark 4.6).

Proof

We employ the splitting \(u=u_{H^{s+2}}+u_{\fancyscript{A}}\) of Theorem 4.5 with \(u_{H^{s+2}}\in H^{s+2}(\varOmega )\) and the analytic part \(u_{\fancyscript{A}}\).

Following [36, Thm. 5.5], we approximate \(u_{H^{s+2}}\) and \(v_{\fancyscript{A}}\) separately in the ensuing steps 1 and 2.

1. step: From, e.g., [36, Lemma B.3], we know that for every \(s^{\prime }>d/2\) and every \(p\ge s^{\prime }-1\) there exists a bounded linear operator \(\pi _{p}:H^{s^{\prime }} (\widehat{K})\rightarrow \fancyscript{P}_{p}\) such that

$$\begin{aligned} \Vert u-\pi _{p}u\Vert _{H^{t}(\widehat{K})}&\le Cp^{-(s^{\prime } -t)}|u|_{H^{s^{\prime }}(\widehat{K})}\quad \text{ for }\quad 0\le t\le s^{\prime },\end{aligned}$$
(4.9)
$$\begin{aligned} \Vert u-\pi _{p}u\Vert _{H^{t}(\widehat{e})}&\le Cp^{-(s^{\prime } -1/2-t)}|u|_{H^{s^{\prime }} (\widehat{K})}\quad \text{ for }\quad 0\le t\le s^{\prime }-1/2. \end{aligned}$$
(4.10)

Here, the constant \(C>0\) depends only on \(s^{\prime }\). By \(\widehat{K}\) we denote the reference element and by \(\widehat{e}\) one of its edges (in 2D) or faces (in 3D). We apply this approximation result with \(s^{\prime }=s+2\). The elementwise application of the operator \(\pi _{p}\) to \(u_{H^{s+2}}\) (pulled back to the reference element \(\widehat{K}\)) defines an approximation \(w_{H^{s+2}}\in S^{p,0}({\fancyscript{T}})\). By a scaling argument (cf. Lemma 4.7) and summation over all elements, the bound (4.9) with \(s^{\prime }=s+2\) implies that \(w_{H^{s+2}}\) satisfies

$$\begin{aligned}&k\left( k\Vert u_{H^{s+2}}-w_{H^{s+2}} \Vert _{L^{2}(\varOmega )} +\Vert \nabla _{\fancyscript{T}}(u_{H^{s+2}}-w_{H^{s+2}})\Vert _{L^{2}(\varOmega )}\right) \\&\quad \le C\left( k\left( \frac{h}{p}\right) ^{s+1}+k^{2}\left( \frac{h}{p}\right) ^{s+2}\right) \left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) . \end{aligned}$$

In order to estimate the terms of the \(DG^{+}\)-norm associated with the skeleton, we employ the choice of the parameters \(\alpha \), \(\beta \), \(\delta \) given in (2.8), viz.,

$$\begin{aligned} \alpha \left( x\right) =\frac{4}{3}\max _{K\in \left\{ K_{x}^{+},K_{x} ^{-}\right\} }\frac{p^{2}}{kh_{K}}\quad \quad \forall x\in \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\quad \text{ and }\quad \beta =O\left( \frac{kh}{p} \right) ,\quad \delta =O\left( \frac{kh}{p}\right) .\qquad \quad \end{aligned}$$
(4.11)

Recall the definition of \(\alpha _{\partial K}^{\min }\) as in Remark 2.3 and estimate (3.4). On the inner skeleton \(\mathfrak S _{\fancyscript{T}}^{I}\) we get

$$\begin{aligned} k\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{H^{s+2}}-w_{H^{s+2}} )\}\Vert _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2} \le \sum _{K\in \fancyscript{T} }\frac{k}{\alpha _{\partial K}^{\min }}\Vert \{\nabla _{\fancyscript{T}}(u_{H^{s+2} }-w_{H^{s+2}})\}\Vert _{L^{2}(\varOmega \cap \partial K)}^{2}. \end{aligned}$$

Let \(X\) denote the minimizer as in (3.4). Then, with the definition (4.11) we get

$$\begin{aligned} \alpha _{\partial K}^{\min }=\alpha \left( X\right) =\frac{4}{3}\max _{K^{\prime }\in \left\{ K_{X}^{+},K_{X}^{-}\right\} }\frac{p^{2} }{kh_{K^{\prime }}}\ge \frac{4}{3}\frac{p^{2}}{kh_{K}} \end{aligned}$$
(4.12)

so that

$$\begin{aligned}&k\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{H^{s+2}}-w_{H^{s+2}} )\}\Vert _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2}\\&\quad \le \sum _{K\in \fancyscript{T}}\frac{3k^{2}h_{K}}{4p^{2}}\Vert \nabla (\left. \left( u_{H^{s+2}}-w_{H^{s+2}}\right) \right| _{K} )\Vert _{L^{2}(\varOmega \cap \partial K)}^{2}. \end{aligned}$$

Thus, we get by scaling (4.9), (4.10) to the mesh \(\fancyscript{T}\)

$$\begin{aligned}&k\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{H^{s+2}}-w_{H^{s+2}} )\}\Vert _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2}\le C\sum _{K\in \fancyscript{T}}\frac{k^{2}h}{p^{2}}\left( \frac{h_{K}}{p}\right) ^{2s+1}\Vert u_{H^{s+2}}\Vert _{H^{s+2}(K)}^{2}\\&\quad \le C\frac{k^{2}}{p}\left( \frac{h}{p}\right) ^{2s+2}\Vert u_{H^{s+2}}\Vert _{H^{s+2}(\varOmega )}^{2}\le C\frac{k^{2}}{p}\left( \frac{h}{p}\right) ^{2s+2}\left( \Vert f\Vert _{H^{s}(\varOmega )}^{2}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}^{2}\right) . \end{aligned}$$

The following estimates can be obtained by similar arguments:

$$\begin{aligned} k^{1/2}\Vert \beta ^{1/2}[\![\nabla _{\fancyscript{T}}(u_{H^{s+2} }\!-\!w_{H^{s+2}})]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\right) }&\le Ck\left( \frac{h}{p}\right) ^{s+1}\left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) ,\\ k^{3/2}\Vert \alpha ^{1/2}[\![u_{H^{s+2}}\!-\!w_{H^{s+2}}]\!]_{N} \Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }&\le Ck\sqrt{p} \left( \frac{h}{p}\right) ^{s+1}\left( \Vert f\Vert _{H^{s} (\varOmega )}\!+\!\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) ,\\ k^{1/2}\Vert \delta ^{1/2}\nabla _{\fancyscript{T}}(u_{H^{s+2}}-w_{H^{s+2}} )\cdot \mathbf{n}\Vert _{H^{s}\left( \mathfrak S _{ \fancyscript{T}}^{\fancyscript{B} }\right) }&\le Ck\left( \frac{h}{p}\right) ^{s+1}\left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) ,\\ k^{3/2}\Vert (1\!-\!\delta )^{1/2}(u_{H^{2}}-w_{H^{2}})\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }&\le Ck^{3/2}\left( \frac{h}{p}\right) ^{s+3/2}\left( \Vert f\Vert _{H^{s}(\varOmega )}\!+\!\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) . \end{aligned}$$

In total, we get the following approximation property for the \(H^{s+2}\)-part:

$$\begin{aligned}&k\Vert u_{H^{s+2}}-w_{H^{s+2}}\Vert _{DG^{+}}\\&\quad \le C\left( \frac{h}{p}\right) ^{s}\left( \frac{kh}{\sqrt{p}}+\left( \frac{kh}{p}\right) ^{3/2}+\left( \frac{kh}{p}\right) ^{2}\right) \left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) . \end{aligned}$$

Using the assumption \(kh/p\le \overline{C}\), this can be simplified to

$$\begin{aligned} k\Vert u_{H^{s+2}}-w_{H^{s+2}}\Vert _{DG^{+}}\le C\left( \frac{h}{p}\right) ^{s}\frac{kh}{\sqrt{p}}\left( \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right) . \end{aligned}$$

2. step: For the approximation of the analytic part \(u_{\fancyscript{A}}\), we construct an element \(w_{\fancyscript{A}}\in S^{p,0}({\fancyscript{T}})\) as follows. For each \(K\in \fancyscript{T}\), let the constant \(C_{K}\) by defined by

$$\begin{aligned} C_{K}^{2}:=\sum _{n\in \mathbb N _{0}}\frac{\Vert \nabla ^{n}u_{\fancyscript{A}} \Vert _{L^{2}(K)}^{2}}{(2\lambda \max \left\{ n,k\right\} )^{2n}}. \end{aligned}$$

Then, we have

$$\begin{aligned} \Vert \nabla ^{n}u_{\fancyscript{A}}\Vert _{L^{2}\left( K\right) }&\le (2\lambda \max \left\{ n,k\right\} )^{n}C_{K}\quad \forall n\in \mathbb N _{0},\nonumber \\ \sum _{K\in \fancyscript{T}}C_{K}^{2}&\le C\left( \frac{1}{\lambda k}\right) ^{2}k^{2\vartheta }\left( \Vert f\Vert _{L^{2}(\varOmega )}^{2}+ \Vert g\Vert _{H^{1/2}(\partial \varOmega )}^{2}\right) . \end{aligned}$$
(4.13)

For \(q\in \{0,1,2\}\) we get the following estimate (see [36, Proof of Theorem 5.5]) for suitable \(\sigma >0\):

$$\begin{aligned} \Vert u_{\fancyscript{A}}-w_{\fancyscript{A}}\Vert _{H^{q}(K)}\le Ch_{K}^{-q} C_{K}\left\{ \left( \frac{h_{K}}{h_{K}+\sigma }\right) ^{p+1}+\left( \frac{kh_{K}}{\sigma p}\right) ^{p+1}\right\} . \end{aligned}$$
(4.14)

It is convenient to define the abbreviations:

$$\begin{aligned} E(\sigma )&:= \left( \frac{h}{h+\sigma }\right) ^{p}+k\left( \frac{kh}{\sigma p}\right) ^{p},\\ M&:= k^{\vartheta }\left( \Vert f\Vert _{L^{2}(\varOmega )}+\Vert g\Vert _{H^{1/2}(\partial \varOmega )}\right) . \end{aligned}$$

By summing over all elements, it follows as in [36] by suitably adjusting the constant \(\sigma \)

$$\begin{aligned} k\Vert u_{\fancyscript{A}}-w_{\fancyscript{A}}\Vert _{\fancyscript{H}}\le C\left( \frac{1}{p}+\frac{kh}{p}\right) E(\sigma )M. \end{aligned}$$
(4.15)

In order to treat the terms associated with the skeleton \(\mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\cup \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\) we use the multiplicative trace inequality (on \(\widehat{K}\) and Lemma 4.7)

$$\begin{aligned} \Vert v\Vert _{L^{2}(\partial K)}^{2}\le C\left( \Vert v\Vert _{L^{2} (K)}|v|_{H^{1}(K)}+h_{K}^{-1}\Vert v\Vert _{L^{2}(K)}^{2}\right) \end{aligned}$$

to obtain

$$\begin{aligned} k\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{\fancyscript{A}}- w_{\fancyscript{A} })\}\Vert _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2} \le \sum _{K\in \fancyscript{T}}\frac{k}{\alpha _{\partial k}^{\min }}\Vert \nabla _{\fancyscript{T} }(\left. \left( u_{\fancyscript{A}}-w_{\fancyscript{A}}\right) \right| _{K})\Vert _{L^{2}(\varOmega \cap \partial K)}^{2}. \end{aligned}$$

By using the estimate (4.12) we obtain

$$\begin{aligned}&k\left\| \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{\fancyscript{A} }-w_{\fancyscript{A}})\}\right\| _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2}\\&\le \sum _{K\in \fancyscript{T}}\frac{3k^{2}h_{K}}{4p^{2}}\left\| \nabla (\left. \left( u_{\fancyscript{A}}-w_{\fancyscript{A}}\right) \right| _{K})\right\| _{L^{2}(\varOmega \cap \partial K)}^{2}\\&\le \sum _{K\in \fancyscript{T}}\frac{3}{4}\left( \frac{k^{2}h_{K}}{p^{2} }\right) \left( \left\| \nabla \left( u_{\fancyscript{A}}-w_{\fancyscript{A} }\right) \right\| _{L^{2}\left( K\right) }\left| \nabla \left( u_{\fancyscript{A}}-w_{\fancyscript{A}}\right) \right| _{H^{1}\left( K\right) }\right. \\&\left. +h_{K}^{-1}\left\| \nabla \left( u_{\fancyscript{A}}-w_{\fancyscript{A}}\right) \right\| _{L^{2}\left( K\right) }^{2}\right) . \end{aligned}$$

By using the estimates in Eq. (4.14) we get

$$\begin{aligned} k\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{\fancyscript{A}}\!-\! w_{\fancyscript{A} })\}\Vert _{L^{2}(\mathfrak S _{\fancyscript{T}}^{I})}^{2} \!\le \!\sum _{\begin{array}{c} K\in \fancyscript{T} \end{array}}\frac{3Ck^{2}}{4p^{2}}\left\{ h_{K}\left( \frac{h_{K}}{h_{K}\!+\!\sigma }\right) ^{p-1}\!+\!\frac{k}{p}\left( \frac{kh_{K}}{\sigma p}\right) ^{p}\right\} ^{2}C_{K}^{2}. \end{aligned}$$

Finally Eq. (4.13) gives us after suitably adjusting the constant \(\sigma \)

$$\begin{aligned} k^{1/2}\Vert \alpha ^{-1/2}\{\nabla _{\fancyscript{T}}(u_{\fancyscript{A}}- w_{\fancyscript{A} })\}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }\le C\frac{1}{p^{2}}E(\sigma )M. \end{aligned}$$

By the similar arguments we obtain the following estimates

$$\begin{aligned} k^{1/2}\Vert \beta ^{1/2}[\![\nabla _{\fancyscript{T}} (u_{\fancyscript{A} }-w_{\fancyscript{A}})]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }&\le C \frac{1}{p^{3/2}} E(\sigma ) M,\\ k^{3/2}\Vert \alpha ^{1/2}[\![u_{\fancyscript{A}}-w_{\fancyscript{A}} ]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I} }\right) }&\le C E(\sigma ) M,\\ k^{1/2}\Vert \delta ^{1/2}\nabla _{\fancyscript{T}}(u_{\fancyscript{A}}- w_{\fancyscript{A} })\cdot \mathbf{n}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B} }\right) }&\le C\frac{1}{p^{3/2}} E(\sigma ) M,\\ k^{3/2}\Vert (1-\delta )^{1/2}(u_{\fancyscript{A}}-w_{\fancyscript{A}})\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{B}}\right) }&\le C\frac{(kh)^{1/2}}{p} E(\sigma ) M. \end{aligned}$$

The approximation property for the analytic part \(u_{\fancyscript{A}}\) with respect to the \(DG^{+}\) norm is then

$$\begin{aligned} k\Vert u_{\fancyscript{A}}-w_{\fancyscript{A}}\Vert _{DG^{+}}\le C&\left( 1 + \frac{1}{p} + \frac{kh}{p} + \frac{\sqrt{kh}}{p}\right) E(\sigma ) M \le C E(\sigma ) M, \end{aligned}$$

where, in the last estimate we used the assumption \(kh/p \le \overline{C}\). The combination of the estimates of steps 1 and 2 leads to the assertion. \(\square \)

The approximation result Theorem 4.8 permits us to estimate the adjoint approximation property \(\eta (S)\) of (3.11):

Corollary 4.9

Let \(\varOmega \subset \mathbb R ^{d}\), \(d\in \{2,3\}\), be a bounded Lipschitz domain with analytic boundary. Let the mesh \(\fancyscript{T}\) be shape-regular in the sense of Assumption 4.1. Let \(\alpha \), \(\beta \), \(\delta \) be chosen according to (2.8). Fix \(\overline{C} > 0\) and assume \(kh/p \le \overline{C}\). Then there exist constants \(C\), \(\sigma >0\) such that \(\eta _{k}(S)\) defined in (3.11) satisfies

$$\begin{aligned} \eta _{k}(S)\le C \left[ \frac{kh}{\sqrt{p}} + k^{\vartheta }\left( \left( \frac{h}{h +\sigma }\right) ^{p} + k \left( \frac{kh}{\sigma p}\right) ^{p} \right) \right] . \end{aligned}$$

Proof

We apply Theorem 4.8 with \(s = 0\) and \(g = 0\). Given \(f \in L^{2}(\varOmega )\) let \(v=N_{k}^{*}(f)= \overline{N_{k}(\overline{f})}\). Hence, the regularity estimates of Theorem 4.5 (with \(g = 0\)) are applicable. The assumption \(kh/p \le \overline{C}\) allows us to estimate \((kh/p)^{2} \le C kh/\sqrt{p}\). \(\square \)

Finally, the convergence estimate for polynomial \(hp\)-FEM can be stated in the following theorem:

Theorem 4.10

(Convergence Estimate) Let \(\varOmega \subset \mathbb R ^{d}\), \(d\in \{2,3\}\), be a bounded Lipschitz domain with analytic boundary. Let the mesh \({\fancyscript{T}}\) be shape-regular in the sense of Assumption 4.1. Fix \(s\in \mathbb{N }_{0}\). Let \(\alpha \), \(\beta \), \(\delta \) be chosen according to (2.8) with \(\mathfrak a \) sufficiently large. Moreover, let \(0<\delta <1/3\). Then, there exist constants \(c_{1}\), \(c_{2}\), \(C>0\) independent of \(k,h\), and \(p\) such that under the assumptions

$$\begin{aligned} \frac{kh}{\sqrt{p}}\le c_{1}\quad \text{ together } \text{ with }\quad p\ge c_{2} \,\log (k)\quad \text{ as } \text{ well } \text{ as } \quad p\ge s+1 \end{aligned}$$
(4.16)

there holds for \(f\in H^{s}(\varOmega )\) and \(g\in H^{s+1/2}(\partial \varOmega )\) the a priori estimate

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}&\le C\left[ \sqrt{p}\left( \frac{h}{p}\right) ^{s+1}+k^{\vartheta -1}\left\{ \left( \frac{h}{h+\sigma }\right) ^{p}+k\left( \frac{kh}{\sigma p}\right) ^{p}\right\} \right] \\&\times \left[ \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right] . \end{aligned}$$

In particular, under the additional assumption that \(\mathfrak b \) and \(\mathfrak d \) satisfy \(\mathfrak b \), \(\mathfrak d \ge c_{0}>0\), there holds

$$\begin{aligned}&\Vert \nabla _{\fancyscript{T}}(u-u_{S})\Vert _{L^{2}(\varOmega )}+ \sqrt{\frac{h}{p} }\Vert [\![\nabla _{\fancyscript{T}}(u-u_{S})]\!]_{N} \Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }+\frac{p}{\sqrt{h}} \Vert [\![u-u_{S}]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T}}^{\fancyscript{I}}\right) }\\&\quad \le C\Vert u-u_{S}\Vert _{DG}. \end{aligned}$$

Proof

By taking the constant \(\mathfrak a \) in (2.8) sufficiently large, we can ensure by Lemma 4.3 the condition (2.7). Hence the assertion is a combination of Theorems 3.10, 4.8 and Corollary 4.9. \(\square \)

4.2.2 Convergence Analysis for \(hp\)-FEM on Regular Meshes

When contrasting the estimate for the adjoint approximation property \(\eta _{k}(S)\) given in Corollary 4.9 and the final convergence result Theorem 4.10 with the corresponding ones for the classical conforming \(hp\)-FEM presented in [36, 37] one observes the suboptimality in \(p\) by half an order. This suboptimality is typical of \(p\)-explicit DG-methods and in general sharp, [22]. It can be removed if the \(hp\)-approximation space \(S\) is such that it contains an \(H^{1}(\varOmega )\)-conforming subspace that is sufficiently rich. The essential point of the argument is that the approximant \(w_{H^{s+2}}\) in the proof of Theorem 4.8 can be chosen to be in \(H^{1}(\varOmega )\) so that the following skeleton term vanishes:

$$\begin{aligned} k^{3/2}\Vert \alpha ^{1/2}[\![u_{H^{s+2} }-w_{H^{s+2}}]\!]_{N}\Vert _{L^{2}\left( \mathfrak S _{\fancyscript{T} }^{\fancyscript{I}}\right) }=0. \end{aligned}$$
(4.17)

We illustrate this procedure for a specific setting, namely, that of a regular mesh \({\fancyscript{T}}\) whose element maps satisfy the standard compatibility conditions for an \(H^{1}(\varOmega )\)-conforming discretization. Specifically, we require the mesh to be \(H^1\)-regular by which we mean: first, the partition has no hanging nodes or edges and, second, in addition to the conditions of Assumption 4.1 we require the element maps \(F_{K}\) and \(F_{K^{\prime }}\) of two elements \(K\), \(K^{\prime }\) that share an edge or face to induce the same parametrization on this edge or face. One of way of constructing such a mesh is to start from a fixed coarse macro triangulation on \(\varOmega \) into “patches” using curved elements (e.g., constructed with “transfinite blending” [24, 25] and [12, Chap. 5]) and then construct the actual triangulation with elements of size \(h\) by transporting refinements of the reference elements to physical space with the patch maps of the coarse triangulation. More details for such a procedure are given in [36, Example 5.1]. On such regular meshes, the standard \(H^{1}(\varOmega )\)-conforming \(hp\)-FEM spaces given as \(S^{p,1}({\fancyscript{T}}):= \{u \in H^1(\varOmega )\,|\, \forall K \in {\fancyscript{T}} :u|_K \circ F_k \in {\fancyscript{P}}_p\}\) have good approximation properties, which results in the following improvement over Theorem 4.8:

Theorem 4.11

Assume the hypotheses of Theorem 4.8. Assume additionally that the mesh \({\fancyscript{T}}\) is \(H^1\)-regular in the above sense. Then for \(S = S^{p,1}({\fancyscript{T}})\):

$$\begin{aligned} \inf _{v \in S} k \Vert u-v\Vert _{DG^{+}}\le C_{f,g}\left( \left( \frac{h}{p}\right) ^{s} \frac{kh}{{p}} + k^{\vartheta }\left\{ \left( \frac{h}{h+\sigma }\right) ^{p}+k \left( \frac{kh}{\sigma p}\right) ^{p}\right\} \right) . \end{aligned}$$
(4.18)

Proof

As in the proof of Theorem 4.8, we decompose \(u=u_{H^{s+2} }+u_{\fancyscript{A}}\). We will not discuss the approximation of \(u_{\fancyscript{A}}\) since its approximation follows the lines of [36, Thm. 5.5]. We construct an \(H^{1}(\varOmega )\)-conforming approximation \(w_{H^{s+2}}\in S\) to \(u_{H^{s+2}}\). This ensures the desired property (4.17). It remains to guarantee that \(w_{H^{s+2}}\) is constructed such that the optimal rate of convergence is achieved in the broken \(H^{1}\)-norm and \(L^{2}\)-norm and also for the trace of the gradient on the skeleton. Recall \(p \ge s+1\). In Appendix 2 (Cor. 7.4) we construct, for \(t>5/2\) (for \(d=2\)) and \(t>5\) (for \(d=3\)) a linear operator \(I:H^{t}(\varOmega )\rightarrow S\cap H^{1}(\varOmega )\) with the following approximation properties:

$$\begin{aligned}&\left( \frac{h_{K}}{p}\right) ^{2}\Vert \nabla ^{2}(u-Iu)\Vert _{L^{2} (K)}+\left( \frac{h_{K}}{p}\right) \Vert \nabla (u-Iu)\Vert _{L^{2}(K)}+\Vert u-Iu\Vert _{L^{2}(K)}\\&\quad \le C\left( \frac{h_{K}}{p}\right) ^{t}\Vert u\Vert _{H^{t}(K)}. \end{aligned}$$

Set \(t^{*}=5/2\) for \(d=2\) and \(t^{*}=5\) for \(d=3\). If \(s+2>t^{*}\), we obtain the desired estimate for \(\Vert u - I u\Vert _{DG+}\) from this by summation over all elements. If \(s+2\le t^{*}\), then we employ the following interpolation argument due to [6]: Fix \(\sigma >t^{*}\). The Sobolev space \(H^{s+2}(\varOmega )\) can be characterized by interpolation (using the so-called “\(K\)-method” as described, for example, in [43]), and we have \(H^{s+2}(\varOmega )=(L^{2}(\varOmega ),H^{\sigma } (\varOmega ))_{\theta ,2}\) with \(\theta =(s+2)/\sigma \). Hence, we can find, for any \(t>0\), a function \(v_{t}\in H^{\sigma }(\varOmega )\) such that

$$\begin{aligned} \Vert u-v_{t}\Vert _{L^{2}(\varOmega )}+t\Vert v_{t}\Vert _{H^{\sigma }(\varOmega )}=:K(u,t)\le Ct^{\theta }\Vert u\Vert _{H^{s+2}(\varOmega )}. \end{aligned}$$

Then [6, Lemma] gives the stability estimate \(\Vert u-v_{t}\Vert _{H^{s+2}(\varOmega )}\le C\Vert u\Vert _{H^{s+2}(\varOmega )}.\) Using interpolation estimates, we therefore arrive at

$$\begin{aligned} \Vert v_{t}\Vert _{H^{\sigma }(\varOmega )}&\le Ct^{\theta -1}\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{L^{2}(\varOmega )}&\le Ct^{\theta }\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{H^{1}(\varOmega )}&\le C\Vert u-v_{t}\Vert _{L^{2}(\varOmega )}^{(s+1)/(s+2)}\Vert u-v_{t}\Vert _{H^{s+2}(\varOmega )}^{1/(s+2)}\le Ct^{\theta (s+1)/(s+2)}\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{H^{2}(\varOmega )}&\le C\Vert u-v_{t}\Vert _{L^{2}(\varOmega )}^{s/(s+2)}\Vert u-v_{t}\Vert _{H^{s+2}(\varOmega )}^{2/(s+2)}\le Ct^{\theta s/(s+2)}\Vert u\Vert _{H^{s+2}(\varOmega )}. \end{aligned}$$

We select \(t=(h/p)^{\sigma }\). Then, the above estimates take the following form:

$$\begin{aligned} \Vert v_{t}\Vert _{H^{\sigma }(\varOmega )}&\le (h/p)^{s+2-\sigma }\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{L^{2}(\varOmega )}&\le C(h/p)^{s+2}\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{H^{1}(\varOmega )}&\le C(h/p)^{s+1}\Vert u\Vert _{H^{s+2}(\varOmega )},\\ \Vert u-v_{t}\Vert _{H^{2}(\varOmega )}&\le C(h/p)^{s}\Vert u\Vert _{H^{s+2}(\varOmega )}. \end{aligned}$$

Using elementwise appropriate multiplicative trace inequalities yields

$$\begin{aligned} \Vert u-v_{t}\Vert _{DG,+}\le C\left[ k(h/p)^{s+2}+(h/p)^{s+1}+k^{1/2} (h/p)^{s+3/2}\right] \Vert u\Vert _{H^{s+2}(\varOmega )}. \end{aligned}$$

Finally, \(v_{t}\) is sufficiently smooth to allow us to apply the approximation operator \(I\) of Appendix 2 and bound \(\Vert v_{t} - I v_{t}\Vert _{DG+}\) with the aid of Corollary 7.4. \(\square \)

Remark 4.12

For \(H^1\)-regular meshes (in the above sense) the approximation result for the adjoint approximation property \(\eta _{k}(S)\) in Corollary 4.9 can be improved to

$$\begin{aligned} \eta _{k}(S)\le C\left[ \frac{kh}{{p}}+k^{\vartheta }\left( \left( \frac{h}{h+\sigma }\right) ^{p}+k\left( \frac{kh}{\sigma p}\right) ^{p}\right) \right] . \end{aligned}$$

In turn, this results in an improvement of Theorem 4.10: the resolution condition (4.16) can be relaxed to

$$\begin{aligned} \frac{kh}{p}\le c_{1}\quad \text{ together } \text{ with }\quad p\ge c_{2}\,\log (k)\quad \text{ as } \text{ well } \text{ as } \quad p\ge s+1 \end{aligned}$$
(4.19)

and the approximation result also improves to

$$\begin{aligned} \Vert u-u_{S}\Vert _{DG}&\le C\left[ \left( \frac{h}{p}\right) ^{s+1}\!\!+k^{\vartheta -1}\left\{ \left( \frac{h}{h+\sigma }\right) ^{p}+k\left( \frac{kh}{\sigma p}\right) ^{p}\right\} \right] \\&\times \left[ \Vert f\Vert _{H^{s}(\varOmega )}+\Vert g\Vert _{H^{s+1/2}(\partial \varOmega )}\right] . \end{aligned}$$

\(\square \)

5 Conclusions

In this paper, we have formulated the discontinuous Galerkin method for abstract finite dimensional test and trial spaces (conforming and non-conforming ones). The concrete choice of this space \(S\) enters the stability and convergence analysis via the following four quantities.

  1. (a)

    Trace constant \(C_{\mathrm{trace}}\left( S,K\right) \). Due to the formulation as a discontinuous Galerkin method, which contains integral jump terms across element faces, it is quite natural that local trace estimates for the space \(S\) are required for the error analysis.

  2. (b)

    Approximation property \(\inf _{v\in S}\Vert u-v\Vert _{DG^{+}}\). In order to derive quantitative error estimates it is obvious that approximation results for \(S\) for functions with higher Sobolev regularity are required. The trace estimate (cf. (a)) allows us to “transfer” the local approximation results for the elements \(K\in \fancyscript{T}\) to the skeleton norm.

  3. (c)

    Adjoint approximation property \(\eta _{k}\left( S\right) \). The decomposition lemma formulated as Theorem 4.5 provides a regularity theory for Helmholtz problems that splits the solution into several contributions, each of which can be approximated by piecewise polynomials with error estimates that are explicit in \(h\), \(k\), and \(p\).

  4. (d)

    The constant \(C_{S}\) of (3.14). This condition ensures unique solvability of the discrete system (2.4) (see Theorem 3.8). For the important cases of polynomial \(hp\)-finite elements on affine, simplicial triangulations or plane wave approximation spaces, the condition (3.14) is automatically satisfied. If the adjoint approximation property can be controlled, then Theorem 3.10 provides an alternative way to ensure unique solvability for (2.4).

As an application of our abstract theory we considered the polynomial \(hp\)-finite elements, and we derived sharp stability and convergence estimates for non-conforming polynomial \(hp\)-finite element spaces. The a priori estimate in Theorem 4.10 is optimal in \(h\) (note that \(f\in H^{s}(\varOmega )\) with \(g\in H^{s+1/2}(\partial \varOmega )\) implies \(u\in H^{s+2}(\varOmega )\) by the assumed smoothness of \(\partial \varOmega \)) but suboptimal in \(p\) by half an order. This is typical in \(p\)-explicit DG methods. This suboptimality in \(p\) can be removed (in both the scale resolution condition (4.16) as well as the a priori estimate of Corollary 4.9) by assuming that the approximation space contains an \(H^{1}(\varOmega )\)-conforming subspace that is sufficiently rich. As an example, we considered the special case of meshes that are \(H^1\)-regular in Theorem 4.11 and the ensuing Remark 4.12. These results are formulated for meshes without handing nodes but we believe that similar results hold also for certain meshes with hanging nodes; the essential tool is the existence of an \(H^{1}(\varOmega )\)-conforming interpolant with appropriate approximation properties. Such a situation arises, e.g., if a conforming \(hp\)-finite element mesh is further refined locally in a controlled way by introducing hanging nodes.

We restricted the convergence analysis for polynomial \(hp\)-finite element spaces in Sect. 4 to Lipschitz domains with analytic boundaries in order not to further increase the technicalities in this paper. In [37], the case of polygonal domains for the standard variational formulation of the Helmholtz equation with conforming polynomial \(hp\)-finite element spaces was considered and regularity estimates in weighted Sobolev spaces were derived. We expect that the generalization of our theory for the DG method for non-conforming finite element spaces to polygonal domains is possible along those lines.