1 Introduction

In this paper, we shall study time-harmonic wave propagation problems in unbounded waveguides. Waveguides are an important technology with a variety of applications in acoustics, optical communications and so on. Many applications of waveguides are found to be posed in large, effectively unbounded, domains. A challenge for the numerical solution of wave propagation problems posed in large domains is the construction and application of domain truncation techniques with high accuracy. The boundary conditions imposed on the artificial boundaries resulting from domain truncation, so-called absorbing boundary conditions (ABCs), should have the following properties

  • the artificial boundary produces as little reflection as we wish and so the solution on the truncated domain can be made arbitrarily close to the solution on the original unbounded domain,

  • the artificial boundary conditions are easy to implement in the discretized problems using, e.g., finite elements method (FEM) or finite difference method (FDM),

  • the numerical methods incorporated with the artificial boundary conditions are stable and robust.

Many ABCs satisfying the properties listed above have been developed, for example, nonlocal boundary conditions based on Dirichlet-to-Neumann (DtN) mappings [3, 13, 19], high-order local boundary conditions [24, 26, 27, 31], and perfectly matched layers (PMLs) [2, 29]. We note that the design of efficient ABCs is also important for scattering problems in exterior domains, which we will consider in a subsequent paper. For general reviews of this subject, see [4, 9, 15, 25, 35].

This paper is devoted to developing local high-order absorbing boundary conditions for time-harmonic wave propagation problems in waveguides motivated by complete radiation boundary conditions (CRBCs) for wave propagation problems in the time-domain [17, 18]. For time-domain calculations, CRBCs exploit the auxiliary function formulation proposed in [17], which leads to a more efficient and natural implementation of high order radiation conditions than those proposed by Higdon [20, 21] and by Givoli and Neta [11]. In addition, it is shown in [17] how optimal parameters can be chosen based on the simulation time, T, the separation, b, of sources and inhomogeneities from the artificial boundary, and the error tolerance, \(\tau \). The parameterizations are quite efficient, with the total number of auxiliary functions, P, obeying

$$\begin{aligned} P \propto \ln {\left( \frac{1}{\tau } \right) } \cdot \ln {\left( \frac{cT}{b} \right) }, \end{aligned}$$
(1.1)

with a positive constant c.

The new method that we shall investigate not only fulfills the necessary requirements for ABCs but also has certain advantages. First of all, compared with methods based on DtN mappings [3, 13, 19], CRBCs do not need the knowledge of eigenfunctions of the transverse Laplace operator on the cross-section of waveguides and the number of propagating modes, though easily-obtained partial information on the distribution of the eigenvalues can be used to improve efficiency.

In addition, as CRBCs are local, the sparsity of the system matrix is retained. In contrast with earlier local boundary condition sequences or PML, CRBCs are constructed to treat evanescent modes as well as propagating modes. Thus they can be placed quite close to wave sources or scatterers without compromising accuracy. This fact will be illustrated in the numerical examples later. Here we note that to handle evanescent modes the PML width needs to be inversely proportional to the smallest decay rate of evanscent modes so that it can be arbitrarily wide, whereas in such a case we can use suitably chosen nodes, e.g., Newman nodes, and guarantee accuracy independent of how small the smallest decay rate is.

Via the introduction of auxiliary variables, CRBCs, as well as some of the other methods mentioned above, avoid the higher order derivatives involved in product boundary operators of Higdon. Hence, these boundary conditions are compatible with FEM. The literature [10, 12, 16, 31] shows many computational results of these ABCs for wave propagation problems in time- and frequency- domains incorporated with FEM. However, the analysis for finite element problems, e.g., well-posedness and quasi-optimal convergence, has not been available in any case. In the present paper, we will provide an improved analysis for the finite element application to time-harmonic wave propagation problems with CRBCs in waveguides. In general, the unique solvability and quasi-optimal convergence of finite element approximations to solutions of indefinite problems satisfying a Gårding type inequality and the regularity of the adjoint problem is obtained by an argument of Schatz [33]. Schatz’s argument requires that the regularity of the continuous variational problem be established and that the mesh size h be small enough. That is, \(0<h<h_0\), where \(h_0\) depends on the regularity constant of the elliptic problem. In CRBC applications, it turns out that the regularity constant may increase polynomially as P grows (a PML application has the similar result that the stability constant depends on the width of the layer polynomially [5]), which means that for large P a smaller mesh h may be required to retain the unique solvability and quasi-optimal convergence. As the error due to the approximate boundary condition typically converges exponentially with increasing order, this possible restriction on the mesh is not likely to be important. We note that in our numerical simulations no dependence on P of the mesh size for the solvability of the discretized problem or the quasi-optimality of the finite element approximations was observed.

This paper is organized as follows. In Sect. 2 we study analytic solutions of a time-harmonic waveguide model. We define the CRBCs for wave propagation problems in the frequency-domain in Sect. 3. Section 4 is devoted to reformulation of the model problem to a variational form and in Sect. 5 existence and uniqueness of solutions to the Helmholtz equation satisfying CRBCs is established. Section 6 includes the convergence analysis of the continuous problem and parameter optimization is discussed in Sect. 7. We analyze the stability and regularity of the variational problem in Sect. 8 and discuss the finite element analysis in Sect. 9. Finally, in Sect. 10 numerical examples that confirm the theories are presented. Note that we cannot directly use the time-domain analysis in the frequency domain, as in the time domain we use the finite simulation time, T, in an essential way. As a result the parameter optimization problem considered here is different and, in fact, more difficult.

2 Fourier series of solutions to the Helmholtz equation in waveguides

Fig. 1
figure 1

Geometry of the semi-infinite waveguide \({\varOmega }_\infty \) in \({\mathbb {R}}^2\), \({\tilde{{\varGamma }}}_T = \tilde{{\varGamma }}_N \cup \tilde{{\varGamma }}_S\)

We consider a time-harmonic waveguide problem

$$\begin{aligned} {\varDelta }u +k^2 u =0\quad \hbox { in }{\varOmega }_\infty \end{aligned}$$
(2.1)

on a semi-infinite waveguide \({\varOmega }_\infty = \{(x,y)\in {\mathbb {R}}\times {\mathbb {R}}^{d-1} \ : \ x>0,\ y\in {\varTheta }\}\), \(d=2\) or 3. Here \({\varTheta }\) is a bounded subset of \({\mathbb {R}}^{d-1}\) with a smooth boundary. (For the numerical experiments we will specialize to \({\mathbb {R}}^2\) with \({\varTheta }=(0,W)\). See Fig. 1). Here k is a positive wavenumber. For definiteness we assume the lateral waveguide boundary is sound-hard, i.e., the normal flux is equal to zero,

$$\begin{aligned} \frac{\partial u}{\partial \nu } = 0\quad \hbox { on }{\tilde{{\varGamma }}}_T \equiv (0, \infty ) \times \partial {\varTheta }, \end{aligned}$$
(2.2)

where \(\nu \) is the outward unit normal vector on \({\tilde{{\varGamma }}}_T\). In addition, we assume that wave sources come from the west boundary \({\varGamma }_W\) of \({\varOmega }_\infty \) located at \(x=0\) and so it determines the boundary data on \({\varGamma }_W\),

$$\begin{aligned} u=f \quad \hbox { on }{\varGamma }_W. \end{aligned}$$
(2.3)

This models the practically important case where more complicated physics, geometry, or distributed sources are located in the region \(x < 0\).

Solutions of the Helmholtz equation (2.1) can be expressed in a Fourier series in terms of the eigenfunctions of the negative transverse Laplace operator

$$\begin{aligned} \begin{aligned} { {{\varDelta }_y Y_n}}+\lambda _n^2Y_n&=0\quad \hbox { in }{\varTheta },\\ \frac{\partial Y_n}{\partial \nu }&=0 \quad \hbox { on }\partial {\varTheta }, \end{aligned} \end{aligned}$$
(2.4)

where \(\lambda _n^2\) and \(Y_n\) are the nth eigenpair. We denote \(\mu _n^2=k^2-\lambda _n^2\). By choosing normalized eigenfunctions, we have an orthonormal basis consisting of eigenfunctions \(Y_n\). Moreover, as

$$\begin{aligned} \lim _{n \rightarrow \infty } \mu _n^2 = - \infty , \end{aligned}$$
(2.5)

there are only finitely many \(\mu _n^2>0\), infinitely many \(\mu _n^2<0\) and there may be cutoff modes \(\mu _n^2=0\). We also note that the asymptotic behavior of the eigenvalues is well-known (e.g. [6, Ch. VI, Thm. 20–21]): for some constant A

$$\begin{aligned} \mu _n^2 \sim -A n^{\frac{2}{d-1}} . \end{aligned}$$
(2.6)

Now, under the time-harmonic assumption \(e^{-i\omega t}\) with angular frequency \(\omega \), for each \(\mu _n\), we only take solutions that propagate to the right or are bounded for \(x>0\),

$$\begin{aligned} z_n(x)=e^{i\mu _n x}. \end{aligned}$$

This represents a propagating mode for \(\mu _n^2>0\) with \(\mu _n>0\) and an evanescent mode for \(\mu _n^2<0\) with \({\tilde{\mu }}_n:=\mathfrak {I}(\mu _n)>0\). In some cases, there is a mode, a so-called cutoff mode, associated with \(\mu _n=0\), for which special care needs to be taken. For ease of exposition we now assume that there exists \(N\ge 0\) such that \(\mu _N=0\), \(\mu _n^2>0\) for all \(n<N\) and \(\mu _n^2<0\) for all \(n>N\). However, we will make clear when the absence of such a mode yields substantial improvements in the error and stability estimates. Note that extensions to the case of multiple cutoff modes could similarly be obtained.

Thus, a general solution to the Helmholtz equation satisfying the outgoing radiation condition is represented by the Fourier series

$$\begin{aligned} \begin{aligned} u(x,y)&=\sum _{n=0}^\infty A_ne^{i\mu _n x}Y_n(y)\\&=\sum _{n=0}^N A_ne^{i\mu _nx}Y_n(y) + \sum _{n=N+1}^\infty A_ne^{-{\tilde{\mu }}_n x}Y_n(y), \end{aligned} \end{aligned}$$
(2.7)

which is a superposition of finitely many propagating modes (including a cutoff mode) and infinitely many evanescent modes. Here the Fourier coefficient \(A_n\) is determined by the sources from \({\varGamma }_W\),

$$\begin{aligned} A_n=\int _{{\varTheta }} u(0,y)Y_n(y)\ \mathrm {d}y. \end{aligned}$$

The constant C throughout the paper is a generic constant and may be different at different places, but it does not depend on functions. Where the dependence of constants on the parameters of the approximate radiation condition are important we will indicate the dependence via a subscript, \(C_a\). We remark that the construction and analysis can easily be extended to problems with variable coefficients depending only on the transverse coordinates, y, including the important case of layered materials. Also, the theory can be established for a case where the domain \({\varOmega }_\infty \) includes any bounded smooth cavity with any inhomogeniety in \(x<0\), and the analysis for this case can be found in [23].

3 Complete radiation boundary conditions

Complete radiation boundary conditions were introduced in [17, 18] to provide a rapidly convergent local boundary condition sequence for time-domain calculations. Fundamental differences between the time-domain and frequency-domain cases are:

  1. i.

    In the frequency domain only a discrete set of modes exists, while in the time domain we must consider the continuum of modes present as k varies along an entire inversion contour;

  2. ii.

    In the time domain we are only concerned about accuracy up to the simulation time, T, which allows for the continuation of k in the complex plane. In the frequency domain this would be akin to solving a limiting absorption approximation to the Helmholtz system, and thus the size of the imaginary part would be tied to the accuracy.

Directly, the conditions proposed in the time domain can be simply translated to the frequency domain by the replacement \(c^{-1} \frac{\partial }{\partial t} \rightarrow -i k\), where c is the wave speed. However, both the analysis and parameter optimization differ.

Fig. 2
figure 2

Geometry of the truncated computational domain \({\varOmega }_b\), \({ {{\varGamma }_T = {{\varGamma }}_N \cup {{\varGamma }}_S}}\)

We truncate the unbounded strip \({\varOmega }_\infty \) to a bounded region \({\varOmega }_b = (0,b) \times {\varTheta }\), whose east boundary \({\varGamma }_E\) is located at \(x=b\) (see Fig. 2). The problem in the finite computational domain \({\varOmega }_b\) is

$$\begin{aligned} {\varDelta }u + k^2 u&=0\quad \hbox { in }{\varOmega }_b, \end{aligned}$$
(3.1)
$$\begin{aligned} \frac{\partial u}{\partial \nu }&=0\quad \hbox { in }{\varGamma }_T=(0,b)\times \partial {\varTheta }, \end{aligned}$$
(3.2)
$$\begin{aligned} u&=f\quad \hbox { on }{\varGamma }_W . \end{aligned}$$
(3.3)

To close the problem, we need to supplement it with the CRBC on the east boundary \({\varGamma }_E\). The boundary condition is defined by the following recursive formulas satisfied by auxiliary variables \(\phi _j\), that also satisfy the Helmholtz equation (3.1) with the sound-hard boundary condition (3.2) on \({\varGamma }_T\):

$$\begin{aligned} \begin{aligned} \phi _0&= u,\\ \left( { {\frac{\partial }{\partial x}}}+ a_j\right) \phi _j&= \left( -{ {\frac{\partial }{\partial x}}}+a_j\right) \phi _{j+1},\\ \end{aligned} \end{aligned}$$
(3.4)

for \(j=0, 1, 2,\ldots \), where \(a_j\) are parameters to be chosen for reducing reflection from the artificial boundary. As motivation we note that the recursion terminates if u is a superposition of modes annihilated by one of the operators \(({ {\frac{\partial }{\partial x}}}+ a_j)\). The parameters \(a_j\) are chosen as follows:

$$\begin{aligned} a_j=\left\{ \begin{array}{cl} -ikc_j &{}\hbox { for }j=0,\ldots , n_p-1,\\ \sigma _{j-n_p} &{}\hbox { for }j=n_p,\ldots ,n_p+n_e \end{array} \right. \end{aligned}$$
(3.5)

with

$$\begin{aligned} 0< c_j \le 1 \hbox { for }j=0,\ldots , n_p-1, \ \hbox { and }\ \ 0<\sigma _j \hbox { for }j=n_p,\ldots , n_p+n_e. \end{aligned}$$
(3.6)

In practice, the parameters we take satisfy

$$\begin{aligned} \mu _{N-1}\le kc_j \le k \ \hbox { and }\ \ {\tilde{\mu }}_{N+1}\le \sigma _j\le M_\sigma , \end{aligned}$$
(3.7)

where \(\mu _{N-1}\) represents the smallest axial frequency of propagating modes and \({\tilde{\mu }}_{N+1}\) is the smallest decay rate of evanescent modes. Also, \(M_\sigma \) is an upper bound for the decay rates \(\sigma _j\) of evanescent modes that the CRBC can damp effectively and it can be chosen so that \(e^{-M_\sigma b}\) is less than an error tolerance of numerical simulations. These bounds and selection of parameters in practice will be discussed in more detail in Sect. 7. We could choose repeated parameters \(a_j\), however from now on we assume that \(a_j\) are all distinct since the parameters in the optimal selection are all different. These recursions are terminated by

$$\begin{aligned} \phi _{n_p+n_e+1} =0\quad \hbox { on }{\varGamma }_E. \end{aligned}$$
(3.8)

Here \((n_p, n_e)\) is called the order of CRBCs and let \(P=n_p+n_e\). If \(a_j\) is selected to be purely imaginary so that \(kc_j=\mu _n>0\), then the recursion exactly eliminates the corresponding propagating mode, and if \(a_j\) is chosen to be real so that \(\sigma _j\) equals the decay rate \({\tilde{\mu }}_n\) of an evanescent mode, then it does not produce reflection of the corresponding evanescent mode.

Remark 3.1

As suggested for time-domain problems in [17], we may also use parameters \(a_j\) of the form

$$\begin{aligned} a_j = \sigma _j -ikc_j \end{aligned}$$
(3.9)

for \(j=0,\ldots ,P\) with the conditions (3.6). In this case, although the recursions do not annihilate any mode exactly, they damp reflection of propagating modes and evanescent modes simultaneously. In this paper, however, we only investigate CRBCs employing \(a_j\) as given in (3.5), which are generally more effective for frequency-domain problems.

For numerical implementation of these boundary conditions, we need to eliminate the derivative of the auxiliary variables with respect to the normal direction from the recursive formulas (3.4). To do this, we apply the operator \(\partial /\partial x\) to the Eq. (3.4) for the \((j-1)\)th and jth recursion, which yields

$$\begin{aligned} { {\frac{\partial ^2}{\partial x^2}}} \phi _{j-1} + { {\frac{\partial ^2}{\partial x^2}}} \phi _{j} = a_{j-1} { {\frac{\partial }{\partial x}}} \phi _{j} -a_{j-1} { {\frac{\partial }{\partial x}}} \phi _{j-1}, \end{aligned}$$
(3.10)

and

$$\begin{aligned} { {\frac{\partial ^2}{\partial x^2}}} \phi _j + { {\frac{\partial ^2}{\partial x^2}}} \phi _{j+1} = a_j { {\frac{\partial }{\partial x}}} \phi _{j+1} -a_j { {\frac{\partial }{\partial x}}} \phi _j. \end{aligned}$$
(3.11)

Here we eliminate \(\partial \phi _{j-1}/\partial x\) from (3.10) and \(\partial \phi _{j+1}/\partial x\) from (3.11) by using (3.4) for the \((j-1)\)th and jth recursion, respectively, which shows that

$$\begin{aligned} \begin{aligned} { {\frac{\partial ^2}{\partial x^2}}} \phi _{j-1} + { {\frac{\partial ^2}{\partial x^2}}} \phi _{j}&= a_{j-1} { {\frac{\partial }{\partial x}}} \phi _{j} -a_{j-1} \left( -{ {\frac{\partial }{\partial x}}} \phi _{j}+a_{j-1}\phi _j-a_{j-1} \phi _{j-1}\right) \\&=2a_{j-1}{ {\frac{\partial }{\partial x}}}\phi _j -a_{j-1}^2\phi _j+a_{j-1}^2\phi _{j-1}, \end{aligned} \end{aligned}$$
(3.12)

and

$$\begin{aligned} { {\frac{\partial ^2}{\partial x^2}}} \phi _j + { {\frac{\partial ^2}{\partial x^2}}} \phi _{j+1}&= a_j \left( -{ {\frac{\partial }{\partial x}}} \phi _j+a_j\phi _{j+1}-a_j\phi _j\right) -a_j { {\frac{\partial }{\partial x}}} \phi _j\nonumber \\&=-2a_j{ {\frac{\partial }{\partial x}}}\phi _j + a_j^2\phi _{j+1}-a_j^2\phi _j. \end{aligned}$$
(3.13)

Now, multiplying (3.12) by \(1/a_{j-1}\) and (3.13) by \(1/a_{j}\) and subsequently adding them together produces

$$\begin{aligned} \begin{aligned}&L_{j,j-1}\frac{\partial ^2}{\partial x^2}\phi _{j-1}+L_{j,j}\frac{\partial ^2}{\partial x^2}\phi _j +L_{j,j+1}\frac{\partial ^2}{\partial x^2}\phi _{j+1}\\&\quad + M_{j,j-1}\phi _{j-1}+ M_{j,j}\phi _j+ M_{j,j+1}\phi _{j+1}=0, \end{aligned} \end{aligned}$$
(3.14)

where

$$\begin{aligned} L_{j,j-1}&=\frac{1}{a_{j-1}},&L_{j,j}&=\frac{1}{a_{j-1}}+\frac{1}{a_j},&L_{j,j+1}&=\frac{1}{a_{j}},\nonumber \\ M_{j,j-1}&=-a_{j-1},&M_{j,j}&= a_{j-1}+a_j,&M_{j,j+1}&=-a_j. \end{aligned}$$
(3.15)

To find the connection between the solution \(u(=\phi _0)\) and the auxiliary variables on \({\varGamma }_E\), as in the above derivation, we have

$$\begin{aligned} \frac{\partial ^2}{\partial x^2}\phi _0+\frac{\partial ^2}{\partial x^2}\phi _1&= a_0 \frac{\partial }{\partial x} \phi _1 -a_0\frac{\partial }{\partial x}\phi _0\\&=a_0\left( -\frac{\partial }{\partial x}\phi _0 +a_0\phi _1-a_0\phi _0\right) -a_0\frac{\partial }{\partial x}\phi _0\\&=-2a_0\frac{\partial }{\partial x}\phi _0 +a_0^2\phi _1 -a_0^2\phi _0. \end{aligned}$$

Therefore,

$$\begin{aligned} -2{ {\frac{\partial }{\partial x}}}\phi _0 = \frac{1}{a_0}\left( { {\frac{\partial ^2}{\partial x^2}}}\phi _0 +{ {\frac{\partial ^2}{\partial x^2}}}\phi _1\right) +a_0\phi _0 -a_0\phi _1. \end{aligned}$$
(3.16)

To obtain our final system, with

$$\begin{aligned} L_{0,0}=\frac{1}{a_0} \ \hbox { and }\ M_{0,0}=a_0, \end{aligned}$$

we define L and M by the \((P+1)\times (P+1) \) symmetric (but not Hermitian) tridiagonal matrices whose non-zero elements \(L_{i,j}\) and \(M_{i,j}\) are given as above, respectively. We can write the boundary condition in matrix form

$$\begin{aligned} -\left( 2\frac{\partial u}{\partial x}\right) {\varvec{{e}}}_0=L\frac{\partial ^2}{\partial x^2}{\varPhi }+M{\varPhi }, \end{aligned}$$

where \({\varvec{{e}}}_j\) is the standard \((P+1)\times 1\) basis vector whose non-zero element is one at the jth component and \({\varPhi }=(\phi _0,\ldots ,\phi _P)^t\) with \(\phi _0=u\) on \({\varGamma }_E\).

Finally, the Helmholtz equation removes all x-derivatives in the equation,

$$\begin{aligned} -\left( 2\frac{\partial u}{\partial x}\right) {\varvec{{e}}}_0=-L{\varDelta }_y{\varPhi }+(-k^2L+M){\varPhi }. \end{aligned}$$

Thus the model problem completed by the CRBCs on \({\varGamma }_E\) is to find functions u defined in \({\varOmega }_b\) and \({\varPhi }=({\phi _0},\ldots ,{\phi _P})^t\) defined on \({\varGamma }_E\) with \(u=\phi _0\) on \({\varGamma }_E\) such that

$$\begin{aligned} {\varDelta }u + k^2 u&= 0\quad \hbox { in }{\varOmega }_b, \end{aligned}$$
(3.17)
$$\begin{aligned} \frac{\partial u}{\partial \nu }&=0\quad \hbox { on }{\varGamma }_T ,\end{aligned}$$
(3.18)
$$\begin{aligned} u&=f \quad \hbox { on }{\varGamma }_W,\end{aligned}$$
(3.19)
$$\begin{aligned} \frac{\partial u}{\partial x}{\varvec{{e}}}_0&= \frac{-1}{2}(-L{\varDelta }_y{\varPhi }+(-k^2L+M){\varPhi })\quad \hbox { on }{\varGamma }_E \end{aligned}$$
(3.20)

with

$$\begin{aligned} \frac{\partial {\varPhi }}{\partial \nu }=0\quad \hbox { on }\partial {\varGamma }_E. \end{aligned}$$
(3.21)

Remark 3.2

A similar algebraic computation for time-domain problems, in which the contribution of evanescent modes is not negligible, can be found in [16]. For time-domain problems the process of removing the \(\partial /\partial x\) operators required a seam function to transit from the recursions for propagating modes to those for evanescent modes, which is not needed in the recursions for frequency-domain problems as time derivatives are not involved and there is no difference between recursions for propagating modes and those for evanescent modes.

4 Variational reformulation

In this section, we reformulate the problem (3.17)–(3.21) to a variational form for a given order \((n_p,n_e)\) of CRBCs with \(n_p+n_e=P\). We begin by defining the appropriate Sobolev spaces,

$$\begin{aligned} {\widetilde{H}}^1({\varOmega }_b)&=\{ \xi \in H^1({\varOmega }_b) \ : \ \xi |_{{\varGamma }_E} \in H^1({\varGamma }_E) \},\\ {\widetilde{H}}^1_0({\varOmega }_b)&=\{ \xi \in {\widetilde{H}}^1({\varOmega }_b) \ : \ \xi =0 \hbox { on }{\varGamma }_W\}. \end{aligned}$$

In the sequel, we will use the notations \((\cdot ,\cdot )_{{\varOmega }_b}\) and \((\cdot ,\cdot )_{{\varGamma }_E}\) for the \(L^2\)-inner product on \({\varOmega }_b\) and \({\varGamma }_E\), respectively,

For the space of auxiliary variables, we first introduce the symmetric positive definite matrices \({\mathcal {L}}\) and \({\mathcal {M}}\), which are obtained by replacing \(a_j\) with \(|a_j|\) in L and M, and define

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathcal {L}}}^2&:=({\mathcal {L}}{\varPhi },{\varPhi })_{{\varGamma }_E}=\sum _{j=0}^P \frac{1}{|a_j|}\Vert \phi _j+\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2,\\ \Vert {\varPhi }\Vert _{{\mathcal {M}}}^2&:=({\mathcal {M}}{\varPhi },{\varPhi })_{{\varGamma }_E}=\sum _{j=0}^P |a_j|\Vert \phi _j-\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2 \end{aligned}$$

and for \(\ell =1,2\)

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathcal {L}},\ell }^2:=\sum _{j=0}^P \frac{1}{|a_j|}\Vert \phi _j+\phi _{j+1}\Vert _{H^{\ell }({\varGamma }_E)}^2,\quad \Vert {\varPhi }\Vert _{{\mathcal {M}},\ell }^2:=\sum _{j=0}^P |a_j|\Vert \phi _j-\phi _{j+1}\Vert _{H^{\ell }({\varGamma }_E)}^2 \end{aligned}$$

for \({\varPhi }=(\phi _0,\ldots ,\phi _P)^t \in (L^2({\varGamma }_E))^{P+1}\) with \(\phi _{P+1}=0\). We define the Sobolev space \({\varvec{{V}}}_{{\varGamma }_E} =(H^1({\varGamma }_E))^{P+1}\) with the norm

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}}^2&=\Vert {\varPhi }\Vert _{{\mathcal {L}},1}^2+\Vert {\varPhi }\Vert _{{\mathcal {M}}}^2, \end{aligned}$$

which is equivalent to the standard product norm of \((H^1({\varGamma }_E))^{P+1}\) but the constants involved in the equivalence may depend on P. Furthermore, we introduce fractional Sobolev spaces \(H^s({\varGamma }_E)\) for \(-1\le s\le 2\) characterized by the norm

$$\begin{aligned} \Vert u\Vert _{H^s({\varGamma }_E)}^2=\sum _{n=0}^\infty (\lambda _n^2+1)^s|u_n|^2 \end{aligned}$$

for \(u=\sum _{n=0}^\infty u_nY_n\).

Remark 4.1

We note that \(H^s({\varGamma }_E)\) for \(3/2\le s\le 2\) in this paper is different from a usual fractional Sobolev space. In this case, \(H^s({\varGamma }_E)\) is the space of functions which are in a usual fractional Sobolev space obtained by real interpolation \([H^1({\varGamma }_E),H^2({\varGamma }_E)]_{s-1}\) and whose normal derivatives vanish on \(\partial {\varGamma }_E\). However \(H^s({\varGamma }_E)\) for \(-1\le s<3/2\) is a usual fractional Sobolev space

$$\begin{aligned} H^s({\varGamma }_E)= \left\{ \begin{array}{ll} {[(H^1({\varGamma }_E))^*, L^2({\varGamma }_E)]_{s+1}}, &{} -1\le s\le 0,\\ {[ L^2({\varGamma }_E), H^1({\varGamma }_E)]_s,} &{} \ \ 0\le s\le 1,\\ {[H^1({\varGamma }_E),H^2({\varGamma }_E)]_{s-1},} &{} \ \ 1\le s<3/2 \end{array}\right. \end{aligned}$$

with \((H^1({\varGamma }_E))^*\) the dual space of \(H^1({\varGamma }_E)\).

If we use the same notations \(\Vert \cdot \Vert _{{\mathcal {L}}}\) and \(\Vert \cdot \Vert _{{\mathcal {M}}}\) for vectors in \({\mathbb {C}}^{P+1}\), the norm in \({\varvec{{V}}}_{{\varGamma }_E}\) can be written as

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}}^2=\sum _{n=0}^\infty (\lambda _n^2+1)\Vert {\varPhi }^n\Vert _{{\mathcal {L}}}^2+\Vert {\varPhi }^n\Vert _{{\mathcal {M}}}^2 \end{aligned}$$

for functions \({\varPhi }\) in \({\varvec{{V}}}_{{\varGamma }_E}\) with Fourier series \({\varPhi }=\sum _{n=0}^\infty {\varPhi }^n Y_n\).

The solution space \({\varvec{{V}}}\) is defined by

$$\begin{aligned} {\varvec{{V}}}:=\{(u,{\varPhi })\in {\widetilde{H}}^1({\varOmega }_b)\times {\varvec{{V}}}_{{\varGamma }_E} \ : \ u=\phi _0 \quad \hbox { on }{\varGamma }_E\hbox { for }{\varPhi }=(\phi _0,\ldots ,\phi _P)^t\}, \end{aligned}$$

which is equipped with the Sobolev norm

$$\begin{aligned} \Vert (u,{\varPhi })\Vert ^2_{{\varvec{{V}}}} =\Vert u\Vert ^2_{H^1({\varOmega }_b)}+\Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}}^2. \end{aligned}$$

We note that since \({\varvec{{V}}}\) is closed in \(H^1({\varOmega }_b)\times (H^1({\varGamma }_E))^{P+1}\), it is a Hilbert space. For regularity estimates, more regular spaces \({\varvec{{V}}}_{{\varGamma }_E}^2\) and \({\varvec{{V}}}^2\) are required, where \({\varvec{{V}}}_{{\varGamma }_E}^2\) is the set \((H^2({\varGamma }_E))^{P+1}\) with the norm

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}^2}^2:=\Vert {\varPhi }\Vert _{{\mathcal {L}},2}^2+\Vert {\varPhi }\Vert _{{\mathcal {M}},1}^2= \sum _{n=0}^\infty (\lambda _n^2+1)^2\Vert {\varPhi }^n\Vert _{\mathcal {L}}^2+(\lambda _n^2+1)\Vert {\varPhi }\Vert ^2_{\mathcal {M}}\end{aligned}$$

(which is also equivalent to the standard product norm in \((H^2({\varGamma }_E))^{P+1}\)) and \({\varvec{{V}}}^2\) is a subspace of \({\varvec{{V}}}\) consisting of \((u,{\varPhi })\) satisfying

$$\begin{aligned} \Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}^2}^2:=\Vert u\Vert _{H^2({\varOmega }_b)}^2+\Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}^2}^2<\infty . \end{aligned}$$

Finally, we introduce the test space \({\varvec{{V}}}_0\), the set of functions \((\xi ,{\varPsi }) \in {\widetilde{H}}_0^1({\varOmega }_b)\times {\varvec{{V}}}_{{\varGamma }_E}\) such that \(\xi =\psi _0\) on \({\varGamma }_E\) for \({\varPsi }=(\psi _0,\ldots ,\psi _P)^t\). Now, we take a test function \((\xi ,{\varPsi })\in {\varvec{{V}}}_0\), multiply (3.17) by \(2\xi \) and (3.20) by \(2{\varPsi }\), and integrate them by parts, which transforms the problem (3.17)–(3.21) to the variational problem of finding \((u,{\varPhi })\in {\varvec{{V}}}\) with \(u=f\) on \({\varGamma }_W\) such that

$$\begin{aligned} A((u,{\varPhi }),(\xi ,{\varPsi })) = 0 \end{aligned}$$
(4.1)

for all \((\xi ,{\varPsi })\in {\varvec{{V}}}_0\), where

$$\begin{aligned} A((u,{\varPhi }),(\xi ,{\varPsi })) =2(\nabla u,\nabla \xi )_{{\varOmega }_b} -2k^2(u,\xi )_{{\varOmega }_b}+ J({\varPhi },{\varPsi }), \end{aligned}$$
(4.2)

and

$$\begin{aligned} J({\varPhi },{\varPsi })=(L\nabla _y {\varPhi },\nabla _y {\varPsi })_{{\varGamma }_E}+((-k^2L+M){\varPhi },{\varPsi })_{{\varGamma }_E} \end{aligned}$$

is the sesquilinear form defined on \({\varvec{{V}}}_{{\varGamma }_E}\times {\varvec{{V}}}_{{\varGamma }_E}\). Also, we define

$$\begin{aligned} {\widetilde{A}}((u,{\varPhi }),(\xi ,{\varPsi }))=2(\nabla u,\nabla \xi )_{{\varOmega }_b}+2(u,\xi )_{{\varOmega }_b}+{\widetilde{J}}({\varPhi },{\varPsi }) \end{aligned}$$

and

$$\begin{aligned} {\widetilde{J}}({\varPhi },{\varPsi })= (L\nabla _y {\varPhi },\nabla _y {\varPsi })_{{\varGamma }_E}+(L{\varPhi },{\varPsi })_{{{\varGamma }_E}} +({\bar{M}}{\varPhi },{\varPsi })_{{\varGamma }_E}, \end{aligned}$$

where \({\bar{M}}\) is the \((P+1)\times (P+1)\) tridiagonal symmetric matrix whose components are the complex conjugate of those of M.

Lemma 4.2

For \({\varPhi },{\varPsi }\) in \((L^2({\varGamma }_E))^{P+1}\), it holds that

$$\begin{aligned} |(L{\varPhi },{\varPsi })_{{\varGamma }_E}|&\le \Vert {\varPhi }\Vert _{\mathcal {L}}\Vert {\varPsi }\Vert _{\mathcal {L}},\\ |(M{\varPhi },{\varPsi })_{{\varGamma }_E}|&\le \Vert {\varPhi }\Vert _{\mathcal {M}}\Vert {\varPsi }\Vert _{\mathcal {M}},\\ |({\bar{M}}{\varPhi },{\varPsi })_{{\varGamma }_E}|&\le \Vert {\varPhi }\Vert _{\mathcal {M}}\Vert {\varPsi }\Vert _{\mathcal {M}}. \end{aligned}$$

Proof

Noting the symmetry of the matrix L, application of the Cauchy–Schwarz inequality shows that

$$\begin{aligned} \begin{aligned} |(L{\varPhi },{\varPsi })_{{\varGamma }_E}|&=\left| \sum _{j=0}^P \frac{1}{a_j} (\phi _j+\phi _{j+1},\psi _j+\psi _{j+1})_{{\varGamma }_E}\right| \le \Vert {\varPhi }\Vert _{\mathcal {L}}\Vert {\varPsi }\Vert _{\mathcal {L}}\end{aligned} \end{aligned}$$
(4.3)

The other cases are proved similarly. \(\square \)

The boundedness of J and \({\widetilde{J}}\) is easily obtained from Lemma 4.2.

Lemma 4.3

For \({\varPhi },{\varPsi }\in \mathbf{{V} }_{{\varGamma }_E}\), it holds that

$$\begin{aligned} |J({\varPhi },{\varPsi })|&\le C\Vert {\varPhi }\Vert _{\mathbf{V }_{{\varGamma }_E}}\Vert {\varPsi }\Vert _{\mathbf{V }_{{\varGamma }_E}},\\ |{\widetilde{J}}({\varPhi },{\varPsi })|&\le C\Vert {\varPhi }\Vert _{\mathbf{V }_{{\varGamma }_E}}\Vert {\varPsi }\Vert _{\mathbf{V }_{{\varGamma }_E}} \end{aligned}$$

with a positive constant C depending only on k.

The following boundedness and coercivity of the sesquilinear form \({\widetilde{A}}(\cdot ,\cdot )\) will play an important role for the existence of solutions in the next section.

Lemma 4.4

It holds that

$$\begin{aligned} |{\widetilde{A}}((u,{\varPhi }),(\xi ,{\varPsi }))|\le C\Vert (u,{\varPhi })\Vert _{\mathbf{V }}\Vert (\xi ,{\varPsi })\Vert _{\mathbf{V }} \end{aligned}$$

and

$$\begin{aligned} |{\widetilde{A}}((u,{\varPhi }),(u,{\varPhi }))|\ge C\Vert (u,{\varPhi })\Vert _{\mathbf{V }}^2 \end{aligned}$$

for all \((u,{\varPhi }), (\xi ,{\varPsi }) \in \mathbf{V }\).

Proof

The boundedness of \({\widetilde{A}}(\cdot ,\cdot )\) is an immediate consequence of Lemma 4.3 and the Cauchy–Schwarz inequality. For the coercivity, we first examine the real and imaginary parts of \({\widetilde{A}}((u,{\varPhi }),(u,{\varPhi }))\),

$$\begin{aligned} \begin{aligned}&\mathfrak {R}({\widetilde{A}}((u,{\varPhi }),(u,{\varPhi })))\\&\quad =2\Vert u\Vert _{H^1({\varOmega }_b)}^2+\sum _{j=n_p}^{n_p+n_e} \bigg (\frac{1}{a_j}\Vert \nabla _y(\phi _j+\phi _{j+1})\Vert _{L^2({\varGamma }_E)}^2 \\&\qquad +\frac{1}{a_j}\Vert \phi _j+\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2 +a_j\Vert \phi _j-\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2\bigg )\\ \end{aligned} \end{aligned}$$
(4.4)

and

$$\begin{aligned} \begin{aligned}&\mathfrak {I}({\widetilde{A}}((u,{\varPhi }),(u,{\varPhi })))\\&\quad =\sum _{j=0}^{n_p-1} \bigg (\frac{1}{|a_j|}\Vert \nabla _y(\phi _j+\phi _{j+1})\Vert _{L^2({\varGamma }_E)}^2\\&\qquad +\frac{1}{|a_j|}\Vert \phi _j+\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2+ |a_j|\Vert \phi _j-\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2\bigg ), \end{aligned} \end{aligned}$$
(4.5)

and we obtain that

$$\begin{aligned} |{\widetilde{A}}((u,{\varPhi }),(u,{\varPhi }))|&\ge C(\mathfrak {R}({\widetilde{A}}((u,{\varPhi }),(u,{\varPhi })))+\mathfrak {I}({\widetilde{A}}((u,{\varPhi }),(u,{\varPhi }))))\\&= C(\Vert u\Vert _{H^1({\varOmega }_b)}^2+\Vert {\varPhi }\Vert _{{\varvec{{V}}}_{{\varGamma }_E}}^2), \end{aligned}$$

which completes the proof. \(\square \)

We close this section with a lemma about a property of the norms \(\Vert \cdot \Vert _{{\mathcal {L}}}\) and \(\Vert \cdot \Vert _{{\mathcal {M}}}\), which will be used for the stability analysis of cutoff modes.

Lemma 4.5

Let \(a_j\) be the parameters defined by (3.5) satisfying (3.7). It holds that

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathcal {L}}}\le C_a(P+1)\Vert {\varPhi }\Vert _{{\mathcal {M}}} \end{aligned}$$

for \({\varPhi }\in {\mathbb {C}}^{P+1}\), where \(C_a\) is a constant depending on \(\max _{0\le j\le P}\{1/|a_j|\}\).

Proof

Noting that

$$\begin{aligned} \sum _{\ell =0}^P|\phi _\ell +\phi _{\ell +1}|^2 \le C(P+1)^2 \sum _{\ell =0}^P|\phi _\ell -\phi _{\ell +1}|^2 \end{aligned}$$

for \({\varPhi }=(\phi _0,\ldots ,\phi _P)^t\in {\mathbb {C}}^{P+1}\) with \(\phi _{P+1}=0\) (see e.g., [34]), it can be proved that

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathcal {L}}}^2&=\sum _{\ell =0}^P\frac{1}{|a_\ell |}|\phi _\ell +\phi _{\ell +1}|^2 \le C_a\sum _{\ell =0}^P|\phi _\ell +\phi _{\ell +1}|^2\\&\le C_a(P+1)^2\sum _{\ell =0}^P|\phi _\ell -\phi _{\ell +1}|^2 \le C_a^2(P+1)^2\Vert {\varPhi }\Vert _{{\mathcal {M}}}^2. \end{aligned}$$

\(\square \)

5 Existence and uniqueness of solutions to the Helmholtz equation with the CRBCs

This section is devoted to establishing the existence and uniqueness of solutions to the problem (3.17)–(3.21). For establishing the uniqueness of solutions, assume that \(f=0\) on \({\varGamma }_W\) and let the solution u be represented by the Fourier series

$$\begin{aligned} u(x,y)=(A_N+B_Nx)Y_N(y)+\sum _{n\ne N} (A_ne^{i\mu _n x}+B_ne^{-i\mu _nx})Y_n(y). \end{aligned}$$
(5.1)

The boundary condition on \({\varGamma }_W\) implies

$$\begin{aligned} A_n&=0\quad \hbox { for }n=N, \end{aligned}$$
(5.2)
$$\begin{aligned} A_n+B_n&=0\quad \hbox { for }n \ne N. \end{aligned}$$
(5.3)

Let \(C_n^0\) and \(D_n^0\) be the Fourier coefficients of the trace of u and \(\partial u/\partial x \) on \({\varGamma }_E\), respectively,

$$\begin{aligned} \begin{aligned} C_n^0&=\left\{ \begin{array}{ll} B_nb&{}\hbox { for }n=N,\\ A_ne^{i\mu _nb}+B_ne^{-i\mu _nb} &{}\hbox { for }n\ne N, \end{array} \right. \\ D_n^0&=\left\{ \begin{array}{ll} B_n &{}\hbox { for }n=N,\\ i\mu _n(A_ne^{i\mu _nb}-B_ne^{-i\mu _nb}) &{}\hbox { for }n\ne N. \end{array}\right. \end{aligned} \end{aligned}$$
(5.4)

The auxiliary variable \(\phi _j\) on \({\varGamma }_E\) has the Fourier expansion

$$\begin{aligned} \phi _j(y)=\sum _{n=0}^\infty C_n^jY_n(y). \end{aligned}$$

Now we note that the vector \({\varvec{{C}}}_n=(C_n^0,\ldots ,C_n^P)^t\) consisting of the nth Fourier coefficients of the auxiliary variables satisfies

$$\begin{aligned} -2D_n^0{\varvec{{e}}}_0=(-\mu _n^2L+M){\varvec{{C}}}_n. \end{aligned}$$
(5.5)

Indeed, since \(Y_n\) is an eigenfunction associated with the eigenvalue \(\lambda _n^2\), the nth Fourier mode of the right hand side of (3.20) is

$$\begin{aligned} \frac{-1}{2}(\lambda _n^2L{\varvec{{C}}}_n+(-k^2L+M){\varvec{{C}}}_n)Y_n=\frac{-1}{2}(-\mu _n^2L+M){\varvec{{C}}}_nY_n, \end{aligned}$$

while that of the left hand side is \(D_n^0{\varvec{{e}}}_0Y_n\). Applying the inner product \((\cdot ,\cdot )_{{\mathbb {C}}^{P+1}}\) in \({\mathbb {C}}^{P+1}\) of (5.5) against \({\varvec{{C}}}_n\) leads to

$$\begin{aligned} \begin{aligned} -2D_n^0{{\bar{C}}_n}^0&=-\mu _n^2(L{\varvec{{C}}}_n,{\varvec{{C}}}_n)_{{\mathbb {C}}^{P+1}}+(M{\varvec{{C}}}_n,{\varvec{{C}}}_n)_{{\mathbb {C}}^{P+1}}\\&=\sum _{j=0}^{P}\left[ \frac{-\mu _n^2}{a_j}|C_n^j+C_n^{j+1}|^2 +a_j|C_n^j-C_n^{j+1}|^2\right] , \end{aligned} \end{aligned}$$
(5.6)

where \({{\bar{C}}_n}^j\) is the complex conjugate of \(C_n^j\) and \(C_n^{P+1}=0\). Owing to (5.3) and (5.4), the left hand side of (5.6) is given by

$$\begin{aligned} 4\mu _n\mathfrak {I}(A_n{\bar{B}}_ne^{2i\mu _nb}) -2\mu _n(|A_n|^2-|B_n|^2)i {{= -4\mu _n|A_n|^2\mathfrak {I}(e^{2i\mu _nb})}} \end{aligned}$$
(5.7)

for \(n<N\) (propagating modes, \(\mu _n>0\)),

$$\begin{aligned} 2{\tilde{\mu }}_n(|A_n|^2e^{-2{\tilde{\mu }}_nb}-|B_n|^2e^{2{\tilde{\mu }}_nb}) +4{\tilde{\mu }}_n\mathfrak {I}(A_n{\bar{B}}_n)i {{=2{\tilde{\mu }}_n|A_n|^2(e^{-2{\tilde{\mu }}_nb}-e^{2{\tilde{\mu }}_nb})}} \end{aligned}$$
(5.8)

for \(n>N\) (evanescent modes, \(\mu _n^2<0\)) and

$$\begin{aligned} -2b|B_N|^2 \end{aligned}$$
(5.9)

for \(n=N\) (cutoff mode, \(\mu _n=0\)).

Now, we are ready to prove the uniqueness of solutions.

Lemma 5.1

Suppose that the parameters \(a_j\) are given by (3.5) and k is a positive wavenumber. Then solutions to the problem (3.17)–(3.21) are unique.

Proof

For \(n<N\) (\(\mu _n^2>0\)), by (5.6) and (5.7)

$$\begin{aligned} -4\mu _n|A_n|^2\mathfrak {I}(e^{2i\mu _nb})&= \sum _{{{j}}=0}^{n_p-1} \left[ \frac{-\mu _n^2}{-ikc_j}|C_n^j+C_n^{j+1}|^2-ikc_j|C_n^j-C_n^{j+1}|^2\right] \nonumber \\&\quad +\sum _{{{j}}=n_p}^{n_p+n_e} \left[ \frac{-\mu _n^2}{\sigma _j}|C_n^j+C_n^{j+1}|^2+\sigma _j|C_n^j-C_n^{j+1}|^2\right] . \end{aligned}$$
(5.10)

Comparing the imaginary parts of both sides, we see that

$$\begin{aligned} C_n^j=0 \hbox { for }j=0,\ldots ,n_p \quad \hbox { and }\quad n=0,\ldots ,N-1. \end{aligned}$$
(5.11)

In addition, since \(C_n^0=C_n^1=0\), it follows from the zeroth row of (5.5) that \(D_n^0=0\), which yields that \(A_n=B_n=0\) for \(n=0,\ldots ,N-1\) by solving the Eq. (5.4). Then, (5.5) becomes

$$\begin{aligned} (-\mu _n^2L+M){\varvec{{C}}}_n=0. \end{aligned}$$
(5.12)

Since the superdiagonal entries of \(-\mu _n^2L+M\) below the \((n_p-1)\)th row are non-zero,

$$\begin{aligned} -\frac{\mu _n^2}{a_j}-a_j=-\frac{\mu _n^2}{\sigma _j}-\sigma _j< 0 \end{aligned}$$

for \(j=n_p,\ldots ,n_p+n_e\), applying forward substitution to (5.12) from the \(n_p\)th row by using \(C_n^{j}=0\) for \(j=0,\ldots ,n_p\) gives \(C_n^j=0\) for \(j=n_p+1,\ldots ,n_p+n_e\).

For \(n>N\) (\(\mu _n^2<0\)), (5.10) with (5.8) used instead of (5.7) leads to

$$\begin{aligned} 2{\tilde{\mu }}_n|A_n|^2(e^{-2{\tilde{\mu }}_nb}-e^{2{\tilde{\mu }}_nb})&= \sum _{j=0}^{n_p-1} \left[ \frac{-\mu _n^2}{-ikc_j}|C_n^j+C_n^{j+1}|^2-ikc_j|C_n^j-C_n^{j+1}|^2\right] \\&\quad +\sum _{j=n_p}^{n_p+n_e} \left[ \frac{-\mu _n^2}{\sigma _j}|C_n^j+C_n^{j+1}|^2+\sigma _j|C_n^j-C_n^{j+1}|^2\right] . \end{aligned}$$

Since the real part of the left hand side is non-positive while that of the right hand side is non-negative, they need to be zero, which implies that \(A_n=B_n=0\) and \(C_n^j=0\) for \(j=n_p,\ldots ,n_p+n_e\). We observe that \(A_n=B_n=0\) implies \(D_n^0=0\), and so again from (5.5) obtain the linear equation (5.12) as above. In this case, since the subdiagonal entries of \(-\mu _n^2L+M\) above the \((n_p+1)\)th row are non-zero,

$$\begin{aligned} \frac{-\mu _n^2}{a_j}-a_j=\frac{-\mu _n^2}{-ikc_j}+ikc_j \ne 0 \end{aligned}$$

for \({{j=0,\ldots ,n_p-1}}\), we solve (5.12) by backward substitution from the \(n_p\)th row by using \(C_n^{j}=0\) for \(j=n_p,\ldots ,n_p+n_e\) and then we can see that \(C_n^j=0\) for \(j=0,\ldots ,n_p-1\).

For \(n=N\) (\(\mu _n^2=0\)), (5.6) becomes

$$\begin{aligned} -2b|B_N|^2=\sum _{j=0}^{n_p-1} -ikc_j|C_n^j-C_n^{j+1}|^2 +\sum _{j=n_p}^{n_p+n_e}\sigma _j|C_n^j-C_n^{j+1}|^2. \end{aligned}$$

By comparing the real and imaginary parts of both sides, it can be easily shown that \(C_n^j=0\) for all \(j=0,\ldots , P\). In addition, due to \(C_N^0=B_Nb\) and (5.2), we have \(A_N=B_N=0\).

Finally, the fact that \(A_n=B_n=0\) and \(C_n^j=0\) for all \(n\ge 0\) and \(j=0,\ldots ,P\) results in \(u=0\) in \({\varOmega }_b\) and \(\phi _j=0\) on \({\varGamma }_E\) for \(j=0,\ldots ,P\), which completes the proof of the uniqueness of solutions. \(\square \)

Theorem 5.2

The problem (3.17)–(3.21) has a unique solution \((u,{\varPhi })\in \mathbf{V }\).

Proof

By invoking Lemma 4.3, we can show boundedness of \(A(\cdot ,\cdot )\), i.e., there exists a positive constant \(C_1\) such that

$$\begin{aligned} |A((u,{\varPhi }),(\xi ,{\varPsi }))|\le C_1\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}}\Vert (\xi ,{\varPsi })\Vert _{{\varvec{{V}}}}. \end{aligned}$$

Furthermore, Lemma 4.3 and Lemma 4.4 show that there exist positive constants \(C_2\) and \(C_3\) such that

$$\begin{aligned} \begin{aligned} A((u,{\varPhi }),(u,{\varPhi }))&={\widetilde{A}}((u,{\varPhi }),(u,{\varPhi }))-2(k^2+1)\Vert u\Vert _{L^2({\varOmega }_b)}^2\\&\qquad -(k^2+1)(L{\varPhi },{\varPhi })_{{\varGamma }_E}+((M-{\bar{M}}){\varPhi },{\varPhi })_{{\varGamma }_E}\\&\ge C_2\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}}^2 -C_3\big (\Vert u\Vert ^2_{L^2({\varOmega }_b)}+\Vert {\varPhi }\Vert _{\mathcal {L}}^2+\Vert {\varPhi }\Vert _{\mathcal {M}}^2\big ) \end{aligned} \end{aligned}$$
(5.13)

for all \((u,{\varPhi }),(\xi ,{\varPsi })\in {\varvec{{V}}}_0\). Since \({\varvec{{V}}}_0\) is compactly embedded in \(L^2({\varOmega }_b)\times (L^2({\varGamma }_E))^{P+1}\), the existence of solutions is a consequence of the Fredholm alternative theorem and the uniqueness of solutions given in Lemma 5.1. \(\square \)

In the proof, it is not established how the stability constant depends on the number of parameters, \(P+1\). This will be studied in more detail in Sect. 8.

Remark 5.3

Let \(\mathbf{V }_0^*\) be the dual space of \(\mathbf{V }_0\) with the norm

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _{\mathbf{V }_0^*}=\sup _{0\ne (\xi ,{\varPsi })\in \mathbf{V }_0} \frac{|{\mathcal {G}}(\xi ,{\varPsi })|}{\Vert (\xi ,{\varPsi })\Vert _{\mathbf{V }}} \end{aligned}$$

for \({\mathcal {G}}\in \mathbf{V }_0^*\). The same argument used in the proof of Theorem 5.2 can show that the problem \(A((u,{\varPhi }),(\xi ,{\varPsi }))={\mathcal {G}}(\xi ,{\varPsi })\) for all \((\xi ,{\varPsi })\in \mathbf{V }_0\) admits a unique solution in \(\mathbf{V }_0\).

We can find a formula for the approximate solution u and \(\phi _j\) satisfying the CRBC on \({\varGamma }_E\) in terms of a prescribed condition \(f\in H^{1/2}({\varGamma }_W)\). To this end, let \(f\in H^{1/2}({\varGamma }_W)\) be a boundary datum, which has a Fourier series

$$\begin{aligned} f(y)=\sum _{n=0}^\infty f_n Y_n(y), \end{aligned}$$

and introduce

$$\begin{aligned} \displaystyle Q^n_{j,m}=\left\{ \begin{array}{ll} \displaystyle \prod _{\ell =j}^m \frac{a_\ell +i\mu _n}{a_\ell -i\mu _n} &{}\hbox { for }m\ge j,\\ 1 &{}\hbox { for }m<j, \end{array}\right. \end{aligned}$$
(5.14)

for \(n\ne N\). Now, \(\phi _j\) in the recursions (3.4) are represented by a Fourier series similar to (5.1),

$$\begin{aligned} \phi _j(x,y)=(A_N^j+B_N^jx)Y_N(y) +\sum _{n\ne N} (A_n^je^{i\mu _nx}+B_n^je^{-i\mu _nx})Y_n(y) \end{aligned}$$

with \(A^0_n=A_n\) and \(B^0_n=B_n\).

\({\underline{\text {Non-cutoff modes}, n\ne N{:}}}\) By (3.4) it is easily shown that

$$\begin{aligned} \begin{aligned} (a_j-i\mu _n)A_n^{j+1}&=(a_j+i\mu _n)A_n^j,\\ (a_j+i\mu _n)B_n^{j+1}&=(a_j-i\mu _n)B_n^j \end{aligned} \end{aligned}$$
(5.15)

for all j. If \(a_j+i\mu _n\ne 0\) for all \(0\le j\le P\), then it holds that

$$\begin{aligned} A_{n}^{j}=Q^n_{0,j-1}A_n \ \hbox { and }B_n^j =\frac{1}{Q^n_{0,j-1}}B_n \quad \hbox { for }\quad 0\le j\le P. \end{aligned}$$

The coefficients \(A_n\) and \(B_n\) of the approximate solution u in (5.1) are determined by the system of linear equations

$$\begin{aligned} A_n +B_n&=f_n,\\ e^{i\mu _nb}Q^n_{0,P}A_n + (e^{i\mu _nb}Q^n_{0,P})^{-1}B_n&=0, \end{aligned}$$

from which one can easily see that

$$\begin{aligned} A_n=\frac{f_n}{1-(e^{i\mu _nb}Q^n_{0,P})^2} \ \ \hbox { and }\ \ B_n=\frac{-(e^{i\mu _nb}Q^n_{0,P})^2f_n}{1-(e^{i\mu _nb}Q^n_{0,P})^2}. \end{aligned}$$
(5.16)

If \(a_j+i\mu _n= 0\) for some j, then a similar computation shows that \(A_n^j=Q^n_{0,j-1}A_n\) and \(B_n^j=0\) for all j and hence (5.16) is still valid.

\({\underline{\text {Cutoff modes, } n=N:}}\) By the recursive relations (3.4), we observe

$$\begin{aligned} B_N^j = B_N^{j+1} , \ \ B_N^j + a_j A_N^j = -B_N^{j+1} + a_j A_N^{j+1}, \end{aligned}$$
(5.17)

which implies

$$\begin{aligned} B_N^j=B_N \quad \hbox { and }\quad A_N^j=A_N+2\sum _{\ell =0}^{j-1}\frac{1}{a_\ell }B_N \end{aligned}$$

for \(j=1,\ldots ,P\). From the boundary condition \(A_N^0=f_N\) and the terminal condition

$$\begin{aligned} A_N^{P} + B_N^{P} b = 0, \end{aligned}$$
(5.18)

we find

$$\begin{aligned} A_N=f_N \ \hbox { and }\ B_N = \frac{ -f_N}{b+ 2 \sum _{j=0}^{P} a_j^{-1}}. \end{aligned}$$
(5.19)

The formula (5.19) reveals the convergence of cutoff modes provided \(\sum _{j=0}^Pa_j^{-1}\rightarrow \infty \).

We note that better results if a cutoff mode is known to be present could be obtained by changing the termination condition (3.8) to

$$\begin{aligned} {\frac{\partial }{\partial x}} \phi _{P+1} =0, \end{aligned}$$
(5.20)

since cutoff modes do not have any variation along the axis of the waveguide. In fact, the CRBC terminated by (5.20) yields coefficients \(A_n\) and \(B_n\) of approximate solutions such that

$$\begin{aligned} A_n=\frac{f_n}{1+(e^{i\mu _nb}Q^n_{0,P})^2} \ \hbox { and }\ B_n=\frac{-(e^{i\mu _nb}Q^n_{0,P})^2f_n}{1+(e^{i\mu _nb}Q^n_{0,P})^2}, \end{aligned}$$

which converge to the exact coefficients at the same rate as those of (5.16) by the Dirichlet condition, but \(A_N=f_N\) and \(B_N=0\), which coincide with those of the exact solution. However this would change the form of the boundary system and require further analysis. Thus we do not consider it here but refer readers to [23].

Alternatively, we can guarantee rapid convergence independent of the distribution of eigenvalues by using Newman nodes which converge to 0 geometrically, for example Newman’s nodes \(a_j=-ik e^{j/\sqrt{P}}\) for propagating modes and/or their analogous form in the evanescent regime [7, 22]. Even though it turns out that with such a choice our bounds on the stability constants degenerate with \(e^{\sqrt{P}}\), our experiments, presented in Sect. 10, indicate the discretized problem keeps a convergence rate expected in the continuous level with increasing P as long as the problem is discretized with small mesh size compensating the degenerating stability constants.

6 Convergence of approximate solutions satisfying CRBCs

In this section, we show convergence of approximate solutions satisfying CRBCs. As we have seen above, the error of the cutoff mode is estimated in terms of

$$\begin{aligned} S_P=|b+2\sum _{j=0}^P a_j^{-1}|^{-1}, \end{aligned}$$

which approaches zero as the order P increases. For non-cutoff modes the error is controlled by the following factor

$$\begin{aligned} \left| -e^{i\mu _nb}(Q^n_{0,P})^2\right| =\left\{ \begin{array}{ll} \displaystyle \prod _{j=0}^{n_p-1} \left| \frac{{a}_j + i \mu _n}{{a}_j -i \mu _n} \right| ^2 &{} \hbox { for }0\le n\le N-1,\\ e^{-{\tilde{\mu }}_n b}\displaystyle \prod _{j=n_p}^{n_p+n_e} \left| \frac{{a}_j - \tilde{\mu }_n}{{a}_j +\tilde{\mu }_n} \right| ^2 &{}\hbox { for }N+1 \le n. \end{array} \right. \end{aligned}$$

Since \(\lim _{n\rightarrow \infty }|Q^n_{0,P}|=1\), the error does not decay exponentially as a function of P. However, since the factor \(e^{i\mu _nb}\) decays exponentially for large n, we can bound the error almost by an exponential function of P (except for the cutoff mode) in the sense of the following theorem. The optimal choice of parameters would depend on a knowledge of the axial frequencies \(\mu _n\), \(\tilde{\mu }_n\). Later on we will advocate a simpler approach based only on the knowledge of intervals containing the axial frequencies. We then introduce the min–max problems determining the reflection coefficients for each \(n\ne N\),

$$\begin{aligned} \rho _p&= \min _{a_0,\ldots ,a_{n_p-1}\in i{\mathbb {R}}_-} \max _{\mu _{N-1} \le \eta \le k} \prod _{j=0}^{n_p-1} \left| \frac{{a}_j + i \eta }{{a}_j -i \eta } \right| ^2, \end{aligned}$$
(6.1)
$$\begin{aligned} \rho _e&= \min _{a_{n_p},\ldots ,a_{n_p+n_e}\in {\mathbb {R}}_{+}}\max _{{\tilde{\mu }}_{N+1} \le {\tilde{\eta }} \le M_\sigma } e^{-{\tilde{\eta }} b} \prod _{j=n_p}^{n_p+n_e} \left| \frac{{a}_j - \tilde{\eta }}{{a}_j +\tilde{\eta }} \right| ^2. \end{aligned}$$
(6.2)

Here we recall that \(M_\sigma \) is determined by \(e^{-M_\sigma b}\) less than an error tolerance. It is shown in [30] that the reflection coefficients can be reduced at an exponential rate with respect to the number of parameters used,

$$\begin{aligned} \begin{aligned} \rho _p&\le e^{-Cn_p/\ln (k/\mu _{N-1})},\\ \rho _e&\le e^{-{\tilde{\mu }}_{N+1}b}e^{-Cn_e/\ln (M_\sigma /{\tilde{\mu }}_{N+1})}. \end{aligned} \end{aligned}$$
(6.3)

by selecting parameters which satisfy (6.1)–(6.2). These are easy to compute in practice using the Remez algorithm, and in the case of (6.1) they are known analytically (see [7]).

Theorem 6.1

Suppose that f is in \(H^{1/2}({\varGamma }_W)\), \(u^{ex}\) is the exact radiating solution to the problem (2.1)–(2.3) and u is the solution to the problem (3.17)–(3.21). Then it holds that

$$\begin{aligned} \Vert u-u^{ex}\Vert _{H^1({\varOmega }_b)}\le C\rho (M_\sigma ,n_p,n_e) \Vert f\Vert _{H^{1/2}({\varGamma }_W)}, \end{aligned}$$
(6.4)

where

$$\begin{aligned} \rho (M_\sigma ,n_p,n_e)= \max \{ S_P, e^{-Cn_p/\ln (k/\mu _{N-1})}, e^{-{\tilde{\mu }}_{N+1}b} e^{-Cn_e/\ln (M_\sigma /{\tilde{\mu }}_{N+1})}, e^{-M_\sigma b} \}. \end{aligned}$$

Remark 6.2

We have not attempted to sharply estimate the dependence of the inequality (6.4) on the wave number k, or on the k-dependence of inequalities (8.3), (8.4), (9.3), or (9.4). From the arguments given we can only derive bounds which grow very rapidly with k. Numerical experiments with k as large as 100 show that the actual k-dependence of the stability and error constants is in fact quite mild.

Remark 6.3

Note that the term \(S_P\) is absent if no cutoff modes exist. Then we have that with node choices satisfying (6.1)–(6.2) and an error tolerance \(\tau \)

$$\begin{aligned} P \propto \ln {\left( \frac{1}{\tau } \right) } \cdot \ln {\left( \ln {\left( \frac{1}{\tau } \right) }\right) } \end{aligned}$$
(6.5)

suffices.

To prove Theorem 6.1, we start by studying the regularity of solutions satisfying the exact radiation condition given by the Dirichlet-to-Neumann map on the artificial boundary \({\varGamma }_E\). For \(0\le s\le 2\), let \(T:H^{s}({\varGamma }_E)\rightarrow H^{s-1}({\varGamma }_E)\) be the Dirichlet-to-Neumann map defined by

$$\begin{aligned} Tv=\sum _{n=0}^\infty i\mu _n v_n Y_n \end{aligned}$$

for \(v=\sum _{n=0}^\infty v_n Y_n\) in \(H^{s}({\varGamma }_E)\). We consider the problem with the exact boundary condition associated with the Dirichlet-to-Neumann map T: For \(g_{in}\in H^{s}({\varOmega }_b)\) and \(g_{bd}\in H^{s+1/2}({\varGamma }_E)\) with \(-1\le s\le 0\),

$$\begin{aligned} {\varDelta }u + k^2 u&=g_{in}\quad \hbox { in }\quad {\varOmega }_b,\nonumber \\ u&=0 \quad \hbox { on }{\varGamma }_W, \quad \frac{\partial u}{\partial \nu }=0\quad \hbox { on }{\varGamma }_T,\nonumber \\ \frac{\partial u}{\partial x}-Tu&=g_{bd} \quad \hbox { on }{\varGamma }_E. \end{aligned}$$
(6.6)

As in [3], it can be shown that the regularity of solutions satisfying the exact boundary condition holds by transforming the problem to one without the Dirichlet condition on \({\varGamma }_W\) via the odd reflection with respect to \({\varGamma }_W\).

Lemma 6.4

For \(g_{in}\in H^{s}({\varOmega }_b)\) and \(g_{bd}\in H^{s+1/2}({\varGamma }_E)\) with \(-1\le s\le 0\), the problem (6.6) admits a unique solution in \(H^{s+2}({\varOmega }_b)\). Moreover, there exists a positive constant C such that

$$\begin{aligned} \Vert u\Vert _{H^{s+2}({\varOmega }_b)} \le C(\Vert g_{in}\Vert _{H^{s}({\varOmega }_b)} +\Vert g_{bd}\Vert _{H^{s+1/2}({\varGamma }_E)}). \end{aligned}$$

Now, the proof of Theorem 6.1 is as follows.

Proof of Theorem 6.1

We first note that the error function \(z=u-u^{ex}\) satisfies

$$\begin{aligned} {\varDelta }z + k^2 z&= 0 \quad \hbox { in }{\varOmega }_b,\\ z&=0 \quad \hbox { on }{\varGamma }_W, \quad \frac{\partial z}{\partial \nu }=0\quad \hbox { on }{\varGamma }_T,\\ \frac{\partial z}{\partial x}-Tz&=g_{bd}\quad \hbox { on }{\varGamma }_E, \end{aligned}$$

where \(g_{bd}\) has the Fourier series

$$\begin{aligned} g_{bd}=\frac{\partial u}{\partial x}-Tu= \frac{-1}{b+2\sum _{j=0}^P a_j^{-1}}f_{N}Y_{N}+ \sum _{n\ne N} \frac{2i\mu _ne^{i\mu _nb}(Q^n_{0,P})^2}{1-(e^{i\mu _nb}Q^n_{0,P})^2}f_nY_n \end{aligned}$$
(6.7)

by using (5.16) and (5.19).

Let \(N_*>N\) be the largest integer such that \({\tilde{\mu }}_{N_*} \le M_\sigma \). Since \(|1-(e^{i\mu _nb}Q^n_{0,P})^2|\) is bounded away from zero for all \(n\ge 0\) and \(|\mu _n|^2\le C(\lambda _n^2+1)\) for all \(n\ne N\), by (6.3) we obtain

$$\begin{aligned} \Vert g_{bd}\Vert _{H^{-1/2}({\varGamma }_E)}^2&\le C\left( \sum _{0\le n\le N-1} e^{-2Cn_p/\ln (k/\mu _{N-1})} \frac{|\mu _nf_n|^2}{(1+\lambda _n^2)^{1/2}}\right. \\&\qquad \left. +\sum _{N+1\le n \le N_*} e^{-2{\tilde{\mu }}_nb} e^{-2Cn_e/\ln (M_\sigma /{\tilde{\mu }}_{N+1})} \frac{|\mu _nf_n|^2}{(1+\lambda _n^2)^{1/2}}\right. \\&\qquad \left. +\sum _{N_*+1\le n} e^{-2M_\sigma b} \frac{|\mu _nf_n|^2}{(1+\lambda _n^2)^{1/2}} +\frac{1}{|b+2\sum _{j=0}^Pa_j^{-1}|^2}|f_{N}|^2\right) \\&\le C\rho (M_\sigma ,n_p,n_e)^2\Vert f\Vert _{H^{1/2}({\varGamma }_W)}^2. \end{aligned}$$

Finally, Lemma 6.4 completes the proof of (6.4). \(\square \)

Remark 6.5

When the parameters \(a_j\) are chosen such that

$$\begin{aligned} a_j=-i\mu _j \hbox { for }j=0,\ldots ,N-1 \quad \hbox { and }\quad a_j={\tilde{\mu }}_{j+1} \hbox { for }j=N,\ldots ,P, \end{aligned}$$

the CRBCs behave as the exact boundary conditions for the important \(P+1\) modes, which are all propagating modes combined with slowly decaying evanescent modes. These are the modes which would produce the largest reflections without efficient absorbing boundary conditions. Since \(Q^n_{0,P}=0\) for \(n=0,\ldots , P\) and \(n\ne N\), the error is estimated as

$$\begin{aligned} \Vert u-u^{ex}\Vert _{H^1({\varOmega }_b)}\le C(S_P+ e^{-{\tilde{\mu }}_{P+1}b}) \Vert f\Vert _{H^{1/2}({\varGamma }_W)}, \end{aligned}$$

where again \(S_P\) is absent if there are no cutoff modes.

7 Parameter selection

The general error formulas derived in the preceding section can be used to guide the selection of optimal parameters. Experiments with an automatic parameter selection algorithm will be reported elsewhere; here we will make selections which, though suboptimal, show that the number of parameters will be small even for difficult cases.

Optimal parameters for a fixed P, chosen independent of f and minimizing the error in the Fourier coefficients at \(x=b\), would be those which minimize the maximum over \(n\ne N\) of

$$\begin{aligned} \rho \equiv \left| -e^{i\mu _nb}(Q^n_{0,P})^2\right| = \left\{ \begin{array}{ll} \left| (Q^n_{0,P})^2\right| &{} \hbox { for }\mu _n^2 >0, \\ \left| e^{-\tilde{\mu }_n b}(Q^n_{0,P})^2\right| &{} \hbox { for }\mu _n^2 <0. \end{array} \right. \end{aligned}$$
(7.1)

Note that the number of propagating modes is finite, as is the number of evanescent modes satisfying \(e^{- \tilde{\mu }_n b} > \tau \) for any error tolerance \(\tau \). The remaining evanescent modes are sufficiently small at the boundary, so the value of \(\left| (Q^n_{0,P})^2 \right| \le 1\) is unimportant. Moreover, the number of important modes increases with increasing k; for k small it is feasible to directly compute this small number of modes and choose parameters which are exact on these modes. (For a discussion of conditions using a different set of auxiliary variables which are exact for propagating modes, see Bendali and Guillaume [3].)

Here we look at the simpler problem of minimizing \(\rho \) over an entire interval rather than over a discrete set. We introduce the following scalings:

$$\begin{aligned} \eta \equiv \mu /k \ \ ({\tilde{\eta }}\equiv {\tilde{\mu }}/k), \ \ {\tilde{a}}_j \equiv {a}_j/k , \ \ b= 2 \pi k^{-1} n_{\lambda } , \end{aligned}$$

where \(n_{\lambda }\) is the number of wavelengths of the normally propagating mode, \(e^{ikx}\), on the interval [0, b]. Now, we explicitly assume that \(\mu _n \ne 0\); that is the cutoff mode is absent. To perform the optimizations we quantify the gap in the spectrum near 0

$$\begin{aligned} \eta ^2 \ge c_0^2 \ \ \quad \mathrm{and}\quad \ \ {\tilde{\eta }}^2 \ge g_0^2 \end{aligned}$$
(7.2)

for some constants \(c_0\) and \(g_0\). In real situations, \(c_0\) and \(g_0\) would be some constants approximate to the smallest axial frequency, \(\mu _{N-1}\), of propagating modes and smallest decay rate, \({\tilde{\mu }}_{N+1}\), of evanescent modes, respectively. We then consider the reflection coefficients

$$\begin{aligned} \rho _p&= \max _{c_0 \le \eta \le 1} \prod _{j=0}^{n_p-1} \left| \frac{\tilde{a}_j + i \eta }{\tilde{a}_j -i \eta } \right| ^2, \end{aligned}$$
(7.3)
$$\begin{aligned} \rho _e&= \max _{\tilde{\eta } \ge g_0} e^{-2 \pi n_{\lambda } \tilde{\eta }} \prod _{j=n_p}^{n_p+n_e} \left| \frac{\tilde{a}_j - \tilde{\eta }}{\tilde{a}_j +\tilde{\eta }} \right| ^2. \end{aligned}$$
(7.4)

For fixed values of \(n_p\) and \(n_e\), we can compute optimal parameters using the Remez algorithm (see, e.g., [30]). For instance, consider the truncated waveguide \({\varOmega }_b\) defined with \(W=1, b=0.1\). When the wavenumber is \(k=100\), there are 32 propagating modes involved in acoustic pressure fields. For \(n_p=4\) and \(n_e=3\), the Remez algorithm applied to minimization of the maximal reflection coefficients (7.3) and (7.4) produces the damping parameters with which the graphs of the reflection coefficients as a function of n are presented in Fig. 3. It indicates that reflection of all propagating modes and evanescent modes can be reduced up to \(3.9590\times 10^{-6}\) and \(5.3492\times 10^{-5}\), respectively. Here the upper bound for \({\tilde{\eta }}\) in the Remez algorithm is determined in a way that the modes between the vertical green lines damped effectively. Note that our simple Matlab implementation of the Remez algorithm, which uses a geometrical sequence as an initial guess, has converged rapidly for all the cases considered here. The authors will provide it to any interested readers.

To determine the smallest P for a given tolerance, \(\tau \), as a function of \(c_0\), \(g_0\) and \(n_{\lambda }\) we simply find the smallest values of \(n_p\) and \(n_e\) such that the optimal nodes chosen by the Remez algorithm lead to \(\rho _p \le \tau \), \(\rho _e \le \tau \).

Note that these approximations can be directly related to optimal approximation of the square root function, which was solved by Zolotarev using elliptic functions [30]. The error estimates developed in [7, 22] state the error in the Zolotarev approximation of degree \((d-1,d)\) on the interval \([z_0 , z_1]\) to be of the order \(e^{- \pi ^2 d/\ln {(z_1/z_0)}}\). For propagating modes this implies

$$\begin{aligned} n_p \propto \ln {\left( \frac{1}{\tau } \right) } \cdot \ln {\left( \frac{1}{c_0} \right) } . \end{aligned}$$
(7.5)

For evanescent modes we note that the largest value of \(\tilde{\eta }\) is relevant scales like \(n_{\lambda }^{-1} \ln {\left( \frac{1}{\tau } \right) }\). Thus we conclude that

$$\begin{aligned} n_e \propto \ln {\left( \frac{1}{\tau } \right) } \cdot \ln {\left( \frac{1}{n_\lambda g_0} \right) } + \ln {\left( \frac{1}{\tau } \right) } \cdot \ln {\ln {\left( \frac{1}{\tau } \right) }} . \end{aligned}$$
(7.6)
Fig. 3
figure 3

Reflection coefficients of \(|\rho _p|\) and \(|\rho _e|\) as a function of n with the optimal parameters obtained by Remez algorithm

We carried out the optimizations discussed above for the parameters

$$\begin{aligned} c_0=\{10^{-2},10^{-4}\}, g_0=\{10^{-2},10^{-4}\}, n_{\lambda }=\{1,0.1\}, \tau =\{10^{-3},10^{-5}\}. \end{aligned}$$

The results are shown in Table 1. Based on the Remez algorithm the results are consistent with the estimates (7.5)–(7.6). We emphasize that these results are definitely suboptimal as they do not take account of the actual modal distributions. Methods for constructing better parameters may be based, for example, on rational Krylov algorithms [8, 14, 28] applied to the finite element discretization of the cross-sectional Laplace operator.

Table 1 Number of terms needed to meet the tolerance, \(\tau \), for select values of \(c_0\), \(g_0\), and \(n_{\lambda }\)

In practice, then, we recommend the following procedure to select the method parameters. Given a choice of b, which can be taken as the separation between the radiation boundary and any sources, scatterers, or inhomogeneities, and an error tolerance, \(\tau \):

  1. i.

    If possible estimate the number of important modes; in many cases this can be done based on the frequency, k, and the geometry of the cross-section using standard inequalities on the spectrum of elliptic operators [6]. If this is small enough, for propagating modes, evanescent modes, or both, application of a Lanczos algorithm [32] will produce them at minimal cost. Then choose the parameters to exactly absorb these modes.

  2. ii.

    If the use of exact conditions is deemed inefficient, again for propagating modes, evanescent modes, or both, use the Lanczos algorithm to compute the eigenvalues nearest \(k^2\) and use that information to define the intervals for input into the Remez algorithm.

8 Stability and regularity of the variational problem

In this section, we study the stability and regularity of the variational problems

$$\begin{aligned} A((u,{\varPhi }),(\xi ,{\varPsi }))=(f_s,\xi )_{{\varGamma }_E} \end{aligned}$$
(8.1)

for all \((\xi ,{\varPsi })\in {\varvec{{V}}}_0\) with \(f_s \in L^2({\varOmega }_b)\) supported away from \({\varGamma }_E\), and

$$\begin{aligned} A((u,{\varPhi }),(\xi ,{\varPsi }))=(L{\varUpsilon }, {\varPsi })_{{\varGamma }_E} \end{aligned}$$
(8.2)

for all \((\xi ,{\varPsi })\in {\varvec{{V}}}_0\) with the source \(L{\varUpsilon }\), \({\varUpsilon }\in (L^2({\varGamma }_E))^{P+1}\) being given as auxiliary variables. The study of the problem (8.1) suffices for verification of the stability and regularity of solutions to the problem (4.1) since the boundary value problem can be reduced to the source problem due to a lifting of the boundary condition. Also, these results will come into play in the finite element analysis.

8.1 Stability and regularity of solutions to Problem (8.1)

We note that the problem (8.1) has a unique solution in \({\varvec{{V}}}_0\) by Remark 5.3. The energy norm estimates for the solution u and the auxiliary variables \({\varPhi }\) are given in the following theorem.

Theorem 8.1

Let \(a_j\) be the parameters defined by (3.5) satisfying (3.6). Then for any \(f_s\in L^2({\varOmega }_b)\) supported away from \({\varGamma }_E\), the solution \((u,{\varPhi })\) to the problem (8.1) satisfies

$$\begin{aligned} \Vert u\Vert _{H^1({\varOmega }_b)}\le C\Vert f_s\Vert _{L^2({\varOmega }_b)} \end{aligned}$$

and

$$\begin{aligned} \Vert {\varPhi }\Vert _{\mathbf{V }_{{\varGamma }_E}}\le C_a(P+1)\Vert f_s\Vert _{L^2({\varOmega }_b)}. \end{aligned}$$

In addition, the regularity result holds,

$$\begin{aligned} \Vert u\Vert _{H^2({\varOmega }_b)} \le C\Vert f_s\Vert _{L^2({\varOmega }_b)} \end{aligned}$$
(8.3)

and

$$\begin{aligned} \Vert {\varPhi }\Vert _{\mathbf{V }^2_{{\varGamma }_E}}\le C_a(P+1)\Vert f_s\Vert _{L^2({\varOmega }_b)}. \end{aligned}$$
(8.4)

If cutoff modes are excluded, the constants \(C_a\) for the stability and regularity estimates are independent of \(a_j\) and the exponents on \((P+1)\) are halved; that is the constants in the estimates of \({\varPhi }\) become \(C (P+1)^{1/2}\).

The proof of Theorem 8.1 proceeds based on a sequence of lemmas for solution formulas of auxiliary variables. In order to study the stability estimate of problem (8.1), it is required to analyze the auxiliary variables solving the problem

$$\begin{aligned} \begin{aligned} -L{\varDelta }_y {\varPhi }+ (-k^2L+M){\varPhi }&= E_0{\varvec{{e}}}_0 \quad \hbox { in }{\varGamma }_E,\\ \frac{\partial {\varPhi }}{\partial \nu }&=0 \quad \hbox { on }\partial {\varGamma }_E. \end{aligned} \end{aligned}$$
(8.5)

The nth Fourier coefficients \({\varPhi }^n\) of \({\varPhi }\) satisfy the equation

$$\begin{aligned} -\mu _n^2L{\varPhi }^n +M{\varPhi }^n = E_0^n{\varvec{{e}}}_0. \end{aligned}$$
(8.6)

We start by finding the explicit form of the solution \({\varPhi }^n\) with \(E_j^n{\varvec{{e}}}_j\) for \(j=0,\ldots ,P\) on the right hand side of (8.6), recalling the definition (5.14) of \(Q^n_{j,m}\) for \(n\ne N\).

Lemma 8.2

Suppose that \(a_j\ne -i\mu _n\) and \(\mu _n\) is not a cutoff axial frequency, i.e., \(\mu _n \ne 0\). Let \({\varPhi }^n \in {\mathbb {C}}^{P+1}\) be a solution to the linear system (8.6) with \(E_j^n{\varvec{{e}}}_j\) on the right hand side. Then \(\phi _\ell ^n\) is given by the formula \(\phi _\ell ^n=s^n_{\ell ,j}E^n_j\), where

$$\begin{aligned} s^n_{\ell ,j}=\left\{ \begin{array}{ll} \displaystyle \frac{(1+(Q^n_{0,\ell -1})^2)Q^n_{\ell ,j-1}(1-(Q^n_{j,P})^2)}{-4i\mu _n(1+(Q^n_{0,P})^2)} &{} \hbox { if }\ell \le j,\\ \displaystyle \frac{(1+(Q^n_{0,j-1})^2)Q^n_{j,\ell -1} (1-(Q^n_{\ell ,P})^2)}{-4i\mu _n(1+(Q_{0,P}^n)^2)} &{} \hbox { if }\ell \ge j. \end{array} \right. \end{aligned}$$
(8.7)

Proof

We will find the solution \({\varPhi }^n\) in the form

$$\begin{aligned} \phi ^n_\ell =\left\{ \begin{array}{ll} Q^n_{0,\ell -1}{\widetilde{A}}_n+\displaystyle \frac{1}{Q^n_{0,\ell -1}}{\widetilde{B}}_n &{}\hbox { for }\ell =0,1,\ldots ,j,\\ Q^n_{j,\ell -1}{\widetilde{C}}_n+\displaystyle \frac{1}{Q^n_{j,\ell -1}}{\widetilde{D}}_n &{}\hbox { for }\ell =j,j+1,\ldots ,P \end{array}\right. \end{aligned}$$
(8.8)

for \(0<j<P\). When \(j=0\) or P, we assume that \(\phi ^n_\ell \) is defined by the upper formula with \(\ell =0,1,\ldots ,P\). Here we will verify the formulas for \(0<j<P\), as the other cases can be treated with only small modifications.

By the definition of \(Q^n_{j,m}\) one can easily show that the three term recursions

$$\begin{aligned}&(-\mu _n^2 L_{\ell ,\ell -1}+M_{\ell ,\ell -1}) \phi ^n_{\ell -1} +(-\mu _n^2 L_{\ell ,\ell }+M_{\ell ,\ell }) \phi ^n_\ell \\&\quad +(-\mu _n^2 L_{\ell ,\ell +1}+M_{\ell ,\ell +1}) \phi ^n_{\ell +1}=0 \end{aligned}$$

hold for \(\ell \ne 0, j, P\). Thus, the four unknowns \({\widetilde{A}}_n\), \({\widetilde{B}}_n\), \({\widetilde{C}}_n\) and \({\widetilde{D}}_n\) are to be determined by

$$\begin{aligned} -2i\mu _n({\widetilde{A}}_n-{\widetilde{B}}_n)=0 \end{aligned}$$
(8.9)

from the 0th equation,

$$\begin{aligned} Q^n_{0,\ell -1}{\widetilde{A}}_n+\displaystyle \frac{1}{Q^n_{0,\ell -1}}{\widetilde{B}}_n={\widetilde{C}}_n+{\widetilde{D}}_n \end{aligned}$$
(8.10)

from the definition of \(\phi ^n_\ell \) with \(\ell =j\),

$$\begin{aligned} \left( Q^n_{0,\ell -1}{\widetilde{A}}_n-\displaystyle \frac{1}{Q^n_{0,\ell -1}}{\widetilde{B}}_n\right) -({\widetilde{C}}_n-{\widetilde{D}}_n)=\frac{1}{2i\mu _n}E^n_j \end{aligned}$$
(8.11)

from the jth equation and

$$\begin{aligned} Q^n_{j,P}{\widetilde{C}}_n + \frac{1}{Q^n_{j,P}}{\widetilde{D}}_n = 0 \end{aligned}$$
(8.12)

from the Pth equation. Solving the Eqs. (8.9)–(8.12) leads to

$$\begin{aligned} {\widetilde{A}}_n&=\frac{(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{-4i\mu _n(1+(Q^n_{0,P})^2)}E^n_j, \quad \quad {\widetilde{B}}_n=\frac{(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{-4i\mu _n(1+(Q^n_{0,P})^2)}E^n_j,\\ {\widetilde{C}}_n&=\frac{(1+(Q^n_{0,j-1})^2)}{-4i\mu _n(1+(Q^n_{0,P})^2)}E^n_j, \quad \quad {\widetilde{D}}_n=\frac{(1+(Q^n_{0,j-1})^2)(Q^n_{j,P})^2}{4i\mu _n(1+(Q^n_{0,P})^2)}E^n_j \end{aligned}$$

and hence the formula (8.7) is obtained. \(\square \)

The next lemma gives solution formulas when there exists an index J such that \(a_J+i\mu _n=0\). In this case the problem can be written as two block systems. The first block system is reduced to the case in Lemma 8.2, and the formulas for the second one can be derived by a similar computation to that used in Lemma 8.2 and hence we omit the proof.

Lemma 8.3

Suppose that there exists an index J such that \(a_J+i\mu _n=0\). Let \({\varPhi }^n \in {\mathbb {C}}^{P+1}\) be a solution to the linear system (8.6) with \(E_j^n{\varvec{{e}}}_j\) in the right hand side. Then \(\phi _\ell ^n\) are given by the formula \(\phi _\ell ^n=s^n_{\ell ,j}E^n_j\), where if \(j\le J\)

$$\begin{aligned} s^n_{\ell ,j}=\left\{ \begin{array}{cl} \displaystyle \frac{-1}{4i\mu _n} (1+(Q^n_{0,\ell -1})^2)Q^n_{\ell ,j-1} &{} \hbox { if }\ell \le j,\\ \displaystyle \frac{-1}{4i\mu _n} (1+(Q^n_{0,j-1})^2)Q^n_{j,\ell -1} &{} \hbox { if }\ell \ge j \end{array} \right. \end{aligned}$$
(8.13)

and if \(j>J\)

$$\begin{aligned} s^n_{\ell ,j}=\left\{ \begin{array}{cl} \displaystyle \frac{-1}{4i\mu _n}Q^n_{\ell ,j-1}(1-(Q^n_{j,P})^2) &{} \hbox { if }\ell \le j,\\ \displaystyle \frac{-1}{4i\mu _n}Q^n_{j,\ell -1}(1-(Q^n_{\ell ,P})^2) &{} \hbox { if }\ell \ge j. \end{array} \right. \end{aligned}$$
(8.14)

We notice that these formulas in Lemma 8.3 are consistent with (8.7) since \(Q^n_{c,d}=0\) for \(c\le J\le d\).

As a special case the solution to (8.6) is given in the following lemma.

Lemma 8.4

Let \({\varPhi }^n \in {\mathbb {C}}^{P+1}\) be a solution to the linear system (8.6). For \(n\ne N\), the \(\phi _\ell ^n\) are given by

$$\begin{aligned} \phi _\ell ^n=\frac{-Q^n_{0,\ell -1} (1-(Q^n_{\ell ,P})^2)}{2i\mu _n(1+(Q^n_{0,P})^2)}E_0^n \end{aligned}$$
(8.15)

and

$$\begin{aligned} \phi _\ell ^n=\frac{Q^n_{0,\ell -1}(1-(Q^n_{\ell ,P})^2)}{(1-(Q^n_{0,P})^2)}\phi _0^n \end{aligned}$$
(8.16)

for \(\ell =0,\ldots ,P\). For \(n=N\),

$$\begin{aligned} \phi _\ell ^n =\sum _{j=\ell }^P \frac{1}{a_j}E^n_0. \end{aligned}$$
(8.17)

Proof

When \(a_j+i\mu _n\ne 0\) for all j, the formula (8.15) is obtained from (8.7) with \(j=0\). If there exists J such that \(a_J+i\mu _n=0\), (8.15) immediately follows from (8.13) and noting that \(Q^n_{\ell ,P}=0\) for \(\ell \le J\) and \(Q^n_{0,\ell -1}=0\) for \(\ell \ge J+1\). In addition, we have (8.16) by rewriting \(E_0^n\) in terms of \(\phi _0^n\).

The formula (8.17) for \(n=N\) is obtained straightforwardly by Gaussian elimination. \(\square \)

We note that by the arithmetic-geometric mean inequality

$$\begin{aligned} \begin{aligned} \frac{1}{\sqrt{|a_\ell |}}|1+Q^n_{\ell ,\ell }|&=\frac{2\sqrt{|a_\ell \mu _n|}}{|a_\ell -i\mu _n|} \frac{1}{\sqrt{|\mu _n|}}\le \frac{C}{\sqrt{|\mu _n|}},\\ \sqrt{|a_\ell |}|1-Q^n_{\ell ,\ell }|&=\frac{2\sqrt{|a_\ell \mu _n|}}{|a_\ell -i\mu _n|}\sqrt{|\mu _n|} \le C\sqrt{|\mu _n|} \end{aligned} \end{aligned}$$
(8.18)

and \((\lambda _n^2+1)\le C|\mu _n|^2\) for \(n\ne N\).

Lemma 8.5

Let \(a_j\) be the parameters defined by (3.5) satisfying (3.6). We assume that \({\varPhi }\in {\mathbf{V }}_{{\varGamma }_E}\), \(\phi _0\in H^{s+1/2}({\varGamma }_E)\) and \(E_0\in H^{s-1/2}({\varGamma }_E)\) for \(s\ge 0\). If \({\varPhi }\) and \(E_0\) satisfy (8.5), then it holds that

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathbf{V }}_{{\varGamma }_E}}\le C_a(P+1)(\Vert E_0\Vert _{H^{-1/2}({\varGamma }_E)}+\Vert \phi _0\Vert _{H^{1/2}({\varGamma }_E)}) \hbox { for }s=0. \end{aligned}$$

In addition, we have the regularity estimate

$$\begin{aligned} \Vert {\varPhi }\Vert _{{\mathbf{V }^2}_{{\varGamma }_E}}\le C_a(P+1)(\Vert E_0\Vert _{H^{1/2}({\varGamma }_E)}+\Vert \phi _0\Vert _{H^{3/2}({\varGamma }_E)}) \hbox { for }s=1. \end{aligned}$$

If cutoff modes are excluded, the constants \(C_a\) for the stability and regularity estimates are independent of \(a_j\) and the exponents on \((P+1)\) are halved; that is the constants in the estimates of \({\varPhi }\) become \(C (P+1)^{1/2}\).

Proof

\({\underline{\text {Cutoff modes, } n=N}}\): By using the solution formula (8.17), we have

$$\begin{aligned} \Vert {\varPhi }^N\Vert _{{\mathcal {M}}}^2&=\sum _{\ell =0}^P |a_\ell ||\phi _\ell ^N-\phi _{\ell +1}^N|^2 =\sum _{\ell =0}^P \frac{1}{|a_\ell |}|E_0^N|^2 = |E_0^N|\sum _{\ell =0}^P\frac{1}{|a_\ell |}|E_0^N|\nonumber \\&\le \sqrt{2}|E_0^N||\phi _0^N| = \frac{\sqrt{2}}{|\sum _{\ell =0}^Pa_\ell ^{-1}|}|\phi _0^N|^2 \le C|\phi ^N_0|^2. \end{aligned}$$
(8.19)

Here we used (8.17) with \(\ell =0\) for the first inequality. Also, invoking Lemma 4.5 and (8.19), we are led to

$$\begin{aligned} \Vert {\varPhi }^N\Vert _{{\mathcal {L}}}^2&\le C_a^2(P+1)^2\Vert {\varPhi }^N\Vert _{{\mathcal {M}}}^2 \le C_a^2(P+1)^2|\phi _0^N|^2. \end{aligned}$$

Thus, since \(\lambda _N=k\) is a constant, we have

$$\begin{aligned} (\lambda _N^2+1)^s((\lambda _N^2+1)\Vert {\varPhi }^N\Vert _{{\mathcal {L}}}^2+\Vert {\varPhi }^N\Vert _{{\mathcal {M}}}^2) \le C_a^2(P+1)^2(\lambda _N^2+1)^{s+1/2}|\phi _0^N|^2. \end{aligned}$$
(8.20)

\({\underline{\text {Non-cutoff modes, }n\ne N}}\): For the estimation of non-cutoff modes, we decompose \({\mathbb {N}}{\setminus }\{N\}\) into two disjoint sets \({\mathcal {N}}_1\) and \({\mathcal {N}}_2\),

$$\begin{aligned} {\mathcal {N}}_1=\{n \in {\mathbb {N}}{\setminus }\{N\} \ : \ |1+(Q^n_{0,P})^2|\ge 1 \} \quad \hbox { and }\quad {\mathcal {N}}_2= {\mathbb {N}}{\setminus } ({\mathcal {N}}_1\cup \{N\}). \end{aligned}$$

Since \(|1+(Q^n_{0,P})^2|\ge 1\) or \(|1-(Q^n_{0,P})^2|\ge 1\) for each \(n\ge 0\), if \(n \in {\mathcal {N}}_2\), then \(|1-(Q^n_{0,P})^2|\ge 1\). Therefore, for \(n\in {\mathcal {N}}_1\) the solution formula (8.15) implies

$$\begin{aligned} |\phi _\ell ^n+\phi ^n_{\ell +1}|= \left| \frac{Q^n_{\ell -1}(1-(Q^n_{\ell +1,P})^2Q^n_{\ell ,\ell })}{(1+(Q^n_{0,P})^2)} \frac{(1+Q^n_{\ell ,\ell })E_0^n}{2i\mu _n}\right| \le C\left| \frac{(1+Q^n_{\ell ,\ell })E_0^n}{2i\mu _n}\right| , \end{aligned}$$

and by (8.18) we have

$$\begin{aligned} \begin{aligned} \frac{1}{|a_\ell |}|\phi ^n_\ell +\phi ^n_{\ell +1}|^2 \le C \frac{|E_0^n|^2}{|\mu _n|^3}. \end{aligned} \end{aligned}$$
(8.21)

A similar computation yields that

$$\begin{aligned} \begin{aligned} |a_\ell ||\phi _\ell ^n-\phi _{\ell +1}^n|^2&=|a_\ell |\left| \frac{Q^n_{\ell -1}(1+(Q^n_{\ell +1,P})^2Q^n_{\ell ,\ell })}{(1+(Q^n_{0,P})^2)} \frac{(1-Q^n_{\ell ,\ell })E_0^n}{2i\mu _n}\right| ^2\\&\le C|a_\ell ||1-Q^n_{\ell ,\ell }|^2 \frac{|E_0^n|^2}{|\mu _n|^2} \le \frac{C}{|\mu _n|}|E_0^n|^2. \end{aligned} \end{aligned}$$
(8.22)

Combining (8.21) and (8.22) yields

$$\begin{aligned} \begin{aligned} (\lambda _n^2+1)^s((\lambda _n^2+1)\Vert {\varPhi }^n\Vert _{\mathcal {L}}^2+\Vert {\varPhi }^n\Vert _{\mathcal {M}}^2)&\le C(P+1)\left( \frac{(\lambda _n^2+1)^{s+1}}{|\mu _n|^3} + \frac{(\lambda _n^2+1)^s}{|\mu _n|}\right) |E_0^n|^2\\&\le C(P+1)(\lambda _n^2+1)^{s-1/2}|E_0^n|^2. \end{aligned} \end{aligned}$$
(8.23)

On the other hand, the same calculation as above but using (8.16) instead of (8.15) shows that for \(n\in {\mathcal {N}}_2\)

$$\begin{aligned} \frac{1}{|a_\ell |}|\phi ^n_\ell +\phi ^n_{\ell +1}|^2&\le \frac{C}{|\mu _n|} |\phi _0^n|^2,\\ |a_\ell ||\phi _\ell ^n-\phi _{\ell +1}^n|^2&\le C|\mu _n||\phi _0^n|^2, \end{aligned}$$

from which it follows that

$$\begin{aligned} \begin{aligned} (\lambda _n^2+1)^s((\lambda _n^2+1)\Vert {\varPhi }^n\Vert _{\mathcal {L}}^2+\Vert {\varPhi }^n\Vert _{\mathcal {M}}^2) \le C(P+1)(\lambda _n^2+1)^{s+1/2}|\phi _0^n|^2. \end{aligned} \end{aligned}$$
(8.24)

Finally, we obtain the stability and regularity estimates by using (8.20), (8.23) and (8.24) for \(s=0\) and \(s=1\), respectively. \(\square \)

Proof of Theorem 8.1

It suffices to prove the regularity estimates (8.3) and (8.4). Let \(u^{ex}\) be the solution to the problem (6.6) with \(g_{in}=f_s\) and \(g_{bd}=0\) satisfying

$$\begin{aligned} \Vert u^{ex}\Vert _{H^2({\varOmega }_b)}\le C\Vert f_s\Vert _{L^2({\varOmega }_b)} \end{aligned}$$
(8.25)

by Lemma 6.4. Also, by u we denote the solution satisfying CRBCs, i.e.

$$\begin{aligned} \begin{aligned} {\varDelta }u + k^2 u&= f_s\quad \hbox { in }{\varOmega }_b,\\ u&=0 \quad \hbox { on }{\varGamma }_W, \quad \frac{\partial u}{\partial \nu } =0 \quad \hbox { on }{\varGamma }_T,\\ B_P(u)&= 0\quad \hbox { on }{\varGamma }_E, \end{aligned} \end{aligned}$$
(8.26)

where \(B_P(u)=\phi _{P+1}\) is the trace of the \((P+1)\)th auxiliary variable \(\phi _{P+1}\) on \({\varGamma }_E\). Since \(u^{ex}\) is expressed as \(u^{ex}=\sum _{n=0}^\infty A_n^{ex}e^{i\mu _nx}Y_n\) beyond the support of \(f_s\), the error function \(z=u-u^{ex}\) satisfies

$$\begin{aligned} B_P(z)=B_P(-u^{ex})=-A_N^{ex}Y_N -\sum _{n\ne N} Q^n_{0,p}A_n^{ex}e^{i\mu _nb}Y_n. \end{aligned}$$

Assume that z is written as \(z=(A_N+B_Nx)Y_N+\sum _{n\ne N} (A_ne^{i\mu _n x}+B_ne^{-i\mu _n x})Y_n.\) If there exists an index J such that \(a_J+i\mu _n=0\) for some n, then the error does not include the corresponding mode, i.e., \(A_n=A_n^{ex}\) and \(B_n=0\). Otherwise, the boundary conditions on \({\varGamma }_E\) and \({\varGamma }_W\) lead to the linear problem for \(A_n\) and \(B_n\),

$$\begin{aligned} A_n+B_n&=0,\\ Q^n_{0,P}e^{i\mu _nb}A_n+\frac{1}{Q^n_{0,P}e^{i\mu _nb}}B_n&= -Q^n_{0,P}e^{i\mu _nb }A_n^{ex}, \end{aligned}$$

for \(n\ne N\) and

$$\begin{aligned} A_N=0 \ \ \hbox { and }\ \ A_N+B_N\left( b+2\sum _{j=0}^Pa_j^{-1}\right) =-A_N^{ex} \end{aligned}$$

for \(n=N\). It then follows that

$$\begin{aligned} A_n=\frac{(Q^n_{0,P}e^{i\mu _nb})^2}{1-(Q^n_{0,P}e^{i\mu _nb})^2}A_n^{ex} \ \ \hbox { and }\ \ B_n=\frac{-(Q^n_{0,P}e^{i\mu _nb})^2}{1-(Q^n_{0,P}e^{i\mu _nb})^2}A_n^{ex} \end{aligned}$$
(8.27)

for \(n\ne N\) and

$$\begin{aligned} A_N=0 \ \ \hbox { and }\ \ B_N=\frac{-A_N^{ex}}{b+2\sum _{j=0}^Pa_j^{-1}} \end{aligned}$$

for \(n=N\).

Then z solves the problem (6.6) with \(g_{in}=0\) and

$$\begin{aligned} g_{bd}=\frac{\partial z}{\partial x}-T(z)= \frac{-1}{b+2\sum _{j=0}^P a_j^{-1}}A^{ex}_{N}Y_{N}+ \sum _{n\ne N} \frac{2i\mu _ne^{i\mu _nb}(Q^n_{0,P})^2}{1-(e^{i\mu _nb}Q^n_{0,P})^2}A^{ex}_nY_n. \end{aligned}$$

Here, we note that \(g_{bd}\) is in \(H^{1/2}({\varGamma }_E)\). Indeed, from the boundedness of the coefficients

$$\begin{aligned} \frac{1}{b+2\sum _{j=0}^Pa_j^{-1}} \ \hbox { and }\frac{2i(Q^n_{0,P})^2}{1-(e^{i\mu _nb}Q^n_{0,P})^2} \end{aligned}$$

of \(g_{bd}\), a trace theorem and (8.25), it follows that

$$\begin{aligned} \Vert g_{bd}\Vert _{H^{1/2}({\varGamma }_E)}^2&\le C\left[ (\lambda _N^2+1)^{1/2}|A^{ex}_N|^2 + \sum _{n\ne N}(\lambda _n^2+1)^{1/2} |\mu _n|^2|e^{i\mu _nb}A_n^{ex}|^2\right] \\&\le C\Vert u^{ex}\Vert _{H^{3/2}({\varGamma }_E)}^2 \le C\Vert f_s\Vert _{L^2({\varOmega }_b)}^2. \end{aligned}$$

Therefore, Lemma 6.4 reveals that

$$\begin{aligned} \Vert u-u^{ex}\Vert _{H^2({\varOmega }_b)}\le C\Vert g_{bd}\Vert _{H^{1/2}({\varGamma }_E)} \le C\Vert f_s\Vert _{L^2({\varOmega }_b)}, \end{aligned}$$

which, in turn, results in (8.3)

$$\begin{aligned} \Vert u\Vert _{H^2({\varOmega }_b)}\le \Vert z\Vert _{H^2({\varOmega }_b)}+\Vert u^{ex}\Vert _{H^2({\varOmega }_b)} \le C\Vert f_s\Vert _{L^2({\varOmega }_b)}. \end{aligned}$$

In addition, a trace inequality yields that

$$\begin{aligned} \Vert u\Vert _{H^{3/2}({\varGamma }_E)} \hbox { and }\Vert \frac{\partial u}{\partial x}\Vert _{H^{1/2}({\varGamma }_E)}\le C\Vert f_s\Vert _{L^2({\varOmega }_b)}, \end{aligned}$$
(8.28)

and hence Lemma 8.5 with \(\phi _0=u\) and \(E_0=-2\partial u/\partial x\) on \({\varGamma }_E\) shows (8.4). \(\square \)

8.2 Regularity of solutions to Problem (8.2)

It is clear that the solution \((u,{\varPhi })\) to the problem (8.2) solves

$$\begin{aligned} \begin{aligned} \left( -2\frac{\partial u}{\partial x}\right) {\varvec{{e}}}_0&=-L{\varDelta }_y {\varPhi }+(-k^2L+M){\varPhi }- {\varXi }\ \ \hbox { in }{\varGamma }_E,\\ \frac{\partial {\varPhi }}{\partial \nu }&= 0 \ \ \hbox { on }\partial {\varGamma }_E, \end{aligned} \end{aligned}$$
(8.29)

where \({\varXi }=L{\varUpsilon }\).

As done in the previous subsection, we will derive explicit formulas for the solution. We know that the solution has the series representation

$$\begin{aligned} u(x,y)=(A_N+B_Nx)Y_N(y)+\sum _{n\ne N} (A_ne^{i\mu _nx}+B_ne^{-i\mu _nx})Y_n(y) \end{aligned}$$
(8.30)

and the linear systems for the nth Fourier coefficients

$$\begin{aligned} 2i\mu _n(A_ne^{i\mu _nb}-B_ne^{-i\mu _nb}){\varvec{{e}}}_0 -\mu _n^2L{\varPhi }^n + M{\varPhi }^n = {\varXi }^n \end{aligned}$$
(8.31)

for \(n\ne N\) and

$$\begin{aligned} 2B_n{\varvec{{e}}}_0+M{\varPhi }^n={\varXi }^n \end{aligned}$$
(8.32)

for \(n=N\) hold with \({\varXi }^n\) being the nth Fourier coefficients of \({\varXi }\). In case of \(n\ne N\), it suffices to derive the formula when \(a_j\ne -i\mu _n\). Otherwise the system matrix can be written as a \(2\times 2\) block diagonal matrix and solutions of the lower block are given by the same formulas as (8.14) in Lemma 8.3.

Lemma 8.6

Suppose that \(a_j\ne -i\mu _n\) and \(\mu _n\) is not a cutoff axial frequency, i.e. \(\mu _n\ne 0\). Then for \({\varXi }=E_j{\varvec{{e}}}_j=\sum _{n=0}^\infty E_j^nY_n{\varvec{{e}}}_j\), there exists a unique solution to the problem (8.31) given by the formula, \(\phi ^n_\ell =t^n_{\ell ,j}E^n_j\), where

$$\begin{aligned} t^n_{\ell ,j}=\left\{ \begin{array}{ll} \displaystyle \frac{(1-e^{2i\mu _nb}(Q^n_{0,\ell -1})^2)Q^n_{\ell ,j-1} (1-(Q^n_{j,P})^2)}{-4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)} &{} \hbox { if }\ell \le j,\\ \displaystyle \frac{(1-e^{2i\mu _nb}(Q^n_{0,j-1})^2)Q^n_{j,\ell -1} (1-(Q^n_{\ell ,P})^2)}{-4i\mu _n(1-e^{2i\mu _nb}(Q_{0,P}^n)^2)} &{} \hbox { if }\ell \ge j. \end{array} \right. \end{aligned}$$
(8.33)

Also, the normal derivative of the nth Fourier mode \(u_n\) on \({\varGamma }_E\) satisfies

$$\begin{aligned} \frac{\partial u_n}{\partial x}=\frac{(1+e^{2i\mu _nb})Q^n_{0,j-1}(1-(Q^n_{j,p})^2)}{4(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_jY_n. \end{aligned}$$
(8.34)

Proof

The same computation used in the proof of Lemma 8.2 will be applied. We only provide the proof of the cases for \(0<j<P\), as the other case for \(j=0,P\) can be treated with small modifications. The only difference from the proof of Lemma 8.2 is that instead of (8.9) we employ the boundary conditions

$$\begin{aligned} \begin{aligned} A_n+B_n&=0\quad \hbox { on }{\varGamma }_W,\\ A_ne^{i\mu _nb}+B_ne^{-i\mu _nb}&={\widetilde{A}}_n+{\widetilde{B}}_n \quad \hbox { on }{\varGamma }_E\end{aligned} \end{aligned}$$
(8.35)

and

$$\begin{aligned} 2i\mu _n(A_ne^{i\mu _nb}-B_ne^{-i\mu _nb})-2i\mu _n({\widetilde{A}}_n-{\widetilde{B}}_n)=0 \end{aligned}$$
(8.36)

from the 0th equation.

By solving (8.10), (8.11), (8.12), (8.35) and (8.36) in terms of \(E_{j}^n\), we obtain that

$$\begin{aligned} A_n&=\frac{e^{i\mu _nb}(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j, \quad B_n=\frac{-e^{i\mu _nb}(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j,\\ {\widetilde{A}}_n&=\frac{e^{2i\mu _nb}(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j, \quad {\widetilde{B}}_n=\frac{-(1-(Q^n_{j,P})^2)Q^n_{0,j-1}}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j,\\ {\widetilde{C}}_n&=\frac{-(1-e^{2i\mu _nb}(Q^n_{0,j-1})^2)}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j, \quad {\widetilde{D}}_n=\frac{(1-e^{2i\mu _nb}(Q^n_{0,j-1})^2)(Q^n_{j,P})^2}{4i\mu _n(1-e^{2i\mu _nb}(Q^n_{0,P})^2)}E^n_j. \end{aligned}$$

Finally, the formulas (8.33) and (8.34) result from substituting them into (8.8) and

$$\begin{aligned} \frac{\partial u_n}{\partial x}=i\mu _n(A_ne^{i\mu _nb}-B_ne^{-i\mu _nb})Y_n, \end{aligned}$$

which completes the proof. \(\square \)

In order to study the regularity result of the problem (8.2), properties of \(t^n_{\ell ,j}\) are required. Let us define for \(n\ne N\),

$$\begin{aligned} {\mathfrak {t}}_{\ell ,j}=t^n_{\ell ,j}+t^n_{\ell ,j+1} \ \hbox { and }\ {\varDelta }_{\ell ,j}^{\pm }={\mathfrak {t}}_{\ell ,j}\pm {\mathfrak {t}}_{\ell +1,j} \end{aligned}$$

(the formula (8.33) can be extended to j or \(\ell =P+1\), saying \(t^n_{\ell ,j}=0\) for j or \(\ell =P+1\) since \((1-Q^n_{P+1,P})=0\)). The following lemma provides estimates of \({\varDelta }_{\ell ,j}^\pm \) and its proof will be given in the Appendix.

Lemma 8.7

The following inequalities hold,

$$\begin{aligned} \frac{1}{\sqrt{|a_\ell |}}|{\varDelta }^+_{\ell ,j}|\frac{1}{\sqrt{|a_j|}}&\le \frac{C}{|\mu _n|^2},\nonumber \\ \sqrt{|a_\ell |}|{\varDelta }^-_{\ell ,j}|\frac{1}{\sqrt{|a_j|}}&\le \frac{C}{|\mu _n|}. \end{aligned}$$
(8.37)

Also, for the analysis in case of \(a_J+i\mu _n=0\) for some J, we need to estimate the analogues to \({\varDelta }^\pm _{\ell ,j}\) for \(s^n_{\ell ,j}\), defined in (8.14). As above, let us define

$$\begin{aligned} {\mathfrak {s}}_{\ell ,j}=s^n_{\ell ,j}+s^n_{\ell ,j+1} \ \hbox { and }\ {\varSigma }_{\ell ,j}^{\pm }={\mathfrak {s}}_{\ell ,j}\pm {\mathfrak {s}}_{\ell +1,j}. \end{aligned}$$

The same estimates of \({\varSigma }_{\ell ,j}^\pm \) as those of \({\varDelta }^{\pm }_{\ell ,j}\) are in the following lemma, which can be proved in the same way as Lemma 8.7.

Lemma 8.8

The following inequalities hold,

$$\begin{aligned} \frac{1}{\sqrt{|a_\ell |}}|{\varSigma }^+_{\ell ,j}|\frac{1}{\sqrt{|a_j|}}&\le \frac{C}{|\mu _n|^2}, \nonumber \\ \sqrt{|a_\ell |}|{\varSigma }^-_{\ell ,j}|\frac{1}{\sqrt{|a_j|}}&\le \frac{C}{|\mu _n|}. \end{aligned}$$
(8.38)

Lemma 8.9

Let \(a_j\) be the parameters defined by (3.5) satisfying (3.6). Then for any \({\varUpsilon }\in (L^2({\varGamma }_E))^{P+1}\) the solution \((u,{\varPhi })\) to the problem (8.2) satisfies the regularity result,

$$\begin{aligned} \Vert u\Vert _{H^2({\varOmega }_b)} \le C_a(P+1)\Vert {\varUpsilon }\Vert _{{\mathcal {L}}} \end{aligned}$$
(8.39)

and

$$\begin{aligned} \Vert {\varPhi }\Vert _{\mathbf{V }^2_{{\varGamma }_E}}\le C_a^2(P+1)^2\Vert {\varUpsilon }\Vert _{{\mathcal {L}}}. \end{aligned}$$
(8.40)

If cutoff modes are excluded, the constants \(C_a\) for the stability and regularity estimates are independent of \(a_j\) and the exponents on \((P+1)\) are halved; that is the constant in the estimate of u becomes \(C (P+1)^{1/2}\) and that for \({\varPhi }\) becomes \(C (P+1)\).

Proof

We first prove (8.40).

: Assume that \({\varXi }=\sum _{n=0}^\infty L{\varUpsilon }^nY_n\) with \({\varUpsilon }^n=(\gamma _0^n,\gamma _1^n,\ldots ,\gamma _P^n)^t\).

\({\underline{\text {Non-cutoff modes, } n\ne N}}\): We note that

$$\begin{aligned} |(1\pm e^{2i\mu _nb}(Q^n_{p,q})^2)(1\pm (Q^n_{r,s})^2)Q^n_{c,d}|<4, \end{aligned}$$
(8.41)

for any \(0\le p,q,r,s,c,d\le P\) and \(|1-e^{2i\mu _nb}(Q^n_{0,P})^2|\) is bounded below away from zero for all \(n\ne N\).

If \(a_j+i\mu _n\ne 0\) for \(0\le j\le P\), then Lemma 8.6 shows that the solution \(\phi _{\ell }^n\) can be written as \(\phi _{\ell }^n=\sum _{j=0}^P t^n_{\ell ,j}(L{\varUpsilon }^n)_{j}\) and a simple computation gives

$$\begin{aligned} \phi _\ell ^n= & {} \sum _{j=0}^P t^n_{\ell ,j}\left[ \frac{1}{a_{j-1}}(\gamma _{j-1}^n+\gamma _j^n) +\frac{1}{a_{j}}(\gamma _{j}^n+\gamma _{j+1}^n)\right] \nonumber \\= & {} \sum _{j=0}^P {\mathfrak {t}}_{\ell ,j}\frac{1}{a_{j}}(\gamma _{j}^n+\gamma _{j+1}^n). \end{aligned}$$
(8.42)

Now, the Cauchy–Schwarz inequality and (8.37) show that

$$\begin{aligned} \frac{1}{\sqrt{|a_\ell |}}|\phi _\ell ^n+\phi _{\ell +1}^n|&\le \sum _{j=0}^P \frac{1}{\sqrt{|a_\ell |}}|{\varDelta }^+_{\ell ,j}|\frac{1}{\sqrt{|a_j|}} \frac{1}{\sqrt{|a_j|}}|\gamma _j^n+\gamma _{j+1}^n|\\&\le \left( \sum _{j=0}^P \left| \frac{1}{\sqrt{|a_\ell |}}{\varDelta }^+_{\ell ,j}\frac{1}{\sqrt{|a_j|}} \right| ^2\right) ^{1/2} \left( \sum _{j=0}^P \frac{1}{|a_j|}|\gamma _j^n+\gamma _{j+1}^n|^2\right) ^{1/2}\\&\le C\frac{\sqrt{P+1}}{|\mu _n|^2}\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}\end{aligned}$$

and hence we obtain that

$$\begin{aligned} (\lambda _n^2+1)^2\Vert {\varPhi }^n\Vert _{{\mathcal {L}}}^2 \le C(P+1)^2\frac{(\lambda _n^2+1)^2}{|\mu _n|^4} \Vert {\varUpsilon }^n\Vert _{{\mathcal {L}}}^2 \le C(P+1)^2\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2. \end{aligned}$$
(8.43)

In addition, the same computation as above gives that

$$\begin{aligned} \sqrt{|a_\ell |}|\phi _\ell ^n-\phi _{\ell +1}^n|&\le \sum _{j=0}^P \sqrt{|a_\ell |}|{\varDelta }^-_{\ell ,j}|\frac{1}{\sqrt{|a_j|}} \frac{1}{\sqrt{|a_j|}}|\gamma _j^n+\gamma _{j+1}^n|\\&\le \left( \sum _{j=0}^P \left| \sqrt{|a_\ell |}{\varDelta }^-_{\ell ,j}\frac{1}{\sqrt{|a_j|}} \right| ^2\right) ^{1/2} \left( \sum _{j=0}^P\frac{1}{|a_j|}|\gamma _j^n+\gamma _{j+1}^n|^2\right) ^{1/2}\\&\le C\frac{\sqrt{P+1}}{|\mu _n|}\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}, \end{aligned}$$

which shows that

$$\begin{aligned} (\lambda _n^2+1)\Vert {\varPhi }^n\Vert _{\mathcal {M}}^2 \le C(P+1)^2\frac{\lambda _n^2+1}{|\mu _n|^2}\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2 \le C(P+1)^2\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2. \end{aligned}$$
(8.44)

Thus, it follows from (8.43) and (8.44) that

$$\begin{aligned} (\lambda _n^2+1)^2\Vert {\varPhi }^n\Vert _{\mathcal {L}}^2+(\lambda _n^2+1)\Vert {\varPhi }^n\Vert _{\mathcal {M}}^2 \le C(P+1)^2\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2. \end{aligned}$$
(8.45)

In case when \(a_J+i\mu _n=0\) for some J, the system of Eqs. (8.29) breaks into two block diagonal systems. We notice that \(\phi _\ell ^n\) is represented by

$$\begin{aligned} \begin{aligned} \phi _\ell ^n=\left\{ \begin{array}{ll} \displaystyle \sum _{j=0}^J t_{\ell ,j}^n(L{\varUpsilon }^n)_j &{}\hbox { for }\ell \le J,\\ \displaystyle \sum _{j=J+1}^Ps_{\ell ,j}^n(L{\varUpsilon }^n)_j &{}\hbox { for }\ell \ge J+1, \end{array} \right. \end{aligned} \end{aligned}$$
(8.46)

where \(t^n_{\ell ,j}\) and \(s^n_{\ell ,j}\) are defined by (8.33) (with P replaced by J) and (8.14), respectively. By using Lemmas 8.7 and 8.8 as in the argument used above, the same result as (8.45) can be derived.

\({\underline{\text {Cutoff modes, }n=N}}\): In this case, \({\varPhi }^N\) satisfies (8.32), which is equivalent to

$$\begin{aligned} 2b^{-1}\phi _0^N{\varvec{{e}}}_0+M{\varPhi }^N=L{\varUpsilon }^N \end{aligned}$$
(8.47)

since \(A_N=0\) from the boundary condition on \({\varGamma }_W\) and \(A_N+B_Nb=\phi ^N_0\). By examining the real and imaginary parts of the inner product of the left hand side of (8.47) with \({\varPhi }^N\), we observe that

$$\begin{aligned} \begin{aligned} \frac{2}{b}|\phi _0^N|^2 +\Vert {\varPhi }^N\Vert _{{\mathcal {M}}}^2&\le C\bigg |(\frac{2}{b}\phi _0^N{\varvec{{e}}}_0+M{\varPhi }^N,{\varPhi }^N)_{{\mathbb {C}}^{P+1}}\bigg | \le C\Vert {\varUpsilon }^N\Vert _{{\mathcal {L}}}\Vert {\varPhi }^N\Vert _{{\mathcal {L}}}\\&\le C_a(P+1)\Vert {\varUpsilon }^N\Vert _{{\mathcal {L}}}\Vert {\varPhi }^N\Vert _{{\mathcal {M}}}. \end{aligned} \end{aligned}$$
(8.48)

The last inequality is the result from Lemma 4.5. Therefore, it follows that

$$\begin{aligned} \Vert {\varPhi }^N\Vert _{\mathcal {M}}\le C_a(P+1)\Vert {\varUpsilon }^N\Vert _{\mathcal {L}}. \end{aligned}$$
(8.49)

Applying Lemma 4.5 again to the above inequality (8.49) yields that

$$\begin{aligned} \Vert {\varPhi }^N\Vert _{\mathcal {L}}\le C_a^2(P+1)^2\Vert {\varUpsilon }^N\Vert _{\mathcal {L}}\end{aligned}$$

and hence it is concluded that

$$\begin{aligned} (\lambda _N^2+1)^2\Vert {\varPhi }^N\Vert _{{\mathcal {L}}}^2+(\lambda _N^2+1)\Vert {\varPhi }^N\Vert _{\mathcal {M}}^2 \le C_a^4(P+1)^4\Vert {\varUpsilon }^N\Vert _{{\mathcal {L}}}^2. \end{aligned}$$
(8.50)

Finally, combining (8.45) and (8.50) implies

$$\begin{aligned} \Vert {\varPhi }\Vert _{{{\varvec{{V}}}}^2_{{\varGamma }_E}}\le C_a^2(P+1)^2\Vert {\varUpsilon }\Vert _{\mathcal {L}}, \end{aligned}$$

which completes the proof of (8.40).

: We shall estimate \(g_{bd}=\partial u/\partial x -T(u)\) in \(H^{1/2}({\varGamma }_E)\),

$$\begin{aligned} \Vert g_{bd}\Vert _{H^{1/2}({\varGamma }_E)}&\le C_a(P+1)\Vert {\varUpsilon }\Vert _{{\mathcal {L}}}. \end{aligned}$$
(8.51)

Once the inequality is established, Lemma 6.4 with (8.51) yields that

$$\begin{aligned} \Vert u\Vert _{H^2({\varOmega }_b)}\le C_a(P+1)\Vert {\varUpsilon }\Vert _{{\mathcal {L}}}, \end{aligned}$$

which completes the proof of (8.39).

Now, we are left with proving (8.51). To do this, as done in (8.42) we use (8.33) with \(\ell =0\) and (8.34) for \(n\ne N\) and \(a_j+i\mu _n\ne 0\) to have

$$\begin{aligned} \frac{\partial u_n}{\partial x}-i\mu _nu_n=\sum _{j=0}^P \frac{Q_{0,j-1}^n(1-Q_{j,j}^n(Q_{j+1,P}^n)^2)(1+Q_{j,j}^n)}{2(1-e^{2i\mu _nb}(Q_{0,P}^n)^2)} \frac{1}{a_j}(\gamma _j+\gamma _{j+1})Y_n. \end{aligned}$$

The Cauchy–Schwarz inequality and (8.18) shows that

$$\begin{aligned} \Vert \frac{\partial u_n}{\partial x}-i\mu _nu_n\Vert _{L^2({\varGamma }_E)}^2&\le \left( \sum _{j=0}^P C|1+Q^n_{j,j}|\frac{1}{|a_j|}|\gamma ^n_j+\gamma ^n_{j+1}|\right) ^2\\&\le \left( \sum _{j=0}^P C\frac{|1+Q^n_{j,j}|^2}{|a_j|}\right) \left( \sum _{j=0}^P\frac{1}{|a_j|}|\gamma ^n_j+\gamma ^n_{j+1}|^2\right) \\&\le C\frac{P+1}{|\mu _n|}\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2 \end{aligned}$$

and so we obtain

$$\begin{aligned} (\lambda _n^2+1)^{1/2}\Vert \frac{\partial u_n}{\partial x}-i\mu _nu_n\Vert _{L^2({\varGamma }_E)}^2\le C(P+1)\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2. \end{aligned}$$
(8.52)

In case when \(a_J+i\mu _n=0\) for some n and j, since \(\partial u_n/\partial x\) and \(u_n\) are affected by only the first \((J+1)\) components of \({\varUpsilon }\), it holds that

$$\begin{aligned} (\lambda _n^2+1)^{1/2}\Vert \frac{\partial u_n}{\partial x}-i\mu _nu_n\Vert _{L^2({\varGamma }_E)}^2\le C(J+1)\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2 \le C(P+1)\Vert {\varUpsilon }^n\Vert _{\mathcal {L}}^2. \end{aligned}$$
(8.53)

For the cutoff mode, i.e., \(n=N\), we use (8.48) and (8.49) to see

$$\begin{aligned} \begin{aligned} \Vert \frac{\partial u_N}{\partial x}\Vert _{L^2({\varGamma }_E)}^2&=|B_N|^2=\frac{1}{b^2}|\phi _0^N|^2\\&\le C_a(P+1)\Vert {\varUpsilon }^N\Vert _{\mathcal {L}}\Vert {\varPhi }^N\Vert _{\mathcal {M}}\le C_a^2(P+1)^2\Vert {\varUpsilon }^N\Vert _{\mathcal {L}}^2. \end{aligned} \end{aligned}$$
(8.54)

Finally by combining (8.52), (8.53) and (8.54) we obtain

$$\begin{aligned} \Vert \frac{\partial u}{\partial x}-T(u)\Vert _{H^{1/2}({\varGamma }_E)} \le C_a(P+1)\Vert {\varUpsilon }\Vert _{{\mathcal {L}}}, \end{aligned}$$

which completes the proof. \(\square \)

9 Finite element approximations

Now, we are in a position to discuss the solvability and quasi-optimal convergence of the finite element approximation \((u_h,{\varPhi }_h)\) to the solution u and the auxiliary variables \({\varPhi }=(\phi _0,\ldots ,\phi _P)^t\) to the variational problem (4.1).

Let \({\mathcal {T}}_h\) denote a partition of \({\varOmega }_b\) with shape-regular meshes and let h represent the diameter of elements, e.g., \(h=\max _{K\in {\mathcal {T}}_h} \hbox {diam}(K)\). By extracting the boundary nodes on \({\varGamma }_E\) generated by \({\mathcal {T}}_h\), we define the boundary meshes, which are denoted by \({\mathcal {T}}^b_h\). Let \({\widetilde{S}}_h\) denote a subspace of \({\widetilde{H}}^1({\varOmega }_b)\) consisting of piecewise polynomial finite element functions and \(S_h^0\) denote the subset of functions in \({\widetilde{S}}_h\) which vanish on \({\varGamma }_W\). Also, \(G_h\) is analogously defined by a finite element subspace of \(H^1({\varGamma }_E)\). We assume that f is the trace of a function on \({\varGamma }_W\) in our approximation space as the errors associated with boundary quadrature in the finite element method are well understood. Let \(S_h\) be the set of functions in \({\widetilde{S}}_h\) which coincide with f on \({\varGamma }_W\). Denoting by \({\varvec{{V}}}_h\) the set of all elements \((u_h,{\varPhi }_h)\) in \(S_h\times (G_h)^{P+1}\) such that \(u_h=\phi _{h,0}\) on \({\varGamma }_E\) for \({\varPhi }_h=(\phi _{h,0},\ldots ,\phi _{h,P})^t\) and by \({\varvec{{V}}}_h^0\) the set of all elements \((u_h,{\varPhi }_h)\) in \(S_h^0\times (G_h)^{P+1}\) such that \(u_h=\phi _{h,0}\) on \({\varGamma }_E\), the finite element approximation to \((u,{\varPhi })\) is the function \((u_h,{\varPhi }_h)\in {{\varvec{{V}}}_h}\) satisfying

$$\begin{aligned} A((u_h,{\varPhi }_h),(\xi _h,{\varPsi }_h))=0 \quad \hbox { for all }(\xi _h,{\varPsi }_h)\in {{{\varvec{{V}}}_h^0}}. \end{aligned}$$
(9.1)

As mentioned earlier, we will now invoke an argument due to Schatz [33] to establish the unique solvability and quasi-optimal convergence of finite element approximations. This requires that the mesh size h satisfies \(0<h<h_0\) for a constant \(h_0\), which may depend on the stability and regularity estimates of the elliptic problem studied in Sect. 8.

In our case, for a given order \((n_p,n_e)\) with \(P=n_p+n_e\) and the damping parameters \(a_j\) given by (3.5) satisfying (3.6), we already know that the sesquilinear form \(A(\cdot ,\cdot )\) is bounded,

$$\begin{aligned} |A((u,{\varPhi }),(\xi ,{\varPsi }))|\le C\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}}\Vert (\xi ,{\varPsi })\Vert _{{\varvec{{V}}}}. \end{aligned}$$

Also, since

$$\begin{aligned} |((M-{\bar{M}}){\varPhi },{\varPhi })_{{\varGamma }_E}|=\sum _{j=0}^{n_p-1}2|a_j|\Vert \phi _j-\phi _{j+1}\Vert _{L^2({\varGamma }_E)}^2 \le Cn_p^2\Vert {\varPhi }\Vert _{{\mathcal {L}}}^2 \end{aligned}$$

due to the fact that \(|a_j|\le k\) for \(j=0,\ldots ,n_p-1\), it follows from (5.13) that the sesquilinear form \(A(\cdot ,\cdot )\) satisfies the inequality

$$\begin{aligned} |A((u,{\varPhi }),(u,{\varPhi }))|\ge C_1\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}}^2 -C_2n_p^2(\Vert u\Vert _{L^2({\varOmega }_b)}^2+\Vert {\varPhi }\Vert _{{\mathcal {L}}}^2) \end{aligned}$$
(9.2)

in \({\varvec{{V}}}_0\times {\varvec{{V}}}_0\) for some positive constants \(C_1\) and \(C_2\). Now, the solvability and quasi-optimal convergence of finite element approximations are given in the following theorem. The proof follows the same line as the standard Schatz’s argument in [33] with the regularity result given in Theorem 8.1 and Lemma 8.9.

Theorem 9.1

Let \(a_j\) be the parameters defined by (3.5) satisfying (3.6). Then there exists an \(h_0>0\) such that for \(0<h<h_0\), (9.1) has a unique solution \((u_h,{\varPhi }_h)\in \mathbf{V }_h\) satisfying

$$\begin{aligned} \Vert (u,{\varPhi })-(u_h,{\varPhi }_h)\Vert _{\mathbf{V }}\le Ch\Vert (u,{\varPhi })\Vert _{\mathbf{V }^2}. \end{aligned}$$
(9.3)

Furthermore, the solution \(u_h\) satisfies the \(L^2\)-error estimate

$$\begin{aligned} \Vert u-u_h\Vert _{L^2({\varOmega }_b)}\le C_a(P+1)h^2\Vert (u,{\varPhi })\Vert _{\mathbf{V }^2}. \end{aligned}$$
(9.4)

Here the constant \(C_a\) is independent of \(a_j\) if cutoff modes are not involved.

Proof

Let \((e,E)=(u,{\varPhi })-(u_h,{\varPhi }_h)\in {\varvec{{V}}}_0\) be the error function. Since the sesquilinear form \(A(\cdot ,\cdot )\) is symmetric (not Hermitian), that is, \(A((u,{\varPhi }),(\xi ,{\varPsi }))=A(({\bar{\xi }},{\bar{{\varPsi }}}),({\bar{u}},{\bar{{\varPhi }}}))\) for \((u,{\varPhi }),(\xi ,{\varPsi })\in {\varvec{{V}}}_0\), the solution \((w,{\varUpsilon })\in {\varvec{{V}}}_0\) to the dual problem

$$\begin{aligned} A((\xi ,{\varPsi }),(w,{\varUpsilon }))=(\xi ,e)_{{\varOmega }_b}+(L{\varPsi }, E)_{{\varGamma }_E} \quad \hbox { for all }(\xi ,{\varPsi })\in {\varvec{{V}}}_0 \end{aligned}$$

also satisfies the regularity estimates in Theorem 8.1 and Lemma 8.9. By choosing a linear or bilinear interpolation \({\varUpsilon }_h=(\gamma _{h,0},\ldots ,\gamma _{h,P})^t\) of \({\varUpsilon }=(\gamma _0,\ldots ,\gamma _P)^t\), it is obvious that

$$\begin{aligned} \Vert {\varUpsilon }-{\varUpsilon }_h\Vert _{{\mathcal {L}},1}^2&=\sum _{j=0}^P\frac{1}{|a_j|} \Vert \gamma _{j}+\gamma _{j+1}-\gamma _{h,j}-\gamma _{h,j+1}\Vert _{H^1({\varGamma }_E)}^2\\&\le Ch^2\sum _{j=0}^P\frac{1}{|a_j|}\Vert \gamma _j+\gamma _{j+1}\Vert _{H^2({\varGamma }_E)}^2 =Ch^2\Vert {\varUpsilon }\Vert _{{\mathcal {L}},2}^2 \end{aligned}$$

and

$$\begin{aligned} \Vert {\varUpsilon }-{\varUpsilon }_h\Vert _{{\mathcal {M}}}^2&=\sum _{j=0}^P|a_j|\Vert (\gamma _j-\gamma _{j+1})-(\gamma _{h,j}-\gamma _{h,j+1})\Vert _{L^2({\varGamma }_E)}^2\\&\le Ch^2\sum _{j=0}^P|a_j|\Vert \gamma _j-\gamma _{j+1}\Vert _{H^1({\varGamma }_E)}^2 =Ch^2\Vert {\varUpsilon }\Vert _{{\mathcal {M}},1}^2, \end{aligned}$$

which reveals that

$$\begin{aligned} \Vert (w,{\varUpsilon })-(w_h,{\varUpsilon }_h)\Vert _{{\varvec{{V}}}} \le Ch\Vert (w,{\varUpsilon })\Vert _{{\varvec{{V}}}^2} \end{aligned}$$
(9.5)

with a linear or bilinear interpolation \(w_h\) of w. The approximation property (9.5) and Lemma 8.9 show that

$$\begin{aligned} \begin{aligned} \Vert e\Vert _{L^2({\varOmega }_b)}^2+\Vert E\Vert _{\mathcal {L}}^2&\le C|A((e,E),(w,{\varUpsilon })-(w_h,{\varUpsilon }_h))|\\&\le Ch\Vert (e,E)\Vert _{{\varvec{{V}}}}\Vert (w,{\varUpsilon })\Vert _{{\varvec{{V}}}^2}\\&\le C_a^2(P+1)^2h\Vert (e,E)\Vert _{{\varvec{{V}}}}(\Vert e\Vert _{L^2({\varOmega }_b)}^2+\Vert E\Vert _{{\mathcal {L}}}^2)^{1/2}, \end{aligned} \end{aligned}$$
(9.6)

which in turn gives

$$\begin{aligned} (\Vert e\Vert _{L^2({\varOmega }_b)}^2+\Vert E\Vert _{\mathcal {L}}^2)^{1/2}\le C_a^2(P+1)^2h\Vert (e,E)\Vert _{{\varvec{{V}}}}. \end{aligned}$$
(9.7)

From Gårding’s inequality (9.2) for (eE),

$$\begin{aligned} C_1\Vert (e,E)\Vert _{\varvec{{V}}}^2&-C_2n_p^2(\Vert e\Vert _{L^2({\varOmega }_b)}^2+\Vert E\Vert _{\mathcal {L}}^2) \le |A((e,E),(e,E))|\\&=|A((e,E),(u,{\varPhi }))|\le C\Vert (e,E)\Vert _{\varvec{{V}}}\Vert (u,{\varPhi })\Vert _{\varvec{{V}}}, \end{aligned}$$

we see that

$$\begin{aligned} C_1\Vert (e,E)\Vert _{\varvec{{V}}}-C_2n_p^2(\Vert e\Vert _{L^2({\varOmega }_b)}^2+\Vert E\Vert _{\mathcal {L}}^2)^{1/2}\le C\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}} \end{aligned}$$

and apply (9.7) to the inequality to obtain

$$\begin{aligned} (C_1 -C_2C_a^2n_p^2(P+1)^2h)\Vert (e,E)\Vert _{\varvec{{V}}}\le C\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}}. \end{aligned}$$
(9.8)

For unique solvability of the finite dimensional problem, suppose that \(f=0\), and so \((u,{\varPhi })=0\). Then there exists \(h_0\) such that \(C_1-C_2C_a^2n_p^2(P+1)^2h_0>0\). For such \(0<h<h_0\), we clearly see that \((e,E)=0\), implying the unique solvability of finite element problem.

Also, the error estimate (9.3) in the energy norm is proved from Gårding’s inequality for \(0<h<h_0\) and Theorem 8.1,

$$\begin{aligned} C\Vert (e,E)\Vert _{\varvec{{V}}}^2&\le |A((e,E),(e,E))|=|A((e,E),(u,{\varPhi })-(u_h,{\varPhi }_h))|\\&\le Ch\Vert (e,E)\Vert _{\varvec{{V}}}\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}^2} \end{aligned}$$

with a linear or bilinear interpolation \((u_h,{\varPhi }_h)\) of \((u,{\varPhi })\), which leads to (9.3).

For the \(L^2\)-error estimate, let \((w_e,{\varUpsilon }_e) \in {\varvec{{V}}}_0\) be the solution to the adjoint problem

$$\begin{aligned} A((\xi ,{\varPsi }),(w_e,{\varUpsilon }_e))=(\xi ,e)_{{\varOmega }_b} \end{aligned}$$

for all \((\xi ,{\varPsi })\in {\varvec{{V}}}_0\). Then the same argument used for (9.6) with Theorem 8.1 instead of Lemma 8.9 shows again that

$$\begin{aligned} \Vert e\Vert _{L^2({\varOmega }_b)}^2&=A((e,E),(w_e,{\varUpsilon }_e))\\&\le Ch\Vert (e,E)\Vert _{{\varvec{{V}}}}\Vert (w_e,{\varUpsilon }_e)\Vert _{{\varvec{{V}}}^2}\\&\le C_a(P+1)h\Vert (e,E)\Vert _{{\varvec{{V}}}}\Vert e\Vert _{L^2({\varOmega }_b)}, \end{aligned}$$

which implies that

$$\begin{aligned} \Vert e\Vert _{L^2({\varOmega }_b)}\le C_a(P+1)h\Vert (e,E)\Vert _{{\varvec{{V}}}} \le C_a(P+1)h^2\Vert (u,{\varPhi })\Vert _{{\varvec{{V}}}^2} \end{aligned}$$

and completes the proof. \(\square \)

We note that the regularity constant in Lemma 8.9 may increase polynomially (quadratically, but linearly if cutoff modes are excluded) as P grows and so a smaller mesh h may be required for large P to retain the unique solvability and quasi-optimal convergence, though this has not been encountered in our experiments. However, when a cutoff modes is present, \(C_a\) depending on \(\max _{j=0,\ldots ,P}\{1/|a_j|\}\) comes in the regularity constant and it is found in numerical tests that the convergence of finite element approximations is affected by the smallest parameter used for CRBCs. A discussion on the convergence with respect to \(C_a\) and h will be made in the following section.

10 Numerical experiments

In this section we provide numerical examples that confirm the well-posedness and convergence theories that were developed in the preceding sections. We specialize to \({\mathbb {R}}^2\) and take \({\varTheta }= (0,W)\). Note that now

$$\begin{aligned} Y_n(y)=\sqrt{\frac{2}{W}}\cos \left( \frac{n\pi }{W}y\right) \end{aligned}$$

are transverse eigenfunctions associated with eigenvalues \(\lambda _n^2=(n\pi /W)^2\) for \(n\ge 0\). The domain \({\varOmega }_b=(0,b)\times (0,W)\) is a rectangular region obtained by truncating the semi-infinite waveguide \({\varOmega }_\infty \) at \(x=b\) (see Fig. 2). We set \(W=1\).

In the first example, we take \(k=20\) and choose f corresponding to the analytic solution of (2.1)–(2.3):

$$\begin{aligned} u^{ex}(x,y)=\sum _{n=0}^6 \frac{1}{7\sqrt{2}}e^{i\mu _n x}Y_n(y) \end{aligned}$$

in \({\varOmega }_b\) with \(b=0.2\). The exact solution \(u^{ex}\) is a superposition of seven propagating modes. In order to apply an efficient CRBC on \({\varGamma }_E\), the optimal parameters discussed in Sect. 7 are computed on the interval \([\mu _6, k]\approx [6.6853, 20]\) by the Remez algorithm and their distributions for \(n_p=1,2,\ldots ,5\) are depicted in Fig. 4. Their maximal reflection coefficients for propagating modes are presented in Table 2 as well. We compute piecewise bilinear finite element approximations \(u_h\) with mesh \(h=1/800, 1/1600\) and 1 / 3200 by using the finite element library deal.II [1]. To see the convergence of approximate solutions, we measure relative \(L^2\)- and \(H^1\)-errors and report the errors in Fig. 5. It is observed that approximate solutions obtained by CRBCs converge as the order of CRBCs increases until mesh size errors dominate. In particular, when the mesh size is small enough so that mesh error is ignorable, the relative \(L^2\)-error converges at the same convergence rate of the maximal reflection coefficients.

The second example illustrates the effect of CRBCs on evanescent modes. To do this, we take \(k=20\) and choose an analytic solution \(u^{ex}\) including seven propagating modes and ten evanescent modes

$$\begin{aligned} u^{ex}(x,y)=\sum _{n=0}^{16} \frac{1}{17\sqrt{2}}e^{i\mu _nx}Y_n(y). \end{aligned}$$

We also assume that the source coming from the left boundary \({\varGamma }_W\) is close to the artificial boundary \({\varGamma }_E\), e.g., \(b=0.1\) (\(W=1\)). For this example, we use the same purely imaginary parameters as those obtained with \(n_p=4\) since the CRBC with \(n_p=4\) serves as an accurate absorbing boundary condition for propagating modes for the meshes \(h=1/800, 1/1600\) and 1 / 3200. For the real parameters responsible for damping evanescent modes, we solve numerically the min–max problem (6.2) on the interval \([{\tilde{\mu }}_{7}, M_\sigma ]\approx [9.1438,147.0887]\), where \(M_\sigma \) is determined by \(e^{-M_\sigma b}=\rho _p.\) The distribution of the real parameters and the maximal reflection coefficients \(\rho _e\) for each \(n_e\) are shown in Fig. 6 and Table 2, respectively. The numerical results given in Fig. 7 also illustrate the convergence of solutions with respect to increasing \(n_e\). Also, it can be seen that the convergence rate of the relative \(L^2\)-errors coincides with the decay rate of \(\rho _e\) as long as the mesh is fine enough.

In the third example, the performance of CRBCs for the cutoff mode is examined. We set \(k=6\pi \) and choose \(u^{ex}\) such that the exact solution is composed of six propagating modes and one cutoff mode:

$$\begin{aligned} u^{ex}(x,y)=\sum _{n=0}^6\frac{1}{7\sqrt{2}}e^{i\mu _nx}Y_n(y) \end{aligned}$$

defined on \({\varOmega }_b\) with \(b=0.2\) (\(W=1\)). We increase the number of purely imaginary parameters in the optimal way for propagating modes with \(n_p=1\sim 30\). As indicated in Theorem 6.1, the error of the cutoff mode is controlled by \(S_P=|b+2\sum _{j=0}^Pa_j^{-1}|^{-1}\), which is illustrated in Fig. 8. We notice that the optimal parameters used for propagating modes do not seem to be the best choice.

Fig. 4
figure 4

Distribution of optimal parameters for \(n_p=1,2,\ldots ,5\). The seven red circles represent the exact propagation frequencies \(\mu _n\) and the blue * marks are the optimal parameters of \(n_p=1\) in (a) through \(n_p=5\) in (e)

Table 2 Maximal reflection coefficients for propagating modes and evanescent modes resulting from CRBCs with the optimal parameters for \(k=20\)
Fig. 5
figure 5

Relative \(L^2\)- and \(H^1\)-errors for the exact propagating solution

Fig. 6
figure 6

Distribution of optimal parameters for \(n_e=1,2,\ldots ,5\). The red circles represent the exact decay rate of evanescent modes \({\tilde{\mu }}_n\) and the blue * marks are the optimal parameters of \(n_e=1\) in (a) through \(n_e=5\) in (e)

Fig. 7
figure 7

Relative \(L^2\)- and \(H^1\)-errors for the solution including both of propagating modes and evanescent modes

Fig. 8
figure 8

Relative \(L^2\)- and \(H^1\)-errors for the solution including both of propagating modes and a cutoff mode satisfying CRBCs with optimal parameters

In case that cutoff modes are involved, we may want to try other choices of parameters, with which CRBCs can reduce \(S_P\) to much smaller level while the reflection coefficients from propagating modes are not deteriorated too much, e.g., Newman’s nodes \(a_j=-ike^{j/\sqrt{P}}\) based on geometric sequences for \(j=0,\ldots , P\). As we can see Fig. 9 of relative \(L^2\)-errors, the CRBCs with geometric sequences produce improved results, though it is observed that the errors obtained from this approach have an irregular behavior for large P. It can be explained in terms of a small parameter \(a_{P}\) for large P. According to the formula for \(S_P\), it seems that one might improve the accuracy of CRBCs at the continuous level by adding a small parameter such as the smallest parameter \(a_P\) of the geometric sequences, which reduces \(S_P\) to the error tolerance. However, the cutoff mode on the discrete level does not satisfy the actual equation on the continuous level

$$\begin{aligned} M{\varPhi }^N= -2\frac{\partial u}{\partial x}{\varvec{{e}}}_0 \end{aligned}$$

but solves an equation of a propagating or evanescent mode

$$\begin{aligned} (-\mu ^2_{N,h}L+M){\varPhi }^N_h=-2\frac{\partial u_h}{\partial x}{\varvec{{e}}}_0 \end{aligned}$$

for some discrete axial frequency \(\mu _{N,h}\ne 0\) since no discrete eigenvalue of the transverse Laplace operator will typically coincide with the cutoff transverse eigenvalue \(\lambda _n^2\). When small parameters are used, some components of L become large but in contrast corresponding components of M become small. Therefore in case that h is not small enough that \(\mu _{N,h}\) is big, \(-\mu _{N,h}^2L\) might be dominant over the actual cutoff mode system matrix M and so the resulting solution would not be accurate. The mesh size affected by the small parameter used for CRBCs can be examined in Fig. 9. We observe the minimum errors at \(P=13, 17, 22\) for \(h=1/800, 1/1600, 1/3200\), respectively, in the plot and they are shifted as h is halved. The ratios of the smallest parameter \(a_P=-ike^{-\sqrt{P}}\) determining \(C_a=O(a_P^{-1})\) between two minimum error points are \(e^{\sqrt{13}}/e^{\sqrt{17}}\approx 0.5960\) and \(e^{\sqrt{17}}/e^{\sqrt{21}}\approx 0.5670,\) which indicates that it appears that \(C_ah\) in Gåding’s inequality is the main factor contributing to solvability and quasi-optimality of the finite element analysis (9.8), and it is necessary to choose h small enough when cutoff modes exist and \(a_P\) is small. To see this observation in more detail, we take a mesh refinement according to P in such a way that \(e^{\sqrt{P}}h\) is a constant \(C_\mathrm{{newman}}\). For example, \(C_\mathrm{{newman}}\) is taken to be \(e^{\sqrt{10}}/800\approx 0.03\) and we do numerical tests with \(h=C_\mathrm{{newman}}e^{-\sqrt{P}}\) for each P. The results are given in Fig. 10a with mesh size for each P in (b). While relative \(L^2\)-errors in approximations for optimal parameters with decreasing h are not improved due to reflection errors, those for Newman’s nodes decrease asymptotically at the same rate of that of \(S_P\), without any oscillatory behavior as long as meshes are refined according to P.

Fig. 9
figure 9

Relative \(L^2\)-errors for the solution including both of propagating modes and a cutoff mode satisfying CRBCs with Newman nodes

Fig. 10
figure 10

Relative \(L^2\)-errors in approximate solutions with h refined according to P for the solution including both of propagating modes and a cutoff mode satisfying CRBCs with Newman nodes

Aside from this, it is found in Fig. 11 that the norm of auxiliary variables, \((\Vert {\varPhi }\Vert _{\mathcal {L}}^2+\Vert {\varPhi }\Vert _{{\mathcal {M}}}^2)^{1/2}\), of the second and third examples increases with increasing P as in the stability analysis of Theorem 8.1 but its variance is small. The independence of the finite element problem from P seems to be caused by the small variance of the norm with respect to P.

Fig. 11
figure 11

Norm of auxiliary variables, \((\Vert {\varPhi }\Vert _{\mathcal {L}}^2+\Vert {\varPhi }\Vert _{\mathcal {M}}^2)^{1/2}\)

In the last example, we are concerned with finite element convergence as h approaches zero. To do this, we set \(k=100\) and take the computational domain to be \({\varOmega }_b=(0,0.1)\times (0,1)\), i.e., \(b=0.1\) and \(W=1\), for which the number of propagating modes is 32. We choose the CRBC of order \((n_p,n_e)=(4,3)\) for which \(\rho _p= 3.9590\times 10^{-6}\) and \(\rho _e=5.3492\times 10^{-5}\) and so reflection errors are negligible compared with mesh errors. The wave source f on \({\varGamma }_W\) is given so that the exact solution is defined by

$$\begin{aligned} u(x,y)=\sum _{n=0}^{31} \frac{1}{64\sqrt{2}}e^{i\mu _nx}Y_n(y) +\sum _{n=32}^{63} \frac{1}{64\sqrt{2}}e^{-{\tilde{\mu }}_nx}Y_n(y) \end{aligned}$$

having 32 propagating modes and 32 evanescent modes. The plot in Fig. 12 shows the quasi-optimal convergence of relative \(L^2\)- and \(H^1\)-errors in finite element approximations with \((n_p,n_e)=(4,3)\).

Fig. 12
figure 12

Relative \(L^2\)- and \(H^1\)- errors with \(h=1/200\), 1 / 400, 1 / 800, 1 / 1600, 1 / 3200