1 Introduction

In this article, the efficient numerical solution of Helmholtz problems

$$\begin{aligned}&-\Delta u - \kappa ^2 u = 0 \quad \text { in }\varOmega ^c,\end{aligned}$$
(1a)
$$\begin{aligned}&u+\alpha \partial _\nu u=u_0 \qquad \text {on }\varGamma :=\partial \varOmega \end{aligned}$$
(1b)

used to model acoustics and electromagnetic scattering will be considered. Herein, \(\kappa \) denotes the wave number and \(\varOmega ^c:={\mathbb {R}}^3{\setminus } \overline{\varOmega }\) the exterior domain of the obstacle \(\varOmega \subset {\mathbb {R}}^3\). The parameter \(\alpha \) and the right-hand side \(u_0\) appearing in the impedance condition (1b) are given. A convenient way to solve exterior problems is the reformulation as an integral equation [11, 13, 14] over the boundary \(\varGamma \) of the scatterer \(\varOmega \). The Galerkin discretization leads to large-scale fully populated matrices \(A\in {\mathbb {C}}^{M\times N}\),

$$\begin{aligned} a_{ij}=\int _{\varGamma } \int _{\varGamma } K(x,y) {\varphi }_i(x)\psi _j(y){\,\text {d}}s_y{\,\text {d}}s_x, \end{aligned}$$
(2)

with \(i\in I:=\{1,\dots ,M\},\,j\in J:=\{1,\dots ,N\}\), test and ansatz functions \({\varphi }_i,\,\psi _j\), having supports \(X_i:={\text {supp}\,}{\varphi }_i\) and \(Y_j:={\text {supp}\,}\psi _j\), respectively. We consider kernel functions \(K\) of the form

$$\begin{aligned} K(x,y):=f(x,y)\exp ({\text {i}}\kappa |x-y|) \end{aligned}$$
(3)

with an arbitrary asymptotically smooth (with respect to \(x\) and \(y\)) function \(f\), i.e., there are constants \(c_{\text {as},1},c_{\text {as},2}>0\) such that for all \(\alpha ,\beta \in {\mathbb {N}}^3\)

$$\begin{aligned} |\partial _x^\alpha \partial _y^\beta f(x,y)|\le c_{\text {as},1} c_{\text {as},2}^p \, \alpha !\beta ! \, \frac{|f(x,y)|}{|x-y|^p},\quad p:=|\alpha +\beta |. \end{aligned}$$
(4)

An example is \(K(x,y)=S(x-y)\) used in the single layer ansatz, where \(S(x)=\exp ({\text {i}}\kappa |x|)/(4\pi |x|)\) denotes the fundamental solution. Notice that the double layer potential \(K(x,y)=\partial _{\nu _y} S(x-y)\) is of the form (3) only if \(\varGamma \), i.e. the unit outer normal \(\nu \), is sufficiently smooth.

Depending on the application, low or high-frequency problems are to be solved. For low-frequency problems, i.e. for \(\kappa \,{\text {diam}}{\varOmega }\le 1\), the treecode algorithm [5] and fast multipole methods (FMM) [2426, 35] were introduced to treat \(A\) with log-linear complexity. The panel clustering method [32] is directed towards more general kernel functions. All previous methods rely on degenerate approximations

$$\begin{aligned} K(x,y)\approx \sum _{i=1}^k u_i(x) v_i(y),\quad x\in X,\, y\in Y, \end{aligned}$$
(5)

using a short sum of products of functions \(u_i\) and \(v_i\) depending on only one of the two variables \(x\) and \(y\) chosen from a pair of domains \(X\times Y\) which satisfies the far-field condition

$$\begin{aligned} \eta _{\text {low}}\, {\text {dist}}(X,Y)\ge \max \{{\text {diam}}X,{\text {diam}}Y\} \end{aligned}$$
(6)

with a given parameter \(\eta _{\text {low}}>0\). Since replacing the kernel function \(K\) in the integrals (2) with degenerate approximations (5) leads to matrices of low rank, a more direct approach to the efficient treatment of matrices (2) are algebraic methods such as mosaic-skeletons [39] and hierarchical matrices [27, 29]. These methods also allow to define approximate replacements of usual matrix operations such as addition, multiplication, inversion, and LU factorization; cf. [23]. An efficient and comfortable way to construct low-rank approximations is the adaptive cross approximation (ACA) method [6]. The advantage of this approach compared with explicit kernel approximation is that significantly better approximations can be expected due the quasi-optimal approximation properties; cf. [7]. Furthermore, ACA has the practical advantage that only few of the original entries of \(A\) are used for its approximation. A second class are wavelet compression techniques [1], which lead to sparse and asymptotically well-conditioned approximations of the coefficient matrix.

It is known that the fundamental solution \(S\) (and its derivatives) of any elliptic operator allows for a degenerate approximation (5) on a pair of domains \((X,Y)\) satisfying (6); see [7]. This applies to the Yukawa operator \(-\Delta +\kappa ^2\) for any \(\kappa \), because the decay of \(S\) benefits from the positive shift \(\kappa ^2\). However, the negative shift \(-\kappa ^2\) in the Helmholtz operator introduces oscillations in \(S\). Hence, for high-frequency Helmholtz problems, i.e. for \(\kappa \,{\text {diam}}{\varOmega }>1\), the wave number \(\kappa \) enters the degree of degeneracy \(k\) in (5) in a way that \(k\) grows linearly with \(\kappa \). In addition to this difficulty, the mesh width \(h\) of the discretization has to be chosen such that \(\kappa h\sim 1\) for a sufficient accuracy of the solution. We assume that

$$\begin{aligned} \kappa h\le \frac{1}{2},\quad h:=\max \{{\text {diam}}{X_i},{\text {diam}}{Y_j},\,i\in I,\,j\in J\}\sim 1/\sqrt{N}, \end{aligned}$$
(7)

which implies that \(\kappa \sim \sqrt{N}\sim \sqrt{M}\). Notice that the recent formulation [16] allows to avoid the previous condition and hence leads to significantly smaller \(N\). For high-frequency Helmholtz problems, one- and two-level versions [36, 37] were presented with complexity \(O(N^{3/2})\) and \(O(N^{4/3})\), respectively. Multi-level algorithms [2, 18] are able to achieve logarithmic-linear complexity. The previous methods are based on an extensive analytic apparatus that is tailored to the kernel function \(K\). To overcome the instability of some of the employed expansions at low frequencies, a wideband version of FMM was presented in [17]. The \({\mathcal {H}}^2\)-matrix approach presented in [4] is based on the explicit kernel expansions used in [2, 36] for two-dimensional problems.

A well-known idea from physical optics (cf. [21]) is to approximate \(K(\cdot ,y)\) in a given direction \(e\in {{\mathbb {S}}}^2\) by a plane wave. The desired boundedness of \(k\) with respect to \(\kappa \) when approximating

$$\begin{aligned} \hat{K}(x,y):=K(x,y)\exp (-{\text {i}}\kappa (x-y,e)) \end{aligned}$$

can be achieved if (6) is replaced by a condition which depends on \(\kappa \) and which is directionally dependent. This is exploited by the fast multipole methods presented in [12, 19, 20, 33] and the so-called butterfly algorithm [15, 34]. The aim of this article is to combine this approach with the ease of use of ACA, i.e., our aim is to construct approximations to \(A\) with complexity \(k^2\,N\log {N}\) using only few of the original entries of \(A\). In this sense, this article generalizes ACA (which achieves log-linear complexity only for low-frequencies) to high-frequency Helmholtz problems. An interesting and important property of the new method is that it will allow for a continuous and numerically stable transition from low to high wave numbers \(\kappa \) by a generalized far-field condition that fades to the usual condition (6) if the wave number becomes small. Since we approximate the operator rather than just its application to a vector, this article is expected to lay ground to future work related to the definition of approximate arithmetic operations and hence to efficient preconditioners for high-frequency problems.

The remaining part of this article is organized as follows. In Sect. 2, we prove estimates like (4) for \(\hat{K}\) in a cone around \(e\). The desired asymptotic smoothness of \(\hat{K}\) leads to a far-field condition on the pair of domains \((X,Y)\) on which such estimates are valid. In Sect. 3, these conditions will be accounted for by subdividing the matrix indices hierarchically. It will be seen that the number of blocks resulting from this partitioning is too large to allow for hierarchical matrix approximations with log-linear complexity. Therefore, nested bases approximations are required, and in Sect. 4 directional \({\mathcal {H}}^2\)-matrices will be introduced as a generalization of usual \({\mathcal {H}}^2\)-matrices [28, 31] that incorporate a directional hierarchy. To define approximate arithmetic operations for such matrices, the \({\mathcal {H}}^2\)-matrix operations introduced in [9] have to be generalized to take into account the directional hierarchy.

The results obtained in the first part of this article are amenable to any way of construction of the low-rank approximation, e.g. polynomial interpolation. Sect. 5 is devoted to the construction of directional \({\mathcal {H}}^2\)-matrices using only few of the entries of \(A\). Error estimates for the constructed nested bases are presented and complexity estimates prove the log-linear overall storage and the log-linear number of operations required by the new technique. Finally, Sect. 6 presents numerical experiments that validate our analysis.

2 Directional asymptotic smoothness

In [7] it is proved that the singularity function of any elliptic second-order partial differential operator is asymptotically smooth. The latter property can be used to prove convergence of ACA and hence the existence of degenerate kernel approximations (5). The wave number \(\kappa \) enters the estimates on \(k\) in (5) only through the expression \(c_\kappa :=\max \{1,\kappa \,\max _{x\in X,\,y\in Y}|x-y|\}\), which in general becomes unbounded in the limit \(\kappa \rightarrow \infty \). For parts \(X\) of the domain of small size, i.e. \(\kappa \,{\text {diam}}{X}\le 1\), satisfying (6), \(c_\kappa \) and hence \(k\) in (5) are bounded independently of \(\kappa \). This follows from the fact that the recursive construction of domains satisfying (6) ensures that (6) is sharp in the sense that there is a constant \(q>1\) such that

$$\begin{aligned} \eta _{\text {low}}\,{\text {dist}}(X,Y)\le q\min \{{\text {diam}}X,{\text {diam}}Y\}. \end{aligned}$$

If the diameters of \(X\) and \(Y\) are comparable, i.e. \({\text {diam}}Y\le c\,{\text {diam}}X\) (which is a valid assumption for \({\mathcal {H}}\)-matrix partitions), then

$$\begin{aligned} \max _{x\in X,\,y\in Y} |x-y|\le {\text {dist}}(X,Y)+{\text {diam}}X+{\text {diam}}Y \le \left( \frac{q}{\eta _{\text {low}}}+1+c\right) {\text {diam}}X \end{aligned}$$

and thus

$$\begin{aligned} c_\kappa \le \max \{1,\left( \frac{q}{\eta _{\text {low}}}+1+c\right) \kappa \,{\text {diam}}X\}\le \frac{q}{\eta _{\text {low}}}+1+c \end{aligned}$$
(8)

is bounded independently of \(\kappa \). In the other case \(\kappa \,{\text {diam}}X>1\), we will not be able to prove asymptotic smoothness with bounded constants. However, a similar property can be proved if the far-field condition (6) is replaced with a frequency dependent condition and if the corresponding far field is subdivided into directions.

For the ease of presentation, \(K\) defined in (3) will be investigated as a function of \(x\) with fixed \(y\). Hence, after shifting \(x\) to \(x+y\) we consider

$$\begin{aligned} \hat{K}(x)=f(x)\exp ({\text {i}}\kappa [|x|-(x,e)]), \end{aligned}$$

which can be regarded as \(K\) divided by the plane wave \(\exp ({\text {i}}\kappa (x,e))\) with some given vector \(e\in {\mathbb {S}}^2\); cf. Fig. 1. It will be shown that \(\hat{K}\) is asymptotically smooth with respect to \(x\) in a cone around \(e\). To this end, \(\hat{K}\) will be investigated in the coordinates \(r:=|x|\) and

$$\begin{aligned} \xi :=|x|-(x,e). \end{aligned}$$

As a first step, we investigate the derivative of \(\xi \) in direction \(e\), i.e., we assume that \(\cos {\varphi }>0\), where \({\varphi }:=\measuredangle (x,e)\). Notice that the desired smoothness cannot be expected for \(\cos {\varphi }\le 0\); see Fig. 1. Furthermore, it will be seen later that it is required that \(\sin {\varphi }\rightarrow 0\) as \(\kappa \rightarrow \infty \). Hence, the following upper bound on \(|\sin {\varphi }|\) is a valid technical assumption.

Fig. 1
figure 1

\({\text {Re}}\, K(x_1,x_2,0)\) and \({\text {Re}}\, \hat{K}(x_1,x_2,0)\) with \(e=(1,0,0)^T\)

Lemma 1

Let \(x\in {\mathbb {R}}^3\) such that \(\cos {\varphi }>0\) and \(|\sin {\varphi }|<1/4\). Then there is a constant \(c_{\text {as},3}>0\) such that

$$\begin{aligned} |\partial _e^p\xi (x)|\le c_{\text {as},3} \, 2^p\, p!\, |x|^{1-p} (\sin {\varphi })^2,\quad p\in {\mathbb {N}}. \end{aligned}$$

Proof

We may assume that \(e=e_1:=(1,0,0)^T\). Hence, \(x_1=|x|\cos \measuredangle (x,e_1)>0\) and \(\xi (x)=|x|-x_1\). In order to estimate the \(p\)-th order derivative of \(\xi \) with respect to \(x_1\), we define

$$\begin{aligned} b^2:=x_2^2+x_3^2=|x|^2-x_1^2=(|x|\sin {\varphi })^2 \end{aligned}$$

and extend \(\xi \) regarded as a function in \(x_1\) to

$$\begin{aligned} \hat{\xi }(z):=\sqrt{z^2+b^2}-z, \end{aligned}$$

which is holomorphic in \(B_\rho (x_1),\,\rho :=|x|/2\). Consider \(z=\alpha +{\text {i}}\beta \in B_\rho (x_1)\). It is easy to see that any ball \(B_r(x_1)\) of radius \(0<r<x_1/\sqrt{2}\) is contained in the set \(\{(x,y)\in {\mathbb {C}}:x>|y|\}\). The assumption \(|\sin {\varphi }|<1/4\) implies that \(\cos {\varphi }>1/\sqrt{2}\) and hence that \(\rho =|x|/2=x_1/(2\cos {\varphi })<x_1/\sqrt{2}\). In particular, we obtain that \(\alpha >b>0\) and \(\alpha ^2>\beta ^2\). With \(A:=|z^2+b^2|=\sqrt{B^2+4\alpha ^2\beta ^2},\,B:=\alpha ^2-\beta ^2+b^2\), we have

$$\begin{aligned} \sqrt{z^2+b^2}=\sqrt{\frac{1}{2}(A+B)}+{\text {i}}\,\text {sgn}(\beta )\sqrt{\frac{1}{2}(A-B)}. \end{aligned}$$

Due to \(|\sqrt{x^2\pm y^2}-|x||\le y^2/\sqrt{x^2\pm y^2}\) for all \(x,y\in {\mathbb {R}}\), it follows that

$$\begin{aligned} |\frac{1}{2}(A+B)-\alpha ^2| =\frac{1}{2}|\sqrt{(\alpha ^2+\beta ^2-b^2)^2+4\alpha ^2b^2}-(\alpha ^2+\beta ^2-b^2)| \le 2\frac{b^2\alpha ^2}{A} \end{aligned}$$

and

$$\begin{aligned} |\frac{1}{2}(A-B)-\beta ^2| =\frac{1}{2}|\sqrt{(\alpha ^2+\beta ^2+b^2)^2-4\beta ^2b^2}-(\alpha ^2+\beta ^2+b^2)| \le 2\frac{b^2\beta ^2}{A}. \end{aligned}$$

Hence,

$$\begin{aligned} |\sqrt{z^2+b^2}-z|^2&=|\sqrt{\frac{1}{2}(A+B)}-\alpha |^2 +|\sqrt{\frac{1}{2}(A-B)}-|\beta ||^2\\&\le \frac{8b^4}{A^2}\left( \frac{\alpha ^4}{A+B} +\frac{\beta ^4}{A-B}\right) \\&\le \frac{8b^4}{A^2}\left( \frac{\alpha ^4}{2\alpha ^2-4b^2\alpha ^2/A} +\frac{\beta ^4}{2\beta ^2-4b^2\beta ^2/A}\right) \\&=\frac{4b^4}{A^2}\frac{|z|^2}{1-2b^2/A} \le \frac{4b^4}{A}\frac{1}{1-8\sqrt{2}(\sin {\varphi })^2}, \end{aligned}$$

which follows from

$$\begin{aligned} A&=|z^2+b^2|=\sqrt{|z|^4+2b^2(\alpha ^2-\beta ^2)+b^4} \ge \sqrt{|z|^4+b^4}\ge \frac{1}{\sqrt{2}}(|z|^2+b^2)\\&\ge \frac{1}{\sqrt{2}}((x_1-\rho )^2+b^2)\ge \frac{1}{\sqrt{2}}(|x|^2-2\rho |x|+\rho ^2) =\frac{|x|^2}{4\sqrt{2}}=\frac{b^2}{4\sqrt{2}(\sin {\varphi })^2} \end{aligned}$$

and in particular \(A\ge |z|^2\). From Cauchy’s differentiation formula we obtain

$$\begin{aligned} |\partial _{x_1}^p \xi (x)|&=|\hat{\xi }^{(p)}(x_1)| \le \frac{p!}{2\pi }\int _{\partial B_\rho (x_1)} \frac{|\hat{\xi }(z)|}{|z-x_1|^{p+1}}{\,\text {d}}z\\&\le \frac{2^pp!}{|x|^{p-1}}\frac{4\root 4 \of {2}(\sin {\varphi })^2}{\sqrt{1-8\sqrt{2}(\sin {\varphi })^2}}. \end{aligned}$$

\(\square \)

Using the previous estimate on the derivatives of \(\xi \), we are now able to estimate the derivatives of \(\hat{K}\) in directions \(e\) and \(e_\perp \in {\mathbb {S}}^2\) perpendicular to \(e\). To this end, we exploit (4) and make use of

$$\begin{aligned} |\partial ^p_{v} \hat{K}(x)|&\le \sum _{i+j=p} \left( \begin{array}{l} p\\ i \end{array}\right) |\partial ^i_{v} \exp ({\text {i}}\kappa \xi )| \, |\partial ^{j}_{v} f(x)|\nonumber \\&\le c_{\text {as},1}p! |\hat{K}(x)| \sum _{i+j=p} \frac{ |\partial ^i_{v} \exp ({\text {i}}\kappa \xi )|}{i!} \, \left( \frac{c_{\text {as},2}}{r} \right) ^{j}, \end{aligned}$$
(9)

which holds true for any direction \(v\). Thus, we require an estimate for \(|\partial ^i_{v} \exp ({\text {i}}\kappa \xi )|\). This will be done in the following two lemmas for the cases \(v=e\) and \(v=e_{\perp }\), respectively.

Lemma 2

Let \(x\) and \({\varphi }\) be as in Lemma 1. Let \(d>1/\kappa \) and define \(\eta :=\kappa \, d^2/r,\,\gamma :=\kappa \, d |\sin {\varphi }|\). Then

$$\begin{aligned} |\partial ^p_e \hat{K}(x)|\le c\, p! \left( \frac{\rho }{d}\right) ^p | \hat{K}(x)|,\quad p\in {\mathbb {N}}, \end{aligned}$$

where \(c := 2 c_{\text {as},1}\exp (4c_{\text {as},3}\gamma )\) and \(\rho := \max \{c_{\text {as},2},4\} \max \{\gamma ,\eta \}\).

Proof

In order to estimate \(|\partial ^i_{e} \exp ({\text {i}}\kappa \xi )|\), we apply Faà di Bruno’s formula expressed in terms of Bell polynomials \(B_{n,k}\)

$$\begin{aligned} \frac{{\,\text {d}}^i}{{\,\text {d}}x^i} (f\circ g)(x)=\sum _{k=0}^i f^{(k)}(g(x)) B_{i,k}(g'(x),\dots ,g^{(i-k+1)}(x)). \end{aligned}$$

Using Lemma 1 and \(\kappa \,d>1\), we obtain

$$\begin{aligned} |\partial ^i_e\exp ({\text {i}}\kappa \xi )|&= |\sum _{k=0}^i \partial ^k_\xi \exp ({\text {i}}\kappa \xi ) \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{i!}{j_1!j_2!\ldots } \prod _{\ell =1}^{i-k+1} \left( \frac{\partial ^\ell _e \xi (x)}{\ell !}\right) ^{j_\ell }| \\&\le i! \sum _{k=0}^i \kappa ^k \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{1}{j_1!j_2!\ldots } \prod _{\ell =1}^{i-k+1} \left( \frac{c_{\text {as},3} \, 2^\ell (\sin \varphi )^2}{r^{\ell -1}}\right) ^{j_\ell } \\&\le 2^i i! \sum _{k=0}^i (c_{\text {as},3}\,\kappa (\sin \varphi )^2)^k r^{k-i}\sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{1}{j_1!j_2!\ldots } \\&= \left( \frac{2}{\kappa \, d^2}\right) ^i i! \sum _{k=0}^i (c_{\text {as},3} \gamma ^2)^k \eta ^{i-k} \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{1}{j_1!j_2!\ldots }\\&< \left( \frac{2\tilde{\rho }}{d} \right) ^i i! \sum _{k=0}^i ( c_{\text {as},3}\gamma )^k \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{1}{j_1!j_2!\ldots }, \end{aligned}$$

where \(\tilde{\rho }:=\max \{\eta ,\gamma \}\) and \(j_\nu = 0\) for all \(\nu > i-k+1\). From the multinomial theorem for \(\varvec{j}\in {\mathbb {N}}^d\) and \(L:=i-k+1\)

$$\begin{aligned} \sum _{|\varvec{j}|=k} \left( \begin{array}{l} k\\ \varvec{j} \end{array}\right) x^{\varvec{j}}=(\sum _{i=1}^L x_i)^k \end{aligned}$$

it follows that

$$\begin{aligned} \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \frac{1}{j_1!\cdots j_L!} = \frac{2^k}{k!} \sum _{\begin{array}{c} \sum _\nu j_\nu =k \\ \sum _\nu \nu j_\nu =i \end{array}} \left( \begin{array}{l} k\\ \varvec{j}\\ \end{array}\right) \prod _{\ell =0}^L(2^{-\ell })^{j_\ell } \le \frac{2^k}{k!} (\sum _{\ell =0}^L 2^{-\ell })^k \le \frac{4^k}{k!} \end{aligned}$$

and hence \(|\partial ^i_e\exp ({\text {i}}\kappa \xi )| \le \tilde{c}\, i!(2\tilde{\rho }/d)^i\) with \(\tilde{c} := \exp (4 c_{\text {as},3}\gamma )\). Together with (9) this yields

$$\begin{aligned} |\partial ^p_e \hat{K}(x)|&\le \tilde{c}\,c_{\text {as},1}\,p!\frac{|\hat{K} (x)|}{d^p} \sum _{i+j =p} (2\tilde{\rho })^i \left( \frac{c_{\text {as},2} \,d}{r}\right) ^{j} \\&\le \tilde{c}\,c_{\text {as},1} p! \left( \frac{2\tilde{\rho }}{d}\right) ^p |\hat{K}(x)| \sum _{i+j = p} \left( \frac{c_{\text {as},2}}{2}\right) ^{j} \\&\le 2 \tilde{c}\, c_{\text {as},1} p! \left( \frac{\max \{c_{\text {as},2},4\}\tilde{\rho }}{d}\right) ^p |\hat{K}(x)| \end{aligned}$$

due to \(d/r=\eta /(\kappa \,d)<\eta \le \tilde{\rho }\) and \(\sum _{j=0}^p t^j \le 2\,t^p\) for \(t\ge 2\). \(\square \)

Lemma 3

Let \(e_\perp \in {\mathbb {S}}^2\) be any direction perpendicular to \(e\in {\mathbb {S}}^2\) and let \(d,\eta ,\gamma \) as in Lemma 2 such that \(\eta ,\gamma <1\). Then

$$\begin{aligned} |\partial _{e_\perp }^p \hat{K}(x)| \le 2\,c_{\text {as},1 } p! \left( \frac{\rho }{d}\right) ^p |\hat{K}(x)|, \end{aligned}$$

where \(\rho := \max \{6/\sqrt{\tilde{\rho }}, 2 c_{\text {as},2}\}\, \tilde{\rho }\) and \(\tilde{\rho }:= \max \{\eta ,\gamma \}\).

Proof

First, we claim that \(\partial _{e_\perp }^i \exp ({\text {i}}\kappa \xi )\) consists of at most \(3^i\) summands of the form

$$\begin{aligned} g(x) := c_g \frac{({\text {i}}\kappa )^n}{r^{n+2m}} (x,e_\perp )^{2(n+m)-i} \exp ({\text {i}}\kappa \xi ), \end{aligned}$$

where \(n,m\in {\mathbb {N}}\) with \(2(m+n) \ge i \ge m+n\) and \(|c_g| \le 2^i i!\). This can be seen by induction using

$$\begin{aligned} \partial _{e_\perp } g(x)&= c_g \frac{({\text {i}}\kappa )^{n+1}}{r^{n+1+2m}} (x,e_\perp )^{2(n+1+m)-(i+1)} \exp ({\text {i}}\kappa \xi ) \\&\quad + (2n+2m-i)\,c_g \frac{({\text {i}}\kappa )^n}{r^{n+2m}} (x,e_\perp )^{2(n+m)-(i+1)} \exp ({\text {i}}\kappa \xi ) \\&\quad - (n+2m)\, c_g \frac{({\text {i}}\kappa )^n}{r^{n+2(m+1)}} (x,e_\perp )^{2(n+m+1)-(i+1)} \exp ({\text {i}}\kappa \xi ). \end{aligned}$$

We estimate

$$\begin{aligned} \frac{|g(x)|}{i!}&\le \frac{|c_g|}{i!} \frac{\kappa ^n}{r^{n+2m}} |(x,e_\perp )|^{2(n+m)-i} \le 2^i \frac{\kappa ^n}{r^{n+2m}} (r|\sin {\varphi }|)^{2(n+m)-i}\\&\le 2^i \frac{\kappa ^{i-(2m+n)}}{r^{i-n}} \left( \frac{\gamma }{d}\right) ^{2(n+m)-i}. \end{aligned}$$

Here, we used that \(|(x,e_\perp )| \le r|\sin {\varphi }| = r\gamma /(\kappa \,d)\). As in the proof of Lemma 2 we have that \(d/r <\eta \) and hence

$$\begin{aligned} \frac{d^i}{i!}|g(x)|&= 2^i \frac{\kappa ^{i-(2m+n)}}{r^{i-n}} \gamma ^{2(n+m)-i} d^{2(i-n-m)} = 2^i \eta ^{i-(2m+n)} \gamma ^{2(n+m)-i} \left( \frac{d}{r} \right) ^{2m} \\&\le 2^i \tilde{\rho }^{2m+n} \le 2^i \tilde{\rho }^{i/2}. \end{aligned}$$

The last estimate follows from \(2m+n\ge i/2\). This implies that \(| \partial ^{i}_{e_\perp } \exp ({\text {i}}\kappa \xi ) | \le i! (6/d)^i \,\tilde{\rho }^{i/2}\) and together with (9) we get

$$\begin{aligned} \frac{d^p}{p!}|\partial ^p_{e_\perp } \hat{K}(x)|&\le c_{\text {as},1} |\hat{K}(x)| \sum _{i+j=p} 6^i \tilde{\rho }^{i/2} \left( \frac{ c_{\text {as},2}\,d}{r} \right) ^{j} \\&\le c_{\text {as},1} \, (c_{\text {as},2}\,\tilde{\rho })^{p} |\hat{K}(x)| \sum _{i+j=p} \left( \frac{6}{c_{\text {as},2}\,\sqrt{\tilde{\rho }}}\right) ^i \\&\le c_{\text {as},1}\, (c_{\text {as,2}}\,\tilde{\rho })^p|\hat{K}(x)| \sum _{i=0}^p \left( \frac{\hat{c}}{ c_{\text {as,2}}} \right) ^i \le 2\, c_{\text {as,1}} ( \hat{c}\,\tilde{\rho })^p |\hat{K}(x)|, \end{aligned}$$

where \(\hat{c} := \max \{6/\sqrt{\tilde{\rho }}, 2 c_{\text {as,2}}\}\). \(\square \)

We return to the general case of estimating the derivatives of \(\hat{K}(x,y)\) for \(x\in X\) and \(y\in Y\). The last two lemmata show that the derivatives of \(\hat{K}\) can be controlled by the parameters \(\eta ,\,\gamma \), and \(d\). Let \(\chi (X)\) denote the Chebyshev center of \(X\). Using the angle condition

$$\begin{aligned} |\sin \measuredangle (\chi (X)-y,e)|\le \frac{\gamma _{{\text {high}}}}{\kappa \,{\text {diam}}{X}},\quad y\in Y, \end{aligned}$$
(10)

and the high-frequency far-field condition

$$\begin{aligned} \eta _{{\text {high}}}{\text {dist}}(X,Y)\ge \kappa ({\text {diam}}{X})^2 \end{aligned}$$
(11)

with \(0<\gamma _{{\text {high}}},\eta _{{\text {high}}}<1\), we obtain for the choice \(d={\text {diam}}{X}\) and \(x\mapsto x-y\) that \(d>1/\kappa \) and

$$\begin{aligned} \eta =\frac{\kappa \,d^2}{r}=\frac{\kappa ({\text {diam}}{X})^2}{|x-y|} \le \frac{\kappa ({\text {diam}}{X})^2}{{\text {dist}}(X,Y)}\le \eta _{{\text {high}}}. \end{aligned}$$

The following lemma shows that \(\gamma =\kappa \,{\text {diam}}{X}\sin \measuredangle (x-y,e)\) is bounded by \(\tfrac{\gamma _{{\text {high}}}+\eta _{{\text {high}}}}{1-\eta _{{\text {high}}}}\).

Lemma 4

Let \(X\) and \(Y\) satisfy (10) and (11). Then for \(x\in X\) and \(y\in Y\)

$$\begin{aligned} |\sin \measuredangle (x-y,e)|\le \frac{\gamma _{{\text {high}}}+\eta _{{\text {high}}}}{1-\eta _{{\text {high}}}} \frac{1}{\kappa \, \mathrm{diam}{X}}. \end{aligned}$$

Proof

Let \(u\times v\) denote the cross product of \(u,v\in {\mathbb {R}}^3\). Then

$$\begin{aligned} |(x-y)\times e|\le |x-\chi (X)|+|(\chi (X)-y)\times e|\le {\text {diam}}{X}+|(\chi (X)-y)\times e|. \end{aligned}$$

It follows that

$$\begin{aligned} |\sin \measuredangle (x-y,e)|=\frac{|(x-y)\times e|}{|x-y|} \le \frac{|(\chi (X)-y)\times e|+{\text {diam}}{X}}{|\chi (X)-y|-{\text {diam}}{X}}. \end{aligned}$$

Due to \(\eta _{\text {high}}|\chi (X)-y|\ge \kappa ({\text {diam}}{X})^2\ge {\text {diam}}{X}\), we obtain that the denominator of the last expression is bounded from below by \((1-\eta _{\text {high}})|\chi (X)-y|\). Hence,

$$\begin{aligned} |\sin \measuredangle (x-y,e)|&\le \frac{1}{1-\eta _{\text {high}}}\frac{|(\chi (X)-y)\times e|+{\text {diam}}{X}}{|\chi (X)-y|}\\&\le \frac{1}{1-\eta _{\text {high}}}\left( \frac{\gamma _{{\text {high}}}}{\kappa \,{\text {diam}}{X}} +\frac{{\text {diam}}{X}}{|\chi (X)-y|}\right) \\&\le \frac{\gamma _{{\text {high}}} +\eta _{\text {high}}}{1-\eta _{\text {high}}}\frac{1}{\kappa \,{\text {diam}}{X}}. \end{aligned}$$

\(\square \)

As a consequence of the angle condition (10) and the far-field condition (11) we obtain from Lemmas 2 and 3 for \(x\in X\) and \(y\in Y\)

$$\begin{aligned} \max \Big \{ |\partial ^p_{e,x} \hat{K}(x,y)|,|\partial ^p_{e_\perp ,x} \hat{K}(x,y)|\Big \} \le c\, p!\left( \frac{\rho }{{\text {diam}}{X}}\right) ^p|\hat{K}(x,y)|, \end{aligned}$$
(12)

where \(c\) is independent of \(\kappa \), the directions \(e,e_\perp \in {\mathbb {S}}^2\) satisfy \((e,e_\perp )=0\) and \(0<\rho <1\) for small enough \(\eta _{{\text {high}}}\) and \(\gamma _{{\text {high}}}\). Figure 2 shows that the angle condition is not a result of overestimation. The angle of the region in which \(\hat{K}\) is smooth decreases with \(\kappa \).

Fig. 2
figure 2

\({\text {Re}}\, \hat{K}(x_1,x_2,0)\) with \(e=(1,0,0)^T\) for \(\kappa =2\) and \(\kappa =4\)

3 Matrix partitioning

The aim of this section is to partition the set of indices \(I\times J,\,I=\{1,\dots ,N\}\) and \(J=\{1,\dots ,M\}\), of the matrix defined in (2) into sub-blocks \(t\times s,\,t\subset I\) and \(s\subset J\), such that the associated supports

$$\begin{aligned}X_t:=\bigcup _{i\in t} X_i \quad \text {and}\quad Y_s:=\bigcup _{j\in s} Y_j \end{aligned}$$

satisfy (11) in the high-frequency case \(\kappa \,{\text {diam}}X_t>1\) and (6) if \(\kappa \,{\text {diam}}{X_t}\le 1\).

Before we discuss the matrix partition, let us make some assumptions that are in line with the usual finite element discretization. The first assumption is that the overlap of the sets \(X_i,\,i\in I\), is bounded in the sense that there is a constant \(\nu >0\) such that

$$\begin{aligned} \sum _{i\in t} \mu (X_i)\le \nu \mu (X_t)\quad \text {for any }t\subset I. \end{aligned}$$
(13)

Furthermore, the surface measure \(\mu \) has the property that there is \(c_\varGamma >0\) such that

$$\begin{aligned} \mu (X)\le c_\varGamma ({\text {diam}}X)^2 \end{aligned}$$

for all \(X\subset \varGamma \). The usual way of constructing hierarchical matrix partitions is based on cluster trees; see [7, 30]. A cluster tree \(T_I\) for the index set \(I\) is a binary tree with root \(I\), where each \(t \in T_I\) and its successors \(t',t'' \in T_I\) (if they exist) have the properties \(t = t' \cup t'',\,t'\ne \emptyset \ne t''\), and \(t' \cap t'' = \emptyset \). We refer to \(\mathcal {L}(T_I) = \{t \in T_I:t\text { has no successors}\}\) as the leaves of \(T_I\) and define

$$\begin{aligned} T_I^{(\ell )} = \{t \in T_I : {\text {dist}}(t,I) = \ell \} \subset T_I, \end{aligned}$$

where \({\text {dist}}(t, s)\) is the minimum distance between \(t\) and \(s\) in \(T_I\). Furthermore,

$$\begin{aligned} L:=L(T_I):=\max \{{\text {dist}}(t,I),\,t\in T_I\}+1 \end{aligned}$$

denotes the depth of \(T_I\). We assume that a cluster tree \(T_I\) is constructed such that there are constants \(c_g,c_G>0\) with

$$\begin{aligned} 2^{-\ell }/c_G\le \mu (X_t) \quad \text {and}\quad ({\text {diam}}X_t)^2\le c_g 2^{-\ell } \end{aligned}$$
(14)

for all \(t\) from the \(\ell \)-th level \(T_I^{(\ell )}\) of \(T_I\). We will make use of the notation \(S_I(t)\) for the set of sons of a cluster \(t\in T_I\). The same properties are also assumed for the sets \(Y_j,\,j\in J\). Under these assumptions it follows that the depth \(L\) of the cluster trees is of the order \(L\sim \log {N}\sim \log {M}\).

A block cluster tree \(T_{I\times J}\) is a quad-tree with root \(I\times J\) satisfying conditions analogous to a cluster tree. It can be constructed from the cluster trees \(T_I\) and \(T_J\) in the following way. Starting from the root \(I\times J\in T_{I\times J}\), let the sons of a block \(t\times s\in T_{I\times J}\) be \(S_{I\times J}(t,s):=\emptyset \) if \(t\times s\) satisfies the low-frequency far-field condition

$$\begin{aligned}&\kappa \min \{{\text {diam}}{X_t},{\text {diam}}{Y_s}\}\le 1, \end{aligned}$$
(15a)
$$\begin{aligned}&\eta _{\text {low}}\,{\text {dist}}(X_t,Y_s)\ge \max \{{\text {diam}}{X_t},{\text {diam}}{Y_s}\} \end{aligned}$$
(15b)

or the high-frequency far-field condition

$$\begin{aligned} \kappa \min \{{\text {diam}}{X_t},{\text {diam}}{Y_s}\}&>1, \end{aligned}$$
(16a)
$$\begin{aligned} \eta _{\text {high}}\,{\text {dist}}(X_t,Y_s)&\ge \kappa \max \{({\text {diam}}{X_t})^2,({\text {diam}}{Y_s})^2\} \end{aligned}$$
(16b)

or \(\min \{|t|,|s|\}\le {n_{\min }}\) with some given constants \(\eta _{\text {low}},\eta _{\text {high}},{n_{\min }}>0\). In all other cases, we set \(S_{I\times J}(t,s):=S_I(t)\times S_J(s)\). Notice that for \(\kappa =0\) we obtain the usual far-field condition (6). Furthermore, the transition from the low to the high-frequency regime is continuous in the sense that for \(\kappa \max \{{\text {diam}}{X_t},{\text {diam}}{Y_s}\}=1\) the conditions (15b) and (16b) are equivalent with \(\eta _{\text {high}}=\eta _{\text {low}}\). We will, however, keep both constants in order to be able to optimize them independently in the implementation of the algorithm.

The set of leaves of \(T_{I\times J}\) defines a partition \(P\) of \(I\times J\). As usual, we partition \(P\) into admissible and non-admissible blocks

$$\begin{aligned} P=P_{{\text {adm}}}\cup P_{{\text {nonadm}}}, \end{aligned}$$

where each \(t\times s\in P_{{\text {adm}}}\) satisfies (15) or (16) and each \(t\times s\in P_{{\text {nonadm}}}\) is small, i.e. satisfies \(\min \{|t|,|s|\}\le {n_{\min }}\). In order to distinguish low and high-frequency blocks, we further subdivide

$$\begin{aligned} P_{{\text {adm}}}=P_{{\text {low}}}\cup P_{{\text {high}}}, \end{aligned}$$

where \(P_{{\text {low}}} :=\{t\times s\in P:t\times s\text { satisfies (15) }\}\) and \(P_{{\text {high}}}:=P_{{\text {adm}}}{\setminus } P_{{\text {low}}}\).

The following lemma will be the basis for the complexity analysis of the algorithms presented in this article. Notice that this lemma analyzes the so-called sparsity constant of hierarchical matrix partitions introduced in [23] for the far-field condition (6). Since the lemma states that this constant is unbounded with respect to \(\kappa \), usual \({\mathcal {H}}\)-matrices are not able to guarantee logarithmic-linear complexity for the high-frequency far-field condition (16b). Therefore, in the next section a variant of \({\mathcal {H}}^2\)-matrices will be introduced.

Lemma 5

Let \(t\in T_I^{(\ell )}\). The set \(\{s\in T_J:t\times s\in P_{\text {high}}\}\) has cardinality \({\mathcal {O}(1+2^{-\ell }\kappa ^2)}\).

Proof

The assumptions (13) and (14) guarantee that each set

$$\begin{aligned} N_\rho :=\{s\in T_J^{(\ell )}: \max _{y\in Y_s} |\chi (X_t)-y|\le \rho \},\quad \rho >0, \end{aligned}$$

contains at most \(\nu c_G c_\varGamma 2^\ell (2\rho )^2\) clusters \(s\) from the same level \(\ell \) in \(T_J\). This follows from

$$\begin{aligned} |N_\rho | 2^{-\ell }/c_G\le \sum _{s\in N_\rho } \mu (Y_s) \le \nu \mu (Y_{N_\rho })\le \nu c_\varGamma (2\rho )^2. \end{aligned}$$
(17)

Let \(s\in T_J\) such that \(t\times s\in P_{\text {high}}\). Furthermore, let \(t^*\) and \(s^*\) be the father clusters of \(t\) and \(s\), respectively. Suppose that \(\max _{y\in Y_s}|\chi (X_t)-y| \ge \rho _0\), where

$$\begin{aligned} \rho _0:= \kappa /\eta _{\text {high}}\max \{({\text {diam}}X_{t^*})^2,({\text {diam}}Y_{s^*})^2\} +{\text {diam}}{X_{t^*}}+{\text {diam}}{Y_{s^*}}. \end{aligned}$$

Then

$$\begin{aligned} {\text {dist}}(X_{t^*},Y_{s^*})&\ge \max _{y\in Y_s} |\chi (X_t)-y| -{\text {diam}}{X_{t^*}}-{\text {diam}}{Y_{s^*}}\\&\ge \kappa /\eta _{\text {high}}\max \{({\text {diam}}{X_{t^*}})^2,({\text {diam}}{Y_{s^*}})^2\} \end{aligned}$$

implies that \(t^*\times s^*\in P_{\text {high}}\). Hence, \(P_{\text {high}}\) cannot contain \(t\times s\), which is a contradiction. It follows that

$$\begin{aligned} \max _{y\in Y_s} |\chi (X_t)-y| <\rho _0\le (c_g 2^{-(\ell -1)})^{1/2}(2+(c_g 2^{-(\ell -1)})^{1/2}\kappa /\eta _{\text {high}}). \end{aligned}$$

From (17) we obtain that

$$\begin{aligned} |\{s\in T_J:t\times s\in P_{\text {high}}\}| \le 8\nu c_\varGamma c_g c_G(2+(c_g2^{-(\ell -1)})^{1/2}\kappa /\eta _{\text {high}})^2 \end{aligned}$$

and hence the assertion. \(\square \)

3.1 Directional subdivision of high-frequency blocks

In the high-frequency regime, i.e. \(\kappa \min \{{\text {diam}}{X_t},{\text {diam}}{Y_s}\}>1\), the matrix block corresponding to \(t\times s\in P_{\text {high}}\) cannot be approximated independently of \(\kappa \) unless it is directionally subdivided; see the discussion in Sect. 2. In view of the angle condition (10), we partition the space \({\mathbb {R}}^3\) recursively into a hierarchy of unbounded pyramids. The first subdivision partitions \({\mathbb {R}}^3\) into the 6 pyramids defined by the origin and the faces of the unit cube as the pyramids’ bases. In each of the next steps, a pyramid is subdivided by dividing its base perpendicular to a largest side of the base into two equally sized halves. A pyramid \(Z\) resulting from \(\nu \) subdivisions satisfies

$$\begin{aligned} |\sin \measuredangle (x,e)|\le 2^{(1-\nu )/2}\quad \text {for all }x\in Z, \end{aligned}$$
(18)

where \(e\in {\mathbb {S}}^2\) denotes the vector pointing from the origin to the center of the base of \(Z\). For a given cluster \(t\in T_I^{(\ell )}\) from the \(\ell \)-th level of \(T_I\) let \(\nu _t\) be the smallest non-negative integer such that

$$\begin{aligned} \nu _t\ge 2(\nu _0+\log _2 \kappa )-\ell +1, \end{aligned}$$
(19)

where \(\nu _0\in {\mathbb {N}}\) is a fixed value which will be specified later on. Denote by \({\mathcal {E}}(t)\) the set of directions \(e\in {\mathbb {S}}^2\) associated with all pyramids \(Z_e\) after \(\nu _t\) subdivisions. Notice that depending on \(t\) some of the directions will not be used when there are no targets in that particular direction. Such directions can be removed from \(\mathcal {E}(t)\). Then

$$\begin{aligned} |{\mathcal {E}}(t)|\le 6\cdot 2^{\nu _t} \sim \kappa ^2 2^{-\ell }. \end{aligned}$$
(20)

Given \(t\in T_I\) and \(e\in {\mathcal {E}}(t)\), for \(t'\in S_I(t)\) we define \(e'\in {\mathcal {E}}(t')\) as the axis of the pyramid \(Z_{e'}\) from which \(Z_e\) results after subdivision. Notice that despite \(t'\subset t\) we have \(Z_e\subset Z_{e'}\). Furthermore, let

$$\begin{aligned} {\mathcal {F}}_e(X_t)&:=\mathcal {D}_e(X_t) \cap {\mathcal {F}}(X_t)\quad \text {with}\\ \mathcal {D}_e(X)&:=\left\{ y\in {\mathbb {R}}^3:|\sin \measuredangle (\chi (X)-y,e)|\le \frac{\gamma _{\text {high}}}{\kappa \,{\text {diam}}{X}}\right\} , \end{aligned}$$

be the directional far field of \(X_t\). Here, \({\mathcal {F}}(X):=\{y\in {\mathbb {R}}^3:\eta _{\text {high}}\,{\text {dist}}(X,y)\ge \kappa ({\text {diam}}{X})^2\}\) denotes the far field of \(X\subset {\mathbb {R}}^3\).

The following lemma will be important for the construction of nested bases used for the approximation. Notice that the directional far fields are nested up to constants.

Lemma 6

Let \(t'\in S_I(t)\) and \(e'\) be defined as above. Then the directional far field satisfies \({\mathcal {F}}_e(X_t)\subset \tilde{{\mathcal {F}}}_{e'}(X_{t'})\) with

$$\begin{aligned} \tilde{{\mathcal {F}}}_{e'}(X_{t'}):=\left\{ y\in \varGamma :|\sin \measuredangle (\chi (X_{t'})-y,e')|\le \frac{\tilde{\gamma }}{\kappa \,\mathrm{diam}{X_{t'}}}\right\} \cap {\mathcal {F}}(X_{t'}) \end{aligned}$$

and the constant \(\tilde{\gamma }:= 2^{1-\nu _0}\sqrt{c_g} + (\gamma _{\text {high}}+ \eta _{\text {high}})/(1-\eta _{\text {high}})\).

Proof

For \(y\in {\mathcal {F}}_e(X_t)\) let \(\zeta =\chi (X_{t'})-y-(\chi (X_{t'})-y,e)e,\,\zeta ':=\chi (X_{t'})-y-(\chi (X_{t'})-y,e')e'\), and \(\delta :=\zeta '-\zeta \). Then

$$\begin{aligned} |\sin \measuredangle (\chi (X_{t'})-y,e')|&=\frac{|\zeta '|}{|\chi (X_{t'})-y|} \le \frac{|\zeta |+|\delta |}{|\chi (X_{t'})-y|}\\&=|\sin \measuredangle (\chi (X_{t'})-y,e)| +\frac{|\delta |}{|\chi (X_{t'})-y|} \end{aligned}$$

and

$$\begin{aligned} |\delta |&= |(\chi (X_{t'})-y,e)[e-(e,e')e']-(\chi (X_{t'})-y,e'-(e,e')e)e'|\\&\le |\chi (X_{t'})-y|[|e-(e,e')e'|+|e'-(e,e')e|] \le 2|\chi (X_{t'})-y||\sin \measuredangle (e',e)|. \end{aligned}$$

Using Lemma 4, we obtain from \(\chi (X_{t'})\in X_{t'}\subset X_t\)

$$\begin{aligned} |\sin \measuredangle (\chi (X_{t'})-y,e)|\le \frac{\gamma _{\text {high}}+\eta _{\text {high}}}{1-\eta _{\text {high}}}\frac{1}{\kappa \,{\text {diam}}{X_t}}. \end{aligned}$$

From \(e'\in \partial Z_e\) it follows

$$\begin{aligned} |\sin \measuredangle (e',e)|\le 2^{(1-\nu _t)/2}\le \frac{2^{-\nu _0}}{\kappa } 2^{\ell /2}\le \frac{2^{-\nu _0} \sqrt{c_g}}{\kappa \,{\text {diam}}X_t} \end{aligned}$$

due to (18) and (14). We obtain

$$\begin{aligned} |\sin \measuredangle (\chi (X_{t'})-y,e')|\le \left( \frac{\gamma _{\text {high}}+\eta _{\text {high}}}{1-\eta _{\text {high}}}+2^{1-\nu _0}\sqrt{c_g}\right) \frac{1}{\kappa \,{\text {diam}}{X_t}}. \end{aligned}$$

The inclusion \({\mathcal {F}}(X_t)\subset {\mathcal {F}}(X_{t'})\) is obvious due to \(X_{t'}\subset X_t\). \(\square \)

From the previous proof it can be seen that \(|\sin \measuredangle (x,e)|\le \frac{2^{-\nu _0} \sqrt{c_g}}{\kappa \,{\text {diam}}X_t}\) for all \(x\in Z_e\). With Lemma 6 it follows that

$$\begin{aligned} (\chi (X_t) + Z_e) \cap {\mathcal {F}}(X_t) \subset {\mathcal {F}}_{e'}(X_{t'})\quad \text {for all }e\in {\mathcal {E}}(t) \end{aligned}$$
(21)

provided that \(\nu _0\) from (19) is chosen sufficiently large and \(\eta _{\text {high}}\) is chosen sufficiently small.

Since high-frequency blocks require special attention, we gather column clusters \(s\) which are admissible with \(t\) in direction \(e\in {\mathcal {E}}(t)\) in the (cluster) far field with direction \(e\)

$$\begin{aligned} F_e(t):=\bigcup \{s\in T_J: \exists \hat{t}\supset t\text { such that }\hat{t}\times s\in P_{\text {high}}\text { and }Y_s\subset \mathcal {D}_e(X_t)\}. \end{aligned}$$

4 Directional \(\varvec{{\mathcal {H}}^2}\)-matrices

From Lemma 5 we see that \({\mathcal {H}}\)-matrices are not able to achieve logarithmic linear complexity. Therefore we employ a nested representation similar to \({\mathcal {H}}^2\)-matrices introduced in [31]. To account for the required directional subdivision, we generalize the concept of nested cluster bases.

Definition 1

A directional cluster basis \(\mathcal {U}\) for the rank distribution \((k_t^e)_{t\in T_I, e\in {\mathcal {E}}(t)}\) is a family \(\mathcal {U}=(U_e(t))_{t\in T_I, e\in {\mathcal {E}}(t)}\) of matrices \(U_e(t) \in {\mathbb {C}}^{t\times k_t^e}\). It is called nested if for each \(t\in T_I{\setminus }\mathcal {L}(T_I)\) there are transfer matrices \(T_e^{t't}\in {\mathbb {C}}^{k^{e'}_{t'}\times k^e_t}\) such that for the restriction of the matrix \(U_e(t)\) to the rows \(t'\) it holds that

$$\begin{aligned} U_e(t)|_{t'}=U_{e'}(t')T_e^{t't}\quad \text {for all }t'\in S_I(t). \end{aligned}$$
(22)

For estimating the complexity of storing a nested cluster basis \(\mathcal {U}\), we assume that \(k_t^e \le k\) for all \(t\in T_I,\,e\in {\mathcal {E}}(t)\) with some \(k\in {\mathbb {N}}\). It follows from (14) that the depth \(L\in {\mathbb {N}}\) of \(T_I\) is of the order \(L\sim \log _2(\kappa ^2)\). Since the set of leaf clusters \(\mathcal {L}(T_I)\) constitutes a partition of \(I\) and according to (20) for each cluster \(t\in \mathcal {L}(T_I)\) at most \(|{\mathcal {E}}(t)|\sim \kappa ^2 2^{-\ell }\) matrices \(U_e(t)\) each with at most \(k|t|\) entries have to be stored, \(k|I|\kappa ^2 2^{-L}\sim k\,N\) units of storage are required for the leaf matrices \(U_e(t),\,t\in \mathcal {L}(T_I)\).

Using (7), we can estimate the storage required for the transfer matrices

$$\begin{aligned} k^2\sum _{t\in T_I} |{\mathcal {E}}(t)| = k^2 \sum _{\ell =0}^{L-1} |T_I^{(\ell )}|\cdot |{\mathcal {E}}(t)| \sim k^2 \sum _{\ell =0}^{L-1} \kappa ^2=k^2\kappa ^2 L \sim k^2 N \log N. \end{aligned}$$

Definition 2

A matrix \(M\in {\mathbb {C}}^{I\times J}\) is a directional \({\mathcal {H}}^2\)-matrix if \(M_b\) is of low rank for all \(b\in P_{\text {low}}\) and there are nested directional cluster bases \({\mathcal {U}}\) and \({\mathcal {V}}\) such that for \(t\times s \in P_{\text {high}}\)

$$\begin{aligned} M_{ts} = U_e(t) S(t,s) V_{-e}^H(s) \end{aligned}$$
(23)

with coupling matrices \(S(t,s)\in {\mathbb {C}}^{k_t^e\times k_s^{-e}}\) and \(e\in {\mathcal {E}}(t)\) such that \(s\subset F_e(t)\).

The storage cost of the blocks in \(P_{\text {nonadm}}\) and \(P_{\text {low}}\) is bounded by the storage cost of a hierarchical matrix, which is known to be \({\mathcal {O}(\max \{k,{n_{\min }}\}N\log N)}\). We estimate the storage required for the coupling matrices. According to Lemma 5, the number of blocks \(t\times s\in P_{\text {high}}\) is \({\mathcal {O}(2^{-\ell }\kappa ^2)}\) for \(t\in T_I^{(\ell )}\). Thus, the coupling matrices require at most

$$\begin{aligned} k^2\sum _{t\in T_I} |\{s\in T_J:t\times s\in P_{\text {high}}\}| \sim k^2\kappa ^2\sum _{\ell =0}^{L-1}\sum _{t\in T_I^{(\ell )}} 2^{-\ell } \sim k^2\kappa ^2L \sim k^2 N\log N \end{aligned}$$

units of storage. Hence, the overall cost for a directional \({\mathcal {H}}^2\)-matrix is of the order \({\mathcal {O}(k^2N\log N)}\).

4.1 Matrix-vector multiplication

Let \(M\in {\mathbb {C}}^{I\times J}\) be a directional \({\mathcal {H}}^2\)-matrix. Since its structure is similar to an \({\mathcal {H}}^2\)-matrix, the matrix-vector multiplication \(y:=y+Mx\) of \(M\) by a vector \(x\in {\mathbb {C}}^J\) can be done via the usual three-phase algorithm (cf. [31]) which we modified to account for the directions \(e\). The following algorithm is a consequence of the decomposition

$$\begin{aligned} Mx= \sum _{t\times s\in P_{{\text {nonadm}}}} M_{ts}x_s +\sum _{t\times s\in P_{{\text {low}}}} W_{ts} Z_{ts}^H x_s +\sum _{t\times s\in P_{\text {high}}} U_e(t)S(t,s)V_{-e}(s)^Hx_s \end{aligned}$$

with \(e \in {\mathcal {E}}(t)\) such that \(s\subset F_e(t)\).

  1. 1.

    Forward transform

    The auxiliary vectors \(\hat{x}_e(s):=V_e(s)^H x_s,\,e\in {\mathcal {E}}(s)\), are computed for all \(s\in T_J\). Exploiting the nestedness of the cluster bases \(\mathcal {V}\) (with transfer matrices \(\bar{T}_e^{s's}\)), one has the following recursive relation

    $$\begin{aligned} \hat{x}_e(s)=V_e(s)^H x_s=\sum _{s'\in S_J(s)}(\bar{T}^{s's}_e)^H V_{e'}(s')^H x_{s'} =\sum _{s'\in S_J(s)}(\bar{T}^{s's}_e)^H \hat{x}_{e'}(s'), \end{aligned}$$

    which has to be applied starting from the leaf vectors \(\hat{x}_e(s),\,e\in {\mathcal {E}}(s),\,s\in \mathcal {L}(T_J)\).

  2. 2.

    Far field interaction

    The products \(S(t,s)\hat{x}_{-e}(s)\) are computed and summed up over all blocks \(t\times s\in P_{\text {high}}\):

    $$\begin{aligned} \hat{y}_e(t):=\sum _{s:t\times s\in P_{\text {high}}} S(t,s)\hat{x}_{-e}(s),\quad e\in {\mathcal {E}}(t),\, t\in T_I. \end{aligned}$$
  3. 3.

    Backward transform

    The vectors \(\hat{y}_e(t)\) are transformed to the target vector \(y\). The nestedness (22) of \(\mathcal {U}\) yields a recursion which descends \(T_I\) for the computation of \(y:=\sum _{t\in T_I} \sum _{e\in {\mathcal {E}}(t)} U_e(t)\hat{y}_e(t)\):

    1. (a)

      Compute \(\hat{y}_{e'}(t'):=\hat{y}_{e'}(t') + T^{t't}_e\hat{y}_e(t)\) for all \(e\in {\mathcal {E}}(t)\) and all \(t'\in S_I(t)\);

    2. (b)

      Compute \(y_t:=y_t+\sum _{e\in {\mathcal {E}}(t)}U_e(t)\hat{y}_e(t)\) for all clusters \(t\in \mathcal {L}(T_I)\).

  4. 4.

    Low-frequency interaction For all \(t\times s \in P_{{\text {low}}}\) compute \(y_t:=y_t+W_{ts} Z_{ts}^H x_s\).

  5. 5.

    Near field interaction

    For all \(t\times s\in P_{{\text {nonadm}}}\) compute \(y_t:=y_t+M_{ts} x_s\).

The total number of operations of the latter algorithm is bounded by \({\mathcal {O}(k^2 N\log N)}\), which can be proven by observing that at most two floating-point operations are performed for each matrix coefficient stored in the directional \({\mathcal {H}}^2\)-matrix representation.

5 Construction of approximations

Our aim is to approximate \(A\in {\mathbb {C}}^{I\times J}\) defined in (2) with a directional \({\mathcal {H}}^2\)-matrix. Blocks in \(P_{{\text {nonadm}}}\) are stored entrywise without approximation. From (8) it follows that \(K\) is asymptotically smooth with constants independent of \(\kappa \) on domains \(X_t\times Y_s\) corresponding to blocks \(t\times s\in P_{{\text {low}}}\). It follows from the convergence analysis in [7] that the adaptive cross approximation

$$\begin{aligned} A_{ts}\approx A_{t\sigma } (A_{\tau \sigma })^{-1} A_{\tau s} \end{aligned}$$
(24)

with appropriately chosen \(\tau \subset t\) and \(\sigma \subset s,\,k_{\varepsilon }:=|\tau |=|\sigma |\sim |\log {\varepsilon }|^2\) independent of \(\kappa \), can be used to generate low-rank approximations \(W_{ts}Z_{ts}^H\approx A_{ts}\), where \(W_{ts}\in {\mathbb {C}}^{t\times k_{\varepsilon }}\) and \(Z_{ts}\in {\mathbb {C}}^{s\times k_{\varepsilon }}\).

In the rest of this section, we will consider the remaining case \(t\times s\in P_{{\text {high}}}\), i.e. we assume that (16) is valid. To be able to apply the results from Sect. 2 and prove existence of low-rank matrix approximations it is required to additionally partition the far field \({\mathcal {F}}(X_t)\) into subsets \({\mathcal {F}}_e(X_t)\) corresponding to directions \(e\in {\mathcal {E}}(t)\). Although ACA generates approximations of high quality, the number of blocks is too large (see Lemma 5) to construct and store the approximations as in the low-frequency case. To overcome this difficulty, we consider the approximation (see [8] for the application of this kind of approximation to Laplace-type problems)

$$\begin{aligned} A_{ts}\approx A_{t\sigma ^e_t}(A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t\sigma _s}(A_{\tau ^{-e}_s\sigma _s})^{-1}A_{\tau _s^{-e} s},\quad Y_s\subset {\mathcal {F}}_e(X_t), \end{aligned}$$
(25)

instead of (24), which is of type (23) with coupling matrices \(S(t,s) = A_{\tau _t\sigma _s}\). The aim of this section is to prove error estimates for the special type of low-rank approximation

$$\begin{aligned} A_{ts}\approx U_e(t) S(t,s) V_{-e}(s)^H \end{aligned}$$
(26)

with nested bases \(\mathcal {U}\) and \(\mathcal {V}\) approximating \(A_{t\sigma ^e_t}(A_{\tau _t\sigma ^e_t})^{-1}\) and \((A_{\tau ^{-e}_s\sigma _s})^{-1}A_{\tau _s^{-e}s}\), respectively.

A crucial part of the approximation (25) is the construction of what we call proper pivots \(\tau _t\subset t\) and \(\sigma _t^e\subset F_e(t),\,|\tau _t|=|\sigma ^e_t|\). They have to guarantee that \(A_{\tau _t\sigma ^e_t}\) is invertible, and for proving error estimates they have to be chosen so that

$$\begin{aligned} {\Vert A_{ts}-A_{t\sigma ^e_t} (A_{\tau _t\sigma ^e_t})^{-1} A_{\tau _t s}\Vert } \le c_R \,{\varepsilon }\,{\Vert A_{tJ}\Vert }\quad \text {for all }s\subset F_e(t) \end{aligned}$$

with some \(c_R>0\); cf. Lemma 9. Hence \(\tau _t\) and \(\sigma _t^e\) represent \(t\) and its far field \(F_e(t)\), respectively. We refer to [8] for a method for choosing \(\tau _t\) and \(\sigma ^e_t\). Note that it is sufficient to choose \(\sigma ^e_t\) so that

$$\begin{aligned} Y_{\sigma _t^e} \subset (\chi (X_t) +Z_e)\cap {\mathcal {F}}(X_t). \end{aligned}$$
(27)

Here and in what follows, \({\varepsilon }>0\) denotes a given accuracy that (up to constants) is to be achieved by the approximations. Let \(\{\zeta _1,\dots , \zeta _{k_{\varepsilon }}\}\) be a basis of the space

$$\begin{aligned} \hat{\varPi }_{p_{\varepsilon }-1}^3:=\{u \, \text {e}^{{\text {i}}\kappa (\cdot ,e)}:u\in \varPi _{p_{\varepsilon }-1}^3\}, \end{aligned}$$

where \(\varPi _p^3\) denotes the space of polynomials of degree at most \(p\) with respect to each of the three variables, \(k_{\varepsilon }:=\dim \varPi _{p_{\varepsilon }-1}^3=p_{\varepsilon }^3\), and \(p_{\varepsilon }\in {\mathbb {N}}\) is the smallest number such that \(p_{\varepsilon }\ge |\log _\rho {\varepsilon }|\) with \(\rho \) from (12).

Assumption 1

Let \(t\in T_I\). If \(|t|\ge k_{\varepsilon }\), we assume that there is \(\tau _t=\{i_1,\dots ,i_{k_{\varepsilon }}\}\subset t\) such that the following two conditions are satisfied.

  1. (i)

    Let \(\Lambda \in {\mathbb {R}}^{|t|\times k_{\varepsilon }}\) have the entries \(\Lambda _{ij}=({\varphi }_i/{\Vert {\varphi }_i\Vert }_{L^2(\varGamma )},\zeta _j)_{L^2(\varGamma )},\,i\in t,\,j=1,\dots ,k_{\varepsilon }\). There are coefficients \(\xi _{i\ell }\) such that

    $$\begin{aligned} \Lambda _{ij} = \sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } \Lambda _{i_\ell j},\quad i\in t,\;j=1,\dots ,k_{\varepsilon }, \end{aligned}$$
    (28)

    and the norm \({\Vert \Xi \Vert }_\infty \) of the matrix \(\Xi =(\xi _{i\ell })_{i\ell }\in {\mathbb {C}}^{t\times k_{\varepsilon }}\) is bounded by a multiple of \(2^{p_{\varepsilon }}\),

  2. (ii)

    each matrix \(A_{\tau _t F_e(t)},\,e\in {\mathcal {E}}(t)\), has full rank.

In the remaining case \(|t|<k_{\varepsilon }\), we set \(\tau _t=t\) and assume only that \(A_{tF_e(t)}\) has full rank.

To see that these assumptions are reasonable, notice that \(\text {rank}\,\Lambda \le k_{\varepsilon }\), which implies (28). The coefficients \(\xi _{i\ell }\) depend only on the underlying grid. Hence, the boundedness of the norm of \(\Xi \) in \((i)\) is an assumption on the regularity of the discretization and on the choice of the pivots \(\tau _t\). Notice that the assumed bound grows exponentially with \(p_{\varepsilon }\). In practice, it seems that a polynomial growth \({\Vert \Xi \Vert }_\infty \le c p_{\varepsilon }^2\) with respect to \(p_{\varepsilon }\) can be achieved. Hence, this should be regarded as a decent assumption.

The set \(\tau _t\) is not unique. Usually, any sub-set of \(t\) having \(k_{\varepsilon }\) elements will do. To see this, assume for the time being that \({\varphi }_i=\delta _{x_i}\). In this case, the sub-matrix \((\Lambda _{i_\ell j})_{\ell j}\in {\mathbb {R}}^{k_{\varepsilon }\times k_{\varepsilon }}\) of \(\Lambda \) is non-singular if and only if \(\hat{\varPi }_{p_{\varepsilon }-1}^3\) is unisolvent with respect to the points \(x_{i_\ell },\,\ell =1,\dots ,k_{\varepsilon }\), i.e.

$$\begin{aligned} \text {e}^{{\text {i}}\kappa (x_{i_\ell },e)}q(x_{i_\ell })=0\iff q(x_{i_\ell })=0, \quad \ell =1,\dots ,k_{\varepsilon }, \end{aligned}$$

for some \(q\in \varPi _{p_{\varepsilon }-1}^3\) implies that \(q=0\). The set of tupels \((x_{i_1},\dots ,x_{i_{k_{\varepsilon }}})\) for which \(\varPi _{p_{\varepsilon }-1}^3\) is not unisolvent is known to be of measure zero (in the parameter domain of \(\varGamma \)); see [38]. Hence, we may assume that \(\tau _t\subset t\) is chosen such that also \((ii)\) is valid. Otherwise, the rank of \(A_{t F_e(t)}\) is already bounded by \(k_{\varepsilon }\).

Notice that with the previous assumptions it is possible to guarantee that the rows \(\tau _t\) used for the approximation of \(A_{ts}\) can be chosen independently of \(s\subset F_e(t)\). This will be crucial for the construction of nested bases.

In the following lemma, we prove error estimates for the multivariate tensor product Chebyshev interpolant \({{\mathfrak {I}}}_p \hat{K}(\cdot ,y)\in \varPi ^3_{p-1}\) with fixed \(y\in Y\) of \(\hat{K}(\cdot ,y):=K(\cdot ,y)\exp (-{\text {i}}\kappa (\cdot -y,e))\).

Lemma 7

Let \(X, Y \subset {\mathbb {R}}^3\) such that \(Y\subset {\mathcal {F}}_e(X)\). Then

$$\begin{aligned} |\hat{K}(x,y) - {{\mathfrak {I}}}_{x,p} \hat{K}(x,y)| \le c_{{\mathfrak {I}}}(p) \left( \frac{\rho }{\rho +2} \right) ^p \max _{x'\in X}|\hat{K}(x',y)|\quad \text {for all }x\in X,\,y\in Y, \end{aligned}$$

where \(c_{{\mathfrak {I}}}(p) := 8\text {e}cp(1+\rho )(1+\frac{2}{\pi }\log p)^3\) with \(c\) and \(\rho \) from (12).

Proof

Without loss of generality we may assume that \(X\) is contained in a cube \(Q=\prod _{i=1}^3 Q_i\) which is aligned with \(e\) and that \(Y\subset {\mathcal {F}}_e(Q)\). Notice that this can be achieved by slightly modifying the constants \(\gamma _{\text {high}},\eta _{\text {high}}\). Let \(\hat{K}_i\) be the function in the \(i\)-th argument of \(\hat{K}(\cdot ,y),\,i=1,2,3\). From (12) we obtain

$$\begin{aligned} {\Vert \hat{K}_i^{(k)}\Vert }_{Q_i,\infty } \le c\, k! \left( \frac{\rho }{{\text {diam}}{Q_i}} \right) ^k{\Vert \hat{K}_i\Vert }_{Q_i,\infty },\quad k\in {\mathbb {N}}. \end{aligned}$$

Using [10, Lemma 3.13], this implies

$$\begin{aligned} \min _{q\in \varPi _{p-1}} {\Vert \hat{K}_i-q\Vert }_{Q_i,\infty } \le \tilde{c}\, p \left( \frac{\rho }{\rho + 2} \right) ^p{\Vert \hat{K}_i\Vert }_{Q_i,\infty }, \end{aligned}$$

where \(\tilde{c} := 4\text {e}c(1 + \rho )\). With this estimate the proof can be done analogously to Theorem 3.18 in [7]. \(\square \)

Although the following estimates will hold in any absolute norm, throughout this article the max norm

$$\begin{aligned} {\Vert A\Vert }:=\max _{i=1,\dots ,m} \max _{i=1,\dots ,n}|a_{ij}| \end{aligned}$$

of \(A\in {\mathbb {C}}^{m\times n}\) will be used if not otherwise indicated. Notice that the max norm is not a matrix norm but satisfies \({\Vert AB\Vert }\le {\Vert A\Vert }{\Vert B\Vert }_1\) and \({\Vert AB\Vert }\le {\Vert A\Vert }_\infty {\Vert B\Vert }\), where \({\Vert \cdot \Vert }_\infty ,\,{\Vert \cdot \Vert }_1\) denote the absolute row and the absolute column sum, respectively.

Lemma 8

Let assumption (i) be valid and let \({\varphi }_i,\,\psi _j\), and \(f\) in (2) be non-negative. Assume that \(c_{\text {as},2}\,\eta _{\text {high}}<1\). For \(t\in T_I\) satisfying \(|t|\ge k_{\varepsilon }\) and \(e\in {\mathcal {E}}(t)\) there is \(\Xi \in {\mathbb {R}}^{t\times k_{\varepsilon }}\) and an \({\varepsilon }\)-independent constant \(c_1>0\) such that

$$\begin{aligned} {\Vert A_{ts}-\Xi A_{\tau _t s}\Vert }\le c_1 {\varepsilon }{\Vert A_{ts}\Vert } \end{aligned}$$

for all \(s\subset J\) satisfying \(Y_s\subset {\mathcal {F}}_e(X_t)\).

Proof

Due to \(K(x,y)=\exp ({\text {i}}\kappa (x-y,e))\hat{K}(x,y)\), we can apply Lemma 7 and obtain for \(x\in X_t,\,y\in Y_s\) that with \(T_{p_{\varepsilon }}(x,y) := \exp ({\text {i}}\kappa (x,e)){{\mathfrak {I}}}_{x,p_{\varepsilon }} \hat{K}(x,y)\in \hat{\varPi }_{p_{\varepsilon }-1}^3\)

$$\begin{aligned} |K(x,y)-T_{p_{\varepsilon }}(x,y)|&=|\hat{K}(x,y)-{{\mathfrak {I}}}_{x,p_{\varepsilon }} \hat{K}(x,y)|\\&\le c_{\mathfrak {I}}(p_{\varepsilon }) \left( \frac{\rho }{\rho +2} \right) ^{p_{\varepsilon }}\max _{x'\in X_t}|\hat{K}(x',y)|. \end{aligned}$$

According to a remark following (12), we may assume that \(\rho <1\). In addition, without loss of generality we may assume that \({\Vert {\varphi }_i\Vert }_{L^2(\varGamma )}=1\). Then assumption (28) is equivalent with

$$\begin{aligned} \int _{\varGamma } \left( {\varphi }_i(x)-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } {\varphi }_{i_\ell }(x)\right) \zeta (x){\,\text {d}}s_x=0\quad \text {for all }\zeta \in \hat{\varPi }_{p_{\varepsilon }-1}^3. \end{aligned}$$

Hence,

$$\begin{aligned}&a_{ij}-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } a_{i_\ell j} =\int _{\varGamma } \int _{\varGamma } \left( {\varphi }_i(x)-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } {\varphi }_{i_\ell }(x)\right) K(x,y)\psi _j(y){\,\text {d}}s_y{\,\text {d}}s_x\\&\quad =\int _{\varGamma } \left( {\varphi }_i(x)-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } {\varphi }_{i_\ell }(x)\right) \int _{\varGamma }[K(x,y)-T_{p_{\varepsilon }}(x,y)]\psi _j(y){\,\text {d}}s_y{\,\text {d}}s_x \end{aligned}$$

and we see using \(\mu (X_i)\le \pi h^2\) that

$$\begin{aligned} |a_{ij}-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } a_{i_\ell j}|&\le c_{\mathfrak {I}}(p_{\varepsilon })\left( \frac{\rho }{\rho +2} \right) ^{p_{\varepsilon }}(1+{\Vert \Xi \Vert }_\infty ) \left( \mu (X_i\cup \bigcup _{\ell =1}^{k_{\varepsilon }}X_{i_\ell })\right) ^{1/2}\\&\quad \cdot \max _{x'\in X_t} \int _\varGamma |\hat{K}(x',y)||\psi _j(y)|{\,\text {d}}s_y\\&\le \tilde{c}h\left( \frac{\rho }{\rho +2} \right) ^{p_{\varepsilon }} \max _{x'\in X_t} \int _\varGamma |\hat{K}(x',y)||\psi _j(y)|{\,\text {d}}s_y \end{aligned}$$

with \(\tilde{c}:= c_{\mathfrak {I}}(p_{\varepsilon })(1+{\Vert \Xi \Vert }_\infty ) \sqrt{(k_{\varepsilon }+1)\pi }\). From the Taylor expansion and the asymptotic smoothness of \(f\) it can be seen that for \(c_{\text {as},2}\,\eta _{\text {high}}<1\)

$$\begin{aligned} |f(x',y)|\le \hat{c}\,|f(x,y)|\quad \text {for all }x, x'\in X_t \end{aligned}$$
(29)

with a constant \(\hat{c}>0\). Estimate (29) and \(1={\Vert {\varphi }_i\Vert }_{L^2(\varGamma )}\le c'h^{-1}{\Vert {\varphi }_i\Vert }_{L^1(\varGamma )}\) imply for \(y\in Y_s\)

$$\begin{aligned} \max _{x'\in X_t} |\hat{K}(x',y)|&=\max _{x'\in X_t} |f(x',y)| \le \hat{c} \min _{x\in X_i} |f(x,y)|\\&\le \hat{c}\,c'h^{-1}\int _\varGamma |{\varphi }_i(x)||f(x,y)|{\,\text {d}}s_x. \end{aligned}$$

From

$$\begin{aligned} |a_{ij}|&=|\text {e}^{{\text {i}}\kappa [-|\xi (X_i)-\xi (Y_j)|]} a_{ij}|\\&\ge {\text {Re}}\int _\varGamma \int _\varGamma {\varphi }_i(x) \psi _j(y) f(x,y) \text {e}^{{\text {i}}\kappa [|x-y|-|\xi (X_i)-\xi (Y_j)|]}{\,\text {d}}s_y {\,\text {d}}s_x\\&=\int _\varGamma \int _\varGamma {\varphi }_i(x) \psi _j(y) f(x,y) \cos (\kappa [|x-y|-|\xi (X_i)-\xi (Y_j)|]){\,\text {d}}s_y{\,\text {d}}s_x\\&\ge c''\int _\varGamma \int _\varGamma {\varphi }_i(x)\psi _j(y) f(x,y){\,\text {d}}s_y{\,\text {d}}s_x \end{aligned}$$

with \(c''>0\) independent of \(\kappa \) and \(h\) due to (cf. (7))

$$\begin{aligned}\kappa [|x-y|-|\xi (X_i)-\xi (Y_j)|]\le \kappa [|x-\xi (X_i)|+|y-\xi (Y_j)|]\le 2\kappa h<\frac{\pi }{2}, \end{aligned}$$

we obtain exploiting the non-negativity of \({\varphi }_i,\,\psi _j\), and \(f\)

$$\begin{aligned} |a_{ij}-\sum _{\ell =1}^{k_{\varepsilon }} \xi _{i\ell } a_{i_\ell j}| \le \frac{\tilde{c} \, \hat{c}\, c'}{c''} \left( \frac{\rho }{\rho +2} \right) ^{p_{\varepsilon }} |a_{i j}|. \end{aligned}$$

Hence, the matrix \(\Xi \) satisfies

$$\begin{aligned} {\Vert A_{ts}-\Xi A_{\tau _t s}\Vert }\le \frac{\tilde{c}\, \hat{c} \, c'}{c''(\rho +2)^{p_{\varepsilon }}} \rho ^{p_{\varepsilon }} {\Vert A_{ts}\Vert } \le c_1 {\varepsilon }{\Vert A_{ts}\Vert }, \end{aligned}$$

because \(\frac{c_{\mathfrak {I}}(p_{\varepsilon })\hat{c} c'}{c''(\rho +2)^{p_{\varepsilon }}}(1+{\Vert \Xi \Vert }_\infty ) \sqrt{(p_{\varepsilon }^3+1)\pi }\) is bounded by an \({\varepsilon }\)-independent constant \(c_1\) from above due to the assumption that \({\Vert \Xi \Vert }_\infty \) is bounded by a multiple of \(2^{p_{\varepsilon }}\) and \(c_{\mathfrak {I}}(p_{\varepsilon })\) grows at most algebraically with \(p_{\varepsilon }\). \(\square \)

The expression \(c_S:=\max \{c_S^{(r)},c_S^{(c)}, {\Vert (A_{\tau _s^e \sigma _s})^{-1}A_{\tau _s^e s}\Vert }_1: s\in T_J\}\), where

$$\begin{aligned} c_S^{(c)}&:=\max \{{\Vert (A_{\tau _t \sigma ^e_t})^{-1}A_{\tau _t s}\Vert }_1:s\subset J,\,Y_s\subset {\mathcal {F}}_e(X_t),\,e\in {\mathcal {E}}(t),\, t\in T_I\},\\ c_S^{(r)}&:=\max \{{\Vert A_{t \sigma _s} (A_{\tau ^e_s \sigma _s})^{-1}\Vert }_\infty :t\subset I,\,X_t\subset {\mathcal {F}}_e(Y_s),\,e\in {\mathcal {E}}(s),\, s\in T_J\}, \end{aligned}$$

will play a central role in the following error analysis. Notice that \(c_S\) does not depend on the cardinality of the clusters. However, it depends on the choice of the pivots \(\{\tau _t,\sigma _t^e,\tau _s^e,\sigma _s:t\in T_I,\,s\in T_J\}\) and hence on \({\varepsilon }\). Numerical experiments show that \(c_S\le cp_{\varepsilon }^2\).

Lemma 9

Let Assumption 1 be valid. Then for \(t\in T_I\) and \(e\in {\mathcal {E}}(t)\) there are proper pivots \((\tau _t,\sigma ^e_t),\,|\tau _t|=|\sigma _t^e|=\min \{k_{\varepsilon },|t|\}\), i.e., for all \(s\subset J\) satisfying \(Y_s\subset {\mathcal {F}}_e(X_t)\)

$$\begin{aligned} {\Vert A_{ts}-A_{t\sigma ^e_t} (A_{\tau _t\sigma ^e_t})^{-1} A_{\tau _t s}\Vert } \le c_2 {\varepsilon }{\Vert A_{tJ}\Vert }, \end{aligned}$$
(30)

where \(c_2({\varepsilon }):=c_1(1+c_S({\varepsilon }))\).

Proof

Since \(A_{\tau _t F_e(t)}\) is assumed to have full rank, there is \(\sigma ^e_t\subset F_e(t),\,|\sigma _t^e|=|\tau _t|=\min \{k_{\varepsilon },|t|\}\), such that \(A_{\tau _t\sigma _t^e}\) is invertible. If \(|t|<k_{\varepsilon }\), we have \(\tau _t=t\) and \(A_{t\sigma _t^e}(A_{\tau _t\sigma ^e_t})^{-1}=I\). Hence, (30) is trivially satisfied. We may hence assume that \(|t|\ge k_{\varepsilon }\). Let \(\Xi \in {\mathbb {R}}^{t\times k_{\varepsilon }}\) be as in Lemma 8. We have

$$\begin{aligned} A_{ts}-A_{t\sigma ^e_t}(A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t s} =\{ A_{ts}-\Xi A_{\tau _t s}\}- \{A_{t\sigma ^e_t} -\Xi A_{\tau _t\sigma ^e_t}\} (A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t s} \end{aligned}$$

and thus

$$\begin{aligned} {\Vert A_{ts}-A_{t\sigma ^e_t}(A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t s}\Vert }&\le {\Vert A_{ts}-\Xi A_{\tau _t s}\Vert } +{\Vert A_{t\sigma ^e_t}-\Xi A_{\tau _t\sigma ^e_t}\Vert }\,c_S\\&\le c_1 {\varepsilon }({\Vert A_{ts}\Vert }+c_S{\Vert A_{t\sigma ^e_t}\Vert }) \le c_2 {\varepsilon }{\Vert A_{tJ}\Vert }. \end{aligned}$$

The second last estimate follows from Lemma 8, because \(Y_{\sigma ^e_t},Y_s\subset {\mathcal {F}}_e(X_t)\). \(\square \)

5.1 Construction of directional cluster bases

Based on the previous estimates, we are going to construct and analyze nested bases approximations. The construction of nested bases is usually done by explicit approximation of the kernel function \(K\); see, for instance, the fast multipole method [37] and the method in [33], which uses interpolation. In this section, we construct the nested bases via a purely algebraic technique which is based on the original matrix entries and thus avoids explicit kernel expansions. In this sense, the presented construction is in the class of kernel independent fast multipole methods; see [3, 19, 40].

We will define a nested basis \(\mathcal {U}\) consisting of matrices \(U_e(t)\in {\mathbb {C}}^{t\times k_{\varepsilon }}\) for each \(t\in T_I\) and \(e\in {\mathcal {E}}(t)\) in a recursive manner starting from the leaves. Due to (16a), it is actually sufficient to consider the sub-tree

$$\begin{aligned} \hat{T}_I:=\{t\in T_I: \kappa \,{\text {diam}}X_t>1\} \end{aligned}$$

of \(T_I\). For its leaf clusters \(t\in \mathcal {L}(\hat{T}_I)\) and \(e\in {\mathcal {E}}(t)\) we set \(U_e(t)=B^{tt}_e\), where for \(t'\subset t\)

$$\begin{aligned} B_e^{t't}:=A_{t'\sigma ^e_t}(A_{\tau _t\sigma ^e_t})^{-1}. \end{aligned}$$

Assume that matrices \(U_e(t')\) have already been constructed for the sons \(t'\in S_I(t)\) of \(t\in \hat{T}_I\setminus \mathcal {L}(\hat{T}_I)\) and \(e\in {\mathcal {E}}(t')\). Then in view of (22) we define for \(e\in {\mathcal {E}}(t)\)

$$\begin{aligned} U_e(t)|_{t'}:=U_{e'}(t')B_e^{\tau _{t'} t}, \quad t'\in S_I(t), \end{aligned}$$

where \(e'\in {\mathcal {E}}(t')\) is defined before Lemma 6.

The following lemma estimates the accuracy when expressing \(B_e^{t't}\) by the product \(B_{e'}^{t't'}B_e^{\tau _{t'}t}\). As stated in [8], \(B_e^{t't}\) may be regarded as the algebraic form of an interpolation operator.

Lemma 10

Let \(t'\in \hat{T}_I\) satisfy \(t'\subset t\in \hat{T}_I{\setminus }\mathcal {L}(\hat{T}_I)\) and \(e\in {\mathcal {E}}(t)\). Then for all \(s\subset J\) satisfying \(Y_s\subset {\mathcal {F}}_e(X_t)\) it holds that

$$\begin{aligned} {\Vert [B_e^{t't}-B_{e'}^{t't'}B_e^{\tau _{t'}t}]A_{\tau _t s}\Vert }\le c_2 c_S {\varepsilon }{\Vert A_{t' J}\Vert } \end{aligned}$$

with \(c_S = c_S({\varepsilon })\) and \(c_2 = c_2({\varepsilon })\) from Lemma 9.

Proof

It is clear that

$$\begin{aligned} {[B_e^{t't}-B_{e'}^{t't'}B_e^{\tau _{t'}t}]}A_{\tau _t s} =[A_{t'\sigma ^e_t} - A_{t'\sigma ^{e'}_{t'}}(A_{\tau _{t'}\sigma ^{e'}_{t'}})^{-1}A_{\tau _{t'}\sigma ^e_t}] (A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t s}. \end{aligned}$$

Due to (27) and (21), it holds that \(Y_{\sigma _t^e} \subset {\mathcal {F}}_{e'}(X_{t'})\). Hence, Lemma 9 yields

$$\begin{aligned} {\Vert [B_e^{t't}-B_{e'}^{t't'}B_e^{\tau _{t'}t}]A_{\tau _t s}\Vert } \le c_2 {\varepsilon }{\Vert A_{t'J}\Vert }{\Vert (A_{\tau _t\sigma ^e_t})^{-1}A_{\tau _t s}\Vert }_1. \end{aligned}$$

\(\square \)

Theorem 1

Let Assumption 1 be valid. Let \(t\in \hat{T}_I\) and let \(\ell =L(\hat{T}_t)\) denote the depth of the sub-tree \(\hat{T}_t\) of \(\hat{T}_I\) rooted at \(t\in T_I\). Then for all \(e\in {\mathcal {E}}(t)\)

$$\begin{aligned} {\Vert [U_e(t)-B_e^{tt}]A_{\tau _t s}\Vert } \le c_3 {\varepsilon }{\Vert A_{tJ}\Vert }\quad \text {for all }s\subset F_e(t), \end{aligned}$$

where

$$\begin{aligned} c_3({\varepsilon }):=c_1(1+c_S({\varepsilon }))\frac{(2c_S({\varepsilon }))^\ell }{2c_S({\varepsilon })-1}. \end{aligned}$$

Proof

From Lemma 10 we have for \(t\in \hat{T}_I{\setminus }\mathcal {L}(\hat{{T}}_I)\)

$$\begin{aligned} {\Vert {[U_e(t)-B_e^{tt}]}A_{\tau _ts}\Vert }&\le \sum _{t'\in S_I(t)} {\Vert [U_e(t)|_{t'}-B_e^{t't}]A_{\tau _ts}\Vert }\\&=\sum _{t'\in S_I(t)} {\Vert [U_{e'}(t')B_e^{\tau _{t'}t}-B_e^{t't}]A_{\tau _t s}\Vert }\\&\le \sum _{t'\in S_I(t)} {\Vert [U_{e'}(t')\!-\!B_{e'}^{t't'}]B_e^{\tau _{t'}t}A_{\tau _ts}\Vert }\!+\!{\Vert [B_e^{t't}-B_{e'}^{t't'}B_e^{\tau _{t'}t}]A_{\tau _ts}\Vert }\\&\le \sum _{t'\in S_I(t)} c_S{\Vert [U_{e'}(t')-B_{e'}^{t't'}]A_{\tau _{t'}\sigma ^e_t}\Vert }+{\varepsilon }c_2c_S{\Vert A_{t'J}\Vert }\\&\le 2 c_2c_S {\varepsilon }{\Vert A_{tJ}\Vert } +c_S\sum _{t'\in S_I(t)}{\Vert [U_{e'}(t')-B_{e'}^{t't'}]A_{\tau _{t'}\sigma ^e_t}\Vert }. \end{aligned}$$

We set \(\alpha _{t'}:={\Vert [U_{e'}(t')-B_{e'}^{t't'}]A_{\tau _{t'}\sigma ^e_t}\Vert }\) for \(t'\in S_I(t)\). From (27) and (21) we obtain that \(Y_{\sigma ^e_t}\subset {\mathcal {F}}_{e'}(X_{t'})\). Hence, we can apply the latter inequality recursively replacing \(s\) by \(\sigma _t^e\) and obtain the recurrence relation

$$\begin{aligned} \alpha _t \le 2c_2c_S{\varepsilon }{\Vert A_{tJ}\Vert }+c_S\sum _{t'\in S_I(t)} \alpha _{t'}, \quad t\in T_I\setminus \mathcal {L}(T_I). \end{aligned}$$
(31)

We show that

$$\begin{aligned} \alpha _t\le 2c_2c_S{\varepsilon }\frac{(2c_S)^{\ell -1}-1}{2c_S-1}{\Vert A_{tJ}\Vert },\quad t\in T_I, \end{aligned}$$
(32)

where \(\ell =\ell (t)\) denotes the depth of the sub-tree \(\hat{T}_t\). This estimate is obviously true for leaf clusters \(t\in \mathcal {L}(T_I)\) as \(\alpha _t=0\). Assume that (32) is valid for the sons \(S_I(t)\) of \(t\in \hat{T}_I{\setminus } \mathcal {L}(\hat{T}_I)\). Then (31) proves

$$\begin{aligned} \alpha _t&\le 2c_2c_S {\varepsilon }{\Vert A_{tJ}\Vert }+c_S\sum _{t'\in S_I(t)} \alpha _{t'}\\&\le 2c_2c_S{\varepsilon }{\Vert A_{tJ}\Vert }+2c_2c_S^2{\varepsilon }\frac{(2c_S)^{\ell -2}-1}{2c_S-1}\sum _{t'\in S_I(t)} {\Vert A_{t'J}\Vert }\\&\le 2c_2c_S{\varepsilon }\left( 1+2c_S\frac{(2c_S)^{\ell -2}-1}{2c_S-1}\right) {\Vert A_{tJ}\Vert } =2c_2c_S{\varepsilon }\frac{(2c_S)^{\ell -1}-1}{2c_S-1}{\Vert A_{tJ}\Vert }. \end{aligned}$$

Hence,

$$\begin{aligned} {\Vert [U_e(t)-B_e^{tt}]A_{\tau _ts}\Vert }&\le 2c_2c_S{\varepsilon }{\Vert A_{tJ}\Vert }+c_S\sum _{t'\in S_I(t)}\alpha _{t'}\\&\le 2c_2c_S{\varepsilon }\frac{(2c_S)^{\ell -1}}{2c_S-1}{\Vert A_{tJ}\Vert } \le c_2 {\varepsilon }\frac{(2c_S)^{\ell }}{2c_S-1}{\Vert A_{tJ}\Vert } \end{aligned}$$

together with \(c_2=c_1(1+c_S)\) proves the assertion. \(\square \)

Similar results as for the row clusters \(t\) can be obtained for column clusters \(s\in T_J\) and \(e\in {\mathcal {E}}(s)\) provided assumptions analogous to Assumption 1 are made. In particular, this defines clusters \(\sigma _s\subset s\) and \(\tau ^e_s\subset F'_e(s),\,|\tau ^e_s|=|\sigma _s|=\min \{k_{\varepsilon },|s|\}\), where

$$\begin{aligned} F'_e(s):=\bigcup \{t\in T_I:\exists \hat{s}\supset s\text { such that }t\times \hat{s}\in P_{\text {high}}\text { and }X_t\subset \mathcal {D}_e(Y_s)\}. \end{aligned}$$

For \(s'\subset s\) we define

$$\begin{aligned} C_e^{s's}:=(A_{\tau ^e_s \sigma _s})^{-1}A_{\tau ^e_s s'}. \end{aligned}$$

For leaf clusters \(s\in \mathcal {L}(\hat{T}_J)\) and \(e\in {\mathcal {E}}(s)\) we set \(V_e(s)=(C^{ss}_e)^H\). Assuming that matrices \(V_e(s')\) have already been constructed for the sons \(s'\in S_J(s)\) of \(s\in \hat{T}_J\setminus \mathcal {L}(\hat{T}_J)\) and \(e\in {\mathcal {E}}(s')\), we define for \(e\in {\mathcal {E}}(s)\)

$$\begin{aligned} V_e(s)|_{s'}:=V_{e'}(s')(C_e^{\sigma _{s'} s})^H, \quad s'\in S_J(s), \end{aligned}$$

where \(e'\in {\mathcal {E}}(s')\) is defined before Lemma 6. Due to the analogy with Lemma 9 and Theorem 1, we omit the proofs of the following error estimates.

Lemma 11

There is \(c_4=c_4({\varepsilon })>0\) such that for all \(s\in T_J\) and \(e\in {\mathcal {E}}(s)\) it holds that

$$\begin{aligned} {\Vert A_{ts}-A_{t\sigma _s}(A_{\tau _t^e \sigma _s})^{-1}A_{\tau _s^e s}\Vert } \le c_4{\varepsilon }{\Vert A_{Is}\Vert } \end{aligned}$$

for all \(t\subset I\) satisfying \(X_t\subset {\mathcal {F}}_e(Y_s)\).

Theorem 2

Let \(s\in \hat{T}_J\) and let \(\ell =L(\hat{T}_s)\) denote the depth of the cluster tree \(\hat{T}_s\). Then there is \(c_5=c_5({\varepsilon })>0\) such that for all \(e\in {\mathcal {E}}(s)\)

$$\begin{aligned} {\Vert A_{t\sigma _s}[C_e^{ss}-V_e(s)^H]\Vert } \le c_5 {\varepsilon }{\Vert A_{I s}\Vert },\quad t\subset F'_e(s). \end{aligned}$$

Using the previously constructed bases \(\mathcal {U}\) and \(\mathcal {V}\), we employ \(S(t,s):=A_{\tau _t\sigma _s}\) in (26) for each block \(A_{ts},\,t\times s\in P_{\text {high}}\). In the following theorem, the accuracy of the nested approximation based on the matrix entries of \(A\) is estimated.

Theorem 3

Let all previous assumptions be valid and \(t\times s\in P_{\text {high}}\). Then there is \(e\in {\mathcal {E}}(t)\) such that \(s\subset F_e(t)\) and the approximation error is bounded by

$$\begin{aligned} {\Vert A_{ts}-U_e(t) S(t,s) V_{-e}(s)^H\Vert } \le c_S(c_2+c_3){\varepsilon }{\Vert A_{tJ}\Vert }+(c_4+c_5{\Vert U_e(t)\Vert }_\infty ) {\varepsilon }{\Vert A_{Is}\Vert }\nonumber \\ \end{aligned}$$
(33)

with \(c_S,c_2,c_3,c_4,c_5\) depending on \({\varepsilon }\) defined above.

Proof

From \(s\subset F_e(t)\) it follows that \(t\subset F'_{-e}(s)\). We have that

$$\begin{aligned} A_{ts}-B_e^{tt} S(t,s) C_{-e}^{ss} = A_{ts} - A_{t \sigma _s}C_{-e}^{ss}+[A_{t \sigma _s}-B_e^{tt}A_{\tau _t\sigma _s}]C_{-e}^{ss}. \end{aligned}$$

From Lemma 9 it follows that \({\Vert A_{t\sigma _s}-B_e^{tt}A_{\tau _t \sigma _s}\Vert }\le c_2 {\varepsilon }{\Vert A_{tJ}\Vert }\), and from Lemma 11 we have that \({\Vert A_{t\sigma _s} -A_{t\sigma _s} C_{-e}^{ss}\Vert } \le c_4 {\varepsilon }{\Vert A_{Is}\Vert }\). Therefore,

$$\begin{aligned} {\Vert A_{ts}-B_e^{tt} S(t,s) C_{-e}^{ss}\Vert } \le {\varepsilon }(c_4{\Vert A_{Is}\Vert }+c_2{\Vert A_{tJ}\Vert }{\Vert C_{-e}^{ss}\Vert }_1). \end{aligned}$$

Furthermore, Theorems 1 and 2 yield

$$\begin{aligned}&{\Vert U_e(t)S(t,s)V_{-e}(s)^H - B_e^{tt}S(t,s)C_{-e}^{ss}\Vert }\\&\le {\Vert U_e(t)\Vert }_\infty {\Vert S(t,s)[V_{-e}(s)^H - C_{-e}^{ss}]\Vert }\\&\quad +{\Vert [U_e(t) - B_e^{tt}]S(t,s)\Vert }{\Vert C_{-e}^{ss}\Vert }_1\\&\le c_5{\varepsilon }{\Vert U_e(t)\Vert }_\infty {\Vert A_{Is}\Vert }+ c_3c_S{\varepsilon }{\Vert A_{tJ}\Vert }, \end{aligned}$$

which proves the assertion. \(\square \)

Remark 1

Furthermore, the \({\Vert \cdot \Vert }_\infty \)-norm of \(U_e(t)\) can be expected to be close to \({\Vert B_e^{tt}\Vert }\) since \(U_e(t)\) approximates \(B_e^{tt}\); cf. Theorem 1. The actual values of \({\Vert U_e(t)\Vert }_\infty \) can easily be evaluated during computation. Hence, it easy to check the quality of the nested approximation via (33). The numerical results in the next section show that these terms are bounded in practice and do not destroy the convergence of the method.

6 Numerical results

We consider the sound soft/hard scattering problem (1) imposing either the Dirichlet datum \(u = u_D\) (sound soft) or the Neumann trace \(\partial _\nu u = v_N\) (sound hard) on the boundary \(\varGamma \). We denote \({\mathfrak {V}}\) the single and \({\mathfrak {K}}\) the double-layer-operator with the kernels \(S(x-y)\) and \(\partial _{\nu _y} S(x-y)\), respectively. Using Green’s representation formula \(u = {\mathfrak {K}}u - {\mathfrak {V}} (\partial _\nu u)\) in \(\varOmega ^c\) and the jump relations, we end up with the integral equation

$$\begin{aligned} {\mathfrak {V}}(\partial _\nu u) = \left( {\mathfrak {K}} - \frac{1}{2} {\mathfrak {I}}\right) u\quad \text {on }\varGamma , \end{aligned}$$
(34)

which can be solved either for the unknown \(\partial _\nu u|_{\varGamma }\) in the case of a Dirichlet problem or for the unknown \(u|_{\varGamma }\) in the Neumann case. The Brakhage–Werner ansatz

$$\begin{aligned} u = {\mathfrak {K}}\phi - {\text {i}}\beta {\mathfrak {V}} \phi \quad \text {in }\varOmega ^c \end{aligned}$$
(35)

with an arbitrary coupling parameter \(\beta >0\) uses an unknown density function \(\phi \) that satisfies the integral equation

$$\begin{aligned} \left( \frac{1}{2} {\mathfrak {I}} + {\mathfrak {K}} -{\text {i}}\beta {\mathfrak {V}}\right) \phi = u_D. \end{aligned}$$
(36)

In either case, Galerkin discretization of the integral equations leads to a linear system with matrices of the form (2) that can be approximated by directional \({\mathcal {H}}^2\)-matrices. The solution can then be obtained via GMRES using the matrix-vector multiplication from Sect. 4.1, which we proved to have logarithmic-linear complexity.

6.1 Approximation of \(\varvec{{\mathfrak {V}}}\)

As a first step, we validate the logarithmic-linear complexity of the directional \({\mathcal {H}}^2\)-matrices (labeled dir\({\mathcal {H}}^2\)-ACA). Moreover, we compare our new approach with the standard \({\mathcal {H}}\)-matrix approximation via ACA (labeled \({\mathcal {H}}\)-ACA) and the \({\mathcal {H}}^2\)-matrix approach from [8] (labeled \({\mathcal {H}}^2\)-ACA). In this section, we focus on the approximation of the single-layer operator \({\mathfrak {V}}\). Since we assumed \(\kappa h\) to be constant, we increase \(\kappa \) with growing number of degrees of freedom \(N\). By “acc.” we label the relative error to the full matrix in the Frobenius norm.

Table 1 shows the memory consumption of the approximation of the discretization of \({\mathfrak {V}}\) with piecewise constant ansatz and test functions on the prolate spheroid, i.e. an ellipsoid with ten times the extension in \(x\)-direction. The accuracy for \(N=1{,}905{,}242\) was not computed, because the calculation of the difference to a full matrix in Frobenius norm takes \({\mathcal {O}(N^2)}\) time.

Table 1 Prolate spheroid: \(\kappa h \sim 0.15,\,\eta _\text {high} = 5\)

For small \(N\), the three methods have about the same performance. This is due to fact that directional \({\mathcal {H}}^2\)-matrices adapt to the wave number and fade to usual \({\mathcal {H}}\)-matrices for low frequencies. Nevertheless, \({\mathcal {H}}^2\)-ACA seems to be superior with respect to memory consumption. The latter approach, however, is not convenient for Helmholtz-type kernel functions as the running time increases for higher wave numbers. The reason for this behaviour is that the blockwise rank cannot be guaranteed to be constant and thus, the number of apriori pivots must be increased accordingly in order to achieve a prescribed accuracy. The gain of dir\({\mathcal {H}}^2\)-ACA in terms of both time and memory becomes visible for larger \(N\) (see Figs. 3, 4).

Fig. 3
figure 3

Memory consumption in MB on prolate spheroid

Fig. 4
figure 4

Construction time on prolate spheroid

6.2 Neumann problem

We consider the sound hard scattering problem and use piecewise linear ansatz and test functions for the Galerkin discretization of (34). The approximate Dirichlet datum \(\tilde{u}_D \approx u|_{\varGamma }\) is obtained from solving (34) with approximations (\({\varepsilon }= 10^{-4}\)) of the discrete operators \(V\) and \(K\) of \({\mathfrak {V}}\) and \({\mathfrak {K}}\). We use the Neumann datum \(v_N:=\partial _\nu S(\cdot -x_0)\) with an interior point \(x_0\in \varOmega \). In this case, we are able to compute the \(L^2\)-error \({\Vert \tilde{u}_D - u\Vert }_{L^2(\varGamma )} /{\Vert u\Vert }_{L^2(\varGamma )}\) to the exact Dirichlet trace given by \(u|_{\varGamma } = S(\cdot -x_0)\). Tables 2 and 4 show the behaviour of the error on the sphere with radius 1 for fixed \(\kappa h\) and fixed \(\kappa \), respectively. Furthermore, the CPU time and the memory consumption required by the approximations to \(V\) and \(K\) is shown and can be seen to behave logarithmic-linear for both fixed and growing wave numbers. As a reference, we made the same computations also using \({\mathcal {H}}\)-matrices. The corresponding results are shown in Tables 3 and 5. As before, the advantage of dir\({\mathcal {H}}^2\)-ACA becomes visible for a growing number of waves. It is remarkable, however, that even in the fixed frequency case the directional approach outperforms \({\mathcal {H}}\)-ACA in terms of memory and computation time for larger degrees of freedom. For smaller degrees of freedom the performance of \({\mathcal {H}}\)-ACA is only slightly better.

Table 2 \(\hbox {dir}{\mathcal {H}}^2\)-ACA: \(L^2\)-error of \(\tilde{u}_D\) with \(\kappa h = 0.19\)
Table 3 \({\mathcal {H}}\)-ACA: \(L^2\)-error of \(\tilde{u}_D\) with \(\kappa h = 0.19\)
Table 4 \(\hbox {dir}{\mathcal {H}}^2\)-ACA: \(L^2\)-error of \(\tilde{u}_D\) with fixed \(\kappa = 6\)
Table 5 \({\mathcal {H}}\)-ACA: \(L^2\)-error of \(\tilde{u}_D\) with fixed \(\kappa = 6\)

6.3 Visualization of Dirichlet problem with Brakhage–Werner

We consider the sound soft scattering problem, i.e. we seek a solution \(u = u_i + u_s\) of the Helmholtz equation (1), where \(u_i(x) := \exp ({\text {i}}\kappa (x,e))\) with \(e = (1,0,1)^T/\sqrt{2}\) denotes the incident acoustic wave and \(u_s\) is the unknown scattered field. The incident wave is reflected on a sound soft obstacle \(\varOmega \), which is described by the Dirichlet condition \( u_s|_{\varGamma } = -u_i|_{\varGamma }\).

The obstacle is composed of 4 cylindrical spheres with 507,904 panels and 253,960 vertices. We solve (36) for the unknown density \(\phi \) with piecewise linear ansatz and test functions. Following [22], we use the coupling parameter \(\beta = \kappa /2\). In a second step, we evaluate the potential (35) on a uniform discretization of a screen behind the obstacle in order to compute the scattered wave \(u_s\). Figure 5 shows the resulting pressure field of the total wave \(|u_i+u_s|\) for \(\kappa = 10\) and \(\kappa = 40\).

Fig. 5
figure 5

Pressure field \(|u_s+u_i|\) for \(\kappa = 10\) and \(\kappa =40\)