1 Introduction

Stochastic PDEs are commonly used to model uncertain physical phenomena. One such model that is used, for instance, in groundwater modeling is the diffusion equation

$$\begin{aligned} -{\text {div}}(a \nabla u ) = f \text { in { D}}, \quad u|_{\partial D} = 0, \end{aligned}$$
(1)

with lognormally distributed coefficient, that is, \(a = \exp (b)\), where b is a centered Gaussian random field defined on the computational domain \(D\subset \mathbb {R}^d\) (where typically \(d=2\) or \(d=3\)).

Uncertainty quantification aims at describing the statistical properties of the resulting solution u, with various computational objectives: evaluating the mean field \(\bar{u}=\mathbb {E}(u)\), estimating a plausible a based on some measurement data of the solution, describing the law of a scalar quantity of interest Q(u).

For such objectives, it is most convenient to represent b in the form of an expansion

$$\begin{aligned} b = \sum _{j \ge 1} y_j \psi _j, \end{aligned}$$
(2)

where \(y_j\) are i.i.d. \({\mathcal {N}}(0,1)\), that is, independent scalar standard Gaussian random variables, and \(\psi _j\) are suitable functions on D. Once such an expansion is given, one may introduce approximations to the solution map

$$\begin{aligned} y \mapsto u(y), \quad y=(y_j)_{j\ge 1}, \end{aligned}$$
(3)

for example by multivariate polynomials in the variables \(y_j\). Such approximations provide a fast way to evaluate u(y) for any choice of y up to some prescribed accuracy, which is of crucial help for the above mentioned tasks. Expansions of the form (2) can also be of practical use for the fast generation of trajectories, provided that the \(\psi _j\) have simple analytic expressions or are easy to compute numerically.

The centered Gaussian field b is characterized by its covariance function

$$\begin{aligned} (x,x')\mapsto K(x,x') = \mathbb {E}(b(x)\,b(x')), \quad x,x'\in D. \end{aligned}$$
(4)

A standard way of obtaining a representation (2) is by using the Karhunen–Loève (KL) basis, that is, the \(L^2(D)\)-orthonormal eigenfunctions \((\varphi _j)_{j\ge 1}\) of the integral operator

$$\begin{aligned} T: v \mapsto Tv= \int _D K(\cdot ,z)\,v(z)\,dz, \end{aligned}$$
(5)

with corresponding eigenvalues \(\lambda _j\ge 0\) arranged in decreasing order. One then obtains (2) by setting

$$\begin{aligned} \psi _j:= \sqrt{\lambda _j} \varphi _j. \end{aligned}$$
(6)

The distinguishing feature of this particular choice is that in addition to the statistical orthogonality \(\mathbb {E}(y_jy_k)=\delta _{j,k}\), the functions \(\psi _j\) are orthogonal in \(L^2(D)\). However, other expansions of the form (2) may also be considered if one does not impose the \(L^2(D)\)-orthogonality.

As shown in [23], general expansions of the form (2) with i.i.d. \({\mathcal {N}}(0,1)\) coefficients are characterized by the fact that the \(\psi _j\) form a tight frame of the reproducing kernel Hilbert space (RKHS) \({\mathcal {H}}\) induced by K. Recall that \({\mathcal {H}}\) is the closure of the finite linear combinations of the functions \(K_z:=K(\cdot ,z)\) for \(z\in D\), with respect to the norm induced by the inner product \(\langle K_x,K_z\rangle _{{\mathcal {H}}}:=K(x,z)\), see [1] for a general treatment. Recall also that a tight frame of a Hilbert space \({\mathcal {H}}\) is a complete system that satisfies the identity

$$\begin{aligned} \sum _{j\ge 1} |\langle g,\psi _j\rangle _{\mathcal {H}}|^2=\Vert g\Vert _{{\mathcal {H}}}^2, \quad g\in {\mathcal {H}}. \end{aligned}$$
(7)

We refer to [11] for classical examples of time-frequency or time-scale frames. In contrast to orthonormal bases, such systems may be redundant. The possible redundancy in (2) can be illustrated by the following trivial example: if \(y_1\) and \(y_2\) are i.i.d. \({\mathcal {N}}(0,1)\) and \(\psi \) is a given function, then \(y_1\psi +y_2\psi = z (\sqrt{2} \psi )\) with \(z=(y_1+y_2)/\sqrt{2}\) also \({\mathcal {N}}(0,1)\).

As an elementary yet useful example of the different possibilities for expanding a Gaussian process, consider the Brownian bridge on \(D=[0,1]\) whose covariance is given by \(K(x,x')=\min \{x,x'\}-xx'\). On the one hand, the representation based on the KL expansion is given by the trigonometric functions

$$\begin{aligned} \psi _j(x):=\frac{\sqrt{2}}{\pi j}\sin (\pi j x), \quad j\ge 1. \end{aligned}$$
(8)

On the other hand, another classical representation of this process is given by the Schauder basis, which consists of the hat functions

$$\begin{aligned} \psi _j(x):=2^{-\ell /2}\sigma (2^\ell x-k), \quad \ell \ge 0, \quad k=0,\dots ,2^\ell -1, \quad j=2^{\ell }+k, \end{aligned}$$
(9)

where

$$\begin{aligned} \sigma (x):=\max \{1-|2x-1|,0\}. \end{aligned}$$
(10)

Both systems are orthonormal bases (and thus tight frames) of the RKHS, which in this case is \({\mathcal {H}}=H^1_0(D)\) endowed with norm \(\Vert v\Vert _{\mathcal {H}}:=\Vert v'\Vert _{L^2(D)}\).

Lax–Milgram theory ensures that for each individual \(y=(y_j)_{j\ge 1} \in U := \mathbb {R}^\mathbb {N}\) such that \(\sum _{j\ge 1} y_j\psi _j\) converges in \(L^\infty (D)\), the corresponding solution u(y) is well defined in \(V:=H^1_0(D)\), with a priori bound

$$\begin{aligned} \Vert u(y)\Vert _{H^1_0}\le C \exp \biggl (\Bigl ||\sum _{j\ge 1} y_j\psi _j\Bigr ||_{L^\infty }\biggr ), \quad C:=\Vert f\Vert _{V'}. \end{aligned}$$
(11)

Sufficient conditions have been established, either in terms of the covariance function K or of the size properties of the \(\psi _j\), which ensure that the solution map \(y\mapsto u(y)\) belongs to \(L^k(U,V,\gamma )\) for all \(k<\infty \), see [3, 6, 10, 17, 21]. Here \(L^k(U,V,\gamma )\) is the usual Bochner space where \(\gamma \) denotes the countable tensor product of the univariate standard Gaussian measure.

In turn, approximation can be obtained in \(L^2(U,V,\gamma )\), corresponding to mean-square convergence, by truncation of the tensor product Hermite expansion

$$\begin{aligned} u(y) = \sum _{\nu \in {\mathcal {F}}} u_\nu H_\nu (y) ,\qquad u_\nu =\int _U u(y)H_\nu (y)d\gamma (y) \in V, \quad H_\nu (y) := \prod _{j\ge 1} H_{\nu _j}(y_j). \end{aligned}$$
(12)

Here we have denoted by \((H_n)_{n\ge 0}\) the sequence of univariate Hermite polynomials with normalization in \(L^2(\mathbb {R}, g(t) dt)\) where \(g(t)=\frac{1}{\sqrt{2\pi }} e^{-t^2/2}\), and by \({\mathcal {F}}\) the set of finitely supported sequences \(\nu =(\nu _j)_{j\ge 1}\) of non-negative integers.

The best error for a given number n of terms retained in the above expansion is attained by the so-called best n-term Hermite approximation \(y\mapsto u_n(y)\) obtained by retaining the indices \(\nu \) corresponding to the n largest \(||u_\nu ||_V\). By combining Parseval’s identity with Stechkin’s lemma [13], the resulting error can be quantified in terms of the \(\ell ^p\)-summability of the sequence \((||u_\nu ||_V)_{\nu \in {\mathcal {F}}}\) for \(p<2\), namely

$$\begin{aligned} \Vert u-u_n\Vert _{L^2(U,V,\gamma )}\le Cn^{-s}, \quad s:=\frac{1}{p}-\frac{1}{2}, \quad C:=\Vert (||u_\nu ||_V)_{\nu \in {\mathcal {F}}}\Vert _{\ell ^p({\mathcal {F}})}. \end{aligned}$$
(13)

As shown in [3], this extra summability may depend strongly on the particular representation (2) that is used, or in other words, on the choice of coordinates \((y_j)_{j\ge 1}\). Whereas previous results establishing summability properties of \((||u_\nu ||_V)_{\nu \in {\mathcal {F}}}\) are based only on the summability of \((||\psi _j||_{L^\infty (D)})_{j\ge 1}\), that is, on the absolute sizes of the \(\psi _j\), the results in [2, 3] also take into account the localization properties of \(\psi _j\). More specifically, the following is shown in [3], assuming that D is a bounded Lipschitz domain.

Theorem 1.1

Let \(0<p<2\) and let \(q=q(p):=\frac{2p}{2-p}\). Assume that there exists a positive sequence \((\rho _j)_{j\ge 1}\) such that

$$\begin{aligned} \sup _{x\in D}\sum _{j\ge 1} \rho _j|\psi _j(x)| <\infty \end{aligned}$$
(14)

and

$$\begin{aligned} (\rho _j^{-1})_{j\ge 1} \in \ell ^q(\mathbb {N}). \end{aligned}$$
(15)

Then the solution map \(y\mapsto u(y)\) belongs to \(L^k(U,V,\gamma )\) for all \(0\le k<\infty \). Moreover, \((\Vert u_\nu \Vert _V)_{\nu \in {\mathcal {F}}} \in \ell ^p({\mathcal {F}})\). In particular, best n-term Hermite approximations converge in \(L^2(U,V,\gamma )\) with rate \(n^{-s}\) for \(s=\frac{1}{p}-\frac{1}{2}=\frac{1}{q}\).

Let us mention that although this theorem refers to the particular elliptic equation (1), inspection of its proof shows that a similar result holds for other types of equation. This includes in particular fourth-order elliptic equations or parabolic equations, again with lognormal coefficients (see the discussion in the end of Sect. 2 of [2] in the case of affine coefficients, which also applies to the lognormal case).

The above result draws an important distinction between representations using globally or locally supported functions \(\psi _j\). If nothing is assumed on the supports of \(\psi _j\), we may only apply the following immediate consequence of Theorem 1.1.

Corollary 1.2

If \((\psi _j)_{j\ge 1}\) is a family of functions with arbitrary support such that \((\Vert \psi _j\Vert _{L^\infty })_{j\ge 1}\) belongs to \(\ell ^r(\mathbb {N})\) for some \(r<1\), then best n-term Hermite approximations converge in \(L^2(U,V,\gamma )\) with rate \(n^{-s}\) for \(s=\frac{1}{r}-1\).

Consider for example the above mentioned Brownian bridge. On the one hand, the KL functions given by (8) are globally supported with size of order \(j^{-1}\). In turn, for any \(q<\infty \), there exists no sequence \((\rho _j)_{j\ge 1}\) satisfying (14) and (15), and therefore the best n-term truncation cannot be proved to converge with any algebraic rate. On the other hand, the Schauder functions given by (9) have local support properties which allow us to fulfill (14) with \(\rho _j= j^{s}\) for any \(s<\frac{1}{2}\). Hence best n-term Hermite truncation based on the Schauder representation is ensured to converge with rate \(n^{-s}\) for any \(s<\frac{1}{2}\).

The same analysis applies to a general process on a bounded domain \(D\subset \mathbb {R}^d\) if it admits an expansion (2) where the \(\psi _j\) have wavelet-like localization properties. In this case, it is convenient to index the basis functions \(\psi _j\) by a scale-space index \(\lambda \), with \(|\lambda |\) denoting the corresponding scale level. We denote by \(\mathcal {I}\) the set of these indices, where \(\#\{ \lambda \in \mathcal {I}:|\lambda | = \ell \} \sim 2^{d\ell }\) for \(\ell \ge 0\).

Corollary 1.3

Let \((\psi _j)_{j\ge 1}= (\psi _\lambda )_{\lambda \in \mathcal {I}}\) be a wavelet basis such that for some \(\alpha >0\),

$$\begin{aligned} \sup _{x\in D} \sum _{|\lambda | = \ell } |\psi _\lambda (x)| \le C 2^{-\alpha \ell } , \quad \ell \ge 0. \end{aligned}$$
(16)

If \((\Vert \psi _\lambda \Vert _{L^\infty })_{\lambda \in \mathcal {I}}\) belongs to \(\ell ^q(\mathcal {I})\), which holds for \(q>\frac{d}{\alpha }\), then the best n-term Hermite approximations converge in \(L^2(U,V,\gamma )\) with rate \(n^{-s}\) for all \(s< \frac{1}{q} < \frac{\alpha }{d}\).

This leads to the question whether suitable wavelet-type systems \((\psi _j)_{j\ge 1}\) forming orthonormal bases or tight frames of \(\mathcal {H}\) can also be found for more general b with given covariance kernel K. In view of the above corollary, we are interested in the decay exponent \(\alpha \) that can be ensured in (16). For practical purposes, we also need these systems to either have exact analytic expressions or be computable by simple and efficient numerical procedures.

We shall focus on stationary random fields, that is, covariances of the form

$$\begin{aligned} K(x,x') = k(x - x'), \quad x,x'\in D, \end{aligned}$$
(17)

where k is an even function defined over \(\mathbb {R}^d\) which is the inverse Fourier transform of a positive measure. One typical class of examples is given by the family of Matérn covariances

$$\begin{aligned} k(x) = \frac{2^{1-\nu }}{\Gamma (\nu )} \biggl ( \frac{\sqrt{2\nu }|x|}{\lambda } \biggr )^\nu K_\nu \biggl ( \frac{\sqrt{2\nu }|x|}{\lambda } \biggr ), \end{aligned}$$
(18)

where \(\nu ,\lambda > 0\) and \(K_\nu \) is the modified Bessel function of the second kind, with Fourier transform given by

$$\begin{aligned} \hat{k}(\omega ) = c_{\nu ,\lambda }\, \biggl ( \frac{2\nu }{\lambda ^2} + |\omega |^2 \biggr )^{-(\nu + d/2)}, \quad c_{\nu ,\lambda } := \frac{ 2^d \pi ^{d/2} \Gamma (\nu + d/2) (2\nu )^\nu }{ \Gamma (\nu ) \lambda ^{2\nu }}. \end{aligned}$$
(19)

Here, for the Fourier transform, we use the convention

$$\begin{aligned} \hat{f}(\omega ) = \int _{\mathbb {R}^d} f(x)\,e^{-i x\cdot \omega } \,dx. \end{aligned}$$
(20)

The parameters \(\nu \) and \(\lambda \) quantify the smoothness and correlation length, respectively, of the process.

The idea of using wavelet-type systems for the representation of Gaussian processes was put forward in the pioneering works of Ciesielski [7], motivated by the problem of analyzing the Hölder smoothness of univariate Gaussian processes. This program was pursued in [8] where Sobolev–Besov smoothness was investigated, in particular for the fractional Brownian motion. These papers were based on representing the processes of interest in the Schauder basis, with resulting components that are generally not independent and therefore not of the form (2).

A general approach was proposed in [4] to obtain wavelet-type representations of Gaussian processes defined on \(\mathbb {R}^d\), with independent components. This approach requires that the covariance function is the Green’s function of an operator of pseudo-differential type, and it includes Matérn covariances, see also [9]. A related approach was proposed in [15] for stationary Gaussian processes, motivated by fast methods for generating trajectories. Both approaches strongly rely on the Fourier transform over \(\mathbb {R}^d\), and do not carry over in a simple manner to the case of a bounded domain \(D\subset \mathbb {R}^d\).

A general construction of wavelet expansions of the form (2) for Gaussian processes was recently proposed in [22], in the general framework of Dirichlet spaces. Here the needed assumptions are that the covariance operator commutes with the operator that defines the Dirichlet structure. The latter does not have a simple explicit form in the case of Matérn covariances on a domain D, which makes this approach difficult to analyze and implement in our setting. Let us also mention the construction in [18], where an orthonormal basis for the RKHS is built by a direct Gram-Schmidt process, however, with generally no size and localization bounds on the resulting basis functions.

In the present paper, we propose an approach where the complications that may arise due to the geometry of D are circumvented by performing a periodic continuation of the random field b on a larger torus \({\mathbb {T}}\). The existence of such a continuation is discussed in Sect. 2. This leads to a simple construction of KL-type and wavelet-type expansions, by restrictions to D of similar expansions defined on \({\mathbb {T}}\). While the systems introduced on \({\mathbb {T}}\) are bases, their restrictions to D are redundant frames of the RKHS, the amount of redundancy being essentially reflected by the ratio \(|{\mathbb {T}}| / |D|\). Our construction thus achieves numerical simplicity at the price of redundancy. Here, D is a general bounded domain with no particular regularity assumption.

The periodic continuation has some parallels to circulant embedding, proposed independently in [14] and [28] as an algebraic technique for evaluating a stationary Gaussian random field, given by k, at the points of a uniform grid. Here, the (block) Toeplitz matrix formed by the grid values of k is embedded into a (block) circulant matrix, enabling a factorization by fast Fourier transform. Although it was shown in [12] that a positive definite Toeplitz matrix can always be embedded into a sufficiently large positive definite circulant matrix, its required size depends on the particular k under consideration, similarly to the size of \({\mathbb {T}}\) in our construction. Under more restrictive assumptions on k, simple procedures produce an embedding into a matrix of size proportional to the original one [14]. This strategy has also been applied in the numerical treatment of lognormal diffusion problems by quasi-Monte Carlo (QMC) methods [19].

Also in the setting of sampling-based methods such as QMC, our results have several potentially advantageous features. They yield a periodically extended random field on all of D, rather than on a uniform spatial grid. Any further approximation (e.g. on unstructured finite element meshes) can thus be adjusted independently of the periodic extension and of a chosen expansion of the random field. Furthermore, the decay properties of the KL eigenvalues of the periodic process are directly controlled by the decay of the Fourier transform of k.

In the case of KL-type expansions, which are studied in Sect. 3, the functions \(\psi _j\) that we obtain are simply the restrictions to D of trigonometric functions. This is an advantage in term of numerical simplicity compared to the \(L^2(D)\)-orthogonal KL functions of b which may not be easy to compute accurately, and in addition may not satisfy uniform \(L^\infty \) bounds. Wavelet-type expansions are defined in Sect. 4, and we establish their localization and size properties. In the case of Matérn covariances, they correspond to the value \(\alpha =\nu \) in (16), where \(\nu \) is the smoothness parameter in (19). Therefore, corresponding best n-term Hermite approximations converge with algebraic rate \(n^{-s}\) for any \(0<s<\nu /d\). Finally, in Sect. 5 details are given on numerical procedures which may be applied to define the periodic continuation and to construct the resulting wavelets, in the case of Matérn covariance, depending on the parameters \(\lambda ,\nu \).

Remark 1.4

Our efforts in this paper are directed at improving the choice of the basis \((\psi _j)_{j\ge 1}\) in the representation of a Gaussian process, in the sense of achieving the property

$$\begin{aligned} \sup _{x\in D}\sum _{j\ge 1} \rho _j |\psi _j(x)| <\infty \end{aligned}$$
(21)

with \((\rho _j^{-1})\in \ell ^q\) for the smallest possible value of q. While the wavelet bases that we construct are superior to KL bases in that particular sense, in the case of Matérn process, an open question is whether they are optimal among all possible representations. Closely related is the question of the choice of a representation that optimizes the convergence of the trunctated random series \(b_n=\sum _{j=1}^n y_j \psi _j\) in the \(L^\infty \) norm towards the random field b. While KL bases are known to be optimal in the sense of minimizing the mean square error \(E(\Vert b-b_n\Vert _{L^2}^2)\), they are generally suboptimal when replacing \(L^2\) by \(L^\infty \) in the norm measuring the truncation error. The condition (21) can be used to estimate the convergence \(\Vert b-b_n\Vert _{L^\infty }\) in various ways, for example in the almost sure sense. Indeed, we may write

$$\begin{aligned} \Vert b-b_n\Vert _{L^\infty } \le C \sup _{j>n} \rho _j^{-1} |y_j|, \quad C:= \sup _{x\in D}\sum _{j\ge 1} \rho _j |\psi _j(x)|. \end{aligned}$$
(22)

Then, observing that if \(\omega _j\) is a slowly increasing sequence such that \(\sum _{j\ge 1} \exp (-\omega _j^2)<\infty \) (for example, take \(\omega _j=\sqrt{2\ln j+1}\)), it is easily checked that \(\sup _{j>n} \omega _j^{-1} |y_j|\) is finite with probability one. This allows us to derive that, almost surely,

$$\begin{aligned} \sup _{n\ge 1} \alpha _n^{-1} \Vert b-b_n\Vert _{L^\infty } <\infty , \quad \alpha _n:=\sup _{j>n} \rho _j^{-1} \omega _j. \end{aligned}$$
(23)

If \((\rho _j^{-1})\in \ell ^q\), and assuming without loss of generality that the \(\rho _j\) are in increasing order, then it follows that \(\rho _j^{-1}\le Cn^{-1/q}\) and therefore we obtain the almost sure convergence rate \(n^{-s}\) in \(L^\infty \) for any \(0<s<\frac{1}{q}\). In the case of Matérn covariances, using the constructed wavelet bases, we obtain such convergence rates for all values of \(\nu >0\), which does not seem to be achievable when using KL bases. It is again an open question to understand whether wavelet bases provide the best possible convergence rate.

Notation: Throughout the paper, as in the above introduction, we use C to denote a constant that may change between occurences (even in a chain of inequalities), and when necessary we indicate its value or the parameters on which it depends.

2 Periodic Continuation of a Stationary Process

Let \((b(x))_{x\in \mathbb {R}^d}\) be a real-valued, stationary and centered Gaussian process defined on \(\mathbb {R}^d\), whose covariance is of the form

$$\begin{aligned} \mathbb {E}(b(x)b(x'))=k(x-x'), \end{aligned}$$
(24)

where k is a real-valued and even function which is the inverse Fourier transform of a non-negative function \(\hat{k}\). We work under the assumption that \(\hat{k}\) is such that

$$\begin{aligned} 0\le \hat{k}(\omega ) \le C(1+|\omega |^2)^{-r},\quad \omega \in \mathbb {R}^d, \end{aligned}$$
(25)

for some \(r>d/2\) and \(C>0\). Obviously, the Matérn covariances (18) satisfy this assumption with \(r=\nu +d/2\) and C depending on \((d,\lambda ,\nu )\). We consider the restricted process \((b(x))_{x\in D}\) defined on the bounded domain D of interest.

We aim for representations of the general form (2) where the \(y_j\) are i.i.d. \({\mathcal {N}}(0,1)\) and the \((\psi _j)_{j\ge 1}\) are a given sequence of functions defined on D. As explained in the introduction, one natural choice is \(\psi _j=\sqrt{\lambda _j}\varphi _j\), where \((\varphi _j,\lambda _j)\) are the eigenfunctions and eigenvalues of the covariance operator. However, this renormalized KL representation may not meet our requirements due to the possibly global support of the \(\psi _j\) and due to the slow decay of their \(L^\infty \) norms, while other representations could be more appropriate.

Our strategy for deriving better representations of the process over D is to view it as the restriction to D of a periodic stationary Gaussian process \(b_\mathrm{p}\) defined on a suitable larger torus \({\mathbb {T}}\). As a consequence, any representation

$$\begin{aligned} b_\mathrm{p}= \sum _{j \ge 1} y_j \tilde{\psi }_j, \end{aligned}$$
(26)

with \(y_j\) i.i.d. \({\mathcal {N}}(0,1)\) and \((\tilde{\psi }_j)_{j\ge 1}\) a given system of functions, yields a representation

$$\begin{aligned} b = \sum _{j \ge 1} y_j \psi _j, \quad \psi _j := \tilde{\psi }_j|_D. \end{aligned}$$
(27)

The construction of \(b_\mathrm{p}\) requires additional assumptions on the covariance function k.

Let \(\delta := \mathrm{diam}(D)\), so that in a suitable coordinate system, D can be embedded into the box \([-\frac{\delta }{2}, \frac{\delta }{2}]^d\). We want to construct a periodic process \((b_\mathrm{p}(x))_{x\in {\mathbb {T}}}\) on a torus \({\mathbb {T}}= [-\gamma ,\gamma ]^d\) with \(\gamma >\delta \), whose restriction \(b_\mathrm{p}|_D\) on D is such that \(b_\mathrm{p}|_D\sim b\), that is, \(b_\mathrm{p}|_D\) and b share the same law. This is feasible provided that we can find an even and \({\mathbb {T}}\)-periodic function \(k_\mathrm{p}\) which agrees with k over \([-\delta ,\delta ]^d\) and such that the Fourier coefficients

$$\begin{aligned} c_n(k_\mathrm{p}):= \int _{\mathbb {T}}k_\mathrm{p}(z)\,e^{-i\frac{\pi }{\gamma }n\cdot z}\,dz, \quad n\in \mathbb {Z}^d, \end{aligned}$$
(28)

are non-negative. In addition we would like that these Fourier coefficients have a similar rate of decay as the function \(\hat{k}\), that is,

$$\begin{aligned} 0\le c_n(k_\mathrm{p})\le C(1+|n|^2)^{-r}, \quad n\in \mathbb {Z}^d, \end{aligned}$$
(29)

for some \(C>0\). Note that \(k_\mathrm{p}\) generally differs from the periodization \(\sum _{n\in \mathbb {Z}^d} k(\cdot +2\gamma n)\), which corresponds to a periodic process that does not agree with b on D.

One natural way of constructing the function \(k_\mathrm{p}\) is by truncation and periodization: first we choose a sufficiently smooth and even cutoff function \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\), to be specified further, such that \(\phi |_{[-\delta ,\delta ]^d} = 1\) and \(\phi (x)=0\) for \(x\notin [-\kappa ,\kappa ]^d\) where \(\kappa :=2\gamma - \delta \), and define the truncation

$$\begin{aligned} k_\mathrm{t}(z):=k(z)\,\phi (z). \end{aligned}$$
(30)

We now define \(k_\mathrm{p}\) as the periodization of \(k_\mathrm{t}\), that is,

$$\begin{aligned} k_\mathrm{p}(z) = \sum _{n\in \mathbb {Z}^d} k_\mathrm{t}(z+2\gamma n) . \end{aligned}$$
(31)

Obviously, \(k_\mathrm{p}\) agrees with k over \([-\delta ,\delta ]^d\), and

$$\begin{aligned} c_n(k_\mathrm{p})=\hat{k}_\mathrm{t}\left( \frac{\pi }{\gamma }n\right) . \end{aligned}$$
(32)

Therefore (29) follows if we can establish

$$\begin{aligned} 0\le \hat{k}_\mathrm{t}(\omega ) \le C(1+|\omega |^2)^{-r},\quad \omega \in \mathbb {R}^d, \end{aligned}$$
(33)

for some \(C>0\). Since we have

$$\begin{aligned} \hat{k}_\mathrm{t}=(2\pi )^{-d}\, \hat{k} * \hat{\phi }, \end{aligned}$$
(34)

it is easily seen that the upper inequality in (33) follows from the upper inequality in (25), provided that \(\phi \) is chosen sufficiently smooth such that

$$\begin{aligned} |\hat{\phi }(\omega )| \le C(1+|\omega |^2)^{-r}, \end{aligned}$$
(35)

for some \(C>0\). Indeed, combining (34) with (25) and (35), we obtain

$$\begin{aligned} (2\pi )^d |\hat{k}_\mathrm{t}(\omega )|&\le \left| \int _{|\xi | \le |\omega |/2}\hat{k}(\xi )\, \hat{\phi }(\omega -\xi )\,d\xi \right| +\left| \int _{|\xi | \ge |\omega |/2}\hat{k}(\xi )\, \hat{\phi }(\omega -\xi )\, d\xi \right| \nonumber \\&\le \Vert \hat{k}\Vert _{L^1} \max _{|\xi |\ge |\omega |/2} |\hat{\phi }(\xi )| + \Vert \hat{\phi }\Vert _{L^1} \max _{|\xi |\ge |\omega |/2}|\hat{k}(\xi )|\nonumber \\&\le C(1+|\omega |^2)^{-r}. \end{aligned}$$
(36)

The main problem is to guarantee the lower inequality in (33), that is, the non-negativity of \(\hat{k}_\mathrm{t}\). Note that \(\hat{\phi }\) cannot be non-negative: since \(1=\phi (0)=(2\pi )^{-d}\int _{\mathbb {R}^d} \hat{\phi }(\omega ) d\omega \), the non-negativity of \(\hat{\phi }\) would imply that

$$\begin{aligned} |\phi (x)|=(2\pi )^{-d}\left| \int _{\mathbb {R}^d} \hat{\phi }(\omega )\,e^{ix \cdot \omega }\, d\omega \right| <1, \quad x\ne 0, \end{aligned}$$
(37)

therefore contradicting the assumption \(\phi |_{[-\delta ,\delta ]^d} = 1\).

It follows that for any such \(\phi \), the convolution operator

$$\begin{aligned} v\mapsto v*\hat{\phi }, \end{aligned}$$
(38)

does not preserve positivity for all functions v. Here, we are only interested in preserving positivity for the particular function \(\hat{k}\). However, the following result shows that this is in general not feasible only under the assumption (25).

Theorem 2.1

For any \(r>d/2\), there exists an even function k that satisfies (25) and such that for any \(\phi \) satisfying \(|\hat{\phi }(\omega )| \le C(1+|\omega |^2)^{-s}\) for some \(s>r\), \(\phi |_{[-\delta ,\delta ]^d} = 1\), \(\phi (x)=0\) for \(x\notin [-\kappa ,\kappa ]^d\) for some \(\kappa >\delta \), the function \(\hat{k}_\mathrm{t}=(2\pi )^{-d}\,\hat{k} * \hat{\phi }\) is not non-negative.

Proof

Let h be a non-negative, smooth, even function on \(\mathbb {R}^d\) with \(h(0) = 1\) and support contained in the unit ball. For \(\ell \in \mathbb {N}\), we choose arbitrary but fixed \(\omega _\ell \in \mathbb {R}^d\) such that \(|\omega _\ell | = 2^\ell \). We now define k by its Fourier transform as

$$\begin{aligned} \hat{k}(\omega ) := \sum _{\ell \ge 1} 2^{-2r\ell } \Bigl ( h\bigl (\ell (\omega - \omega _\ell )\bigr ) + h\bigl (\ell (\omega + \omega _\ell )\bigr ) \Bigr ). \end{aligned}$$
(39)

Then clearly, (25) is satisfied. As demonstrated above, there exists \(\omega ^*\in \mathbb {R}^d\) such that \(\hat{\phi }(\omega ^*)<0\). For \(\ell > 1\), consider

$$\begin{aligned} \hat{k}_\mathrm{t}(\omega ^* + \omega _\ell ) = (2\pi )^{-d} \int _{\mathbb {R}^d} \hat{\phi }(\omega ^* - \xi ) \,\hat{k}(\xi + \omega _\ell )\,d\xi = (2\pi )^{-d} \bigl ( I_1(\ell ) + I_2(\ell ) \bigr ), \end{aligned}$$
(40)

where

$$\begin{aligned} I_1(\ell )&:=\int _{\mathbb {R}^d} \hat{\phi }(\omega ^* - \xi ) 2^{-2r\ell } h(\ell \xi ) \,d\xi = \ell ^{-d} 2^{-2r\ell }\int _{\mathbb {R}^d} \hat{\phi }(\omega ^* - \ell ^{-1} \xi ) \, h(\xi ) \,d\xi ,\\ I_2(\ell )&:= \int _{\mathbb {R}^d} \hat{\phi }(\omega ^* - \xi ) \bigl ( \hat{k}(\xi + \omega _\ell ) - 2^{-2r\ell } h(\ell \xi ) \bigr ) \,d\xi . \end{aligned}$$

On the one hand,

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \ell ^d 2^{2r\ell } I_1(\ell ) = \hat{\phi }(\omega ^*) \int _{\mathbb {R}^d} h(\xi )\,d\xi <0. \end{aligned}$$
(41)

On the other hand, \(\hat{k} \in L^1(\mathbb {R}^d)\) and \(\hat{k}(\xi + \omega _\ell ) - 2^{-2r\ell } h(\ell \xi )\) vanishes for \(|\xi | \le 2^{\ell -2}\). As a consequence,

$$\begin{aligned} |I_2(\ell )| \le ||\hat{k}||_{L^1(\mathbb {R}^d)} \max _{|\xi | \ge 2^{\ell -2}} |\hat{\phi }(\omega ^* - \xi )| . \end{aligned}$$
(42)

For \(\ell \) such that \(2^{\ell -2} > 2|\omega ^*|\), we thus have

$$\begin{aligned} |I_2(\ell )| \le C ||\hat{k}||_{L^1(\mathbb {R}^d)} ( 1+ 2^{2\ell -6})^{-s}. \end{aligned}$$
(43)

Therefore

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \ell ^d 2^{2r\ell } |I_2(\ell )|=0. \end{aligned}$$
(44)

As a consequence, \(\hat{k}_\mathrm{t}(\omega ^* + \omega _\ell ) < 0\) for sufficiently large \(\ell \). \(\square \)

The above counterexample reveals that further assumptions are needed on the covariance function k. Specifically, we work under the stronger assumptions

$$\begin{aligned} c(1+|\omega |^2)^{-s}\le \hat{k}(\omega )\le C(1+|\omega |^2)^{-r}, \end{aligned}$$
(45)

for some \(s\ge r >d/2\) and \(0<c\le C\), and

$$\begin{aligned} \lim _{R\rightarrow \infty } \int _{|x|>R} | \partial ^\alpha k (x) | \,dx = 0, \quad |\alpha | \le 2\lceil s\rceil , \end{aligned}$$
(46)

where \(\lceil s\rceil \) is the smallest integer greater or equal to s.

Remark 2.2

In the case of the Matérn covariance (18) with parameters \(\nu ,\lambda > 0\), the assumption (45) holds with \(s=r = \nu +d/2\) as a consequence of (19). The assumption (46) actually holds for all derivation orders \(\alpha \), as a consequence of the exponential decay of the modified Bessel functions of the second kind \(K_\nu \) and of their derivatives. This exponential decay can be seen, for example, from the integral representation

$$\begin{aligned} K_\nu (x) = \int _0^\infty e^{-x \cosh t} \cosh \nu t \, dt, \end{aligned}$$
(47)

see [26, p. 181]. While in the present paper we focus on the example of Matérn covariances, there are of course other relevant processes that satisfy (45), in particular anisotropic processes. As an example consider a d-dimensional stationary process with anisotropic covariance given by

$$\begin{aligned} k(x)=k_{\nu _1,\lambda _1}(x_1)\dots k_{\nu _d,\lambda _d}(x_d), \end{aligned}$$
(48)

where \(k_{\nu ,\lambda }\) stands for the one-dimensional Matérn covariance as given in (18). Taking the Fourier transform, it is easily seen that (45) again holds, however with values \(s>r\) in contrast to the isotropic case.

Theorem 2.3

Let k be an even function such that (45) and (46) hold. Then for \(\kappa > \delta \) sufficiently large, there exists \(\phi \) satisfying (35), \(\phi |_{[-\delta ,\delta ]^d} = 1\), and \(\phi (x)=0\) for \(x\notin [-\kappa ,\kappa ]^d\) such that \(\hat{k}_\mathrm{t}= (2\pi )^{-d}\,\hat{k} * \hat{\phi }\) is positive.

Proof

We first choose a function \(\phi _{2\delta } \in C^{2p}(\mathbb {R}^d)\) with \(p := \lceil s\rceil \), supported on \([-2\delta ,2\delta ]^d\) and such that \({\phi _{2\delta }}|_{[-\delta ,\delta ]^d} = 1\). Then for each \(\kappa \ge 2\delta \), we define

$$\begin{aligned} \phi _\kappa (x)=\phi _{2\delta }(2\delta x/\kappa ). \end{aligned}$$
(49)

We thus have, for all \(\kappa \ge 2\delta \),

$$\begin{aligned} \max _{|\alpha |\le 2p} ||\partial ^\alpha \phi _\kappa ||_{L^\infty } \le \max _{|\alpha |\le 2p} ||\partial ^\alpha \phi _{2\delta }||_{L^\infty } =: M<\infty . \end{aligned}$$
(50)

Note that \(\phi _\kappa \) satisfies (35) with a constant C that depends on \(\kappa \). For a value of \(\kappa \) to be fixed further, we take \(\phi :=\phi _\kappa \), and let \(\theta := 1 - \phi \). Then

$$\begin{aligned} \hat{k}_\mathrm{t}(\omega ) = \hat{k}(\omega ) - \widehat{k\theta }(\omega ) \ge c(1 + |\omega |^2)^{-s} - \widehat{k\theta }(\omega ), \end{aligned}$$
(51)

and

$$\begin{aligned} \bigl |\widehat{k\theta }(\omega )\bigr | \le ( 1 + |\omega |^2)^{-p} \int _{\mathbb {R}^d} |(I - \Delta )^{p} ( k \theta ) | \,dx \le ( 1 + |\omega |^2)^{-s} \int _{\mathbb {R}^d} |(I - \Delta )^{p} ( k \theta ) | \,dx. \end{aligned}$$
(52)

By repeated application of Leibniz’ rule and separately bounding each term, one finds

$$\begin{aligned} \int _{\mathbb {R}^d} |(I - \Delta )^{p} ( k \theta ) | \,dx \le C(\kappa ) := (1 + 4d)^{p} M \max _{|\alpha | \le 2p} \int _{|x|>\kappa /2} | \partial ^\alpha k (x) | \,dx. \end{aligned}$$
(53)

Since \(\lim _{\kappa \rightarrow \infty } C(\kappa ) = 0\) by (46), it follows that \(\hat{k}_\mathrm{t}\) is positive for \(\kappa \) chosen large enough such that \(C(\kappa )\le c\). \(\square \)

Remark 2.4

From the proof of Theorem 2.3, for a given k and a choice of \(\phi _{2\delta }\) one may also extract an upper bound for the required size of \(\kappa \) (and hence \(\gamma \)) from the decay of the right side of (53); note that here, p and M also depend on k via s. For the case of Matérn covariances, the resulting requirements on \(\gamma \) are illustrated numerically in Fig. 1 of Sect. 5.1.

3 Karhunen–Loève Representations

Let us recall that the standard Karhunen–Loève (KL) decomposition of the stationary Gaussian process b has the form (2) with

$$\begin{aligned} \psi _j=\psi _j^\mathrm{KL}:= \sqrt{ \lambda _j}\, \varphi _j, \end{aligned}$$
(54)

where \((\lambda _j)_{j\ge 1}\) is the sequence of positive eigenvalues of the covariance operator

$$\begin{aligned} T: v \mapsto Tv=\int _D k(\cdot -x) \,v(x)\,dx, \end{aligned}$$
(55)

arranged in decreasing order, and \((\varphi _j)_{j\ge 1}\) is the associated \(L^2(D)\)-orthonormal basis of eigenfunctions.

Working under assumptions (45) and (46), the periodized construction described in the previous section provides us with an alternative decomposition based on the covariance operator associated to the periodized process \(b_\mathrm{p}\), that is,

$$\begin{aligned} T_\mathrm{p}: v \mapsto T_\mathrm{p} v=\int _{{\mathbb {T}}} k_\mathrm{p}(\cdot -x)\, v(x)\, dx, \end{aligned}$$
(56)

with \({\mathbb {T}}=[-\gamma ,\gamma ]^d\). The \(L^2({\mathbb {T}})\)-orthonormal eigenfunctions of this operator are explicitly given by the trigonometric functions

$$\begin{aligned} \theta _n(z):= t_{n_1}(z_1)\cdots t_{n_d}(z_d), \quad n=(n_1,\dots ,n_d) \in \mathbb {N}_0^d, \end{aligned}$$
(57)

where \(t_0(z)=(2\gamma )^{-1/2}\) and

$$\begin{aligned} t_{2m}(z)=\gamma ^{-1/2}\cos \left( \frac{m\pi z}{\gamma }\right) \quad \mathrm{and }\quad t_{2m-1}(z)=\gamma ^{-1/2}\sin \left( \frac{m\pi z}{\gamma }\right) , \quad m\ge 1. \end{aligned}$$
(58)

The eigenvalues are related to the Fourier coefficients \(c_n(k_\mathrm{p})\) defined in (28). For the above eigenfunction \(\theta _n\), with \(n=(n_1,\dots ,n_d)\) and each \(n_i\) being either of the form \(2m_i\) or \(2m_i-1\), the corresponding eigenvalue is

$$\begin{aligned} c_m(k_\mathrm{p}), \quad m=(m_1,\dots ,m_d). \end{aligned}$$
(59)

We denote by \((\lambda _{\mathrm {p}, j})_{j\ge 1}\) a decreasing rearrangement of these eigenvalues, with corresponding eigenfunctions \((\varphi _{\mathrm {p}, j})_{j\ge 1}\). We may thus write

$$\begin{aligned} b_\mathrm{p}=\sum _{j\ge 1} y_j \psi _{\mathrm {p}, j}, \end{aligned}$$
(60)

where the \(y_j\) are i.i.d. \({\mathcal {N}}(0,1)\) and

$$\begin{aligned} \psi _{\mathrm {p}, j}:=\sqrt{\lambda _{\mathrm {p}, j}}\,\varphi _{\mathrm {p}, j}. \end{aligned}$$
(61)

Since \(b_\mathrm{p}\sim b\) on D, this yields a decomposition of b by restriction, that is, taking

$$\begin{aligned} \psi _j=\psi _j^\mathrm{R}:=\sqrt{\lambda _{\mathrm {p}, j}}\,\varphi _{\mathrm {p}, j}|_{D}. \end{aligned}$$
(62)

From (29), we obtain that, for some \(C>0\),

$$\begin{aligned} \#\{n\in \mathbb {Z}^d \,: \, |c_n(k_\mathrm{p})| \ge \eta \} \le C\eta ^{-2r/d}, \quad \eta >0. \end{aligned}$$
(63)

Thus, by decreasing rearrangement, we obtain a decay estimate of the form

$$\begin{aligned} \lambda _{\mathrm {p}, j} \le C j^{-\frac{2r}{d}},\quad j\ge 1, \end{aligned}$$
(64)

for some \(C>0\). One first observation is that a similar decay estimate holds for the original eigenvalues \(\lambda _j\).

Theorem 3.1

If the covariance function satisfies (45) and (46), then we have

$$\begin{aligned} \lambda _j \le C j^{-\frac{2r}{d}},\quad j\ge 1, \end{aligned}$$
(65)

for some \(C>0\).

Proof

We denote by

$$\begin{aligned} V_n = {{\mathrm{span}}}\{ \varphi _1,\ldots ,\varphi _n \}, \end{aligned}$$
(66)

the spaces generated by the Karhunen–Loève functions. These spaces satisfy the optimality property

$$\begin{aligned} \sum _{j>n} \lambda _j= \mathbb {E}\bigl ( ||b - P_{V_n} b||^2_{L^2(D)} \bigr )=\min _{\dim (V)=n} \mathbb {E}\bigl ( ||b - P_V b||^2_{L^2(D)} \bigr ), \end{aligned}$$
(67)

where \(P_V\) is the \(L^2(D)\)-orthogonal projector. With

$$\begin{aligned} V_{\mathrm {p},n} := {{\mathrm{span}}}\{ \varphi _{\mathrm {p},1},\ldots ,\varphi _{\mathrm {p},n}\}, \end{aligned}$$
(68)

the spaces generated by the KL functions of \(b_\mathrm{p}\), we denote by

$$\begin{aligned} W_n:= {{\mathrm{span}}}\{ \varphi _{\mathrm {p},1}|_{D},\ldots ,\varphi _{\mathrm {p},n}|_{D}\} \end{aligned}$$
(69)

their restriction to D. We thus have

$$\begin{aligned} \sum _{j>n} \lambda _j\le \mathbb {E}\bigl ( ||b - P_{W_n} b||^2_{L^2(D)} \bigr ), \end{aligned}$$
(70)

and since b and \(b_\mathrm{p}\) agree on D, it follows that

$$\begin{aligned} \sum _{j>n} \lambda _j\le \mathbb {E}\bigl ( ||b_\mathrm{p}- P_{V_{\mathrm {p},n}} b_\mathrm{p}||^2_{L^2({\mathbb {T}})} \bigr ), \end{aligned}$$
(71)

where \(P_{V_{\mathrm {p},n}}\) is the \(L^2({\mathbb {T}})\)-orthogonal projector. Therefore

$$\begin{aligned} \sum _{j>n} \lambda _j\le \sum _{j>n}\lambda _{\mathrm {p}, j} \le Cn^{1-\frac{2r}{d}}, \quad n\ge 1, \end{aligned}$$
(72)

where the second inequality follows from (64). Since the \(\lambda _j\) are positive non-increasing, we obtain

$$\begin{aligned} \lambda _{j}\le \frac{2}{j} \sum _{\lfloor \frac{j}{2}\rfloor <l\le j}\lambda _{l} \le \frac{2}{j} \sum _{l>\lfloor \frac{j}{2}\rfloor }\lambda _{l} \le Cj^{-\frac{2r}{d}}, \end{aligned}$$
(73)

which is (65). \(\square \)

Remark 3.2

In the case of the Matérn covariance, in view of (19), we therefore obtain (65) with the value \(r:=\nu +d/2\). This estimate was derived in [20] by a different approach, using the theory developed by Widom for the eigenvalues of convolution-type operators. This theory makes the assumption that \(\hat{k}\) is unimodal in each variable, see [27, p. 290], which holds for Matérn covariances, but is not needed in the above construction based on assumptions (45) and (46).

One interest of using the representation based on the functions \(\psi ^\mathrm{R}_j\) defined by restriction according to (62) is that the functions \(\varphi _{\mathrm {p}, j}\) are explicitly given by tensorized trigonometric functions. In particular, they are uniformly bounded. It follows that

$$\begin{aligned} \Vert \psi _j^\mathrm{R}\Vert _{L^\infty } \le C\lambda _{\mathrm {p}, j}^{1/2},\quad j\ge 1, \quad \end{aligned}$$
(74)

and therefore, by (64),

$$\begin{aligned} \Vert \psi _j^\mathrm{R}\Vert _{L^\infty } \le Cj^{-\frac{r}{d}},\quad j\ge 1. \end{aligned}$$
(75)

In contrast, such uniform bounds for the \(L^\infty (D)\) norms are generally not available for the KL functions \(\varphi _j\), which are in addition not easily computable in the case of a general multivariate domain.

In the particular case of Matérn covariances, the \(L^\infty \) norms of these functions have been estimated in [20], by arguments which use their natural connections with Hilbertian Sobolev spaces. We give below a similar argument which improves on the estimates established in [20]. Fixing an s such that \(\frac{d}{2}< s < r=\nu +d/2\), and assuming that the domain D satisfies the uniform cone condition, we may use Sobolev embedding to obtain

$$\begin{aligned} ||\varphi _j||_{L^\infty (D)} \le C||\varphi _j||_{H^s(D)}. \end{aligned}$$
(76)

We then find by interpolation inequalities between Sobolev spaces that

$$\begin{aligned} ||\varphi _j||_{L^\infty (D)} \le C||\varphi _j||_{L^2(D)}^{1 - s/r}\, ||\varphi _j||_{H^r(D)}^{s/r}=C||\varphi _j||_{H^r(D)}^{s/r} . \end{aligned}$$
(77)

In order to estimate the \(H^r(D)\) norms of the functions \(\varphi _j\), we use the following bound for the covariance operator T: for any \(v\in L^2(D)\), denoting by w its extension by zero to \(\mathbb {R}^d\), we have

$$\begin{aligned} \Vert Tv\Vert _{H^r(D)}^2&= \Vert (k*w)|_D\Vert _{H^r(D)}^2 \le \Vert k*w\Vert _{H^r(\mathbb {R}^d)}^2\\&= C\int _{\mathbb {R}^d}(1+|\omega |^2)^{r} |\hat{k}(\omega ) \hat{w}(\omega )|^2 d\omega \\&\le C\int _{\mathbb {R}^d}(1+|\omega |^2)^{-r} |\hat{w}(\omega )|^2 d\omega \\&\le C\int _{\mathbb {R}^d} \hat{k}(\omega ) |\hat{w}(\omega )|^2 d\omega \\&= C\langle k*w,w\rangle _{L^2(\mathbb {R}^d)} \le C \Vert Tv\Vert _{L^2(D)}\Vert v\Vert _{L^2(D)}, \end{aligned}$$

where we have used the characterization of Hilbertian Sobolev spaces by Fourier transforms and the particular form of the Matérn covariance (recall that by convention the constant C is allowed to change value in the above chain of inequalities). Taking \(v=\varphi _j\) and using \(T\varphi _j=\lambda _j\varphi _j\), we thus obtain

$$\begin{aligned} ||\varphi _j||_{H^r(D)} \le C\lambda _j^{-1/2}. \end{aligned}$$
(78)

In summary we have obtained the non-uniform bound

$$\begin{aligned} ||\varphi _j||_{L^\infty (D)} \le C \lambda _j^{-\frac{s}{2r}} \end{aligned}$$
(79)

and therefore, by (65),

$$\begin{aligned} ||\psi ^\mathrm{KL}_j||_{L^\infty (D)} \le C j^{-\frac{r-s}{d}}. \end{aligned}$$
(80)

In particular, we may take \(s = \frac{d}{2} + \varepsilon \) for any sufficiently small \(\varepsilon >0\) to obtain

$$\begin{aligned} ||\psi ^\mathrm{KL}_j||_{L^\infty (D)} \le C j^{-\frac{r}{d} + \frac{1}{2} + \varepsilon }. \end{aligned}$$
(81)

This needs to be compared with (64), which in the present case of the Matérn covariance yields

$$\begin{aligned} ||\psi ^\mathrm{R}_j||_{L^\infty (D)} \le C j^{-\frac{r}{d}}, \end{aligned}$$
(82)

since \(||\varphi _{\mathrm {p},j}||_{L^\infty (D)} \le 1\).

Let us mention that in the particular univariate case \(d=1\), numerical experiment seem to indicate that \(||\varphi _j||_{L^\infty (D)}\) stays bounded independently of j, and therefore that the upper bound (79) is not always sharp. On the other hand, one can also exhibit examples of stationary processes such that the corresponding KL functions on the domain D are not uniformly bounded. Take for example the case \(D=[-1,1]\) and k such that \(\hat{k}=\chi _{[-F,F]}\) for which the KL functions \(\varphi _j\) coincide with the univariate prolate spheroidal functions introduced in [25]. It is known that these functions are uniformly close to the Legendre polynomials \(L_j\) as \(j\rightarrow \infty \), which shows that \(||\varphi _j||_{L^\infty (D)}\sim j^{1/2}\), see [5].

In summary, we have obtained substantially better bounds on the decay of \(||\psi ^\mathrm{R}_j||_{L^\infty (D)}\) than available for \(||\psi ^\mathrm{KL}_j||_{L^\infty (D)}\), and in addition the \(\psi ^\mathrm{R}_j\) are easily computed numerically while this is generally not the case for the \(\psi ^\mathrm{KL}_j\). However, (82) still leads to rather severe restrictions on the values of r for which Theorem 1.1 is applicable via Corollary 1.2, due to the global supports of the functions \(\psi ^\mathrm{R}_j\). In the following section we consider an alternative wavelet-type construction for which Theorem 1.1, with Corollary 1.3, yields an approximation rate for corresponding solutions of u for any \(r>\frac{d}{2}\).

4 Wavelet Representations

Our starting point is an \(L^2(\mathbb {R})\)-orthonormal wavelet basis, that is a basis of the form

$$\begin{aligned} \{\varphi (\cdot -n)\}_{n\in \mathbb {Z}} \cup \{ 2^{\ell /2}\psi (2^\ell \cdot -n)\}_{\ell \ge 0,n\in \mathbb {Z}} \end{aligned}$$
(83)

where \(\varphi \) and \(\psi \) are the scaling function and mother wavelet, respectively. For simplicity we use the Meyer wavelets, whose construction is detailed in [11, 24], and for which

$$\begin{aligned} {{\mathrm{supp}}}(\hat{\varphi })=\biggl [-\frac{4\pi }{3},\frac{4\pi }{3}\biggr ]\quad \mathrm{and}\quad {{\mathrm{supp}}}(\hat{\psi })=\biggl [-\frac{8\pi }{3}, -\frac{2\pi }{3}\biggr ]\cup \biggl [\frac{2\pi }{3},\frac{8\pi }{3}\biggr ]. \end{aligned}$$
(84)

The functions \(\hat{\varphi }\) and \(\hat{\psi }\) may be chosen to be smooth, but for our purpose it will be enough to assume that they have M uniformly bounded derivatives with an integer \( M \ge d+1\).

Denoting \(\psi _0 := \varphi \) and \(\psi _1 := \psi \), the multivariate scaling function and wavelets are defined by

$$\begin{aligned} \Phi (x) := \varphi (x_1)\cdots \varphi (x_d),\qquad \Psi _\varepsilon (x) := \psi _{\varepsilon _1}(x_1)\cdots \psi _{\varepsilon _d}(x_d), \quad \varepsilon \in \mathcal {C}, \end{aligned}$$
(85)

where \(\mathcal {C}:=\{0,1\}^d\setminus \{(0,\ldots ,0)\}\). Then

$$\begin{aligned} \{ \Phi (\cdot - n):n\in \mathbb {Z}^d\} \cup \{ \Psi _{\varepsilon ,n,\ell } :n\in \mathbb {Z}^d , \ell \ge 0, \varepsilon \in \mathcal {C}\}, \end{aligned}$$
(86)

is an orthonormal basis of \(L^2(\mathbb {R}^d)\), where we have used the notation

$$\begin{aligned} \Psi _{\varepsilon ,n,\ell }:=2^{d\ell /2} \Psi _\varepsilon (2^\ell \cdot -n). \end{aligned}$$
(87)

We obtain an orthonormal basis of \(L^2({\mathbb {T}})\) by rescaling and periodization. This basis consists of the constant scaling function

$$\begin{aligned} \Phi ^\mathrm{p}(x) := \sum _{m\in \mathbb {Z}^d} (2\gamma )^{-d/2} \Phi \bigl ((2\gamma )^{-1}x - m\bigr ) = (2\gamma )^{-d/2}, \end{aligned}$$
(88)

and the \({\mathbb {T}}\)-periodic wavelets

$$\begin{aligned} \Psi ^\mathrm{p}_{\varepsilon ,\ell ,n} (x) := \sum _{m\in \mathbb {Z}^d} (2\gamma )^{-d/2} \Psi _{\varepsilon ,n,\ell } \bigl ((2\gamma )^{-1} x - m \bigr ), \end{aligned}$$
(89)

for \(n\in \{0,\dots ,2^{\ell }-1\}^d, \;\ell \ge 0,\; \varepsilon \in \mathcal {C}\). From the Poisson summation formula, and the support properties of \(\hat{\varphi }\) and \(\hat{\psi }\), it is easily seen that the above wavelets, at a given scale level \(\ell \), are finite linear combinations of the Fourier exponentials \(e_n\) with \(\Vert n\Vert _\infty \le 2^{\ell +2}\). In other words, they are trigonometric polynomials of degree at most \(2^{\ell +2}\) in each variable.

We now make the following general observation: let \((g_j)_{j\ge 1}\) be any orthonormal basis of \(L^2({\mathbb {T}})\), with each basis function having the Fourier expansion

$$\begin{aligned} g_j=(2\gamma )^{-d/2}\sum _{n\in \mathbb {Z}^d} c_n(g_j)\,e_n, \end{aligned}$$
(90)

where

$$\begin{aligned} e_n(z):= (2\gamma )^{-d/2} e^{i\frac{\pi }{\gamma }n\cdot z},\quad n\in \mathbb {Z}^d. \end{aligned}$$
(91)

Then, defining the functions \((\bar{g}_j)_{j\ge 1}\) by \(\bar{g}_j:=Sg_j\), where S is the filtering operator

$$\begin{aligned} v\mapsto Sv:=(2\gamma )^{-d/2} \sum _{n\in \mathbb {Z}^d} \sqrt{c_n (k_\mathrm{p})} \,c_n(v)\,e_n, \end{aligned}$$
(92)

we obtain a decomposition

$$\begin{aligned} b_\mathrm{p}= \sum _{j\ge 1} y_j \bar{g}_j, \end{aligned}$$
(93)

where the \(y_j\) are i.i.d. \({\mathcal {N}}(0,1)\), and therefore \(b = \sum _{j\ge 1} y_j \bar{g}_j|_{D}\).

We apply this procedure to the above described periodic wavelet basis, therefore obtaining new periodic functions

$$\begin{aligned} \bar{\Phi }^\mathrm{p}=S\Phi ^\mathrm{p} \quad \mathrm{and} \quad \bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}:=S\Psi ^\mathrm{p}_{\varepsilon ,\ell ,n}, \end{aligned}$$
(94)

which are adapted to the decomposition of \(b_\mathrm{p}\). By construction \(\bar{\Phi }^\mathrm{p}\) is again a constant function with value \((2\gamma )^{-d/2}\sqrt{c_0 (k_\mathrm{p})}\), and the functions \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}\) are trigonometric polynomials of degree at most \(2^{\ell +2}\) in each variable. We next study in more detail the size and localization properties of these functions and show that they essentially behave like a wavelet basis with normalization \(2^{\ell (d/2-r)}\) in place of \(2^{\ell d/2}\). This geometric decay of the \(L^\infty \) norms, combined with the localization properties, will allow us to apply Theorem 1.1 and Corollary 1.3 for any value of \(r>\frac{d}{2}\).

Note that since \(k_\mathrm{p}\) has been obtained by periodizing the truncated function \(k_\mathrm{t}=k\phi \), an equivalent construction of the functions \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}\) is obtained by first defining over \(\mathbb {R}^d\) the rescaled and filtered wavelets \(\bar{\Psi }_{\varepsilon ,\ell ,n}\) according to

$$\begin{aligned} \widehat{\bar{\Psi }}_{\varepsilon ,\ell ,n}(\omega )=\hat{k}_\mathrm{t}^{\frac{1}{2}}(\omega ) \,(2\gamma )^{d/2}\widehat{\Psi }_{\varepsilon ,\ell ,n}(2\gamma \omega ), \end{aligned}$$
(95)

and then applying \({\mathbb {T}}\)-periodization, that is,

$$\begin{aligned} \bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}(x)=\sum _{m\in \mathbb {Z}^d}\bar{\Psi }_{\varepsilon ,\ell ,n}(x+2\gamma m). \end{aligned}$$
(96)

We thus focus our attention on the size and localization properties of the functions \(\bar{\Psi }_{\varepsilon ,\ell ,n}\).

Note that these functions inherit from the wavelet basis the translation invariance structure, since, at any given scale level \(\ell \), the functions \(\bar{\Psi }_{\varepsilon ,\ell ,n}\) are translates by \(2\gamma 2^{-\ell }n\) of \(\bar{\Psi }_{\varepsilon ,\ell ,0}\), that is,

$$\begin{aligned} \bar{\Psi }_{\varepsilon ,\ell ,n}=\bar{\Psi }_{\varepsilon ,\ell ,0}(\cdot -2\gamma 2^{-\ell }n). \end{aligned}$$
(97)

However, they do not inherit the dilation invariance structure, since the functions \(\bar{\Psi }_{\varepsilon ,\ell ,0}\) are not obtained by a simple rescaling of \(\bar{\Psi }_{\varepsilon ,0,0}\). We introduce rescaled functions \(F_{\varepsilon ,\ell }\) such that

$$\begin{aligned} \bar{\Psi }_{\varepsilon ,\ell ,0}(x)=2^{\ell (d/2-r)}F_{\varepsilon ,\ell }(2^\ell x), \end{aligned}$$
(98)

that is, \(F_{\varepsilon ,\ell }\) is defined by

$$\begin{aligned} F_{\varepsilon ,\ell }( x)=2^{-\ell (d/2-r)} \bar{\Psi }_{\varepsilon ,\ell ,0}(2^{-\ell } x). \end{aligned}$$
(99)

Our objective is now to show that the functions \(F_{\varepsilon ,\ell }\) satisfy a uniform localization estimate, independently of \(\varepsilon \) and \(\ell \).

For this purpose, we require some additional assumptions on the covariance function, namely that (45) holds with \(s=r\), and that the partial derivatives of \(\hat{k}\) satisfy improved decay estimates

$$\begin{aligned} |\partial ^\alpha \hat{k} (\omega )| \le C ( 1+ |\omega |^2)^{-(r+|\alpha |/2)} , \quad |\alpha | \le M, \end{aligned}$$
(100)

for an \(M \ge d+1\). It is easily seen that Matérn covariances satisfy such estimates, by straightforward differentiation of (19).

Lemma 4.1

Let k satisfy (45), (46) with \(s=r\) and (100) with an integer \(M \ge d+1\). Then \(\phi \) in Theorem 2.3 can be chosen such that

$$\begin{aligned} |\partial ^\alpha \hat{k}_\mathrm{t}^{\frac{1}{2}} (\omega )| \le C ( 1+ |\omega |)^{-(r+|\alpha |)} , \quad 0\le |\alpha | \le M. \end{aligned}$$
(101)

Proof

We proceed exactly as in the proof of Theorem 2.3 to obtain a family of functions \(\phi _\kappa \in C^{2p}(\mathbb {R}^d)\) for \(\kappa \ge 2 \delta \), supported on \([-2\delta ,2\delta ]^d\) and such that \({\phi _{2\delta }}|_{[-\delta ,\delta ]^d} = 1\), and satisfying

$$\begin{aligned} \max _{|\alpha |\le 2p} ||\partial ^\alpha \phi _\kappa ||_{L^\infty } \le D <\infty \end{aligned}$$
(102)

for some \(D>0\), but here with \(p := \lceil r + M/2\rceil \).

As in the proof of Theorem 2.3, we obtain that for \(\phi := \phi _\kappa \) with \(\kappa \) sufficiently large, there exist \(\tilde{c},\tilde{C}>0\) such that

$$\begin{aligned} \tilde{c} ( 1 + |\omega |^2 )^{-r} \le \hat{k}_\mathrm{t}(\omega ) \le \tilde{C} ( 1 + |\omega |^2)^{-r}. \end{aligned}$$
(103)

Note that \(\phi \) obtained in this manner satisfies

$$\begin{aligned} |\hat{\phi }(\omega )| \le C ( 1 + |\omega |^2)^{-p}. \end{aligned}$$
(104)

Since \(\hat{k}_\mathrm{t}= (2\pi )^{-d}\, \hat{k} * \hat{\phi }\), we have \(\partial ^\alpha \hat{k}_\mathrm{t}=(2\pi )^{-d}\,\partial ^\alpha \hat{k} * \hat{\phi }\) and therefore

$$\begin{aligned} | \partial ^\alpha \hat{k}_\mathrm{t}(\omega ) | = (2\pi )^{-d} |(\partial ^\alpha \hat{k} * \hat{\phi })(\omega )| \le C ( 1 + |\omega |^2 )^{-(r + |\alpha |/2)}, \quad |\alpha | \le M, \end{aligned}$$
(105)

by the same argument used for the upper inequality in (33).

We now turn to (101). For \(\alpha = 0\), this simply follows from (100) combined with \((1+|\omega |^2)\le (1+|\omega |)^2\). For \(\alpha \ne 0\) such that \(|\alpha | \le M\), we obtain by induction that \(\partial ^\alpha \hat{k}_\mathrm{t}^{\frac{1}{2}}\) is of the form

$$\begin{aligned} \partial ^\alpha \hat{k}_\mathrm{t}^{\frac{1}{2}} = \sum _{m=1}^{|\alpha |}\sum _{\beta _1 + \cdots +\beta _m = \alpha } C_{\beta _1,\ldots ,\beta _m} \hat{k}_\mathrm{t}^{\frac{1}{2} - m} \prod _{\ell =1}^m \partial ^{\beta _\ell } \hat{k}_\mathrm{t}\end{aligned}$$
(106)

for certain \(C_{\beta _1,\ldots ,\beta _m} \in \mathbb {R}\). Note that we are allowed to divide by \(\hat{k}_\mathrm{t}\) since it is positive by (103). Since for \(\beta _1,\ldots ,\beta _m\) such that \(\beta _1 + \cdots + \beta _m = \alpha \), we have

$$\begin{aligned} \hat{k}_\mathrm{t}^{\frac{1}{2} - m}(\omega ) \prod _{\ell =1}^m \partial ^{\beta _\ell } \hat{k}_\mathrm{t}(\omega ) \le C ( 1 + |\omega |^2 )^{-\frac{r}{2} + mr} \prod _{\ell =1}^m ( 1 + |\omega |^2 )^{-(r + |\beta _\ell |/2)}, \end{aligned}$$
(107)

and thus

$$\begin{aligned} \hat{k}_\mathrm{t}^{\frac{1}{2} - m}(\omega ) \prod _{\ell =1}^m \partial ^{\beta _\ell } \hat{k}_\mathrm{t}(\omega ) \le C ( 1 + |\omega |)^{-(r+|\alpha |)}, \end{aligned}$$
(108)

we arrive at (101). \(\square \)

Theorem 4.2

Let k satisfy (45), (46) with \(s=r\) and (100) with an integer \(M \ge d+1\), and let \(\phi \) be chosen as in Lemma 4.1. Then the functions \(F_{\varepsilon ,\ell }\) satisfy

$$\begin{aligned} |F_{\varepsilon ,\ell }(x)|\le C(1+|x|)^{-M}, \end{aligned}$$
(109)

for some \(C>0\) that is independent of \(\ell \) and \(\varepsilon \).

Proof

From its definition (99) we have

$$\begin{aligned} \widehat{F}_{\varepsilon ,\ell }(\omega )=2^{\ell (d/2+r)} \widehat{\bar{\Psi }}_{\varepsilon ,\ell ,0}(2^{\ell } \omega ), \end{aligned}$$
(110)

and therefore by (95),

$$\begin{aligned} \widehat{F}_{\varepsilon ,\ell }(\omega )=(2\gamma )^{d/2} 2^{\ell (d/2+r)} \hat{k}_\mathrm{t}^{\frac{1}{2}}(2^\ell \omega ) \,\widehat{\Psi }_{\varepsilon ,\ell ,0}(2\gamma 2^\ell \omega )=(2\gamma )^{d/2}2^{\ell r} \hat{k}_\mathrm{t}^{\frac{1}{2}}(2^\ell \omega )\, \widehat{\Psi }_{\varepsilon }(2\gamma \omega ), \end{aligned}$$
(111)

where we have used the scaling relation between \(\Psi _{\varepsilon ,\ell ,0}\) and \(\Psi _{\varepsilon }\). The functions \(\widehat{F}_{\varepsilon ,\ell }\) are uniformly compactly supported since

$$\begin{aligned} |\omega |_\infty \ge \frac{8\pi }{6\gamma } \quad \implies \quad \widehat{\Psi }_{\varepsilon }(2\gamma \omega )=0. \end{aligned}$$
(112)

Applying partial differentiation for any \(\alpha \) such that \(|\alpha |\le M\), and using the multivariate Leibniz formula, we find that

$$\begin{aligned} \partial ^\alpha \widehat{F}_{\varepsilon ,\ell }(\omega )=(2\gamma )^{d/2}2^{\ell r}\sum _{\beta \le \alpha } {\alpha \atopwithdelims ()\beta } \left( 2^{\ell |\beta |}\partial ^\beta \hat{k}_\mathrm{t}^{\frac{1}{2}}(2^\ell \omega )\right) \left( (2\gamma )^{|\alpha |-|\beta |}\partial ^{\alpha -\beta } \widehat{\Psi }_{\varepsilon }(2\gamma \omega )\right) . \end{aligned}$$
(113)

The second factor \((2\gamma )^{|\alpha |-|\beta |}\partial ^{\alpha -\beta }\widehat{\Psi }_{\varepsilon }(2\gamma \omega )\) in each term is uniformly bounded independently of \(\omega \) and \(\beta \), in view of the smoothness assumption that we have imposed on \(\hat{\psi }\). As to the first factor, since we only consider \(\frac{2\pi }{6\gamma } \le |\omega | \le \sqrt{d} \frac{8\pi }{6\gamma }\), we may use (101) to conclude that

$$\begin{aligned} |2^{\ell |\beta |}\partial ^\beta \hat{k}_\mathrm{t}^{\frac{1}{2}}(2^\ell \omega )|\le C2^{-r\ell }. \end{aligned}$$
(114)

It follows that the derivatives \(\partial ^\alpha \widehat{F}_{\varepsilon ,\ell }(\omega )\) are uniformly bounded, independently of \(\ell \) and \(\varepsilon \), for all \(|\alpha |\le M\), which implies (109) since they are in addition uniformly compactly supported. \(\square \)

As we shall show next, Theorem 4.2 implies that Corollary 1.3 can be applied to the wavelet basis defined by (96), that is, with the basis in the corollary chosen as the scaling function \(\bar{\Phi }^\mathrm{p}\) and the wavelets \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}\) ordered by increasing scale level.

Corollary 4.3

Under the assumptions of Theorem 4.2, for each \(\ell \ge 0\) one has

$$\begin{aligned} \sup _{x\in {\mathbb {T}}} \sum _{\varepsilon \in \mathcal {C}} \sum _{n \in \{0,\ldots ,2^\ell -1\}^d} |\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}(x)| \le C 2^{-\alpha \ell }, \end{aligned}$$
(115)

with \(\alpha := r - \frac{d}{2}\) and C independent of \(\ell \).

Proof

For the summation over n in (115), by (96) and (98) we obtain

$$\begin{aligned} \sum _{n \in \{0,\ldots ,2^\ell -1\}^d} |\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}(x)|&\le 2^{\ell (d/2-r)} \sum _{m \in \mathbb {Z}^d} \sum _{n \in \{0,\ldots ,2^\ell -1\}^d} |F_{\varepsilon ,\ell }\bigl (2^\ell x + 2\gamma ( 2^\ell m - n) \bigr )|\\&= 2^{\ell (d/2-r)} \sum _{k \in \mathbb {Z}^d} |F_{\varepsilon ,\ell }(2^\ell x + 2\gamma k)|. \end{aligned}$$

By Theorem 4.2,

$$\begin{aligned} \sum _{k \in \mathbb {Z}^d} |F_{\varepsilon ,\ell }(2^\ell x + 2\gamma k)| \le C \sum _{k \in \mathbb {Z}^d} ( 1 + |2^\ell x + 2\gamma k|)^{-M} \le C \max _{z \in {\mathbb {T}}} \sum _{k \in \mathbb {Z}^d} (1 + |z + 2\gamma k|)^{-M}, \end{aligned}$$
(116)

and the expression on the right is bounded since \(M \ge d+1\). \(\square \)

Remark 4.4

In the estimate (115), the precise value of M enters only into the constant C. For numerical purposes, however, larger values of M corresponding to stronger spatial localization of the functions \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}\) can be advantageous. Note that in the case of the Matérn covariance, if \(\phi \in C^\infty \), then Lemma 4.1 can be applied for any integer \(M\ge d+1\). If in addition \(\hat{\varphi }, \hat{\psi }\in C^\infty \) holds for the functions generating the Meyer wavelets, then Theorem 4.2 can be applied for any such M as well, and the resulting spatial decay of \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,n}\) is faster than any polynomial order.

5 Numerical Aspects

We now discuss in more concrete detail the periodic continuation and the resulting KL and wavelet representations in the case of the Matérn covariances (18). In particular we discuss how these representations can be efficiently computed by using FFT and show some numerical examples which reveal the effect of the parameters \(\lambda \) and \(\nu \) of the Matérn covariance. While these computational principles apply in any dimension, we work in the univariate setting, both for the sake of notational simplicity and for visualization purposes. Our computational domain is thus an interval \(\left]-\frac{\delta }{2},\frac{\delta }{2}\right[\) for some \(\delta >0\). Note however that varying the correlation length parameter \(\lambda \) in (18) amounts to rescaling the interval. Therefore we fix \(\delta =1\), that is

$$\begin{aligned} D:=\Bigl ]-\frac{1}{2},\frac{1}{2}\Bigr [, \end{aligned}$$
(117)

and study the effect of varying \(\lambda \).

5.1 Truncation and Positivity

We need to first choose a \(\phi \) satisfying the conditions in Theorem 2.3 such that \(k_\mathrm{t}= k \phi \) has nonnegative Fourier transform. One option, which yields \(\phi \in C^\infty (\mathbb {R})\), is based on the function \(\theta \) defined by

$$\begin{aligned} \theta (x) := {\left\{ \begin{array}{ll} \exp (-x^{-1}), &{} x>0, \\ 0, &{} x \le 0. \end{array}\right. } \end{aligned}$$
(118)

We then set

$$\begin{aligned} \phi (x) := \frac{ \theta \Bigl (\frac{\kappa - |x|}{\kappa - \delta }\Bigr ) }{\theta \Bigl (\frac{\kappa - |x|}{\kappa - \delta }\Bigr ) + \theta \Bigl (\frac{|x| - \delta }{\kappa - \delta }\Bigr )}. \end{aligned}$$
(119)

Recall that \(\kappa =2\gamma -\delta >\delta \). In view of Theorem 2.3, in order to ensure \(\hat{k}_\mathrm{t}\ge 0\), it then suffices to take \(\gamma \) sufficiently large.

In order to illustrate the dependence of the required value of \(\gamma \) on the parameters \(\nu \), \(\lambda \) of the Matérn covariance, we now describe, for any chosen value of \(\gamma \), a simple scheme for approximating \(\hat{k}_\mathrm{t}\) based on the discrete Fourier transform. Let \(L> \kappa \) (so that \({{\mathrm{supp}}}k_\mathrm{t}\subset [-L,L]\)), and let \(N= 2^J\) for some \(J>0\). We consider the approximation of \(\hat{k}_\mathrm{t}\) by the trapezoidal rule,

$$\begin{aligned} \hat{k}_\mathrm{t}(\omega ) = \int _{-L}^L k_\mathrm{t}(x)\, e^{-i\omega x}\,dx \approx h \sum _{n={-N/2}}^{N/2-1} k_\mathrm{t}(x_n) e^{-i\omega x_n} =: D_{L,N}(\omega ), \end{aligned}$$
(120)

where \(h := 2L/N\) and \(x_n := nh\). The sum on the right side can be evaluated by the FFT to obtain the values \(D_{L,N}(\omega _k)\) with

$$\begin{aligned} \omega _k := \frac{\pi k}{L}, \quad k = -\frac{N}{2},\ldots , \frac{N}{2} - 1. \end{aligned}$$
(121)

Thus, by making N and L large we may approximately compute \(\hat{k}_\mathrm{t}\) on an arbitrarily large range of \(\omega \) and with arbitrarily fine sampling rate.

Since \(k_\mathrm{t}\) agrees on \([-L,L]\) with its 2L-periodic extension, we have

$$\begin{aligned} k_\mathrm{t}(x_n) = \sum _{\ell \in \mathbb {Z}} \biggl (\frac{1}{2L} \int _{-L}^L k_\mathrm{t}(y)\, e^{-\pi i y \ell /L} \,dy \biggr ) e^{\pi i x_n \ell / L}, \end{aligned}$$
(122)

and consequently

$$\begin{aligned} D_{L,N}(\omega _k) = h \sum _{n={-N/2}}^{N/2-1} \Bigl ( \sum _{\ell \in \mathbb {Z}} \frac{1}{2L} \hat{k}_\mathrm{t}(\omega _\ell ) \,e^{\pi i x_n \ell / L} \Bigr ) e^{- \pi i x_n k /L} = \sum _{m\in \mathbb {Z}} \hat{k}_\mathrm{t}(\omega _{k + m N}). \end{aligned}$$
(123)

We thus have the error representation

$$\begin{aligned} |\hat{k}_\mathrm{t}(\omega _k) - D_{L,N}(\omega _k)| = \Bigl | \sum _{\begin{array}{c} m\in \mathbb {Z}\\ m\ne 0 \end{array}} \hat{k}_\mathrm{t}(\omega _{k+mN}) \Bigr |. \end{aligned}$$
(124)

In view of (19) and (36), in our present setting we obtain

$$\begin{aligned} |\hat{k}_\mathrm{t}(\omega _k) - D_{L,N}(\omega _k)| \le C N^{-(2\nu + 1)}, \quad k = -\frac{N}{2},\ldots , \frac{N}{2} - 1, \end{aligned}$$
(125)

where \(C>0\) is independent of k, L and N.

Based on this approximation, we can check positivity of the obtained approximate values \(D_{L,N}(\omega _k)\) for each \(\gamma \) and combine this with a simple bisection scheme to obtain an estimate of the minimum required value \(\gamma _{\min }\) for which \(\hat{k}_\mathrm{t}\) remains non-negative on the chosen grid. As illustrated in Fig. 1, we observe that \(\gamma _{\min }\) remains close to its lower bound \(\delta =1\) for \(\nu ,\lambda < 1\), and shows approximately bilinear growth for larger \(\nu ,\lambda \). In other words, the continuation process requires a significantly larger domain as smoothness or correlation length increase. This also implies that the KL or wavelet frames of the RKHS obtained in Sects. 3 and 4 become more redundant as these parameters increase.

Fig. 1
figure 1

Numerically observed minimum value of \(\gamma \) required for positivity of \(\hat{k}_\mathrm{t}\), with k as in (18), in dependence on the Matérn parameters \(\lambda ,\nu \)

5.2 Matérn Wavelets

The Meyer scaling function and wavelet can be defined by first taking \(\hat{\varphi }\) to be a non-negative function such that

$$\begin{aligned} |\hat{\varphi }(\omega )|^2=\beta (\omega ), \end{aligned}$$
(126)

where \(\beta (\omega )\) is a smooth and even function supported in \([-\frac{4\pi }{3},\frac{4\pi }{3}]\), such that

$$\begin{aligned} \beta (\omega )=1, \quad \omega \in \left[ -\frac{2\pi }{3}, \frac{2\pi }{3}\right] , \end{aligned}$$
(127)

and

$$\begin{aligned} \beta (\pi -\omega )+\beta (\pi +\omega )=1, \quad \omega \in \left[ 0, \frac{\pi }{3} \right] . \end{aligned}$$
(128)

Then, one defines \(\hat{\psi }\) by

$$\begin{aligned} \hat{\psi }(\omega ) :=\bigl (\beta (\omega /2)-\beta (\omega )\bigr )^{1/2}e^{i\omega /2}. \end{aligned}$$
(129)

One simple example with explicit expressions of \(\hat{\varphi }\) and \(\hat{\psi }\), following the construction given in [11], is

$$\begin{aligned} \hat{\varphi }(\omega ) := {\left\{ \begin{array}{ll} 1 ,&{} |\omega | \le \frac{2\pi }{3},\\ \cos \Bigl (\frac{\pi }{2} \nu \Bigl ( \frac{3|\omega |}{2\pi } - 1 \Bigr ) \Bigr ), &{} \frac{2\pi }{3}< |\omega | < \frac{4\pi }{3},\\ 0, &{}\text {otherwise,} \end{array}\right. } \end{aligned}$$
(130)

and

$$\begin{aligned} \hat{\psi }(\omega ) := {\left\{ \begin{array}{ll} \sin \Bigl ( \frac{\pi }{2} \nu \Bigl ( \frac{3|\omega |}{2\pi } - 1\Bigr ) \Bigr ) e^{i\omega /2}, &{} \frac{2\pi }{3}< |\omega | \le \frac{4\pi }{3}, \\ \cos \Bigl ( \frac{\pi }{2} \nu \Bigl ( \frac{3|\omega |}{4\pi } - 1\Bigr ) \Bigr ) e^{i\omega /2}, &{} \frac{4\pi }{3} < |\omega | \le \frac{8\pi }{3}, \\ 0, &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$
(131)

where we take

$$\begin{aligned} \nu (x) := \frac{\theta (x)}{\theta (x) + \theta (1-x)} \end{aligned}$$
(132)

with \(\theta \) given by (118).

From these we now construct the one-dimensional versions of \(\bar{\Phi }^\mathrm{p}\) and \(\bar{\Psi }^\mathrm{p}\) described in Sect. 4. Recall that for the scaling function, \(\bar{\Phi }^\mathrm{p} = (2\gamma )^{-1/2} \sqrt{ \hat{k}_\mathrm{t}(0) }\), which we can directly approximate using (120). For the wavelets, it suffices to consider \(\bar{\Psi }^\mathrm{p}_{\varepsilon ,\ell ,0}\) for each wavelet type \(\varepsilon \) and scale level \(\ell \), since all further wavelets are obtained as translates of these functions. In the present univariate case, there is only a single wavelet type \(\varepsilon =1\), and we omit the corresponding subscript in what follows.

By (92) and (32), we have

$$\begin{aligned} \bar{\Psi }^\mathrm{p}_{\ell ,0}(x) = \frac{1}{2\gamma } \sum _{n\in \mathbb {Z}} \sqrt{ \hat{k}_\mathrm{t}\left( \frac{\pi }{\gamma }n\right) } \,\Bigl ( \sqrt{2^{1-\ell }\gamma }\, \hat{\psi }(2^{-\ell + 1}\pi n)\Bigr )\,e^{i \frac{\pi }{\gamma }n x}. \end{aligned}$$
(133)

We now choose L in (120) as \(L=2 \gamma \), that is, \(\omega _k = \frac{\pi k}{2 \gamma }\). We assume that \(\gamma \) and \(N = 2^J\), with \(J>1\), are sufficiently large to ensure \(D_{L,N}(\omega _k)\ge 0\) for the range of k in (121). This allows us to approximate the above expression by

$$\begin{aligned} \tilde{\Psi }^\mathrm{p}_{\ell ,0}(x) := \frac{1}{\sqrt{2^{\ell +1}\gamma }} \sum _{n = -N/4}^{ N/4-1} \sqrt{ D_{L,N}(\omega _{2 n})} \,\hat{\psi }(2^{-\ell + 1}\pi n)\,e^{i \frac{\pi }{\gamma }n x}. \end{aligned}$$
(134)

Using the compact support of \(\hat{\psi }\) and that \(|\hat{\psi }| \le 1\), we obtain

$$\begin{aligned} |\bar{\Psi }^\mathrm{p}_{\ell ,0}(x) - \tilde{\Psi }^\mathrm{p}_{\ell ,0}(x)|\le & {} C 2^{-\ell /2} \biggl \{\sum _{\begin{array}{c} |n|\ge N/4 \\ \frac{1}{3} 2^\ell< |n|< \frac{4}{3} 2^\ell \end{array}}\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)} \nonumber \\&+ \sum _{\begin{array}{c} n \in \{ - N/4, \ldots , N/4-1 \} \\ \frac{1}{3} 2^\ell< |n| < \frac{4}{3} 2^\ell \end{array}} \biggl |\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)} - \sqrt{ D_{L,N}(\omega _{2 n})} \biggr |\biggr \}.\nonumber \\ \end{aligned}$$
(135)

Recall by (103) we have \(c (1+|\omega |)^{-2\nu -1}\le \hat{k}_\mathrm{t}(\omega ) \le C(1+|\omega |)^{-2\nu -1}\). It follows that the first sum can be bounded according to

$$\begin{aligned} \sum _{\begin{array}{c} |n|\ge N/4 \\ \frac{1}{3} 2^\ell< |n| < \frac{4}{3} 2^\ell \end{array}}\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)}\le C \min \{ N^{-\nu + \frac{1}{2}}, 2^\ell N^{-\nu -\frac{1}{2}}\}. \end{aligned}$$
(136)

For the second sum, we combine (125) with

$$\begin{aligned} \biggl |\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)} - \sqrt{ D_{L,N}(\omega _{2 n})} \biggr | \le \frac{\bigl |{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)} - { D_{L,N}(\omega _{2 n})} \bigr |}{\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)}}, \end{aligned}$$
(137)

to obtain

$$\begin{aligned} \sum _{\begin{array}{c} n \in \{ - N/4, \ldots , N/4-1 \} \\ \frac{1}{3} 2^\ell< |n| < \frac{4}{3} 2^\ell \end{array}} \biggl |\sqrt{ \hat{k}_\mathrm{t}(\pi \gamma ^{-1} n)} - \sqrt{ D_{L,N}(\omega _{2 n})} \biggr | \le CN^{-2\nu -1} \min \{ N^{\nu +\frac{3}{2}}, 2^\ell N^{\nu +\frac{1}{2}}\}. \end{aligned}$$
(138)

This yields

$$\begin{aligned} |\bar{\Psi }^\mathrm{p}_{\ell ,0}(x) - \tilde{\Psi }^\mathrm{p}_{\ell ,0}(x)| \le C 2^{-\ell /2} \min \{ N^{-\nu + \frac{1}{2}}, 2^\ell N^{-\nu -\frac{1}{2}}\} =C 2^{-\nu J - \frac{1}{2} |J-\ell |}, \end{aligned}$$
(139)

where \(C>0\) depends on \(\nu ,\lambda ,\gamma \), but not on J or \(\ell \).

Fig. 2
figure 2

Wavelets \(\bar{\Psi }^\mathrm{p}_{\ell ,0}\) obtained with \(\lambda =1\), \(\nu =\frac{1}{2}\), and \(\gamma =\frac{3}{2}\), for \(\ell =0,\ldots ,5\)

Fig. 3
figure 3

Wavelets \(\bar{\Psi }^\mathrm{p}_{\ell ,0}\) obtained with \(\lambda =1\), \(\nu =4\), and \(\gamma =5\), for \(\ell =0,\ldots ,5\)

The sum in (134) can be evaluated by FFT simultaneously for the \(2^{J-1}\) arguments

$$\begin{aligned} x = \frac{4\gamma k}{N}, \quad k=-\frac{N}{4},\ldots , \frac{N}{4}-1, \end{aligned}$$
(140)

at cost of order \(J 2^J\). In other words, if we prescribe a grid size \(h\sim 2^{-J}\), as a consequence of (139) we can determine the values of the wavelets at any level \(\ell \) at the grid points up to an error of order \(h^\nu \), using \(h^{-1} |\log h|\) operations in total. Since the wavelets are trigonometric polynomials, their values between grid points can be approximated with similar order of accuracy by local polynomial interpolation of neighboring grid values.

As an illustration, we display in Figs. 2 and 3 the obtained wavelets at scales \(\ell =0,\dots ,5\) for \(\lambda =1\) and \((\nu ,\gamma )=(1/2,3/2)\), \((\nu ,\gamma )=(4,5)\). Note that these wavelets behave asymptotically similarly to standard wavelets in terms of scale invariance. As expected the size decay in scale depends on \(\nu \).

Remark 5.1

In typical applications, the exact wavelet expansion of the random field b is truncated to a finite number of terms, up to some prescribed error. The estimate (139) shows that in addition we can control the approximation of each wavelet in the uniform sense, with arbirarily high precision governed by J. Combining such estimates allows us to control the resulting error after truncation of the expansion and approximation of each wavelet.