1 Introduction

Samples of infinite dimensional data, especially of data recorded continuously over a time interval are now a commonly encountered type of data due to the possibilities of modern technology. They arise in many fields of applications, e.g. in econometrics where authors rather speak of panel data and supply the field of functional data analysis (FDA) whose scope is no more to be demonstrated (see, for general ideas and lots of examples, Hsiao (2003), Ramsay and Silverman (2007), Wang et al. (2015)). Parametric models are most often proposed to deal with FDA. However, nonparametric approaches allow for more flexibility and robustness.

In the present contribution, we consider i.i.d. observations \((X_i(t), t\in [0,T], i=1, \ldots ,N) \) of the continuous time moving average (CMA) process

$$\begin{aligned} X(t)= \int _0^t a(t-s)dW(s) \end{aligned}$$
(1)

where \((W(t), t\ge 0)\) is a Wiener process and \(a: {{\mathbb {R}}}^+ \rightarrow {{\mathbb {R}}}\) is a deterministic square integrable function. Our aim is to study the new and challenging question of the nonparametric estimation of the function \(g=a^2\) from these observations under very general conditions on the function a(t). Our assumptions include in particular the classical CARMA processes (continuous ARMA) but also more complicated processes such as the continuous time fractionally integrated process of order d (see (3)), defined in (Comte and Renault 1996, Definition 2) which is linked with Brownian motion with Hurst index \(H=d+(1/2)\).

CMA processes have been the subject of a huge number of contributions concerned with modelling properties. Estimation procedures rely on the observation of a unique sample path on a time interval [0, T] and usually, the stationary version of (X(t)), namely

$$\begin{aligned} Y(t)= \int _{-\infty }^t a(t-s)dW(s), \end{aligned}$$
(2)

is considered. We refer e.g. to (Brockwell (2001)) for a reference book, Brockwell et al.  (2012) and the references given therein, where a general Lévy process (L(t)) may replace (W(t)) (see also e.g. Belomestny et al. 2019; Schnurr and Woerner 2011). For what concerns nonparametric estimation, a pointwise estimator of a(t) for mainly Gaussian CARMA(pq) processes in stationary regime (see formula (2)), is proposed in Brockwell et al. (2012) based on the discrete observation of one sample path. Except for this reference, to our knowledge, the nonparametric estimation of a(t) for general CMA processes has not yet been studied.

In the present paper, stationarity of the process is not required. The asymptotic framework will be that either N tends to infinity with fixed T or both Nand T tend to infinity. We assume that g is square integrable. Considering sequences \((S_m, m\in {{\mathbb {N}}})\) of finite dimensional subspaces of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\), we propose two kinds of projections estimators of g built using the observations \((X_i(t), t\in [0,T], i=1, \ldots ,N)\): i.e., we build estimators of the orthogonal projection \(g_m\) of g on \(S_m\) by estimating the coefficients of the projection on an orthonormal basis of \(S_m\). The first method relies on the assumption that a(t) belongs to \(C^1([0, +\infty ))\) which excludes the continuous time fractionally integrated process. In this case, (X(t)) is an Itô process with explicit stochastic differential. The second approach which is more general applies without regularity assumptions on a(t). Then, in the general case, we propose a data-driven selection of the dimension leading to an adaptive estimator. For this part, the Gaussian character of the process (X(t)) is especially exploited. Proofs which do not rely on this property are possible though longer.

In Sect. 2, we present assumptions and the collections of models. Two collections are especially investigated. First, we consider for fixed T the collection of spaces generated by the trigonometric basis of \({{\mathbb {L}}}^2([0,T])\) and thus we estimate \(g_T=g{{\mathbf {1}}}_{[0,T]}\). Second, for large T, we consider spaces generated by the Laguerre basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). This basis has been largely investigated and used in recent references for nonparametric estimation by projection method (see e.g. Comte and Genon-Catalot 2018). The estimators are presented in Sects. 2.2 (first method) and 2.3 (second method under more general assumptions). Several risk bounds for the projection estimators on a fixed space are obtained and discussed. In Sect. 3, we detail the possible rates of convergence that can be deduced from the risk bounds depending on regularity spaces for the unknown function g. Section 4 is concerned with the data-driven choice of the dimension of the projection space. We prove that our estimators are adaptive in the sense that their risk bounds automatically achieve the best compromise between square bias and variance terms (Theorems 1 and 2). Section 5 contains a simulation study. Estimators are implemented on simulated data for various examples of functions g. We give table of risks obtained by Monte-Carlo simulations. In Sect. 6, some concluding remarks are given. Proofs are gathered in Sects. 7 and 8 contains the necessary definitions and properties of the Laguerre basis.

2 Projection estimators on a fixed space

2.1 Assumptions and collection of models

We estimate the function

$$\begin{aligned} g(t):=a^2(t). \end{aligned}$$

Our study will depend on assumptions on the unknown function a(t):

  • [H0] The function \(g(t)=a^2(t)\) belongs to \({{\mathbb {L}}}^1({{\mathbb {R}}}^+)\cap {{\mathbb {L}}}^2({{\mathbb {R}}}^+)\)

  • [H1] The function a(t) belongs to \(C^1({{\mathbb {R}}}^+)\), is bounded and \(\int _0^{+\infty } (a'(t))^2dt<+\infty \).

Example 1

Consider the following example: \(a(t)=t^d{{\tilde{a}}}(t)/\Gamma (d+1)\) where \(d>-1/2\) and \({{\tilde{a}}}\in C^1({{\mathbb {R}}}^+)\) and \({{\tilde{a}}}(0) \ne 0\),

$$\begin{aligned} X(t)=\int _0^t \frac{(t-s)^d}{\Gamma (d+1)} {{\tilde{a}}}(t-s)dW(s). \end{aligned}$$
(3)

In particular for \({{\tilde{a}}}(x)=1\), this process is the continuous time fractional Brownian motion of order d defined in (Comte and Renault 1996 , Definition 1) and the general formulation above corresponds to the continuous time fractionally integrated process of order d (Definition 2 therein). The integrability of \(a^2, a'^2,a^4\) near infinity can be ensured by the rate of decrease of \({{\tilde{a}}}\) near infinity, for instance if \({{\tilde{a}}}(t)=e^{-t}\). The behaviour near 0 depends on d:

(i):

The process X(t) is well defined for any \(d>-1/2\) as a is locally square integrable.

(ii):

For \(-1/2<d<0\), a(0) is not defined.

(iii):

For \(d\ge 1\), a(t) belongs to \(C^1({{\mathbb {R}}}^+)\) and \(a'\) is locally square integrable.

(iv):

As \(a(t) \sim ({{\tilde{a}}}(0)/\Gamma (d+1)) t^d\) at 0, [H0] requires \(d>-1/4\).

In other words, fractional processes can be studied only under [H0].

We denote respectively by \(\Vert .\Vert _T\) (resp. \(\langle .,.\rangle _T\)) the norm (resp. the scalar product) of \({{\mathbb {L}}}^2([0,T])\) and \(\Vert .\Vert \) (resp. \(\langle .,.\rangle \)) the norm (resp. the scalar product) of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). We set

$$\begin{aligned} G(t): = \int _0^t a^2(s)ds \le \Vert a\Vert ^2. \end{aligned}$$
(4)

Note that \({{\mathbb {E}}}(X^2(t))=G(t)\) is what enables us to estimate g, whereas \({{\mathbb {E}}}(Y^2(t))=\Vert a\Vert ^2\) would not. To build estimators of g, we use a projection method and consider two settings.

  • In the first case, T is fixed and we estimate \(g_T=g{{\mathbf {1}}}_{[0,T]}\). For this, we consider the collection \((S_{m}^{Trig}, m\ge 0)\) of subspaces of \({{\mathbb {L}}}^2([0,T])\) where \(S_{m}^{Trig}\) has odd dimension m and is generated by the orthonormal trigonometric basis \((\varphi _{j,T})\) where \(\varphi _{0,T}(t)=\sqrt{1/T}\mathbf {1}_{[0,T]}(t)\), \(\varphi _{2j-1,T}(t)=\sqrt{2/T}\cos (2\pi jt/T) \mathbf {1}_{[0,T]}(t)\) and \(\varphi _{2j,T}(t)=\sqrt{2/T}\sin (2\pi j t/T) \mathbf {1}_{[0,T]}(t)\) for \(j=1, \dots , (m-1)/2\). This basis satisfies

    $$\begin{aligned} \sum _{j=0}^{m-1} \varphi _{j,T}^2(t)= \frac{m}{T} \quad \text{ and }\quad \int _0^T \varphi _{0,T}(t)dt=\sqrt{T} , \int _0^T \varphi _{j,T}(t)dt= 0 \quad \text{ for }\quad j\ne 0. \end{aligned}$$
  • In the second case, we may consider that either T is fixed but large enough, or that T tends to infinity. In this case, we estimate g on \({{\mathbb {R}}}^+\) and we rather consider a collection of subspaces of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\), generated by an orthonormal basis. The basis considered here is the Laguerre basis defined by

    $$\begin{aligned} \ell _j(t)= \sqrt{2} L_j(2t) e^{-t}{{\mathbf {1}}}_{t\ge 0}, \quad j\ge 0, \quad L_j(t)=\sum _{k=0}^j (-1)^k \left( {\begin{array}{c}j\\ k\end{array}}\right) \frac{t^k}{k!}. \end{aligned}$$
    (5)

    We set \(S_{m}^{Lag}= \mathrm{span}\{\ell _j, j=0, \ldots , m-1\}\). We have

    $$\begin{aligned} \forall t\ge 0, \quad \sum _{j=0}^{m-1} \ell _{j}^2(t)\le 2m, \quad \text{ and } \quad \int _0^{+\infty } \ell _j(t)dt= \sqrt{2}(-1)^j. \end{aligned}$$

    The second property is obtained by exact computation and the first one comes from the fact that \(\forall j, |\ell _j(t)|\le \sqrt{2}\). Moreover, \({{\mathcal {L}}}_j(T):=\int _0^T\ell _j(u)du\) can computed recursively, see (48). All formulae concerning this basis are recalled in Sect. 8.

Remark 1

In the case of fixed T, we could also consider the subspaces \((S_{m}^{Hist})\) of \({{\mathbb {L}}}^2([0,T])\) generated by the histogram basis

$$\begin{aligned} \varphi _{j,T}(t)= \sqrt{m/T} \mathbf {1}_{[jT/m, (j+1)T/m[}(t), j=0, \dots , m-1 \end{aligned}$$

where \(\sum _{j=0}^{m-1} \varphi _{j,T}^2(t)=m/T\) and \(\int _0^T \varphi _{j,T}(t)dt=\sqrt{T/m}\). But these basis functions are not differentiable and thus would not be suitable for all our proposals.

For simplicity, in order to use a unique notation, we denote by \(\varphi _j\) either \(\varphi _{j,T}\) or \(\ell _j\) and set \(S_m= \mathrm{span}\{\varphi _j, j=0, \ldots , m-1\}\). In all cases, under [H0], the function g admits a development

$$\begin{aligned} g= \sum _{j\ge 0} \theta _j \varphi _j, \text{ with } \theta _j= \int _0^{+\infty } g(s) \varphi _j(s)ds= \langle g, \varphi _j\rangle . \end{aligned}$$

We define \(g_m(t)= \sum _{j=0}^{m-1} \theta _j \varphi _j(t)\) the orthogonal projection of g on \(S_m\).

2.2 Estimators under [H0]–[H1]

Under [H1], the stochastic differential of (X(t)) satisfies:

$$\begin{aligned} dX(t)= a(0) dW(t)+ \left[ \int _0^t a'(t-s)dW(s)\right] dt. \end{aligned}$$
(6)

(see Comte and Renault (1996), Eq. (6)).

Remark 2

By Eq. (6), we have, for each trajectory \(X_i\), for \(t_k=kT/n\) with fixed T,

$$\begin{aligned} \frac{1}{T}\sum _{k=0}^{n-1} (X_i(t_{k+1})-X_i(t_k))^2 \rightarrow _{n\rightarrow +\infty } a^2(0)=g(0), \text { in probability}. \end{aligned}$$

Thus, we can assume that g(0) is known, as we have continuous observation of the sample paths.

The construction of our first estimator relies on the following lemma.

Lemma 1

Under [H0]-[H1], denoting by \(\theta _j=\langle g, \varphi _j\rangle \), we have

$$\begin{aligned} {{\mathbb {E}}}\left( \int _0^{+\infty } \varphi _j(s) X(s)dX(s)\right) = \frac{1}{2} \left( \theta _j - g(0) \int _0^{+\infty } \varphi _j(s)ds\right) . \end{aligned}$$

Obviously, if the basis has support [0, T], integrals are on this interval. Relying on this lemma, we can set:

$$\begin{aligned} {{\hat{\theta }}}_j= {{\hat{\theta }}}_j(N,T)= 2\left[ \frac{1}{N}\sum _{i=1}^N \left( \int _0^T \varphi _j(s) X_i(s)dX_i(s)\right) \right] + g(0)\int _0^{T} \varphi _j(s)ds. \end{aligned}$$
(7)

The projection estimator of g on a fixed space \(S_m\) is given by:

$$\begin{aligned} {{\hat{g}}}_m= \sum _{j=0}^{m-1}{{\hat{\theta }}}_j\varphi _j. \end{aligned}$$

We refer to Remark 2 concerning the fact that g(0) is known. We mention that here, the histogram basis can be used in the fixed-T setting.

Note that, by the Ito formula and (6), we can write \({{\hat{\theta }}}_j\) without stochastic integral, provided that \(\varphi _j\) is differentiable:

$$\begin{aligned} {{\hat{\theta }}}_j=\varphi _j(T) \frac{1}{N}\sum _{i=1}^N X_i^2(T) - \frac{1}{N}\sum _{i=1}^N \int _0^T \varphi '_j(s) X_i^2(s)ds. \end{aligned}$$
(8)

The following proposition gives bounds for the \({{\mathbb {L}}}^2\)-risk of \({{\hat{g}}}_m\) in the case of fixed T and the trigonometric basis.

Proposition 1

Assume [H0]-[H1] and consider that \((\varphi _j=\varphi _{j,T})\) is the trigonometric basis. Then

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert _T^2)\le \Vert g_m-g\Vert _T^2 + 8g(0)G(T)\frac{m}{N}+ 4\left( 2 G(T)G_1(T) + g^2(0)\right) \frac{T}{N}+ \frac{4\Vert g\Vert _T^2}{N} \nonumber \\ \end{aligned}$$
(9)

where \(G_1(T)=\int _0^T(a'(u))^2du\). (Recall that G is defined in (4), that \(g_m\) denotes the orthogonal projection of g on \(S_m^{Trig}\) and that \(\Vert u\Vert _T^2=\int _0^T u^2(s)ds\).)

If \(g(0)=0\),

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert _T^2)\le \Vert g_m-g\Vert _T^2 + 4 G(T)G_1(T) \frac{T}{N}+ \frac{\Vert g\Vert _T^2}{N}. \end{aligned}$$
(10)

Let us discuss these bounds for fixed T and large N. The bounds involve a standard squared bias term \(\Vert g-g_m\Vert _T^2\) due to the projection method. For \(g(0)\ne 0\), the variance has order m/N and the last two terms are residuals (see (9)). Therefore in this case, for choosing m, the bias-variance compromise can be done between the first two terms.

The case \(g(0)=0\) is different as the process is differentiable, see (6) with \(a(0)=0\), and the bound (10) shows that m must simply be chosen as large as possible.

Proposition 2

Assume [H0]-[H1].

If \((\varphi _j)\) is an orthonormal basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\), for all \(T\ge 1, N\ge 1, m\ge 0\), we have

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert ^2)\le & {} \Vert g_m-g\Vert ^2 + 8g(0)G(T)\frac{m}{N}+ c_G\frac{T}{N} + 2\int _T^{+\infty } g^2(s)ds + \frac{4 \Vert g\Vert ^2}{N}. \nonumber \\ \end{aligned}$$
(11)

where \(c_G=4\left( 2 G(T)G_1(T) + g^2(0)\right) \). If in addition \(g(0)=0\),

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert ^2) \le \Vert g_m-g\Vert ^2 + 4 G(T)G_1(T) \frac{T}{N} + \int _T^{+\infty } g^2(s)ds + \frac{2\Vert g\Vert ^2}{N}. \end{aligned}$$
(12)

If \((\varphi _j)\) is the Laguerre basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\), g is bounded and \(T\ge 6m-3\), then

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert ^2)\le \Vert g_m-g\Vert ^2 + C \frac{m^2}{N} + C'\Vert a\Vert ^2 m \exp {(-12\gamma _2m)} \end{aligned}$$
(13)

where \(C= C'(g(0)^2+\Vert g\Vert _\infty ^2+ \Vert a\Vert ^2 \Vert a'\Vert ^2))\), \(C'\) and \(\gamma _2\) are positive constants depending on the basis only.

We can discuss these bounds for fixed T or large T. Here again, the bounds involve a standard squared bias term \(\Vert g-g_m\Vert ^2\).

Bounds (11) and (12) may be compared to (9) and (10). In (9), T is fixed so that the variance has order m/N for \(g(0)\ne 0\) and the term T/N is a negligible residual. If T can be large, the term T/N may no more be negligible and (11)–(12) involve an additional bias term \(\int _T^{+\infty } g^2(s)ds\) which is small for large T. But the order of these terms, depending on T which can not be chosen, are difficult to discuss.

Bound (12) implies as in the trigonometric case that m must be chosen as large as possible. Bound (13) looks more classical: T does not appears, the variance term has order \(m^3/N\) and \(m \exp {(-12\gamma _2m)}\) is a negligible additional bias term.

Comparing (11) and (13), we see that in the Laguerre case, the variance term is less than

$$\begin{aligned} \min \left\{ 8g(0)G(T)\frac{m}{N}+ 4\left( 2 G(T)G_1(T) + g^2(0)\right) \frac{T}{N}, C \frac{m^2}{N} \right\} . \end{aligned}$$

Note that the constants \(G_1(T)\) and C are difficult to estimate which is a drawback for model selection. In Sect. 5, we propose a practical data-driven choice of m taking into account this difficulty.

2.3 Estimator under [H0]

In this paragraph, to handle more general processes, including fractional processes, we propose another estimation method. We no longer assume that a belongs to \(C^1({{\mathbb {R}}}^+)\). Therefore, the stochastic differential (6), which requires [H1], no more holds. As a counterpart, we consider basis functions that have to be differentiable on their domain, [0, T] or \({{\mathbb {R}}}^+\).

The construction of the second estimator is based on the following lemma.

Lemma 2

Assume that [H0] holds and that \((\varphi _j)_j\) is differentiable on [0, T], then

$$\begin{aligned} {{\mathbb {E}}}\left( \int _0^{T} \varphi '_{j}(s) X^2(s)ds\right) = \varphi _{j}(T)G(T)- \int _0^T g(u) \varphi _{j}(u)du. \end{aligned}$$

Therefore, we can set

$$\begin{aligned} {{\tilde{\theta }}}_j= - \frac{1}{N}\sum _{i=1}^N \left( \int _0^T \varphi _{j}'(s) X_i^2(s)ds\right) + \varphi _{j}(T){\widehat{G}}(T), \quad {\widehat{G}}(s) =\frac{1}{N} \sum _{i=1}^{N}X_i^2(s), \, s\in [0,T].\nonumber \\ \end{aligned}$$
(14)

Remark 3

We can remark that \({\tilde{\theta }}_j\) can also be written \({{\tilde{\theta }}}_j= - \int _0^T \varphi _{j}'(s) {\widehat{G}}(s) ds + \varphi _{j}(T){\widehat{G}}(T)\) and thus can be seen as an estimator of \(g=G'\) where G defined by (4) is seen as \(G(s)={{\mathbb {E}}}(X^2(s))\) for \(s\in [0,T]\). It corresponds to an empirical and truncated version of the integration by parts formula

$$\begin{aligned} \langle g, \varphi _j\rangle = [G(u)\varphi _j(u)]_0^{+\infty } -\int _0^{+\infty } G(u)\varphi _j'(u)du= -\int _0^{+\infty } G(u)\varphi _j'(u)du. \end{aligned}$$

Thus the estimator defined below may be considered in a more general context than processes X(t) given by (1). However, our computations are specific to this setting.

Note that under [H0] only, formula (8) no longer holds, this is why we use another notation, \({\tilde{\theta }}_j\) instead of \({\hat{\theta }}_j\). If \(\varphi _j=\varphi _{j,T}\) is the trigonometric basis, then \(\varphi _{0,T}(T)= 1/\sqrt{T}, \varphi _{2j-1,T}(T)=\sqrt{2/T}, \varphi _{2j,T}(T)=0\), \(j\ge 1\). Then we define the estimator by

$$\begin{aligned} {{{\tilde{g}}}}_m= \sum _{j=0}^{m-1}{{\tilde{\theta }}}_j\varphi _j. \end{aligned}$$

We introduce the assumption:

  • [H2] \(\displaystyle \int _0^1 \frac{G^2(s)}{s} ds = c_0<+\infty \). Actually, [H2] is rather weak and allows to consider fractional processes.

Example 1

(continued). If we consider, as in example 1, \(a(t) =t^d {{\tilde{a}}}(t)/\Gamma (d+1)\), where \(d>-1/2\) and \({{\tilde{a}}} \in C^1({{\mathbb {R}}}^+)\), with \({{\tilde{a}}}(0)\ne 0\), then \(G^2(s)/s \sim _{s \rightarrow 0} s^{4d+1}{{\tilde{a}}}^4(0)/\Gamma ^4(d+1)\) and [H2] holds (\(d>-1/2\)). The constraint is weaker than [H0].

The following risk bounds hold for \({{{\tilde{g}}}}_m\).

Proposition 3

Assume [H0].

  • If \((\varphi _j=\varphi _{j,T})\) the trigonometric basis, then

    $$\begin{aligned} {{\mathbb {E}}}(\Vert {{{\tilde{g}}}}_m - g\Vert ^2_T) \le \Vert g_m-g\Vert ^2_T + 6 G^2(T)\frac{4\pi ^2 m^2}{NT}+6G^2(T)\frac{m}{NT}. \end{aligned}$$
    (15)
  • Let \((\varphi _j=\ell _j)\) be the Laguerre basis.

    • Then, for all \(T\ge 1, N\ge 1, m\ge 0\),

      $$\begin{aligned}&{{\mathbb {E}}}(\Vert {{{\tilde{g}}}}_m - g\Vert ^2) \le \Vert g_m-g\Vert ^2 + 12\left( G^2(T)+ 2 \int _0^T \frac{G^2(u)}{u} du \right) \, \frac{m}{N} + \nonumber \\&\quad 12G^2(T) \frac{T }{N} + \int _T^{\infty } g^2(s)ds, \end{aligned}$$
      (16)

      where, if [H2] holds, ,

      $$\begin{aligned} \int _0^T \frac{G^2(u)}{u} du \le c_0 + G^2(T) \log (T). \end{aligned}$$
    • If \(T\ge 6(m-1)+3=6m-3\) and \((\varphi _j)\) is the Laguerre basis, then

      $$\begin{aligned} {{\mathbb {E}}}(\Vert {{{\tilde{g}}}}_m - g\Vert ^2) \le \Vert g_m-g\Vert ^2+ c_1 G^2(T)\frac{m^3}{N}+ c_2\Vert a\Vert ^2 m\; \exp {(-12\gamma _2m)} \end{aligned}$$
      (17)

      where \( c_1, c_2, \gamma _2\) are constants depending on the basis only.

As previously, all the risk bounds involve a squared bias term, \(\Vert g-g_m\Vert _T^2\) or \(\Vert g-g_m\Vert ^2\). The variance term in (15) can be compared to the one in (9), taking into acount that \(G(T)\le \Vert a\Vert ^2<+\infty \), and the order is now \(m^2/(NT)\) which for fixed T is larger than m/N obtained for \({\hat{g}}_m\) with the sama basis. Similarly, the variance term in (17) has order \(m^3/N\), which is larger than \(m^2/N\) in (13). This increase is the price of more general assumptions and estimators. As for (16), it is to be compared with (11): the variance order is m/N and there are the two additional terms T/N and \(\int _T^{\infty } g^2(s)ds\), difficult to discuss. We develop a data-driven selection method in Sect. 4, based on (15)–(16), which is implemented on simulated data.

3 Rates of convergence

Rates of convergence can be deduced from Propositions 1 and 3 in the asymptotic framework where N tends to infinity. As it is always the case in nonparametric estimation, we must link the bias term \(\Vert g-g_m\Vert ^2\) with regularity properties of function g, and the regularity spaces depend on the projection spaces.

3.1 Rates on periodic Fourier–Sobolev spaces for trigonometric basis

Consider first Inequality (9) and estimators built using the trigonometric basis. Let \(\beta \) be a positive integer, \(L>0\) and define

$$\begin{aligned} W^{per}(\beta , L)= & {} \{ f: [0,T]\rightarrow {{\mathbb {R}}}, f^{(\beta -1)} \; \text { is absolutely continuous}, \\&\quad \int _0^T [f^{(\beta )}(x)]^2dx \le L^2, f^{(j)}(0)=f^{(j)}(T), j=1, \dots , \beta -1\}. \end{aligned}$$

By Proposition 1.14 of Tsybakov (2009), a function \(f\in W^{per}(\beta , L)\) admits a development \(f=\sum _{j=0}^{\infty } \theta _j\varphi _{j,T}\) such that \(\sum _{j\ge 0} \theta _j^2 \tau _j^2\le C(L,T)\) where \(\tau _j=j^\beta \) for even j, \(\tau _j=(j-1)^\beta \) for odd j and \(C(L,T)=L^2(T/\pi )^{2\beta }\).

Therefore, consider the sets

$$\begin{aligned} \begin{array}{l} {{\mathcal {W}}}_1^{per}=\{ g, \; g\in W^{per}(\beta , L), g \text{ satisfies } \text{[H0] } \text{ and } \text{[H1] }, g(0)>0\}, \\ {{\mathcal {W}}}_2^{per}=\{ g, \; g\in W^{per}(\beta , L), g \text{ satisfies } \text{[H0] } \text{ and } \text{[H1] }, g(0)=0\}, \\ {{\mathcal {W}}}_3^{per}=\{ g, \; g\in W^{per}(\beta , L), g \text{ satisfies } \text{[H0] } \text{ but } \text{ not } \text{[H1] }\}. \end{array} \end{aligned}$$
(18)

Now, assume that \(g\in {{\mathcal {W}}}_1^{per}\). As \(g\in W^{per}(\beta , L)\), then \(\Vert g-g_m\Vert ^2\le C(L,T)m^{-2\beta }\) and Inequality (9) becomes

$$\begin{aligned} {{\mathbb {E}}}(\Vert {\hat{g}}_m-g\Vert ^2_T)\le C(L,T)m^{-2\beta } + C_1g(0)\frac{m}{N} + C_1\frac{ T}{N}. \end{aligned}$$

As \(g(0)\ne 0\), choosing \(m_{\mathrm{opt}}=c_TN^{1/(2\beta +1)}\) yields, for fixed T,

$$\begin{aligned} {{\mathbb {E}}}(\Vert {\hat{g}}_{m_{\mathrm{opt}}}-g\Vert ^2_T)\lesssim N^{-2\beta /(2\beta +1)} + \frac{ T}{N}. \end{aligned}$$

Thus, for fixed (not large) T, the estimator \({\hat{g}}_{m_{\mathrm{opt}}}\) is convergent in MISE when N grows to infinity, with rate \(N^{-2\beta /(2\beta +1)} \) and

$$\begin{aligned} \sup _{g\in {{\mathcal {W}}}_1^{per}} \inf _{m\ge 1} {{\mathbb {E}}}(\Vert {\hat{g}}_m-g\Vert ^2_T)\lesssim N^{-2\beta /(2\beta +1)}. \end{aligned}$$

If \(g\in {{\mathcal {W}}}_2^{per}\), then \(g(0)=0\), and choosing m as large as possible we can obtain the rate \(N ^{-1}\) for fixed T.

On the other hand, if \(g\in {{\mathcal {W}}}_3^{per}\), then we must consider the estimator \({{\tilde{g}}}_m\). As \(g\in W^{per}(\beta , L)\), Inequality (15) yields, for a choice \({{\tilde{m}}}_{\mathrm{opt}}={{\tilde{c}}}_T \, N^{1/(2\beta +2)}\) a rate for \({{\tilde{g}}}_{{{\tilde{m}}}_{\mathrm{opt}}}\) of order \(N^{-2\beta /(2\beta +2)}\) that is

$$\begin{aligned} \sup _{g\in {{\mathcal {W}}}_3^{per}} \inf _{m\ge 1} {{\mathbb {E}}}(\Vert {{\tilde{g}}}_m-g\Vert ^2_T)\lesssim N^{-2\beta /(2\beta +2)}. \end{aligned}$$

Clearly \(N^{-2\beta /(2\beta +2)} >N^{-2\beta /(2\beta +1)}\). The rate is less good than on \({{\mathcal {W}}}_1\), but the contraints are different. All the constants in the rates depend on \(\beta \) and L.

3.2 Rates on Sobolev–Laguerre spaces

Now, look at inequality (13) where \({\hat{g}}_m\) is computed using the (non compactly supported) Laguerre basis. Assume for consistency that \(m^2\lesssim N\) and \(m\le T/6\). The last term is negligible with respect to the variance term \(m^2/N\) and the usual square bias term \(\Vert g-g_m\Vert ^2\). An adequate solution to assess the rate of the bias term is provided by the balls of Sobolev-Laguerre spaces. For \(s\ge 0\), let

$$\begin{aligned} W^s((0, +\infty ),K)=\{h:(0, +\infty )\rightarrow {{\mathbb {R}}}, h \in {{\mathbb {L}}}^2((0, +\infty )), \sum _{k\ge 0} k^s \theta _k^2(h)\le K < +\infty \}\nonumber \\ \end{aligned}$$
(19)

where \(\theta _k(h)=\int _0^{+\infty } h(u)\varphi _k(u)du\). We set

$$\begin{aligned} W^s((0, +\infty )) =\{h:(0, +\infty )\rightarrow {{\mathbb {R}}}, h \in {{\mathbb {L}}}^2((0, +\infty )), \sum _{k\ge 0} k^s \theta _k^2(h)\ < +\infty \} \end{aligned}$$

for the Sobolev-Laguerre space. The link with regularity properties of functions can be seen for s integer. In this case, if \(h:(0, +\infty ) \rightarrow {{\mathbb {R}}}\) belongs to \(L^{2}((0,+\infty ))\),

$$\begin{aligned} \sum _{k\ge 0} k^s (\theta _k(h))^2<+\infty \end{aligned}$$
(20)

is equivalent to the property that h admits derivatives up to order \(s-1\), with \(h^{(s-1)}\) absolutely continuous on \((0,+\infty )\) and for \(m=0, \ldots , s-1\), the functions

$$\begin{aligned} x^{(m+1)/2}(he^x)^{(m+1)}e^{-x} = x^{(m+1)/2}\sum _{j=0}^{m+1} \left( {\begin{array}{c}m+1\\ j\end{array}}\right) h^{(j)} \end{aligned}$$

belong to \({{\mathbb {L}}}^2((0, +\infty ))\). Moreover, for \(m=0,1, \dots , s-1\),

$$\begin{aligned} \Vert x^{(m+1)/2}(he^x)^{(m+1)}e^{-x}\Vert ^2 = \sum _{k\ge m+1} k(k-1)\dots (k-m)\theta _k^2(h). \end{aligned}$$

(see Comte and Genon-Catalot Comte and Genon-Catalot 2018).

Now, consider the classes of functions \( {{\mathcal {W}}}_1=\{ g, \; g\in W^s((0, +\infty ),K), g \text{ satisfies } \text{[H0] } \text{ and } \text{[H1] }\}, \) and \( {{\mathcal {W}}}_2=\{ g, \; g\in W^s((0, +\infty ),K), g \text{ satisfies } \text{[H0] } \text{ but } \text{ not } \text{[H1] }\}.\)

Assume that \(g\in {{\mathcal {W}}}_1\). Then, as g belongs to \(W^s((0, +\infty ),K)\), it holds \(\Vert g-g_m\Vert ^2\le K m^{-s}\). Considering Inequality (13), the minimization of \(m^{-s}+m^2/N\) yields \(m_{opt}= N^{1/(2+s)}\) and a rate of order \(N^{-s/(2+s)}\) for the \({{\mathbb {L}}}^2\)-risk of \({\hat{g}}_m\) on the set \({{\mathcal {W}}}_1\).

The constraint \(m_{opt}= N^{1/(2+s)}\le T/6\) holds for all s as soon as \(T\ge \sqrt{N}\).

Assume that \(g\in {{\mathcal {W}}}_2\). The rate of convergence for the \({{\mathbb {L}}}^2\)-risk must be discussed for \({{\tilde{g}}}_m\), and relies on Inequality (17). Assume that \(m^3\lesssim N\). As g belongs to \(W^s((0, +\infty ),K)\), we still have \(\Vert g-g_m\Vert ^2\le Km^{-s}\). By minimizing \((m^3/N)+m^{-s}\), we find \(m_{opt}= N^{1/(s+3)}\) and a rate of order \(N^{-s/(s+3)}\). The constraint \(T>6m_{opt}\) holds for all s as soon as \(T\ge N^{1/3}\). To sum up this case, for \(T\ge N^{1/3}\),

$$\begin{aligned} \sup _{g\in {{\mathcal {W}}}_2} \inf _{m\in {{\mathbb {N}}}, 1\le m^3\le N} {{\mathbb {E}}}(\Vert {{\tilde{g}}}_m-g\Vert ^2_T)\lesssim N^{-2\beta /(2\beta +2)}. \end{aligned}$$

All the constants in the rates depend on s and K.

Inequalities (11) and (16) are appealing: the variance terms are smaller and they require less conditions. However they contain a term \(\int _T^{+\infty }g^2(s)ds\): this term is hopefully small for large (not too small) T, but rates of convergence are difficult to discuss. Nevertheless, our model selection procedures rely on these inequalities because the constants g(0)G(T) and \(G^2(T)\) are known in theory and possible to estimate in practice.

Example 1

(continued). Consider the function \(a(t)=t^d\exp {(-t)}\) with \(-1/4<d<1/2\), case where a(0) may not be defined and \(a'\) is not locally square integrable. Then, g belongs to \(W^1((0,+\infty ))\) if, moreover, \(\sqrt{t}(a^2(t)+2a(t)a'(t)) \in {{\mathbb {L}}}^2((0,+\infty ))\), which holds for \(0<d<1/2\). But for these values of d, we can check that g does not belong to \(W^2((0,+\infty ))\) as \(t(a^2(t)+ 2(a')^2(t)+ (a^2)''(t))\) does not belong to \({{\mathbb {L}}}^2((0,+\infty ))\). Therefore, the bias term for such a function is of order smaller than \(m^{-1}\) but larger than \(m^{-2}\), for \(0<d<1/2\).

4 Adaptive procedure under [H0]

As the second estimator can be computed under more general assumptions, we concentrate on this one for finding a data-driven choice of the projection dimension.

The estimator \({{\tilde{g}}}_m\) can be obtained as:

$$\begin{aligned} {{\tilde{g}}}_m= \arg \min _{h\in S_m^{(B)}} \gamma _{N,T}(h), \end{aligned}$$
(21)

for \((B)=(Lag)\) or \((B)=(Trig)\), and where

$$\begin{aligned} \gamma _{N,T}(h)= \Vert h\Vert ^2 + \frac{2}{N}\sum _{i=1}^N \left( \int _0^Th'(u)X_i^2(u)du - h(T)X_i^2(T)\right) . \end{aligned}$$

We consider the sets \({{\mathcal {M}}}_N^{(Lag)}=\{ m\in {{\mathbb {N}}}, m \le N/\log (T)\}\) and \({{\mathcal {M}}}_N^{(Trig)}=\{ m\in {{\mathbb {N}}}, m^2 \le N\}\). By inequalities (15)–(16), the variance term in the \({{\mathbb {L}}}^2\)-risk of all \({{{\tilde{g}}}}_m\) with \(m \in {{\mathcal {M}}}_N^{(B)}\) is bounded, where the superscript (B) indicates the basis: \((B)=(Trig)\) for the trigonometric basis and \((B)=(Lag)\) for the Laguerre basis. Now, we define, for \(\kappa \) a numerical constant,

$$\begin{aligned} {\widetilde{m}}^{(B)} :=\arg \min _{m\in {{\mathcal {M}}}_N^{(B)}} \left( \gamma _{N,T}({{\tilde{g}}}_m) + \mathrm{pen}^{(B)}(m)\right) , \end{aligned}$$
(22)

where

$$\begin{aligned}&\mathrm{pen}^{(Lag)} (m)= \kappa \log (N) \left( G^2(T)+ \int _0^T\frac{G^2(u)}{u} du \right) \frac{m}{N}, \\&\quad \mathrm{pen}^{(Trig)} (m)= \kappa G^2(T)\log (N) \frac{m^2}{NT}. \end{aligned}$$

Note that \(\gamma _{N,T}({{\tilde{g}}}_m)=-\Vert {{\tilde{g}}}_m\Vert ^2\). Thus, as \(\Vert g-g_m\Vert ^2=\Vert g\Vert ^2 -\Vert g_m\Vert ^2\), \(-\Vert {{\tilde{g}}}_m\Vert ^2\) provides an estimation of the squared bias, up to a constant. On the other hand, \(\mathrm{pen}^{(B)}(m)\) has the variance order, up to the \(\log (N)\) factor. We do not know if this factor is structural or due to technical problems (in the proofs) only. Anyway, the choice of \({{\widetilde{m}}}^{(B)}\) mimicks the squared bias-variance compromise. The following risk bound holds.

Theorem 1

Assume [H0] and [H2]. Then, there exists a numerical value \(\kappa _0^{(B)}>0\) such that \(\forall \kappa \ge \kappa _0^{(B)}\),

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\tilde{g}}}_{{{\widetilde{m}}}^{(B)}}-g\Vert ^2)\le & {} 4 \inf _{m\in {{\mathcal {M}}}_N^{(B)}} \left( \Vert g-g_m\Vert ^2 + \mathrm{pen}^{(B)}(m)\right) + C^{(B)}(T,N),\\ C^{(Lag)}(T,N)= & {} 32 G^2(T)\frac{T\log (N)}{N} \!+\! \int _T^{+\infty } g^2(s)ds \!+\! \frac{C}{N} \left( \frac{TG^2(T)}{N} \!+\! \int _0^T\frac{G^2(u)}{u} du\right) \end{aligned}$$

and

$$\begin{aligned} C^{(Trig)}(T,N)=\frac{C}{N}\left( \frac{1}{T}+ \frac{G^2(T)}{T} \frac{1}{N^{1/2}} \right) \end{aligned}$$

where C is a numerical constant.

The term \(G^2(T)\) in the definition of \(\mathrm{pen}^{(Trig)}(m)\) is unknown and must be replaced by an estimator. In practical implementation, we set

$$\begin{aligned} \widehat{G^2(T)}= \frac{1}{3N} \sum _{i=1}^N X_i^4(T). \end{aligned}$$
(23)

Indeed, \({{\mathbb {E}}}(X_1^4(T)) = 3G^2(T)\). From theoretical point of view, it can be proved that the result of Theorem 1 still holds for trigonometric basis with this substitution, see Section 4.1.4, Proof of Theorem 4.1 in Comte and Genon-Catalot (2015).

For the implementation of the procedure, we have to fix the constants \(\kappa \) in the penalties (see (22)). The numerical values of \(\kappa _0^{(B)}\), given in the proofs, are too large. In this method, finding the minimal value of \(\kappa \) is a difficult problem. This is why the choice of \(\kappa \) in the penalties is standardly calibrated by preliminary simulations.

Theorem 1 shows that the estimator \({{\tilde{g}}}_{{{{\widetilde{m}}}}^{(B)}}\) is adaptive in the sense that its \({{\mathbb {L}}}^2\)-risk automatically achieves the best compromise between squared bias and variance terms, up to remainder terms \(C^{(B)}(T,N)\). For \( C^{(Trig)}(T,N)\), it is clearly negligible, as \(T>1\) is fixed. As already noticed earlier, the term \(C^{(Lag)}(T,N)\) contains T/N and \(\int _T^{+\infty } g^2(s)ds\) which are in conflict: T should be large enough for the latter, but not too large for the former. However our risk bounds are valid for any TN.

Another strategy is possible for Laguerre basis, without [H2], which solves the conflict mentioned above. Let \({{\mathcal {M}}}_N^\star =\{ m, m^3\le N\}\) so that, by inequality (17), the variance term of \({{\tilde{g}}}_m\) is bounded and define

$$\begin{aligned} m^\star :=\arg \min _{m\in {{\mathcal {M}}}_N^\star } \left( \gamma _{N,T}({{\tilde{g}}}_m) + \mathrm{pen}^\star (m)\right) , \text{ where } \mathrm{pen}^\star (m)= \kappa \log (N)G^2(T) \frac{m^3}{N}. \end{aligned}$$
(24)

Theorem 2

Assume [H0]. Consider the Laguerre basis, and \(T\ge 6 N^{1/3}\). Then, there exists a numerical value \(\kappa _0^\star >0\) such that \(\forall \kappa \ge \kappa _0^\star \),

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\tilde{g}}}_{m^\star }-g\Vert ^2) \le 4 \inf _{m\in {{\mathcal {M}}}_N^\star } \left( \Vert g-g_m\Vert ^2 + \mathrm{pen}^\star (m)\right) + G^2(T)\frac{C}{N}, \end{aligned}$$

where C is a constant depending on the basis.

Theorem 2 also shows that the estimator \({{\tilde{g}}}_{m^\star }\) is adaptive in the sense that its \({{\mathbb {L}}}^2\)-risk automatically achieves the best compromise between the squared bias and the variance term of inequality (17). The comments after Theorem 1 apply also here.

5 Simulation study

In this section, we implement the adaptive estimators of the previous sections on simulated data. To simulate an exact discrete sampling of \((X_i(t),i=1, \ldots ,N\) with small sampling interval \(\Delta \), we use the property that the vectors \((X_i(k\Delta ), k=1, \ldots ,n)'\) with \(T=n\Delta \) are i.i.d. centered Gaussian vectors with covariance matrix \(A=(A_{j,k})\) where for \(1\le j\le k\),

$$\begin{aligned} A_{j,k}= \int _0^{j\Delta }a(j\Delta -u)a(k\Delta -u)du=\int _0^{j\Delta }a(v)a((k-j)\Delta +v)dv. \end{aligned}$$

which can be computed exactly or numerically according to the examples. Integrals in the estimators formulae are discretized. The following examples of functions a(.) and thus g(.) are considered.

  1. (1)

    (Ornstein-Uhlenbeck process) \(a_1(t)= \sigma \exp {(-\theta t)}\),

    $$\begin{aligned} A(j,k)= \frac{\sigma ^2\exp {(-\theta k\Delta )}}{2\theta }(\exp {(\theta j\Delta )}-\exp {(-\theta j\Delta )}), \; k\ge j. \end{aligned}$$

    We take \(\sigma =0.5, \theta =0.25\).

  2. (2)

    \(a_2(t)= (\beta (3,3,t/10)/\omega _2^{1/2})^{1/2}\) where \(\beta (p,q,x)\) is the density of a \(\beta (p,q)\) distribution at point x and \(\omega _2= 14.157\) is such that \(\int _{{{\mathbb {R}}}^+} g_2^2(u)du \approx 1\).

  3. (3)

    \(a_3(t)=(\frac{1}{2} \beta (3,3, t/3)+ \frac{1}{2} \beta (3,3,t/3-2))^{1/2}\).

  4. (4)

    \(a_4(t)= 10 b(6t)/(\omega _4)^{0.25}\) with \(b(t) =0.3 \Gamma (3,2, t)+0.7 \Gamma (7,4,t)\) where \(\Gamma (p,q,x)\) is the density of a \(\Gamma (p,q)\) distribution at point x and \(\omega _4= 0.03048\) is such that \(\int _{{{\mathbb {R}}}^+} g_4^2(u)du \approx 1\).

  5. (5)

    \(a_5(t)= t^{1.25} e^{-t/2}\).

  6. (6)

    \(a_6(t)= t^{0.25} e^{-t/3}\).

  7. (7)

    \(a_7(t)= t^{-0.125} e^{-t/5}\).

  8. (8)

    \(a_8(t)=1/\sqrt{1+t^2}\).

In all cases, recall that \(g_i(t)= a_i^2(t)\). The functions \(a_2\) and \(a_4\) are normalized (constants \(\omega _2, \omega _4\)), in order that \(\int g_i^2(u)du\approx 1\), \(i=2,4\), while for the other functions, this integral falls between 0.5 and 2.5. In Table 1, we compute the values of residual terms of formula (16): the values of \(\int _T^{+\infty } g_i^2(u)du\) are always negligible; but the values of \(TG_i^2(T)/N\) are comparable to the risk values obtained in Table 2, and thus not so small.

Table 1 Order of residual terms, for \(T=10\) and \(N=2000\)

All functions \(g_i\), \(i=1, \dots , 8\) satisfy [H0]. The functions \(g_2\) to \(g_6\) are null at zero, \(a_6\) and \(a_7\) do not satisfy [H1]. Thus, the first method (valid under [H0]-[H1]) should work for all functions except \(g_6\) and \(g_7\), with parametric rate (and large chosen dimension) for \(g_2\) to \(g_5\). Nevertheless, we implemented both methods for all functions. Note that all functions satisfy [H2].

We also experiment different settings for (NT): \(T=n\Delta =10\), \(n=400, \Delta =0.1/4\) with \(N=500, 2000, 8000\).

Table 2 Simulation results, 100 MISE one lines MISE, 100 std on lines (std), MISE of the oracle (best unknown choice, using the true), dim \(=\) mean of the selected dimensions, dim Or \(=\) mean of dimensions associated to the oracle, 200 repetitions

The estimators are computed via the formulae given in Sects. 2.2 and 2.3.

More precisely, inspired by Inequalities (9) and (11), we implement a data driven estimator relying on \({\hat{g}}_m\) given in Sect. 2.2 with dimension selected as follows: for \((B)= (Lag),\, (Trig)\),

$$\begin{aligned} {\hat{m}}^{(B)}=\arg \min _{1 \le m \le D_{\max }}\left\{ -\Vert {\hat{g}}_m^{(B)}\Vert ^2 + \kappa _1^{(B)} g^\dag (0) \widehat{G(T)} \frac{m}{N}\right\} . \end{aligned}$$

Note that no theoretical result is given in this case. We compute \(({\hat{g}}_m^{(Trig)})_{1\le m\le D_{\max }}\) and \(({\hat{g}}_m^{(Lag)})_{1\le m\le D_{\max }}\) the collections of estimators respectively in trigonometric and Laguerre basis, with coefficients given by (7), with \(D_{\max }=45\). Note that the first term in the curly bracket estimates the squared bias and the second estimates the main variance term. Moreover \({\widehat{G}}(T)=(1/N)\sum _{i=1}^N X_i^2(T)\) and \(g^\dag (0)\) is computed using the quadratic variation (see Remark 2)

$$\begin{aligned} g^\dag (0)=\frac{1}{NT}\sum _{i=1}^N \sum _{k=0}^{n-1} \left[ X_i\left( \frac{(k+1)T}{n}\right) -X_i\left( \frac{kT}{n}\right) \right] ^2 \end{aligned}$$

Next, we implement the estimators of Theorem 1. We compute \(({{\tilde{g}}}_m^{(Trig)})_{1\le m\le D_{\max }}\) and \(({{\tilde{g}}}_m^{(Lag)})_{1\le m\le D_{\max }}\) the collection of estimators in trigonometric and Laguerre basis, with coefficients given by (14). We select

$$\begin{aligned} {{\tilde{m}}}^{(Trig)}=\arg \min _m\left\{ -\Vert {{\tilde{g}}}_m^{(Trig)}\Vert ^2 + \kappa _{2}^{(Trig)} \widehat{G^2(T)}\log (N)\frac{m^2}{NT}\right\} \end{aligned}$$

and

$$\begin{aligned} {{\tilde{m}}}^{(Lag)}=\arg \min _m\left\{ -\Vert {{\tilde{g}}}_m^{(Lag)}\Vert ^2 + \kappa _{2}^{(Lag)} \widehat{G^2(T)}\log (T)\log (N)\frac{m}{N}\right\} \end{aligned}$$

where \(\widehat{G^2(T)}\) is defined by (23).

Fig. 1
figure 1

Examples of 50 estimated curves (in gray-green) using the Laguerre basis, with the two methods (method 1 left, method 2 right, for each couple of plots) for functions 3 (first line), function 6 (second line) and function 7 (third line), for \(N=500\) (left plots) and \(N=8000\) (right plots). The bold curve is the true function. Under each plot, the MISE over the 50 paths, and the mean of the selected dimensions

We do not present results using the procedure of Theorem 2, as the method seemed not stable.

Based on preliminary simulations, the constants are calibrated once and for all to the following values \(\kappa _{1}^{(Lag)}=27\), \(\kappa _{1}^{(Trig)}=6\), \(\kappa _{2}^{(Lag)}=0.11\) and \(\kappa _{2}^{(Trig)}= 0.6\).

Table 2 presents the values of the risks of the adaptive estimators computed for the eight functions, following the two methods (method 1: estimators \({\hat{g}}\), method 2: estimators \({{\tilde{g}}}\)) and using two bases, Laguerre (index L) and trigonometric (index T). For each function, the first line gives the MISE multipled by 100, over 200 repetitions, with standard deviation multiplied by 100 in parenthesis on the line below. The line “Or” gives the mean of path-by-path minimal integrated error (computed using the true function). The fourth line provides the mean of selected dimensions, and “dim or” the mean of the dimensions associated to the oracle estimators. We can compare lines 1 (MISE)and 3 (Or), and lines 4 (dim) and 5 (dim Or), where MISE and dim should be as close as possible to Or and dim Or.

Naturally, the risk decreases as N increases. Globally, the Laguerre basis performs satisfactorily, and better than the trigonometric one, except for function \(a_2\). Note that the methods are easy to implement and the computation time is quite fast.

To conclude this section, we provide in Fig. 1 plots illustrating the behaviour of our estimators following the two strategies in the Laguerre basis, for three of the functions of the list, namely the mixed-beta function 3) and two functions of type \(t^d e^{-t/b}\) with \(d=0.25, b=3\) (function 6)) and \(d=-0.25, b=5\) (function 7)). Each couple of plots corresponds to the representation of 50 estimators computed by the two methods, together with the true function in bold (red). Two values of N are compared, and the MISE are given, to make the orders of Table 2 concrete; the improvement from \(N=500\) to \(N=8000\) is obvious in most cases. We note that the first method seems to still work for function 6), contrary to what was expected from the theory. But it fails for function 7), as expected : the estimator is biased. Method 2 always gives good results.

6 Concluding remarks

In this paper, we consider i.i.d. continuous observations of the processes \((X_i(t), t\in [0,T]), i=1, \ldots , N)\) distributed as the CMA process (1). We build collections of nonparametric estimators of the unknown function \(g=a^2\) by projection method on finite-dimensional subspaces of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). The subspaces are generated by the trigonometric basis of \({{\mathbb {L}}}^2([0,T])\) or the Laguerre basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). After proving various risk bounds for each estimator, we propose a data-driven selection of the dimension of the projection space and prove that it leads to an adaptive estimator. Our methods are implemented on simulated data and show convincing results in terms of risks and plots with a better performance for the estimators in Laguerre basis.

The consistency of the estimators is ensured for fixed T as N tends to infinity (case of trigonometric basis) or when both Tand N tend to infinity (case of Laguerre basis) but with T/N not too large. It would be interesting to clarify this point which has an impact on the risk bounds as we noticed on Monte-Carlo simulations.

Our proofs rely on the Gaussian character of (1) especially for the adaptive procedure. The generalization to other processes than the Wiener process in (1) is of interest and left to further work. Clearly, the results could be obtained with more general deviation inequalities, rather than the \(\chi \)-square deviations specifically used here.

The question of taking into account, from the theoretical point of view, the discretization step used in practice may also be worth investigation. Lastly, there may be some developments about optimality, but the meaning of this in our context would have first to be carefully defined.

7 Proofs

7.1 Proofs of section 2

Proof of Lemma 1

Using the stochastic differential (6), we write:

$$\begin{aligned} \int _0^{+\infty } \varphi _j(s) X(s)dX(s)= & {} a(0) \int _0^{+\infty } \varphi _j(s) X(s)dW(s)\\&\quad + \int _0^{+\infty } \varphi _j(s) X(s) \int _0^{s} a'(s-u)dW(u) ds\\ \end{aligned}$$

Therefore, as \({{\mathbb {E}}}\int _0^{+\infty } \varphi _j^2(s)X_i^2(s)ds =\int _0^{+\infty } \varphi _j^2(s)G(s)ds\le \Vert a\Vert ^2 <+\infty \),

$$\begin{aligned} {{\mathbb {E}}}\int _0^{+\infty } \varphi _j(s) X(s)dX(s)= & {} \int _0^{+\infty } \varphi _j(s) \int _0^s a(s-u)a'(s-u)du \, ds \\= & {} \frac{1}{2}\int _0^{+\infty } \varphi _j(s)(a^2(s)-a^2(0))ds, \end{aligned}$$

which gives the result. \(\square \)

Proof of Lemma 1

We consider that \((\varphi _j)=(\varphi _{j,T})\) is the trigonometric basis on [0, T]. In this case, \({{\mathbb {E}}}{\hat{\theta }}_j=\theta _j\), we can write \({{\mathbb {E}}}\Vert {\hat{g}}_m-g\Vert _T^2= {{\mathbb {E}}}\Vert {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m\Vert ^2+ \Vert g_m-g\Vert _T^2.\) We have, setting \(X=X_1\),

$$\begin{aligned} {{\mathbb {E}}}\Vert {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m\Vert ^2= & {} \frac{1}{N} \sum _{j=0}^{m-1} \text{ Var } \left( 2\int _0^T \varphi _j(s)X(s)dX(s)\right) \nonumber \\\le & {} \frac{1}{N} \sum _{j=0}^{m-1} {{\mathbb {E}}}\left( 2\int _0^T \varphi _j(s)X(s)dX(s)\right) ^2. \end{aligned}$$
(25)

(Note that for functions on \(S_{m,T}\), the norms \(\Vert .\Vert _T\) and \(\Vert .\Vert \) are identical). We have:

$$\begin{aligned} {{\mathbb {E}}}\left( \int _0^T \varphi _j(s)X(s)dX(s)\right) ^2\le & {} 2 g(0) \int _0^T \varphi _j^2(s){{\mathbb {E}}}(X^2(s))ds \nonumber \\&\quad +2 {{\mathbb {E}}} \left[ \left( \int _0^T \varphi _j(s) X(s)Y(s) ds\right) ^2\right] \end{aligned}$$
(26)

where \(Y(s)= \int _0^sa'(s-u)dW(u)\). We have \({{\mathbb {E}}}(X^2(s))=G(s)\le G(T)\le \Vert a\Vert ^2\). Now, using that \((\varphi _j)=(\varphi _{j,T})\) is an orthonormal basis of \({{\mathbb {L}}}^2([0,T])\)

$$\begin{aligned} \sum _{j=0}^{m-1}{{\mathbb {E}}} \left( \int _0^T \varphi _j(s) X(s)Y(s) ds\right) ^2= & {} {{\mathbb {E}}}\left[ \sum _{j=0}^{m-1}\left( \int _0^T \varphi _j(s) X(s)Y(s) ds\right) ^2\right] \\&\quad \le {{\mathbb {E}}}\int _0^TX^2(s)Y^2(s) ds. \end{aligned}$$

As (X(s), Y(s)) is a Gaussian centered vector, we know that:

$$\begin{aligned}&{{\mathbb {E}}}(X^2(s)Y^2(s)) = 2 \left[ {{\mathbb {E}}}(X(s)Y(s))\right] ^2 + {{\mathbb {E}}}(X^2(s)){{\mathbb {E}}}(Y^2(s))\nonumber \\&\quad = \frac{1}{4}\left( g(s)-g(0)\right) ^2+ G(s) G_1(s), \end{aligned}$$
(27)

with \(G_1(s)=\int _0^s (a'(u))^2du\le \Vert a'\Vert ^2\). Therefore, if \(g(0)\ne 0\)

$$\begin{aligned} {{\mathbb {E}}}\Vert {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m\Vert ^2\le & {} \frac{4}{N}\left( 2 g(0)G(T) m+ 2 T G(T) G_1(T)+ Tg^2(0)+ \Vert g\Vert ^2_T\right) . \end{aligned}$$

Therefore, we obtain (9).

If \(g(0)=0\), (26) becomes

$$\begin{aligned} {{\mathbb {E}}}\left( \int _0^T \varphi _j(s)X(s)dX(s)\right) ^2= {{\mathbb {E}}} \left[ \left( \int _0^T \varphi _j(s) X(s)Y(s) ds\right) ^2\right] \end{aligned}$$

and thus

$$\begin{aligned} {{\mathbb {E}}}\Vert {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m\Vert ^2 \le 4 \frac{TG(T)G_1(T)}{N} + \frac{\Vert g\Vert _T^2}{N}, \end{aligned}$$

which gives inequality (10). \(\square \)

Proof of Preposition 2

Now, we look at the case of a basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). The estimator \({\hat{\theta }}_j\) is no more unbiased. We write \({\hat{g}}_m-g= {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m + {{\mathbb {E}}} {\hat{g}}_m-g_m + g_m-g\) and

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{\hat{g}}}_m - g\Vert ^2) = \Vert g_m-g\Vert ^2 + {{\mathbb {E}}}\Vert {\hat{g}}_m - {{\mathbb {E}}} {\hat{g}}_m\Vert ^2 + \Vert {{\mathbb {E}}} {\hat{g}}_m-g_m\Vert ^2. \end{aligned}$$
(28)

The first term is the usual square bias term. The middle term is a variance term which can be treated as in the previous proposition. The last term is an additional bias term, due to the truncation of the integrals. We have:

$$\begin{aligned} \Vert {{\mathbb {E}}} {\hat{g}}_m-g_m\Vert ^2 = \sum _{j=0}^{m-1} ({{\mathbb {E}}} {\hat{\theta }}_j -\theta _j)^2= \sum _{j=0}^{m-1}\left( \int _T^{+\infty } g(s) \varphi _j(s)ds\right) ^2 \le \int _T^{+\infty } g^2(s)ds, \end{aligned}$$
(29)

and we obtain inequalities (11) and (12).

Now, consider the Laguerre basis. To get (13), we bound differently (26). We write:

$$\begin{aligned} \nonumber&\left( {{\mathbb {E}}} \int _0^T \varphi _j(s) X(s)Y(s) ds\right) ^2\\ \nonumber&\quad = \int _{[0,T]^2}\varphi _j(s)\varphi _j(u){{\mathbb {E}}} [X(s)Y(s) X(u)Y(u)] dsdu\\ \nonumber&\quad \le \int _{[0,T]^2}|\varphi _j(s)\varphi _j(u)|\left\{ {{\mathbb {E}}}[(X(s)Y(s))^2]{{\mathbb {E}}}[(X(u)Y(u))^2]\right\} ^{1/2}dsdu \\&\quad = \left( \int _0^T |\varphi _j(s)|\left\{ {{\mathbb {E}}}[(X(s)Y(s))^2)]\right\} ^{1/2} ds\right) ^2. \end{aligned}$$
(30)

By (27) and the assumptions, \({{\mathbb {E}}}(X(s)Y(s))^2\le \frac{1}{2}(g^2(0)+ \Vert g\Vert _\infty ^2)+ \Vert a\Vert ^2\Vert a'\Vert ^2\) is bounded. Therefore, we need to bound \(\int _0^T |\varphi _j(s)|ds\). For this, we split each integral according to the inequalities of Askey and Wainger (1965) recalled in Sect. 8 (we assume without loss of generality that they hold for all j). We have:

$$\begin{aligned} \int _0^T |\varphi _j(s)|ds= \int _0^{2T} |\varphi _j(x/2)|dx/2= \frac{1}{2}(\sum _{\ell =1}^6 I_\ell ) \end{aligned}$$

and bound each term. Setting \(\nu _j=4j+2\),

$$\begin{aligned} I_1= & {} \int _0^{1/\nu _j}dx= \nu _j^{-1}, \quad I_2= \int _{1/\nu _j}^{\nu _j/2}(x\nu _j)^{-1/4}dx\le \frac{2^{5/4}}{3}\nu _j^{1/2},\\ I_3= & {} \int _{\nu _j/2}^{\nu _j- \nu _j^{1/3}}\nu _j^{-1/4}(\nu _j-x)^{-1/4}dx\le 3/4, \\ I_4= & {} \int _{\nu _j-\nu _j^{1/3}}^{\nu _j+\nu _j^{1/3}}\nu _j^{-1/3}dx= 2,\\ I_5= & {} \int _{\nu _j+\nu _j^{1/3}}^{3\nu _j/2}\frac{exp{(-\gamma _1\nu _j^{-1/2}(x-\nu _j)^{3/2})}}{\nu _j^{1/4}(x-\nu _j)^{1/4}}dx\le \nu _j ^{1/6} \frac{\exp {(-\gamma _1)}}{\gamma _1}, \\ I_6= & {} \int _{3\nu _j/2}^T \exp {(-\gamma _2 x)} dx \le \frac{\exp {(-3\gamma _2\nu _j/2)}}{\gamma _2}. \end{aligned}$$

Consequently, for \(j=0, \ldots ,m-1\) and \(T\ge 6(m-1)+3=6m-3\),

$$\begin{aligned} \int _0^T |\varphi _j(s)|ds \lesssim j^{1/2}. \end{aligned}$$
(31)

Finally,

$$\begin{aligned} \sum _{j=0}^{m-1} \left( \int _0^T |\varphi _j(s)|ds\right) ^2\lesssim m^2 \end{aligned}$$
(32)

Using again the inequalities of Askey and Wainger (1965) (see Sect. 8), for \(T\ge 6m-3\), for all \(j \in \{0, 1, \ldots , m-1\}\), \(|\varphi _j(x/2) |\le \exp {(-\gamma _2 x)}\) and

$$\begin{aligned} \left| \int _T^{+\infty } \varphi _j(s)g(s)ds\right|\le & {} \int _T^{+\infty } |\varphi _j(s)|g(s)ds\le \exp {(-2\gamma _2 T)} \Vert a\Vert ^2\\\le & {} \exp {(-2\gamma _2 (6(m-1)+3))} \Vert a\Vert ^2. \end{aligned}$$

The additional bias term (29) is therefore bounded as follows:

$$\begin{aligned} \sum _{j=0}^{m-1}\left[ \int _T^{+\infty } \varphi _j(s)g(s)ds\right] ^2\lesssim \Vert a\Vert ^2 m \; \exp {(-12\gamma _2m)}. \end{aligned}$$
(33)

We thus obtain (13) by joining (30), (32) and (33).\(\square \)

Proof of Lemma 2

We have

$$\begin{aligned} {{\mathbb {E}}}(\int _0^{T} \varphi '_{j}(s) X^2(s)ds)= & {} \int _0^{T} \varphi '_{j}(s) ( \int _0^s g(s-u)du) ds= \int _0^{T} \varphi '_{j}(s) G(s) ds\\= & {} [\varphi _{j}(s)G(s)]_0^{T}- \langle g, \varphi _j\rangle _T= \varphi _{j}(T)G(T)- \langle g, \varphi _{j}\rangle _T, \end{aligned}$$

which is the result. \(\square \)

Proof of Proposition 3

Assume that \((\varphi _j=\varphi _{j,T})\) is the trigonometric basis. Then, \({\tilde{\theta }}_j\) is an unbiased estimator of \(\theta _j\). We only need to study the variance term of the risk.

$$\begin{aligned} {{\mathbb {E}}}\Vert {{\tilde{g}}}_m - {{\mathbb {E}}} {{\tilde{g}}}_m\Vert ^2\le & {} \frac{2}{N}\left( \sum _{j=0}^{m-1}{{\mathbb {E}}}\left( \int _0^T \varphi '_{j,T}(s) X^2(s)ds\right) ^2+\sum _{j=0}^{m-1}\varphi _{j,T}^2(T){{\mathbb {E}}}X^4(T)\right) \end{aligned}$$

where \({{\mathbb {E}}}X^4(T)= 3\left( \int _0^Ta^2(s)ds\right) ^2=3G^2(T)\) and \(\sum _{j=0}^{m-1}\varphi _j^2(T)=m/T\).

We have

$$\begin{aligned} \varphi '_{0,T}(s)= 0, \;\;\varphi '_{2j,T}(s)\!=\! (2\pi j/T) \varphi _{2j-1,T}(s),\;\; \varphi '_{2j-1,T}(s)\!=\! -\!(2\pi j/T) \varphi _{2j,T}(s), \;\;j\ge 1.\nonumber \\ \end{aligned}$$
(34)

Proceeding as in Proposition 1 (projection argument), we obtain

$$\begin{aligned} \sum _{j=0}^{m-1}{{\mathbb {E}}}\left( \int _0^T \varphi '_{j,T}(s) X^2(s)ds\right) ^2\le \frac{4\pi ^2 m^2}{T^2} {{\mathbb {E}}}\int _0^TX^4(s)ds\le 3 G^2(T)\frac{4\pi ^2 m^2}{T}, \end{aligned}$$

using that \({{\mathbb {E}}}X^4(s)= 3\left( \int _0^s \, a^2(u)du\right) ^2\le 3 G^2(T)\) for \(s\le T\). This gives (15).

Now, we look at the case of the Laguerre basis on \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\). We start as above by

$$\begin{aligned} {{\mathbb {E}}}(\Vert {{{\tilde{g}}}}_m - g\Vert ^2)= {{\mathbb {E}}}\Vert {{\tilde{g}}}_m - {{\mathbb {E}}} {{\tilde{g}}}_m\Vert ^2 + \Vert {{\mathbb {E}}} {{\tilde{g}}}_m-g_m\Vert ^2 + \Vert g_m-g\Vert ^2. \end{aligned}$$

We get

$$\begin{aligned} {{\mathbb {E}}}\Vert {{\tilde{g}}}_m - {{\mathbb {E}}} {{\tilde{g}}}_m\Vert ^2= & {} \frac{1}{N} \sum _{j=0}^{m-1}\mathrm{Var}\left( \int _0^T \ell _j'(s) X_1^2(s)ds- X_1^2(T)\ell _j(T)\right) \\&\le \frac{2}{N} \sum _{j=0}^{m-1}{{\mathbb {E}}}\left[ \left( \int _0^T \ell _j'(s) X_1^2(s)ds\right) ^2\right] \\&+ \frac{2}{N} \sum _{j=0}^{m-1}\ell _j^2(T) {{\mathbb {E}}}[X_1^4(T)]:= {{\mathbb {T}}}_1+{{\mathbb {T}}}_2. \end{aligned}$$

Using that \({{\mathbb {E}}}X^4(T)= 3\left( \int _0^Ta^2(s)ds\right) ^2= 3 G^2(T)\) and \(|\ell _j|\le \sqrt{2}\), we get

$$\begin{aligned} {{\mathbb {T}}}_2 \le 12 G^2(T) \frac{m}{N}. \end{aligned}$$

Next, we use that the Laguerre basis satisfies \(\ell '_0(x)=-\ell _0(x)\) and \(\ell '_j(x)= -\ell _j(x) - \sqrt{2j/x} \ell _{j-1}^{(1)}(x)\) for \(j\ge 1\) where \((\ell _{k}^{(1)}(x), k\ge 0)\) is the Laguerre basis with index 1 (see Sect. 8) and we find

$$\begin{aligned} {{\mathbb {T}}}_1\le & {} \frac{4}{N} \sum _{j=0}^{m-1} {{\mathbb {E}}}\left[ \left( \int _0^T \ell _j(s)X_1^2(s)ds\right) ^2\right] + \frac{4}{N} \sum _{j=1}^{m-1} {{\mathbb {E}}}\left[ \left( \int _0^T \ell _{j-1}^{(1)}(s)\sqrt{\frac{2j}{s}}X_1^2(s)ds\right) ^2\right] \\\le & {} \frac{4}{N} {{\mathbb {E}}}\!\left( \int _0^T X_1^4(s)ds\right) \!+\! \frac{8m}{N} {{\mathbb {E}}}\left( \int _0^T \frac{X_1^4(s)}{s} ds\right) \!=\!12TG^2(T)\frac{1}{N} \!+\! 24 \frac{m}{N} \int _0^T \frac{G^2(s)}{s} ds. \end{aligned}$$

Under [H2], we obtain

$$\begin{aligned} {{\mathbb {T}}}_1\le 12TG^2(T)\frac{1}{N} + 24(c_0 + \log (T) G^2(T)) \frac{m}{N}. \end{aligned}$$

Finally, the variance term is bounded by

$$\begin{aligned} {{\mathbb {E}}}\Vert {{\tilde{g}}}_m - {{\mathbb {E}}} {{\tilde{g}}}_m\Vert ^2 \le 12 \left( G^2(T) \left( 2 \log (T)+1\right) + 2c_0\right) \frac{m}{N} + 12G^2(T) \frac{T }{N}. \end{aligned}$$

If [H2] does not hold and \(T\ge 6m-3\), we can bound differently the variance and bias terms.

$$\begin{aligned}&\sum _{j=0}^{m-1}{{\mathbb {E}}}\left( \int _0^T\ell _j'(s)X^2(s)ds\right) ^2\\&\quad = \int _{[0,T]^2} \sum _{j=0}^{m-1} \ell '_j(s)\ell '_j(u) {{\mathbb {E}}}[X^2(s)X^2(u)] ds \,du \\&\quad \le \int _{[0,T]^2}\left[ \sum _{j=0}^{m-1}(\ell _j'(s))^2{{\mathbb {E}}}(X^4(s))\sum _{j=0}^{m-1}(\ell _j'(u))^2{{\mathbb {E}}}(X^4(u))\right] ^{1/2}dsdu \\&\quad = \left( \int _0^T \left[ \sum _{j=0}^{m-1}(\ell _j'(s))^2{{\mathbb {E}}}(X^4(s))\right] ^{1/2} ds\right) ^2\\&\quad \le 3 G^2(T) \left( \int _0^T \left[ \sum _{j=0}^{m-1}(\ell _j'(s))^2\right] ^{1/2} ds\right) ^2 \end{aligned}$$

We decompose the integral to obtain

$$\begin{aligned} \left( \int _0^T \left[ \sum _{j=0}^{m-1}(\ell _j'(s))^2\right] ^{1/2} ds\right) ^2\le 2 \left( \int _0^{6m-3} \ldots \right) ^2 + 2\left( \int _{6m-3}^T \ldots \right) ^2, \end{aligned}$$

and bound each term. Using (47) and again the inequalities of Askey and Wanger (1965)(see Sect. 8), we get that, for \(s\ge 6m-3\), \(|\varphi '_j(s)|\le 2\sum _{k=0}^{j} |\ell _j(s)|\le 2(j+1)\exp {(-\gamma _2s)}\). Thus, \(\sum _{j=0}^{m-1}(\ell _j'(s))^2\le 4m^3 \exp {(-2\gamma _2s)}\). So,

$$\begin{aligned} \left( \int _{6m-3}^T\left[ \sum _{j=0}^{m-1}(\ell _j'(s)^2\right] ^{1/2} ds\right) ^2 \le \frac{4m^3}{\gamma _2^2} \exp {(-(12m-6)\gamma _2)}. \end{aligned}$$
(35)

Now,

$$\begin{aligned}&\left( \int _{0}^{6m-3}\left[ \sum _{j=0}^{m-1}(\ell _j'(s)^2\right] ^{1/2} ds\right) ^2 \nonumber \\&\quad \le (6m-3) \int _0^{+\infty }\sum _{j=0}^{m-1}(\ell _j'(s))^2\, ds = (6m-3) \sum _{j=0}^{m-1}(1+4j)\le 12m^3. \end{aligned}$$
(36)

Finally, we get

$$\begin{aligned} {{\mathbb {E}}}\Vert {{\tilde{g}}}_m - {{\mathbb {E}}} {{\tilde{g}}}_m\Vert ^2 \le \frac{3}{N} G^2(T)\left( 12m^3 +\frac{4m^3}{\gamma _2^2} \exp {(-(12m-6)\gamma _2)}\right) \end{aligned}$$
(37)

So, we have the two variance bounds.

For the bias term, we have \({{\mathbb {E}}} {\tilde{\theta }}_j= \theta _j -\ell _j(T)G(T)- \int _T^{+\infty } \ell _j(s)g(s)ds\). We have \(G(T)\le G(+\infty )= \Vert a\Vert ^2\). Moreover, inequality (33) still holds. Joining variance and bias terms, we obtain (16) and (17). \(\square \)

7.2 Proof of Theorem 1

Let us state a preliminary Lemma:

Lemma 3

Let \(V_N=\sum _{i=1}^N (X_i^2-1)\) where \(X_i\) are i.i.d. standard Gaussian variables. Then for all \(\varepsilon \in (0,1]\),

$$\begin{aligned} {{\mathbb {P}}}(|V_N| \ge N\varepsilon ) \le 2 \exp \left( -\frac{N\varepsilon ^2}{8} \right) . \end{aligned}$$

Proof of Lemma 3

By Lemma 1 and Inequalities (4.3)–(4.4) in Laurent and Massart (2000), we have, for any \(u>0\),

$$\begin{aligned} {{\mathbb {P}}}(|V_N|\ge 2\sqrt{Nu} + 2 u)\le 2\exp (-u). \end{aligned}$$

Thus, setting \(u=Nx\), we have, for any \(x>0\), \({{\mathbb {P}}}(|V_N|\ge 2N\sqrt{x} + 2 Nx)\le 2\exp (-Nx)\). Now we set \(N\varepsilon = 2N(x+\sqrt{x})\) and using Birgé and Massart (1998), Lemma 8, Inequality (7.14) with \(v=\sqrt{2}\) and \(c=2\), we find

$$\begin{aligned} {{\mathbb {P}}}(|V_N|\ge N\varepsilon )\le 2\exp \left( -\frac{N\varepsilon ^2}{4(1+\varepsilon )}\right) , \end{aligned}$$

and the result follows. \(\square \)

7.2.1 Case of Laguerre basis

Note that, as \(G(0)=0\) and \(h(+\infty )=0\), \(\langle h,g \rangle = - \langle h',G\rangle \). Therefore,

$$\begin{aligned} \gamma _{N,T}(h)= \Vert h\Vert ^2- 2 \langle h,g \rangle -2 \nu _{N,T}(h) + 2 R_T(h) \end{aligned}$$

where \(\nu _{N,T}(h)=\nu _{N,1}(h)+ \nu _{N,2}(h)\),

$$\begin{aligned} \nu _{N,1}(h)= & {} -\frac{1}{N}\sum _{i=1}^N \int _0^Th'(u)[X_i^2(u)-G(u)]du,\nonumber \\ \nu _{N,2}(h)= & {} \frac{1}{N} \sum _{i=1}^N h(T) (X_i^2(T)-G(T)), \end{aligned}$$
(38)

and

$$\begin{aligned} R_T(h)= \int _T^{+\infty }h(u)g(u)du. \end{aligned}$$

Therefore,

$$\begin{aligned} \gamma _{N,T}(h_1)- \gamma _{N,T}(h_2)= \Vert h_1-g\Vert ^2- \Vert h_2-g\Vert ^2- 2 \nu _{N,T}(h_1-h_2) +2 R_T(h_1-h_2).\nonumber \\ \end{aligned}$$
(39)

Using the definition of \({{\tilde{m}}} ={{\tilde{m}}}^{(Lag)}\), we have for all \(g_m\in S_m\),

$$\begin{aligned} \gamma _{N,T}({{\tilde{g}}}_{{\widetilde{m}}}) + \mathrm{pen}({\widetilde{m}}) \le \gamma _{N,T}({{\tilde{g}}}_{ m}) + \mathrm{pen}( m), \end{aligned}$$

where for simplicity \(\mathrm{pen}=\mathrm{pen}^{(Lag)}\). We deduce

$$\begin{aligned} \Vert {{\tilde{g}}}_{{\widetilde{m}}}- g\Vert ^2\le & {} \Vert g_m-g\Vert ^2 + 2 \nu _{N,T}({{\tilde{g}}}_{{\widetilde{m}}}-g_m) -2 R_T({{\tilde{g}}}_{{\widetilde{m}}}-g_m) + \mathrm{pen}( m)-\mathrm{pen}({\widetilde{m}})\\ \end{aligned}$$

Let \(B_m= \{h \in S_m, \Vert h\Vert \le 1\}\). We use that

$$\begin{aligned} 2| \nu _{N,T}({{\tilde{g}}}_{{\widetilde{m}}}-g_m)|\le & {} \frac{1}{8} \Vert {{\tilde{g}}}_{{\widetilde{m}}}-g_m\Vert ^2 +8 \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,T}(h) \\\le & {} \frac{1}{4} (\Vert {{\tilde{g}}}_{{\widetilde{m}}}-g\Vert ^2 + \Vert g-g_m\Vert ^2) +8 \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,T}(h) \\ 2 |R_T({{\tilde{g}}}_{{\widetilde{m}}}-g_m) |\le & {} \frac{1}{4} (\Vert {{\tilde{g}}}_{{\widetilde{m}}}-g\Vert ^2 + \Vert g-g_m\Vert ^2) +8 \sup _{h\in B_{{\widetilde{m}} \vee m}} R^2_{T}(h). \end{aligned}$$

We have \(\sup _{h\in B_m} R^2_T(h)\le \int _T^{+\infty } g^2(u)du\) so that

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} R^2_{T}(h)\right) \le \int _T^{+\infty } g^2(u)du. \end{aligned}$$

Gathering terms yields

$$\begin{aligned} \frac{1}{2} \Vert {{\tilde{g}}}_{{\widetilde{m}}}- g\Vert ^2\le & {} \frac{3}{2} \Vert g_m-g\Vert ^2 + \mathrm{pen}( m) +8 \int _T^{+\infty } g^2(u)du\\&+ 8 (\sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,T}(h) -p^{(Lag)}(m,{{{\widetilde{m}}}}) ) + 8 p^{(Lag)}(m,{{{\widetilde{m}}}}) -\mathrm{pen}({\widetilde{m}}). \end{aligned}$$

where \(p^{(Lag)}(m,m')=p_1^{(Lag)}(m,m')+p_2^{(Lag)}(m,m')\),

$$\begin{aligned} p_1^{(Lag)}(m,m')= & {} 128 \log (N)\int _0^T \frac{G^2(u)}{u} du \frac{m\vee m'}{N} , \\ p_2^{(Lag)}(m,m')= & {} 32 G^2(T)\log (N) \frac{m\vee m'}{N}. \end{aligned}$$

Now we use that \(8p^{(Lag)}(m,m')\le \mathrm{pen}(m)+ \mathrm{pen}(m')\) for \(\kappa \ge \kappa _0^{(Lag)}=8\times 128\), and the result of the following Lemma:

Lemma 4

Under the Assumptions of Theorem 1, for \(\ell =1,2\),

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu _{N,\ell }^2(h) - p^{(Lag)}_\ell (m, {{\widetilde{m}}})\right) _+ \le C_\ell ^{(Lag)}(T,N), \end{aligned}$$

where

$$\begin{aligned} C_1^{(Lag)}(T,N)= & {} \frac{C}{N} \left( \frac{TG^2(T)}{N} + \int _0^T\frac{G^2(u)}{u} du\right) + \frac{C\log {N}}{N}TG^2(T) , \\ C_2^{(Lag)}(T,N)= & {} \frac{C}{N}G^2(T) \end{aligned}$$

and C is a positive numerical constant.

And we obtain

$$\begin{aligned}&{{\mathbb {E}}}( \Vert {{\tilde{g}}}_{{\widetilde{m}}}- g\Vert ^2)\\&\quad \le 3 \Vert g_m-g\Vert ^2 + 4 \mathrm{pen}( m) +16 \int _T^{+\infty } g^2(u)du + 32(C_1^{(Lag)}(T,N)+C_2^{(Lag)}(T,N)), \end{aligned}$$

which ends the proof of Theorem 1 in the Laguerre case. \(\square \)

.

Proof of Lemma 4

Let us define

$$\begin{aligned} Z_N(u)=\frac{1}{N} \sum _{i=1}^N \left( \frac{X_i^2(u)}{G(u)} -1\right) , \end{aligned}$$

which is for all u distributed as \((\chi ^2(N)-N)/N\), and set

$$\begin{aligned} A_N(u)=\left\{ Z_N^2(u)\le 16 \frac{\log (N)}{N}\right\} . \end{aligned}$$

By Lemma 3, \({{\mathbb {P}}}(A_N(u)^c)\le 2N^{-2}\) provided that \(16\log (N)/N\le 1\) i.e. \(N\ge 68\).

Now we can write \(\nu _{N,1}(h)= - \int _0^T G(u)h'(u)Z_N(u)du\) and split it

$$\begin{aligned} \nu _{N,1}(h)= & {} - \int _0^T G(u)h'(u)Z_N(u){{\mathbf {1}}}_{A_N(u)}du - \int _0^T G(u)h'(u)Z_N(u){{\mathbf {1}}}_{A_N(u)^c}du\\ \nonumber:= & {} \nu _{N,1,1}(h)+\nu _{N,1,2}(h).\end{aligned}$$
(40)

Then

$$\begin{aligned}&\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,1}(h) -p_1^{(Lag)}(m,{{{\widetilde{m}}}}) \right) _+ \le \left( 2\sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,1,1}(h) -p_1^{(Lag)}(m,{{{\widetilde{m}}}}) \right) _+ \\&\quad + 2\sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,1,2}(h). \end{aligned}$$

With \(B(u):= G(u)Z_N(u)\mathbf{1}_{A_N(u)}\), and by using Formula (46), we have

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,1}^2(h)\le & {} \sum _{j=0}^{{\widetilde{m}} \vee m-1}\left( \int _0^T B(u)\ell '_j(u)du \right) ^2 \\\le & {} 2 \sum _{j=0}^{{\widetilde{m}} \vee m-1}\left( \int _0^T B(u)\ell _j(u)du \right) ^2 \!+\!2\sum _{j=1}^{{\widetilde{m}} \vee m\!-\!1} (2j) \left( \int _0^T \frac{B(u)}{\sqrt{u}}\ell ^{(1)}_{j-1}(u)du \right) ^2\\\le & {} 2\int _0^T B^2(u)du + 4({\widetilde{m}} \vee m) \int _0^T \frac{B^2(u)}{u} du. \end{aligned}$$

Now, using the definition of \(A_N(u)\),

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,1}^2(h) \le 32 G^2(T) \frac{T \log (N)}{N} + 64 \int _0^T \frac{G^2(u)}{u} du \frac{ ({\widetilde{m}} \vee m) \log (N)}{N}. \end{aligned}$$

As a consequence

$$\begin{aligned} {{\mathbb {E}}}\left[ \left( 2 \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,1,1}(h) -p_1^{(Lag)}(m,{{{\widetilde{m}}}}) \right) _+ \right] \le 64 G^2(T) \frac{T \log (N)}{N}. \end{aligned}$$
(41)

Similarly, for \(C(u):= G(u)Z_N(u)\mathbf{1}_{A_N(u)^c}\), we have

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,2}^2(h)\le 2\int _0^T C^2(u)du + 4N \int _0^T \frac{C^2(u)}{u} du. \end{aligned}$$

Now, by the Rosenthal Inequality (see Hall and Heyde 1980), \({{\mathbb {E}}}(Z_N^4(u))\lesssim N^{-2}\) and thus

$$\begin{aligned} {{\mathbb {E}}}[Z_N^2(u){{\mathbf {1}}}_{A_N(u)^c}]\le {{\mathbb {E}}}^{1/2}[Z_N^4(u)] {{\mathbb {P}}}^{1/2}(A_N(u)^c)\lesssim N^{-2}. \end{aligned}$$
(42)

As a consequence,

$$\begin{aligned} {{\mathbb {E}}}\left( 2\sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,2}^2(h)\right) \le C\left( \frac{TG^2(T)}{N^2} + \int _0^T\frac{G^2(u)}{u}du \frac{1}{N}\right) . \end{aligned}$$
(43)

Gathering (41) and (43) implies the result of Lemma 4 for \(\ell =1\) and \((B)=(Lag)\). Now we look at \(\nu _{N,2}(h)\) and write

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h)= & {} \left( \sup _{h\in B_{{\widetilde{m}} \vee m}}h^2(T)\right) G^2(T)Z_N^2(T)\le \sum _{j=0}^{{\widetilde{m}} \vee m-1} \varphi ^2_j(T) G^2(T)Z_N^2(T)\\\le & {} 2({\widetilde{m}} \vee m) G^2(T)Z_N^2(T). \end{aligned}$$

Therefore

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h){{\mathbf {1}}}_{A_N(T)}\le 32({\widetilde{m}} \vee m) G^2(T)\frac{\log (N)}{N} \end{aligned}$$

and using (42),

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h){{\mathbf {1}}}_{A_N(T)^c}\right) \le 2NG^2(T) {{\mathbb {E}}}(Z_N^2(T){{\mathbf {1}}}_{A_N(T)^c}) \lesssim \frac{G^2(T)}{N}. \end{aligned}$$

Finally

$$\begin{aligned} {{\mathbb {E}}}\left[ \left( \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h)-p_2^{(Lag)}(m, {\widetilde{m}})\right) _+\right]\le & {} {{\mathbb {E}}}\left[ \left( \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h){{\mathbf {1}}}_{A_N(T)}-p_2^{(Lag)}(m, {\widetilde{m}})\right) _+\right] \\&+ {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h){{\mathbf {1}}}_{A_N(T)^c}\right) \\\le & {} C\frac{G^2(T)}{N}, \end{aligned}$$

where C is a numerical constant. Thus, we obtain Lemma 4 for \(\ell =2\) and \((B)=(Lag)\).

\(\square \)

7.2.2 Case of trigonometric basis

We proceed analogously. As now \(R_T(h)=0\), we have, with for simplicity, \({\widetilde{m}}={\widetilde{m}}^{(Trig)}\),

$$\begin{aligned} \Vert {{\tilde{g}}}_{{\widetilde{m}}}- g\Vert ^2\le & {} 3 \Vert g_m-g\Vert ^2 + 2\mathrm{pen}^{(Trig)}( m) \\&+ 16 (\sup _{h\in B_{{\widetilde{m}} \vee m}} \nu ^2_{N,T}(h) -p^{(Trig)}(m,{{{\widetilde{m}}}}) ) + 16 p^{(Trig)}(m,{{{\widetilde{m}}}}) -2\mathrm{pen}^{(Trig)}({\widetilde{m}}). \end{aligned}$$

with \(p^{(Trig)}(m,m')=p_1^{(Trig)}(m,m')+p_2^{(Trig)}(m,m')\) and we find

$$\begin{aligned}&p_1^{(Trig)}(m,m')= 64\pi ^2G^2(T) \log (N) \frac{(m\vee m')^2}{NT},\\&\quad p_2^{(Trig)}(m,m') = 128 G^2(T) \log (N)\frac{m\vee m'}{NT}. \end{aligned}$$

In the same way as Lemma 4, the following lemma determines \(p_1^{(Trig)}(m,m'),p_2^{(Trig)}(m,m')\).

Lemma 5

Under the Assumptions of Theorem 1, for \(\ell =1,2\),

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu _{N,\ell }^2(h) - p^{(Trig)}_\ell (m, {{\widetilde{m}}})\right) _+ \le C_\ell ^{(Trig)}(T,N), \end{aligned}$$

where

$$\begin{aligned} C_\ell ^{(Trig)}(T,N)= \frac{ C}{NT}, \quad C_2^{(Trig)}(T,N)= C\frac{G^2(T)}{T} \frac{1}{N^{3/2}} \end{aligned}$$

and C is a positive numerical constant.

To conclude the proof in the trigonometric basis case, analogously, we use that \(8p^{(Trig)}(m,m')\le \mathrm{pen}^{(Trig)}(m)+ \mathrm{pen}^{(Trig)}(m')\) for \(\kappa \ge \kappa _0^{(Trig)}=8\times 16(8\pi ^2+1)\) and obtain

$$\begin{aligned} {{\mathbb {E}}}\Vert {{\tilde{g}}}_{{\widetilde{m}}}- g\Vert ^2\le & {} 3 \Vert g_m-g\Vert ^2 + 4\mathrm{pen}^{(Trig)}( m) \\&+ 16(C_1^{(Trig)}(T,N)+C_2^{(Trig)}(T,N)) \end{aligned}$$

\(\square \)

Proof of Lemma 5

Using the properties of the derivatives \(\varphi '_{j,T}\) (see (34)) and the definition of \(A_N(u)\), the bound for \(\nu _{N,1,1}^2(h)\) now writes

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,1}^2(h)\le & {} \sum _{j=0}^{{\widetilde{m}} \vee m-1}\left( \int _0^T B(u)\varphi '_{j,T}(u)du \right) ^2 \\\le & {} 4\pi ^2 \frac{({\widetilde{m}} \!\vee m)^2}{T^2} \!\int _0^T B^2(u)du \!=\! 4\pi ^2 \frac{({\widetilde{m}} \vee m)^2}{T^2} \!\int _0^TG^2(u)Z_N^2(u)\mathbf{1}_{A_N(u)}du \\\le & {} 16 \frac{\log {N}}{N}4\pi ^2 \frac{({\widetilde{m}} \vee m)^2}{T^2} TG^2(T) = \frac{1}{2} p_1^{(Trig)}(m,{\widetilde{m}} ). \end{aligned}$$

And, using the definition of \({{\mathcal {M}}}_N^{(Trig)}\) and (42),

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,2}^2(h)\le & {} 4\pi ^2 \frac{({\widetilde{m}} \vee m)^2}{T^2} \int _0^T C^2(u)du\\= & {} 4\pi ^2 \frac{({\widetilde{m}} \vee m)^2}{T^2} \int _0^T G^2(u)Z_N^2(u)\mathbf{1}_{A_N(u)^c}du\\\le & {} 4\pi ^2 \frac{N}{T^2} \int _0^TG^2(u)du \frac{C}{N^2} \le 4\pi ^2 C \frac{G^2(T)}{NT}= \frac{1}{2} C_1^{(Trig)}(N,T). \end{aligned}$$

The other term is

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h)= & {} \sup _{h\in B_{{\widetilde{m}} \vee m}}h^2(T)G^2(T) Z_N^2(T) (\mathbf{1}_{A_N(T)}+\mathbf{1}_{A_N(T)^c})\\\le & {} 16 \frac{{\widetilde{m}} \vee m}{T}G^2(T)\frac{\log {N}}{N} + \frac{\sqrt{N}}{T}G^2(T) \times \frac{C}{N^2}\\\le & {} p_2^{(Trig)}(m,{\widetilde{m}} ) + C_2^{(Trig)}(N,T). \end{aligned}$$

This implies Lemma 5. \(\square \)

7.3 Proof of Theorem 2

The proof follows the same steps as Theorem 1. We only indicate the changes. Here, we have, proceeding as in Proposition 3:

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,1}^2(h)\le & {} \sum _{j=0}^{{\widetilde{m}} \vee m-1}\left( \int _0^T B(u)\ell '_j(u)du \right) ^2\\= & {} \int _{[0,T]^2}\left[ \sum _{j=0}^{{\widetilde{m}} \vee m-1}\ell _j'(u)B(u)\ell _j'(v)B(v)\right] dudv \\\le & {} \left[ \int _0^T\left( \sum _{j=0}^{{\widetilde{m}} \vee m-1}(\ell _j'(u)B(u))^2\right) ^{1/2} du\right] ^2\\\le & {} 16 \frac{\log {N}}{N}G^2(T) \left[ \int _0^T\left( \sum _{j=0}^{{\widetilde{m}} \vee m-1}(\ell _j'(u))^2\right) ^{1/2} du\right] ^2\\\le & {} 16 \frac{\log {N}}{N}G^2(T) ({\widetilde{m}} \vee m)^3(12 + 4\gamma _2^{-2}):= \frac{1}{2} p_1^\star (m, {\widetilde{m}}). \end{aligned}$$

Analogously, for the term with C(u), using that the maximal value in \({{\mathcal {M}}}_N^\star \) is bounded by \(N^{1/3}\),

$$\begin{aligned} \sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,1,2}^2(h)\le & {} C \frac{G^2(T)}{N^2}\left[ \int _0^T\left( \sum _{j=0}^{{\widetilde{m}} \vee m-1}(\ell _j'(u))^2\right) ^{1/2} du\right] ^2 \le C \frac{G^2(T)}{N}. \end{aligned}$$

Thus,

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu _{N,1}^2(h) - p^{\star }_1(m, {{\widetilde{m}}})\right) _+ \le C \frac{G^2(T)}{N}. \end{aligned}$$

The study of \(\sup _{h\in B_{{\widetilde{m}} \vee m}}\nu _{N,2}^2(h)\) is the same as previously and we can set

$$\begin{aligned} p_2^\star (m, {\widetilde{m}})=p_2^{(Lag)}(m, {\widetilde{m}})\le 32 \frac{\log {N}}{N}G^2(T) ({\widetilde{m}} \vee m)^3. \end{aligned}$$

Then,

$$\begin{aligned} {{\mathbb {E}}}\left( \sup _{h\in B_{{\widetilde{m}} \vee m}} \nu _{N,2}^2(h) - p^{\star }_2(m, {{\widetilde{m}}})\right) _+ \le C \frac{G^2(T)}{N}. \end{aligned}$$

We set \(p^\star (m,m')=p_1^\star (m,m')+p_2^\star (m,m')\) and check that \(8p^\star (m,m')\le \mathrm{pen}^\star (m)+ \mathrm{pen}^\star (m')\) for \(\kappa \ge \kappa _0^\star = 8\times (16(12 + 4\gamma _2^{-2})+32)\).

Lastly, we have from the proof of Inequality (17) in Proposition 3 that, for \(T\ge 6m-1\),

$$\begin{aligned} \sup _{h\in S_m, \Vert h\Vert \le 1}|R_T(h)| \le 2\Vert a\Vert ^2 \frac{m^{3/2}}{\gamma _2} \exp {(-(6m-3)\gamma _2)}. \end{aligned}$$

Therefore, \(\sup _{h\in S_{M_N}, \Vert h\Vert \le 1}R^2_T(h)\lesssim \frac{1}{N}\). \(\square \)

8 Appendix

For this paragraph, we refer to Abramowitz and Stegun (1964) and Comte and Comte and Genon-Catalot (2018).

The Laguerre polynomial with index \(\delta \), \(\delta >-1\), and degree k is given by

$$\begin{aligned} L_k^{(\delta )}(x)= \frac{1}{k!} e^x x^{-\delta } \frac{d^k}{dx^k} \left( x^{\delta +k}e^{-x}\right) = \sum _{j=0}^k \left( {\begin{array}{c}k+\delta \\ k-j\end{array}}\right) \frac{(-x)^j}{j!}. \end{aligned}$$

The following holds:

$$\begin{aligned}&\left( L_k^{(\delta )}(x)\right) '= -L_{k-1}^{(\delta +1)}(x), \quad \text{ for } k\ge 1, \; \text{ and } \nonumber \\&\int _0^{+\infty } \left( L_k^{(\delta )}(x)\right) ^2 x^{\delta }e^{-x}dx= \frac{\Gamma (k+\alpha +1)}{k!}. \end{aligned}$$
(44)

We consider the Laguerre functions with index \(\delta \), given by

$$\begin{aligned} \ell _k^{(\delta )}(x)= 2^{(\delta +1)/2} \left( \frac{k!}{\Gamma (k+\delta +1)}\right) ^{1/2} L_k^{(\delta )}(2x) e^{-x} x^{\delta /2}. \end{aligned}$$
(45)

The family \((\ell _k^{(\delta )})_{k\ge 0}\) is an orthonormal basis of \({{\mathbb {L}}}^2({{\mathbb {R}}}^+)\).

For \(\delta =0\), we set \(L_k^{(0)}=L_k\), \(\varphi _k^{(0)}=\ell _k\). Using (44), we obtain for \(j\ge 1\):

$$\begin{aligned} \ell '_j(x)= -\ell _j(x)- \sqrt{\frac{2j}{x}} \;\ell _{j-1}^{(1)}(x). \end{aligned}$$
(46)

The following properties hold for the \(\ell _j\)’s. For all \(x\ge 0\),

$$\begin{aligned} |\ell _j(x)|\le \sqrt{2}, \quad \;\int _0^{+\infty }\ell _j(x)dx= \sqrt{2}(-1)^j, \quad j\ge 0, \end{aligned}$$
$$\begin{aligned} \ell _0'(x)=-\ell _0(x), \; \ell _j'(x)=-\ell _j(x)- 2\sum _{k=0} ^{j-1} \ell _k(x) , j \ge 1. \end{aligned}$$
(47)

Then integrating from x to \(+\infty \) formula (47) for \(j\ge 1\), and setting \(\widetilde{{{\mathcal {L}}}}_j(x)=\int _x^{+\infty } \ell _j(u)du\), we obtain \(\ell _j=\widetilde{{{\mathcal {L}}}}_j+2\sum _{k=0}^{j-1}\widetilde{{{\mathcal {L}}}}_k\). Thus, \(\widetilde{ {{\mathcal {L}}}}_j= \ell _j-\ell _{j-1} -\widetilde{ {{\mathcal {L}}}} _{j-1}\). Using that \(\widetilde{ {{\mathcal {L}}}}_0=\ell _0\), we obtain by elementary induction \(\widetilde{ {{\mathcal {L}}}}_j= \ell _j+2 \sum _{k=1}^j (-1)^k \ell _{j-k}\). Moreover, setting \({{\mathcal {L}}}_j(x)=\int _0^x \ell _j(u)du\), we have

$$\begin{aligned} {{\mathcal {L}}}_0(x)=\ell _0(0)-\ell _0(x), \quad {{\mathcal {L}}}_j(x)=-{{\mathcal {L}}}_{j-1}(x)-\ell _j(x)+\ell _{j-1}(x), j\ge 1. \end{aligned}$$
(48)

Lastly, the following asymptotic formulae can be found in Askey and Wainger (1965). For \(\nu =4k+2\), and k large enough

$$\begin{aligned} |\ell _k(x/2)|\le C \left\{ \begin{array}{lll}a) &{} 1 &{} \text{ if } 0\le x\le 1/\nu \\ b) &{}(x\nu )^{-1/4} &{} \text{ if } 1/\nu \le x\le \nu /2 \\ c) &{} \nu ^{-1/4} (\nu -x)^{-1/4} &{} \text{ if } \nu /2 \le x\le \nu -\nu ^{1/3} \\ d) &{} \nu ^{-1/3} &{} \text{ if } \nu -\nu ^{1/3} \le x\le \nu + \nu ^{1/3} \\ e) &{} \nu ^{-1/4} (x-\nu )^{-1/4}e^{-\gamma _1 \nu ^{-1/2}(x-\nu )^{3/2}} &{} \text{ if } \nu + \nu ^{1/3} \le x \le 3\nu /2 \\ f) &{} e^{-\gamma _2 x} &{} \text{ if } x\ge 3\nu /2 \end{array}\right. \end{aligned}$$

where \(\gamma _1\) and \(\gamma _2\) are positive and fixed constants.