1 Introduction

In this paper, we consider sums of Banach space valued random variables with the aim to establish conditions for their asymptotic normality using a generalization of Lindeberg’s elementary proof of the central limit theorem of 1922 [18].

Lindeberg’s method is simple and elegant. It is based on replacing one by one non-Gaussian random variables by Gaussian ones and then using Taylor expansion to get approximation bounds. This principle has been applied for proving central limit theorems for sums of independent random variables with values in a Hilbert space by Giné and León [12], or to estimate rates of convergence in a central limit theorem in Banach spaces (see, e.g., Bentkus et al. [4], Paulauskas and Račkauskas [19]) as well. Its potential for proving more general invariance results has been discovered by many researchers and led to results on matrices with exchangeable entries [7], on the universality of local laws [24, 25] on matrices with correlated entries [1,2,3] and many others. Various kernel type density (regression function) estimators produce sums of an array of random functions to which Lindeberg CLT can be applied to help solve various statistical problems (see, e.g., [13] and references therein).

The idea of Lindeberg was carefully examined and generalized by Zolotarev [26] through the introduction of the so-called \(\zeta \) metrics, which metrizes weak convergence in the case of distributions in Hilbert space as later proved by Giné and León [12]. Although for Banach spaces such metrization of weak convergence is not possible in general, however, it is possible to connect weak convergence of probability measures with uniform convergence over a suitable class of differentiable functions in the case where the norm of Banach space is smooth enough (see Sect. 2). This leads to an extension of Lindeberg method for smooth Banach spaces. In turn, we prove central limit theorem for a triangular array of random elements with values in such Banach spaces (see Sect. 4) and establish asymptotic normality of sums of Banach space valued linear processes as well of weighted sums of independent identically distributed \(\mathsf {B}\)-valued random variables (Sect. 5). In Sect. 3, we present some remarks concerning differentiability of norm and some examples of smooth Banach spaces.

Abstract theory of smoothness in infinite-dimensional real Banach spaces and its connections with geometrical properties has been investigated by many authors. For a very detailed exposition of the theory, we refer to the book by Hájek and Johanis [14]. The existence of a p-smooth bump function for \(1<p\le 2\) was shown to be equivalent to certain martingale moment inequality and appeared as sufficient property to some probability limit theorems in Banach spaces, see, e.g., Pisier [20] and Rosiński [22].

2 Weak Convergence via Smooth Functions

In what follows, \(\mathsf {B}\) denotes a real separable Banach space. The norm of an element \(x\in \mathsf {B}\) is denoted by \(\Vert x\Vert _\mathsf {B}\), or, if no confusion can arise, simply by \(\Vert x\Vert \). The Banach space topological dual of \(\mathsf {B}\) is \(\mathsf {B}^*\), and we shall use the notation \(\langle x, y^* \rangle :=y^*(x)\) for the duality pairing of the elements \(x\in \mathsf {B}\) and \(y^*\in \mathsf {B}^*\). Let \(L(\mathsf {B})\) be the Banach space of all continuous linear operators \(u: \mathsf {B}\rightarrow \mathsf {B}\), endowed with the norm \(\Vert u\Vert = \sup \{\Vert ux\Vert : x\in \mathsf {B}, \ \Vert x\Vert \le 1\},\) \(I_{\mathsf {B}}\in L(\mathsf {B})\) is the identity operator. By \(\mathscr {L}_k(\mathsf {B})\), we denote the Banach space of bounded k-linear operators \(T:\mathsf {B}^k\rightarrow \mathbb {R}\) with the supremum norm

$$\begin{aligned} \Vert T\Vert =\sup \{|T(h_1, \dots , h_k)|: \Vert h_1\Vert \le 1, \dots , \Vert h_k\Vert \le 1\}. \end{aligned}$$

To simplify the writing, the overloading of the notation \(\left\| z \right\| \) for the norm is used throughout the text whenever the nature of the argument z dispels any doubt on the Banach space involved : \(\mathsf {B}\) or one of the associated spaces of continuous operators \(L(\mathsf {B})\), \(\mathscr {L}_k(\mathsf {B})\), \(k>1\).

The set of all probability distributions on the measurable space \((\mathsf {B}, \mathscr {B}_{\mathsf {B}})\) is denoted by \(\mathscr {P}(\mathsf {B})\), where \(\mathscr {B}_{\mathsf {B}}\) is the \(\sigma \)-algebra of Borel subsets of \(\mathsf {B}\). Throughout we use

$$\begin{aligned} \mathsf {P}f:=\int _{\mathsf {B}}f(x)\mathsf {P}(\,\mathrm {d}x) \end{aligned}$$

for any probability \(\mathsf {P}\in \mathscr {P}(\mathsf {B})\) and any \(\mathsf {P}\)-integrable function \(f:\mathsf {B}\rightarrow \mathbb {R}\). Let us recall here that a sequence \((\mathsf {P}_n) \subset \mathscr {P}(\mathsf {B})\) converges weakly to \(\mathsf {P}\in \mathscr {P}(\mathsf {B})\) (denoted \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P})\) if

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathsf {P}_nf =\mathsf {P}f\ \ \text {for each} \ f\in C_b(\mathsf {B}), \end{aligned}$$

where \(\mathrm {C}_b(\mathsf {B})\) is the class of all bounded continuous functions \(f:\mathsf {B}\rightarrow \mathbb {R}\). It is sometimes convenient to prove weak convergence of probability measures by showing that \(\mathsf {P}_nf\rightarrow \mathsf {P}f\) for a class \(\mathscr {F}\) of functions \(f: \mathsf {B}\rightarrow \mathbb {R}\) which is smaller than the class \(C_b(\mathsf {B})\). In this case, it is said that \(\mathscr {F}\) determines the weak convergence of probabilities. A well known example is provided by the class of bounded Lipschitz functions. Recall the function \(f:\mathsf {B}\rightarrow \mathbb {R}\) is bounded Lipschitz if

$$\begin{aligned} \Vert f\Vert _{\mathrm {Lip}}:=\sup _{x\in \mathsf {B}}|f(x)|+\sup _{x\not =y}\frac{|f(x)-f(y)|}{\Vert x-y\Vert }<\infty . \end{aligned}$$

Moreover, the bounded Lipschitz distance

$$\begin{aligned} d_{\mathrm {BL}}(\mathsf {P}, \mathsf {Q}) = \sup \Big \{\big |\mathsf {P}f-\mathsf {Q}f\big |:\ \Vert f\Vert _{\mathrm {Lip}}\le 1\Big \},\ \ \mathsf {P},\mathsf {Q}\in \mathscr {P}(\mathsf {B}), \end{aligned}$$

metrizes weak convergence that is \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) if and only if \(\lim _{n\rightarrow \infty } d_{\mathrm {BL}}(\mathsf {P}_n; \mathsf {P})=0\).

Another example is known in the case where \(\mathsf {B}=\mathscr {H}\) is a separable Hilbert space. As proved by Giné and León [12], for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathscr {H})\), in order to check weak convergence \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) it is enough to show that \(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for every \(f:\mathscr {H}\rightarrow \mathbb {R}\) continuous, bounded and with bounded derivatives of all orders. So the situation in a separable Hilbert space is just as in the case of a finite dimensional space.

In what follows for a number \(p\ge 1\), we denote by \(\lfloor p \rfloor \) the unique integer satisfying \(p-1\le \lfloor p \rfloor <p\) and agree that \(\{p\}:=p-\lfloor p \rfloor \). The reader is warned about the difference with the classical “floor” and “fractional part” functions, e.g., \(\lfloor 3.9 \rfloor =3\), but \(\lfloor 4 \rfloor =3\) and \(\{n\}=1\) for any integer n. This is motivated by our wish to interpolate between spaces \(\mathrm {C}_b^{n-1}\) and \(\mathrm {C}_b^{n}\) of functions with, respectively, bounded \((n-1)\) derivatives or n derivatives by spaces of functions whose \((n-1)\)th derivatives satisfy a Hölder condition with exponent \(0<\alpha \le 1\), the special case \(\alpha =1\) giving Lipschitz \((n-1)\) derivatives.

More precisely, we introduce for any real \(p\ge 1\) the class \(\mathrm {C}^{(p)}_b(\mathsf {B})\) of functions \(f\in \mathrm {C}_b(\mathsf {B})\) that are \(\lfloor p \rfloor \)-times continuously Fréchet differentiable and such that

$$\begin{aligned} \left\| f \right\| _{(p)}:=\sum _{k=0}^{\lfloor p \rfloor }\sup _{x\in \mathsf {B}}\Vert f^{(k)}(x)\Vert + \sup _{x\not =y}\frac{\Vert f^{(\lfloor p \rfloor )}(x)-f^{(\lfloor p \rfloor )}(y)\Vert }{\Vert x-y\Vert ^{\{p\}}}<\infty , \end{aligned}$$
(1)

where \(f^{(k)}\) denotes the \(k^{\text {th}}\) Fréchet derivative of the function f with \(f^{(0)}:=f\). For the definition of Fréchet derivatives and properties of Fréchet differentiable functions in infinite dimensional Banach spaces, we refer to [6]. Clearly, \(\mathrm {C}_b^{(1)}(\mathsf {B})\) coincides with the class of Lipschitz functions, and \(\left\| f \right\| _{(1)}=\Vert f\Vert _{\mathrm {Lip}}\) for \(f\in \mathrm {C}_b^{(1)}(\mathsf {B})\). By \(\mathrm {C}^\infty _b(\mathsf {B})\), we denote the class of infinitely many times differentiable functions with bounded derivatives.

Define for \(\mathsf {P},\mathsf {Q}\in \mathscr {P}(\mathsf {B})\),

$$\begin{aligned} \zeta _{p}(\mathsf {P},\mathsf {Q}) := \sup \Big \{\big |\mathsf {P}f-\mathsf {Q}f\big |:\ f\in \mathrm {C}^{(p)}_b(\mathsf {B}), \ \left\| f \right\| _{(p)} \le 1\Big \}. \end{aligned}$$
(2)

A natural question is then: for which Banach spaces \(\mathsf {B}\), does the class \(\mathrm {C}^{(p)}_{b}(\mathsf {B}), p\ge 1,\) determine the weak convergence of probability measures? Roughly speaking, this is true in the case where the norm of \(\mathsf {B}\) is sufficiently smooth. To be more precise, we define first what we mean by smoothness of a norm.

Definition 1

Let \(p\ge 1\). We say that a Banach space \(\mathsf {B}\) is p-smooth if its norm \(\psi (x):=\Vert x\Vert _\mathsf {B}, x\in \mathsf {B},\) is \(\lfloor p \rfloor \)-times continuously Fréchet differentiable on the set \(\mathsf {B}\setminus \{0\}\), and

$$\begin{aligned} \sum _{i=1}^{\lfloor p \rfloor }\sup _{\Vert x\Vert =1}\Vert \psi ^{(i)}(x)\Vert + \sup _{x\not =y, \Vert x\Vert =\Vert y\Vert =1}\frac{\Vert \psi ^{(\lfloor p \rfloor )}(x)-\psi ^{(\lfloor p \rfloor )}(y)\Vert }{\Vert x-y\Vert ^{\{p\}}}<\infty . \end{aligned}$$
(3)

Evidently every Banach space is 1-smooth. If \(\mathsf {B}\) is q-smooth for some \(q>1\), it is also p-smooth for \(1\le p\le q\). Examples of p-smooth spaces, where \(p>1,\) are given below (see Examples 12 and 14).

Remark 2

Our definition of p-smoothness, tailored for the Lindeberg method, looks different of the p-smoothability as in, e.g., Rosiński [22,  p. 159] where \(\mathsf {B}\) is said to be p-smoothable (\(1\le p\le 2\)), if there exists an equivalent norm \(\vert \cdot \vert \) on \(\mathsf {B}\) such that the modulus of smoothness

$$\begin{aligned} \rho _{\vert \cdot \vert }(t):= \sup \left\{ \frac{\vert x+ty \vert +\vert x-ty \vert }{2} - 1 : \vert x \vert =\vert y \vert =1\right\} =O(t^p),\quad \text {as }t\rightarrow 0. \end{aligned}$$

By Lemma 19 p.246 in [14], this condition means that \(\vert \cdot \vert \) is \(\mathrm {C}^{1,p-1}\) smooth in the sense of [14,  Def. 124, p. 55]. So for \(1\le p \le 2\), p-smoothability and p smoothness in the sense of Definition 1 are similar up to the equivalence of norms. But the use of \(\rho _{\vert \cdot \vert }(t)\) to define the p-smoothability induces the restriction \(p\le 2\) while we need \(p>2\) for the Lindeberg method.

Remark 3

It seems worth noticing here the two following facts about \(\psi \).

  1. (a)

    For any Banach space \(\mathsf {B}\), there is no Fréchet derivative of \(\psi \) at 0. Indeed should \(\psi \) be Fréchet derivable at 0, the same should hold for its restriction to the one dimensional subspace \(D=\{t u, t\in \mathbb {R}\}\), for some fixed \(u\in \mathsf {B}\setminus \{0\}\). This in turn would imply the derivability at 0 (in the classical elementary sense) of the function \(t\mapsto |t|\), which clearly fails.

  2. (b)

    If \(\psi \) is Fréchet differentiable on \(\mathsf {B}\setminus \{0\}\), then

    $$\begin{aligned} \psi ^{(1)}(x) = \psi ^{(1)}\left( \frac{x}{\Vert x\Vert }\right) ,\quad x\in \mathsf {B}\setminus \{0\}. \end{aligned}$$
    (4)

    In particular, \(\psi ^{(1)}\) is bounded on \(\mathsf {B}\setminus \{0\}\) if and only if it is bounded on \(\{x\in \mathsf {B},\;\Vert x\Vert =1\}\). Another obvious consequence of (4) is that

    $$\begin{aligned} \psi ^{(1)}(cx) = \psi ^{(1)}(x),\quad c\in \mathbb {R}\setminus \{0\},\;x\in \mathsf {B}\setminus \{0\}. \end{aligned}$$
    (5)

    To check (4), we note that the Fréchet differentiability of \(\psi \) means that for each fixed \(x\ne 0\),

    $$\begin{aligned} \bigl \vert \left\| x+h \right\| -\left\| x \right\| -\psi ^{(1)}(x)(h) \bigr \vert = o(\left\| h \right\| ),\quad h\rightarrow 0. \end{aligned}$$

    Puting \(y:=x/\left\| x \right\| \), \(u:=h/\left\| x \right\| \) and recalling that \(\psi ^{(1)}(x)\) is a linear operator \(B\rightarrow \mathbb {R}\), lead to

    $$\begin{aligned} \vert \psi \left( y+u\right) - \psi (y) - \psi ^{(1)}(x)\left( u\right) \vert = o(\left\| u \right\| ), \quad u\rightarrow 0, \end{aligned}$$

    whence \(\psi ^{(1)}(x)=\psi ^{(1)}(y)\).

The main result in this section is the following theorem.

Theorem 4

Let \(p\ge 1\). If the Banach space \(\mathsf {B}\) is p-smooth, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathsf {B})\), the following statements are equivalent

  1. (i)

    \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);

  2. (ii)

    \(\mathsf {P}_n f\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}^{(p)}_b(\mathsf {B})\);

  3. (iii)

    \(\lim _{n\rightarrow \infty }\zeta _{p}(\mathsf {P}_n, \mathsf {P})=0.\)

The proof of the theorem will be achieved by establishing the cycle of implications:

$$\begin{aligned} \mathrm{(i)} \Rightarrow \mathrm{(iii)} \Rightarrow \mathrm{(ii)} \Rightarrow \mathrm{(i)}. \end{aligned}$$

For (i) \(\Rightarrow \) (iii), we note that for \(p\ge 1\), the unit ball \(\mathrm {U}_p\) of \(\mathrm {C}^{(p)}_b(\mathsf {B})\) is an equicontinuous family in \(\mathrm {C}_b(\mathsf {B})\), uniformly bounded by the constant 1. Then, the convergence \(\mathsf {P}_n f \rightarrow \mathsf {P}f\) is uniform on \(\mathrm {U}_p\) by Theorem 3.1 in Ranga Rao [21].

(iii) \(\Rightarrow \) (ii) is obvious for f in \(\mathrm {U}_p\) and extends to any f in \(\mathrm {C}^{(p)}_b(\mathsf {B})\) by linearity of the integral.

The hard part is (ii) \(\Rightarrow \) (i) which we detail now.

Proof of (ii) \(\Rightarrow \) (i) To this aim, it is enough to prove that if (ii) holds, then for each finite intersection A of open balls, we have \(\mathsf {P}_n(A)\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}(A)\), provided A is a \(\mathsf {P}\)-continuity set (see, e.g., Billingsley [5], Corollary 2 to Th. 2.2, p.15). Recall, \(A\in \mathscr {B}_{\mathsf {B}}\) is \(\mathsf {P}\)-continuity set if \(\mathsf {P}(\partial A)=0\), where \(\partial A\) denotes the boundary of A, that is \(\partial A = {\overline{A}}\setminus \mathring{A}\) where \({\overline{A}}\) and \(\mathring{A}\) are the closure and the interior of A, respectively.

Let for \(x\in \mathsf {B}\) and \(r>0\), \(B(x,r)=\{y\in \mathsf {B}: \Vert y-x\Vert <r\}\). Set

$$\begin{aligned} A=\bigcap _{i=1}^m B(x_i,r_i),\quad x_1, \dots , x_m\in \mathsf {B}, \ \ r_1, \dots , r_m>0. \end{aligned}$$

It suffices to prove, that

$$\begin{aligned} \mathsf {P}_n(A)\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}(A)\ \ \text {whenever}\ \ \mathsf {P}(\partial A)=0. \end{aligned}$$
(6)

To this aim, define for \(0<\varepsilon <r_0:=\min _{1\le i\le m}r_i\),

$$\begin{aligned} A^\varepsilon :=\bigcap _{i=1}^m B(x_i,r_i+\varepsilon ),\quad A_\varepsilon :=\bigcap _{i=1}^m B(x_i,r_i-\varepsilon ). \end{aligned}$$

It is easily seen that

$$\begin{aligned}&\bigcup _{0<\varepsilon<r_0}A_\varepsilon = A,\nonumber \\&\quad \bigcap _{0<\varepsilon <r_0}A^\varepsilon = A' := \{x\in \mathsf {B}: \left\| x-x_i \right\| \le r_i,\;1\le i\le m\}. \end{aligned}$$
(7)

As intersection of closed balls, \(A'\) is closed and since \(A\subset A'\), the closure \({\overline{A}}\) of A is included in \(A'\). It is not difficult to find examples where this inclusion is strict when A is empty. Of course, this special case may be discarded since with \(A=\emptyset \), the convergence (6) is trivial. When A is non-empty, one can check that \(A'={\overline{A}}\) as follows. Let x be an arbitrary element in \(A'\). There is at least one element \(y_0\) in A. Then, we define \(y_1:=\frac{x+y_0}{2}\). For \(i=1,\dots ,m\), \(\left\| y_1-x_i \right\| \le \frac{1}{2}\left\| x-x_i \right\| + \frac{1}{2}\left\| y_0-x_i \right\| \). As for \(i=1,\dots ,m\), \(\left\| x-x_i \right\| \le r_i\) and \(\left\| y_0-x_i \right\| <r_i\), we see that \(\left\| y_1-x_i \right\| <r_i\), hence, \(y_1\) is in A. Iterating this argument, we construct the sequence \((y_n)\) is A such that \(y_n=\frac{x+y_{n-1}}{2}\). Since \(\left\| x-y_n \right\| \le 2^{-n}\left\| x-y_0 \right\| \), \(n\ge 1\), x belongs to \({\overline{A}}\) as limit of a sequence of points of A. Therefore, if \(A\ne \emptyset \),

$$\begin{aligned} \bigcap _{0<\varepsilon <r_0}A^\varepsilon = {\overline{A}}. \end{aligned}$$
(8)

Next, we construct for \(A\ne \emptyset \) and \(0<\varepsilon <r_0\), the functions \(f^\varepsilon , f_\varepsilon \in \mathrm {C}^{(p)}_b(\mathsf {B})\) such that

$$\begin{aligned} \varvec{1}_{A_\varepsilon }(x)\le f_\varepsilon (x)\le \varvec{1}_{A}(x)\le \varvec{1}_{{\overline{A}}}(x)\le f^\varepsilon (x)\le \varvec{1}_{A^\varepsilon }(x) \end{aligned}$$
(9)

for all \(x\in \mathsf {B}\), where \(\varvec{1}_A\) is the indicator function of A. Assume for a moment that these functions are already constructed. From (7) and (8), we obtain by monotone sequential continuity of the probability measure \(\mathsf {P}\),

$$\begin{aligned} \lim _{\varepsilon \downarrow 0}\mathsf {P}(A_\varepsilon ) = \mathsf {P}(A),\quad \lim _{\varepsilon \downarrow 0}\mathsf {P}(A^\varepsilon ) = \mathsf {P}({\overline{A}}). \end{aligned}$$

In view of (9) and recalling that A is a \(\mathsf {P}\)-continuity set, this gives

$$\begin{aligned} \lim _{\varepsilon \downarrow 0}\mathsf {P}f_\varepsilon = \mathsf {P}(A) = \mathsf {P}({\overline{A}}) = \lim _{\varepsilon \downarrow 0}\mathsf {P}(f^\varepsilon ), \end{aligned}$$
(10)

From (9) and the hypothesis (ii) applied with \(f_\varepsilon \), \(f^\varepsilon \),

$$\begin{aligned} \limsup _{n\rightarrow \infty } \mathsf {P}_n(A)\le \limsup _{n\rightarrow \infty }\mathsf {P}_n f^\varepsilon = \lim _{n\rightarrow \infty }\mathsf {P}_n f^\varepsilon = \mathsf {P}f^\varepsilon \end{aligned}$$
(11)

and

$$\begin{aligned} \liminf _{n\rightarrow \infty }\mathsf {P}_n(A)\ge \liminf _{n\rightarrow \infty }\mathsf {P}_nf_\varepsilon = \lim _{n\rightarrow \infty }\mathsf {P}_nf_\varepsilon = \mathsf {P}f_\varepsilon \end{aligned}$$
(12)

for each \(0<\varepsilon <r_0\). Taking into account (10), from (11) and (12) we deduce (6). So, it remains to construct the functions \(f^\varepsilon , f_\varepsilon .\)

We begin with a lemma on the Fréchet derivatives of the norm of a p-smooth space which completes Remark 3. It quantifies the explosion of the successive derivatives of the norm near 0.

Lemma 5

When \(\mathsf {B}\) is p-smooth, its norm \(\psi (x)=\left\| x \right\| \) satisfies for \(i=1, \dots , \lfloor p \rfloor \),

$$\begin{aligned} \psi ^{(i)}\left( \frac{x}{\left\| x \right\| }\right) = \left\| x \right\| ^{i-1}\psi ^{(i)}(x), \quad x\in \mathsf {B}\setminus \{0\}. \end{aligned}$$
(13)

Moreover, there exists constants \(c_1,\dots ,c_{\lfloor p \rfloor },c_p\) such that

$$\begin{aligned} \Vert \psi ^{(i)}(x)\Vert \le c_i\Vert x\Vert ^{1-i},\ \ i=1, \dots , \lfloor p \rfloor \end{aligned}$$
(14)

and

$$\begin{aligned} \Vert \psi ^{(\lfloor p \rfloor )}(x)-\psi ^{(\lfloor p \rfloor )}(y)\Vert \le c_p(\Vert x\Vert ^{1-p}+\Vert y\Vert ^{1-p})\Vert x-y\Vert ^{\{p\}}, \end{aligned}$$
(15)

for all \(x, y\in \mathsf {B}, x, y\not =0\).

Proof of Lemma 5

To prove (13), we proceed by finite induction on \(1\le i<\lfloor p \rfloor \). The initialization step is (4) already checked. Define for \(x\in \mathsf {B}\setminus \{0\}\), \(y\in \mathsf {B}\) and \(1\le i<\lfloor p \rfloor \),

$$\begin{aligned} T_{i+1}(x,y) := \psi ^{(i)}\left( \frac{x}{\left\| x \right\| } + y\right) \cdot y^{\otimes i} - \psi ^{(i)}\left( \frac{x}{\left\| x \right\| }\right) \cdot y^{\otimes i} - \left\| x \right\| ^i\psi ^{(i+1)}(x)\cdot y^{\otimes (i+1)}, \end{aligned}$$

where for \(y\in \mathsf {B}\), \(y^{\otimes i}=(y,\dots ,y)\) denotes the element of \(\mathsf {B}^i\) with all components equal to y and for a i-linear form L on \(B^i\), \(L\cdot w\) stands for L(w), \(w=(w_1,\dots ,w_i)\in \mathsf {B}^i\). To complete the proof of (13), it suffices to prove that under the induction Assumption (13) for some i, \(\vert T_{i+1}(x,y) \vert = o(\left\| y \right\| ^{i+1})\) for any fixed \(x\in \mathsf {B}\setminus \{0\}\) when \(y\rightarrow 0\) in \(\mathsf {B}\). Indeed then both symmetric \((i+1)\)-linear forms \(\psi ^{(i+1)}(\frac{x}{\left\| x \right\| })\) and \(\left\| x \right\| ^i\psi ^{(i+1)}(x)\) are equal on the diagonal of \(\mathsf {B}^{i+1}\), and this equality extends to the whole space \(\mathsf {B}^{i+1}\) by symmetry.

By the induction assumption, restricting to \(\left\| y \right\| <1\) to avoid a possible vanishing of \(\frac{x}{\left\| x \right\| } + y\),

$$\begin{aligned} \psi ^{(i)}\left( \frac{x}{\left\| x \right\| } + y\right)&= \left\| \frac{x}{\left\| x \right\| } + y \right\| ^{1-i}\psi ^{(i)}\left( \frac{ \frac{x}{\left\| x \right\| } + y }{\left\| \frac{x}{\left\| x \right\| } + y \right\| }\right) \\&= \left\| x \right\| ^{i-1}\left\| x+\left\| x \right\| y\, \right\| ^{1-i}\psi ^{(i)}\left( \frac{x+\left\| x \right\| y}{\left\| x+\left\| x \right\| y \right\| }\right) \\&= \left\| x \right\| ^{i-1}\psi ^{(i)}(x+\left\| x \right\| y)\\ \end{aligned}$$

and

$$\begin{aligned} \psi ^{(i)}\left( \frac{x}{\left\| x \right\| }\right) = \left\| x \right\| ^{i-1}\psi ^{(i)}(x). \end{aligned}$$

By the multilinearity of the Fréchet derivatives, it follows that

$$\begin{aligned}&\left\| x \right\| T_{i+1}(x,y) = \\&\quad \psi ^{(i)}(x+\left\| x \right\| y)\cdot (\left\| x \right\| y)^{\otimes i} - \psi ^{(i)}(x)\cdot (\left\| x \right\| y)^{\otimes i} - \psi ^{(i+1)}(x)\cdot (\left\| x \right\| y)^{\otimes (i+1)}. \end{aligned}$$

Putting \(h:=\left\| x \right\| y\), we deduce from the existence of \(\psi ^{(i+1)}\) that \(\left\| x \right\| \vert T_{i+1}(x,y) \vert = o(\left\| h \right\| ^{i+1})\) as \(h\rightarrow 0\), whence recalling that x is fixed, \(\vert T_{i+1}(x,y) \vert = o(\left\| y \right\| ^{i+1})\) when y tends to 0, as expected.

Clearly, (14) follows from (13) with \(c_i:=\sup \{\left\| \psi ^{(i)}(y) \right\| : \left\| y \right\| =1\}\), recalling that the definition of p-smoothness includes the boundedness of each \(\psi ^{(i)}\), \(1\le i\le \lfloor p \rfloor \), on \(\{y\in \mathsf {B}: \left\| y \right\| =1\}\), see (3).

Now, we prove (15). Denoting \(t=\Vert x\Vert ^{-1}, s=\Vert y\Vert ^{-1}\) for \(x\not =0, y\not =0\), and using (13), we start from

$$\begin{aligned} \Vert \psi ^{(\lfloor p \rfloor )}(x)-\psi ^{(\lfloor p \rfloor )}(y)\Vert&=\Vert \psi ^{(\lfloor p \rfloor )}(tx)t^{\lfloor p \rfloor -1}-\psi ^{(\lfloor p \rfloor )}(sy)s^{\lfloor p \rfloor -1}\Vert \nonumber \\&\le \Vert \psi ^{(\lfloor p \rfloor )}(tx)-\psi ^{(\lfloor p \rfloor )}(sy)\Vert t^{\lfloor p \rfloor -1} \nonumber \\&\quad +\, \psi ^{(\lfloor p \rfloor )}(sy)|t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}|. \end{aligned}$$
(16)

The first term in (16) is bounded by

$$\begin{aligned} \left\| \psi ^{(\lfloor p \rfloor )}(tx)-\psi ^{(\lfloor p \rfloor )}(sy) \right\| t^{\lfloor p \rfloor -1} \le c'_p\left\| tx-sy \right\| ^{\{p\}}t^{\lfloor p \rfloor -1}, \end{aligned}$$

where

$$\begin{aligned} c'_p:= \sup _{\begin{array}{c} \left\| w \right\| =\left\| z \right\| =1\\ w\ne z \end{array}} \frac{\Vert \psi ^{(\lfloor p \rfloor )}(w)-\psi ^{(\lfloor p \rfloor )}(z)\Vert }{\Vert w-z\Vert ^{\{p\}}}, \end{aligned}$$

is finite by the p-smoothness assumption (3). Now,

$$\begin{aligned} \Vert tx - sy \Vert ^{\{p\}} = \Vert \Vert y\Vert x - \Vert x\Vert y\, \Vert ^{\{p\}} (\Vert x\Vert \,\Vert y\Vert )^{-\{p\}} = \left\| x - \frac{\left\| x \right\| }{\left\| y \right\| }y \right\| ^{\{p\}} \left\| x \right\| ^{-\{p\}} \end{aligned}$$

and writing \( x - \frac{\left\| x \right\| }{\left\| y \right\| }y = x - y + \frac{\left\| y \right\| - \left\| x \right\| }{\left\| y \right\| }y, \) the triangle inequality gives

$$\begin{aligned} \left\| x - \frac{\left\| x \right\| }{\left\| y \right\| }y \right\| \le \left\| x-y \right\| + \left\| \frac{\left\| y \right\| - \left\| x \right\| }{\left\| y \right\| }y \right\| \le 2\left\| x-y \right\| , \end{aligned}$$

whence, noticing that \(2^{\{p\}}\le 2\),

$$\begin{aligned} \Vert \psi ^{(\lfloor p \rfloor )}(tx)-\psi ^{(\lfloor p \rfloor )}(sy)\Vert t^{\lfloor p \rfloor -1}&\le 2c'_p \left\| x-y \right\| ^{\{p\}}\left\| x \right\| ^{-\{p\}-\lfloor p \rfloor +1}\nonumber \\&= 2c'_p \left\| x-y \right\| ^{\{p\}}\left\| x \right\| ^{1-p}. \end{aligned}$$
(17)

To estimate the second term in (16), choose \(a=\max \{\Vert x\Vert , \Vert y\Vert \}\) and consider first the case where \(\Vert x-y\Vert \ge a\). In this case,

$$\begin{aligned} |t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}|&\le [t^{\lfloor p \rfloor -1}+ s^{\lfloor p \rfloor -1}]a^{-\{p\}}\Vert x-y\Vert ^{\{p\}} \\&\le [\Vert x\Vert ^{1-p}+\Vert y\Vert ^{1-p}]\Vert x-y\Vert ^{\{p\}}. \end{aligned}$$

If \(\Vert x-y\Vert \le a\), we claim that

$$\begin{aligned} |t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}| \le 2p(\left\| x \right\| ^{1-p} + \left\| y \right\| ^{1-p})\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$
(18)

Let us check (18). Put for simplification \(m:=\lfloor p \rfloor -1\). Applying the elementary bound

$$\begin{aligned} \vert u^m - v^m \vert&= \vert u-v \vert \vert u^{m-1}+ u^{m-2}v + \dots + v^{m-1} \vert \\&\le \vert u-v \vert m(\max (\vert u \vert ,\vert v \vert )^{m-1},\; u,v\in \mathbb {R}, \end{aligned}$$

which is optimal when v tends to u, gives

$$\begin{aligned} |t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}| = \frac{\vert \left\| x \right\| ^m - \left\| y \right\| ^m \vert }{\left\| x \right\| ^m\left\| y \right\| ^m}&\le \frac{ma^{m-1}}{\left\| x \right\| ^m\left\| y \right\| ^m}\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

As \(m<p\) and \(\left\| x-y \right\| ^{1-\{p\}}\le (2a)^{1-\{p\}} \le 2a^{1-\{p\}}\), this leads to

$$\begin{aligned} |t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}|&\le 2p\frac{a^{\lfloor p \rfloor -1-\{p\}}}{\left\| x \right\| ^{\lfloor p \rfloor -1}\left\| y \right\| ^{\lfloor p \rfloor -1}}\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

If \(a=\left\| x \right\| \), this provides

$$\begin{aligned} \frac{|t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}|}{\left\| x-y \right\| ^{\{p\}}} \le 2p\frac{a^{-\{p\}}}{\left\| y \right\| ^{\lfloor p \rfloor -1}} = 2p \left\| y \right\| ^{1-p} \left( \frac{\left\| y \right\| }{a}\right) ^{\{p\}} \le 2p \left\| y \right\| ^{1-p}. \end{aligned}$$

Obviously the same estimate holds replacing \(\left\| y \right\| \) by \(\left\| x \right\| \) when \(a=\left\| y \right\| \), and adding both estimates to have a common bound gives (18).

Gathering the estimates, we obtain for the second term in (16),

$$\begin{aligned} \psi ^{(\lfloor p \rfloor )}(sy)|t^{\lfloor p \rfloor -1}-s^{\lfloor p \rfloor -1}| \le 2pc_{\lfloor p \rfloor }(\left\| x \right\| ^{1-p} + \left\| y \right\| ^{1-p})\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

Accounting (17), this completes the proof of (15) with \(c_p:= 2\max (c'_p,pc_{\lfloor p \rfloor })\). \(\square \)

Let us go back to the construction of \(f^\varepsilon \), \(f_\varepsilon \). Lemma 5 quantifies in some way the non-membership of the norm of \(\mathsf {B}\) in the space \(\mathrm {C}^{(p)}_b(\mathsf {B})\). To remedy this drawback, the idea is to modify \(\psi \) inside the ball \(B(0,\varepsilon )\) by flatening to zero the peak of \(\psi \) in the ball \(B(0,\varepsilon /2)\) and use a connection through \(B(0,\varepsilon )\setminus B(0,\varepsilon /2)\) smooth enough to obtain an approximation of \(\psi \) by a function \(g_\varepsilon \) in \(\mathrm {C}^{(p)}_b(\mathsf {B})\). To this aim, let us choose a function \(u\in \mathrm {C}_b^{(\infty )}([0,\infty ))\) such that \(0\le u\le 1\), \(u=0\) on [0, 1/2], \(u=1\) on \([1,\infty )\). Set

$$\begin{aligned} g_\varepsilon (x)=\psi (x)u(\varepsilon ^{-1}\psi (x)),\ \ x\in \mathsf {B}. \end{aligned}$$
(19)

More explicitly,

$$\begin{aligned} g_\varepsilon (x) = {\left\{ \begin{array}{ll} 0 &{}\quad \text {if }0\le \left\| x \right\| \le \frac{\varepsilon }{2},\\ \left\| x \right\| u\left( \frac{\left\| x \right\| }{\varepsilon }\right) &{}\quad \text {if }\frac{\varepsilon }{2}< \left\| x \right\| < \varepsilon ,\\ \left\| x \right\| &{}\quad \text {if }\left\| x \right\| \ge \varepsilon . \end{array}\right. } \end{aligned}$$

The function \(g_\varepsilon \) is uniformly approximating the norm. Indeed

$$\begin{aligned} \sup _{x\in \mathsf {B}}|\psi (x)-g_\varepsilon (x)|=\sup _{x}\psi (x)|1-u(\varepsilon ^{-1}\psi (x))|\le \varepsilon , \end{aligned}$$
(20)

since \(u(\varepsilon ^{-1}\psi (x))=1\), if \(\psi (x)\ge \varepsilon \).

The next lemma establishes the membership of \(g_\varepsilon \) in \(\mathrm {C}^{(p)}_b(\mathsf {B})\) and provides some control of its norm \(\left\| g_\varepsilon \right\| _p\), defined by (1), in terms of the parameter \(\varepsilon \).

Lemma 6

There is a constant \(C>0\) such that

$$\begin{aligned} \sup _{x\in \mathsf {B}}\Vert g^{(i)}_\varepsilon (x)\Vert \le C\varepsilon ^{1-i},\ \ i=1, \dots , \lfloor p \rfloor \end{aligned}$$
(21)

and

$$\begin{aligned} \sup _{x, y\in \mathsf {B}}\Vert g^{(\lfloor p \rfloor )}_\varepsilon (x)-g^{(\lfloor p \rfloor )}_\varepsilon (y)\Vert \le C\varepsilon ^{1-p}\Vert x-y\Vert ^{\{p\}}. \end{aligned}$$
(22)

Proof of Lemma 6

Denoting

$$\begin{aligned}t_x:=\varepsilon ^{-1}\psi (x),\quad v(t):=tu(t),\end{aligned}$$

we rewrite

$$\begin{aligned} g_\varepsilon (x)=\varepsilon v(t_x). \end{aligned}$$

An immediate induction provides the following formula for the successive derivatives of v

$$\begin{aligned} v^{(m)}(t) = tu^{(m)}(t) + mu^{(m-1)}(t),\quad t\ge 0, m\ge 1. \end{aligned}$$

We note also that \(u^{(m)}(1/2) = u^{(m)}(1)=0\) since u is \(C^\infty \) and null at the right of 1 and at the left of 1/2. The values of \(v^{(m)}\) on \([0,\frac{1}{2}]\cup [1,\infty )\) are then

$$\begin{aligned} v'(t) = {\left\{ \begin{array}{ll} 0 &{}\quad \text {if }t\le 1/2,\\ 1 &{}\quad \text {if }t\ge 1, \end{array}\right. } \quad v^{(m)}(t) = {\left\{ \begin{array}{ll} 0 &{}\quad \text {if }t\le 1/2,\\ 0 &{}\quad \text {if }t\ge 1, \end{array}\right. } \quad m>1. \end{aligned}$$
(23)

Together with the infinite derivability of v, this implies that for any integer \(m\ge 1\),

$$\begin{aligned} C_m:=\sup _{t\ge 0}|v^{(m)}(t)|<\infty ,\ \ D_m:=\sup _{\begin{array}{c} s,t\ge 0\\ t\not =s \end{array}}\frac{|v^{(m)}(t)-v^{(m)}(s)|}{|t-s|^{\{p\}}}<\infty . \end{aligned}$$
(24)

Using differentiation of composite functions on Banach spaces (see, e.g., Fraenkel [11]), we find

$$\begin{aligned} \frac{\mathrm {d}^m}{\mathrm {d}x^m}v(t_x)(h_1, \dots , h_m) = \sum _{j=1}^m\sum _{\begin{array}{c} \beta \in <j>_+\\ |\beta |=m \end{array}}\sum _{\sigma } \frac{v^{(j)}(t_x)}{j!\beta !}I_{\beta , \sigma }(x)(h_1, \dots , h_m) \end{aligned}$$
(25)

with

$$\begin{aligned} I_{\beta , \sigma }(x)(h_1, \dots , h_m) := t_x^{(\beta _1)}(h_{\sigma _1}, \dots , h_{\sigma _{\beta _1}})\cdots t^{(\beta _j)}_x(h_{\sigma _{m-\beta _j+1}}, \dots , h_{\sigma _m}), \end{aligned}$$

where \(\beta \in <j>_+\) means that \(\beta =(\beta _1, \dots , \beta _j)\) with integers \(\beta _i\ge 1, i=1, \dots , j\), \(|\beta |=\beta _1+\cdots +\beta _j\) and \(\beta !=\beta _1!\cdots \beta _j!\). In \(\sum _{\sigma }\), the summation runs over the m! permutations \(\sigma \) of \(\{1, \dots , m\}\).

To prove (21), we need to bound \(\left\| I_{\beta , \sigma }(x) \right\| \) only for \(\left\| x \right\| >\frac{\varepsilon }{2}\) since for \(m\ge 1\), \(v^{m)}(t_x)=0\) for \(t_x\in [0,1/2]\), that is for \(\left\| x \right\| \le \frac{\varepsilon }{2}\). So assuming \(\left\| x \right\| >\frac{\varepsilon }{2}\) and accounting (14), we obtain

$$\begin{aligned} \Vert I_{\beta , \sigma }(x)\Vert \le \Vert t_x^{(\beta _1)}\Vert \cdots \Vert t_x^{(\beta _j)}\Vert&\le \left( \frac{1}{\varepsilon }c_{\beta _1}\left\| x \right\| ^{1-\beta _1}\right) \cdots \left( \frac{1}{\varepsilon }c_{\beta _j}\left\| x \right\| ^{1-\beta _j}\right) \nonumber \\&= c_{\beta _1}\cdots c_{\beta _j}\varepsilon ^{-j}\left\| x \right\| ^{j-m}\nonumber \\&\le 2^mc_{\beta _1}\cdots c_{\beta _j}\varepsilon ^{-m}. \end{aligned}$$
(26)

Consequently \(\Big \Vert \frac{\mathrm {d}^m}{\mathrm {d}x^m}v(t_x)\Big \Vert \le K_m\varepsilon ^{-m}\) and as \(g_\varepsilon (x)=\varepsilon v(t_x)\), (21) is checked.

To prove (22), we have to find a bound of the form \(c\varepsilon ^{-p}\Vert x-y\Vert ^{\{p\}}\) for each

$$\begin{aligned} \varDelta _{\beta ,\sigma }(x, y):=v^{(j)}(t_x)I_{\beta , \sigma }(x)-v^{(j)}(t_y)I_{\beta , \sigma }(y),\quad \beta \in <j>_+,\; 1\le j\le \lfloor p \rfloor ,\nonumber \\ \end{aligned}$$
(27)

because of the decomposition

$$\begin{aligned} \frac{\mathrm {d}^{\lfloor p \rfloor }}{\mathrm {d}x^{\lfloor p \rfloor }}v(t_x)-\frac{\mathrm {d}^{\lfloor p \rfloor }}{\mathrm {d}y^{\lfloor p \rfloor }}v(t_y)= \sum _{j=1}^{\lfloor p \rfloor }\sum _{\begin{array}{c} \beta \in <j>_+ \\ |\beta |=\lfloor p \rfloor \end{array}}\sum _{\sigma }\frac{1}{j!\beta !}\varDelta _{\beta ,\sigma }(x, y). \end{aligned}$$
(28)

In view of (23), the discussion is naturally ordered according to the various configurations of \(\left\| x \right\| \), \(\left\| y \right\| \) and the open interval \((\frac{\varepsilon }{2},\varepsilon )\).

Case 1: \(\left\| x \right\| \) and \(\left\| y \right\| \) are both outside \((\frac{\varepsilon }{2},\varepsilon )\). As \(t_x\), \(t_y\) are both outside \((\frac{1}{2},1)\), it is clear from (23) that for \(j\ge 2\), \(v^{(j)}(t_x)=v^{(j)}(t_y)=0\) whence \(\varDelta _{\beta ,\sigma }(x, y)=0\). For the same reason, \(\varDelta _{\beta ,\sigma }(x,y)=0\) when \(j=1\) and \(\left\| x \right\| ,\left\| y \right\| \le \frac{\varepsilon }{2}\). If \(j=1\) and \(\left\| x \right\| ,\left\| y \right\| \ge \varepsilon \),

$$\begin{aligned} \varDelta _{\beta ,\sigma }(x,y)(h_1,\dots ,h_{\lfloor p \rfloor })&= (I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y))(h_1,\dots ,h_{\lfloor p \rfloor })\\&= (t_x^{({\lfloor p \rfloor })} - t_y^{({\lfloor p \rfloor })})(h_{\sigma _1},\dots ,h_{\sigma _{\lfloor p \rfloor }}), \end{aligned}$$

whence recalling (15),

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| \le \left\| t_x^{({\lfloor p \rfloor })} - t_y^{({\lfloor p \rfloor })} \right\|&= \varepsilon ^{-1}\left\| \psi ^{({\lfloor p \rfloor })}(x) - \psi ^{({\lfloor p \rfloor })}(y) \right\| \nonumber \\&\le c_p\varepsilon ^{-1}(\left\| x \right\| ^{1-p} + \left\| y \right\| ^{1-p})\left\| x-y \right\| ^{\{p\}} \nonumber \\&\le 2c_p\varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$
(29)

Case 2: only one of \(\left\| x \right\| \), \(\left\| y \right\| \) is inside \((\frac{\varepsilon }{2},\varepsilon )\). By symmetry, it suffices to treat the configurations where \(\left\| x \right\| \) is inside \((\frac{\varepsilon }{2},\varepsilon )\) and \(\left\| y \right\| \) outside.

Case 2.a: \(0<\left\| y \right\| \le \frac{\varepsilon }{2}< \left\| x \right\| < \varepsilon \). In this configuration, for \(j\ge 1\) and \(\beta \in <j>_+\),

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| = \vert v^{(j)}(t_x) \vert \left\| I_{\beta ,\sigma }(x) \right\| = \vert v^{(j)}(t_x)-v^{(j)}(t_y) \vert \left\| I_{\beta ,\sigma }(x) \right\| . \end{aligned}$$
(30)

From (24),

$$\begin{aligned} \vert v^{(j)}(t_x)-v^{(j)}(t_y) \vert&\le D_j\vert t_x-t_y \vert ^{\{p\}} = D_j \varepsilon ^{-\{p\}}\bigl \vert \left\| x \right\| - \left\| y \right\| \bigr \vert ^{\{p\}} \\&\le D_j \varepsilon ^{-\{p\}} \left\| x-y \right\| ^{\{p\}} \end{aligned}$$

which together with (26), leads to

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| \le (2^p c_{\beta _1}\cdots c_{\beta _j}D_j) \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}},\quad \beta \in <j>_+,\; j\ge 1. \end{aligned}$$

Case 2.b: \(\frac{\varepsilon }{2}< \left\| x \right\| < \varepsilon \le \left\| y \right\| \). For \(j\ge 2\), (30) still holds and exactly the same argument as above gives

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| \le (2^p c_{\beta _1}\cdots c_{\beta _j}D_j) \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}},\quad \beta \in <j>_+,\; j\ge 2. \end{aligned}$$

In the special case where \(j=1\), as \(v'(t_y)=1\),

$$\begin{aligned} \varDelta _{\beta ,\sigma }(x,y) = v'(t_x)I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y), \end{aligned}$$

whence

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| \le \vert v'(t_x)-v'(t_y) \vert \left\| I_{\beta ,\sigma }(x) \right\| + \left\| I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y) \right\| . \end{aligned}$$

Bounding the first term in the right-hand side exactly as in case 2.a and referring to (29) in case 1 for the second one, we obtain

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x,y) \right\| \le (2^p c_{\beta _1}\cdots c_{\beta _j}D_j + 2c_p) \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}},\quad \beta \in <1>_+. \end{aligned}$$

Case 3: \(\frac{\varepsilon }{2}< \left\| x \right\| ,\left\| y \right\| < \varepsilon \). We start from

$$\begin{aligned} \left\| \varDelta _{\beta ,\sigma }(x, y) \right\|&\le \vert v^{(j)}(t_x)-v^{(j)}(t_y)| \vert \cdot \left\| I_{\beta , \sigma }(x) \right\| \nonumber \\&\quad +\vert v^{(j)}(t_y) \vert \cdot \left\| I_{\beta , \sigma }(x)-I_{\beta , \sigma }(y) \right\| . \end{aligned}$$
(31)

The first term in the right-hand side of (31) is bounded exactly as in case 2.a:

$$\begin{aligned}&\vert v^{(j)}(t_x)-v^{(j)}(t_y)| \vert \left\| I_{\beta , \sigma }(x) \right\| \le (2^p c_{\beta _1}\cdots c_{\beta _j}D_j) \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}},\nonumber \\&\beta \in <j>_+,\; j\ge 1. \end{aligned}$$

For the second term, let us treat first the special case where \(j=1\). Arguing as in case 1, we just have to replace \(\varepsilon \) by \(\frac{\varepsilon }{2}\) in (29), so accounting (24),

$$\begin{aligned} \vert v^{(1)}(t_y) \vert \cdot \left\| I_{\beta , \sigma }(x)-I_{\beta , \sigma }(y) \right\| \le 2^{p+1}C_1c_p\varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}}, \quad \beta \in <1>_+. \end{aligned}$$

Assume now that \(\beta \in <j>_+\) with \(2\le j\le \lfloor p \rfloor \). As \(\vert \beta \vert =\lfloor p \rfloor \), for each component \(\beta _i\) of \(\beta \), \(\beta _i<\lfloor p \rfloor \). Using telescopic summation where at each step one factor \(t_x^{(\beta _i)}\) is replaced by \(t_y^{(\beta _i)}\) gives

$$\begin{aligned} I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y) = \sum _{i=1}^j\left( \prod _{1\le k<i} t_x^{(\beta _k)}\right) \left( t_x^{(\beta _i)} - t_y^{(\beta _i)}\right) \left( \prod _{i< k\le j} t_y^{(\beta _k)}\right) , \end{aligned}$$

with the usual convention that a product indexed by \(\emptyset \) equals 1. Therefore,

$$\begin{aligned} \left\| I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y) \right\| \le \sum _{i=1}^j \varPi _{\beta ,i} \left\| t_x^{(\beta _i)} - t_y^{(\beta _i)} \right\| , \end{aligned}$$
(32)

where

$$\begin{aligned} \varPi _{\beta ,i} := \left( \prod _{1\le k<i} \left\| t_x^{(\beta _k)} \right\| \right) \left( \prod _{i< k\le j} \left\| t_y^{(\beta _k)} \right\| \right) \end{aligned}$$

It is easy to bound \(\varPi _{\beta ,i}\) since by (14), \( \left\| t_x^{(\beta _k)} \right\| =\varepsilon ^{-1}\left\| \psi ^{(\beta _k)}(x) \right\| \le \varepsilon ^{-1}c_{\beta _k}\left\| x \right\| ^{1-\beta _k} \) and as \(\beta _k\ge 1\) and \(\left\| x \right\| >\frac{\varepsilon }{2}\), \(\left\| x \right\| ^{1-\beta _k}\le 2^{\beta _k-1}\varepsilon ^{1-\beta _k}\). Obviously, the same holds for \(\left\| t_y^{(\beta _k)} \right\| \) and all this gives

$$\begin{aligned} \varPi _{\beta ,i} \le 2^{\lfloor p \rfloor - \beta _i}\varepsilon ^{\beta _i-\lfloor p \rfloor } \prod _{\begin{array}{c} 1\le k\le j \\ k\ne i \end{array} }c_{\beta _k}. \end{aligned}$$
(33)

Recalling that \(\beta _i<\lfloor p \rfloor \), \(1\le i\le j\), it remains to estimate for any \(1\le m< \lfloor p \rfloor \),

$$\begin{aligned} \delta _m(x, y) := \Vert t_x^{(m)}-t_y^{(m)}\Vert =\varepsilon ^{-1}\Vert \psi ^{(m)}(x)-\psi ^{(m)}(y)\Vert . \end{aligned}$$

In view of (14), it seems relevant to apply the mean-value theorem for derivatives to the function \(\psi ^{(m)}:\mathsf {B}\setminus \{0\}\rightarrow \mathscr {L}_m(\mathsf {B}) \). But then, care must be taken of the inclusion of the segment [xy] in the open set \(\mathsf {B}\setminus \{0\}\). If 0 belongs to [xy], then there exists \(s\in [0,1]\) such that \((1-s)x+sy =0\), that is \(x=s(x-y)\) whence \(\left\| x \right\| =s\left\| x-y \right\| \). If \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\), this equality is impossible since \(s\in [0,1]\) and \(\left\| x \right\| >\frac{\varepsilon }{2}\). Accordingly, we separate the cases \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\) and \( \left\| x-y \right\| >\frac{\varepsilon }{2}\).

If \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\), by the mean-value theorem, (14) and the convexity of the function \(t\mapsto t^{-m}\),

$$\begin{aligned} \delta _m(x, y)&\le \varepsilon ^{-1}\sup _{s\in [0,1]}\left\| \psi ^{(m+1)}((1-s)x+sy) \right\| \left\| x-y \right\| \\&\le \varepsilon ^{-1}c_{m+1}\sup _{s\in [0,1]}\left\| (1-s)x+sy \right\| ^{-m} \left( \frac{\varepsilon }{2}\right) ^{1-\{p\}} \left\| x-y \right\| ^{\{p\}}\\&\le 2^{\{p\}-1}c_{m+1}\sup _{s\in [0,1]}\big ((1-s)\left\| x \right\| ^{-m} + s\left\| y \right\| ^{-m}\big )\varepsilon ^{-\{p\}} \left\| x-y \right\| ^{\{p\}}\\&\le 2^{m+\{p\}-1}c_{m+1}\varepsilon ^{-m-\{p\}} \left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

If \(\left\| x-y \right\| >\frac{\varepsilon }{2}\), it is enough to use (14) as follows.

$$\begin{aligned} \left\| \psi ^{(m)}(x)-\psi ^{(m)}(y) \right\|&\le \left\| \psi ^{(m)}(x) \right\| + \left\| \psi ^{(m)}(y) \right\| \\&\le c_m\big (\left\| x \right\| ^{1-m} + \left\| y \right\| ^{1-m} \big ) \le 2^m c_m \varepsilon ^{1-m}. \end{aligned}$$

Noticing that here \(\varepsilon < 2\left\| x-y \right\| \), this gives

$$\begin{aligned} \delta _m(x, y) \le 2^{m+\{p\}}c_m\varepsilon ^{-m-\{p\}}\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

So we can retain from both cases the common bound:

$$\begin{aligned} \delta _m(x, y) \le 2^{m+\{p\}}\max (c_m,c_{m+1})\varepsilon ^{-m-\{p\}}\left\| x-y \right\| ^{\{p\}}, \end{aligned}$$

for \( \frac{\varepsilon }{2}< \left\| x \right\| ,\left\| y \right\| < \varepsilon \), \(1\le m <\lfloor p \rfloor \). This together with (33) enables us to bound the \(i^{\,\text {th}}\) term in (32) as

$$\begin{aligned} \varPi _{\beta ,i} \left\| t_x^{(\beta _i)} - t_y^{(\beta _i)}\! \right\|&\le 2^{\lfloor p \rfloor - \beta _i}\varepsilon ^{\beta _i-\lfloor p \rfloor } \! \prod _{\begin{array}{c} 1\le k\le j \\ k\ne i \end{array} } \!c_{\beta _k}2^{\beta _i+\{p\}}\! \max (c_{\beta _i},c_{\beta _i+1})\varepsilon ^{-\beta _i-\{p\}}\!\left\| x-y \right\| ^{\{p\}}\\&= 2^p\prod _{\begin{array}{c} 1\le k\le j \\ k\ne i \end{array} }c_{\beta _k}\,\max (c_{\beta _i},c_{\beta _i+1}) \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}}. \end{aligned}$$

Finally, accounting (24),

$$\begin{aligned} \vert v^{(j)}(t_y) \vert \cdot \left\| I_{\beta , \sigma }(x)-I_{\beta , \sigma }(y) \right\| \le c \varepsilon ^{-p}\left\| x-y \right\| ^{\{p\}},\quad \quad \beta \in <j>_+,\; 2\le j\le \lfloor p \rfloor . \end{aligned}$$

This completes the proof of (22) and Lemma 6\(\square \)

Next for each \(\varepsilon >0\), we construct a \(\lfloor p \rfloor \)-times Fréchet differentiable function \(\phi _\varepsilon :\mathsf {B}\rightarrow \mathbb {R}\) such that

$$\begin{aligned} \phi _{\varepsilon }(x)={\left\{ \begin{array}{ll}1\ \ &{}\quad \text {if}\ \ \Vert x\Vert \le 1\\ 0\ \ &{}\quad \text {if}\ \ \Vert x\Vert >1+\varepsilon \end{array}\right. } \end{aligned}$$
(34)

and

$$\begin{aligned} \sum _{j=0}^{\lfloor p \rfloor }\sup _{\varepsilon>0}\varepsilon ^{j}\sup _{x\in \mathsf {B}}\Vert \phi _\varepsilon ^{(j)}(x)\Vert +\sup _{\varepsilon >0}\varepsilon ^{p}\sup _{x\not =y} \frac{\Vert \phi _\varepsilon ^{(\lfloor p \rfloor )}(x)-\phi ^{(\lfloor p \rfloor )}_\varepsilon (y)\Vert }{\Vert x-y\Vert ^{\{p\}}}<\infty . \end{aligned}$$
(35)

To this aim, let the function \(q\in \mathrm {C}^{(\infty )}_b(\mathbb {R})\) be such that \(0\le q\le 1\), \(q(t)=1\), if \(t<1/8\), \(q(t)=0\), if \(t>7/8\). Set

$$\begin{aligned} \phi _\varepsilon (x)=q(\varepsilon ^{-1}(g_{\varepsilon /8}(x)-1)),\ \ x\in \mathsf {B}. \end{aligned}$$

If \(\Vert x\Vert \le 1\), then \((\Vert x\Vert -1)\varepsilon ^{-1}<0\) and

$$\begin{aligned} \frac{g_{\varepsilon /8}(x)-1}{\varepsilon }=\frac{g_{\varepsilon /8}(x)-1}{\varepsilon }-\frac{\Vert x\Vert -1}{\varepsilon }+\frac{\Vert x\Vert -1}{\varepsilon }\le \frac{g_{\varepsilon /8}(x)-\Vert x\Vert }{\varepsilon }\le 1/8 \end{aligned}$$

therefore \(\phi _\varepsilon (x)=1\). If \(\Vert x\Vert >1+\varepsilon \), then

$$\begin{aligned} \frac{g_{\varepsilon /8}(x)-1}{\varepsilon }&=\frac{g_{\varepsilon /8}(x)-1}{\varepsilon }- \frac{\Vert x\Vert -1}{\varepsilon }+\frac{\Vert x\Vert -1}{\varepsilon }> 1+\frac{g_{\varepsilon /8}(x)-1}{\varepsilon }- \frac{\Vert x\Vert -1}{\varepsilon }\\&\ge 1-\Big |\frac{g_{\varepsilon /8}(x)-1}{\varepsilon }-\frac{\Vert x\Vert -1}{\varepsilon }\Big |\ge 1-1/8=7/8, \end{aligned}$$

therefore \(\phi (x)=0\) and (34) is confirmed. There remains to evaluate the derivatives of the function \(\phi _\varepsilon \). This can be done in much the same way as we proved (21) and (22). Finally, we use \(\phi _\varepsilon , \varepsilon >0\) to define the required functions

$$\begin{aligned} f^\varepsilon (x)=\prod _{i=1}^m \phi _{\varepsilon r_i^{-1}}(r_i^{-1}(x-x_i)),\quad f_\varepsilon (x)=\prod _{i=1}^m \phi _{\varepsilon (r_i-\varepsilon )^{-1}}((r_i-\varepsilon )^{-1}(x-x_i)). \end{aligned}$$

It is straightforward to check that \(\Vert f^\varepsilon \Vert _p<\infty \) and \(\Vert f_\varepsilon \Vert _p<\infty \). This completes the proof of \((ii)\Rightarrow (i)\) and Theorem 4. \(\square \)

Theorem 7

If the Banach space \(\mathsf {B}\) is \(\infty \)-smooth, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathsf {B})\), the following statements are equivalent

  1. (i)

    \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);

  2. (ii)

    \(\mathsf {P}_n f\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}^{(\infty )}_b(\mathsf {B})\).

Proof

The proof of Theorem 4 is adapted for the case of \(\infty \)-smooth Banach space as well. One needs to follow its lines having in mind that functions involved are \(\infty \)-differentiable (or p-differentiable for any \(p>1\)). So the finally constructed functions \(f^{\varepsilon }\) and \(f_{\varepsilon }\) belong to \(\mathrm {C}^{(\infty )}_b(\mathsf {B})\). \(\square \)

Remark 8

From the proof of Theorem 4, we see that the differentiability of the norm could be substituted by its smooth approximation in the sense that for each \(\varepsilon >0\), there exists a \(\lfloor p \rfloor \)-times Fréchet differentiable function \(\psi _\varepsilon :\mathsf {B}\rightarrow \mathbb {R}\) such that

  1. (a)

    for any \(\varepsilon >0\),

    $$\begin{aligned} \sup _{x\in \mathsf {B}}|\psi _\varepsilon (x)-\psi (x)|\le \varepsilon ; \end{aligned}$$
  2. (b)

    with some constant \(C>0\),

    $$\begin{aligned} \sup _{x\in \mathsf {B}}\Vert \psi ^{(i)}_\varepsilon (x)\Vert \le C\varepsilon ^{1-i},\ \ i=1, \dots , \lfloor p \rfloor ; \end{aligned}$$
  3. (c)

    with some constant \(C>0\),

    $$\begin{aligned} \sup _{x\not =y, x,y \in \mathsf {B}}\frac{\Vert \psi ^{(\lfloor p \rfloor )}_\varepsilon (x)-\psi ^{(\lfloor p \rfloor )}_\varepsilon (y)\Vert }{\Vert x-y\Vert ^{\{p\}}}\le C\varepsilon ^{1-p}. \end{aligned}$$

Remark 9

Since \(\mathrm {C}_b^{(p)}(\mathsf {B})\subset \mathrm {C}_b^{(p')}(\mathsf {B})\), if \(p>p'\), it holds

$$\begin{aligned} \zeta _p(\mathsf {P}, \mathsf {Q})\le \zeta _{p'}(\mathsf {P},\mathsf {Q}),\ \ \mathsf {P}, \mathsf {Q}\in \mathscr {P}(\mathsf {B}). \end{aligned}$$

Remark 10

For \(\mathsf {B}\)-valued random variables XY, we set

$$\begin{aligned} \zeta _p(X,Y):=\zeta _p(\mathsf {P}_X, \mathsf {P}_Y), \end{aligned}$$

where \(\mathsf {P}_X\) denotes the distribution of X. Hence, if \(\mathsf {B}\) is p-smooth then in order to check convergence in distribution of a sequence \((X_n, n\in \mathbb {N})\) of \(\mathsf {B}\)-valued random variables to a \(\mathsf {B}\)-valued random variable X, it is enough to prove \(\zeta _p(X_n, X)\xrightarrow [n\rightarrow \infty ]{}0\). The use of \(\zeta _{p}\) in proving convergence in distribution of random variables is attractive due to the following simple but powerful properties of \(\zeta _p\):

  1. (a)

    for each \(c\in \mathbb {R}\),

    $$\begin{aligned} \zeta _{p}(cX, cY)\le \max \{1, |c|^{p}\}\zeta _p(X, Y); \end{aligned}$$
  2. (b)

    if the \(\mathsf {B}\)-valued random element Z is independent of (XY), then

    $$\begin{aligned} \zeta _{p}(X+Z, Y+Z)\le \zeta _{p}(X, Y); \end{aligned}$$
  3. (c)

    for independent B-valued random elements \(X_1, \dots , X_n; Y_1, \dots , Y_n\),

    $$\begin{aligned} \zeta _{p}\Big (\sum _{k=1}^n X_k, \sum _{k=1}^n Y_k\Big )\le \sum _{k=1}^n \zeta _{p}(X_k, Y_k). \end{aligned}$$
    (36)

These properties of \(\zeta _p\) were discovered by Zolotorev [26], but actually are easy to prove. The statement (a) follows directly from the definition of \(\zeta _p\). To prove (b), one needs to use Fubini theorem and invariance by shifts of the function space \(\mathrm {C}_b^{(p)}(\mathsf {B})\) that is under the transformations \(T_x : \mathrm {C}_b^{(p)}(\mathsf {B})\rightarrow \mathrm {C}_b^{(p)}(\mathsf {B})\), \(f\mapsto T_xf:=f(x+\cdot )\), \(x\in \mathsf {B}\). Finally, (c) follows from (b) and the triangle inequality.

3 Some Remarks on Smooth Banach Spaces

Various aspects of differentiability of Banach space norm are discussed in Sundaresan [23].

3.1 Smoothness and Type 2

Recall a Banach space \(\mathsf {B}\) is said to be of type 2 if there is a constant \(K>0\) such that for any finite set of elements \(x_1,\dots ,x_n\) in \(\mathsf {B}\) and Rademacher sequence \(\epsilon _1,\dots ,\epsilon _n\) (the \(\epsilon _i\) being independent and such that \(P(\epsilon _i = -1) = P(\epsilon _i = 1) = 1/2\)),

$$\begin{aligned} \big (\mathrm{E}\left\| \epsilon _1 x_1 + \cdots + \epsilon _n x_n \right\| ^2\big )^{1/2} \le K \big (\left\| x_1 \right\| ^2 +\cdots +\left\| x_n \right\| ^2\big )^{1/2}. \end{aligned}$$
(37)

By the Khintchine–Kahane inequality giving the equivalence of moments of Rademacher sums \(\sum _i \epsilon _i x_i\) see, e.g., [17,  Th. 4.7], the second moment the left-hand side of (37) may be replaced by the first one, leading to the equivalent definition of type 2 by the inequality

$$\begin{aligned} \mathrm{E}\left\| \epsilon _1 x_1 + \cdots + \epsilon _n x_n \right\| \le K' \big (\left\| x_1 \right\| ^2 +\cdots +\left\| x_n \right\| ^2\big )^{1/2}. \end{aligned}$$
(38)

Moreover, by, e.g., [17,  Prop.9.11], if the separable Banach space \(\mathsf {B}\) is of type 2, there is a constant \(K>0\) depending only on \(\mathsf {B}\) such that for any finite set of mean zero independent \(\mathsf {B}\)-valued random elements \(X_1, \dots , X_n\),

$$\begin{aligned} \mathrm{E}\left\| X_1+\cdots +X_n \right\| ^2 \le K^2\big (\mathrm{E}\left\| X_1 \right\| ^2 +\cdots +\mathrm{E}\left\| X_n \right\| ^2\big ). \end{aligned}$$
(39)

This obviously implies that

$$\begin{aligned} \mathrm{E}\left\| X_1+\cdots +X_n \right\| \le K\big (\mathrm{E}\left\| X_1 \right\| ^2 +\cdots +\mathrm{E}\left\| X_n \right\| ^2\big )^{1/2}. \end{aligned}$$
(40)

Conversely, if in a separable Banach space \(\mathsf {B}\), any finite set of mean zero independent \(\mathsf {B}\)-valued random elements \(X_1, \dots , X_n\) satisfies (40), then choosing \(X_i=\epsilon _i x_i\) shows that B satisfies (38); hence, \(\mathsf {B}\) is of type 2.

Proposition 11

If the Banach space \(\mathsf {B}\) is 2-smooth, then \(\mathsf {B}\) is of type 2.

Proof

We just have to prove that (40) is satisfied in \(\mathsf {B}\). We use the functions \(g_\varepsilon \), \(\varepsilon >0,\) defined in (19) which are in \(\mathrm {C}_b^{(2)}(\mathsf {B})\) by Lemma 6.

For any fixed \(a,b\in \mathsf {B}\), the map \(f:[0,1]\rightarrow \mathbb {R}\), \(t\mapsto f(t) := g_\varepsilon (a+tb)\) has clearly a continuous first derivative \(f'(t)=g_\varepsilon ^{(1)}(a+tb) \cdot b\), so \(f(1)-f(0) = \int _0^1 f'(t)\,\mathrm {d}t\), that is :

$$\begin{aligned} g_\varepsilon (a+b) - g_\varepsilon (a) = \int _0^1 g_\varepsilon ^{(1)}(a+tb) \cdot b \,\mathrm {d}t,\quad a,b\in \mathsf {B}. \end{aligned}$$
(41)

Denoting \(S_0=0\), \(S_j=X_1+\cdots +X_j, j=1, \dots , n\), we have by (20),

$$\begin{aligned} \mathrm{E}\Vert S_n\Vert \le \varepsilon +\mathrm{E}g_\varepsilon (S_n). \end{aligned}$$
(42)

Recalling that \(g_\varepsilon (0)=0\) and applying (41) gives

$$\begin{aligned} \mathrm{E}g_\varepsilon (S_n) = \mathrm{E}\big (g_\varepsilon (S_n) - g_\varepsilon (0)\Big )&= \sum _{j=1}^n \mathrm{E}\Big (g_\varepsilon (S_j) - g_\varepsilon (S_{j-1})\Big )\\&= \sum _{j=1}^n \int _0^1 \mathrm{E}\Big (g_\varepsilon ^{(1)}(S_{j-1} + tX_j) \cdot X_j\Big ) \,\mathrm {d}t. \end{aligned}$$

It is well known that if \(\varphi \) is a continuous linear form on \(\mathsf {B}\) and X a random element in \(\mathsf {B}\) which is Bochner or Pettis integrable, \(\mathrm{E}\varphi (X_j)= \varphi (\mathrm{E}X_j)\). Combining this property with the independence of \(S_{j-1}\) and \(X_j\) gives via an obvious Fubini argument that

$$\begin{aligned} \mathrm{E}(g_\varepsilon ^{(1)}(S_{j-1})\cdot X_j)= (\mathrm{E}g_\varepsilon ^{(1)}(S_{j-1}) ) \cdot (\mathrm{E}X_j) = 0. \end{aligned}$$

This enables us to rewrite the above decomposition of \(\mathrm{E}g_\varepsilon (S_n)\) as

$$\begin{aligned} \mathrm{E}g_\varepsilon (S_n) = \sum _{j=1}^n \int _0^1 \mathrm{E}\Big (\big (g_\varepsilon ^{(1)}(S_{j-1} + tX_j) - g_\varepsilon ^{(1)}(S_{j-1})\big ) \cdot X_j\Big ) \,\mathrm {d}t. \end{aligned}$$

As \(g_\varepsilon \) satisfies Lemma 6 with \(\lfloor p \rfloor =\{p\}=1\), we can use (22) to obtain

$$\begin{aligned} \mathrm{E}g_\varepsilon (S_n)&\le \sum _{j=1}^n \int _0^1 \mathrm{E}\left\| \big (g_\varepsilon ^{(1)}(S_{j-1} + tX_j) - g_\varepsilon ^{(1)}(S_{j-1})\big ) \cdot X_j \right\| \,\mathrm {d}t\\&\le \sum _{j=1}^n \int _0^1 \mathrm{E}\left( \frac{C}{\varepsilon }\left\| tX_j \right\| \left\| X_j \right\| \right) \,\mathrm {d}t. \end{aligned}$$

Going back to (42), this gives

$$\begin{aligned} \mathrm{E}\Vert S_n\Vert \le \varepsilon + \frac{C}{\varepsilon }\sum _{j=1}^n\mathrm{E}\Vert X_j\Vert ^2. \end{aligned}$$
(43)

Minimizing in \(\varepsilon \) this upper bound, yields (40) with \(K=2C^{1/2}\). \(\square \)

3.2 The Case of Hilbert Spaces

Example 12

Let \(\mathscr {H}\) be a separable Hilbert space with the inner product \(\langle x, y \rangle \) and the norm \(\Vert x\Vert =\sqrt{\langle x, x \rangle }\), \(x, y\in \mathscr {H}\). Then, \(\psi (x)=\Vert x\Vert \) satisfies \(c_j:=\sup _{\Vert x\Vert =1}\Vert \psi ^{(j)}(x)\Vert <\infty \) for any \(j\ge 1.\) This can be seen from \(\psi (x)=(\langle x, x \rangle )^{1/2}\) and the fact that the inner product is a bilinear function; hence, its first derivative is a linear function, whereas its second one is a constant. So in Hilbert space the convergence \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) is equivalent to \(\mathsf {P}_n f\rightarrow \mathsf {P}f\) for any \(f\in \mathrm {C}_b^\infty (\mathscr {H})\). As well the weak convergence is metrizable by \(\zeta _d(\mathsf {P}_n, \mathsf {P})\) for \(d\ge 1\). The following result proved by Giné and León [12] is also a corollary of Theorems 4 and 7.

Theorem 13

Let \(\mathscr {H}\) be a separable Hilbert space. Then, for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathscr {H})\) the following statements are equivalent:

  1. (i)

    \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);

  2. (ii)

    \(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for every \(f\in \mathrm {C}_b^{(\infty )}(\mathscr {H})\);

  3. (iii)

    for at least one \(d>1\), \(\lim _{n\rightarrow \infty } \zeta _d(\mathsf {P}_n, \mathsf {P})=0\).

3.3 Smoothness of \(\mathrm {L}_p\) Spaces

Example 14

Let \(({\mathbb S}, \mathscr {S}, \nu )\) be a \(\sigma \)-finite measure space, \(p\ge 1\). By \(\mathscr {L}_p({\mathbb S}, \nu ; \mathbb {R})\), we denote the set of measurable functions \(x:{\mathbb S}\rightarrow \mathbb {R}\) such that \(\int _{{\mathbb S}}|x(s)|^p\nu (\mathrm {d}s)<\infty \). The corresponding Banach space is denoted by \(\mathrm {L}_p({\mathbb S}, \mathscr {S}, \nu ; \mathbb {R}))\) or shortly \(\mathrm {L}_p({\mathbb S}, \nu )\) and is endowed with the norm

$$\begin{aligned} \left\| x \right\| _{\mathrm {L}_p}:=\Big (\int _{{\mathbb S}}|x(s)|^p\nu (\,\mathrm {d}s)\Big )^{1/p}. \end{aligned}$$

Throughout we assume that the spaces \(\mathrm {L}_p({\mathbb S}, \nu ), p\ge 1,\) are separable. This is the case if \(\mathscr {S}\) is countably generated or if \(({\mathbb S}, \mathscr {S}, \nu )\) is \(\nu \)-countably generated: there exists a sequence \((S_n, n\ge 1)\subset \mathscr {S}\), consisting of sets of finite \(\nu \)-measure, which \(\nu \)-essentially generates \(\mathscr {S}\) in the sense that for all \(A\in \mathscr {S}\) we can find a set \(A_0\) in the \(\sigma \)-algebra generated by \((S_n, n\ge 1)\) such that \(\nu (A\varDelta A_0) = 0\), see Proposition 1.49 in Hytönen et al. [16].

As proved in [19,  Prop. 2.23], the norm \(\psi (x)=\Vert x\Vert _{\mathrm {L}_p}\) is \(\lfloor p \rfloor \) times continuously differentiable on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) and satisfies \(\sum _{k=0}^{\lfloor p \rfloor }\sup _{\Vert x\Vert =1}\Vert \psi ^{(k)}(x)\Vert <\infty \). We use here the following notations : \(\psi _1(x):=(\psi (x))^p\), \(g(t):=|t|^{p}\), \(f(t):=|t|^{1/p}\). The method used in the proof of [19,  Prop. 2.23] is to establish the \(\lfloor p \rfloor \)-continuous differentiability of \(\psi _1\) by a Taylor formula technique and as \(\psi = f\circ \psi _1\) and f is infinitely differentiable on \(\mathbb {R}\setminus \{0\}\), the \(\lfloor p \rfloor \) times continuous differentiability of \(\psi \) on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) follows. In what follows, we adopt the Toscano notation for the falling factorial, that is for any real number r and any integer \(k\ge 1\),

$$\begin{aligned} r^{{\underline{k}}}:= \prod _{i=0}^{k-1}(r-i). \end{aligned}$$

With this notation, the derivatives of f and g are conveniently expressed as

$$\begin{aligned} f^{(k)}(t)=\left( \frac{1}{p}\right) ^{{\underline{k}}}\vert t \vert ^{1/p-k}{\text {sgn}}(t)^k,\quad g^{(k)}(t)=p^{\,{\underline{k}}}\,\vert t \vert ^{p-k}{\text {sgn}}(t)^k, \quad t\ne 0, k\ge 1.\nonumber \\ \end{aligned}$$
(44)

In the proof of [19, Prop. 2.23], it is shown that \(\psi _1^{(k)}(x)=A_k(x)\), \(k=1, \dots , r\), where the k-linear form \(A_k(x)\) is defined by

$$\begin{aligned} A_k(x)(h_1, \dots , h_k)=\int _{{\mathbb S}}g^{(k)}(x(s))h_1(s)\cdots h_k(s)\nu (\mathrm {d}s), \quad h_1,\dots ,h_k\in \mathrm {L}_p.\nonumber \\ \end{aligned}$$
(45)

For our aim, it is useful to explicit here the iterated use of Hölder inequality mentioned in [19] to check the continuity of the k-linear operator \(A_k(x)\). This way we obtain a bound of the norm \(\left\| A_k(x) \right\| _{\mathscr {L}_k(\mathrm {L}_p)}\) in terms of p, k and \(\left\| x \right\| _{\mathrm {L}_p}\). In view of (44), the problem is reduced to the successive “extractions” of \(\left\| h_1 \right\| _{\mathrm {L}_p},\dots ,\left\| h_k \right\| _{\mathrm {L}_p}\) via Hölder inequality applied iteratively along an ad hoc sequence \((p_1,q_1),\dots ,(p_k,q_k)\) of conjugate exponents, starting from the integral \( J_1 := \int _{{\mathbb S}}\vert x \vert ^{p-k}\vert h_1 \vert \dots \vert h_k \vert \,\mathrm {d}\nu . \) To this aim, we choose \(p_i=p-i+1\), \(q_i=p_i/(p_i-1)=(p-i+1)/(p-i)\), \(1\le i \le k\). It is easily seen that

$$\begin{aligned} q_1\cdots q_i = \frac{p}{p-i},\quad q_1\cdots q_{i-1}p_i = p, \quad i=1,\dots ,k. \end{aligned}$$
(46)

The step \(i\rightarrow i+1\) of this procedure consists in applying Hölder inequality as follows:

$$\begin{aligned} J_i&:= \left( \int _{{\mathbb S}}\vert x \vert ^{q_1\cdots q_{i-1}(p-k)}\vert h_i\cdots h_k \vert ^{q_1\cdots q_{i-1}}\,\mathrm {d}\nu \right) ^{\frac{1}{q_1\cdots q_{i-1}}}\\&\le \left( \int _{{\mathbb S}}\vert x \vert ^{q_1\cdots q_i(p-k)}\vert h_{i+1}\cdots h_k \vert ^{q_1\cdots q_i}\,\mathrm {d}\nu \right) ^{\frac{1}{q_1\cdots q_i}} \times \left( \int _{{\mathbb S}}\vert h_i \vert ^{q_1\cdots q_{i-1}p_i} \,\mathrm {d}\nu \right) ^{\frac{1}{q_1\cdots q_{i-1}p_i}}\\&= J_{i+1}\left\| h_i \right\| _{\mathrm {L}_p}. \end{aligned}$$

At the end of this procedure, we obtain \(J_1 \le J_{k+1} \left\| h_1 \right\| _{\mathrm {L}_p}\cdots \left\| h_k \right\| _{\mathrm {L}_p}\), where

$$\begin{aligned} J_{k+1} = \left( \int _{{\mathbb S}}\vert x \vert ^{q_1\cdots q_k(p-k)}\,\mathrm {d}\nu \right) ^{\frac{1}{q_1\cdots q_k}} = \left( \int _{{\mathbb S}}\vert x \vert ^p\,\mathrm {d}\nu \right) ^{\frac{p-k}{p}} = \left\| x \right\| _{\mathrm {L}_p}^{p-k}, \end{aligned}$$

using (46). From this bound for \(J_1\), we deduce that for every \(x\in \mathsf {B}\setminus \{0\}\) the integral in (45) is well defined, that the k-linear operator \(A_k(x): (\mathrm {L}_p)^k\rightarrow \mathbb {R}\) is continuous and satisfies

$$\begin{aligned} \left\| \psi _1^{(k)}(x) \right\| = \left\| A_k(x) \right\| _{\mathscr {L}_k(\mathrm {L}_p)} \le p^{\,{\underline{k}}}\left\| x \right\| _{\mathrm {L}_p}^{p-k}, \quad x\ne 0,\; 1\le k\le \lfloor p \rfloor . \end{aligned}$$
(47)

To prove the p-smoothness of \(\mathrm {L}_p\), we have to check (3). Recalling that \(\psi = f\circ \psi _1\), we can use differentiation of composite functions on Banach spaces as in the proof of Lemma 6:

$$\begin{aligned} \psi ^{(m)}(x)(h_1, \dots , h_m)&= \frac{\mathrm {d}^m f(\psi _1(x))}{\mathrm {d}x^m}(h_1, \dots , h_m) \nonumber \\&= \sum _{j=1}^m\sum _{\begin{array}{c} \beta \in <j>_+\\ |\beta |=m \end{array}}\sum _{\sigma } \frac{f^{(j)}(\psi _1(x))}{j!\beta !}I_{\beta , \sigma }(x)(h_1, \dots , h_m), \; 1\le m\le \lfloor p \rfloor , \end{aligned}$$
(48)

with the same summation conventions as in (25) and

$$\begin{aligned} I_{\beta , \sigma }(x)(h_1, \dots , h_m) := \psi _1^{(\beta _1)}(x)(h_{\sigma _1}, \dots , h_{\sigma _{\beta _1}})\cdots \psi _1^{(\beta _j)}(x)(h_{\sigma _{m-\beta _j+1}}, \dots , h_{\sigma _m}).\nonumber \\ \end{aligned}$$
(49)

Write \({\mathbb {U}}:=\{x\in \mathrm {L}_p({\mathbb S},\nu ) : \left\| x \right\| _{\mathrm {L}_p} = 1\}\) for the unit sphere of \(\mathrm {L}_p({\mathbb S},\nu )\). As \(\psi _1(x)=1\) for \(x\in {\mathbb {U}}\), it follows from (44) that

$$\begin{aligned} \vert f^{(j)}(\psi _1(x)) \vert = \left| \left( \frac{1}{p}\right) ^{{\underline{j}}} \right| ,\quad x\in {\mathbb {U}}. \end{aligned}$$
(50)

Moreover, \(\beta \in <j>_+\), have all its components \(\beta _i\ge 1\), so by (47),

$$\begin{aligned} \left\| \psi _1^{(\beta _i)}(x) \right\| \le p^{\,\underline{\beta _i}}, \quad x\in {\mathbb {U}}. \end{aligned}$$
(51)

Gathering (48) to (51), we obtain

$$\begin{aligned} \sum _{k=1}^{\lfloor p \rfloor }\sup _{x\in {\mathbb {U}}}\left\| \psi ^{(k)}(x) \right\| < \infty . \end{aligned}$$

It remains to check that \(\psi ^{(\lfloor p \rfloor )}\) satisfies

$$\begin{aligned} \sup _{\begin{array}{c} x,y \in {\mathbb {U}} \\ x\ne y \end{array}} \frac{\Vert \psi ^{(\lfloor p \rfloor )}(x)-\psi ^{(\lfloor p \rfloor )}(y)\Vert }{\Vert x-y\Vert ^{\{p\}}}<\infty . \end{aligned}$$
(52)

As a preliminary, we check the following inequality

$$\begin{aligned} \big ||a|^\alpha {\text {sgn}}(a)-|b|^\alpha {\text {sgn}}(b)\big |\le 2^{1-\alpha }|a-b|^\alpha ,\quad a, b\in \mathbb {R}, \;\alpha \in (0, 1). \end{aligned}$$
(53)

To this aim, we put \(c:=\max (\vert a \vert ,\vert b \vert )\), \(d:=\min (\vert a \vert ,\vert b \vert )\) and use the elementary inequalities

$$\begin{aligned} 1\le t^\alpha + (1-t)^\alpha \le 2^{1-\alpha },\quad 0 \le t \le 1. \end{aligned}$$
(54)

If \({\text {sgn}}(a)={\text {sgn}}(b)\), the choice of \(t=d/c\) in (54) leads to \(c^\alpha - d^\alpha \le (c-d)^\alpha \), whence

$$\begin{aligned} \bigl \vert |a|^\alpha {\text {sgn}}(a)-|b|^\alpha {\text {sgn}}(b) \bigr \vert = \bigl \vert |a|^\alpha -|b|^\alpha \bigr \vert = c^\alpha - d^\alpha \le (c-d)^\alpha = \vert a-b \vert ^\alpha . \end{aligned}$$

If \({\text {sgn}}(a)\not ={\text {sgn}}(b)\), the choice of \(t=c/(c+d)\) in (54) gives \(c^\alpha + d^\alpha \le 2^{1-\alpha }(c+d)^\alpha \), whence

$$\begin{aligned} \bigl \vert |a|^\alpha {\text {sgn}}(a)-|b|^\alpha {\text {sgn}}(b) \bigr \vert = c^\alpha + d^\alpha \le 2^{1-\alpha }(c+d)^\alpha = 2^{1-\alpha }\vert a-b \vert ^\alpha . \end{aligned}$$

Proof of

(52)

Case \(1<p<2\). Here, \(\lfloor p \rfloor =1\) and for \(x, y\in \mathrm {L}_p\) with \(\Vert x\Vert _{\mathrm {L}_p}=\Vert y\Vert _{\mathrm {L}_p}=1\) we have

$$\begin{aligned} \vert \psi '(x)(h)-\psi '(y)(h) \vert&= \Big |\frac{1}{p}(\psi _1(x))^{(1-p)/p}\psi _1'(x)(h)-\psi _1(y))^{(1-p)/p}\psi _1'(y)(h))\Big |\\&= \frac{1}{p}|\psi '_1(x)(h)-\psi '_1(y)(h)|. \end{aligned}$$

As \(\psi '_1\) is the linear form \(A_1\), (44), (45) and (53) with \(\alpha =p-1\) give

$$\begin{aligned} |\psi '(x)(h)-\psi '(y)(h)|&= \Big |\int _{{\mathbb S}}\Big [|x(s)|^{p-1}{\text {sgn}}(x(s))-|y(s)|^{p-1}{\text {sgn}}(y(s))\Big ]h(s)\nu (\mathrm {d}s)\Big | \\&\le \int _{{\mathbb S}}\big ||x(s)|^{p-1}{\text {sgn}}(x(s))- |y(s)|^{p-1}{\text {sgn}}(y(s))\big ||h(s)|\nu (\mathrm {d}s)\\&\le \int _{{\mathbb S}}2^{2-p}\vert x(s) - y(s) \vert ^{p-1} \vert h(s) \vert \nu (\mathrm {d}s) \end{aligned}$$

Applying Hölder inequality with exponents p and \(q=p/(p-1)\), we obtain

$$\begin{aligned} |\psi '(x)(h)-\psi '(y)(h)|\le 2^{2-p}\Vert x-y\Vert _{\mathrm {L}_p}^{p-1}\Vert h\Vert _{\mathrm {L}_p}. \end{aligned}$$

This inequality being valid for every h in \(\mathrm {L}_p({\mathbb S}, \nu )\) and as \(\{p\}=p-1\) here, it follows that \(\left\| \psi '(x)-\psi '(y) \right\| \le 2^{2-p}\Vert x-y\Vert _{\mathrm {L}_p}^{\{p\}}\), so (52) is satisfied when \(1<p<2\).

Case \(p\ge 2\). By (48) and (50), we have for \(x, y \in {\mathbb {U}}\),

$$\begin{aligned} \left\| \psi ^{(\lfloor p \rfloor )}(x)-\psi ^{(\lfloor p \rfloor )}(y) \right\| \le \sum _{j=1}^{\lfloor p \rfloor }\sum _{\begin{array}{c} \beta \in <j>_+\\ |\beta |=\lfloor p \rfloor \end{array}}\sum _{\sigma } \frac{\vert (1/p)^{\,{\underline{j}}} \vert }{j!\beta !}\left\| \varDelta _{\beta , \sigma }(x,y) \right\| , \end{aligned}$$
(55)

where \(\varDelta _{\beta , \sigma }(x,y):= I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y)\). Using telescopic summation as in the proof of Lemma 6 and (51), we get

$$\begin{aligned} \left\| \varDelta _{\beta , \sigma }(x,y) \right\|&\le \sum _{i=1}^j\left( \prod _{1\le k<i} \left\| \psi _1^{(\beta _k)}(x) \right\| \right) \left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\| \left( \prod _{i< k\le j} \left\| \psi _1^{(\beta _k)}(y) \right\| \right) \nonumber \\&\le \sum _{i=1}^j \left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\| \prod _{\begin{array}{c} 1\le k\le j\\ k\ne i \end{array}}p^{\,{\underline{k}}}. \end{aligned}$$
(56)

Now, it remains to find a suitable control of each increment \(\left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\| \).

If \(2\le j\le \lfloor p \rfloor \), the multi-index \(\beta \) has at least two components, so \(1\le \beta _i<\lfloor p \rfloor \). Then, \(\psi _1^{\beta _i}\) has a continuous derivative on \(\mathsf {B}\setminus \{0\}\). So if \(0\notin [x,y]\), recalling (47), we get

$$\begin{aligned} \left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\|&\le \sup _{z\in [x,y]}\left\| \psi _1^{(\beta _i +1)}(z) \right\| \left\| x-y \right\| \nonumber \\&\le p^{\,\underline{\beta _i+1}}\sup _{t\in [0,1]}\left\| (1-t)x+ty \right\| ^{p-\beta _i-1} \left\| x-y \right\| \nonumber \\&\le 2^{p-\beta _i-1}p^{\,\underline{\beta _i+1}}\left\| x-y \right\| . \end{aligned}$$
(57)

To complete the case \(j\ge 2\), notice that \(0\in [x,y]\) if and only if \(y=-x\). In this special case, \(\left\| x-y \right\| =2\) and accounting (51), we can simply write

$$\begin{aligned} \left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\| \le \left\| \psi _1^{(\beta _i)}(x) \right\| + \left\| \psi _1^{(\beta _i)}(-x) \right\| \le 2p^{\,\underline{\beta _i}} = p^{\,\underline{\beta _i}} \left\| x-y \right\| .\nonumber \\ \end{aligned}$$
(58)

Now, \(\left\| x-y \right\| \le 2^{1-\{p\}}\left\| x-y \right\| ^{\{p\}}\) for \( x,y\in {\mathbb {U}}\), so from (56)–(58), there is a constant K dependent only on the space \(\mathrm {L}_p({\mathbb S},\nu )\) such that

$$\begin{aligned} \sum _{j=2}^{\lfloor p \rfloor }\sum _{\begin{array}{c} \beta \in <j>_+\\ |\beta |=\lfloor p \rfloor \end{array}}\sum _{\sigma } \frac{\vert (1/p)^{\,{\underline{j}}} \vert }{j!\beta !}\left\| \varDelta _{\beta , \sigma }(x,y) \right\| \le K\left\| x-y \right\| ^{\{p\}},\quad x,y\in {\mathbb {U}}. \end{aligned}$$
(59)

It remains to treat the sum of terms for which \(j=1\) in (55). Here, \(\beta \) is a mono-index necessarily equal to \(\lfloor p \rfloor \) and by (56), one can bound this remaining sum R as

$$\begin{aligned} R:= \sum _{\sigma } \frac{1}{p\,\lfloor p \rfloor !}\left\| \varDelta _{\lfloor p \rfloor , \sigma }(x,y) \right\| \le \frac{1}{p}\left\| \psi _1^{(\lfloor p \rfloor )}(x)-\psi _1^{(\lfloor p \rfloor )}(y) \right\| . \end{aligned}$$

Recalling (45) and (44), we have for \(h_1, \dots , h_{\lfloor p \rfloor }\in \mathrm {L}_p({\mathbb S}, \nu )\),

$$\begin{aligned}&\big (\psi _1^{(\lfloor p \rfloor )}(x)-\psi _1^{(\lfloor p \rfloor )}(y)\big )(h_1,\dots ,h_{\lfloor p \rfloor }) \\&= p^{\,\underline{\lfloor p \rfloor }} \int _{\mathbb S}(\vert x \vert ^{\{p\}}\!{\text {sgn}}x - \vert y \vert ^{\{p\}}\!{\text {sgn}}y ) h_1\cdots h_{\lfloor p \rfloor }\!\,\mathrm {d}\nu \end{aligned}$$

Using iteratively Hölder inequality exactly as in the proof of (47), we obtain

$$\begin{aligned} \left\| \psi _1^{(\lfloor p \rfloor )}(x)-\psi _1^{(\lfloor p \rfloor )}(y) \right\|&\le p^{\,\underline{\lfloor p \rfloor }}\left( \int _{\mathbb S}\left| \vert x \vert ^{\{p\}}{\text {sgn}}x - \vert y \vert ^{\{p\}}{\text {sgn}}y \right| ^{\frac{p}{\{p\}}} \,\mathrm {d}\nu \right) ^{\frac{\{p\}}{p}}\\&\le 2^{1-\{p\}}p^{\,\underline{\lfloor p \rfloor }} \left( \int _{\mathbb S}\left| x-y \right| ^p \,\mathrm {d}\nu \right) ^{\frac{\{p\}}{p}}, \end{aligned}$$

thanks to (53). Finally, \(R\le K'\left\| x-y \right\| ^{\{p\}}\) with a constant \(K'\) depending only on p and recalling (59), this completes the proof of (52). \(\square \)

Recalling that by [19, Prop. 2.23], if \(p=2\ell \) is an even integer number, then the norm \(\psi (x)\) is infinitely many times Fréchet differentiable, we can summarize about \(\mathrm {L}_p\) smoothness by the following proposition.

Proposition 15

  1. (a)

    For any \(p>1\), the space \(\mathrm {L}_p({\mathbb S}, \nu )\) is p-smooth.

  2. (b)

    If \(p=2\ell \) is an even integer, then the norm of \(\mathrm {L}_p({\mathbb S}, \nu )\) is infinitely many times Fréchet differentiable on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) and has bounded derivatives on the unit circle, so that \(\mathrm {L}_p({\mathbb S}, \nu )\) is d-smooth for any integer \(d\ge 1\).

Theorem 4 and Proposition 15 yield the following results.

Theorem 16

Let \(p\ge 1\). For \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathrm {L}_p({\mathbb S}, \nu ))\), the following statements are equivalent:

  1. (i)

    \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);

  2. (ii)

    \(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}_b^{(p)}(\mathrm {L}_p({\mathbb S}, \nu ))\);

  3. (iii)

    \(\lim _{n\rightarrow \infty } \zeta _p(\mathsf {P}_n, \mathsf {P})=0\).

Theorem 17

If \(p\ge 2\) is an even integer, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathrm {L}_p({\mathbb S}, \nu ))\), the following statements are equivalent:

  1. (i)

    \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);

  2. (ii)

    \(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}_b^{(\infty )}(\mathrm {L}_p({\mathbb S}, \nu ))\);

  3. (iii)

    for at least one \(d>1\), \(\lim _{n\rightarrow \infty } \zeta _d(\mathsf {P}_n, \mathsf {P})=0\).

4 Lindeberg CLT in p-Smooth Banach Spaces

First, we implement in Theorem 21 below the main principle of Lindeberg method and compare the sums \(\sum _{k=1}^{r_n}X_{nk}\) with sums of independent Gaussian random variables. Beforehand, it seems convenient to recall the notion of \(\mathsf {B}\)-valued stochastic integral with respect to a white noise which plays a key role in our proof of Theorem 21.

Definition 18

Let \(({\mathbb S}, \mathscr {S}, \mu )\) be a measure space and \(\mathscr {S}_0:=\{A\in \mathscr {S};\; \mu (A)<\infty \}\). A white noise with variance \(\mu \) is a stochastic process \(W = (W(A);\; A\in \mathscr {S}_0)\) defined on some rich enough probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\) such that

  1. (a)

    for each \(A\in \mathscr {S}_0\), W(A) is a real valued Gaussian random variable with mean zero and variance \(\mu (A)\);

  2. (b)

    if \(A_1\in \mathscr {S}_0, \dots , A_j\in \mathscr {S}_0\) are disjoint, then \(W(A_1), \dots , W(A_j)\) are independent and

    $$\begin{aligned} W\left( \bigcup _{i=1}^j A_i\right) =\sum _{i=1}^j W(A_i),\quad j\ge 2. \end{aligned}$$

Next, following Proposition 3.3 in Hoffmann Jørgensen and Pisier [15], one can construct a \(\mathsf {B}\)-valued stochastic integral with respect to W. Classically we define first this integral for functions g in the space \(\mathscr {L}_0(\mu )\) of \(\mathscr {S}_0\)-simple functions g, that is of the form \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\), with \(x_i\in \mathsf {B}\), \(A_i\in \mathscr {S}_0\), \(1\le i\le j\), \(j\ge 1\) and extend it to the whole space \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\). The next proposition is essentially stated and proved in [15]. Our rewriting of its statement and proof is motivated by the need to explicit Corollary 20 in view of its role in Theorem 21.

Proposition 19

If \(\mathsf {B}\) is of type 2 and W is a white noise with variance \(\mu \) on some probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\), then there exists a unique linear map

$$\begin{aligned} I_W : \mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\longrightarrow \mathrm {L}^2(\varOmega ',\mathscr {F}',\mathsf {P}',\mathsf {B}),\quad g \longmapsto I_W(g)=\int _{\mathbb S}g\,\mathrm {d}W, \end{aligned}$$

such that the following statements hold.

  1. (a)

    For every \(g = \sum _{i=1}^j x_i\mathbf {1}_{A_i}\), where \(x_1,\dots ,x_j\in \mathsf {B}\), \(A_1,\dots ,A_j\in \mathscr {S}_0\), \(j\ge 1\),

    $$\begin{aligned} I_W(g)=\int _{\mathbb S}g\,\mathrm {d}W := \sum _{i=1}^j x_iW(A_i). \end{aligned}$$
    (60)
  2. (b)

    There exists a constant C such that for every \(g\in \mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\),

    $$\begin{aligned} \mathrm{E}\left\| \int _{\mathbb S}g\,\mathrm {d}W \right\| ^2 \le C \int _{\mathbb S}\left\| g \right\| ^2\,\mathrm {d}\mu . \end{aligned}$$
    (61)
  3. (c)

    For every \(g\in \mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\), \(\int _{\mathbb S}g\,\mathrm {d}W\) is a Gaussian mean zero random element in \(\mathsf {B}\).

  4. (d)

    If \(D', D''\in \mathscr {B}_\mathsf {B}\) are disjoint, \(\int _{D'}g\,\mathrm {d}W\) and \(\int _{D''}g\,\mathrm {d}W\) are independent for every g in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\).

Proof

The coherence of the definition of \(I_W(g)\) by (60) when g is a \(\mathscr {S}_0\)-simple function is checked in a standard way using the additivity property (b) in Definition 18. Checking the linearity of \(I_W\) on the subspace \(\mathscr {L}_0(\mu )\) of simple functions in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) is then straightforward. Next, if \(g\in \mathscr {L}_0(\mu )\), it can be represented as \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\) where the \(A_i\in \mathscr {S}_0\) are disjoint, which was not requested in (60). As \(\mathsf {B}\) is of type 2 and the \(x_iW(A_i)\) are independent with mean zero and finite second moment, there is a constant C depending only on \(\mathsf {B}\) such that

$$\begin{aligned} \mathrm{E}\left\| I_W(g) \right\| ^2 = \mathrm{E}\left\| \sum _{i=1}^j x_iW(A_i) \right\| ^2 \le C\sum _{i=1}^j\mathrm{E}\left\| x_iW(A_i) \right\| ^2 = C\sum _{i=1}^j\left\| x_i \right\| ^2 \mathrm{E}W(A_i)^2. \end{aligned}$$

As the random variables \(W(A_i)\) are mean zero with respective variances \(\mu (A_i)\), this implies

$$\begin{aligned} \mathrm{E}\left\| I_W(g) \right\| ^2\le C\sum _{i=1}^j\left\| x_i \right\| ^2\mu (A_i) = C\int _{\mathbb S}\left\| g \right\| ^2\,\mathrm {d}\mu . \end{aligned}$$

Therefore, \(I_W\) is a continuous linear map \(\mathscr {L}_0(\mu )\longrightarrow \mathrm {L}^2(\varOmega ',\mathscr {F}',\mathsf {P}',\mathsf {B})\) and by density of \(\mathscr {L}_0(\mu )\) in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\), \(I_W\) has a unique continuous linear extension to this space, still denoted \(I_W\), and satisfying (61) with the same constant C.

To prove (c), we check that for every \(u\in \mathsf {B}^*\), \(u(I_W(g))\) is a Gaussian random variable. This is clear for g simple since then \(u(I_W(g))\) is a linear combination of independent Gaussian random variables. In the general case, g is the limit in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) of a sequence \((g_n)\) of simple functions. Combining the continuity of the linear functional u with (61) gives

$$\begin{aligned} \mathrm{E}\left| u(I_W(g))-u(I_W(g_n)) \right| ^2 \le C\left\| u \right\| ^2\int _{\mathbb S}\left\| g-g_n \right\| ^2\,\mathrm {d}\mu , \end{aligned}$$

which shows that \(u(I_W(g))\) is a Gaussian random variable as limit in quadratic mean of a sequence of Gaussian random variables. Moreover, for every \(u\in \mathsf {B}\), \(\mathrm{E}u(I_W(g)) = \lim _{n\rightarrow \infty }\mathrm{E}u(I_W(g_n))=0\), whence \(\mathrm{E}I_W(g)=0\).

To prove (d), we first note that for g simple, \(g=\sum _{i=1}^jx_i\mathbf {1}_{A_i}\) with the \(A_i\) disjoint, and \(D\in \mathscr {B}_\mathsf {B}\), \(\int _{D}g\,\mathrm {d}W:=\int _{\mathbb S}g\mathbf {1}_{D}\,\mathrm {d}W = \sum _{i=1}^jx_i W(A_i\cap D)\). Since \(D'\) and \(D''\) are disjoint, so are the sets \(A_1\cap D',\dots ,A_j\cap D',A_1\cap D'',\dots ,A_j\cap D''\), which provides the independence of

$$\begin{aligned} Y'=\int _{D'}g\,\mathrm {d}W = \sum _{i=1}^jx_i W(A_i\cap D') \quad \text {and}\quad Y''=\int _{D''}g\,\mathrm {d}W = \sum _{i=1}^jx_i W(A_i\cap D''). \end{aligned}$$

This independence is preserved when g is the limit in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) of a sequence \((g_n)\) of simple functions since then, \(Y'_n=\int _{D'}g_n\,\mathrm {d}W\) and \(Y''=\int _{D''}g_n\,\mathrm {d}W\) converge in probability to \(Y'=\int _{D'}g\,\mathrm {d}W\) and \(Y''=\int _{D''}g\,\mathrm {d}W\), respectively, which implies the convergence in distribution of \((Y'_n,Y''_n)\) to \((Y',Y'')\). Then, the distribution of \((Y',Y'')\) is the product of the distributions of \(Y'\) and \(Y''\) which is equivalent to the independence of \(Y'\) and \(Y''\). \(\square \)

Now, denote by X a \(\mathsf {B}\) valued random element defined on a probability space \((\varOmega ,\mathscr {F},\mathsf {P})\) with distribution \(\mathsf {P}_X=\mathsf {P}\circ X^{-1}\) (which is a probability measure on \(\mathscr {B}_\mathsf {B}\)) and such that \(\mathrm{E}X=0\), \(\mathrm{E}\left\| X \right\| ^2<\infty \). Let us denote by \(Q = \mathrm {cov}(X)\in L(\mathsf {B}^*, \mathsf {B})\) the covariance operator of X, that is the linear bounded operator from \(\mathsf {B}^*\) to \(\mathsf {B}\) defined by

$$\begin{aligned} Qu=\mathrm{E}\big (\langle u, X \rangle X\big ),\ \ u\in \mathsf {B}^*. \end{aligned}$$

Since \(\mathsf {B}\) is of type 2 and \(\mathrm{E}\left\| X \right\| ^2<\infty \), the operator Q is pregaussian (see, Theorem 3.5. in Hoffmann Jørgensen and Pisier [15]), so there exists a Gaussian mean zero random element Y in \(\mathsf {B}\) with covariance operator Q. One way to construct such an Y is to apply Proposition 19 with \({\mathbb S}= \mathsf {B}\), \(\mathscr {S}=\mathscr {B}_\mathsf {B}\), \(\mu =\mathsf {P}_X\), which gives the following corollary.

Corollary 20

Let \(\mathsf {B}\) be a separable type 2 Banach space and X be a random element in \(\mathsf {B}\) defined on some probability space \((\varOmega , \mathscr {F}, \mathsf {P})\). Denote by \(\mathsf {P}_X:=\mathsf {P}\circ X^{-1}\) the distribution of X. Assume that \(\mathrm{E}\left\| X \right\| ^2<\infty \) and \(\mathrm{E}X = 0\). Let \(W= (W(A), A\in \mathscr {B}_\mathsf {B})\) be a white noise with variance \(\mu =\mathsf {P}_X\) defined on some probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\). As \(\mathrm{E}\left\| X \right\| ^2<\infty \), the identity map, \({{\,\mathrm{Id}\,}}_\mathsf {B}: \mathsf {B}\rightarrow \mathsf {B}\), \(x\mapsto x\), is in \(\mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B},\mathsf {P}_X,\mathsf {B})\), so we can define a Gaussian mean zero random element Y in \(\mathsf {B}\) by

$$\begin{aligned} Y:= \int _\mathsf {B}{{\,\mathrm{Id}\,}}_\mathsf {B}\,\mathrm {d}W = \int _B x\,\mathrm {d}W(x). \end{aligned}$$
(62)

Then, the following statements hold.

  1. (a)

    With the constant C in (61),

    $$\begin{aligned} \mathrm{E}\left\| Y \right\| ^2 \le C \mathrm{E}\left\| X \right\| ^2. \end{aligned}$$
    (63)
  2. (b)

    For every \(g\in \mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\), the Gaussian mean zero random element \(Z=\int _\mathsf {B}g\,\mathrm {d}W\) has the same covariance operator as g(X). In particular, Y and X have the same covariance operator.

  3. (c)

    For every symmetric \(T\in \mathscr {L}_2(\mathsf {B})\) and every \(g\in \mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\),

    $$\begin{aligned} \mathrm{E}T(g(X),g(X))=\mathrm{E}T(Z, Z) \end{aligned}$$
    (64)

    and in particular \(\mathrm{E}T(X,X) = \mathrm{E}T(Y,Y)\).

Proof

(a) is a simple translation of (61) in the special case under consideration. For (b), we have to check that \(Q_Z=Q_{g(X)}\) which is equivalent to \(\mathrm{E}\big (u(Z)v(Z)\big ) = \mathrm{E}\big (u(g(X))v(g(X))\big )\) for every u, v in \(\mathsf {B}^*\). For \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\) with the \(A_i\)’s disjoint, \(u(Z)=\sum _{i=1}^j u(x_i)W(A_i)\), whence by independence of the \(A_i\)’s,

$$\begin{aligned} \mathrm{E}\big (u(Z)v(Z)\big ) = \sum _{i=1}^j u(x_i)v(x_i)\mathsf {P}_X(A_i)&= \int _\mathsf {B}u(g(x))v(g(x))\,\mathrm {d}\mathsf {P}_X(x)\\&= \mathrm{E}\big (u(g(X))v(g(X))\big ). \end{aligned}$$

Valid for every g simple, this equality extends to the whole space \(\mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\) by the continuity of \(I_W\). In particular for \(g={{\,\mathrm{Id}\,}}_\mathsf {B}\) and \(Y=\int _\mathsf {B}{{\,\mathrm{Id}\,}}_\mathsf {B}\,\mathrm {d}W\), \(Q_Y=Q_X\).

The proof of (c) is similar and will be omitted. \(\square \)

Theorem 21

Assume that \(\mathsf {B}\) is of type 2. Consider an array of \(\mathsf {B}\)-valued random variables

$$\begin{aligned} X_{n1}, \dots ,X_{nk},\dots , X_{nr_n},\quad 1\le k\le r_n, n\ge 1, \end{aligned}$$

where the probability space \((\varOmega _n, \mathscr {F}_n, \mathsf {P}_n)\) underlying the \(n^{\text {th}}\) line may vary with n and for each \(n\ge 1\), the \(X_{nk}\), \(1\le k\le r_n\), are mean zero independent and

$$\begin{aligned} M_n:=\sum _{k=1}^{r_n}\mathrm{E}\left\| X_{nk} \right\| ^2 <\infty . \end{aligned}$$
(65)

Then, for each \(n\ge 1\), one can construct on some probability space \((\varOmega _n', \mathscr {F}_n', \mathsf {P}_n')\) independent mean zero Gaussian \(\mathsf {B}\)-valued random variables \(Y_{n1}, \dots , Y_{nr_n}\), such that for \(1\le k\le r_n\), \(X_{nk}\) and \(Y_{nk}\) have the same covariance operator and for any \(\delta \in (0, 1]\), and any \(\varepsilon >0\),

$$\begin{aligned} \zeta _{2+\delta }\left( \sum _{k=1}^{r_n} X_{nk}, \sum _{k=1}^{r_n} Y_{nk}\right) \le c(\mathsf {B},\delta )\left( M_n\varepsilon ^{\delta } + (M_n+1)\sum _{k=1}^{r_n}\mathrm{E}\Vert X_{nk}\Vert ^2\varvec{1}\{\left\| X_{nk} \right\| >\varepsilon \}\right) ,\nonumber \\ \end{aligned}$$
(66)

where the constant \(c(\mathsf {B},\delta )>0\) depends only of the type 2 constant of the space \(\mathsf {B}\) and of \(\delta \).

Proof

We fix an arbitrary \(n\ge 1\) and prove (66) for the \(n^{\text {th}}\) line of the array. In view of the property (36) of \(\zeta _{2+\delta }\), the problem reduces to proving that, given any mean zero random element X in \(\mathsf {B}\) such that \(\mathrm{E}\left\| X \right\| ^2<\infty \), one can construct, possibly on another probability space than the one supporting X, a mean zero Gaussian random element Y in \(\mathsf {B}\) with the same covariance operator as X, such that

$$\begin{aligned} \zeta _{2+\delta }(X,Y) \le c(\mathsf {B},\delta )\left( \varepsilon ^\delta \mathrm{E}\left\| X \right\| ^2 + \big (1+\mathrm{E}\left\| X \right\| ^2\big )\mathrm{E}\left\| X \right\| ^2\varvec{1}\{\left\| X \right\| >\varepsilon \}\right) . \end{aligned}$$
(67)

To this aim, choosing Y as in Corollary 20, we have to estimate \(\vert \mathrm{E}f(X) - \mathrm{E}f(Y) \vert \) for \(f\in \mathrm {C}_b^{(2+\delta )}(\mathsf {B})\) such that \(\left\| f \right\| _{2+\delta }\le 1\). By Taylor formula at the order 1 with integral remainder,

$$\begin{aligned} f(X) = f(0) + f'(0).X + \int _0^1 (1-t)f''(tX).(X,X)\,\mathrm {d}t. \end{aligned}$$
(68)

To exploit fully the membership of f in \(\mathrm {C}_b^{(2+\delta )}(\mathsf {B})\), we rephrase this formula as

$$\begin{aligned} f(X)= & {} f(0) + f'(0).X + \frac{1}{2}f''(0).(X,X) \\&+ \int _0^1 (1-t)\big (f''(tX)-f''(0)\big ).(X,X)\,\mathrm {d}t. \end{aligned}$$

Applying the same treatment to f(Y) and using \(\mathrm{E}(f'(0).X)= f'(0).(\mathrm{E}X)=0\) and similarly \(\mathrm{E}(f'(0).Y)=0\), together with (c) in Corollary 20 applied with the bilinear symmetric operator \(T=f''(0)\), we are left with

$$\begin{aligned} \vert \mathrm{E}f(X) - \mathrm{E}f(Y) \vert \le R(X) + R(Y), \end{aligned}$$

where

$$\begin{aligned} R(Z) := \int _0^1 (1-t)\mathrm{E}\left\| \big (f''(tZ)-f''(0)\big ).(Z,Z) \right\| \,\mathrm {d}t,\quad Z=X,Y. \end{aligned}$$

By the \(\delta \)-Hölder continuity of \(f''\) and \(\left\| f \right\| _{2+\delta }\le 1\), \(\left\| f''(tZ) - f''(0) \right\| \le \left\| tZ \right\| ^\delta \le \left\| Z \right\| ^\delta \) whence

$$\begin{aligned} \left\| \big (f''(tZ)-f''(0)\big ).(Z,Z) \right\| \le \left\| Z \right\| ^\delta \left\| (Z,Z) \right\| \le \left\| Z \right\| ^{2+\delta }\quad Z=X,Y. \end{aligned}$$
(69)

As Y is Gaussian, \(\mathrm{E}\left\| Y \right\| ^{2+\delta }<\infty \), which gives a first estimate of R(Y), by integration with respect to t in (69):

$$\begin{aligned} R(Y) \le \frac{1}{2} \mathrm{E}\left\| Y \right\| ^{2+\delta }. \end{aligned}$$
(70)

Concerning R(X), only the finiteness of \(\mathrm{E}\left\| X \right\| ^2\) is available, so we use (69) only on the event \(\{X\le \varepsilon \}\) where \(\left\| \big (f''(tX)-f''(0)\big ).(X,X) \right\| \le \varepsilon ^\delta \left\| X \right\| ^2\). This gives

$$\begin{aligned}&\int _0^1 (1-t)\mathrm{E}\big (\left\| \big (f''(tX)-f''(0)\big ).(X,X) \right\| \varvec{1}\{\left\| X \right\| \le \varepsilon \}\big )\,\mathrm {d}t \nonumber \\&\quad \le \int _0^1 (1-t)\varepsilon ^\delta \mathrm{E}\left\| X \right\| ^2\varvec{1}\{\left\| X \right\| \le \varepsilon \}\,\mathrm {d}t \le \frac{1}{2} \varepsilon ^\delta \mathrm{E}\left\| X \right\| ^{2}. \end{aligned}$$
(71)

On \(\{\left\| X \right\| >\varepsilon \}\), we simply use the fact that \(\left\| f'' \right\| \le 1\), so \(\left\| f''(tX)-f''(0) \right\| \le 2\). This gives

$$\begin{aligned}&\int _0^1 (1-t)\mathrm{E}\big (\left\| \big (f''(tX)-f''(0)\big ).(X,X) \right\| \varvec{1}\{\left\| X \right\|>\varepsilon \}\big )\,\mathrm {d}t \nonumber \\&\quad \le \int _0^1 2(1-t) \mathrm{E}\left\| X \right\| ^2\varvec{1}\{\left\| X \right\|>\varepsilon \}\,\mathrm {d}t = \mathrm{E}\left\| X \right\| ^{2}\varvec{1}\{\left\| X \right\| > \varepsilon \}. \end{aligned}$$
(72)

Our next step is to control the bound (70) in terms of the distribution of X only. Since Y is Gaussian, there is for every \(r>0\) a constant \(\kappa _r\) depending on r only, such that \((\mathrm{E}\left\| Y \right\| ^r)^{1/r}\le \kappa _r (\mathrm{E}\left\| Y \right\| ^2)^{1/2}\). One possible value is obtained using the inequality \(P(\left\| Y \right\| >t)\le 4\exp (-t^2/(8c^2))\) where \(c^2=\mathrm{E}\left\| Y \right\| ^2\), see, e.g., (3.5) in [17], which gives \(\kappa _r = 2^{3/2+2/r}\varGamma (r/2+1)^{1/r}\). In particular,

$$\begin{aligned} \mathrm{E}\left\| Y \right\| ^{2+\delta } \le \kappa _{2+\delta }^{2+\delta }\left( \mathrm{E}\left\| Y \right\| ^2\right) ^{1+\delta /2}. \end{aligned}$$

Next, recalling (62), we note that \(Y = Y'_\varepsilon + Y''_\varepsilon \), where

$$\begin{aligned} Y'_\varepsilon := \int _B x\varvec{1}\{\left\| x \right\| \le \varepsilon \}\,\mathrm {d}W(x) \quad \text {and}\quad Y''_\varepsilon := \int _B x\varvec{1}\{\left\| x \right\| > \varepsilon \}\,\mathrm {d}W(x) \end{aligned}$$

are independent Gaussian random elements in \(\mathsf {B}\) by Proposition 19 and Corollary 20. Since \(\mathsf {B}\) is of type 2, it follows by (39) that

$$\begin{aligned} \mathrm{E}\left\| Y \right\| ^{2+\delta } \le \kappa _{2+\delta }^{2+\delta }K^{2+\delta } \left( \mathrm{E}\left\| Y'_\varepsilon \right\| ^2 + \mathrm{E}\left\| Y''_\varepsilon \right\| ^2\right) ^{1+\delta /2}. \end{aligned}$$

By Proposition 19 (b) and the convexity inequality \((a+b)^r \le 2^{r-1}(a^r + b^r)\), \(a,b\ge 0\), \(r\ge 1\), we obtain with a constant \(\gamma _\delta :=2^{\delta /2}\kappa _{2+\delta }^{2+\delta }K^{2+\delta }C^{1+\delta /2}\), C being as in (61),

$$\begin{aligned} \mathrm{E}\left\| Y \right\| ^{2+\delta }&\le \gamma _\delta \left( \mathrm{E}\left\| X \right\| ^{2+\delta }\varvec{1}\{\left\| X \right\| \le \varepsilon \} + \left( \mathrm{E}\left\| X \right\| ^{2}\varvec{1}\{\left\| X \right\|> \varepsilon \}\right) ^{1+\delta /2}\right) \nonumber \\&\le \gamma _\delta \left( \varepsilon ^\delta \mathrm{E}\left\| X \right\| ^2 + \left( \mathrm{E}\left\| X \right\| ^{2}\right) ^{\delta /2} \mathrm{E}\left\| X \right\| ^{2}\varvec{1}\{\left\| X \right\|> \varepsilon \}\right) \nonumber \\&\le \gamma _\delta \left( \varepsilon ^\delta \mathrm{E}\left\| X \right\| ^2 + \left( 1+\mathrm{E}\left\| X \right\| ^{2}\right) \mathrm{E}\left\| X \right\| ^{2}\varvec{1}\{\left\| X \right\| > \varepsilon \}\right) . \end{aligned}$$
(73)

Now gathering (70), (71), (72) and (73) gives (67) with \(c(\mathsf {B},\delta )=(1+\gamma _\delta /2)\).

To conclude, choose a probability space \((\varOmega _n',\mathscr {F}_n',\mathsf {P}'_n)\) rich enough to support a sequence of independent white noises \((W_{nk})_{1\le k\le r_n}\) where the variance of \(W_{nk}\) is the distribution of \(X_{nk}\). Define on this probability space the corresponding sequence \((Y_{nk})_{1\le k\le r_n}\) of Gaussian random elements in \(\mathsf {B}\) by \(Y_{nk}:=\int _\mathsf {B}{{\,\mathrm{Id}\,}}_B \,\mathrm {d}W_{nk}\), \(1\le k\le r_n\). Each pair \(X_{nk}\), \(Y_{nk}\) satisfies (67). Bounding \((1+\mathrm{E}\left\| X_{nk} \right\| ^2)\) by \(1+M_n\) and summing over \(k=1,\dots ,r_n\), we obtain (66). \(\square \)

Hence, in p-smooth Banach space \(\mathsf {B}\) where \(p>2\), the proof of convergence in distribution of the sequence \(\sum _{k=1}^nX_{nk}, n\in \mathbb {N},\) to a \(\mathsf {B}\)-valued Gaussian random variable \(Y_Q\), is reduced by Theorem 21 to the proof of convergence in distribution of the Gaussian sequence \(\sum _{k=1}^n Y_{nk}\) to \(Y_Q\). The later is controlled by convergence of covariance operators. In any finite dimensional space, this is not a problem. In any separable Hilbert space as well as in Banach space of type 2 with approximation property, the convergence \(\sum _{k=1}^nY_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_Q\) is obtained from convergence of covariances in nuclear norm (see Chevet [8]).

Recall an operator \(u\in \mathscr {L}(\mathsf {B})\) is said to be nuclear if it admits the representation

$$\begin{aligned} u(x)=\sum _{k=1}^\infty f_k(x)y_k, \end{aligned}$$

where \(f_k\in \mathsf {B}^*, y_k\in \mathsf {B}\), and

$$\begin{aligned} \sum _{k=1}^\infty \Vert f_k\Vert \cdot \Vert y_k\Vert <\infty . \end{aligned}$$

The greatest lower bound of the sum \(\sum _{k=1}^\infty \Vert f_k\Vert \cdot \Vert y_k\Vert \) taken over all possible representations of u is called the nuclear norm of u and is denoted by \(\nu _1(u)\).

Theorem 22

Let the Banach space \(\mathsf {B}\) be p-smooth for some \(p>2\) and has approximation property. For each \(n\ge 1\) suppose that \(X_{n1}, \dots , X_{nr_n}\) is a sequence of mean zero independent \(\mathsf {B}\)-valued random elements such that \(\sup _{n\in \mathbb {N}}\sum _{k=1}^{r_n}\mathrm{E}\left\| X_{nk} \right\| ^2<\infty \). Let \(Q_{nj}:=\mathrm {cov}(X_{nj})\), \(j=1, \dots , r_n, n\ge 1.\) If there is a linear bounded operator \(Q\in L(\mathsf {B}^*, \mathsf {B})\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\nu _1\Big (\sum _{j=1}^{r_n}Q_{nj}-Q\Big )=0, \end{aligned}$$
(74)

and for each \(\varepsilon >0\),

$$\begin{aligned} \lim _{n\rightarrow \infty }\sum _{k=1}^{r_n}\mathrm{E}\Vert X_{nk}\Vert ^2\varvec{1}\{\Vert X_{nk}\Vert >\varepsilon \}=0, \end{aligned}$$
(75)

then Q is pre-Gaussian and

$$\begin{aligned} \sum _{k=1}^{r_n}X_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_Q, \end{aligned}$$
(76)

where \(Y_Q\) is a mean zero Gaussian random element in \(\mathsf {B}\) with covariance Q.

Proof

Let the Gaussian triangular array \((Y_{nk}, k=1, \dots , r_n; n\ge 1)\) be as constructed in Theorem 21. Since

$$\begin{aligned} \zeta _p\Big (\sum _{k=1}^{r_n}X_{nk}, Y_Q\Big )\le \zeta _p\Big (\sum _{k=1}^{r_n}X_{nk}, \sum _{k=1}^{r_n}Y_{nk}\Big )+\zeta _p\Big (\sum _{k=1}^{r_n}Y_{nk}, Y_Q\Big ), \end{aligned}$$

it is enough by Theorem 21 to prove

$$\begin{aligned} \lim _{n\rightarrow \infty }\zeta _p\Big (\sum _{k=1}^{m_n}Y_{nk}, Y_Q\Big )=0. \end{aligned}$$
(77)

This is equivalent to the weak convergence of Gaussian distributions and as it is proved in Chevet [8], the convergence \(\sum _{k=1}^{r_n}Y_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_Q\) follows from (74). \(\square \)

To check convergence in nuclear norm might be a quite complex task. In some concrete Banach spaces, the direct proof of convergence in distribution of Gaussian random variables is easier to achieve. As an illustration consider now the case of \(\mathrm {L}_p\) spaces. In what follows, \(({\mathbb S},\mathscr {S},\mu )\) is a measurable space where the measure \(\mu \) is \(\sigma \)-finite. We denote by p a real in \((2,\infty )\) and by \(q=p/(p-1)\) its conjugate exponent. We assume moreover that the space \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) is separable. We denote, respectively, by \(\mathscr {S}\otimes \mathscr {S}\) and \(\mu \otimes \mu \) the product \(\sigma \)-field and product measure on the Cartesian product \({\mathbb S}^2\). We will use the abbreviations:

$$\begin{aligned} \mathrm {L}_p({\mathbb S}) := \mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R}),\quad \mathrm {L}_p({\mathbb S}^2) := \mathrm {L}_p({\mathbb S}^2,\mathscr {S}\otimes \mathscr {S},\mu \otimes \mu ;\mathbb {R}). \end{aligned}$$

For real valued functions uv defined \(\mu \) almost everywhere on \({\mathbb S}\), \(u\otimes v\) denotes the function defined \(\mu \otimes \mu \) almost everywhere on \({\mathbb S}^2\) by \((u\otimes v)(s,t):= u(s)v(t)\). This notation is extended in an obvious way to random elements in \(\mathrm {L}_p({\mathbb S})\).

Theorem 23

(CLT in \(\mathrm {L}_p\), \(p>2\)) Let \((X_{nk},k=1, \dots , r_n; n\in \mathbb {N})\) be a triangular array of mean zero independent random elements in the separable space \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\). Assume that the following conditions are satisfied.

  1. (a)

    \(\sup _{n\in \mathbb {N}}\sum _{k=1}^{r_n} \mathrm{E}\left\| X_{nk} \right\| _{\mathrm {L}_p}^2<\infty \).

  2. (b)

    For any \(\varepsilon >0\),

    $$\begin{aligned} \lim _{n\rightarrow \infty }\sum _{k=1}^{r_n}\mathrm{E}\left\| X_{nk} \right\| ^2_{\mathrm {L}_p}\mathbf {1}\{\left\| X_{nk} \right\| _{\mathrm {L}_p}>\varepsilon \} = 0. \end{aligned}$$
  3. (c)

    There is a mean zero Gaussian random element Y in \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) such that

    $$\begin{aligned} \sum _{k=1}^{r_n} \mathrm{E}(X_{nk}\otimes X_{nk}) \xrightarrow [n\rightarrow \infty ]{} \varGamma :=\mathrm{E}(Y\otimes Y)\quad \text {in } \mathrm {L}_p({\mathbb S}^2,\mathscr {S}\otimes \mathscr {S},\mu \otimes \mu ;\mathbb {R}). \end{aligned}$$
  4. (d)

    Denoting by \(\sigma _n\) and \(\sigma \) the non-negative elements of \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) defined by \(\sigma _n^2(s):= \sum _{k=1}^{r_n}\mathrm{E}X_{nk}(s)^2\) and \(\sigma ^2(s):=\mathrm{E}Y(s)^2\), \(\mu \)-a.e on \({\mathbb S}\),

    $$\begin{aligned} \int _{\mathbb S}\sigma _n^p\,\mathrm {d}\mu \xrightarrow [n\rightarrow \infty ]{} \int _{\mathbb S}\sigma ^p \,\mathrm {d}\mu . \end{aligned}$$

Then,

$$\begin{aligned} \sum _{k=1}^{r_n}X_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y \text { in the space } \mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R}). \end{aligned}$$

The proof requires the preliminaries gathered from Lemmas 24 to 27.

Lemma 24

Assume that \(f, g\in \mathrm {L}_q({\mathbb S}^2)\) satisfy for all \(A, B\in \mathscr {S}\) of finite \(\mu \)-measure,

$$\begin{aligned} \int _{A\times B} f \,\mathrm {d}(\mu \otimes \mu ) = \int _{A\times B} g \,\mathrm {d}(\mu \otimes \mu ). \end{aligned}$$
(78)

Then, \(f=g\), \((\mu \otimes \mu )\)-a.e. on \({\mathbb S}^2\).

Proof

Let us remark first that the integrals in (78) are well defined because \(\mathbf {1}_A\) and \(\mathbf {1}_B\) are in \(\mathrm {L}_q({\mathbb S})\) since \(\mu (A)\) and \(\mu (B)\) are finite. We first prove the lemma in the special case where \(\mu ({\mathbb S})<\infty \) and then extend the result to the general case by using the \(\sigma \)-finiteness of \(\mu \). To simplify the writing, we denote by \(C_n\uparrow C\) the fact that the sequence of sets \((C_n)_{n\ge 1}\) increases to the set C, that is \(C_n\subset C_{n+1}\) for every \(n\ge 1\) and \(\cup _{n\ge 1}C_n = C\).

Case where \(\mu ({\mathbb S})<\infty \). Let us introduce the class \(\mathscr {L}\) of sets \(C\in \mathscr {S}\otimes \mathscr {S}\) such that f and g are \(\mu \otimes \mu \) integrable on C and \(\int _C f\,\mathrm {d}(\mu \otimes \mu ) = \int _C f\,\mathrm {d}(\mu \otimes \mu )\), together with the class \(\mathscr {R}:=\{A\times B, A\in \mathscr {S}, B\in \mathscr {S}\}\). As \(\mu ({\mathbb S})\) is finite, the same holds for \(\mu (A)\) and \(\mu (B)\) and (78) gives the inclusion \(\mathscr {R}\subset \mathscr {L}\). Clearly, \(\mathscr {R}\) is a \(\pi \)-system, i.e., closed under the formation of finite intersections. The class \(\mathscr {L}\) satisfies the three following properties.

(\(\lambda _1\)):

\({\mathbb S}^2\) belongs to \(\mathscr {L}\). Indeed \(\mathbf {1}_{{\mathbb S}^2}\in \mathrm {L}_q({\mathbb S}^2)\) because \(\mu ({\mathbb S})<\infty \).

(\(\lambda _2\)):

\(C, C'\in \mathscr {L}\) and \(C\subset C'\) imply \(C'\setminus C \in \mathscr {L}\). This follows easily by writing for \(h=f, g\), \(\int _{C'}h\,\mathrm {d}(\mu \otimes \mu ) = \int _Ch\,\mathrm {d}(\mu \otimes \mu ) + \int _{C'\setminus C}h\,\mathrm {d}(\mu \otimes \mu )\) and using the membership of \(C, C'\) in \(\mathscr {L}\).

(\(\lambda _3\)):

\(\{C_n, n\ge 1\}\subset \mathscr {L}\) and \(C_n\uparrow C\) imply \(C\in \mathscr {L}\). Indeed the equality \(\int _{C_n}f\,\mathrm {d}(\mu \otimes \mu ) = \int _{C_n}g\,\mathrm {d}(\mu \otimes \mu )\) gives \(\int _{C_n}(f^+ + g^-)\,\mathrm {d}(\mu \otimes \mu ) = \int _{C_n}(g^+ + f^-)\,\mathrm {d}(\mu \otimes \mu )\) and by B. Levi’s monotone convergence theorem, we obtain \(\int _{C}(f^+ + g^-)\,\mathrm {d}(\mu \otimes \mu ) = \int _{C}(g^+ + f^-)\,\mathrm {d}(\mu \otimes \mu )\) that is \(\int _{C}f\,\mathrm {d}(\mu \otimes \mu ) = \int _{C}g\,\mathrm {d}(\mu \otimes \mu )\), so \(C\in \mathscr {L}\).

Hence, \(\mathscr {L}\) is a \(\lambda \)-system. As it contains the \(\pi \)-system \(\mathscr {R}\), by Dynkin’s \(\pi \)-\(\lambda \) theorem, see, e.g., [5], it contains also the \(\sigma \)-field generated by \(\mathscr {R}\), that is the product \(\mathscr {S}\otimes \mathscr {S}\). As \(\mathscr {L}\) was defined as a subset of \(\mathscr {S}\otimes \mathscr {S}\), it follows that \(\mathscr {L}=\mathscr {S}\otimes \mathscr {S}\). In other words,

$$\begin{aligned} \forall C\in \mathscr {S}\otimes \mathscr {S}, \quad \int _{C}f\,\mathrm {d}(\mu \otimes \mu ) = \int _{C}g\,\mathrm {d}(\mu \otimes \mu ). \end{aligned}$$
(79)

Now, with \(C=\{f>g\}\), (79) gives \(\int _C (f-g)\,\mathrm {d}(\mu \otimes \mu )=0\). As \(f-g\) is positive on C, this implies \(\mu \otimes \mu (\{f>g\})=0\). Similarly, one check that \(\mu \otimes \mu (\{f<g\})=0\), so finally \(\mu \otimes \mu (\{f\ne g\})=0\), that is \(f=g\) \(\mu \otimes \mu \)-a.e. on \({\mathbb S}^2\).

Case where \(\mu ({\mathbb S})=\infty \). By \(\sigma \)-finiteness of \(\mu \), there is a sequence \(({\mathbb S}_n)_{n\ge 1}\) in \(\mathscr {S}\), such that \({\mathbb S}_n\uparrow {\mathbb S}\) and \(\mu ({\mathbb S}_n)<\infty \) for each \(n\ge 1\). Let us equip \({\mathbb S}_n\) with the \(\sigma \)-field \( \mathscr {S}_n := \{A\in \mathscr {S}; A\subset {\mathbb S}_n\} = \{A'\cap {\mathbb S}_n ; A'\in \mathscr {S}\} \) Then, we can apply the previous case to each measured space \(({\mathbb S}_n,\mathscr {S}_n,\mu )\), \(n\ge 1\), which gives \(f=g\), \((\mu \otimes \mu \))-a.e. on \({\mathbb S}_n^2\). As \({\mathbb S}^2=\cup _{n\ge 1}{\mathbb S}_n\times {\mathbb S}_n\), this gives \(f=g\), \((\mu \otimes \mu )\)-a.e. on \({\mathbb S}^2\). \(\square \)

Proposition 25

If X and \(X'\) are mean zero random elements in \(\mathrm {L}_p({\mathbb S})\) with finite strong second moment and the same covariance operator, then

$$\begin{aligned} \mathrm{E}(X(s)X(t)) = \mathrm{E}(X'(s)X'(t)) \quad \text {for }\mu \otimes \mu \text { almost every }(s,t)\in {\mathbb S}^2. \end{aligned}$$

Proof

If X and \(X'\) have the same covariance operator, then for all \(u, v \in \mathrm {L}_q({\mathbb S}^2)\),

$$\begin{aligned} \mathrm{E}(\langle X,u \rangle \langle X,v \rangle ) = \mathrm{E}(\langle X',u \rangle \langle X',v \rangle ). \end{aligned}$$

Hölder inequality and Fubini arguments legitimate the rephrasing of this equality as

$$\begin{aligned} \int _{{\mathbb S}^2}\mathrm{E}(X(s)X(t))u(s)v(t)\,\mathrm {d}\mu \otimes \mu (s,t) = \int _{{\mathbb S}^2}\mathrm{E}(X'(s)X'(t))u(s)v(t)\,\mathrm {d}\mu \otimes \mu (s,t). \end{aligned}$$

As for any \(A, B\in \mathscr {S}\) such that \(\mu (A), \mu (B)<\infty \), the functions \(u=\mathbf {1}_A\) and \(v=\mathbf {1}_B\) are in \(\mathrm {L}_q({\mathbb S})\), Lemma 24 gives the expected conclusion. \(\square \)

In what follows, we use for notational convenience the indexation by infinite subsets of \(\mathbb {N}^*=\mathbb {N}\setminus \{0\}\) to denote subsequences. So any (infinite) subsequence of \((u_n)_{n\ge 1}\) can be denoted as \((u_n)_{n\in I}\) with I infinite subset of \(\mathbb {N}^*\) and the convergence of this subsequence will be denoted by \(\xrightarrow [n\rightarrow \infty , n\in I]{}\) or \(\lim _{n\rightarrow \infty , n\in I}\).

Lemma 26

Let \(\xi \) be a Gaussian random element in \(\mathrm {L}_p({\mathbb S})=\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) having a representation

$$\begin{aligned} \xi = \int _{\mathrm {L}_p({\mathbb S})}{{\,\mathrm{Id}\,}}\,\mathrm {d}W, \end{aligned}$$

where W is a white noise. Then, for \(\mu \)-almost every \(s\in {\mathbb S}\), \(\xi (s)\) is a mean zero Gaussian random variable.

Proof

By construction of the \(\mathrm {L}_p({\mathbb S})\) valued stochastic integral with respect to W, there is a sequence of \(\mathrm {L}_p({\mathbb S})\) valued simple functions \( f_n = \sum _{i=1}^{j_n}h_{ni}\mathbf {1}_{A_{ni}} \) where the \(h_{ni}\) are in \(\mathrm {L}_p({\mathbb S})\), and for each n, the \(A_{ni}\), \(1\le i\le j_n\) are disjoint, such that with \(\xi _n:=\int _{\mathrm {L}_p({\mathbb S})}f_n\,\mathrm {d}W\), \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\rightarrow 0\). Let us fix a representant still denoted \(h_{ni}\) in each class of functions \(h_{ni}\). Then,

$$\begin{aligned} \xi _n(s) = \sum _{i=1}^{j_n} h_{ni}(s)W(A_{ni}) \end{aligned}$$

is a Gaussian mean zero random variable as a linear combination of the independent Gaussian mean zero random variables \(W(A_{ni})\). Now, the conclusion of the Lemma follows if we prove that for \(\mu \)-almost every \(s\in {\mathbb S}\), \(\mathrm{E}\vert \xi _n(s)-\xi (s) \vert ^2\rightarrow 0\).

Our first step in this way is to prove that the convergence \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\rightarrow 0\) implies \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^p\rightarrow 0\) in our Gaussian setting. To this aim, we use the following estimates.

$$\begin{aligned} \mathrm{E}\left\| \xi _n-\xi \right\| _p^p&\le \mathrm{E}\Big (\left\| \xi _n-\xi \right\| _p\big (\left\| \xi _n \right\| _p + \left\| \xi \right\| _p\big )^{p-1}\Big )\\&\le \Big (\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\Big )^{1/2}\Big (\mathrm{E}\big (\left\| \xi _n \right\| _p + \left\| \xi \right\| _p\big )^{2p-2}\Big )^{1/2}\\&\le \Big (\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\Big )^{1/2}\Big (2^{2p-3}\mathrm{E}\left\| \xi _n \right\| _p^{2p-2} +2^{2p-3}\mathrm{E}\left\| \xi \right\| _p^{2p-2}\Big )^{1/2} \end{aligned}$$

As the Gaussian random elements \(\xi _n\) and \(\xi \) have strong moments of any order, we just have to bound \(\mathrm{E}\left\| \xi _n \right\| _p^{2p-2}\) uniformly in n. Using (3.5) in [17], we get for \(r\ge 2\),

$$\begin{aligned} \mathrm{E}\left\| \xi _n \right\| _p^r = \int _0^\infty rt^{r-1}\mathsf {P}'(\left\| \xi _n \right\| _p > t)\,\mathrm {d}t \le \int _0^\infty 4rt^{r-1}\exp \left( \frac{-t^2}{8\mathrm{E}\left\| \xi _n \right\| _p^2}\right) \,\mathrm {d}t. \end{aligned}$$

Now, the convergence to zero of \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\) implies the convergence of \(\mathrm{E}\left\| \xi _n \right\| _p^2\) to \(\mathrm{E}\left\| \xi \right\| _p^2\) so there is some \(n_0\) such that \(\mathrm{E}\left\| \xi _n \right\| _p^2 \le 2\mathrm{E}\left\| \xi \right\| _p^2\) for every \(n\ge n_0\). Hence, \(\sup _{n\ge n_0}\mathrm{E}\left\| \xi _n \right\| _p^r\le \int _0^\infty 4rt^{r-1}\exp (-t^2/(16\mathrm{E}\left\| \xi \right\| _p^2))\,\mathrm {d}t<\infty \).

Finally, since

$$\begin{aligned} \mathrm{E}\left\| \xi _n-\xi \right\| _p^p = \int _{\mathbb S}\mathrm{E}\vert \xi _n(s)-\xi (s) \vert ^p\,\mathrm {d}\mu (s) \xrightarrow [n\rightarrow \infty ]{}0, \end{aligned}$$

we can extract a subsequence \((\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p)_{n\in I}\) which converges to zero \(\mu \)-almost everywhere on \({\mathbb S}\). So there is some measurable subset \({\mathbb S}'\) such that \(\mu ({\mathbb S}\setminus {\mathbb S}')=0\) and for every \(s\in {\mathbb S}'\) \(\lim _{n\rightarrow \infty , n\in I}\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p=0\). As \(p>2\), this implies \(\lim _{n\rightarrow \infty , n\in I}\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^2=0\). So for every s in \({\mathbb S}'\), \(\xi (s)\) is limit in quadratic mean of the sequence of mean zero Gaussian random variables \((\xi _{n}(s))_{n\in I}\); hence, \(\xi (s)\) is a mean zero Gaussian random variable. \(\square \)

Lemma 27

Let X be a random element in \(\mathrm {L}_p({\mathbb S})\) such that \(\mathrm{E}\left\| X \right\| _{\mathrm {L}_p}^2<\infty \). Let \(\xi \) be a Gaussian random element of the form \(\xi =\int _{\mathrm {L}_p({\mathbb S})}{{\,\mathrm{Id}\,}}\,\mathrm {d}W\), where W is a white noise with variance \(\mathsf {P}_X\).

  1. (i)

    For \(\mu \) almost every \(s\in {\mathbb S}\), \(\sigma ^2(s):=\mathrm{E}X(s)^2=\mathrm{E}\xi (s)^2\).

  2. (ii)

    Moreover, \(\sigma \in \mathrm {L}_p({\mathbb S}^2)\).

Proof

To prove (i), we recall that the proof of Lemma 26 provides a measurable subset \({\mathbb S}'\) such that \(\mu ({\mathbb S}\setminus {\mathbb S}')=0\) and an infinite subset I of \(\mathbb {N}^*\) such that for every \(s\in {\mathbb S}'\) \((\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p)_{n\in I}\) converges to zero and \((\mathrm{E}\xi _n(s)^2)_{n\in I}\) converges to \(\mathrm{E}\xi (s)^2\). So it suffices to prove that one can extract a subsequence \((\mathrm{E}\xi _n(s)^2)_{n\in J}\) for some infinite subset J of I, converging to \(\mathrm{E}X(s)^2\) for \(\mu \) almost every \(s\in {\mathbb S}'\). Moreover, it is enough to prove i) in the case where \(\mu ({\mathbb S})<\infty \). Indeed when \(\mu ({\mathbb S})=\infty \), by \(\sigma \)-finiteness of \(\mu \), there is a sequence \(({\mathbb S}_n)_{n\ge 1}\) in \(\mathscr {S}\), such that \({\mathbb S}_n\uparrow {\mathbb S}\) and \(\mu ({\mathbb S}_n)<\infty \) for each \(n\ge 1\) and the same holds with \(({\mathbb S}'_n)_{n\ge 1}\) and \({\mathbb S}'\), where \({\mathbb S}'_n:={\mathbb S}\cap {\mathbb S}'_n\). Then, clearly if \(\mathrm{E}X(s)^2=\mathrm{E}\xi (s)^2\) \(\mu \)-a.e. in \({\mathbb S}'_n\), the same equality holds \(\mu \)-a.e on \({\mathbb S}'\). So let us assume from now on, that \(\mu ({\mathbb S})\) is finite.

Now, we note that \(\xi _n\) was defined as \(\xi _n:=\int _{\mathrm {L}_p({\mathbb S})}f_n\,\mathrm {d}W\), with

$$\begin{aligned} f_n\xrightarrow [n\rightarrow \infty ]{} {{\,\mathrm{Id}\,}}_{\mathrm {L}_p({\mathbb S})},\quad \text {in the space }\mathrm {L}^2\big (\mathrm {L}_p({\mathbb S}),\mathscr {B}_{\mathrm {L}_p({\mathbb S})},\mathsf {P}_X; \mathrm {L}_p({\mathbb S})\big ). \end{aligned}$$

This convergence means that

$$\begin{aligned} \int _{\mathrm {L}_p({\mathbb S})}\left\| f_n(x) - {{\,\mathrm{Id}\,}}_{\mathrm {L}_p({\mathbb S})}(x) \right\| _{\mathrm {L}_p({\mathbb S})}\,\mathrm {d}\mathsf {P}_X \xrightarrow [n\rightarrow \infty ]{} 0, \end{aligned}$$

which can be reformulated as

$$\begin{aligned} \mathrm{E}\left\| f_n(X) - X \right\| _{\mathrm {L}_p({\mathbb S})}^2 = \mathrm{E}\left( \int _{{\mathbb S}}\vert f_n(X(s)) - X(s) \vert ^p \,\mathrm {d}\mu (s)\right) ^{2/p} \xrightarrow [n\rightarrow \infty ]{} 0. \end{aligned}$$
(80)

Since \(\mu ({\mathbb S})\) is finite, \(\mu /\mu ({\mathbb S})\) is a probability, whence as \(p>2\), for any \(g\in \mathrm {L}_p({\mathbb S})\),

$$\begin{aligned} \left( \int _{{\mathbb S}}\vert g \vert ^p\,\mathrm {d}\mu \right) ^{1/p}\ge \mu ({\mathbb S})^{1/p-1/2} \left( \int _{{\mathbb S}}\vert g \vert ^2\,\mathrm {d}\mu \right) ^{1/2}. \end{aligned}$$

This enables us to deduce from (80) that

$$\begin{aligned} \mathrm{E}\int _{{\mathbb S}}\vert f_n(X(s)) - X(s) \vert ^2 \,\mathrm {d}\mu (s) = \int _{{\mathbb S}'}\mathrm{E}\vert f_n(X(s)) - X(s) \vert ^2 \,\mathrm {d}\mu (s) \xrightarrow [n\rightarrow \infty , n\in I]{} 0 \end{aligned}$$

Then, there is a measurable subset \({\mathbb S}''\) of \({\mathbb S}'\) such that \(\mu ({\mathbb S}'\setminus {\mathbb S}'')=0\) together with a subsequence \((\mathrm{E}\vert f_n(X(s)) - X(s) \vert ^2)_{n\in J}\), with \(J\subset I\), converging to zero for every \(s\in {\mathbb S}''\). Now, we have for every \(s\in {\mathbb S}''\), \( \lim _{n\rightarrow \infty , n\in J}\mathrm{E}f_n(X(s))^2 = \mathrm{E}X(s)^2. \) As \(f_n(X)=\sum _{i=1}^{j_n}x_{ni}\mathbf {1}_{A_{ni}}(X)\), with the \(A_{ni}\) pairwise disjoint, \(\mathrm{E}f_n(X(s))^2 = \sum _{i=1}^{j_n}x_{ni}(s)^2 \mathsf {P}_X(A_{ni})\). On the other hand, \(\xi _n(s) = I_W(f_n)(s) = \sum _{i=1}^{j_n}x_{ni}(s)W(A_{ni})\) where the \(W(A_{ni})\) are independent centered Gaussian random variables with respective variances \(\mathsf {P}_X(A_{ni})\), \(\mathrm{E}\xi _n(s)^2 = \sum _{i=1}^{j_n}x_{ni}(s)^2 P_X(A_{ni})\). So \(\mathrm{E}f_n(X(s))^2 = \mathrm{E}\xi _n(s)^2\) \(\mu \)-a.e. on \({\mathbb S}''\). Finally,

$$\begin{aligned} \mathrm{E}X(s)^2 = \lim _{n\rightarrow \infty , n\in J}\mathrm{E}f_n(X(s))^2 = \lim _{n\rightarrow \infty , n\in J}\mathrm{E}\xi _n(s)^2 = \mathrm{E}\xi (s)^2,\quad \mu \text {-a.e. on }{\mathbb S}'', \end{aligned}$$

which completes the proof of i).

To check ii), by combining i) and Lemma 26, one see that for \(\mu \) almost every \(s\in {\mathbb S}\), \(\sigma ^2(s)=\mathrm{E}\xi (s)^2\) and \(\xi (s)\) is a mean zero Gaussian random variable. For every such s, \(\sigma (s)=(\mathrm{E}\xi (s)^2)^{1/2} \le (\mathrm{E}\vert \xi (s) \vert ^p)^{1/p}\) since \(p>2\), whence \(\sigma (s)^p \le \mathrm{E}\vert \xi (s) \vert ^p\). Therefore,

$$\begin{aligned} \int _{{\mathbb S}}\sigma ^p \,\mathrm {d}\mu \le \int _{{\mathbb S}}\mathrm{E}\vert \xi \vert ^p\,\mathrm {d}\mu = \mathrm{E}\left\| \xi \right\| _{\mathrm {L}_p}^p <\infty , \end{aligned}$$

since the Gaussian random element \(\xi \) in \(\mathrm {L}_p({\mathbb S})\) has finite moments of any order. \(\square \)

Proof of Theorem 23

Puting \(2+\delta = \min (p,3)\) and applying Theorem 21, we deduce from (a) and (b) that \(\lim _{n\rightarrow \infty } \zeta _{2+\delta }\left( \sum _{k=1}^{r_n}X_{nk},\sum _{k=1}^{r_n}Y_{nk}\right) = 0 \), where the \(Y_{nk}\) are choosen as in the proof of Theorem 21, that is \(Y_{nk}=\int _{\mathrm {L}_p({\mathbb S})}{{\,\mathrm{Id}\,}}_{\mathrm {L}_p({\mathbb S})}\,\mathrm {d}W_{nk}\), where the \(W_{nk}\) are independent white noises with respective variances \(P_{X_{nk}}\). So it remains to prove that \(\zeta _{2+\delta }\big (\sum _{k=1}^{m_n}Y_{nk}, Y\big )\) converges to zero, which is equivalent to \(\sum _{k=1}^{r_n}Y_{nk} \xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y\) in the space \(\mathrm {L}_p({\mathbb S})\). This last convergence will be established by proving that

  1. (i)

    For every \(u\in \mathrm {L}_q({\mathbb S}, \mathscr {S},\mu ;\mathbb {R})\), \(\langle \sum _{k=1}^{r_n}Y_{nk},u \rangle \) converges in distribution to \(\langle Y,u \rangle \).

  2. (ii)

    The sequence \(\big (\sum _{k=1}^{r_n}Y_{nk}\big )_{n\ge 1}\) is tight in \(\mathrm {L}_p({\mathbb S})\).

To prove (i), we remark that \(\langle \sum _{k=1}^{r_n}Y_{nk},u \rangle \) and \(\langle Y,u \rangle \) are mean zero Gaussian random variables, so the announced convergence in distribution will follow from convergence of their variances. To prove this last one, we note that

$$\begin{aligned} \mathrm {var}\langle Y_{nk},u \rangle =\mathrm{E}\langle Y_{nk},u \rangle ^2 = \int _{{\mathbb S}^2} u(s)u(t)\mathrm{E}Y_{nk}(s)Y_{nk}(t)\,\mathrm {d}\mu \otimes \mu (s,t), \end{aligned}$$

so using the independence of the Gaussian random variables \(\langle Y_{nk},u \rangle \), \(1\le k\le r_n\), we just have to prove that

$$\begin{aligned} \int _{{\mathbb S}^2} u(s)u(t)\sum _{k=1}^{r_n}\mathrm{E}Y_{nk}(s)Y_{nk}(t)\,\mathrm {d}\mu \otimes \mu (s,t) \xrightarrow [n\rightarrow \infty ]{} \int _{{\mathbb S}^2}u(s)u(t) \mathrm{E}Y(s)Y(t)\,\mathrm {d}\mu \otimes \mu (s,t). \end{aligned}$$

By Proposition 25, we can replace the \(X_{nk}\)’s by the \(Y_{nk}\)’s in the above convergence which then appears as an obvious consequence of Assumption (c) since \(u\otimes u\) belongs to \(\mathrm {L}_q({\mathbb S}^2)\).

To prove (ii), according to Cremers and Kadelka [9, Th. 2], it suffices to prove that with \(Y_n:=\sum _{k=1}^{r_n}Y_{nk}\),

$$\begin{aligned} \limsup _{n\rightarrow \infty }\int _{{\mathbb S}}\mathrm{E}\left| Y_n \right| ^p\,\mathrm {d}\mu \le \int _{{\mathbb S}}\mathrm{E}\left| Y \right| ^p\,\mathrm {d}\mu . \end{aligned}$$

In fact, we will prove that

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{{\mathbb S}}\mathrm{E}\left| Y_n \right| ^p\,\mathrm {d}\mu = \int _{{\mathbb S}}\mathrm{E}\left| Y \right| ^p\,\mathrm {d}\mu . \end{aligned}$$
(81)

By Lemmas 26 and 27 i), for \(\mu \) almost every \(s\in {\mathbb S}\), \(Y_n(s)\) and Y(s) are centered Gaussian random variables with respective variances \(\sigma _n^2(s)=\sum _{k=1}^{r_n}\mathrm{E}X_{nk}(s)^2\) and \(\sigma ^2(s)\). This implies that

$$\begin{aligned} \mathrm{E}\vert Y_n(s) \vert ^p = \sigma _n(s)^p m_p, \quad \mathrm{E}\vert Y(s) \vert ^p = \sigma (s)^pm_p \end{aligned}$$

where \(m_p:=(2\pi )^{-1/2}\int _{-\infty }^\infty \vert z \vert ^p \exp (-z^2/2)\,\mathrm {d}z\). This way, (81) is reduced to Assumption (d) and the proof is complete. \(\square \)

5 Asymptotic Normality of Weighted Sums

Let \((X_j, j\in \mathbb {Z})\) be a set of \(\mathsf {B}\)-valued random elements. Assume that \(\mathrm{E}(X_j)=0\) and \(\mathrm{E}\Vert X_j\Vert ^2<\infty \) for any \(j\in \mathbb {Z}.\) Consider the weighted sums

$$\begin{aligned} Z_n:=\sum _{k=0}^\infty a_{n, k}X_k,\ \ n\in \mathbb {N}, \end{aligned}$$
(82)

whenever they are correctly defined, where \(\{(a_{n, k}, k\ge 0), n\in \mathbb {N}\}\subset \mathbb {R}\). We assume that for each \(n\in \mathbb {N}\), \(\sum _{k}a^2_{nk}<\infty \).

Theorem 28

Assume that the Banach space \(\mathsf {B}\) is p-smooth with some \(p>2\). Let \((X_k, k\in \mathbb {Z})\) be i.i.d. \(\mathsf {B}\)-valued random elements and \(Q:=\mathrm {cov}(X_i)\). Assume that

  1. (i)

    \(c_n:=\sup _{k\ge 0}|a_{nk}|\rightarrow 0\) as \(n\rightarrow \infty \);

  2. (ii)

    \(b_n^2:=\sum _{k\ge 0}a^2_{nk}\rightarrow 1\) as \(n\rightarrow \infty \).

Then, for each n, the series \(\sum _{k}a_{nk}X_k\) converges a.s., and

$$\begin{aligned} \sum _{k=0}^\infty a_{nk}X_k\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_{Q}. \end{aligned}$$
(83)

Proof

Since the space \(\mathsf {B}\) is of type 2, we get by (40) for any \(1\le l\le m\),

$$\begin{aligned} \mathrm{E}\Big \Vert \sum _{k=l}^m a_{nk}X_k\Big \Vert&\le K\Big (\sum _{k=l}^m \mathrm{E}\Vert a_{nk}X_k\Vert ^2\Big )^{1/2}\\&\le K \Big (\sum _{k=l}^m a_{nk}^2\Big )^{1/2}(\mathrm{E}\Vert X\Vert ^2)^{1/2}. \end{aligned}$$

Due to Condition (ii), this implies that the series \(\sum _{k}a_{nk}X_k\) satisfies Cauchy’s criterion in the space \(\mathrm {L}^1(\varOmega ,\mathscr {F},\mathsf {P};\mathsf {B})\), hence converges in \(\mathrm {L}^1\) and in probability. By independence of its terms, it converges also a.s., according to Ito-Nisio theorem.

Without loss of generality we assume that \(p=2+\delta \) with some \(\delta \in (0, 1)\). Let \(m\ge 1\). We apply Theorem 22 for \(X_{nk}=a_{nk}X_k\), \(k=0, \dots , m\). In this case, the sum \(\sum _{k=0}^m Y_{nk}\) has the same distribution as \(A_{nm}Y_Q,\) where \(A_{nm}^2=\sum _{k=0}^m a_{nk}^2\), \(A_{nm}\ge 0\). Hence, by (66) in Theorem 21, with \(c:=c(\mathsf {B},\delta )\),

$$\begin{aligned}&\zeta _p\left( \sum _{k=0}^m a_{nk} X_{k}, A_{nm}Y_Q\right) \le c\Big [\varepsilon ^{\delta }A_{nm}^2\mathrm{E}\left\| X_1 \right\| ^2 \\&\quad +\, \big (1+A_{nm}^2\mathrm{E}\left\| X_1 \right\| ^2\big ) \sum _{k=0}^m a_{nk}^2\mathrm{E}\left\| X_1 \right\| ^2\varvec{1}\{\vert a_{nk} \vert \left\| X_1 \right\| > \varepsilon \}\bigg ]. \end{aligned}$$

This gives the following bound uniform in \(m\ge 0\):

$$\begin{aligned}&\zeta _p\left( \sum _{k=0}^m a_{nk} X_{k}, A_{nm}Y_Q\right) \le c b_n^2\big [\varepsilon ^\delta \mathrm{E}\left\| X_1 \right\| ^2 \nonumber \\&\quad +\, (1 + b_n^2\mathrm{E}\left\| X_1 \right\| ^2)\mathrm{E}\left\| X_1 \right\| ^2\varvec{1}\{c_n\left\| X_1 \right\| > \varepsilon \}\big ]. \end{aligned}$$
(84)

Now, we estimate

$$\begin{aligned} \zeta _p(A_{nm}Y_Q, Y_Q) = \sup \left\{ \left| \mathrm{E}f(A_{nm}Y_Q) - \mathrm{E}f(Y_Q) \right| :\; f\in \mathrm {C}^{(p)}_b(\mathsf {B}), \ \Vert f\Vert _{p}\le 1\right\} . \end{aligned}$$

As the random elements \(A_{nm}Y_Q\) and \(Y_Q\) are defined on the same probability space, the writing \(\mathrm{E}\big (f(A_{nm}Y_Q)-f(Y_Q)\big )\) makes sense and we obtain

$$\begin{aligned} \left| \mathrm{E}f(A_{nm}Y_Q) - \mathrm{E}f(Y_Q) \right|&= \left| \mathrm{E}\big (f(A_{nm}Y_Q)-f(Y_Q)\big ) \right| \le \mathrm{E}\left| f(A_{nm}Y_Q)-f(Y_Q) \right| \\&\le \mathrm{E}\sup _{x\in \mathsf {B}}\left\| f'(x) \right\| \left\| A_{nm}Y_Q - Y_Q \right\| \le \vert 1-A_{nm} \vert \mathrm{E}\left\| Y_Q \right\| , \end{aligned}$$

because \(\left\| f \right\| _{(p)}\le 1\) implies \(\sup _{x\in \mathsf {B}}\left\| f'(x) \right\| \le 1\). Therefore,

$$\begin{aligned} \zeta _p(A_{nm}Y_Q, Y_Q) \le \vert 1-A_{nm} \vert \mathrm{E}\left\| Y_Q \right\| ,\quad m\ge 0. \end{aligned}$$
(85)

Next, by the regularity property of \(\zeta _p\), see (b) p.15 and the independence of the \(X_k\)’s,

$$\begin{aligned} \zeta _p\left( \sum _{k=0}^m a_{nk}X_k, \sum _{k=0}^\infty a_{nk}X_k\right) \le \zeta _p\left( 0,\sum _{k=m+1}^\infty a_{nk}X_k\right) . \end{aligned}$$

By Taylor formula at the order 1 with integral remainder, see (68), it is easily seen that if Z is a random element in \(\mathsf {B}\) and \(\mathrm{E}Z=0\), \(\mathrm{E}\left\| Z \right\| ^2<\infty \), then \(\zeta _p(0,Z)\le \frac{1}{2}\mathrm{E}\left\| Z \right\| ^2\). Therefore,

$$\begin{aligned} \zeta _p\left( \sum _{k=0}^m a_{nk}X_k, \sum _{k=0}^\infty a_{nk}X_k\right) \le \frac{K^2}{2}\mathrm{E}\left\| X_1 \right\| ^2\sum _{k=m+1}^\infty a_{nk}^2. \end{aligned}$$
(86)

Finally, by triangular inequality for the distance \(\zeta _p\), gathering the estimates (84), (85) and (86) gives

$$\begin{aligned} \zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) \le \frac{K^2}{2}\mathrm{E}\left\| X_1 \right\| ^2\sum _{k=m+1}^\infty a_{nk}^2 + u_n(\varepsilon ) + \vert 1-A_{nm} \vert \left\| Y_Q \right\| , \end{aligned}$$

where \(u_n(\varepsilon )\) denotes the right-hand side of (84). Using Assumption (ii), letting m tend to infinity in the above inequality gives \( \zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) \le u_n(\varepsilon ) \), whence by (i) and (ii),

$$\begin{aligned} \limsup _{n\rightarrow \infty }\zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) \le c\mathrm{E}\left\| X_1 \right\| ^2 \varepsilon ^\delta . \end{aligned}$$

By arbitraryness of \(\varepsilon \), we conclude that \(\lim _{n\rightarrow \infty }\zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) =0\). \(\square \)

Next, consider a \(\mathsf {B}\)-valued linear process \((X_k, k\in \mathbb {Z})\) defined by

$$\begin{aligned} X_k=\sum _{j=0}^\infty \psi _j \epsilon _{k-j}, \end{aligned}$$
(87)

where innovations \((\epsilon _k, k\in \mathbb {Z})\) are i.i.d. \(\mathsf {B}\)-valued random variables such that \(\mathrm{E}\epsilon _0=0\), \(Q_\epsilon =\mathrm {cov}(\epsilon _0)\), \(0<\sigma ^2:=\mathrm{E}||\epsilon _k||^2<\infty \) and the linear filter \((\psi _j, j\ge 0)\subset L(\mathsf {B})\) is a sequence of linear bounded operators such that \(\psi _0={{\,\mathrm{Id}\,}}_{\mathsf {B}}\) and

$$\begin{aligned} \sum _{j=0}^\infty \left\| \psi _j \right\| <\infty . \end{aligned}$$
(88)

This condition ensures the a.s. convergence of the series in (87). In this case, we set \(\varPsi =\sum _{j=0}^\infty \psi _j.\)

Theorem 29

Let \(\mathsf {B}\) be a p-smooth Banach space, \(p>2\). Let \((X_k)\) be a linear process defined by (87), where \((\psi _k)\) satisfies (88). Let \((a_{n, j}, j\in \mathbb {Z}, n\in \mathbb {N})\subset \mathbb {R}\) satisfy conditions (i)–(ii) of Theorem 28 and

  1. (iii)

    \(\lim _{n\rightarrow \infty }\sum _{k\in \mathbb {Z}}(a_{n,k+1}-a_{n,k})^2=0.\)

Then,

$$\begin{aligned} \sum _{k=0}^\infty a_{nk}X_k\ \xrightarrow [n\rightarrow \infty ]{\mathscr {D}}\ Y_{\varPsi ^*\widetilde{Q}_\epsilon \varPsi }. \end{aligned}$$

Proof

We have

$$\begin{aligned} Z_n:=\sum _{k=0}^\infty a_{nk}X_k = \sum _{k=0}^\infty a_{nk}\sum _{j=0}^\infty \psi _j \epsilon _{k-j} = \sum _{j=0}^\infty \psi _j\sum _{k=0}^\infty a_{nk}\epsilon _{k-j} = \sum _{j=0}^\infty \psi _jZ_{nj}, \end{aligned}$$

where \(Z_{nj}:=\sum _{k=0}^\infty a_{nk}\epsilon _{k-j}\). Writing \(Z_n=Z_n'+Z_n''\), where

$$\begin{aligned} Z_n':=\sum _{j=0}^\infty \psi _{j}(Z_{nj}-Z_{n0}),\ Z_n'':=\sum _{j=0}^\infty \psi _{j} Z_{n0} \end{aligned}$$

we consider each \(Z_n'\) and \(Z_n''\) separately. By Theorem 28,

$$\begin{aligned} Z_n'' =\varPsi (Z_{n0}) = \varPsi \left( \sum _{k=0}^\infty a_{nk}\epsilon _k\right) \xrightarrow [n\rightarrow \infty ]{\mathscr {D}}\varPsi (Y_{Q_\epsilon })\sim Y_{\varPsi ^*Q_\epsilon \varPsi }. \end{aligned}$$

To complete the proof, we show that \(Z_n'\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}0\). To this aim, let us assume for a moment that the following two properties hold true:

$$\begin{aligned} \sup _{n,j}\mathrm{E}\left\| Z_{nj} \right\| ^2<\infty \end{aligned}$$
(89)

and

$$\begin{aligned} \lim _{n\rightarrow \infty }\left\| Z_{nj}-Z_{n0} \right\| = 0 \quad \text {in probability, for each }j\in \mathbb {N}. \end{aligned}$$
(90)

Let \(\varepsilon >0\) and \(J\in \mathbb {N}\). Splitting \(Z'_n\) in two sums indexed by \(j\le J\) and \(j>J\), leads to

$$\begin{aligned} \mathsf {P}(\left\| Z'_n \right\|>\varepsilon ) \le \mathsf {P}\left( \sum _{j=0}^{J}\Vert \psi _{j}\Vert \cdot \Vert Z_{nj}-Z_{n0}\Vert>\frac{\varepsilon }{2}\right) + \mathsf {P}\left( \sum _{j>J}\Vert \psi _{j}\Vert \cdot \Vert Z_{nj}-Z_{n0}\Vert >\frac{\varepsilon }{2}\right) .\nonumber \\ \end{aligned}$$
(91)

Applying Markov inequality at order one gives

$$\begin{aligned} \mathsf {P}\left( \sum _{j>J}\Vert \psi _{j}\Vert \cdot \Vert Z_{nj}-Z_{n0}\Vert>\frac{\varepsilon }{2}\right) \le \frac{4}{\varepsilon }\sup _{n,j} \mathrm{E}\left\| Z_{nj} \right\| \sum _{j>J}\left\| \psi _{j} \right\| . \end{aligned}$$

By (89) and (88), taking \(J\in \mathbb {N}\) large enough, one can make the right side of the preceding bound as small as one wish. Then, the first probability on the right side of (91) is small as one wish by (90) and taking \(n\in \mathbb {N}_{+}\) large enough. Therefore, \(Z_n'\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}0\) holds true subject to the forthcoming proof of (89) and (90).

For (89), as the sequence \((\epsilon _i)_{i\in \mathbb {Z}}\) is i.i.d., it is clear that for each \(n\ge 1\), all the \(Z_{nj}\) have the same distribution, so it suffices to check that \(\sup _{n\ge 1}\mathrm{E}\left\| Z_{n0} \right\| ^2<\infty \). As B is of type 2, it easily follows from (39) and the equidistribution of the independent \(\epsilon _k\) that

$$\begin{aligned} \mathrm{E}\left\| Z_{n0} \right\| ^2 \le K^2 \mathrm{E}\left\| \epsilon _0 \right\| ^2 \sum _{k=0}^\infty a_{nk}^2. \end{aligned}$$

Hence, (89) results from Assumption (ii).

To check (90), we show that in the decomposition

$$\begin{aligned} Z_{nj} - Z_{n0} = \sum _{i=-j}^{-1} a_{n,i+j}\epsilon _i + \sum _{i=0}^\infty (a_{n,i+j} - a_{n,i})\epsilon _i, \end{aligned}$$

both sums converge to zero in quadratic mean. For the first one,

$$\begin{aligned} \mathrm{E}\left\| \sum _{i=-j}^{-1} a_{n,i+j}\epsilon _i \right\| ^2&\le K^2\sum _{i=-j}^{-1}\mathrm{E}\left\| a_{n,i+j}\epsilon _i \right\| ^2 = K^2\sum _{i=-j}^{-1}a_{n,i+j}^2\mathrm{E}\left\| \epsilon _0 \right\| ^2 \\&\le K^2 j\mathrm{E}\left\| \epsilon _0 \right\| ^2 c_n^2, \end{aligned}$$

which tends to zero as n goes to infinity by Assumption (i). For the second sum,

$$\begin{aligned} \mathrm{E}\left\| \sum _{i=0}^\infty (a_{n,i+j} - a_{n,i})\epsilon _i \right\| ^2&\le K^2\mathrm{E}\left\| \epsilon _0 \right\| ^2 \sum _{i=0}^\infty (a_{n,i+j} - a_{n,i})^2. \end{aligned}$$

Let us denote by d the Euclidean distance in the sequence space \(\ell ^2(\mathbb {N})\). Then,

$$\begin{aligned} \sum _{i=0}^\infty (a_{n,i+j} - a_{n,i})^2&\le \left( \sum _{l=0}^{j-1}d\Big (\big (a_{n,i+l+1}\big )_{i\ge 0},\big (a_{n,i+l}\big )_{i\ge 0}\Big )\right) ^2 \\&\le j^2\max _{0\le l<j}d\Big (\big (a_{n,i+l+1}\big )_{i\ge 0},\big (a_{n,i+l}\big )_{i\ge 0}\Big )^2\\&\le j^2 \sum _{k=0}^\infty (a_{n,k+1} - a_{n,k})^2, \end{aligned}$$

which tends to zero as n tends to infinity by Assumption (iii). Hence, (90) is established and the proof complete. \(\square \)

Examples of summation methods \((a_{n,k}, k\in \mathbb {N}, n\in \mathbb {N})\) that satisfy conditions (i)–(iii) include the following (see [10] for more examples):

Cesàro summation corresponding to

$$\begin{aligned} a_{n,k}=(n+1)^{-1/2}{\left\{ \begin{array}{ll} 1, \ &{}\text {if}\ 0\le k\le n,\\ 0,\ &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Here, \(c_n=(n+1)^{-1/2}\), \(b_n^2=1\) and \(\sum _{k=0}^\infty (a_{n,k+1}-a_{n,k})^2=1/(n+1)\).

Abel summation corresponding to

$$\begin{aligned} a_{n,k}= \sqrt{2\lambda _n}(1-\mathrm {e}^{-1/\lambda _n})\mathrm {e}^{-k/\lambda _n}, \quad k\ge 0, \end{aligned}$$

where \(\lambda _n\rightarrow \infty \) as \(n\rightarrow \infty \). In this case,

$$\begin{aligned}&c_n=\sqrt{2\lambda _n}(1-\mathrm {e}^{-1/\lambda _n})\sim \left( \frac{2}{\lambda _n}\right) ^{1/2} \xrightarrow [n\rightarrow \infty ]{}0, \\&b_n^2 = 2\lambda _n(1 - \mathrm {e}^{-1/\lambda _n})^2 \sum _{k=0}^\infty \mathrm {e}^{-2k/\lambda _n} = \frac{2\lambda _n(1 - \mathrm {e}^{-1/\lambda _n})^2}{1 - \mathrm {e}^{-2/\lambda _n}} \\&\qquad = \frac{2\lambda _n(1 - \mathrm {e}^{-1/\lambda _n})}{1 + \mathrm {e}^{-1/\lambda _n}} \xrightarrow [n\rightarrow \infty ]{}1, \end{aligned}$$
$$\begin{aligned} \sum _{k=0}^\infty (a_{n,k+1} - a_{n,k})^2&= 2\lambda _n(1 - \mathrm {e}^{-1/\lambda _n})^2 \sum _{k=0}^\infty \Big (\mathrm {e}^{-(k+1)/\lambda _n} - \mathrm {e}^{-k/\lambda _n}\Big )^2\\&= \frac{2\lambda _n(1 - \mathrm {e}^{-1/\lambda _n})^4}{1 - \mathrm {e}^{-2/\lambda _n}} \sim \frac{1}{\lambda _n^2}\xrightarrow [n\rightarrow \infty ]{}0. \end{aligned}$$

Borel summation corresponding to

$$\begin{aligned} a_{n, k}=\sqrt{2}(\pi \lambda _n)^{1/4}\,\frac{\mathrm {e}^{-\lambda _n}\lambda _n^k}{k!}, \quad k\ge 0,\quad \lambda _n\xrightarrow [n\rightarrow \infty ]{}\infty . \end{aligned}$$

To check (i), recalling that \(\max _{k\ge 0}P(N=k)\) where the random variable N has the Poisson distribution with parameter \(\lambda _n\) is \(P(N=m)\) with \(m\le \lambda _n<m+1\), we get

$$\begin{aligned} c_n = \sqrt{2}(\pi \lambda _n)^{1/4}\,\frac{\mathrm {e}^{-\lambda _n}\lambda _n^m}{m!}. \end{aligned}$$

By Stirling formula, \(m! = \sqrt{2\pi }\,m^{m+1/2}\mathrm {e}^{-m}(1+\delta _m)\) with \(\lim _{m\rightarrow \infty }\delta _m=0\), whence

$$\begin{aligned} c_n \le \pi ^{-1/4}\frac{(m+1)^{1/4}\mathrm {e}^{-m}(m+1)^m}{m^{m+1/2}\mathrm {e}^{-m}(1+\delta _m)} = \frac{\left( 1+\frac{1}{m}\right) ^{m+1/4}}{(\pi m)^{1/4}(1+\delta _m)} \sim \frac{\mathrm {e}^{}}{(\pi m)^{1/4}}\xrightarrow [n\rightarrow \infty ]{}0. \end{aligned}$$

To check (ii), we refer to [10] where it is proved by using the Bessel function of the first kind, see (2.8) and (2.9) therein.

To check (iii), we note first that

$$\begin{aligned} \sum _{k=0}^\infty (a_{n,k+1} - a_{n,k})^2 = \sum _{k=0}^\infty 2\sqrt{\pi \lambda _n} \mathrm {e}^{-2\lambda _n}\frac{\lambda _n^{2k}}{(k!)^2}\left( \frac{\lambda _n}{k+1}-1\right) ^2 = \int _{\mathbb {N}}f_{\lambda _n}\,\mathrm {d}\mu _{\lambda _n}, \end{aligned}$$

where \(f_\lambda (k):=(\lambda (k+1)^{-1}-1)^2\) and \(\mu _{\lambda }\) is the discrete measure

$$\begin{aligned} \mu _{\lambda } := \sum _{k=0}^\infty \left( 2\sqrt{\pi \lambda }\, \mathrm {e}^{-2\lambda }\frac{\lambda ^{2k}}{(k!)^2}\right) \delta _k. \end{aligned}$$

In what follows, we simplify the notations by replacing \(\lambda _n\) (\(n\rightarrow \infty \)) by \(\lambda \) (\(\lambda \rightarrow \infty \)). It is easily seen that the peak of the point masses of \(\mu _{\lambda }\) is at \(k=[\lambda ]\). We will use the following estimates for the left and right tails of \(\mu _{\lambda }\), obtained by comparison with geometric sums.

$$\begin{aligned} \sum _{k\le j}\frac{\lambda ^{2k}}{(k!)^2}&< \frac{\lambda ^{2j}}{(j!)^2} \frac{\lambda ^2}{\lambda ^2 - j^2},\quad 0\le j < \lambda , \end{aligned}$$
(92)
$$\begin{aligned} \sum _{k\ge j}\frac{\lambda ^{2k}}{(k!)^2}&< \frac{\lambda ^{2j}}{(j!)^2} \frac{(j+1)^2}{(j+1)^2 - \lambda ^2},\quad j>\lambda . \end{aligned}$$
(93)

Let \(1/2<\tau <1\). We split \(\int _{\mathbb {N}}\) in \( \int _L + \int _C + \int _R \) with left, center and right intervals \(L:=\mathbb {N}\cap [0,\lambda - \lambda ^\tau ]\), \(C:=\mathbb {N}\cap (\lambda - \lambda ^\tau , \lambda + \lambda ^\tau )\), \(R:=\mathbb {N}\cap [\lambda + \lambda ^\tau ,\infty )\).

Estimation of \(\int _L f_\lambda \,\mathrm {d}\mu _\lambda \). Let j be the unique integer such that \(j\le \lambda - \lambda ^\tau <j+1\). As \(f_\lambda (k)\le \lambda ^2\) on L and accounting (92),

$$\begin{aligned} \int _L f_\lambda \,\mathrm {d}\mu _\lambda \le 2\sqrt{\pi }\lambda ^{5/2}\frac{\mathrm {e}^{-2\lambda }\lambda ^{2j}}{(j!)^2} \frac{\lambda ^2}{\lambda ^2 - j^2}. \end{aligned}$$

We note that

$$\begin{aligned} \frac{\lambda ^2}{\lambda ^2 - j^2} \le \frac{\lambda ^2}{\lambda ^2 - (\lambda -\lambda ^\tau )^2} = \frac{1}{1-(1-\lambda ^{\tau -1})^2} \sim \frac{1}{2}\lambda ^{1-\tau }. \end{aligned}$$

By Stirling formula, \(((j+1)!)^2 = 2\pi (j+1)^{2(j+1)+1}\mathrm {e}^{-2(j+1)}(1+\delta _{j+1})^2\), so as \(j+1>\lambda - \lambda ^\tau \), \(((j+1)!)^2\ge 2\pi (\lambda - \lambda ^\tau )^{2\lambda - 2\lambda ^\tau +1}\mathrm {e}^{-2\lambda + 2\lambda ^\tau - 2}(1+\delta _{j+1})^2\), whence

$$\begin{aligned} \frac{\mathrm {e}^{-2\lambda }\lambda ^{2j}}{(j!)^2} = \frac{\mathrm {e}^{-2\lambda }\lambda ^{2j}(j+1)^2}{((j+1)!)^2} \le \frac{\mathrm {e}^{2}}{2\pi } \frac{\mathrm {e}^{- 2\lambda ^\tau }\lambda ^{2\lambda - 2\lambda ^\tau }(\lambda - \lambda ^\tau +1)^2}{(\lambda - \lambda ^\tau )^{2\lambda - 2\lambda ^\tau +1}(1+\delta _{j+1})^2} \sim \frac{\mathrm {e}^{2}}{2\pi }T_1(\lambda ), \end{aligned}$$

where \(T_1(\lambda ):= \lambda \mathrm {e}^{-2\lambda ^\tau }(1-\lambda ^{\tau -1})^{2\lambda ^\tau -2\lambda }\). Next, using \(\ln (1-t)= -t-t^2/2 + o(t^2)\) as \(t\rightarrow 0\),

$$\begin{aligned} T_1(\lambda )&= \lambda \exp \Big (-2\lambda ^\tau + (2\lambda ^\tau -2\lambda ) \big (-\lambda ^{\tau -1} - \frac{\lambda ^{2\tau -2}}{2} + o(\lambda ^{2\tau -2})\big )\Big )\\&= \lambda \exp \big (-\lambda ^{2\tau - 1} + o(\lambda ^{2\tau -1})\big ) \end{aligned}$$

since \(\lambda ^{3\tau -2}=o(\lambda ^{2\tau -1})\). As \(2\tau -1>0\), for any \(a>0\) and \(c\in (0,1)\), \(\lambda ^aT_1(\lambda ) = O(\exp (-c\lambda ^{2\tau -1}))\), whence

$$\begin{aligned} \int _L f_\lambda \,\mathrm {d}\mu _\lambda = O\left( \exp (-c\lambda ^{2\tau -1})\right) = o(1). \end{aligned}$$

Estimation of \(\int _C f_\lambda \,\mathrm {d}\mu _\lambda \). One easily check that for \(k\in C\), \( f_\lambda (k) \le \left( \frac{\lambda ^\tau }{\lambda - \lambda ^\tau }\right) ^2 \sim \lambda ^{2\tau - 2} \). As \(\mu _\lambda (C) < \mu _\lambda (\mathbb {N})\sim 1\), this gives

$$\begin{aligned} \int _C f_\lambda \,\mathrm {d}\mu _\lambda = O\left( \lambda ^{2\tau - 2}\right) = o(1). \end{aligned}$$

Estimation of \(\int _R f_\lambda \,\mathrm {d}\mu _\lambda \). For \(k\ge \lambda + \lambda ^\tau \), \(0<\lambda (k+1)^{-1} <1\) whence \(f_\lambda (k)<1\) so

$$\begin{aligned} \int _R f_\lambda \,\mathrm {d}\mu _\lambda \le 2\sqrt{\pi } \sum _{k > \lambda + \lambda ^\tau } \lambda ^{1/2}\mathrm {e}^{-2\lambda }\frac{\lambda ^{2k}}{(k!)^2}. \end{aligned}$$

Denoting by j the unique integer such that \(j\le \lambda + \lambda ^\tau < j+1\), (93) gives

$$\begin{aligned} \sum _{k > \lambda + \lambda ^\tau } \lambda ^{1/2}\mathrm {e}^{-2\lambda }\frac{\lambda ^{2k}}{(k!)^2} \le \lambda ^{1/2}\frac{\mathrm {e}^{-2\lambda }\lambda ^{2j}}{(j!)^2} \frac{(j+1)^2}{(j+1)^2 - \lambda ^2}. \end{aligned}$$

By increasingness of the function \(s\mapsto s/(s-a)\) on \((a,\infty )\), the last factor is estimated as

$$\begin{aligned} \frac{(j+1)^2}{(j+1)^2 - \lambda ^2} \le \frac{(\lambda + \lambda ^\tau )^2}{(\lambda + \lambda ^\tau )^2 - \lambda ^2} = \frac{\lambda ^2(1 + \lambda ^{\tau - 1})^2}{2\lambda ^{1+\tau } + \lambda ^{2\tau }} \sim \frac{\lambda ^{1-\tau }}{2}. \end{aligned}$$

By Stirling formula, \(((j+1)!)^2\ge 2\pi (\lambda + \lambda ^\tau )^{2\lambda + 2\lambda ^\tau +1}\mathrm {e}^{-2\lambda - 2\lambda ^\tau - 2}(1+\delta _{j+1})^2\), so

$$\begin{aligned} \frac{\mathrm {e}^{-2\lambda }\lambda ^{2j}}{(j!)^2} = \frac{\lambda ^{2j}(j+1)^2}{((j+1)!)^2} \le \frac{\mathrm {e}^{2}}{2\pi } \frac{\mathrm {e}^{2\lambda ^\tau }\lambda ^{2\lambda +2\lambda ^\tau }(\lambda +\lambda ^\tau +1)^2}{(\lambda +\lambda ^\tau )^{2\lambda +2\lambda ^\tau }(\lambda +\lambda ^\tau )(1+\delta _{j+1})^2} \sim \frac{\mathrm {e}^{2}}{2\pi }T_2(\lambda ), \end{aligned}$$

where \(T_2(\lambda ):=\lambda \mathrm {e}^{2\lambda ^\tau } (1+\lambda ^{\tau -1})^{-2\lambda -2\lambda ^\tau }\). Since \(\ln (1+t)\ge t - t^2/2\) for \(t\ge 0\),

$$\begin{aligned} T_2(\lambda ) \le \lambda \exp \left( 2\lambda ^\tau - (2\lambda + 2\lambda ^\tau ) \left( \lambda ^{\tau -1}-\frac{\lambda ^{2\tau -2}}{2}\right) \right) \ = \lambda \exp \left( - \lambda ^{2\tau -1} + \lambda ^{3\tau -2}\right) . \end{aligned}$$

As \(2\tau -1>0\), for any \(a>0\) and \(c\in (0,1)\), \(\lambda ^aT_2(\lambda ) = O(\exp (-c\lambda ^{2\tau -1}))\), whence

$$\begin{aligned} \int _R f_\lambda \,\mathrm {d}\mu _\lambda = O\left( \exp (-c\lambda ^{2\tau -1})\right) = o(1). \end{aligned}$$

Gathering all estimates gives \(\int _\mathbb {N}f_\lambda \,\mathrm {d}\mu _\lambda = O\left( \lambda ^{2\tau - 2}\right) = o(1)\), concluding the check of (iii).