Abstract
The relation between weak convergence of probabilities on a smooth Banach space and uniform convergence over a certain class of smooth functions is established. This leads to an extension of Lindeberg’s proof of the central limit theorem in a Banach space framework. As a result, asymptotic normality is proved for sums of Banach space random variables including triangular arrays and weighted linear processes.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we consider sums of Banach space valued random variables with the aim to establish conditions for their asymptotic normality using a generalization of Lindeberg’s elementary proof of the central limit theorem of 1922 [18].
Lindeberg’s method is simple and elegant. It is based on replacing one by one non-Gaussian random variables by Gaussian ones and then using Taylor expansion to get approximation bounds. This principle has been applied for proving central limit theorems for sums of independent random variables with values in a Hilbert space by Giné and León [12], or to estimate rates of convergence in a central limit theorem in Banach spaces (see, e.g., Bentkus et al. [4], Paulauskas and Račkauskas [19]) as well. Its potential for proving more general invariance results has been discovered by many researchers and led to results on matrices with exchangeable entries [7], on the universality of local laws [24, 25] on matrices with correlated entries [1,2,3] and many others. Various kernel type density (regression function) estimators produce sums of an array of random functions to which Lindeberg CLT can be applied to help solve various statistical problems (see, e.g., [13] and references therein).
The idea of Lindeberg was carefully examined and generalized by Zolotarev [26] through the introduction of the so-called \(\zeta \) metrics, which metrizes weak convergence in the case of distributions in Hilbert space as later proved by Giné and León [12]. Although for Banach spaces such metrization of weak convergence is not possible in general, however, it is possible to connect weak convergence of probability measures with uniform convergence over a suitable class of differentiable functions in the case where the norm of Banach space is smooth enough (see Sect. 2). This leads to an extension of Lindeberg method for smooth Banach spaces. In turn, we prove central limit theorem for a triangular array of random elements with values in such Banach spaces (see Sect. 4) and establish asymptotic normality of sums of Banach space valued linear processes as well of weighted sums of independent identically distributed \(\mathsf {B}\)-valued random variables (Sect. 5). In Sect. 3, we present some remarks concerning differentiability of norm and some examples of smooth Banach spaces.
Abstract theory of smoothness in infinite-dimensional real Banach spaces and its connections with geometrical properties has been investigated by many authors. For a very detailed exposition of the theory, we refer to the book by Hájek and Johanis [14]. The existence of a p-smooth bump function for \(1<p\le 2\) was shown to be equivalent to certain martingale moment inequality and appeared as sufficient property to some probability limit theorems in Banach spaces, see, e.g., Pisier [20] and Rosiński [22].
2 Weak Convergence via Smooth Functions
In what follows, \(\mathsf {B}\) denotes a real separable Banach space. The norm of an element \(x\in \mathsf {B}\) is denoted by \(\Vert x\Vert _\mathsf {B}\), or, if no confusion can arise, simply by \(\Vert x\Vert \). The Banach space topological dual of \(\mathsf {B}\) is \(\mathsf {B}^*\), and we shall use the notation \(\langle x, y^* \rangle :=y^*(x)\) for the duality pairing of the elements \(x\in \mathsf {B}\) and \(y^*\in \mathsf {B}^*\). Let \(L(\mathsf {B})\) be the Banach space of all continuous linear operators \(u: \mathsf {B}\rightarrow \mathsf {B}\), endowed with the norm \(\Vert u\Vert = \sup \{\Vert ux\Vert : x\in \mathsf {B}, \ \Vert x\Vert \le 1\},\) \(I_{\mathsf {B}}\in L(\mathsf {B})\) is the identity operator. By \(\mathscr {L}_k(\mathsf {B})\), we denote the Banach space of bounded k-linear operators \(T:\mathsf {B}^k\rightarrow \mathbb {R}\) with the supremum norm
To simplify the writing, the overloading of the notation \(\left\| z \right\| \) for the norm is used throughout the text whenever the nature of the argument z dispels any doubt on the Banach space involved : \(\mathsf {B}\) or one of the associated spaces of continuous operators \(L(\mathsf {B})\), \(\mathscr {L}_k(\mathsf {B})\), \(k>1\).
The set of all probability distributions on the measurable space \((\mathsf {B}, \mathscr {B}_{\mathsf {B}})\) is denoted by \(\mathscr {P}(\mathsf {B})\), where \(\mathscr {B}_{\mathsf {B}}\) is the \(\sigma \)-algebra of Borel subsets of \(\mathsf {B}\). Throughout we use
for any probability \(\mathsf {P}\in \mathscr {P}(\mathsf {B})\) and any \(\mathsf {P}\)-integrable function \(f:\mathsf {B}\rightarrow \mathbb {R}\). Let us recall here that a sequence \((\mathsf {P}_n) \subset \mathscr {P}(\mathsf {B})\) converges weakly to \(\mathsf {P}\in \mathscr {P}(\mathsf {B})\) (denoted \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P})\) if
where \(\mathrm {C}_b(\mathsf {B})\) is the class of all bounded continuous functions \(f:\mathsf {B}\rightarrow \mathbb {R}\). It is sometimes convenient to prove weak convergence of probability measures by showing that \(\mathsf {P}_nf\rightarrow \mathsf {P}f\) for a class \(\mathscr {F}\) of functions \(f: \mathsf {B}\rightarrow \mathbb {R}\) which is smaller than the class \(C_b(\mathsf {B})\). In this case, it is said that \(\mathscr {F}\) determines the weak convergence of probabilities. A well known example is provided by the class of bounded Lipschitz functions. Recall the function \(f:\mathsf {B}\rightarrow \mathbb {R}\) is bounded Lipschitz if
Moreover, the bounded Lipschitz distance
metrizes weak convergence that is \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) if and only if \(\lim _{n\rightarrow \infty } d_{\mathrm {BL}}(\mathsf {P}_n; \mathsf {P})=0\).
Another example is known in the case where \(\mathsf {B}=\mathscr {H}\) is a separable Hilbert space. As proved by Giné and León [12], for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathscr {H})\), in order to check weak convergence \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) it is enough to show that \(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for every \(f:\mathscr {H}\rightarrow \mathbb {R}\) continuous, bounded and with bounded derivatives of all orders. So the situation in a separable Hilbert space is just as in the case of a finite dimensional space.
In what follows for a number \(p\ge 1\), we denote by \(\lfloor p \rfloor \) the unique integer satisfying \(p-1\le \lfloor p \rfloor <p\) and agree that \(\{p\}:=p-\lfloor p \rfloor \). The reader is warned about the difference with the classical “floor” and “fractional part” functions, e.g., \(\lfloor 3.9 \rfloor =3\), but \(\lfloor 4 \rfloor =3\) and \(\{n\}=1\) for any integer n. This is motivated by our wish to interpolate between spaces \(\mathrm {C}_b^{n-1}\) and \(\mathrm {C}_b^{n}\) of functions with, respectively, bounded \((n-1)\) derivatives or n derivatives by spaces of functions whose \((n-1)\)th derivatives satisfy a Hölder condition with exponent \(0<\alpha \le 1\), the special case \(\alpha =1\) giving Lipschitz \((n-1)\) derivatives.
More precisely, we introduce for any real \(p\ge 1\) the class \(\mathrm {C}^{(p)}_b(\mathsf {B})\) of functions \(f\in \mathrm {C}_b(\mathsf {B})\) that are \(\lfloor p \rfloor \)-times continuously Fréchet differentiable and such that
where \(f^{(k)}\) denotes the \(k^{\text {th}}\) Fréchet derivative of the function f with \(f^{(0)}:=f\). For the definition of Fréchet derivatives and properties of Fréchet differentiable functions in infinite dimensional Banach spaces, we refer to [6]. Clearly, \(\mathrm {C}_b^{(1)}(\mathsf {B})\) coincides with the class of Lipschitz functions, and \(\left\| f \right\| _{(1)}=\Vert f\Vert _{\mathrm {Lip}}\) for \(f\in \mathrm {C}_b^{(1)}(\mathsf {B})\). By \(\mathrm {C}^\infty _b(\mathsf {B})\), we denote the class of infinitely many times differentiable functions with bounded derivatives.
Define for \(\mathsf {P},\mathsf {Q}\in \mathscr {P}(\mathsf {B})\),
A natural question is then: for which Banach spaces \(\mathsf {B}\), does the class \(\mathrm {C}^{(p)}_{b}(\mathsf {B}), p\ge 1,\) determine the weak convergence of probability measures? Roughly speaking, this is true in the case where the norm of \(\mathsf {B}\) is sufficiently smooth. To be more precise, we define first what we mean by smoothness of a norm.
Definition 1
Let \(p\ge 1\). We say that a Banach space \(\mathsf {B}\) is p-smooth if its norm \(\psi (x):=\Vert x\Vert _\mathsf {B}, x\in \mathsf {B},\) is \(\lfloor p \rfloor \)-times continuously Fréchet differentiable on the set \(\mathsf {B}\setminus \{0\}\), and
Evidently every Banach space is 1-smooth. If \(\mathsf {B}\) is q-smooth for some \(q>1\), it is also p-smooth for \(1\le p\le q\). Examples of p-smooth spaces, where \(p>1,\) are given below (see Examples 12 and 14).
Remark 2
Our definition of p-smoothness, tailored for the Lindeberg method, looks different of the p-smoothability as in, e.g., Rosiński [22, p. 159] where \(\mathsf {B}\) is said to be p-smoothable (\(1\le p\le 2\)), if there exists an equivalent norm \(\vert \cdot \vert \) on \(\mathsf {B}\) such that the modulus of smoothness
By Lemma 19 p.246 in [14], this condition means that \(\vert \cdot \vert \) is \(\mathrm {C}^{1,p-1}\) smooth in the sense of [14, Def. 124, p. 55]. So for \(1\le p \le 2\), p-smoothability and p smoothness in the sense of Definition 1 are similar up to the equivalence of norms. But the use of \(\rho _{\vert \cdot \vert }(t)\) to define the p-smoothability induces the restriction \(p\le 2\) while we need \(p>2\) for the Lindeberg method.
Remark 3
It seems worth noticing here the two following facts about \(\psi \).
-
(a)
For any Banach space \(\mathsf {B}\), there is no Fréchet derivative of \(\psi \) at 0. Indeed should \(\psi \) be Fréchet derivable at 0, the same should hold for its restriction to the one dimensional subspace \(D=\{t u, t\in \mathbb {R}\}\), for some fixed \(u\in \mathsf {B}\setminus \{0\}\). This in turn would imply the derivability at 0 (in the classical elementary sense) of the function \(t\mapsto |t|\), which clearly fails.
-
(b)
If \(\psi \) is Fréchet differentiable on \(\mathsf {B}\setminus \{0\}\), then
$$\begin{aligned} \psi ^{(1)}(x) = \psi ^{(1)}\left( \frac{x}{\Vert x\Vert }\right) ,\quad x\in \mathsf {B}\setminus \{0\}. \end{aligned}$$(4)In particular, \(\psi ^{(1)}\) is bounded on \(\mathsf {B}\setminus \{0\}\) if and only if it is bounded on \(\{x\in \mathsf {B},\;\Vert x\Vert =1\}\). Another obvious consequence of (4) is that
$$\begin{aligned} \psi ^{(1)}(cx) = \psi ^{(1)}(x),\quad c\in \mathbb {R}\setminus \{0\},\;x\in \mathsf {B}\setminus \{0\}. \end{aligned}$$(5)To check (4), we note that the Fréchet differentiability of \(\psi \) means that for each fixed \(x\ne 0\),
$$\begin{aligned} \bigl \vert \left\| x+h \right\| -\left\| x \right\| -\psi ^{(1)}(x)(h) \bigr \vert = o(\left\| h \right\| ),\quad h\rightarrow 0. \end{aligned}$$Puting \(y:=x/\left\| x \right\| \), \(u:=h/\left\| x \right\| \) and recalling that \(\psi ^{(1)}(x)\) is a linear operator \(B\rightarrow \mathbb {R}\), lead to
$$\begin{aligned} \vert \psi \left( y+u\right) - \psi (y) - \psi ^{(1)}(x)\left( u\right) \vert = o(\left\| u \right\| ), \quad u\rightarrow 0, \end{aligned}$$whence \(\psi ^{(1)}(x)=\psi ^{(1)}(y)\).
The main result in this section is the following theorem.
Theorem 4
Let \(p\ge 1\). If the Banach space \(\mathsf {B}\) is p-smooth, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathsf {B})\), the following statements are equivalent
-
(i)
\(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);
-
(ii)
\(\mathsf {P}_n f\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}^{(p)}_b(\mathsf {B})\);
-
(iii)
\(\lim _{n\rightarrow \infty }\zeta _{p}(\mathsf {P}_n, \mathsf {P})=0.\)
The proof of the theorem will be achieved by establishing the cycle of implications:
For (i) \(\Rightarrow \) (iii), we note that for \(p\ge 1\), the unit ball \(\mathrm {U}_p\) of \(\mathrm {C}^{(p)}_b(\mathsf {B})\) is an equicontinuous family in \(\mathrm {C}_b(\mathsf {B})\), uniformly bounded by the constant 1. Then, the convergence \(\mathsf {P}_n f \rightarrow \mathsf {P}f\) is uniform on \(\mathrm {U}_p\) by Theorem 3.1 in Ranga Rao [21].
(iii) \(\Rightarrow \) (ii) is obvious for f in \(\mathrm {U}_p\) and extends to any f in \(\mathrm {C}^{(p)}_b(\mathsf {B})\) by linearity of the integral.
The hard part is (ii) \(\Rightarrow \) (i) which we detail now.
Proof of (ii) \(\Rightarrow \) (i) To this aim, it is enough to prove that if (ii) holds, then for each finite intersection A of open balls, we have \(\mathsf {P}_n(A)\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}(A)\), provided A is a \(\mathsf {P}\)-continuity set (see, e.g., Billingsley [5], Corollary 2 to Th. 2.2, p.15). Recall, \(A\in \mathscr {B}_{\mathsf {B}}\) is \(\mathsf {P}\)-continuity set if \(\mathsf {P}(\partial A)=0\), where \(\partial A\) denotes the boundary of A, that is \(\partial A = {\overline{A}}\setminus \mathring{A}\) where \({\overline{A}}\) and \(\mathring{A}\) are the closure and the interior of A, respectively.
Let for \(x\in \mathsf {B}\) and \(r>0\), \(B(x,r)=\{y\in \mathsf {B}: \Vert y-x\Vert <r\}\). Set
It suffices to prove, that
To this aim, define for \(0<\varepsilon <r_0:=\min _{1\le i\le m}r_i\),
It is easily seen that
As intersection of closed balls, \(A'\) is closed and since \(A\subset A'\), the closure \({\overline{A}}\) of A is included in \(A'\). It is not difficult to find examples where this inclusion is strict when A is empty. Of course, this special case may be discarded since with \(A=\emptyset \), the convergence (6) is trivial. When A is non-empty, one can check that \(A'={\overline{A}}\) as follows. Let x be an arbitrary element in \(A'\). There is at least one element \(y_0\) in A. Then, we define \(y_1:=\frac{x+y_0}{2}\). For \(i=1,\dots ,m\), \(\left\| y_1-x_i \right\| \le \frac{1}{2}\left\| x-x_i \right\| + \frac{1}{2}\left\| y_0-x_i \right\| \). As for \(i=1,\dots ,m\), \(\left\| x-x_i \right\| \le r_i\) and \(\left\| y_0-x_i \right\| <r_i\), we see that \(\left\| y_1-x_i \right\| <r_i\), hence, \(y_1\) is in A. Iterating this argument, we construct the sequence \((y_n)\) is A such that \(y_n=\frac{x+y_{n-1}}{2}\). Since \(\left\| x-y_n \right\| \le 2^{-n}\left\| x-y_0 \right\| \), \(n\ge 1\), x belongs to \({\overline{A}}\) as limit of a sequence of points of A. Therefore, if \(A\ne \emptyset \),
Next, we construct for \(A\ne \emptyset \) and \(0<\varepsilon <r_0\), the functions \(f^\varepsilon , f_\varepsilon \in \mathrm {C}^{(p)}_b(\mathsf {B})\) such that
for all \(x\in \mathsf {B}\), where \(\varvec{1}_A\) is the indicator function of A. Assume for a moment that these functions are already constructed. From (7) and (8), we obtain by monotone sequential continuity of the probability measure \(\mathsf {P}\),
In view of (9) and recalling that A is a \(\mathsf {P}\)-continuity set, this gives
From (9) and the hypothesis (ii) applied with \(f_\varepsilon \), \(f^\varepsilon \),
and
for each \(0<\varepsilon <r_0\). Taking into account (10), from (11) and (12) we deduce (6). So, it remains to construct the functions \(f^\varepsilon , f_\varepsilon .\)
We begin with a lemma on the Fréchet derivatives of the norm of a p-smooth space which completes Remark 3. It quantifies the explosion of the successive derivatives of the norm near 0.
Lemma 5
When \(\mathsf {B}\) is p-smooth, its norm \(\psi (x)=\left\| x \right\| \) satisfies for \(i=1, \dots , \lfloor p \rfloor \),
Moreover, there exists constants \(c_1,\dots ,c_{\lfloor p \rfloor },c_p\) such that
and
for all \(x, y\in \mathsf {B}, x, y\not =0\).
Proof of Lemma 5
To prove (13), we proceed by finite induction on \(1\le i<\lfloor p \rfloor \). The initialization step is (4) already checked. Define for \(x\in \mathsf {B}\setminus \{0\}\), \(y\in \mathsf {B}\) and \(1\le i<\lfloor p \rfloor \),
where for \(y\in \mathsf {B}\), \(y^{\otimes i}=(y,\dots ,y)\) denotes the element of \(\mathsf {B}^i\) with all components equal to y and for a i-linear form L on \(B^i\), \(L\cdot w\) stands for L(w), \(w=(w_1,\dots ,w_i)\in \mathsf {B}^i\). To complete the proof of (13), it suffices to prove that under the induction Assumption (13) for some i, \(\vert T_{i+1}(x,y) \vert = o(\left\| y \right\| ^{i+1})\) for any fixed \(x\in \mathsf {B}\setminus \{0\}\) when \(y\rightarrow 0\) in \(\mathsf {B}\). Indeed then both symmetric \((i+1)\)-linear forms \(\psi ^{(i+1)}(\frac{x}{\left\| x \right\| })\) and \(\left\| x \right\| ^i\psi ^{(i+1)}(x)\) are equal on the diagonal of \(\mathsf {B}^{i+1}\), and this equality extends to the whole space \(\mathsf {B}^{i+1}\) by symmetry.
By the induction assumption, restricting to \(\left\| y \right\| <1\) to avoid a possible vanishing of \(\frac{x}{\left\| x \right\| } + y\),
and
By the multilinearity of the Fréchet derivatives, it follows that
Putting \(h:=\left\| x \right\| y\), we deduce from the existence of \(\psi ^{(i+1)}\) that \(\left\| x \right\| \vert T_{i+1}(x,y) \vert = o(\left\| h \right\| ^{i+1})\) as \(h\rightarrow 0\), whence recalling that x is fixed, \(\vert T_{i+1}(x,y) \vert = o(\left\| y \right\| ^{i+1})\) when y tends to 0, as expected.
Clearly, (14) follows from (13) with \(c_i:=\sup \{\left\| \psi ^{(i)}(y) \right\| : \left\| y \right\| =1\}\), recalling that the definition of p-smoothness includes the boundedness of each \(\psi ^{(i)}\), \(1\le i\le \lfloor p \rfloor \), on \(\{y\in \mathsf {B}: \left\| y \right\| =1\}\), see (3).
Now, we prove (15). Denoting \(t=\Vert x\Vert ^{-1}, s=\Vert y\Vert ^{-1}\) for \(x\not =0, y\not =0\), and using (13), we start from
The first term in (16) is bounded by
where
is finite by the p-smoothness assumption (3). Now,
and writing \( x - \frac{\left\| x \right\| }{\left\| y \right\| }y = x - y + \frac{\left\| y \right\| - \left\| x \right\| }{\left\| y \right\| }y, \) the triangle inequality gives
whence, noticing that \(2^{\{p\}}\le 2\),
To estimate the second term in (16), choose \(a=\max \{\Vert x\Vert , \Vert y\Vert \}\) and consider first the case where \(\Vert x-y\Vert \ge a\). In this case,
If \(\Vert x-y\Vert \le a\), we claim that
Let us check (18). Put for simplification \(m:=\lfloor p \rfloor -1\). Applying the elementary bound
which is optimal when v tends to u, gives
As \(m<p\) and \(\left\| x-y \right\| ^{1-\{p\}}\le (2a)^{1-\{p\}} \le 2a^{1-\{p\}}\), this leads to
If \(a=\left\| x \right\| \), this provides
Obviously the same estimate holds replacing \(\left\| y \right\| \) by \(\left\| x \right\| \) when \(a=\left\| y \right\| \), and adding both estimates to have a common bound gives (18).
Gathering the estimates, we obtain for the second term in (16),
Accounting (17), this completes the proof of (15) with \(c_p:= 2\max (c'_p,pc_{\lfloor p \rfloor })\). \(\square \)
Let us go back to the construction of \(f^\varepsilon \), \(f_\varepsilon \). Lemma 5 quantifies in some way the non-membership of the norm of \(\mathsf {B}\) in the space \(\mathrm {C}^{(p)}_b(\mathsf {B})\). To remedy this drawback, the idea is to modify \(\psi \) inside the ball \(B(0,\varepsilon )\) by flatening to zero the peak of \(\psi \) in the ball \(B(0,\varepsilon /2)\) and use a connection through \(B(0,\varepsilon )\setminus B(0,\varepsilon /2)\) smooth enough to obtain an approximation of \(\psi \) by a function \(g_\varepsilon \) in \(\mathrm {C}^{(p)}_b(\mathsf {B})\). To this aim, let us choose a function \(u\in \mathrm {C}_b^{(\infty )}([0,\infty ))\) such that \(0\le u\le 1\), \(u=0\) on [0, 1/2], \(u=1\) on \([1,\infty )\). Set
More explicitly,
The function \(g_\varepsilon \) is uniformly approximating the norm. Indeed
since \(u(\varepsilon ^{-1}\psi (x))=1\), if \(\psi (x)\ge \varepsilon \).
The next lemma establishes the membership of \(g_\varepsilon \) in \(\mathrm {C}^{(p)}_b(\mathsf {B})\) and provides some control of its norm \(\left\| g_\varepsilon \right\| _p\), defined by (1), in terms of the parameter \(\varepsilon \).
Lemma 6
There is a constant \(C>0\) such that
and
Proof of Lemma 6
Denoting
we rewrite
An immediate induction provides the following formula for the successive derivatives of v
We note also that \(u^{(m)}(1/2) = u^{(m)}(1)=0\) since u is \(C^\infty \) and null at the right of 1 and at the left of 1/2. The values of \(v^{(m)}\) on \([0,\frac{1}{2}]\cup [1,\infty )\) are then
Together with the infinite derivability of v, this implies that for any integer \(m\ge 1\),
Using differentiation of composite functions on Banach spaces (see, e.g., Fraenkel [11]), we find
with
where \(\beta \in <j>_+\) means that \(\beta =(\beta _1, \dots , \beta _j)\) with integers \(\beta _i\ge 1, i=1, \dots , j\), \(|\beta |=\beta _1+\cdots +\beta _j\) and \(\beta !=\beta _1!\cdots \beta _j!\). In \(\sum _{\sigma }\), the summation runs over the m! permutations \(\sigma \) of \(\{1, \dots , m\}\).
To prove (21), we need to bound \(\left\| I_{\beta , \sigma }(x) \right\| \) only for \(\left\| x \right\| >\frac{\varepsilon }{2}\) since for \(m\ge 1\), \(v^{m)}(t_x)=0\) for \(t_x\in [0,1/2]\), that is for \(\left\| x \right\| \le \frac{\varepsilon }{2}\). So assuming \(\left\| x \right\| >\frac{\varepsilon }{2}\) and accounting (14), we obtain
Consequently \(\Big \Vert \frac{\mathrm {d}^m}{\mathrm {d}x^m}v(t_x)\Big \Vert \le K_m\varepsilon ^{-m}\) and as \(g_\varepsilon (x)=\varepsilon v(t_x)\), (21) is checked.
To prove (22), we have to find a bound of the form \(c\varepsilon ^{-p}\Vert x-y\Vert ^{\{p\}}\) for each
because of the decomposition
In view of (23), the discussion is naturally ordered according to the various configurations of \(\left\| x \right\| \), \(\left\| y \right\| \) and the open interval \((\frac{\varepsilon }{2},\varepsilon )\).
Case 1: \(\left\| x \right\| \) and \(\left\| y \right\| \) are both outside \((\frac{\varepsilon }{2},\varepsilon )\). As \(t_x\), \(t_y\) are both outside \((\frac{1}{2},1)\), it is clear from (23) that for \(j\ge 2\), \(v^{(j)}(t_x)=v^{(j)}(t_y)=0\) whence \(\varDelta _{\beta ,\sigma }(x, y)=0\). For the same reason, \(\varDelta _{\beta ,\sigma }(x,y)=0\) when \(j=1\) and \(\left\| x \right\| ,\left\| y \right\| \le \frac{\varepsilon }{2}\). If \(j=1\) and \(\left\| x \right\| ,\left\| y \right\| \ge \varepsilon \),
whence recalling (15),
Case 2: only one of \(\left\| x \right\| \), \(\left\| y \right\| \) is inside \((\frac{\varepsilon }{2},\varepsilon )\). By symmetry, it suffices to treat the configurations where \(\left\| x \right\| \) is inside \((\frac{\varepsilon }{2},\varepsilon )\) and \(\left\| y \right\| \) outside.
Case 2.a: \(0<\left\| y \right\| \le \frac{\varepsilon }{2}< \left\| x \right\| < \varepsilon \). In this configuration, for \(j\ge 1\) and \(\beta \in <j>_+\),
From (24),
which together with (26), leads to
Case 2.b: \(\frac{\varepsilon }{2}< \left\| x \right\| < \varepsilon \le \left\| y \right\| \). For \(j\ge 2\), (30) still holds and exactly the same argument as above gives
In the special case where \(j=1\), as \(v'(t_y)=1\),
whence
Bounding the first term in the right-hand side exactly as in case 2.a and referring to (29) in case 1 for the second one, we obtain
Case 3: \(\frac{\varepsilon }{2}< \left\| x \right\| ,\left\| y \right\| < \varepsilon \). We start from
The first term in the right-hand side of (31) is bounded exactly as in case 2.a:
For the second term, let us treat first the special case where \(j=1\). Arguing as in case 1, we just have to replace \(\varepsilon \) by \(\frac{\varepsilon }{2}\) in (29), so accounting (24),
Assume now that \(\beta \in <j>_+\) with \(2\le j\le \lfloor p \rfloor \). As \(\vert \beta \vert =\lfloor p \rfloor \), for each component \(\beta _i\) of \(\beta \), \(\beta _i<\lfloor p \rfloor \). Using telescopic summation where at each step one factor \(t_x^{(\beta _i)}\) is replaced by \(t_y^{(\beta _i)}\) gives
with the usual convention that a product indexed by \(\emptyset \) equals 1. Therefore,
where
It is easy to bound \(\varPi _{\beta ,i}\) since by (14), \( \left\| t_x^{(\beta _k)} \right\| =\varepsilon ^{-1}\left\| \psi ^{(\beta _k)}(x) \right\| \le \varepsilon ^{-1}c_{\beta _k}\left\| x \right\| ^{1-\beta _k} \) and as \(\beta _k\ge 1\) and \(\left\| x \right\| >\frac{\varepsilon }{2}\), \(\left\| x \right\| ^{1-\beta _k}\le 2^{\beta _k-1}\varepsilon ^{1-\beta _k}\). Obviously, the same holds for \(\left\| t_y^{(\beta _k)} \right\| \) and all this gives
Recalling that \(\beta _i<\lfloor p \rfloor \), \(1\le i\le j\), it remains to estimate for any \(1\le m< \lfloor p \rfloor \),
In view of (14), it seems relevant to apply the mean-value theorem for derivatives to the function \(\psi ^{(m)}:\mathsf {B}\setminus \{0\}\rightarrow \mathscr {L}_m(\mathsf {B}) \). But then, care must be taken of the inclusion of the segment [x, y] in the open set \(\mathsf {B}\setminus \{0\}\). If 0 belongs to [x, y], then there exists \(s\in [0,1]\) such that \((1-s)x+sy =0\), that is \(x=s(x-y)\) whence \(\left\| x \right\| =s\left\| x-y \right\| \). If \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\), this equality is impossible since \(s\in [0,1]\) and \(\left\| x \right\| >\frac{\varepsilon }{2}\). Accordingly, we separate the cases \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\) and \( \left\| x-y \right\| >\frac{\varepsilon }{2}\).
If \(\left\| x-y \right\| \le \frac{\varepsilon }{2}\), by the mean-value theorem, (14) and the convexity of the function \(t\mapsto t^{-m}\),
If \(\left\| x-y \right\| >\frac{\varepsilon }{2}\), it is enough to use (14) as follows.
Noticing that here \(\varepsilon < 2\left\| x-y \right\| \), this gives
So we can retain from both cases the common bound:
for \( \frac{\varepsilon }{2}< \left\| x \right\| ,\left\| y \right\| < \varepsilon \), \(1\le m <\lfloor p \rfloor \). This together with (33) enables us to bound the \(i^{\,\text {th}}\) term in (32) as
Finally, accounting (24),
This completes the proof of (22) and Lemma 6\(\square \)
Next for each \(\varepsilon >0\), we construct a \(\lfloor p \rfloor \)-times Fréchet differentiable function \(\phi _\varepsilon :\mathsf {B}\rightarrow \mathbb {R}\) such that
and
To this aim, let the function \(q\in \mathrm {C}^{(\infty )}_b(\mathbb {R})\) be such that \(0\le q\le 1\), \(q(t)=1\), if \(t<1/8\), \(q(t)=0\), if \(t>7/8\). Set
If \(\Vert x\Vert \le 1\), then \((\Vert x\Vert -1)\varepsilon ^{-1}<0\) and
therefore \(\phi _\varepsilon (x)=1\). If \(\Vert x\Vert >1+\varepsilon \), then
therefore \(\phi (x)=0\) and (34) is confirmed. There remains to evaluate the derivatives of the function \(\phi _\varepsilon \). This can be done in much the same way as we proved (21) and (22). Finally, we use \(\phi _\varepsilon , \varepsilon >0\) to define the required functions
It is straightforward to check that \(\Vert f^\varepsilon \Vert _p<\infty \) and \(\Vert f_\varepsilon \Vert _p<\infty \). This completes the proof of \((ii)\Rightarrow (i)\) and Theorem 4. \(\square \)
Theorem 7
If the Banach space \(\mathsf {B}\) is \(\infty \)-smooth, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathsf {B})\), the following statements are equivalent
-
(i)
\(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);
-
(ii)
\(\mathsf {P}_n f\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}^{(\infty )}_b(\mathsf {B})\).
Proof
The proof of Theorem 4 is adapted for the case of \(\infty \)-smooth Banach space as well. One needs to follow its lines having in mind that functions involved are \(\infty \)-differentiable (or p-differentiable for any \(p>1\)). So the finally constructed functions \(f^{\varepsilon }\) and \(f_{\varepsilon }\) belong to \(\mathrm {C}^{(\infty )}_b(\mathsf {B})\). \(\square \)
Remark 8
From the proof of Theorem 4, we see that the differentiability of the norm could be substituted by its smooth approximation in the sense that for each \(\varepsilon >0\), there exists a \(\lfloor p \rfloor \)-times Fréchet differentiable function \(\psi _\varepsilon :\mathsf {B}\rightarrow \mathbb {R}\) such that
-
(a)
for any \(\varepsilon >0\),
$$\begin{aligned} \sup _{x\in \mathsf {B}}|\psi _\varepsilon (x)-\psi (x)|\le \varepsilon ; \end{aligned}$$ -
(b)
with some constant \(C>0\),
$$\begin{aligned} \sup _{x\in \mathsf {B}}\Vert \psi ^{(i)}_\varepsilon (x)\Vert \le C\varepsilon ^{1-i},\ \ i=1, \dots , \lfloor p \rfloor ; \end{aligned}$$ -
(c)
with some constant \(C>0\),
$$\begin{aligned} \sup _{x\not =y, x,y \in \mathsf {B}}\frac{\Vert \psi ^{(\lfloor p \rfloor )}_\varepsilon (x)-\psi ^{(\lfloor p \rfloor )}_\varepsilon (y)\Vert }{\Vert x-y\Vert ^{\{p\}}}\le C\varepsilon ^{1-p}. \end{aligned}$$
Remark 9
Since \(\mathrm {C}_b^{(p)}(\mathsf {B})\subset \mathrm {C}_b^{(p')}(\mathsf {B})\), if \(p>p'\), it holds
Remark 10
For \(\mathsf {B}\)-valued random variables X, Y, we set
where \(\mathsf {P}_X\) denotes the distribution of X. Hence, if \(\mathsf {B}\) is p-smooth then in order to check convergence in distribution of a sequence \((X_n, n\in \mathbb {N})\) of \(\mathsf {B}\)-valued random variables to a \(\mathsf {B}\)-valued random variable X, it is enough to prove \(\zeta _p(X_n, X)\xrightarrow [n\rightarrow \infty ]{}0\). The use of \(\zeta _{p}\) in proving convergence in distribution of random variables is attractive due to the following simple but powerful properties of \(\zeta _p\):
-
(a)
for each \(c\in \mathbb {R}\),
$$\begin{aligned} \zeta _{p}(cX, cY)\le \max \{1, |c|^{p}\}\zeta _p(X, Y); \end{aligned}$$ -
(b)
if the \(\mathsf {B}\)-valued random element Z is independent of (X, Y), then
$$\begin{aligned} \zeta _{p}(X+Z, Y+Z)\le \zeta _{p}(X, Y); \end{aligned}$$ -
(c)
for independent B-valued random elements \(X_1, \dots , X_n; Y_1, \dots , Y_n\),
$$\begin{aligned} \zeta _{p}\Big (\sum _{k=1}^n X_k, \sum _{k=1}^n Y_k\Big )\le \sum _{k=1}^n \zeta _{p}(X_k, Y_k). \end{aligned}$$(36)
These properties of \(\zeta _p\) were discovered by Zolotorev [26], but actually are easy to prove. The statement (a) follows directly from the definition of \(\zeta _p\). To prove (b), one needs to use Fubini theorem and invariance by shifts of the function space \(\mathrm {C}_b^{(p)}(\mathsf {B})\) that is under the transformations \(T_x : \mathrm {C}_b^{(p)}(\mathsf {B})\rightarrow \mathrm {C}_b^{(p)}(\mathsf {B})\), \(f\mapsto T_xf:=f(x+\cdot )\), \(x\in \mathsf {B}\). Finally, (c) follows from (b) and the triangle inequality.
3 Some Remarks on Smooth Banach Spaces
Various aspects of differentiability of Banach space norm are discussed in Sundaresan [23].
3.1 Smoothness and Type 2
Recall a Banach space \(\mathsf {B}\) is said to be of type 2 if there is a constant \(K>0\) such that for any finite set of elements \(x_1,\dots ,x_n\) in \(\mathsf {B}\) and Rademacher sequence \(\epsilon _1,\dots ,\epsilon _n\) (the \(\epsilon _i\) being independent and such that \(P(\epsilon _i = -1) = P(\epsilon _i = 1) = 1/2\)),
By the Khintchine–Kahane inequality giving the equivalence of moments of Rademacher sums \(\sum _i \epsilon _i x_i\) see, e.g., [17, Th. 4.7], the second moment the left-hand side of (37) may be replaced by the first one, leading to the equivalent definition of type 2 by the inequality
Moreover, by, e.g., [17, Prop.9.11], if the separable Banach space \(\mathsf {B}\) is of type 2, there is a constant \(K>0\) depending only on \(\mathsf {B}\) such that for any finite set of mean zero independent \(\mathsf {B}\)-valued random elements \(X_1, \dots , X_n\),
This obviously implies that
Conversely, if in a separable Banach space \(\mathsf {B}\), any finite set of mean zero independent \(\mathsf {B}\)-valued random elements \(X_1, \dots , X_n\) satisfies (40), then choosing \(X_i=\epsilon _i x_i\) shows that B satisfies (38); hence, \(\mathsf {B}\) is of type 2.
Proposition 11
If the Banach space \(\mathsf {B}\) is 2-smooth, then \(\mathsf {B}\) is of type 2.
Proof
We just have to prove that (40) is satisfied in \(\mathsf {B}\). We use the functions \(g_\varepsilon \), \(\varepsilon >0,\) defined in (19) which are in \(\mathrm {C}_b^{(2)}(\mathsf {B})\) by Lemma 6.
For any fixed \(a,b\in \mathsf {B}\), the map \(f:[0,1]\rightarrow \mathbb {R}\), \(t\mapsto f(t) := g_\varepsilon (a+tb)\) has clearly a continuous first derivative \(f'(t)=g_\varepsilon ^{(1)}(a+tb) \cdot b\), so \(f(1)-f(0) = \int _0^1 f'(t)\,\mathrm {d}t\), that is :
Denoting \(S_0=0\), \(S_j=X_1+\cdots +X_j, j=1, \dots , n\), we have by (20),
Recalling that \(g_\varepsilon (0)=0\) and applying (41) gives
It is well known that if \(\varphi \) is a continuous linear form on \(\mathsf {B}\) and X a random element in \(\mathsf {B}\) which is Bochner or Pettis integrable, \(\mathrm{E}\varphi (X_j)= \varphi (\mathrm{E}X_j)\). Combining this property with the independence of \(S_{j-1}\) and \(X_j\) gives via an obvious Fubini argument that
This enables us to rewrite the above decomposition of \(\mathrm{E}g_\varepsilon (S_n)\) as
As \(g_\varepsilon \) satisfies Lemma 6 with \(\lfloor p \rfloor =\{p\}=1\), we can use (22) to obtain
Going back to (42), this gives
Minimizing in \(\varepsilon \) this upper bound, yields (40) with \(K=2C^{1/2}\). \(\square \)
3.2 The Case of Hilbert Spaces
Example 12
Let \(\mathscr {H}\) be a separable Hilbert space with the inner product \(\langle x, y \rangle \) and the norm \(\Vert x\Vert =\sqrt{\langle x, x \rangle }\), \(x, y\in \mathscr {H}\). Then, \(\psi (x)=\Vert x\Vert \) satisfies \(c_j:=\sup _{\Vert x\Vert =1}\Vert \psi ^{(j)}(x)\Vert <\infty \) for any \(j\ge 1.\) This can be seen from \(\psi (x)=(\langle x, x \rangle )^{1/2}\) and the fact that the inner product is a bilinear function; hence, its first derivative is a linear function, whereas its second one is a constant. So in Hilbert space the convergence \(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\) is equivalent to \(\mathsf {P}_n f\rightarrow \mathsf {P}f\) for any \(f\in \mathrm {C}_b^\infty (\mathscr {H})\). As well the weak convergence is metrizable by \(\zeta _d(\mathsf {P}_n, \mathsf {P})\) for \(d\ge 1\). The following result proved by Giné and León [12] is also a corollary of Theorems 4 and 7.
Theorem 13
Let \(\mathscr {H}\) be a separable Hilbert space. Then, for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathscr {H})\) the following statements are equivalent:
-
(i)
\(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);
-
(ii)
\(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for every \(f\in \mathrm {C}_b^{(\infty )}(\mathscr {H})\);
-
(iii)
for at least one \(d>1\), \(\lim _{n\rightarrow \infty } \zeta _d(\mathsf {P}_n, \mathsf {P})=0\).
3.3 Smoothness of \(\mathrm {L}_p\) Spaces
Example 14
Let \(({\mathbb S}, \mathscr {S}, \nu )\) be a \(\sigma \)-finite measure space, \(p\ge 1\). By \(\mathscr {L}_p({\mathbb S}, \nu ; \mathbb {R})\), we denote the set of measurable functions \(x:{\mathbb S}\rightarrow \mathbb {R}\) such that \(\int _{{\mathbb S}}|x(s)|^p\nu (\mathrm {d}s)<\infty \). The corresponding Banach space is denoted by \(\mathrm {L}_p({\mathbb S}, \mathscr {S}, \nu ; \mathbb {R}))\) or shortly \(\mathrm {L}_p({\mathbb S}, \nu )\) and is endowed with the norm
Throughout we assume that the spaces \(\mathrm {L}_p({\mathbb S}, \nu ), p\ge 1,\) are separable. This is the case if \(\mathscr {S}\) is countably generated or if \(({\mathbb S}, \mathscr {S}, \nu )\) is \(\nu \)-countably generated: there exists a sequence \((S_n, n\ge 1)\subset \mathscr {S}\), consisting of sets of finite \(\nu \)-measure, which \(\nu \)-essentially generates \(\mathscr {S}\) in the sense that for all \(A\in \mathscr {S}\) we can find a set \(A_0\) in the \(\sigma \)-algebra generated by \((S_n, n\ge 1)\) such that \(\nu (A\varDelta A_0) = 0\), see Proposition 1.49 in Hytönen et al. [16].
As proved in [19, Prop. 2.23], the norm \(\psi (x)=\Vert x\Vert _{\mathrm {L}_p}\) is \(\lfloor p \rfloor \) times continuously differentiable on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) and satisfies \(\sum _{k=0}^{\lfloor p \rfloor }\sup _{\Vert x\Vert =1}\Vert \psi ^{(k)}(x)\Vert <\infty \). We use here the following notations : \(\psi _1(x):=(\psi (x))^p\), \(g(t):=|t|^{p}\), \(f(t):=|t|^{1/p}\). The method used in the proof of [19, Prop. 2.23] is to establish the \(\lfloor p \rfloor \)-continuous differentiability of \(\psi _1\) by a Taylor formula technique and as \(\psi = f\circ \psi _1\) and f is infinitely differentiable on \(\mathbb {R}\setminus \{0\}\), the \(\lfloor p \rfloor \) times continuous differentiability of \(\psi \) on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) follows. In what follows, we adopt the Toscano notation for the falling factorial, that is for any real number r and any integer \(k\ge 1\),
With this notation, the derivatives of f and g are conveniently expressed as
In the proof of [19, Prop. 2.23], it is shown that \(\psi _1^{(k)}(x)=A_k(x)\), \(k=1, \dots , r\), where the k-linear form \(A_k(x)\) is defined by
For our aim, it is useful to explicit here the iterated use of Hölder inequality mentioned in [19] to check the continuity of the k-linear operator \(A_k(x)\). This way we obtain a bound of the norm \(\left\| A_k(x) \right\| _{\mathscr {L}_k(\mathrm {L}_p)}\) in terms of p, k and \(\left\| x \right\| _{\mathrm {L}_p}\). In view of (44), the problem is reduced to the successive “extractions” of \(\left\| h_1 \right\| _{\mathrm {L}_p},\dots ,\left\| h_k \right\| _{\mathrm {L}_p}\) via Hölder inequality applied iteratively along an ad hoc sequence \((p_1,q_1),\dots ,(p_k,q_k)\) of conjugate exponents, starting from the integral \( J_1 := \int _{{\mathbb S}}\vert x \vert ^{p-k}\vert h_1 \vert \dots \vert h_k \vert \,\mathrm {d}\nu . \) To this aim, we choose \(p_i=p-i+1\), \(q_i=p_i/(p_i-1)=(p-i+1)/(p-i)\), \(1\le i \le k\). It is easily seen that
The step \(i\rightarrow i+1\) of this procedure consists in applying Hölder inequality as follows:
At the end of this procedure, we obtain \(J_1 \le J_{k+1} \left\| h_1 \right\| _{\mathrm {L}_p}\cdots \left\| h_k \right\| _{\mathrm {L}_p}\), where
using (46). From this bound for \(J_1\), we deduce that for every \(x\in \mathsf {B}\setminus \{0\}\) the integral in (45) is well defined, that the k-linear operator \(A_k(x): (\mathrm {L}_p)^k\rightarrow \mathbb {R}\) is continuous and satisfies
To prove the p-smoothness of \(\mathrm {L}_p\), we have to check (3). Recalling that \(\psi = f\circ \psi _1\), we can use differentiation of composite functions on Banach spaces as in the proof of Lemma 6:
with the same summation conventions as in (25) and
Write \({\mathbb {U}}:=\{x\in \mathrm {L}_p({\mathbb S},\nu ) : \left\| x \right\| _{\mathrm {L}_p} = 1\}\) for the unit sphere of \(\mathrm {L}_p({\mathbb S},\nu )\). As \(\psi _1(x)=1\) for \(x\in {\mathbb {U}}\), it follows from (44) that
Moreover, \(\beta \in <j>_+\), have all its components \(\beta _i\ge 1\), so by (47),
Gathering (48) to (51), we obtain
It remains to check that \(\psi ^{(\lfloor p \rfloor )}\) satisfies
As a preliminary, we check the following inequality
To this aim, we put \(c:=\max (\vert a \vert ,\vert b \vert )\), \(d:=\min (\vert a \vert ,\vert b \vert )\) and use the elementary inequalities
If \({\text {sgn}}(a)={\text {sgn}}(b)\), the choice of \(t=d/c\) in (54) leads to \(c^\alpha - d^\alpha \le (c-d)^\alpha \), whence
If \({\text {sgn}}(a)\not ={\text {sgn}}(b)\), the choice of \(t=c/(c+d)\) in (54) gives \(c^\alpha + d^\alpha \le 2^{1-\alpha }(c+d)^\alpha \), whence
Proof of
(52)
Case \(1<p<2\). Here, \(\lfloor p \rfloor =1\) and for \(x, y\in \mathrm {L}_p\) with \(\Vert x\Vert _{\mathrm {L}_p}=\Vert y\Vert _{\mathrm {L}_p}=1\) we have
As \(\psi '_1\) is the linear form \(A_1\), (44), (45) and (53) with \(\alpha =p-1\) give
Applying Hölder inequality with exponents p and \(q=p/(p-1)\), we obtain
This inequality being valid for every h in \(\mathrm {L}_p({\mathbb S}, \nu )\) and as \(\{p\}=p-1\) here, it follows that \(\left\| \psi '(x)-\psi '(y) \right\| \le 2^{2-p}\Vert x-y\Vert _{\mathrm {L}_p}^{\{p\}}\), so (52) is satisfied when \(1<p<2\).
Case \(p\ge 2\). By (48) and (50), we have for \(x, y \in {\mathbb {U}}\),
where \(\varDelta _{\beta , \sigma }(x,y):= I_{\beta ,\sigma }(x) - I_{\beta ,\sigma }(y)\). Using telescopic summation as in the proof of Lemma 6 and (51), we get
Now, it remains to find a suitable control of each increment \(\left\| \psi _1^{(\beta _i)}(x) - \psi _1^{(\beta _i)}(y) \right\| \).
If \(2\le j\le \lfloor p \rfloor \), the multi-index \(\beta \) has at least two components, so \(1\le \beta _i<\lfloor p \rfloor \). Then, \(\psi _1^{\beta _i}\) has a continuous derivative on \(\mathsf {B}\setminus \{0\}\). So if \(0\notin [x,y]\), recalling (47), we get
To complete the case \(j\ge 2\), notice that \(0\in [x,y]\) if and only if \(y=-x\). In this special case, \(\left\| x-y \right\| =2\) and accounting (51), we can simply write
Now, \(\left\| x-y \right\| \le 2^{1-\{p\}}\left\| x-y \right\| ^{\{p\}}\) for \( x,y\in {\mathbb {U}}\), so from (56)–(58), there is a constant K dependent only on the space \(\mathrm {L}_p({\mathbb S},\nu )\) such that
It remains to treat the sum of terms for which \(j=1\) in (55). Here, \(\beta \) is a mono-index necessarily equal to \(\lfloor p \rfloor \) and by (56), one can bound this remaining sum R as
Recalling (45) and (44), we have for \(h_1, \dots , h_{\lfloor p \rfloor }\in \mathrm {L}_p({\mathbb S}, \nu )\),
Using iteratively Hölder inequality exactly as in the proof of (47), we obtain
thanks to (53). Finally, \(R\le K'\left\| x-y \right\| ^{\{p\}}\) with a constant \(K'\) depending only on p and recalling (59), this completes the proof of (52). \(\square \)
Recalling that by [19, Prop. 2.23], if \(p=2\ell \) is an even integer number, then the norm \(\psi (x)\) is infinitely many times Fréchet differentiable, we can summarize about \(\mathrm {L}_p\) smoothness by the following proposition.
Proposition 15
-
(a)
For any \(p>1\), the space \(\mathrm {L}_p({\mathbb S}, \nu )\) is p-smooth.
-
(b)
If \(p=2\ell \) is an even integer, then the norm of \(\mathrm {L}_p({\mathbb S}, \nu )\) is infinitely many times Fréchet differentiable on \(\mathrm {L}_p({\mathbb S}, \nu )\setminus \{0\}\) and has bounded derivatives on the unit circle, so that \(\mathrm {L}_p({\mathbb S}, \nu )\) is d-smooth for any integer \(d\ge 1\).
Theorem 4 and Proposition 15 yield the following results.
Theorem 16
Let \(p\ge 1\). For \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathrm {L}_p({\mathbb S}, \nu ))\), the following statements are equivalent:
-
(i)
\(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);
-
(ii)
\(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}_b^{(p)}(\mathrm {L}_p({\mathbb S}, \nu ))\);
-
(iii)
\(\lim _{n\rightarrow \infty } \zeta _p(\mathsf {P}_n, \mathsf {P})=0\).
Theorem 17
If \(p\ge 2\) is an even integer, then for \((\mathsf {P}, \mathsf {P}_n, n\in \mathbb {N})\subset \mathscr {P}(\mathrm {L}_p({\mathbb S}, \nu ))\), the following statements are equivalent:
-
(i)
\(\mathsf {P}_n\xrightarrow [n\rightarrow \infty ]{\mathrm {w}}\mathsf {P}\);
-
(ii)
\(\mathsf {P}_nf\xrightarrow [n\rightarrow \infty ]{}\mathsf {P}f\) for any \(f\in \mathrm {C}_b^{(\infty )}(\mathrm {L}_p({\mathbb S}, \nu ))\);
-
(iii)
for at least one \(d>1\), \(\lim _{n\rightarrow \infty } \zeta _d(\mathsf {P}_n, \mathsf {P})=0\).
4 Lindeberg CLT in p-Smooth Banach Spaces
First, we implement in Theorem 21 below the main principle of Lindeberg method and compare the sums \(\sum _{k=1}^{r_n}X_{nk}\) with sums of independent Gaussian random variables. Beforehand, it seems convenient to recall the notion of \(\mathsf {B}\)-valued stochastic integral with respect to a white noise which plays a key role in our proof of Theorem 21.
Definition 18
Let \(({\mathbb S}, \mathscr {S}, \mu )\) be a measure space and \(\mathscr {S}_0:=\{A\in \mathscr {S};\; \mu (A)<\infty \}\). A white noise with variance \(\mu \) is a stochastic process \(W = (W(A);\; A\in \mathscr {S}_0)\) defined on some rich enough probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\) such that
-
(a)
for each \(A\in \mathscr {S}_0\), W(A) is a real valued Gaussian random variable with mean zero and variance \(\mu (A)\);
-
(b)
if \(A_1\in \mathscr {S}_0, \dots , A_j\in \mathscr {S}_0\) are disjoint, then \(W(A_1), \dots , W(A_j)\) are independent and
$$\begin{aligned} W\left( \bigcup _{i=1}^j A_i\right) =\sum _{i=1}^j W(A_i),\quad j\ge 2. \end{aligned}$$
Next, following Proposition 3.3 in Hoffmann Jørgensen and Pisier [15], one can construct a \(\mathsf {B}\)-valued stochastic integral with respect to W. Classically we define first this integral for functions g in the space \(\mathscr {L}_0(\mu )\) of \(\mathscr {S}_0\)-simple functions g, that is of the form \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\), with \(x_i\in \mathsf {B}\), \(A_i\in \mathscr {S}_0\), \(1\le i\le j\), \(j\ge 1\) and extend it to the whole space \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\). The next proposition is essentially stated and proved in [15]. Our rewriting of its statement and proof is motivated by the need to explicit Corollary 20 in view of its role in Theorem 21.
Proposition 19
If \(\mathsf {B}\) is of type 2 and W is a white noise with variance \(\mu \) on some probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\), then there exists a unique linear map
such that the following statements hold.
-
(a)
For every \(g = \sum _{i=1}^j x_i\mathbf {1}_{A_i}\), where \(x_1,\dots ,x_j\in \mathsf {B}\), \(A_1,\dots ,A_j\in \mathscr {S}_0\), \(j\ge 1\),
$$\begin{aligned} I_W(g)=\int _{\mathbb S}g\,\mathrm {d}W := \sum _{i=1}^j x_iW(A_i). \end{aligned}$$(60) -
(b)
There exists a constant C such that for every \(g\in \mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\),
$$\begin{aligned} \mathrm{E}\left\| \int _{\mathbb S}g\,\mathrm {d}W \right\| ^2 \le C \int _{\mathbb S}\left\| g \right\| ^2\,\mathrm {d}\mu . \end{aligned}$$(61) -
(c)
For every \(g\in \mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\), \(\int _{\mathbb S}g\,\mathrm {d}W\) is a Gaussian mean zero random element in \(\mathsf {B}\).
-
(d)
If \(D', D''\in \mathscr {B}_\mathsf {B}\) are disjoint, \(\int _{D'}g\,\mathrm {d}W\) and \(\int _{D''}g\,\mathrm {d}W\) are independent for every g in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\).
Proof
The coherence of the definition of \(I_W(g)\) by (60) when g is a \(\mathscr {S}_0\)-simple function is checked in a standard way using the additivity property (b) in Definition 18. Checking the linearity of \(I_W\) on the subspace \(\mathscr {L}_0(\mu )\) of simple functions in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) is then straightforward. Next, if \(g\in \mathscr {L}_0(\mu )\), it can be represented as \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\) where the \(A_i\in \mathscr {S}_0\) are disjoint, which was not requested in (60). As \(\mathsf {B}\) is of type 2 and the \(x_iW(A_i)\) are independent with mean zero and finite second moment, there is a constant C depending only on \(\mathsf {B}\) such that
As the random variables \(W(A_i)\) are mean zero with respective variances \(\mu (A_i)\), this implies
Therefore, \(I_W\) is a continuous linear map \(\mathscr {L}_0(\mu )\longrightarrow \mathrm {L}^2(\varOmega ',\mathscr {F}',\mathsf {P}',\mathsf {B})\) and by density of \(\mathscr {L}_0(\mu )\) in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\), \(I_W\) has a unique continuous linear extension to this space, still denoted \(I_W\), and satisfying (61) with the same constant C.
To prove (c), we check that for every \(u\in \mathsf {B}^*\), \(u(I_W(g))\) is a Gaussian random variable. This is clear for g simple since then \(u(I_W(g))\) is a linear combination of independent Gaussian random variables. In the general case, g is the limit in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) of a sequence \((g_n)\) of simple functions. Combining the continuity of the linear functional u with (61) gives
which shows that \(u(I_W(g))\) is a Gaussian random variable as limit in quadratic mean of a sequence of Gaussian random variables. Moreover, for every \(u\in \mathsf {B}\), \(\mathrm{E}u(I_W(g)) = \lim _{n\rightarrow \infty }\mathrm{E}u(I_W(g_n))=0\), whence \(\mathrm{E}I_W(g)=0\).
To prove (d), we first note that for g simple, \(g=\sum _{i=1}^jx_i\mathbf {1}_{A_i}\) with the \(A_i\) disjoint, and \(D\in \mathscr {B}_\mathsf {B}\), \(\int _{D}g\,\mathrm {d}W:=\int _{\mathbb S}g\mathbf {1}_{D}\,\mathrm {d}W = \sum _{i=1}^jx_i W(A_i\cap D)\). Since \(D'\) and \(D''\) are disjoint, so are the sets \(A_1\cap D',\dots ,A_j\cap D',A_1\cap D'',\dots ,A_j\cap D''\), which provides the independence of
This independence is preserved when g is the limit in \(\mathrm {L}^2({\mathbb S}, \mathscr {S}, \mu ,\mathsf {B})\) of a sequence \((g_n)\) of simple functions since then, \(Y'_n=\int _{D'}g_n\,\mathrm {d}W\) and \(Y''=\int _{D''}g_n\,\mathrm {d}W\) converge in probability to \(Y'=\int _{D'}g\,\mathrm {d}W\) and \(Y''=\int _{D''}g\,\mathrm {d}W\), respectively, which implies the convergence in distribution of \((Y'_n,Y''_n)\) to \((Y',Y'')\). Then, the distribution of \((Y',Y'')\) is the product of the distributions of \(Y'\) and \(Y''\) which is equivalent to the independence of \(Y'\) and \(Y''\). \(\square \)
Now, denote by X a \(\mathsf {B}\) valued random element defined on a probability space \((\varOmega ,\mathscr {F},\mathsf {P})\) with distribution \(\mathsf {P}_X=\mathsf {P}\circ X^{-1}\) (which is a probability measure on \(\mathscr {B}_\mathsf {B}\)) and such that \(\mathrm{E}X=0\), \(\mathrm{E}\left\| X \right\| ^2<\infty \). Let us denote by \(Q = \mathrm {cov}(X)\in L(\mathsf {B}^*, \mathsf {B})\) the covariance operator of X, that is the linear bounded operator from \(\mathsf {B}^*\) to \(\mathsf {B}\) defined by
Since \(\mathsf {B}\) is of type 2 and \(\mathrm{E}\left\| X \right\| ^2<\infty \), the operator Q is pregaussian (see, Theorem 3.5. in Hoffmann Jørgensen and Pisier [15]), so there exists a Gaussian mean zero random element Y in \(\mathsf {B}\) with covariance operator Q. One way to construct such an Y is to apply Proposition 19 with \({\mathbb S}= \mathsf {B}\), \(\mathscr {S}=\mathscr {B}_\mathsf {B}\), \(\mu =\mathsf {P}_X\), which gives the following corollary.
Corollary 20
Let \(\mathsf {B}\) be a separable type 2 Banach space and X be a random element in \(\mathsf {B}\) defined on some probability space \((\varOmega , \mathscr {F}, \mathsf {P})\). Denote by \(\mathsf {P}_X:=\mathsf {P}\circ X^{-1}\) the distribution of X. Assume that \(\mathrm{E}\left\| X \right\| ^2<\infty \) and \(\mathrm{E}X = 0\). Let \(W= (W(A), A\in \mathscr {B}_\mathsf {B})\) be a white noise with variance \(\mu =\mathsf {P}_X\) defined on some probability space \((\varOmega ',\mathscr {F}',\mathsf {P}')\). As \(\mathrm{E}\left\| X \right\| ^2<\infty \), the identity map, \({{\,\mathrm{Id}\,}}_\mathsf {B}: \mathsf {B}\rightarrow \mathsf {B}\), \(x\mapsto x\), is in \(\mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B},\mathsf {P}_X,\mathsf {B})\), so we can define a Gaussian mean zero random element Y in \(\mathsf {B}\) by
Then, the following statements hold.
-
(a)
With the constant C in (61),
$$\begin{aligned} \mathrm{E}\left\| Y \right\| ^2 \le C \mathrm{E}\left\| X \right\| ^2. \end{aligned}$$(63) -
(b)
For every \(g\in \mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\), the Gaussian mean zero random element \(Z=\int _\mathsf {B}g\,\mathrm {d}W\) has the same covariance operator as g(X). In particular, Y and X have the same covariance operator.
-
(c)
For every symmetric \(T\in \mathscr {L}_2(\mathsf {B})\) and every \(g\in \mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\),
$$\begin{aligned} \mathrm{E}T(g(X),g(X))=\mathrm{E}T(Z, Z) \end{aligned}$$(64)and in particular \(\mathrm{E}T(X,X) = \mathrm{E}T(Y,Y)\).
Proof
(a) is a simple translation of (61) in the special case under consideration. For (b), we have to check that \(Q_Z=Q_{g(X)}\) which is equivalent to \(\mathrm{E}\big (u(Z)v(Z)\big ) = \mathrm{E}\big (u(g(X))v(g(X))\big )\) for every u, v in \(\mathsf {B}^*\). For \(g=\sum _{i=1}^j x_i\mathbf {1}_{A_i}\) with the \(A_i\)’s disjoint, \(u(Z)=\sum _{i=1}^j u(x_i)W(A_i)\), whence by independence of the \(A_i\)’s,
Valid for every g simple, this equality extends to the whole space \(\mathrm {L}^2(\mathsf {B}, \mathscr {B}_\mathsf {B}, \mathsf {P}_X,\mathsf {B})\) by the continuity of \(I_W\). In particular for \(g={{\,\mathrm{Id}\,}}_\mathsf {B}\) and \(Y=\int _\mathsf {B}{{\,\mathrm{Id}\,}}_\mathsf {B}\,\mathrm {d}W\), \(Q_Y=Q_X\).
The proof of (c) is similar and will be omitted. \(\square \)
Theorem 21
Assume that \(\mathsf {B}\) is of type 2. Consider an array of \(\mathsf {B}\)-valued random variables
where the probability space \((\varOmega _n, \mathscr {F}_n, \mathsf {P}_n)\) underlying the \(n^{\text {th}}\) line may vary with n and for each \(n\ge 1\), the \(X_{nk}\), \(1\le k\le r_n\), are mean zero independent and
Then, for each \(n\ge 1\), one can construct on some probability space \((\varOmega _n', \mathscr {F}_n', \mathsf {P}_n')\) independent mean zero Gaussian \(\mathsf {B}\)-valued random variables \(Y_{n1}, \dots , Y_{nr_n}\), such that for \(1\le k\le r_n\), \(X_{nk}\) and \(Y_{nk}\) have the same covariance operator and for any \(\delta \in (0, 1]\), and any \(\varepsilon >0\),
where the constant \(c(\mathsf {B},\delta )>0\) depends only of the type 2 constant of the space \(\mathsf {B}\) and of \(\delta \).
Proof
We fix an arbitrary \(n\ge 1\) and prove (66) for the \(n^{\text {th}}\) line of the array. In view of the property (36) of \(\zeta _{2+\delta }\), the problem reduces to proving that, given any mean zero random element X in \(\mathsf {B}\) such that \(\mathrm{E}\left\| X \right\| ^2<\infty \), one can construct, possibly on another probability space than the one supporting X, a mean zero Gaussian random element Y in \(\mathsf {B}\) with the same covariance operator as X, such that
To this aim, choosing Y as in Corollary 20, we have to estimate \(\vert \mathrm{E}f(X) - \mathrm{E}f(Y) \vert \) for \(f\in \mathrm {C}_b^{(2+\delta )}(\mathsf {B})\) such that \(\left\| f \right\| _{2+\delta }\le 1\). By Taylor formula at the order 1 with integral remainder,
To exploit fully the membership of f in \(\mathrm {C}_b^{(2+\delta )}(\mathsf {B})\), we rephrase this formula as
Applying the same treatment to f(Y) and using \(\mathrm{E}(f'(0).X)= f'(0).(\mathrm{E}X)=0\) and similarly \(\mathrm{E}(f'(0).Y)=0\), together with (c) in Corollary 20 applied with the bilinear symmetric operator \(T=f''(0)\), we are left with
where
By the \(\delta \)-Hölder continuity of \(f''\) and \(\left\| f \right\| _{2+\delta }\le 1\), \(\left\| f''(tZ) - f''(0) \right\| \le \left\| tZ \right\| ^\delta \le \left\| Z \right\| ^\delta \) whence
As Y is Gaussian, \(\mathrm{E}\left\| Y \right\| ^{2+\delta }<\infty \), which gives a first estimate of R(Y), by integration with respect to t in (69):
Concerning R(X), only the finiteness of \(\mathrm{E}\left\| X \right\| ^2\) is available, so we use (69) only on the event \(\{X\le \varepsilon \}\) where \(\left\| \big (f''(tX)-f''(0)\big ).(X,X) \right\| \le \varepsilon ^\delta \left\| X \right\| ^2\). This gives
On \(\{\left\| X \right\| >\varepsilon \}\), we simply use the fact that \(\left\| f'' \right\| \le 1\), so \(\left\| f''(tX)-f''(0) \right\| \le 2\). This gives
Our next step is to control the bound (70) in terms of the distribution of X only. Since Y is Gaussian, there is for every \(r>0\) a constant \(\kappa _r\) depending on r only, such that \((\mathrm{E}\left\| Y \right\| ^r)^{1/r}\le \kappa _r (\mathrm{E}\left\| Y \right\| ^2)^{1/2}\). One possible value is obtained using the inequality \(P(\left\| Y \right\| >t)\le 4\exp (-t^2/(8c^2))\) where \(c^2=\mathrm{E}\left\| Y \right\| ^2\), see, e.g., (3.5) in [17], which gives \(\kappa _r = 2^{3/2+2/r}\varGamma (r/2+1)^{1/r}\). In particular,
Next, recalling (62), we note that \(Y = Y'_\varepsilon + Y''_\varepsilon \), where
are independent Gaussian random elements in \(\mathsf {B}\) by Proposition 19 and Corollary 20. Since \(\mathsf {B}\) is of type 2, it follows by (39) that
By Proposition 19 (b) and the convexity inequality \((a+b)^r \le 2^{r-1}(a^r + b^r)\), \(a,b\ge 0\), \(r\ge 1\), we obtain with a constant \(\gamma _\delta :=2^{\delta /2}\kappa _{2+\delta }^{2+\delta }K^{2+\delta }C^{1+\delta /2}\), C being as in (61),
Now gathering (70), (71), (72) and (73) gives (67) with \(c(\mathsf {B},\delta )=(1+\gamma _\delta /2)\).
To conclude, choose a probability space \((\varOmega _n',\mathscr {F}_n',\mathsf {P}'_n)\) rich enough to support a sequence of independent white noises \((W_{nk})_{1\le k\le r_n}\) where the variance of \(W_{nk}\) is the distribution of \(X_{nk}\). Define on this probability space the corresponding sequence \((Y_{nk})_{1\le k\le r_n}\) of Gaussian random elements in \(\mathsf {B}\) by \(Y_{nk}:=\int _\mathsf {B}{{\,\mathrm{Id}\,}}_B \,\mathrm {d}W_{nk}\), \(1\le k\le r_n\). Each pair \(X_{nk}\), \(Y_{nk}\) satisfies (67). Bounding \((1+\mathrm{E}\left\| X_{nk} \right\| ^2)\) by \(1+M_n\) and summing over \(k=1,\dots ,r_n\), we obtain (66). \(\square \)
Hence, in p-smooth Banach space \(\mathsf {B}\) where \(p>2\), the proof of convergence in distribution of the sequence \(\sum _{k=1}^nX_{nk}, n\in \mathbb {N},\) to a \(\mathsf {B}\)-valued Gaussian random variable \(Y_Q\), is reduced by Theorem 21 to the proof of convergence in distribution of the Gaussian sequence \(\sum _{k=1}^n Y_{nk}\) to \(Y_Q\). The later is controlled by convergence of covariance operators. In any finite dimensional space, this is not a problem. In any separable Hilbert space as well as in Banach space of type 2 with approximation property, the convergence \(\sum _{k=1}^nY_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_Q\) is obtained from convergence of covariances in nuclear norm (see Chevet [8]).
Recall an operator \(u\in \mathscr {L}(\mathsf {B})\) is said to be nuclear if it admits the representation
where \(f_k\in \mathsf {B}^*, y_k\in \mathsf {B}\), and
The greatest lower bound of the sum \(\sum _{k=1}^\infty \Vert f_k\Vert \cdot \Vert y_k\Vert \) taken over all possible representations of u is called the nuclear norm of u and is denoted by \(\nu _1(u)\).
Theorem 22
Let the Banach space \(\mathsf {B}\) be p-smooth for some \(p>2\) and has approximation property. For each \(n\ge 1\) suppose that \(X_{n1}, \dots , X_{nr_n}\) is a sequence of mean zero independent \(\mathsf {B}\)-valued random elements such that \(\sup _{n\in \mathbb {N}}\sum _{k=1}^{r_n}\mathrm{E}\left\| X_{nk} \right\| ^2<\infty \). Let \(Q_{nj}:=\mathrm {cov}(X_{nj})\), \(j=1, \dots , r_n, n\ge 1.\) If there is a linear bounded operator \(Q\in L(\mathsf {B}^*, \mathsf {B})\) such that
and for each \(\varepsilon >0\),
then Q is pre-Gaussian and
where \(Y_Q\) is a mean zero Gaussian random element in \(\mathsf {B}\) with covariance Q.
Proof
Let the Gaussian triangular array \((Y_{nk}, k=1, \dots , r_n; n\ge 1)\) be as constructed in Theorem 21. Since
it is enough by Theorem 21 to prove
This is equivalent to the weak convergence of Gaussian distributions and as it is proved in Chevet [8], the convergence \(\sum _{k=1}^{r_n}Y_{nk}\xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y_Q\) follows from (74). \(\square \)
To check convergence in nuclear norm might be a quite complex task. In some concrete Banach spaces, the direct proof of convergence in distribution of Gaussian random variables is easier to achieve. As an illustration consider now the case of \(\mathrm {L}_p\) spaces. In what follows, \(({\mathbb S},\mathscr {S},\mu )\) is a measurable space where the measure \(\mu \) is \(\sigma \)-finite. We denote by p a real in \((2,\infty )\) and by \(q=p/(p-1)\) its conjugate exponent. We assume moreover that the space \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) is separable. We denote, respectively, by \(\mathscr {S}\otimes \mathscr {S}\) and \(\mu \otimes \mu \) the product \(\sigma \)-field and product measure on the Cartesian product \({\mathbb S}^2\). We will use the abbreviations:
For real valued functions u, v defined \(\mu \) almost everywhere on \({\mathbb S}\), \(u\otimes v\) denotes the function defined \(\mu \otimes \mu \) almost everywhere on \({\mathbb S}^2\) by \((u\otimes v)(s,t):= u(s)v(t)\). This notation is extended in an obvious way to random elements in \(\mathrm {L}_p({\mathbb S})\).
Theorem 23
(CLT in \(\mathrm {L}_p\), \(p>2\)) Let \((X_{nk},k=1, \dots , r_n; n\in \mathbb {N})\) be a triangular array of mean zero independent random elements in the separable space \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\). Assume that the following conditions are satisfied.
-
(a)
\(\sup _{n\in \mathbb {N}}\sum _{k=1}^{r_n} \mathrm{E}\left\| X_{nk} \right\| _{\mathrm {L}_p}^2<\infty \).
-
(b)
For any \(\varepsilon >0\),
$$\begin{aligned} \lim _{n\rightarrow \infty }\sum _{k=1}^{r_n}\mathrm{E}\left\| X_{nk} \right\| ^2_{\mathrm {L}_p}\mathbf {1}\{\left\| X_{nk} \right\| _{\mathrm {L}_p}>\varepsilon \} = 0. \end{aligned}$$ -
(c)
There is a mean zero Gaussian random element Y in \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) such that
$$\begin{aligned} \sum _{k=1}^{r_n} \mathrm{E}(X_{nk}\otimes X_{nk}) \xrightarrow [n\rightarrow \infty ]{} \varGamma :=\mathrm{E}(Y\otimes Y)\quad \text {in } \mathrm {L}_p({\mathbb S}^2,\mathscr {S}\otimes \mathscr {S},\mu \otimes \mu ;\mathbb {R}). \end{aligned}$$ -
(d)
Denoting by \(\sigma _n\) and \(\sigma \) the non-negative elements of \(\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) defined by \(\sigma _n^2(s):= \sum _{k=1}^{r_n}\mathrm{E}X_{nk}(s)^2\) and \(\sigma ^2(s):=\mathrm{E}Y(s)^2\), \(\mu \)-a.e on \({\mathbb S}\),
$$\begin{aligned} \int _{\mathbb S}\sigma _n^p\,\mathrm {d}\mu \xrightarrow [n\rightarrow \infty ]{} \int _{\mathbb S}\sigma ^p \,\mathrm {d}\mu . \end{aligned}$$
Then,
The proof requires the preliminaries gathered from Lemmas 24 to 27.
Lemma 24
Assume that \(f, g\in \mathrm {L}_q({\mathbb S}^2)\) satisfy for all \(A, B\in \mathscr {S}\) of finite \(\mu \)-measure,
Then, \(f=g\), \((\mu \otimes \mu )\)-a.e. on \({\mathbb S}^2\).
Proof
Let us remark first that the integrals in (78) are well defined because \(\mathbf {1}_A\) and \(\mathbf {1}_B\) are in \(\mathrm {L}_q({\mathbb S})\) since \(\mu (A)\) and \(\mu (B)\) are finite. We first prove the lemma in the special case where \(\mu ({\mathbb S})<\infty \) and then extend the result to the general case by using the \(\sigma \)-finiteness of \(\mu \). To simplify the writing, we denote by \(C_n\uparrow C\) the fact that the sequence of sets \((C_n)_{n\ge 1}\) increases to the set C, that is \(C_n\subset C_{n+1}\) for every \(n\ge 1\) and \(\cup _{n\ge 1}C_n = C\).
Case where \(\mu ({\mathbb S})<\infty \). Let us introduce the class \(\mathscr {L}\) of sets \(C\in \mathscr {S}\otimes \mathscr {S}\) such that f and g are \(\mu \otimes \mu \) integrable on C and \(\int _C f\,\mathrm {d}(\mu \otimes \mu ) = \int _C f\,\mathrm {d}(\mu \otimes \mu )\), together with the class \(\mathscr {R}:=\{A\times B, A\in \mathscr {S}, B\in \mathscr {S}\}\). As \(\mu ({\mathbb S})\) is finite, the same holds for \(\mu (A)\) and \(\mu (B)\) and (78) gives the inclusion \(\mathscr {R}\subset \mathscr {L}\). Clearly, \(\mathscr {R}\) is a \(\pi \)-system, i.e., closed under the formation of finite intersections. The class \(\mathscr {L}\) satisfies the three following properties.
- (\(\lambda _1\)):
-
\({\mathbb S}^2\) belongs to \(\mathscr {L}\). Indeed \(\mathbf {1}_{{\mathbb S}^2}\in \mathrm {L}_q({\mathbb S}^2)\) because \(\mu ({\mathbb S})<\infty \).
- (\(\lambda _2\)):
-
\(C, C'\in \mathscr {L}\) and \(C\subset C'\) imply \(C'\setminus C \in \mathscr {L}\). This follows easily by writing for \(h=f, g\), \(\int _{C'}h\,\mathrm {d}(\mu \otimes \mu ) = \int _Ch\,\mathrm {d}(\mu \otimes \mu ) + \int _{C'\setminus C}h\,\mathrm {d}(\mu \otimes \mu )\) and using the membership of \(C, C'\) in \(\mathscr {L}\).
- (\(\lambda _3\)):
-
\(\{C_n, n\ge 1\}\subset \mathscr {L}\) and \(C_n\uparrow C\) imply \(C\in \mathscr {L}\). Indeed the equality \(\int _{C_n}f\,\mathrm {d}(\mu \otimes \mu ) = \int _{C_n}g\,\mathrm {d}(\mu \otimes \mu )\) gives \(\int _{C_n}(f^+ + g^-)\,\mathrm {d}(\mu \otimes \mu ) = \int _{C_n}(g^+ + f^-)\,\mathrm {d}(\mu \otimes \mu )\) and by B. Levi’s monotone convergence theorem, we obtain \(\int _{C}(f^+ + g^-)\,\mathrm {d}(\mu \otimes \mu ) = \int _{C}(g^+ + f^-)\,\mathrm {d}(\mu \otimes \mu )\) that is \(\int _{C}f\,\mathrm {d}(\mu \otimes \mu ) = \int _{C}g\,\mathrm {d}(\mu \otimes \mu )\), so \(C\in \mathscr {L}\).
Hence, \(\mathscr {L}\) is a \(\lambda \)-system. As it contains the \(\pi \)-system \(\mathscr {R}\), by Dynkin’s \(\pi \)-\(\lambda \) theorem, see, e.g., [5], it contains also the \(\sigma \)-field generated by \(\mathscr {R}\), that is the product \(\mathscr {S}\otimes \mathscr {S}\). As \(\mathscr {L}\) was defined as a subset of \(\mathscr {S}\otimes \mathscr {S}\), it follows that \(\mathscr {L}=\mathscr {S}\otimes \mathscr {S}\). In other words,
Now, with \(C=\{f>g\}\), (79) gives \(\int _C (f-g)\,\mathrm {d}(\mu \otimes \mu )=0\). As \(f-g\) is positive on C, this implies \(\mu \otimes \mu (\{f>g\})=0\). Similarly, one check that \(\mu \otimes \mu (\{f<g\})=0\), so finally \(\mu \otimes \mu (\{f\ne g\})=0\), that is \(f=g\) \(\mu \otimes \mu \)-a.e. on \({\mathbb S}^2\).
Case where \(\mu ({\mathbb S})=\infty \). By \(\sigma \)-finiteness of \(\mu \), there is a sequence \(({\mathbb S}_n)_{n\ge 1}\) in \(\mathscr {S}\), such that \({\mathbb S}_n\uparrow {\mathbb S}\) and \(\mu ({\mathbb S}_n)<\infty \) for each \(n\ge 1\). Let us equip \({\mathbb S}_n\) with the \(\sigma \)-field \( \mathscr {S}_n := \{A\in \mathscr {S}; A\subset {\mathbb S}_n\} = \{A'\cap {\mathbb S}_n ; A'\in \mathscr {S}\} \) Then, we can apply the previous case to each measured space \(({\mathbb S}_n,\mathscr {S}_n,\mu )\), \(n\ge 1\), which gives \(f=g\), \((\mu \otimes \mu \))-a.e. on \({\mathbb S}_n^2\). As \({\mathbb S}^2=\cup _{n\ge 1}{\mathbb S}_n\times {\mathbb S}_n\), this gives \(f=g\), \((\mu \otimes \mu )\)-a.e. on \({\mathbb S}^2\). \(\square \)
Proposition 25
If X and \(X'\) are mean zero random elements in \(\mathrm {L}_p({\mathbb S})\) with finite strong second moment and the same covariance operator, then
Proof
If X and \(X'\) have the same covariance operator, then for all \(u, v \in \mathrm {L}_q({\mathbb S}^2)\),
Hölder inequality and Fubini arguments legitimate the rephrasing of this equality as
As for any \(A, B\in \mathscr {S}\) such that \(\mu (A), \mu (B)<\infty \), the functions \(u=\mathbf {1}_A\) and \(v=\mathbf {1}_B\) are in \(\mathrm {L}_q({\mathbb S})\), Lemma 24 gives the expected conclusion. \(\square \)
In what follows, we use for notational convenience the indexation by infinite subsets of \(\mathbb {N}^*=\mathbb {N}\setminus \{0\}\) to denote subsequences. So any (infinite) subsequence of \((u_n)_{n\ge 1}\) can be denoted as \((u_n)_{n\in I}\) with I infinite subset of \(\mathbb {N}^*\) and the convergence of this subsequence will be denoted by \(\xrightarrow [n\rightarrow \infty , n\in I]{}\) or \(\lim _{n\rightarrow \infty , n\in I}\).
Lemma 26
Let \(\xi \) be a Gaussian random element in \(\mathrm {L}_p({\mathbb S})=\mathrm {L}_p({\mathbb S},\mathscr {S},\mu ;\mathbb {R})\) having a representation
where W is a white noise. Then, for \(\mu \)-almost every \(s\in {\mathbb S}\), \(\xi (s)\) is a mean zero Gaussian random variable.
Proof
By construction of the \(\mathrm {L}_p({\mathbb S})\) valued stochastic integral with respect to W, there is a sequence of \(\mathrm {L}_p({\mathbb S})\) valued simple functions \( f_n = \sum _{i=1}^{j_n}h_{ni}\mathbf {1}_{A_{ni}} \) where the \(h_{ni}\) are in \(\mathrm {L}_p({\mathbb S})\), and for each n, the \(A_{ni}\), \(1\le i\le j_n\) are disjoint, such that with \(\xi _n:=\int _{\mathrm {L}_p({\mathbb S})}f_n\,\mathrm {d}W\), \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\rightarrow 0\). Let us fix a representant still denoted \(h_{ni}\) in each class of functions \(h_{ni}\). Then,
is a Gaussian mean zero random variable as a linear combination of the independent Gaussian mean zero random variables \(W(A_{ni})\). Now, the conclusion of the Lemma follows if we prove that for \(\mu \)-almost every \(s\in {\mathbb S}\), \(\mathrm{E}\vert \xi _n(s)-\xi (s) \vert ^2\rightarrow 0\).
Our first step in this way is to prove that the convergence \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\rightarrow 0\) implies \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^p\rightarrow 0\) in our Gaussian setting. To this aim, we use the following estimates.
As the Gaussian random elements \(\xi _n\) and \(\xi \) have strong moments of any order, we just have to bound \(\mathrm{E}\left\| \xi _n \right\| _p^{2p-2}\) uniformly in n. Using (3.5) in [17], we get for \(r\ge 2\),
Now, the convergence to zero of \(\mathrm{E}\left\| \xi _n-\xi \right\| _p^2\) implies the convergence of \(\mathrm{E}\left\| \xi _n \right\| _p^2\) to \(\mathrm{E}\left\| \xi \right\| _p^2\) so there is some \(n_0\) such that \(\mathrm{E}\left\| \xi _n \right\| _p^2 \le 2\mathrm{E}\left\| \xi \right\| _p^2\) for every \(n\ge n_0\). Hence, \(\sup _{n\ge n_0}\mathrm{E}\left\| \xi _n \right\| _p^r\le \int _0^\infty 4rt^{r-1}\exp (-t^2/(16\mathrm{E}\left\| \xi \right\| _p^2))\,\mathrm {d}t<\infty \).
Finally, since
we can extract a subsequence \((\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p)_{n\in I}\) which converges to zero \(\mu \)-almost everywhere on \({\mathbb S}\). So there is some measurable subset \({\mathbb S}'\) such that \(\mu ({\mathbb S}\setminus {\mathbb S}')=0\) and for every \(s\in {\mathbb S}'\) \(\lim _{n\rightarrow \infty , n\in I}\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p=0\). As \(p>2\), this implies \(\lim _{n\rightarrow \infty , n\in I}\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^2=0\). So for every s in \({\mathbb S}'\), \(\xi (s)\) is limit in quadratic mean of the sequence of mean zero Gaussian random variables \((\xi _{n}(s))_{n\in I}\); hence, \(\xi (s)\) is a mean zero Gaussian random variable. \(\square \)
Lemma 27
Let X be a random element in \(\mathrm {L}_p({\mathbb S})\) such that \(\mathrm{E}\left\| X \right\| _{\mathrm {L}_p}^2<\infty \). Let \(\xi \) be a Gaussian random element of the form \(\xi =\int _{\mathrm {L}_p({\mathbb S})}{{\,\mathrm{Id}\,}}\,\mathrm {d}W\), where W is a white noise with variance \(\mathsf {P}_X\).
-
(i)
For \(\mu \) almost every \(s\in {\mathbb S}\), \(\sigma ^2(s):=\mathrm{E}X(s)^2=\mathrm{E}\xi (s)^2\).
-
(ii)
Moreover, \(\sigma \in \mathrm {L}_p({\mathbb S}^2)\).
Proof
To prove (i), we recall that the proof of Lemma 26 provides a measurable subset \({\mathbb S}'\) such that \(\mu ({\mathbb S}\setminus {\mathbb S}')=0\) and an infinite subset I of \(\mathbb {N}^*\) such that for every \(s\in {\mathbb S}'\) \((\mathrm{E}\vert \xi _{n}(s)-\xi (s) \vert ^p)_{n\in I}\) converges to zero and \((\mathrm{E}\xi _n(s)^2)_{n\in I}\) converges to \(\mathrm{E}\xi (s)^2\). So it suffices to prove that one can extract a subsequence \((\mathrm{E}\xi _n(s)^2)_{n\in J}\) for some infinite subset J of I, converging to \(\mathrm{E}X(s)^2\) for \(\mu \) almost every \(s\in {\mathbb S}'\). Moreover, it is enough to prove i) in the case where \(\mu ({\mathbb S})<\infty \). Indeed when \(\mu ({\mathbb S})=\infty \), by \(\sigma \)-finiteness of \(\mu \), there is a sequence \(({\mathbb S}_n)_{n\ge 1}\) in \(\mathscr {S}\), such that \({\mathbb S}_n\uparrow {\mathbb S}\) and \(\mu ({\mathbb S}_n)<\infty \) for each \(n\ge 1\) and the same holds with \(({\mathbb S}'_n)_{n\ge 1}\) and \({\mathbb S}'\), where \({\mathbb S}'_n:={\mathbb S}\cap {\mathbb S}'_n\). Then, clearly if \(\mathrm{E}X(s)^2=\mathrm{E}\xi (s)^2\) \(\mu \)-a.e. in \({\mathbb S}'_n\), the same equality holds \(\mu \)-a.e on \({\mathbb S}'\). So let us assume from now on, that \(\mu ({\mathbb S})\) is finite.
Now, we note that \(\xi _n\) was defined as \(\xi _n:=\int _{\mathrm {L}_p({\mathbb S})}f_n\,\mathrm {d}W\), with
This convergence means that
which can be reformulated as
Since \(\mu ({\mathbb S})\) is finite, \(\mu /\mu ({\mathbb S})\) is a probability, whence as \(p>2\), for any \(g\in \mathrm {L}_p({\mathbb S})\),
This enables us to deduce from (80) that
Then, there is a measurable subset \({\mathbb S}''\) of \({\mathbb S}'\) such that \(\mu ({\mathbb S}'\setminus {\mathbb S}'')=0\) together with a subsequence \((\mathrm{E}\vert f_n(X(s)) - X(s) \vert ^2)_{n\in J}\), with \(J\subset I\), converging to zero for every \(s\in {\mathbb S}''\). Now, we have for every \(s\in {\mathbb S}''\), \( \lim _{n\rightarrow \infty , n\in J}\mathrm{E}f_n(X(s))^2 = \mathrm{E}X(s)^2. \) As \(f_n(X)=\sum _{i=1}^{j_n}x_{ni}\mathbf {1}_{A_{ni}}(X)\), with the \(A_{ni}\) pairwise disjoint, \(\mathrm{E}f_n(X(s))^2 = \sum _{i=1}^{j_n}x_{ni}(s)^2 \mathsf {P}_X(A_{ni})\). On the other hand, \(\xi _n(s) = I_W(f_n)(s) = \sum _{i=1}^{j_n}x_{ni}(s)W(A_{ni})\) where the \(W(A_{ni})\) are independent centered Gaussian random variables with respective variances \(\mathsf {P}_X(A_{ni})\), \(\mathrm{E}\xi _n(s)^2 = \sum _{i=1}^{j_n}x_{ni}(s)^2 P_X(A_{ni})\). So \(\mathrm{E}f_n(X(s))^2 = \mathrm{E}\xi _n(s)^2\) \(\mu \)-a.e. on \({\mathbb S}''\). Finally,
which completes the proof of i).
To check ii), by combining i) and Lemma 26, one see that for \(\mu \) almost every \(s\in {\mathbb S}\), \(\sigma ^2(s)=\mathrm{E}\xi (s)^2\) and \(\xi (s)\) is a mean zero Gaussian random variable. For every such s, \(\sigma (s)=(\mathrm{E}\xi (s)^2)^{1/2} \le (\mathrm{E}\vert \xi (s) \vert ^p)^{1/p}\) since \(p>2\), whence \(\sigma (s)^p \le \mathrm{E}\vert \xi (s) \vert ^p\). Therefore,
since the Gaussian random element \(\xi \) in \(\mathrm {L}_p({\mathbb S})\) has finite moments of any order. \(\square \)
Proof of Theorem 23
Puting \(2+\delta = \min (p,3)\) and applying Theorem 21, we deduce from (a) and (b) that \(\lim _{n\rightarrow \infty } \zeta _{2+\delta }\left( \sum _{k=1}^{r_n}X_{nk},\sum _{k=1}^{r_n}Y_{nk}\right) = 0 \), where the \(Y_{nk}\) are choosen as in the proof of Theorem 21, that is \(Y_{nk}=\int _{\mathrm {L}_p({\mathbb S})}{{\,\mathrm{Id}\,}}_{\mathrm {L}_p({\mathbb S})}\,\mathrm {d}W_{nk}\), where the \(W_{nk}\) are independent white noises with respective variances \(P_{X_{nk}}\). So it remains to prove that \(\zeta _{2+\delta }\big (\sum _{k=1}^{m_n}Y_{nk}, Y\big )\) converges to zero, which is equivalent to \(\sum _{k=1}^{r_n}Y_{nk} \xrightarrow [n\rightarrow \infty ]{\mathscr {D}}Y\) in the space \(\mathrm {L}_p({\mathbb S})\). This last convergence will be established by proving that
-
(i)
For every \(u\in \mathrm {L}_q({\mathbb S}, \mathscr {S},\mu ;\mathbb {R})\), \(\langle \sum _{k=1}^{r_n}Y_{nk},u \rangle \) converges in distribution to \(\langle Y,u \rangle \).
-
(ii)
The sequence \(\big (\sum _{k=1}^{r_n}Y_{nk}\big )_{n\ge 1}\) is tight in \(\mathrm {L}_p({\mathbb S})\).
To prove (i), we remark that \(\langle \sum _{k=1}^{r_n}Y_{nk},u \rangle \) and \(\langle Y,u \rangle \) are mean zero Gaussian random variables, so the announced convergence in distribution will follow from convergence of their variances. To prove this last one, we note that
so using the independence of the Gaussian random variables \(\langle Y_{nk},u \rangle \), \(1\le k\le r_n\), we just have to prove that
By Proposition 25, we can replace the \(X_{nk}\)’s by the \(Y_{nk}\)’s in the above convergence which then appears as an obvious consequence of Assumption (c) since \(u\otimes u\) belongs to \(\mathrm {L}_q({\mathbb S}^2)\).
To prove (ii), according to Cremers and Kadelka [9, Th. 2], it suffices to prove that with \(Y_n:=\sum _{k=1}^{r_n}Y_{nk}\),
In fact, we will prove that
By Lemmas 26 and 27 i), for \(\mu \) almost every \(s\in {\mathbb S}\), \(Y_n(s)\) and Y(s) are centered Gaussian random variables with respective variances \(\sigma _n^2(s)=\sum _{k=1}^{r_n}\mathrm{E}X_{nk}(s)^2\) and \(\sigma ^2(s)\). This implies that
where \(m_p:=(2\pi )^{-1/2}\int _{-\infty }^\infty \vert z \vert ^p \exp (-z^2/2)\,\mathrm {d}z\). This way, (81) is reduced to Assumption (d) and the proof is complete. \(\square \)
5 Asymptotic Normality of Weighted Sums
Let \((X_j, j\in \mathbb {Z})\) be a set of \(\mathsf {B}\)-valued random elements. Assume that \(\mathrm{E}(X_j)=0\) and \(\mathrm{E}\Vert X_j\Vert ^2<\infty \) for any \(j\in \mathbb {Z}.\) Consider the weighted sums
whenever they are correctly defined, where \(\{(a_{n, k}, k\ge 0), n\in \mathbb {N}\}\subset \mathbb {R}\). We assume that for each \(n\in \mathbb {N}\), \(\sum _{k}a^2_{nk}<\infty \).
Theorem 28
Assume that the Banach space \(\mathsf {B}\) is p-smooth with some \(p>2\). Let \((X_k, k\in \mathbb {Z})\) be i.i.d. \(\mathsf {B}\)-valued random elements and \(Q:=\mathrm {cov}(X_i)\). Assume that
-
(i)
\(c_n:=\sup _{k\ge 0}|a_{nk}|\rightarrow 0\) as \(n\rightarrow \infty \);
-
(ii)
\(b_n^2:=\sum _{k\ge 0}a^2_{nk}\rightarrow 1\) as \(n\rightarrow \infty \).
Then, for each n, the series \(\sum _{k}a_{nk}X_k\) converges a.s., and
Proof
Since the space \(\mathsf {B}\) is of type 2, we get by (40) for any \(1\le l\le m\),
Due to Condition (ii), this implies that the series \(\sum _{k}a_{nk}X_k\) satisfies Cauchy’s criterion in the space \(\mathrm {L}^1(\varOmega ,\mathscr {F},\mathsf {P};\mathsf {B})\), hence converges in \(\mathrm {L}^1\) and in probability. By independence of its terms, it converges also a.s., according to Ito-Nisio theorem.
Without loss of generality we assume that \(p=2+\delta \) with some \(\delta \in (0, 1)\). Let \(m\ge 1\). We apply Theorem 22 for \(X_{nk}=a_{nk}X_k\), \(k=0, \dots , m\). In this case, the sum \(\sum _{k=0}^m Y_{nk}\) has the same distribution as \(A_{nm}Y_Q,\) where \(A_{nm}^2=\sum _{k=0}^m a_{nk}^2\), \(A_{nm}\ge 0\). Hence, by (66) in Theorem 21, with \(c:=c(\mathsf {B},\delta )\),
This gives the following bound uniform in \(m\ge 0\):
Now, we estimate
As the random elements \(A_{nm}Y_Q\) and \(Y_Q\) are defined on the same probability space, the writing \(\mathrm{E}\big (f(A_{nm}Y_Q)-f(Y_Q)\big )\) makes sense and we obtain
because \(\left\| f \right\| _{(p)}\le 1\) implies \(\sup _{x\in \mathsf {B}}\left\| f'(x) \right\| \le 1\). Therefore,
Next, by the regularity property of \(\zeta _p\), see (b) p.15 and the independence of the \(X_k\)’s,
By Taylor formula at the order 1 with integral remainder, see (68), it is easily seen that if Z is a random element in \(\mathsf {B}\) and \(\mathrm{E}Z=0\), \(\mathrm{E}\left\| Z \right\| ^2<\infty \), then \(\zeta _p(0,Z)\le \frac{1}{2}\mathrm{E}\left\| Z \right\| ^2\). Therefore,
Finally, by triangular inequality for the distance \(\zeta _p\), gathering the estimates (84), (85) and (86) gives
where \(u_n(\varepsilon )\) denotes the right-hand side of (84). Using Assumption (ii), letting m tend to infinity in the above inequality gives \( \zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) \le u_n(\varepsilon ) \), whence by (i) and (ii),
By arbitraryness of \(\varepsilon \), we conclude that \(\lim _{n\rightarrow \infty }\zeta _p\left( \sum _{k=0}^\infty a_{nk}X_k, Y_Q\right) =0\). \(\square \)
Next, consider a \(\mathsf {B}\)-valued linear process \((X_k, k\in \mathbb {Z})\) defined by
where innovations \((\epsilon _k, k\in \mathbb {Z})\) are i.i.d. \(\mathsf {B}\)-valued random variables such that \(\mathrm{E}\epsilon _0=0\), \(Q_\epsilon =\mathrm {cov}(\epsilon _0)\), \(0<\sigma ^2:=\mathrm{E}||\epsilon _k||^2<\infty \) and the linear filter \((\psi _j, j\ge 0)\subset L(\mathsf {B})\) is a sequence of linear bounded operators such that \(\psi _0={{\,\mathrm{Id}\,}}_{\mathsf {B}}\) and
This condition ensures the a.s. convergence of the series in (87). In this case, we set \(\varPsi =\sum _{j=0}^\infty \psi _j.\)
Theorem 29
Let \(\mathsf {B}\) be a p-smooth Banach space, \(p>2\). Let \((X_k)\) be a linear process defined by (87), where \((\psi _k)\) satisfies (88). Let \((a_{n, j}, j\in \mathbb {Z}, n\in \mathbb {N})\subset \mathbb {R}\) satisfy conditions (i)–(ii) of Theorem 28 and
-
(iii)
\(\lim _{n\rightarrow \infty }\sum _{k\in \mathbb {Z}}(a_{n,k+1}-a_{n,k})^2=0.\)
Then,
Proof
We have
where \(Z_{nj}:=\sum _{k=0}^\infty a_{nk}\epsilon _{k-j}\). Writing \(Z_n=Z_n'+Z_n''\), where
we consider each \(Z_n'\) and \(Z_n''\) separately. By Theorem 28,
To complete the proof, we show that \(Z_n'\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}0\). To this aim, let us assume for a moment that the following two properties hold true:
and
Let \(\varepsilon >0\) and \(J\in \mathbb {N}\). Splitting \(Z'_n\) in two sums indexed by \(j\le J\) and \(j>J\), leads to
Applying Markov inequality at order one gives
By (89) and (88), taking \(J\in \mathbb {N}\) large enough, one can make the right side of the preceding bound as small as one wish. Then, the first probability on the right side of (91) is small as one wish by (90) and taking \(n\in \mathbb {N}_{+}\) large enough. Therefore, \(Z_n'\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}0\) holds true subject to the forthcoming proof of (89) and (90).
For (89), as the sequence \((\epsilon _i)_{i\in \mathbb {Z}}\) is i.i.d., it is clear that for each \(n\ge 1\), all the \(Z_{nj}\) have the same distribution, so it suffices to check that \(\sup _{n\ge 1}\mathrm{E}\left\| Z_{n0} \right\| ^2<\infty \). As B is of type 2, it easily follows from (39) and the equidistribution of the independent \(\epsilon _k\) that
Hence, (89) results from Assumption (ii).
To check (90), we show that in the decomposition
both sums converge to zero in quadratic mean. For the first one,
which tends to zero as n goes to infinity by Assumption (i). For the second sum,
Let us denote by d the Euclidean distance in the sequence space \(\ell ^2(\mathbb {N})\). Then,
which tends to zero as n tends to infinity by Assumption (iii). Hence, (90) is established and the proof complete. \(\square \)
Examples of summation methods \((a_{n,k}, k\in \mathbb {N}, n\in \mathbb {N})\) that satisfy conditions (i)–(iii) include the following (see [10] for more examples):
Cesàro summation corresponding to
Here, \(c_n=(n+1)^{-1/2}\), \(b_n^2=1\) and \(\sum _{k=0}^\infty (a_{n,k+1}-a_{n,k})^2=1/(n+1)\).
Abel summation corresponding to
where \(\lambda _n\rightarrow \infty \) as \(n\rightarrow \infty \). In this case,
Borel summation corresponding to
To check (i), recalling that \(\max _{k\ge 0}P(N=k)\) where the random variable N has the Poisson distribution with parameter \(\lambda _n\) is \(P(N=m)\) with \(m\le \lambda _n<m+1\), we get
By Stirling formula, \(m! = \sqrt{2\pi }\,m^{m+1/2}\mathrm {e}^{-m}(1+\delta _m)\) with \(\lim _{m\rightarrow \infty }\delta _m=0\), whence
To check (ii), we refer to [10] where it is proved by using the Bessel function of the first kind, see (2.8) and (2.9) therein.
To check (iii), we note first that
where \(f_\lambda (k):=(\lambda (k+1)^{-1}-1)^2\) and \(\mu _{\lambda }\) is the discrete measure
In what follows, we simplify the notations by replacing \(\lambda _n\) (\(n\rightarrow \infty \)) by \(\lambda \) (\(\lambda \rightarrow \infty \)). It is easily seen that the peak of the point masses of \(\mu _{\lambda }\) is at \(k=[\lambda ]\). We will use the following estimates for the left and right tails of \(\mu _{\lambda }\), obtained by comparison with geometric sums.
Let \(1/2<\tau <1\). We split \(\int _{\mathbb {N}}\) in \( \int _L + \int _C + \int _R \) with left, center and right intervals \(L:=\mathbb {N}\cap [0,\lambda - \lambda ^\tau ]\), \(C:=\mathbb {N}\cap (\lambda - \lambda ^\tau , \lambda + \lambda ^\tau )\), \(R:=\mathbb {N}\cap [\lambda + \lambda ^\tau ,\infty )\).
Estimation of \(\int _L f_\lambda \,\mathrm {d}\mu _\lambda \). Let j be the unique integer such that \(j\le \lambda - \lambda ^\tau <j+1\). As \(f_\lambda (k)\le \lambda ^2\) on L and accounting (92),
We note that
By Stirling formula, \(((j+1)!)^2 = 2\pi (j+1)^{2(j+1)+1}\mathrm {e}^{-2(j+1)}(1+\delta _{j+1})^2\), so as \(j+1>\lambda - \lambda ^\tau \), \(((j+1)!)^2\ge 2\pi (\lambda - \lambda ^\tau )^{2\lambda - 2\lambda ^\tau +1}\mathrm {e}^{-2\lambda + 2\lambda ^\tau - 2}(1+\delta _{j+1})^2\), whence
where \(T_1(\lambda ):= \lambda \mathrm {e}^{-2\lambda ^\tau }(1-\lambda ^{\tau -1})^{2\lambda ^\tau -2\lambda }\). Next, using \(\ln (1-t)= -t-t^2/2 + o(t^2)\) as \(t\rightarrow 0\),
since \(\lambda ^{3\tau -2}=o(\lambda ^{2\tau -1})\). As \(2\tau -1>0\), for any \(a>0\) and \(c\in (0,1)\), \(\lambda ^aT_1(\lambda ) = O(\exp (-c\lambda ^{2\tau -1}))\), whence
Estimation of \(\int _C f_\lambda \,\mathrm {d}\mu _\lambda \). One easily check that for \(k\in C\), \( f_\lambda (k) \le \left( \frac{\lambda ^\tau }{\lambda - \lambda ^\tau }\right) ^2 \sim \lambda ^{2\tau - 2} \). As \(\mu _\lambda (C) < \mu _\lambda (\mathbb {N})\sim 1\), this gives
Estimation of \(\int _R f_\lambda \,\mathrm {d}\mu _\lambda \). For \(k\ge \lambda + \lambda ^\tau \), \(0<\lambda (k+1)^{-1} <1\) whence \(f_\lambda (k)<1\) so
Denoting by j the unique integer such that \(j\le \lambda + \lambda ^\tau < j+1\), (93) gives
By increasingness of the function \(s\mapsto s/(s-a)\) on \((a,\infty )\), the last factor is estimated as
By Stirling formula, \(((j+1)!)^2\ge 2\pi (\lambda + \lambda ^\tau )^{2\lambda + 2\lambda ^\tau +1}\mathrm {e}^{-2\lambda - 2\lambda ^\tau - 2}(1+\delta _{j+1})^2\), so
where \(T_2(\lambda ):=\lambda \mathrm {e}^{2\lambda ^\tau } (1+\lambda ^{\tau -1})^{-2\lambda -2\lambda ^\tau }\). Since \(\ln (1+t)\ge t - t^2/2\) for \(t\ge 0\),
As \(2\tau -1>0\), for any \(a>0\) and \(c\in (0,1)\), \(\lambda ^aT_2(\lambda ) = O(\exp (-c\lambda ^{2\tau -1}))\), whence
Gathering all estimates gives \(\int _\mathbb {N}f_\lambda \,\mathrm {d}\mu _\lambda = O\left( \lambda ^{2\tau - 2}\right) = o(1)\), concluding the check of (iii).
Data Availability
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
Banna, M.: Limiting spectral distribution of Gram matrices associated with functionals of mixing processes. J. Math. Anal. Appl. 433(1), 416–433 (2016)
Banna, M., Merlevède, F.: Limiting spectral distribution of large sample covariance matrices associated with a class of stationary processes. J. Theor. Probab. 28, 745–783 (2015)
Banna, M., Merlevède, F., Peligrad, M.: On the limiting spectral distribution for a large class of symmetric random matrices with correlated entries. Stoch. Process. Appl. 125(7), 2700–2726 (2015)
Bentkus, V., Götze, F., Paulauskas, V., Račkauskas, A.: The accuracy of Gaussian approximation in Banach spaces. Itogi Nauki i Techniki 81, 39–139 (1991)
Billingsley, P.: Convergence of Probability Measures. Wiley, New York (1968)
Cartan, H.: Calcul Différentiel. Formes Différentielles. Hermann, Paris (1967)
Chatterjee, S.: A generalization of the Lindeberg principle. Ann. Probab. 34(6), 2061–2076 (2006)
Chevet, S.: Compacité dans l’espace des probabilités de Radon gaussiennes sur un Banach. C. R. Acad. Sci. Paris Sér. I Math. 296, 275–278 (1983)
Cremers, H., Kadelka, D.: On weak convergence of integral functionals of stochastic processes with applications to processes taking paths in \({\rm L}_p^{E}\). Stoch. Process. Appl. 21, 305–317 (1986)
Embrechts, P., Maejima, M.: The central limit theorem for summability methods of I.I.D. random variables. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 68, 191–204 (1984)
Fraenkel, L.E.: Formulae for high derivatives of composite functions. Math. Proc. Camb. Philos. Soc. 83, 159–165 (1978)
Giné, E., León, J.R.: On the central limit theorem in Hilbert space. Stochastica 4(1), 43–71 (1980)
Giné, E., Nickl, R.: Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge University Press (2016)
Hájek, P., Johanis, M.: Smooth Analysis in Banach Spaces, De Gruyter Series in Nonlinear Analysis and Applications, vol. 19. De Gruyter (2014)
Hoffmann-Jørgensen, J., Pisier, G.: The law of large numbers and the central limit theorem in Banach spaces. Ann. Probab. 4(4), 587–599 (1976)
Hytönen, T., van Neerven, J., Veraar, M., Weiss, L.: Analysis in Banach Spaces. Springer (2016)
Ledoux, M., Talagrand, M.: Probability in Banach Spaces. Springer-Verlag, Berlin (1991)
Lindeberg, J.W.: Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung. Math. Zeit. 15, 211–225 (1922)
Paulauskas, V., Račkauskas, A.: Approximation Theory in the Central Limit Theorem. Exact Results in Banach Spaces. Kluwer Academic Publishers (1989)
Pisier, G.: Martingales with values in uniformly convex spaces. Israel J. Math. 20(3–4), 326–350 (1975)
Rao Ranga, R.: Relations between weak and uniform convergence of measures with applications. Ann. Math. Stat. 33(2), 659–680 (1962)
Rosiński, J.: Central limit theorem for dependent random vectors in Banach spaces. Lect. Notes Math. 939, 157–180 (1982)
Sundaresan, K.: Smooth Banach spaces. Math. Annalen 173, 191–199 (1967)
Tao, T., Vu, V.: Random matrices: universality of local eigenvalue statistics. Acta Math. 206(1), 127–204 (2011)
Tao, T., Vu, V.: Random matrices: the universality phenomenon for Wigner ensembles. In: Proceedings of Symposia in Applied Mathematics, vol. 72, pp. 121–172 (2014)
Zolotarev, V.M.: Ideal metrics in the problem of approximating the distributions of sums of independent random variables. Theor. Probab. Appl. 22(3), 433–449 (1977)
Funding
The research of Alfredas Račkauskas is supported by the Research Council of Lithuania, Grant No. S-MIP-17-76.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Račkauskas, A., Suquet, C. Asymptotic Normality in Banach Spaces via Lindeberg Method. J Theor Probab 36, 409–455 (2023). https://doi.org/10.1007/s10959-022-01177-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-022-01177-x