1 Introduction

Let \(\Omega \) be a compact Hausdorff space, and let \(\mathcal R_p(\Omega )\) denote the totality of Radon probability measures on \(\Omega \). For an \(f \in C(\Omega )\) and \(n \in \mathbb N\), we prescribe a set of n functions \(\psi _{n,1},\ldots , \psi _{n,n} \in C(\Omega )\), and define stochastic quasi-interpolants \(Q^X_n(f)\) of f by

$$\begin{aligned} Q^X_n(f)(x): = \sum ^n_{j=1}f(X_{n,j})\psi _{n,j}(x), \quad x \in \Omega . \end{aligned}$$

Here \(\{f(X_{n,j})\psi _{n,j}\}_{j=1}^n \) are \( C(\Omega )\)-valued random variables obeying respectively laws \(\{\nu _{n,j}\}_{j=1}^n \subset \mathcal R_p(\Omega )\), the design of which caters to problems on hand and may require some ingenuity. Ideally, one selects functions \(\psi _{n,1},\ldots , \psi _{n,n} \in C(\Omega )\) following the “partition of unity” principle. That is,

$$\begin{aligned} \sum ^n_{j=1} \psi _{n,j} (x) =1, \quad x \in \Omega , \quad n \in \mathbb N. \end{aligned}$$

In many theoretical and practical problems, however, an exact partition of unity may be either hard to obtain or cumbersome to implement. One then uses a set of functions that forms an approximate partition of unity in the sense that the sequence of functions \(\sum ^n_{j=1} \psi _{n,j} (x)\) approximates 1 within a desirable error bound under an appropriately-selected topology on \(C(\Omega )\). We will call this general type of stochastic approximation scheme “stochastic quasi-interpolation”.

Let \(\mu \) be the uniform probability measure on \(\Omega \) as defined by Niederreiter [27]. For \(1 \le p \le \infty \), let \(L^p(\Omega )\) denote the Banach spaces consisting of all real-valued Borel measurable functions f on \(\Omega \) for which \(\Vert f\Vert _p < \infty \). Here \(\Vert f\Vert _p:=\Vert f\Vert _{L^p(\Omega )}\) is defined by:

$$\begin{aligned} \Vert f\Vert _{L^p(\Omega )}= {\left\{ \begin{array}{ll} {\displaystyle \left( \int _{\Omega } |f(x)|^p \mathrm{{d}} \mu (x)\right) ^{1/p},} &{} \text {if }1 \le p < \infty , \\ \inf \{C: |f(x)| \le C \; \text {almost surely with respect to }\mu \}, &{} \text {if }p=\infty . \end{array}\right. } \end{aligned}$$

Let \(\mathbb P(A)\) and \(\mathbb E(Z)\) denote, respectively, the probability of the event A and the expectation of the random variable Z. A coveted type of estimates in the study of stochastic quasi-interpolation is Gaussian-style \(L^p\)-concentration inequalities for some target functions f of interest: For any given \(\epsilon>\), there holds

$$\begin{aligned} \mathbb P\{ \Vert Q^X_n(f) -f \Vert _\infty > \epsilon \} \le c_1 \exp \left( -c_2 n^\alpha \ \epsilon ^2\right) . \end{aligned}$$

Here \(\alpha \ge 1,\) and \(c_1, c_2\) are absolute constants, which can be made explicit with efforts. As a corollary of the above inequality, we derive

$$\begin{aligned} \mathbb E\left( \Vert Q_n(f) -f \Vert _\infty \right) \le c_3\ n^{-\alpha /2}, \end{aligned}$$
(1.1)

where \(c_3\) is an absolute constant depending on \(c_1\) and \(c_2\). This amounts to saying that the “average approximation order” of the stochastic quasi-interpolation scheme is: \(n^{-\alpha /2}\). Of course, one needs to carry out “Bochner integrals” on the underlying Banach space valued random variables en route. Under certain circumstances, it may be impossible to obtain the desired \(L^\infty \)-concentration inequalities. One then settles on the next best thing: For some p in \([1,\infty )\), and any given \(\epsilon >0\), there holds

$$\begin{aligned} \mathbb P\{ \Vert Q^X_n(f) -f \Vert _p > \epsilon \} \le c_1 \exp \left( -c_2 n^\alpha \ \epsilon ^2\right) . \end{aligned}$$

Due to the compactness of \(\Omega \), if the above inequality holds true for a \(p_1 \in [1,\infty ),\) then it does for all \(p_2\) in the range \(1 \le p_2 \le p_1.\) Many theoretical and practical problems may be modeled under this stochastic quasi-interpolation framework. Literature abounds in implicit applications of stochastic quasi-interpolation techniques. Notably, Wagner [37], and Bourgain and Lindenstrauss [7] applied a stochastic quasi-interpolation method to show the existence of certain node sets giving rise to a near optimal number of segments in Minkowski sums with equal length needed to approximate a euclidean ball under the Hausdorff metric within a prescribed error bound; see also [5, 6]. Gao et al [21] studied the case in which \( X_{n,1}, \ldots , X_{n,n}\) are n-independent copies of the random variables uniformly distributed in \(\Omega .\)

To mitigate uncertainties brought on by unreliable sampling sites, Wu et al. [38] proposed and studied a class of stochastic Bernstein polynomials

$$\begin{aligned} (B^X_n f)(x):= \sum ^n_{k=0} f(X_{n,k})\ p_{n,k}(x), \quad n \in \mathbb N, \end{aligned}$$
(1.2)

in which \(X_{n,0}, X_{n,1}, \cdots , X_{n,n}\) are the order statistics (see [9, 14]) of \((n+1)\) independent copies of the random variable uniformly distributed in (0, 1), and

$$\begin{aligned} p_{n,k}(x) = {n \atopwithdelims ()k} x^k (1-x)^{n-k}, \quad 0 \le k \le n, \quad 0 \le x \le 1. \end{aligned}$$

These are stochastic cousins of the classical Bernstein polynomial \(B_n f\) defined by

$$\begin{aligned} (B_n f)(x):= \sum ^n_{k=0} f\left( \frac{k}{n}\right) \ p_{n,k}(x), \quad n \in \mathbb N. \end{aligned}$$
(1.3)

Stochastic Bernstein polynomials have a simple structure and therefore are nimble in a wide range of applications. Furthermore, the inherent randomness makes them suitable for Monte Carlo simulations [34]. Authors of [1, 20, 33, 39] have established algebraic and exponential convergence rates for probabilistic convergence of these stochastic Bernstein polynomials.

Authors of [34] introduced the notion “\(L^p\)-probabilistic convergence” (\(1 \le p \le \infty \)) of stochastic Bernstein polynomials (Definition 2.1) and established various \(L^p\)-probabilistic convergence rates, a highlight of which is as follows. For any given \(\varepsilon >0\), p in the range \(1 \le p \le 2\), and any \( f \in C[0,1]\), the following inequality:

$$\begin{aligned} \mathbb {P}\left\{ \left\| B_{n}^{X} f-f\right\| _{p}>\varepsilon \right\} \le 2 \exp \left[ -\frac{\varepsilon ^2}{4\ \omega ^2\left( f, \frac{1}{\sqrt{n}}\right) }\right] , \end{aligned}$$
(1.4)

holds true under the assumption that

$$\begin{aligned} \omega \left( f, \frac{1}{\sqrt{n}} \right) < \frac{\varepsilon }{6.2}, \end{aligned}$$
(1.5)

in which \(\omega \left( f, \cdot \right) \) denotes the modulus of continuity of f defined by

$$\begin{aligned} \omega (f, h):=\sup _{\begin{array}{c} 0\le x, y\le 1\\ |x-y|\le h \end{array}}\left| f(x)-f(y)\right| , \ \ 0\le h\le 1. \end{aligned}$$

These are Gaussian-type concentration inequalities dressed in modulus of continuity, which in the current article will be called \(L^p\)-concentration inequalities (or simply concentration inequalities) for stochastic Bernstein polynomials. However, both the restriction on \(p\ (1 \le p \le 2)\) and the assumption (1.5) are impractical. Concerning the latter, we have received a constructive feedback from practitioners in the field, a gist of which goes as follows. In modeling many real-world problems, one often finds a priori information on the smoothness of an unknown target function elusive. Thus, assumption (1.5) is sometimes impossible to verify beforehand, which has hindered application and foreshadowed the effectiveness of estimate (1.4).

The proof given in [34] depends on assumption (1.5) because it requires harnessing the approximation power of the classical Bernstein polynomial \(B_n f\). In so doing, authors of [34] have inadvertently followed a familiar methodology in a learning theory paradigm advocated by Cucker and Smale [12], and Cucker and Zhou [13], which decomposes an overall error in two parts: the approximation error and the sampling error. Another caveat of the approach is that it implicitly implies that under no circumstance, will any other type of Bernstein polynomials (spawned from their stochastic cousins) outperform the classical ones. But this is not true. Guided by theoretical analysis, we carried out many rounds of numerical simulations, which show that Bernstein polynomials built upon some carefully-designed triangular arrays of knots converge to a Heaviside-type function exponentially fast in the space \(L^1 [0,1]\). We will study this interesting problem in the near future.

Approximation theorists are interested in knowing the average approximation order of the stochastic Bernstein polynomials (1.2). For this goal, assumption (1.5) is nearly harmless. The main result of Sect. 4 (Theorem 4.2) asserts that inequality (1.4) holds true for all \(1 \le p \le \infty \) under a similar assumption to (1.5). This implies the following inequality (see Corollary 4.3):

$$\begin{aligned} \mathbb E\left( \left\| B_{n}^{X} f-f\right\| _{p}\right) \le C\ \omega \left( \frac{1}{\sqrt{n}} \right) , \quad 1 \le p \le \infty . \end{aligned}$$

where C is an absolute constant, which will be specified in Corollary 4.3, and \(\omega (h)\) is an abbreviation for \(\omega (f, h)\). (We will use the abbreviation throughout the article.) The above inequality implies that the expected (average) \(L^p\)-approximation order \((1 \le p \le \infty )\) of stochastic Bernstein polynomials is the same as that achieved by the classical Bernstein polynomial \(B_n f\).Footnote 1 This implication does not depend on assumption (1.5).

In summary, there are good arguments to be made on both sides (with or without assumption (1.5)) in so far as \(L^p\)-concentration inequalities are developed. Henceforth, we will use the phrase “unconditional \(L^p\)-concentration inequalities” to refer to estimates obtained without using assumption (1.5). To facilitate broader applications of stochastic Bernstein polynomials, we will push forward as much as we can to prove unconditional \(L^p\)-concentration inequalities, using assumption (1.5) only as the last resort. The departure of the assumption (1.5) has increased the level of difficulty. For p in the range \(1 \le p \le 2\), Gaussian bounds of mean square beta distributions [3, pp. 125] play a key role in our proof. But the same method does not carry over to the cases \(2< p <\infty \) for which our proof is much involved, calling for a host of techniques including a crucial application of Dvoretzky–Kiefer–Wolfowitz inequality [17, 26]. Finally, we prove an \(L^\infty \)-concentration inequality using a slightly weaker version of assumption (1.5).

Dvoretzky et al. [17] proved the following remarkable result. Let \(X_1, X_2,\ldots , X_n\) be real-valued independent and identically distributed random variables with c.d.f (cumulative distribution function) F. Let \(F_n\) denote the empirical distribution function associated with F defined by

$$\begin{aligned} F_{n}(x)={\frac{1}{n}}\sum _{i=1}^{n}\mathbbm {1} _{\{X_{i}\le x\}},\qquad x\in \mathbb {R}. \end{aligned}$$

Then for any given \(\varepsilon > 0\), there holds:

$$\begin{aligned} \mathbb P{\Bigl (}\sup _{x\in \mathbb {R} }|F_{n}(x)-F(x)|>\varepsilon {\Bigr )}\le 2\ e^{-2n\varepsilon ^{2}}. \end{aligned}$$
(1.6)

We remark that Dvoretzky, Kiefer, and Wolfowitz only established the above inequality with an unspecified multiplicative constant C. The inequality with the sharp constant \(C=2\) was proved by Massart [26], which confirms a conjecture due to Birnbaum and McCarty [2], and has since been known as Dvoretzky–Kiefer–Wolfowitz inequality.

The current investigation has drawn inspirations from previous research done in the theory of random approximation of functions; see [10, 18, 19, 23, 24, 29, 30, 36], of which we mention particularly the idea of employing Bernstein polynomials for density and distribution estimation on the interval [0, 1]:

$$\begin{aligned} \widetilde{F}_m(x) = \sum ^n_{k=0} \widehat{F}_m \left( \frac{k}{n}\right) \ p_{n,k}(x), \end{aligned}$$

where \(m, n \in \mathbb N\), and \(\widehat{F}_m\) is the empirical distribution of order m. There are close relationships as well as fundamental differences between the two classes of stochastic Bernstein polynomials. Addressing these interesting topics, however, will take us far afield.

The arrangement of the current article is as follows. In Sect. 2 , we prove unconditional \(L^p\)-concentration inequalities for stochastic Bernstein polynomials (1.2) for p in the range \(1 \le p \le 2\). Among other results, we obtain the unconditional version of inequality (1.4) using theory of mean square beta distribution. In Sect. 3, we prove unconditional \(L^p\)-concentration inequalities for p in the range \(2< p < \infty \). In Sect. 4, we prove an \(L^\infty \)-concentration inequality with an assumption similar to (1.5). As a corollary, we show that for all p in the range \(1 \le p \le \infty \), the average \(L^p\) approximation order of stochastic Bernstein polynomials is the same as that achieved by the classical Bernstein polynomial \(B_n f\) under a mild restriction.

2 Mean Square Beta Distribution and Unconditional \(L^p\)-Concentration Inequalities

Let X be the random variable uniformly distributed in (0, 1). Let \(X_{n,0}, X_{n,1},\ldots , X_{n,n}\) be the order statistics of \((n+1)\) independent copies of X. Then \(X_{n,k}\) is distributed according to the beta distribution \(\mathrm{Beta}(k+1, n-k+1)\) with parameters \((k+1)\) and \((n-k+1)\) that has density function

$$\begin{aligned} (n+1)\ p_{n,k}(x), \quad 0 \le x \le 1, \quad 0 \le k \le n. \end{aligned}$$

Beta distributions belong to the wider class of sub-Gaussian distributions that enjoy a Gaussian-type probability concentration inequalities; see [4, 8, 28]. For a fixed \(0 \le x \le 1\), we have

$$\begin{aligned} p_{n,k}(x) \ge 0, \quad \text {and} \quad \sum ^n_{k=0} p_{n,k}(x)=1. \end{aligned}$$

Thus, the random variable \(B^X_n f\) can be studied effectively as a convex linear combination of the \((n+1)\) random variables \(f(X_{n,k})\).

For \(1 \le p \le \infty \) and \(f \in C[0,1]\), we define

$$\begin{aligned} \Vert f\Vert _p:= {\left\{ \begin{array}{ll} {\displaystyle \left( \int ^1_0 |f(x)|^p dx \right) ^{1/p},} &{}\text {if } 1 \le p < \infty , \\ {\displaystyle \max _{x \in [0,1]}|f(x)|}, &{} \text {if } p=\infty . \end{array}\right. } \end{aligned}$$

Definition 2.1

Let \(1 \le p \le \infty \), and \(f\in C[0, 1]\) be given. If \(\forall \varepsilon >0\),

$$\begin{aligned} \lim _{n \rightarrow \infty } \mathbb {P}\{\Vert B^X_nf-f\Vert _p >\varepsilon \} =0, \end{aligned}$$

then we say that \(B^X_nf\) converges to f in probability under the \(L^p\)-norm. We also refer to such probabilistic convergence as “\(L^p\)-probabilistic convergence” of \(B^X_nf\) to f.

The bulk of our effort in the current article is devoted to finding exponential decay ratesFootnote 2 (in terms of the modulus of continuity of an \(f \in C[0,1]\)) for \(L^p\)-probabilistic convergence. We use throughout the article two basic properties of the modulus of continuity. The first one is called “positivity”, which asserts that if \(f \in C[0,1]\) is not a constant, then \(\omega (h)>0\) for \(0<h\le 1\). The second one is referred to as “subadditivity”, as shown in the following inequality:

$$\begin{aligned} \omega (\lambda \delta )\le (1+\lambda )\ \omega (\delta ),\quad 0\le \delta \le 1, \quad 0< \lambda < \infty . \end{aligned}$$
(2.1)

Neither of the two properties is hard to prove. Authors of [34] showed the following result.

Theorem 2.2

If \(X\sim \mathrm{Beta}(\alpha ,\beta )\), then

$$\begin{aligned} \mathbb {P}\{|X-\mathbb {E}(X)|>r\} \le 2\exp [-2(\alpha +\beta +1) r^2], \quad r >0. \end{aligned}$$
(2.2)

Let \({\mathcal {B}}_n\) denote the function defined by

$$\begin{aligned} {\mathcal {B}}_n(x,y) = (n+1) \sum ^{n}_{k=0} p_{n,k}(x)p_{n,k}(y), \quad (x,y) \in [0,1]^2. \end{aligned}$$

Let (XY) be the random vector with the joint density function \({\mathcal {B}}_n\). The resulted distribution on \([0,1]^2\) is often called “mean square beta distribution”; see [3]. With the newly-updated Gaussian tail bound as shown in Theorem 2.2, we reiterate a result of Bobkov and Ledoux [3, Proposition B.12].

Proposition 2.3

If (XY) is the random vector with the joint density function \({\mathcal {B}}_n\), then for all \(r \ge 0,\)

$$\begin{aligned} \mathbb {P}\{|X-Y|>r\} \le 2\exp [-(n+3) r^2]. \end{aligned}$$

We will utilize the random vector (XY) as a “majorant” for our stochastic Bernstein polynomials. The best effect of this approach can be seen via its implementation on the uniform distribution in (0, 1). We denote U the c.d.f. of the uniform distribution in (0, 1).

Lemma 2.4

Let (XY) be the random vector that has joint density function \({\mathcal {B}}_n(x,y).\) Then for any \(1 \le p < \infty \), we have

$$\begin{aligned} \mathbb E\left( \Vert B^X_n U -U \Vert ^p_p\right) \le \mathbb E\left( |X-Y|^p \right) . \end{aligned}$$

Proof

We first use the triangle inequality to write

$$\begin{aligned} |(B^X_n U)(x) -U(x)|&\le \sum ^{n}_{j=0} |X_{n,k}-x|p_{n,k}(x), \quad x \in [0,1]. \end{aligned}$$

By Jensen’s inequality, we have

$$\begin{aligned} \mathbb E\left( \Vert B^X_n U -U \Vert ^p_p \right) \le&\, \mathbb E\left[ \int ^1_0 \left( \sum ^n_{k=0}|X_{n,k}-x|p_{n,k} (x) \right) ^p dx \right] \\ \le&\, \mathbb E\left[ \int ^1_0 \sum ^n_{k=0}|X_{n,k}-x|^p p_{n,k} (x) dx \right] \\ =&\,\int ^1_0 \int ^1_0 |x-y|^p {\mathcal {B}}_n(x,y) dy dx \\ =&\,\mathbb E\left( |X-Y|^p\right) . \end{aligned}$$

This completes the proof. \(\square \)

The result of the above lemma amounts to saying that for any \(1\le p < \infty \), the p-moment of the random variable \(\Vert B^X_n U -U\Vert _p\) is majorized by that of the random variable \((X-Y)\). The following result shows that the majorization extends to exponential moments.

Lemma 2.5

For any \(1 \le p < \infty \) and \(r \ge 0\), the following inequality holds true:

$$\begin{aligned} \mathbb E\left[ \exp \left( r\Vert B^X_n U -U \Vert ^p_p\right) \right] \le \mathbb E\left[ \exp \left( r|X-Y|^p\right) \right] . \end{aligned}$$

Proof

Similar to the proof of Lemma 2.4, we first write the following inequality:

$$\begin{aligned} \Vert B^X_n U -U \Vert ^p_p \le&\sum ^n_{k=0} \int ^1_0 |X_{n,k}-x|^p p_{n,k} (x) dx. \end{aligned}$$

Applying Jensen’s inequality twice (first the integral form then the discrete form), we have

$$\begin{aligned} \exp \left( r\Vert B^X_n U -U \Vert ^p_p\right)&\le \int ^1_0 \exp \left( r\sum ^n_{k=0} p_{n,k} (x)|X_{n,k}-x|^p\right) dx \\&\le \int ^1_0 \left( \sum ^n_{k=0} \exp \left( r |X_{n,k}-x|^p\right) p_{n,k} (x)\right) dx. \end{aligned}$$

It follows that

$$\begin{aligned} \mathbb E\left[ \exp \left( r\Vert B^X_n U -U \Vert ^p_p\right) \right] \le \int ^1_0 \int ^1_0 \exp \left( r |x-y|^p\right) {\mathcal {B}}_n(x,y) dy dx, \end{aligned}$$

which is the desired result. \(\square \)

The majorization of \(\Vert B^X_n U -U\Vert _p\) by \(|X-Y|\) as shown in Lemmas 2.4 and 2.5 can be passed on to the random variable \(\Vert B^X_n f -f\Vert _p\) for any \(f \in C[0,1]\) via manipulation of its modulus of continuity. Our main goal in the current section is to find upper bounds for the quantities

$$\begin{aligned} \mathbb {P}\{||B^X_nf-f||_p >\varepsilon \}, \quad 1 \le p \le \infty . \end{aligned}$$

where \(\varepsilon >0\) and \(n \in \mathbb N\) are fixed. If f is a constant, then \(B^X_nf \equiv f\). In such a case, the upper bounds hold true automatically. We hereby declare once and for all that in the sequel our target functions are not constant, which justifies writing \(\omega (\frac{1}{\sqrt{n}})\) in denominators. For aesthetic reasons, we will use the equivalent form \(\omega (n^{-1/2})\) for \(\omega (\frac{1}{\sqrt{n}})\) in mathematical contexts in which the presence of the latter expression would make unusually-large displayed equations.

Proposition 2.6

For any \(1 \le p < \infty \), and any given \(f \in C[0,1]\), the following inequalities hold true:

$$\begin{aligned}&\mathbb E\left( \Vert B^X_n f -f \Vert ^p_p \right) \le 2^{p-1}\ \omega ^p \left( n^{-1/2}\right) \left[ 1 + n^{p/2} \mathbb E\left( |X-Y|^p\right) \right] ; \end{aligned}$$
(2.3)
$$\begin{aligned}&\mathbb E\left[ \exp \left( \{2\ \omega (n^{-1/2})\}^{-p} \Vert B^X_n f -f \Vert ^p_p \right) \right] \le \sqrt{e}\ \cdot \mathbb E\left[ \exp \left( \frac{1}{2} n^{p/2} |X-Y|^p \right) \right] . \nonumber \\ \end{aligned}$$
(2.4)

Proof

The proofs of (2.3) and (2.4) use basically the same techniques as those in the proofs of Lemmas 2.4 and 2.5. We will prove (2.4) as it is more involved than the other. For a fixed \(x \in [0,1]\), by the subadditivity of modulus of continuity of f and Jensen’s inequality, we write

$$\begin{aligned}&|B^X_n f (x) -f(x) |^p \\&\quad \le \omega ^p \left( n^{-1/2}\right) \sum ^n_{k=0}\left( 1 + \sqrt{n}|X_{n,k}-x|\right) ^p p_{n,k} (x) \\&\quad \le 2^{p-1}\ \omega ^p \left( n^{-1/2}\right) \left( 1 + n^{p/2} \sum ^n_{k=0}|X_{n,k}-x|^p p_{n,k} (x)\right) . \end{aligned}$$

It follows that

$$\begin{aligned}&\exp \left( \{2\ \omega (n^{-1/2})\}^{-p} \Vert B^X_n f -f \Vert ^p_p \right) \\&\quad \le \int ^1_0 \left[ \exp \left( \frac{1}{2} +\frac{1}{2} n^{p/2} \sum ^n_{k=0}|X_{n,k}-x|^p p_{n,k} (x) \right) \right] dx \\&\quad \le \sqrt{e}\ \int ^1_0 \left[ \sum ^n_{k=0} \exp \left( \frac{1}{2} n^{p/2} |X_{n,k}-x|^p \right) p_{n,k} (x) \right] dx, \end{aligned}$$

which implies that

$$\begin{aligned}&\mathbb E\left[ \exp \left( \{2\ \omega (n^{-1/2})\}^{-p} \Vert B^X_n f -f \Vert ^p_p \right) \right] \\&\quad \le \sqrt{e} \int ^1_0 \int ^1_0 \exp \left( \frac{1}{2} n^{p/2} |x-y|^p\right) {\mathcal {B}}_n(x,y) dy dx\\&\quad = \sqrt{e}\ \cdot \mathbb E\left[ \exp \left( \frac{1}{2} n^{p/2} |X-Y|^p \right) \right] . \end{aligned}$$

This completes the proof. \(\square \)

In the rest of this section, we will first develop upper bounds for \(\mathbb E\left( |X-Y|^p \right) \; (1 \le p < \infty )\) (Lemma 2.7 below), and \(\mathbb E\left[ \exp \left( \frac{1}{2} n^{p/2} |X-Y|^p \right) \right] \ (1 \le p \le 2)\) (Lemma 2.9 below). We will then use Chebyshev inequality to obtain unconditional \(L^p\)-concentration inequalities for stochastic Bernstein polynomials (Theorems 2.8, 2.11 below). En route we will be using variants of the following identity

$$\begin{aligned} \mathbb {E}(V) = \int _0^{\infty }\mathbb {P}(V>r)\mathrm{{d}}r, \end{aligned}$$
(2.5)

where V is a nonnegative random variable.

Lemma 2.7

Let \(1 \le p < \infty \). Then

$$\begin{aligned} \mathbb E\left( |X - Y |^p \right) \le \frac{p\ \Gamma \left( \frac{p}{2}\right) }{(n+3)^{p/2}}. \end{aligned}$$

Proof

We make use of identity (2.5) and Theorem 2.2 to write

$$\begin{aligned}&\mathbb E\left( |X - Y|^p \right) \\&\quad = p\int ^\infty _0 \mathbb P\{|X - Y| >r \} r^{p-1} \mathrm{{d}}r \\&\quad \le 2p \int ^\infty _0 e^{-(n+3) r^2} r^{p-1} \mathrm{{d}}r \\&\quad = \frac{p\ \Gamma \left( \frac{p}{2}\right) }{(n+3)^{p/2}}, \end{aligned}$$

which is the desired result. \(\square \)

Theorem 2.8

For any \(\varepsilon >0,\) \(1 \le p < \infty ,\) and \(f \in C[0,1],\) the following inequalities hold true:

  1. 1.

    \({\displaystyle \mathbb P\{\Vert B^X_n f -f \Vert _p > \varepsilon \} \le \frac{ 2^{p-1}\left[ 1+ p\ \Gamma \left( \frac{p}{2}\right) \right] \omega ^p \left( n^{-1/2}\right) }{\varepsilon ^p}.}\)

  2. 1.

    \({\displaystyle \mathbb E\left( \Vert B^X_n f -f \Vert _p \right) \le 2^{1-\frac{1}{p}} \left[ 1+ p\ \Gamma \left( \frac{p}{2}\right) \right] ^{\frac{1}{p}} \omega \left( n^{-1/2}\right) .}\)

Part 1 of Theorem 2.8 gives an \(L^p\)-probabilistic convergence rate (\(1 \le p < \infty )\) for stochastic Bernstein polynomials; Part 2 shows that the average \(L^p\)-approximation order \((1 \le p < \infty )\) of stochastic Bernstein polynomials is the same as that given by the classical Bernstein polynomials. The result in Part 2 fails to work for the case \(p=\infty \). The causation is that the constant, by Sterling’s formula, is of the order \(\sqrt{p}\) as \(p \rightarrow \infty .\) In Sect. 4, we will obtain an average \(L^p\)-approximation order with an absolute constant for stochastic Bernstein polynomials (Corollary 4.3).

Proof

To prove Part 1, we use Chebyshev inequality, inequality (2.3), and Lemma 2.7 to derive

$$\begin{aligned} \mathbb P\{\Vert B^X_n f -f \Vert _p > \varepsilon \} \le \frac{\mathbb E\left( \Vert B^X_n f -f \Vert ^p_p\right) }{\varepsilon ^p} \le \frac{ 2^{p-1}\left[ 1+ p \Gamma \left( \frac{p}{2}\right) \right] \omega ^p \left( n^{-1/2}\right) }{\varepsilon ^p}. \end{aligned}$$

To prove Part 2, we use Jensen’s inequality, inequality (2.3), and Lemma 2.7 to derive

$$\begin{aligned} \mathbb E\left( \Vert B^X_n f -f \Vert _p \right) \le \left[ \mathbb E\left( \Vert B^X_n f -f \Vert ^p_p \right) \right] ^{1/p} \le 2^{1-\frac{1}{p}} \left[ 1+ p \Gamma \left( \frac{p}{2}\right) \right] ^{1/p} \omega \left( n^{-1/2}\right) . \end{aligned}$$

This completes the proof. \(\square \)

Lemma 2.9

Suppose that Z is a random variable satisfying the following inequality:

$$\begin{aligned} \mathbb P\{ |Z|>r\} \le 2 \exp \left( -2n r^2 \right) , \quad r>0. \end{aligned}$$

Then we have

$$\begin{aligned} \mathbb E\left[ \exp \left( n |Z|^2\right) \right] \le 2. \end{aligned}$$

Proof

We use Proposition (2.5) to derive

$$\begin{aligned}&\mathbb E\left[ \exp \left( n |Z|^2 \right) \right] \\&\quad = \int ^\infty _0 \mathbb P\{\exp \left( n |Z|^2 \right)>r \} \mathrm{{d}}r \\&\quad = \int ^\infty _0 \mathbb P\{|Z| > \tau \} \mathrm{{d}} \exp \left( n \tau ^2 \right) \\&\quad \le 2 \int ^\infty _0 \exp (-2n \tau ^2) \mathrm{{d}} \exp \left( n \tau ^2 \right) \\&\quad \le 2\int ^\infty _1 \frac{du}{u^2}=2. \end{aligned}$$

This completes the proof. \(\square \)

Proposition 2.3 and Lemma 2.9 yield the following result.

Corollary 2.10

The following inequality holds true:

$$\begin{aligned} \mathbb E\left[ \exp \left( \frac{1}{2} n |X-Y|^2 \right) \right] \le 2. \end{aligned}$$

We now state and prove the main result of the section.

Theorem 2.11

For any given \(\varepsilon >0\) and p in the range \(1 \le p \le 2\), and \(f \in C[0,1]\), the following inequality holds true.

$$\begin{aligned}&\mathbb {P}\left\{ \left\| B_{n}^{X} f-f\right\| _{p}>\varepsilon \right\} \le 2 \sqrt{e}\ \exp \left[ -\frac{\varepsilon ^2}{4\ \omega ^2\left( n^{-1/2})\right) }\right] . \end{aligned}$$

Proof

For any given \(\varepsilon >0\) and p in the range \(1 \le p \le 2\), and \(f \in C[0,1]\), we have

$$\begin{aligned} \mathbb {P}\left\{ \left\| B_{n}^{X} f-f\right\| _{p}>\varepsilon \right\} \le \mathbb {P}\left\{ \left\| B_{n}^{X} f-f\right\| _2 >\varepsilon \right\} . \end{aligned}$$
(2.6)

Thus, it suffices to prove the theorem for the case \(p=2.\) By (2.4) and Corollary 2.10, we have

$$\begin{aligned} \mathbb E\left[ \exp \left\{ \left[ 2\ \omega \left( n^{-1/2}\right) \right] ^{-2} \Vert B^X_n f -f \Vert ^2_2 \right\} \right] \le \sqrt{e}\ \mathbb E\left[ \exp \left( \frac{1}{2} n |X-Y|^2 \right) \right] \le 2 \sqrt{e}. \end{aligned}$$

Applying Chebyshev inequality to the right hand side of (2.6), we obtain

$$\begin{aligned}&\mathbb {P}\left\{ \Vert B_{n}^{X} f-f\Vert _2 >\varepsilon \right\} \\&\quad \le \frac{\mathbb E\left[ \exp \left\{ \left[ 2\ \omega \left( n^{-1/2}\right) \right] ^{-2} \Vert B^X_n f -f \Vert ^2_2 \right\} \right] }{\exp \left\{ \varepsilon ^2\ \left[ 2\ \omega \left( n^{-1/2}\right) \right] ^{-2} \right\} }\\&\quad \le 2 \sqrt{e}\ \exp \left[ -\frac{\varepsilon ^2}{4\ \omega ^2\left( n^{-1/2}\right) }\right] , \end{aligned}$$

which is the desired result. \(\square \)

Unlike (1.4), we do not need assumption (1.5) in the proof of Theorem 2.11. The price we paid for this is the multiplicative constant \(\sqrt{e}\).

3 Gaussian-Type \(L^p\)-Concentration Inequalities \( (1 \le p < \infty )\)

In an essential way, the proof of Theorem 2.11 depends on the boundedness of the sequence

$$\begin{aligned} \mathbb E\left( \exp (n^{p/2}|X-Y|^p) \right) , \quad n \in \mathbb N, \quad 1 \le p \le 2. \end{aligned}$$

This is no longer true for \(p > 2.\) To prove unconditional \(L^p\)-concentration inequalities for \(2<p < \infty \), we start with the random process \(V_n(x)\) defined by

$$\begin{aligned} V_n(x):=\sum _{k=0}^n |X_{n,k} - x|p_{n,k}(x), \quad x \in [0,1]. \end{aligned}$$

Lemma 3.1

The following inequality holds true almost surely:

$$\begin{aligned} \Vert B^X_n f - f \Vert ^2_p \le 2\ \omega ^2\left( n^{-1/2}\right) \left( 1 + n \Vert V_n\Vert ^2_p\right) , \quad 1 \le p < \infty , \quad f \in C[0,1]. \end{aligned}$$

Proof

By subadditivity of the modulus of continuity, we have

$$\begin{aligned}&|(B^X_n f)(x) - f(x) | \\&\quad \le \sum ^n_{k=0} \omega (|X_{n,k} - x|)\ p_{n,k}(x) \\&\quad \le \omega \left( n^{-1/2} \right) \left( 1 + \sqrt{n}\ V_n(x) \right) . \end{aligned}$$

We then apply Minkowski’s and Jensen’s inequalities to get the desired the result. \(\square \)

We use a triangle inequality to write

$$\begin{aligned} V_n(x)\le \sum _{k=0}^n \left| X_{n,k} - \frac{k}{n}\right| p_{n,k}(x) + \sum _{k=0}^n \left| x - \frac{k}{n}\right| p_{n,k}(x). \end{aligned}$$
(3.1)

Let \( Y_n(x)\) denote the random process as is expressed by the first sum on the right hand side of (3.1), and \(W_n\) the random variable \(\max \nolimits _{0 \le k \le n} |X_{n,k} - \frac{k}{n}|.\) For a fixed \(x \in [0,1]\), let T(x) be the random variable taking values 0 and 1, with respective probabilities x and \(1-x\). Let \(\{T_k(x)\}_{k=0}^n\) be \((n+1)\) independent copies of T(x). Let \(Z_n=Z_n(x)\) be the random variable defined by

$$\begin{aligned} Z_n(x) = \frac{1}{n} \sum ^n_{k=0}T_k(x). \end{aligned}$$
(3.2)

Then \(Z_n(x) \) takes values \(\frac{k}{n}\) with respective probabilities \(p_{n,k}(x)\), \(k=0,1,\ldots ,n\). These lead to the following result.

Lemma 3.2

The following inequality holds true almost surely:

$$\begin{aligned} \Vert V_n\Vert ^2_p \le 2\ \left( W_n^2 + \Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p \right) , \quad 1 \le p < \infty . \end{aligned}$$

Proof

For all \(x \in [0,1]\), we have

$$\begin{aligned} 0 \le Y_n(x) \le W_n \sum ^n_{k=0}p_{n,k}(x) = W_n, \end{aligned}$$

which implies that \(\Vert Y_n\Vert _\infty \le W_n\) almost surely. By Minkowski’s and Jensen’s inequalities, we have

$$\begin{aligned} \Vert V_n\Vert ^2_p \le&2 \left\| \sum ^n_{k=0} \left| X_{n,k} - \frac{k}{n}\right| p_{n,k} \right\| ^2_p + 2 \Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p. \\ \le&2 \left( W^2_n + \Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p\right) , \end{aligned}$$

which is the desired result.

We will deal with the two terms on the right hand side of the above inequality separately, and start with the first. The key strategy is to tie the random variable \(W_n\) to the empirical process of the uniform distribution in (0, 1).

Lemma 3.3

Let \(U_{n+1}\) be the empirical distribution function of order \((n+1)\) associated with the uniform distribution in (0, 1) defined by

$$\begin{aligned} U_{n+1}(x):=\frac{1}{n+1}\sum _{k=0}^n\mathbbm {1}_{\{X_{n,k}<x\}}. \end{aligned}$$
(3.3)

Then the following inequality holds true almost surely,

$$\begin{aligned} W_n\le \sup _{x\in [0,1]}\left| U_{n+1}(x)-x\right| . \end{aligned}$$
(3.4)

Proof

Let \(X^*_{n,0}< X^*_{n,1}< \cdots < X^*_{n,n}\) be a set of observed values of the random variables \(X_{n,0}, X_{n,1}, \ldots , X_{n,n}\). Let \(U^*_{n+1}\) be the corresponding (observed) empirical distribution function for the uniform distribution U on [0, 1]. Let \(I_0:=[0,X^*_{n,0})\), \(I_k:=[X^*_{n,k-1}, X^*_{n,k}), \; 1 \le k <n,\) and \(I_n:=[X^*_{n,n}, 1]\). Then we have

$$\begin{aligned}&\sup _{x \in I_0} \left( x - U^*_{n+1}(x)\right) = X^*_{n,0}, \\&\sup _{x \in I_k} \left( x - U^*_{n+1}(x)\right) = X^*_{n,k} -\frac{k}{n+1}, \quad 1 \le k \le n,\\&\sup _{x \in I_n} \left( x - U^*_{n+1}(x)\right) =0. \end{aligned}$$

The equations above show that

$$\begin{aligned} \sup _{x \in [0, 1]} \left( x - U^*_{n+1}(x)\right) = \max _{0 \le k \le n} \left( X^*_{n,k} -\frac{k}{n+1}\right) . \end{aligned}$$

Since \({X^*_{n,k} -\frac{k}{n} < X^*_{n,k} -\frac{k}{n+1},}\) we have \( = \max _{0 \le k \le n} \left( X^*_{n,k} -\frac{k}{n}\right) \le \sup _{x \in [0, 1]}\left( x - U^*_{n+1}(x)\right) \). This inequality holds true for every set of such observed values of the order statistics. Hence we conclude that

$$\begin{aligned} \max _{0 \le k \le n} \left( X_{n,k} -\frac{k}{n}\right) \le \sup _{x \in [0, 1]} \left( x - U_{n+1}(x)\right) \quad \text {almost surely.} \end{aligned}$$
(3.5)

Similarly we show that

$$\begin{aligned} \max _{0 \le k \le n} \left( \frac{k}{n}- X_{n,k} \right) \le \sup _{x \in [0, 1]} \left( U_{n+1}(x) - x \right) \quad \text {almost surely.} \end{aligned}$$
(3.6)

Combining (3.5) and (3.6), we get the desired result. \(\square \)

Lemma 3.4

Let \(\varepsilon > 0\) and \(n \in \mathbb N\) be given. Then we have the following inequality:

$$\begin{aligned} \mathbb P\{W_n > \varepsilon \} \le 2\ e^{-2(n+1)\varepsilon ^2}. \end{aligned}$$

Proof

By Lemma 3.3 and Dvoretzky–Kiefer–Wolfowitz inequality (see (1.6)), we have

$$\begin{aligned} \mathbb P\{W_n> \varepsilon \} \le \mathbb P\{\sup _{x\in [0,1]}\left| U_{n+1}(x)-x\right| > \varepsilon \} \le 2\ e^{-2(n+1)\varepsilon ^2}, \end{aligned}$$

which is the desired result. \(\square \)

We now turn our attention to estimating the term \(\Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p \) in Lemma 3.2, en route we need a well-known concentration inequality known as Bernstein inequality; see [12].

Theorem 3.5

Let \(c>0\). Let \(X_1, \ldots , X_n\) be independent variables. If \(\mathbb P(|X_j| \le c)=1\), then for any \(\varepsilon > 0\),

$$\begin{aligned} \mathbb P(|S_n - \mu | > \varepsilon ) \le 2 \exp \left\{ \frac{-n \varepsilon ^2}{2 \sigma ^2 + 2c \varepsilon / 3}\right\} , \end{aligned}$$

in which \(S_n:= \frac{1}{n} \sum \nolimits ^n_{j=1}X_j\), \(\mathbb E(S_n) = \mu \), and \(\sigma ^2:= \frac{1}{n} \sum ^n_{j=1}\mathrm{Var} (X_j).\)

In the next lemma, we apply Bernstein inequality to control a certain segment of sums of \(p_{n,k}.\)

Lemma 3.6

For a fixed \(x \in [0,1]\), the following inequality holds true:

$$\begin{aligned} \sum \limits _{|x - \frac{k}{n}| \ge \varepsilon } p_{n,k}(x) \le 2 \exp \left[ \frac{-n \varepsilon ^2}{2x(1-x) + 3\varepsilon /2} \right] . \end{aligned}$$

Proof

Consider the random variable \(Z_n\) as defined in (3.2). We have \(\mathbb E(Z_n)=x\), and \(\text{ Var }(Z_n)=x(1-x)\). We apply Bernstein inequality to the random variable \(Z_n\) to get the desired inequality. \(\square \)

Denote

$$\begin{aligned} \Delta _{n,x}: = \max \left( \sqrt{\frac{x(1-x)}{n}},\quad \frac{1}{n}\right) .\end{aligned}$$

Theorem 3.7

Let \(0<p<\infty \) be given. The central p-moment of the random variable \(Z_n\) satisfies the following inequality:

$$\begin{aligned} \mathbb E\left( |Z_n - \mathbb E(Z_n)|^p \right) \le C_p\ \Delta ^{p}_{n,x}, \end{aligned}$$

where

$$\begin{aligned} C_p:= 2 \sum ^{\infty }_{j=0} (j+1)^p\exp [-j^2/(2+ 3j/2)]. \end{aligned}$$
(3.7)

Proof

If \(x=0\) or \(x=1\), then the desired inequality holds true trivially. In the rest of the proof, we assume that \(x(1-x) > 0.\) Denote

$$\begin{aligned} E_{j,x}:=\left\{ k: j \Delta _{n,x} \le \left| \frac{k}{n} - x \right| \le (j+1) \Delta _{n,x}\right\} , \quad j=0,1,\ldots , J_{n,x},\end{aligned}$$

where \(J_{n,x}:= \lceil \Delta ^{-1}_{n,x}\rceil \), and use Lemma 3.6 to write

$$\begin{aligned} \mathbb E|Z_n - \mathbb E(Z_n)|^p= & {} \sum ^n_{k=0} \left| \frac{k}{n} - x \right| ^p p_{n,k}(x) \\\le & {} \sum ^{J_{n,x}}_{j=0} \sum \limits _{k \in E_{j,x}} \left| \frac{k}{n} - x \right| ^p p_{n,k}(x) \\\le & {} 2 \Delta ^p_{n,x} \sum ^{J_{n,x}}_{j=0} (j+1)^p \exp \left[ \frac{-n \left( j \Delta _{n,x}\right) ^2 }{2x(1-x) + 3j \Delta _{n,x}/2} \right] . \end{aligned}$$

In what follows, we derive an upper bound ( independent of x and n) for the sum above by considering two separate cases. Case (i): \(x(1-x) \le \frac{1}{n}\). We have \( \Delta _{n,x} \le 1/n.\) It follows that

$$\begin{aligned} \exp \left[ \frac{-n \left( j \Delta _{n,x}\right) ^2 }{2x(1-x) + 3j \Delta _{n,x}/2} \right] \le \exp [-j^2/(2+ 3j/2)]. \end{aligned}$$

Case (ii): \(x(1-x) > \frac{1}{n}\). We have

$$\begin{aligned} \Delta _{n,x}=\sqrt{\frac{x(1-x)}{n}} \le x(1-x), \end{aligned}$$

which yields

$$\begin{aligned} \exp \left[ \frac{-n \left( j \Delta _{n,x}\right) ^2 }{2x(1-x) + 3j \Delta _{n,x}/2} \right]\le & {} \exp \left[ \frac{-n \left( j x(1-x)/n \right) ^2 }{2x(1-x) + 3j x(1-x)/2} \right] \\\le & {} \exp \left[ -j^2/(2+3j/2) \right] . \end{aligned}$$

The proof is complete. \(\square \)

Lemma 3.8

The following inequality holds true:

$$\begin{aligned} \Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p \le C^{2/p}_p/(4n), \end{aligned}$$

where \(C_p\) is as defined in (3.7).

Proof

By Jensen’s inequality and Theorem 3.7, we have

$$\begin{aligned}&\Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p \\&\quad \le \left[ \int ^1_0 \left( \sum ^n_{k=0} \left| x - \frac{k}{n}\right| p_{n,k}(x) \right) ^p dx \right] ^{2/p} \\&\quad \le \left[ \int ^1_0 \left( \sum ^n_{k=0} \left| x - \frac{k}{n}\right| ^p p_{n,k}(x) \right) dx \right] ^{2/p} \\&\quad \le \left( C_p \int ^1_0 \Delta _{n,x}^{p/2}\ dx \right) ^{2/p} \le C_p^{2/p}/(4n), \end{aligned}$$

which is the desired result. \(\square \)

Here is the main result of the section.

Theorem 3.9

Let \(\varepsilon > 0\) and \(n \in \mathbb N\) be given. Then the following inequality holds true:

$$\begin{aligned} \mathbb P\left\{ \left\| B_{n}^{X} f-f\right\| _{p}>\varepsilon \right\} \le 2 D_p\ \exp \left[ -\frac{\varepsilon ^2}{4\ \omega ^2\left( n^{-1/2}\right) }\right] , f \in C[0,1], 1 \le p < \infty , \end{aligned}$$

in which \(D_p:= \exp \left[ \left( 1 + \frac{C^{2/p}_p}{2}\right) /2\right] .\)

Proof

By Lemmas 3.1, 3.2, and 3.8, the following inequalities hold true almost surely:

$$\begin{aligned} \left\| B_{n}^{X} f-f\right\| _{p}^2 \le&2\ \omega ^2\left( n^{-1/2}\right) \left[ 1 + 2n \left( W_2^2 + \Vert \mathbb E\left( |Z_n - \mathbb E(Z_n)| \right) \Vert ^2_p\right) \right] \\ \le&2\ \omega ^2\left( n^{-1/2}\right) \left( 1 + 2n\ W_2^2 + \frac{C^{2/p}_p}{2} \right) . \end{aligned}$$

It follows from Lemmas 3.4 and 2.9 that

$$\begin{aligned}&\mathbb E\left( \exp \left[ \left( 4\ \omega ^2\left( n^{-1/2}\right) \right) ^{-1} \Vert B_n^X f-f\Vert _{p}^2 \right] \right) \\&\quad \le \mathbb E\left( \exp \left[ \frac{1}{2} \left( 1 + 2n\ W_2^2 + \frac{C^{2/p}_p}{2} \right) \right] \right) \\&\quad = \exp \left[ \left( 1 + (C^{2/p}_p)/2\right) /2\right] \mathbb E\left[ \exp (2n\ W_2^2) \right] = 2 D_p. \end{aligned}$$

Applying Chebyshev inequality, we get

$$\begin{aligned}&\mathbb P\left\{ \left\| B_{n}^{X} f-f\right\| _{p}>\varepsilon \right\} \\&\quad \le \mathbb E\left( \exp \left[ \left( 4\ \omega ^2\left( n^{-1/2}\right) \right) ^{-1} \Vert B_n^X f-f\Vert _{p}^2 \right] \right) \\&\qquad \cdot \left( \exp \left[ \left( 4\ \omega ^2\left( n^{-1/2}\right) \right) ^{-1} \varepsilon ^2 \right] \right) ^{-1} \\&\quad \le 2 D_p\ \exp \left[ -\frac{\varepsilon ^2}{4\ \omega ^2\left( n^{-1/2}\right) }\right] , \end{aligned}$$

which is the desired result. \(\square \)

Two remarks about Theorem 3.9 are in order.

  1. 1.

    For \(p=2,\) the multiplicative constant in the result of Theorem 3.9 is \(\exp \left[ \left( 1 + \frac{C_2}{2}\right) /2\right] \), which is noticeably larger than \(\sqrt{e}\) as given in Theorem 2.11.

  2. 2.

    The multiplicative constant \(D_p\) is of the order \(\exp (\sqrt{p})\) as \(p \rightarrow \infty ,\) which is the typical growth rate of the pth order of exponential moments of a sub-Gaussian random variable. This has ruled out the feasibility of obtaining an \(L^\infty \)-concentration inequality for stochastic Bernstein polynomials by taking the limit \(p \rightarrow \infty .\)

4 Exponential \(L_\infty \)-Probabilistic Convergence Rates

The approach undertaken in this section needs to harness the deterministic approximation power of classical Bernstein polynomials, which features many publications; see [11, 15, 16, 22, 25, 35], and the references therein. Sikkema [31] proved the following remarkable result:

$$\begin{aligned} \sup _{n}\ \sup _{f}\ \sup _{0 \le x \le 1}\frac{|B_n f(x)-f(x)|}{\omega (f, n^{-1/2})}= \frac{4306+837 \sqrt{6}}{5832}=1.0898871330\cdots . \end{aligned}$$

We denote this constant by \(c_s\). Sikkema [32] also showed that the value on the right hand side of the above equation is only reached for \(n=6\). But we will not use the latter result in the current article.

Lemma 4.1

Let \(\varepsilon >0\) and \(f\in C[0, 1]\) be given. Suppose that \(\omega (\frac{1}{\sqrt{n}})<\frac{\varepsilon }{2(c_s +1)}.\) Then the following inequality holds true:

$$\begin{aligned} \mathbb {P}\{\Vert B^X_nf-f\Vert _p>\varepsilon \} \le \mathbb {P}\left\{ \left\| Y_n \right\| _p >\frac{\varepsilon }{2\sqrt{n}\ \omega \left( n^{-1/2}\right) } \right\} , \quad 1 \le p \le \infty . \end{aligned}$$
(4.1)

Authors of [34, Lemma 2.10] proved a slightly different version of the above lemma. We will omit the proof, and refer interested readers to the proof of Lemma 2.10 in [34]

Theorem 4.2

Let \(\varepsilon >0\) and \(f\in C[0, 1]\) be given. Suppose that

$$\begin{aligned} \omega \left( \frac{1}{\sqrt{n}} \right) < \frac{\varepsilon }{2(c_s +1)}. \end{aligned}$$

Then the following inequality holds true:

$$\begin{aligned} \mathbb {P}\{\Vert B^X_nf-f\Vert _p >\varepsilon \} \le 2 \exp \left[ - \frac{\varepsilon ^2}{2\ \omega ^2(n^{-1/2} ) } \right] , \quad 1 \le p \le \infty . \end{aligned}$$

Proof

Let \(1 \le p \le \infty \) be given. We use Lemmas 4.1 and 3.4 in a successive fashion to derive the following inequalities:

$$\begin{aligned}&\mathbb {P}\{\Vert B^X_nf-f\Vert _p>\varepsilon \} \\&\quad \le \mathbb {P}\{\Vert B^X_nf-f\Vert _\infty>\varepsilon \} \\&\quad \le \mathbb {P}\left\{ \left\| Y_n \right\| _\infty>\frac{\varepsilon }{2\sqrt{n}\ \omega \left( n^{-1/2} \right) } \right\} \\&\quad \le \mathbb {P}\left\{ W_n >\frac{\varepsilon }{2\sqrt{n}\ \omega \left( n^{-1/2} \right) } \right\} \\&\quad \le 2 \exp \left[ - \frac{\varepsilon ^2}{2\omega ^2(n^{-1/2} ) } \right] . \end{aligned}$$

This completes the proof. \(\square \)

The result of Theorem 4.2 confirms a similar conjecture raised in [33].

Corollary 4.3

Let \(f\in C[0, 1]\) be given. Then the following inequality holds true:

$$\begin{aligned} \mathbb E\left( \Vert B^X_nf-f\Vert _p \right) \le \left[ 2(c_s+1) + \sqrt{\frac{\pi }{2}}\ \right] \omega (\frac{1}{\sqrt{n}}), \quad 1 \le p \le \infty . \end{aligned}$$

Proof

Denote \(a_n:=2(c_s +1)\omega (n^{-1/2})\). By (2.5) and Theorem 4.2, we have

$$\begin{aligned} \mathbb E\left( \Vert B^X_nf-f\Vert _p \right) =&\left\{ \int ^{a_n}_0 + \int ^\infty _{a_n} \right\} \mathbb P\{ \Vert B^X_nf-f\Vert _p > r \} \mathrm{{d}}r \\ \le&2(c_s+1)\ \omega \left( \frac{1}{\sqrt{n}}\right) \\&+ 2\int ^\infty _0\exp \left[ - \frac{r^2}{2\ \omega ^2\left( n^{-1/2}\right) } \right] \mathrm{{d}}r \\ =&\left[ 2(c_s+1) + \sqrt{\frac{\pi }{2}}\ \right] \omega \left( \frac{1}{\sqrt{n}}\right) . \end{aligned}$$

This completes the proof. \(\square \)