Keywords

1 Introduction

In many statistical and probabilistic applications, we have to solve the problem of Gaussian comparison, that is, one has to evaluate how the probability of a ball under a Gaussian measure is affected, if the mean and the covariance operators of this Gaussian measure are slightly changed. In [1] we present particular examples motivating the results when such “large ball probability” problem naturally arises, including bootstrap validation, Bayesian inference and high-dimensional CLT, see also [2]. The tight non-asymptotic bounds for the Kolmogorov distance between the probabilities of two Gaussian elements to hit a ball in a Hilbert space have been derived in [1] and [3]. The key property of these bounds is that they are dimension-free and depend on the nuclear (Schatten-one) norm of the difference between the covariance operators of the elements and on the norm of the mean shift. The obtained bounds significantly improve the bound based on Pinsker’s inequality via the Kullback–Leibler divergence. It was also established an anti-concentration bound for a squared norm \(||Z - a||^2, \ \ a \in \mathbf{H},\) of a shifted Gaussian element Z with zero mean in a Hilbert space \(\mathbf{H}\). The decisive role in proving the results was played by the upper estimates for the maximum of the probability density function g(xa) of \(||Z - a||^2\), see Theorem 2.6 in [1]:

$$\begin{aligned} \sup _{x\ge 0} g(x, a) \le c\, (\varLambda _1 \varLambda _2)^{-1/4}, \end{aligned}$$
(1)

where c is an absolute constant and

$$ \varLambda _1 = \sum _{k=1}^{\infty } \lambda _k^2, \qquad \varLambda _2 = \sum _{k=2}^{\infty } \lambda _k^2 $$

with \( \lambda _1 \ge \lambda _2 \ge \dots \) are the eigenvalues of a covariance operator \(\varSigma \) of Z.

It is well known that g(xa) can be considered as a density function of a weighted sum of non-central \( \chi ^{2}\) distributions. An explicit but cumbersome representation for g(xa) in finite dimensional space \(\mathbf{H}\) is available (see, e.g., Sect. 18 in Johnson, Kotz and Balakrishnan [4]). However, it involves some special characteristics of the related Gaussian measure which makes it hard to use in specific situations. Our result (1) is much more transparent and provides sharp uniform upper bounds. Indeed, in the case \(\mathbf{H}= \mathbf{R}^{d}\), \(a = 0\), \(\varSigma \) is the unit matrix, one has that the distribution of \(||Z||^{2}\) is the standard \(\chi ^{2}\) with d degrees of freedom and the maximum of its probability density function is proportional to \(d^{-1/2}\). This is the same as what we get in (1).

At the same time, it was noted in [1] that obtaining lower estimates for \(\sup _x g(x, a)\) remains an open problem. The latter problem was partially solved in [5], Theorem 1. However, it was done under additional conditions and we took into account the multiplicity of the largest eigenvalue.

In the present paper we get two–sided bounds for \(\sup _x g(x, 0)\) in the finite-dimensional case \(\mathbf{H}= \mathbf{R}^d\), see Theorem 1 below. The bounds are dimension-free, that is they do not depend on d. Thus, for the upper bounds (1), we obtain a new proof, which is of independent interest. And new lower bounds show the optimality of (1), since the upper and lower bounds differ only in absolute constants. Moreover, new two-sided bounds are constructed for \(\sup _x g(x, a)\) with \(a\ne 0\) in the finite-dimensional case \(\mathbf{H}= \mathbf{R}^d\), see Theorem 2 below. Here we consider a typical situation, where \(\lambda _1\) does not dominate the other coefficients.

2 Main Results

For independent standard normal random variables \(Z_k \sim N(0,1)\), consider the weighted sum

$$ W_0 = \lambda _1 Z_1^2 + \dots + \lambda _n Z_n^2, \qquad \lambda _1 \ge \dots \ge \lambda _n > 0. $$

It has a continuous probability density function p(x) on the positive half-axis. Define the functional

$$ M(W_0) = \sup _x \, p(x). $$

Theorem 1

Up to some absolute constants \(c_0\) and \(c_1\), we have

$$\begin{aligned} c_0 (A_1 A_2)^{-1/4} \le M(W_0) \le c_1 (A_1 A_2)^{-1/4}, \end{aligned}$$
(2)

where

$$ A_1 = \sum _{k=1}^n \lambda _k^2, \qquad A_2 = \sum _{k=2}^n \lambda _k^2 $$

and

$$ c_0 = \frac{1}{4e^2\sqrt{2\pi }} > 0.013, \qquad c_1 = \frac{2}{\sqrt{\pi }} < 1.129. $$

Theorem 1 can be extended to more general weighted sums:

$$\begin{aligned} W_a = \lambda _1 (Z_1 - a_1)^2 + \dots + \lambda _n (Z_n - a_n)^2 \end{aligned}$$
(3)

with parameters \(\lambda _1 \ge \dots \ge \lambda _n > 0\) and \(a = (a_1, \dots , a_n) \in \mathbf{R}^n\).

It has a continuous probability density function p(xa) on the positive half-axis \(x > 0\). Define the functional

$$ M(W_a) = \sup _x \, p(x, a). $$

Remark. It is known that for any non-centred Gaussian element Y in a Hilbert space, the random variable \(||Y||^2\) is distributed as \(\sum _{i=1}^{\infty }\lambda _i (Z_i - a_i)^2\) with some real \(a_i\) and \(\lambda _i\) such that

$$\lambda _1 \ge \lambda _2 \ge \dots \ge 0 \ \ \ \mathrm{and} \ \ \ \sum _{i=1}^{\infty }\lambda _i < \infty .$$

Therefore, the upper bounds for \(M(W_a)\) immediately imply the upper bounds for the probability density function of \(||Y||^2\).

Theorem 2

If \(\lambda _1^2 \le A_1/3\), then one has a two-sided bound

$$ \frac{1}{4\sqrt{3}}\,\frac{1}{\sqrt{A_1 + B_1}} \le M(W_a) \le \frac{2}{\sqrt{A_1 + B_1}}, $$

where

$$ A_1 = \sum _{k=1}^n \lambda _k^2, \qquad B_1 = \sum _{k=1}^n \lambda _k^2 a_k^2. $$

Moreover, the left inequality holds true without any assumption on \(\lambda _1^2\).

Remark. In Theorem 2 we only consider a typical situation, where \(\lambda _1\) does not dominate the other coefficients. Moreover, the condition \(\lambda _1^2 \le A_1/3\) necessarily implies that \(n \ge 3\). If this condition is violated, the behaviour of \(M(W_a)\) should be studied separately.

3 Auxiliary Results

For the lower bounds in the theorems, one may apply the following lemma, which goes back to the work by Statulyavichus [6], see also Proposition 2.1 in [7].

Lemma 1

Let \(\eta \) be a random variable with \(M(\eta )\) denoting the maximum of its probability density function. Then one has

$$\begin{aligned} M^2(\eta ) \,\mathrm{Var}(\eta ) \ge \frac{1}{12}. \end{aligned}$$
(4)

Moreover, the equality in (4) is attained for the uniform distribution on any finite interval.

Remark. There are multidimensional extensions of (4), see e.g. [8, 9] and Section III in [10].

Proof. Without loss of generality we may assume that \(M(\eta ) =1.\)

Put \(H(x) = \mathbf{P}(|\eta - \mathbf{E}\eta | \ge x), \quad x\ge 0.\)

Then, \(H(0) = 1\) and \(H'(x)\ge -2\), which gives \(H(x)\ge 1 - 2x,\) so

$$\begin{aligned} \mathrm{Var}(\eta )= & {} 2\int _0^{\infty } xH(x)\, dx \ge 2\int _0^{1/2} xH(x)\, dx \nonumber \\\ge & {} 2\int _0^{1/2} x(1-2x)\, dx =\frac{1}{12}.\nonumber \end{aligned}$$

Lemma is proved.

The following lemma will give the lower bound in Theorem 2.

Lemma 2

For the random variable \(W_a\) defined in (3), the maximum \(M(W_a)\) of its probability density function satisfies

$$\begin{aligned} M(W_a) \ge \frac{1}{4\sqrt{3}}\,\frac{1}{\sqrt{A_1 + B_1}}, \end{aligned}$$
(5)

where

$$ A_1 = \sum _{k=1}^n \lambda _k^2, \qquad B_1 = \sum _{k=1}^n \lambda _k^2 a_k^2. $$

Proof. Given \(Z \sim N(0,1)\) and \(b \in \mathbf{R}\), we have

$$ \mathbf{E}\,(Z - b)^2 = 1 + b^2, \qquad \mathbf{E}\,(Z - b)^4 = 3 + 6b^2 + b^4, $$

so that \(\mathrm{Var}((Z - b)^2) = 2 + 4b^2\). It follows that

$$ \mathrm{Var}(W_a) = \sum _{k=1}^n \lambda _k^2\,(2 + 4a_k^2) = 2 A_1 + 4B_1 \le 4 (A_1 + B_1). $$

Applying (4) with \(\eta = W_a\), we arrive at (5).

Lemma is proved.

The proofs of the upper bounds in the theorems are based on the following lemma.

Lemma 3

Let

$$\alpha _1^2 + \dots + \alpha _n^2 = 1.$$

If \(\alpha _k^2 \le {1}/{m}\) for \(m = 1,2,\dots \), then the characteristic function f(t) of the random variable

$$ W = \alpha _1 Z_1^2 + \dots + \alpha _n Z_n^2 $$

satisfies

$$\begin{aligned} |f(t)| \le \frac{1}{(1 + 4t^2/m)^{m/4}}. \end{aligned}$$
(6)

In particular, in the cases \(m=4\) and \(m = 3\), W has a bounded density with \(M(W) \le {1}/{2}\) and \(M(W) < 0.723\) respectively.

Proof. Necessarily \(n \ge m\). The characteristic function has the form

$$ f(t) = \prod _{k=1}^n (1 - 2\alpha _k it)^{-1/2}, $$

so

$$ - \log |f(t)| = \frac{1}{4} \sum _{k=1}^n \log (1 + 4\alpha _k^2 t^2). $$

First, let us describe the argument in the simplest case \(m=1\).

For a fixed t, consider the concave function

$$ V(b_1,\dots ,b_n) = \sum _{k=1}^n \log (1 + 4b_k t^2) $$

on the simplex

$$ Q_1 \, = \, \Big \{(b_1,\dots ,b_n): b_k \ge 0, \ b_1 + \dots + b_n = 1\Big \}. $$

It has n extreme points \(b^k = (0,\dots ,0,1,0,\dots ,0)\). Hence

$$ \min _{b \in Q_1} V(b) = V(b^{k}) = \log (1 + 4t^2), $$

that is, \(|f(t)| \le (1 + 4t^2)^{-1/4}\), which corresponds to (6) for \(m=1\).

If \(m=2\), we consider the same function V on the convex set

$$ Q_2 = \Big \{(b_1,\dots ,b_n): 0 \le b_k \le \frac{1}{2}, \ b_1 + \dots + b_n = 1\Big \}, $$

which is just the intersection of the cube \([0,\frac{1}{2}]^n\) with the hyperplane. It has \({n(n-1)}/{2}\) extreme points

$$b^{kj}, \ \ \ 1 \le k < j \le n,$$

with coordinates 1/2 on the j-th and k-th places and with zero elsewhere. Indeed, suppose that a point

$$b = (b_1,\dots ,b_n) \in Q_2$$

has at least two non-zero coordinates \(0< b_k, b_j < {1}/{2}\) for some \(k < j\). Let x be the point with coordinates

$$x_l = b_l \ \ \ \mathrm{for} \ \ \ l \ne k,j,\,\,\, x_k = b_k + \varepsilon , \ \ \ \mathrm{and} \ \ \ x_j = b_j - \varepsilon , $$

and similarly, let y be the point such that

$$ y_l = b_l \ \ \ \mathrm{for} \ \ \ l \ne k,j, \,\,\, y_k = b_k - \varepsilon , \ \ \ \mathrm{and} \ \ \ y_j = b_j + \varepsilon .$$

If \(\varepsilon >0\) is small enough, then both x and y lie in \(Q_2\), while

$$b = (x+y)/2, \ \ \ x \ne y.$$

Hence such b cannot be an extreme point. Equivalently, any extreme point b of \(Q_2\) is of the form

$$b^{kj}, \ \ \ 1 \le k < j \le n.$$

Therefore, we conclude that

$$ \min _{b \in Q_2} V(b) = V(b^{kj}) = 2\log (1 + 2t^2), $$

which is the first desired claim.

In the general case, consider the function V on the convex set

$$ Q_m = \Big \{(b_1,\dots ,b_n): 0 \le b_k \le \frac{1}{m}, \ b_1 + \dots + b_n = 1\Big \}. $$

By a similar argument, any extreme point b of \(Q_m\) has zero for all coordinates except for m places where the coordinates are equal to 1/m. Therefore,

$$ \min _{b \in Q_m} V(b) = V\Big (\frac{1}{m},\dots ,\frac{1}{m},0,\dots ,0\Big ) = m\log (1 + 4t^2/m), $$

and we are done.

In case \(m=4\), using the inversion formula, we get

$$ M(W) \, \le \, \frac{1}{2\pi } \int _{-\infty }^\infty |f(t)|\,dt \, \le \, \frac{1}{2\pi } \int _{-\infty }^\infty \frac{1}{1 + t^2}\,dt \, = \, \frac{1}{2}. $$

Similarly, in the case \(m=3\),

$$ M(W) \, \le \, \frac{1}{2\pi } \int _{-\infty }^\infty \frac{1}{(1 + \frac{4}{3}\,t^2)^{3/4}}\,dt \, < \, 0.723. $$

Lemma is proved.

4 Proofs of Main Results

Proof of Theorem 1. In the following we shall write W instead of \(W_0\).

If \(n=1\), then the distribution function and the probability density function of \(W = \lambda _1 Z_1^2\) are given by

$$ F(x) = 2\,\varPhi \bigg (\sqrt{\frac{x}{\lambda _1}}\,\bigg ) - 1, \quad p(x) = \frac{1}{\sqrt{2\pi \lambda _1}}\,e^{-x/(2\lambda _1)} \qquad (x > 0), $$

respectively. Therefore, p is unbounded near zero, so that \(M(W) = \infty \). This is consistent with (2), in which case \(A_1 = \lambda _1^2\) and \(A_2 = 0\).

If \(n=2\), the density p(x) is described as the convolution

$$\begin{aligned} p(x) \, = \, \frac{1}{2\pi \sqrt{\lambda _1 \lambda _2}} \int _0^1 \frac{1}{\sqrt{(1-t) t}}\,\exp \Big \{-\frac{x}{2}\, \Big [\frac{1-t}{\lambda _1} + \frac{t}{\lambda _2}\Big ]\Big \}\,dt \qquad (x>0). \end{aligned}$$
(7)

Hence, p is decreasing and attains maximum at \(x=0\):

$$ M(W) = \frac{1}{2\pi \sqrt{\lambda _1 \lambda _2}} \int _0^1 \frac{1}{\sqrt{(1-t) t}}\,dt = \frac{1}{2\sqrt{\lambda _1 \lambda _2}}. $$

Since \(A_1 = \lambda _1^2 + \lambda _2^2\) and \(A_2 = \lambda _2^2\), we conclude, using the assumption \(\lambda _1 \ge \lambda _2\), that

$$ \frac{1}{2}\,(A_1 A_2)^{-1/4} \le M(W) \le \frac{1}{2^{3/4}}\,(A_1 A_2)^{-1/4}. $$

As for the case \(n \ge 3\), the density p is vanishing at zero and attains maximum at some point \(x>0\).

The further proof of Theorem 1 is based on the following observations and Lemma 3.

By homogeneity of (2), we may assume that \(A_1 = 1\).

If \(\lambda _1 \le {1}/{2}\), then all \(\lambda _k^2 \le {1}/{4}\), so that \(M(W) \le {1}/{2}\), by Lemma 3. Hence, the inequality of the form

$$ M(W) \le \frac{1}{2}\,(A_1 A_2)^{-1/4} $$

holds true.

Now, let \(\lambda _1 \ge {1}/{2}\), so that \(A_2 \le {3}/{4}\). Write

$$ W = \lambda _1 Z_1^2 + \sqrt{A_2}\,\xi , \qquad \xi = \sum _{k=2}^n \alpha _k Z_k^2, \quad \alpha _k = \frac{\lambda _k}{\sqrt{A_2}}. $$

By construction, \(\alpha _2^2 + \dots + \alpha _n^2 = 1\).

Case 1: \(\lambda _2 \ge \sqrt{A_2}/2\). Since the function M(W) may only decrease when adding an independent random variable to W, we get using (7) that

$$ M(W) \le M(\lambda _1 Z_1^2 + \lambda _2 Z_2^2) = \frac{1}{2\sqrt{\lambda _1 \lambda _2}} \le c\,(A_1 A_2)^{-1/4}, $$

where the last inequality holds with \(c = 1\). This gives the upper bound in (2) with constant 1.

Case 2: \(\lambda _2 \le \sqrt{A_2}/2\). It implies that \(n\ge 5\) and all \(\alpha _k^2 \le {1}/{4}\) for \(k > 1\). By Lemma 3 with \(m=4\), the random variable \(\xi \) has the probability density function q bounded by 1/2. The distribution function of W may be written as

$$ \mathbf{P}\{W \le x\} = \int _0^{x/\sqrt{A_2}} \mathbf{P}\Big \{|Z_1| \le \frac{1}{\sqrt{\lambda _1}}\, (x - y\sqrt{A_2})^{1/2}\Big \}\,q(y)\,dy, \quad x > 0, $$

and its density has the form

$$ p(x) = \frac{1}{\sqrt{2\pi \lambda _1}} \int _0^{x/\sqrt{A_2}} \frac{1}{\sqrt{x - y\sqrt{A_2}}}\,e^{-(x - y\sqrt{A_2})/(2\lambda _1)}\,q(y)\,dy. $$

Equivalently,

$$\begin{aligned} p(x\sqrt{A_2}) = \frac{1}{\sqrt{2\pi \lambda _1}}\, A_2^{-1/4} \int _0^{x} \frac{1}{\sqrt{x - y}}\,e^{-(x - y)\sqrt{A_2}/(2\lambda _1)}\,q(y)\,dy. \end{aligned}$$
(8)

Since \(\lambda _1 \ge {1}/{2}\), we immediately obtain that

$$ M(W) \le A_2^{-1/4}\,\frac{1}{\sqrt{\pi }}\ \sup _{x>0} \ \int _0^x \frac{1}{\sqrt{x - y}}\,q(y)dy. $$

But, using \(q \le {1}/{2}\), we get

$$\begin{aligned} \int _0^x \frac{1}{\sqrt{x - y}}\,q(y)dy= & {} \int _{0< y< x, \ x-y< 1} \frac{1}{\sqrt{x - y}}\,q(y)dy \\&+ \int _{0< y < x, \ x-y > 1} \frac{1}{\sqrt{x - y}}\,q(y)dy \\\le & {} \frac{1}{2}\,\int _0^1 \frac{1}{\sqrt{z}}\,dz + 1 \, = \, 2. \end{aligned}$$

Thus,

$$ M(W) \le 2 A_2^{-1/4}\,\frac{1}{\sqrt{\pi }}. $$

Combining the obtained upper bounds for M(W) in all cases we get the upper bound in (2).

For the lower bound, one may apply the inequality (4) in Lemma 1. Thus, we obtain that

$$ M(W) \ge \frac{1}{2\sqrt{6}} $$

due to the assumption \(A_1 = 1\) and the property \(\mathrm{Var}(Z_1^2) = 2\).

If \(\lambda _1^2 \le {1}/{2}\), we have \(A_2 \ge {1}/{2}\). Hence,

$$\begin{aligned} M(W) \ge \frac{1}{2\sqrt{6}} \ge c_0\,(A_1 A_2)^{-1/4}, \end{aligned}$$
(9)

where the last inequality holds true with

$$c_0 = \frac{1}{2^{5/4}\sqrt{6}} \, \ge 0.171.$$

In case \(\lambda _1^2 \ge \frac{1}{2}\), we have \(A_2 \le {1}/{2}\). Returning to the formula (8), let us choose \(x = \mathbf{E}\xi + 2\) and restrict the integration to the interval

$$ \varDelta : \max (\mathbf{E}\xi - 2,0)< y < \mathbf{E}\xi + 2. $$

On this interval necessarily

$$x - y \le 4.$$

Therefore, (8) yields

$$ M(W) \ge \frac{A_2^{-1/4}}{2\sqrt{2\pi \lambda _1}} \,\cdot e^{-2\sqrt{A_2}/\lambda _1}\,\mathbf{P}\{\xi \in \varDelta \}. $$

Here,

$$ \frac{A_2}{\lambda _1^2} = \frac{1}{\lambda _1^2} - 1 \le 1, $$

and we get

$$ M(W) \ge \frac{A_2^{-1/4}}{2\sqrt{2\pi }} \,\cdot e^{-2}\,\mathbf{P}\{\xi \in \varDelta \}. $$

Now, recall that \(\xi \ge 0\) and \(\mathrm{Var}(\xi ) = 2\,(\alpha _2^2 + \dots + \alpha _n^2) = 2\). Hence, by Chebyshev’s inequality,

$$ \mathbf{P}\{|\xi - \mathbf{E}\xi | \ge 2\} \le \frac{1}{4}\,\mathrm{Var}(\xi ) = \frac{1}{2}. $$

That is, \(\mathbf{P}\{\xi \in \varDelta \} \ge {1}/{2}\), and thus

$$ M(W) \ge \frac{ (A_1 A_2)^{-1/4} }{4\sqrt{2\pi }} \,e^{-2} \ge 0.013\cdot (A_1 A_2)^{-1/4} . $$

Theorem 1 is proved.

Proof of Theorem 2. In the following we shall write W instead of \(W_a\).

The lower bound in Theorem 2 immediately follows from (5) in Lemma 2 without any assumption on \(\lambda _1^2.\)

Our next aim is to reverse this bound up to a numerical factor under suitable natural assumptions.

Without loss of generality, let \(A_1 = 1\). Our basic condition will be that \(\lambda _1^2 \le {1}/{3}\), similarly to the first part of the proof of Theorem 1. Note that if \(\lambda _1^2 \le {1}/{3}\) then necessarily \(n \ge 3\).

As easy to check, for \(Z \sim N(0,1)\) and \(a \in \mathbf{R}\),

$$ \mathbf{E}\,e^{it\,(Z-a)^2} = \frac{1}{\sqrt{1 - 2it}}\, \exp \Big \{a^2\,\frac{it}{1 - it}\Big \}, \qquad t \in \mathbf{R}, $$

so that

$$ \Big |\mathbf{E}\,e^{it\,(Z-a)^2}\Big | = \frac{1}{(1 + 4t^2)^{1/4}}\, \exp \Big \{-2a^2\,\frac{t^2}{1 + 4t^2}\Big \}. $$

Hence, the characteristic function f(t) of W satisfies

$$ - \log |f(t)| = \frac{1}{4} \sum _{k=1}^n \log (1 + 4\lambda _k^2 t^2) + 2 \sum _{k=1}^n a_k^2\,\frac{\lambda _k^2 t^2}{1 + 4\lambda _k^2 t^2}. $$

Since \(\lambda _1^2 \le \frac{1}{3}\), by the monotonicity, all \(\lambda _k^2 \le \frac{1}{3}\) as well. But, as we have already observed, under the conditions

$$0 \le b_k \le \frac{1}{3}, \ \ \ b_1 + \dots + b_k = 1,$$

and for any fixed value \(t \in \mathbf{R}\), the function

$$ \psi (b_1,\dots ,b_n) = \sum _{k=1}^n \log (1 + 4b_k t^2) $$

is minimized for the vector with coordinates

$$b_1 = b_2 = b_3 = \frac{1}{3} \ \ \ \mathrm{and} \ \ \ b_k = 0 \ \ \ \mathrm{for} \ \ \ k>3.$$

Hence,

$$ \psi (b_1,\dots ,b_n) \ge 3\, \log (1 + 4t^2/3) \ge 3\, \log (1 + t^2). $$

Therefore, one may conclude that

$$\begin{aligned} |f(t)| \le \frac{1}{(1 + t^2)^{3/4}}\, \exp \Big \{-2 \sum _{k=1}^n a_k^2\,\frac{\lambda _k^2 t^2}{1 + 4\lambda _k^2 t^2}\Big \}. \end{aligned}$$
(10)

It is time to involve the inversion formula which yields the upper bound

$$\begin{aligned} M(W) \le \frac{1}{\pi } \int _0^\infty |f(t)|\,dt. \end{aligned}$$
(11)

In the interval

$$0< t < T = \frac{1}{2\lambda _1},$$

we have \(\lambda _k^2 t^2 \le {1}/{4}\) for all k, and the bound (8) is simplified to

$$ |f(t)| \le \frac{1}{(1 + t^2)^{3/4}}\,e^{-B_1 t^2}. $$

This gives

$$ \int _0^T |f(t)|\,dt \le I(B_1) \equiv \int _0^\infty \frac{1}{(1 + t^2)^{3/4}}\,e^{-B_1 t^2}\,dt. $$

If \(B_1 \le 1\),

$$ I(B_1) \le \int _0^\infty \frac{1}{(1 + t^2)^{3/4}}\,dt < 3, $$

while for \(B_1 \ge 1\),

$$ I(B_1) \le \int _0^\infty e^{-B_1 t^2}\,dt = \frac{\sqrt{\pi }}{2\sqrt{B_1}} < \frac{1}{\sqrt{B_1}}. $$

The two estimates can be united by

$$I(B_1) \le \frac{ 3\sqrt{2}}{\sqrt{1+B_1}}.$$

To perform the integration over the half-axis \(t \ge T\), a different argument is needed. Put \(p_k = a_k^2 \lambda _k^2/B_1\), so that \(p_k \ge 0\) and \(p_1 + \dots + p_k = 1\). By Jensen’s inequality applied to the convex function \(V(x) = {1}/{(1 + x)}\) for \(x \ge 0\) with points \(x_k = 4\lambda _k^2 t^2\), we have

$$\begin{aligned} \sum _{k=1}^n a_k^2\,\frac{\lambda _k^2 t^2}{1 + 4\lambda _k^2 t^2}= & {} B_1 t^2 \sum _{k=1}^n p_k V(x_k) \\\ge & {} B_1 t^2\,V(p_1 x_1 + \dots p_n x_n) \\= & {} \frac{B_1 t^2}{1 + \frac{4t^2}{B_1} \sum _{k=1}^n a_k^2 \lambda _k^4} \ \ge \ \frac{B_1 t^2}{1 + \frac{4t^2}{3B_1} \sum _{k=1}^n a_k^2 \lambda _k^2} \, = \, \frac{B_1 t^2}{1 + \frac{4}{3}\,t^2}, \end{aligned}$$

where we used the property \(\lambda _k^2 \le {1}/{3}\). Moreover, since

$$t^2 \ge \frac{1}{(2\lambda _1)^2} \ge \frac{3}{4},$$

necessarily

$$ \frac{t^2}{1 + \frac{4}{3}\,t^2} \ge \frac{3}{8}. $$

Hence, from (10) we get

$$ |f(t)| \le \frac{1}{(1 + t^2)^{3/4}}\,e^{-3B_1/4}, \quad t \ge T, $$

and

$$ \int _T^\infty |f(t)|\,dt \le e^{-3B_1/4} \int _{\sqrt{3}/2}^\infty \frac{1}{(1 + t^2)^{3/4}}\,dt< 1.68\, e^{-3B_1/4} < \frac{1.85}{\sqrt{1+B_1}}. $$

Combining the two estimates together for different regions of integration with \({(3\sqrt{2} + 1.85)}/{\pi } < 1.94\), the bound (11) leads to

$$M(W) < \frac{2}{\sqrt{A_1+B_1}}.$$

Thus, this inequality, together with Lemma 2, completes the proof of the theorem.