Abstract
We study probability inequalities leading to tail estimates in a general semigroup \({\mathscr {G}}\) with a translation-invariant metric \(d_{\mathscr {G}}\). (An important and central example of this in the functional analysis literature is that of \({\mathscr {G}}\) a Banach space.) Using our prior work Khare and Rajaratnam (Ann Prob 45(6A):4101–4111, 2017) that extends the Hoffmann–Jørgensen inequality to all metric semigroups, we obtain tail estimates and approximate bounds for sums of independent semigroup-valued random variables, their moments, and decreasing rearrangements. In particular, we obtain the “correct” universal constants in several cases, extending results in the Banach space literature by Johnson et al. (Ann Prob 13(1):234-253, 1985), Hitczenko (Ann Prob 22(1):453–468, 1994), and Hitczenko and Montgomery-Smith (Ann Prob 29(1):447-466, 2001). Our results also hold more generally, in a very primitive mathematical framework required to state them: metric semigroups \({\mathscr {G}}\). This includes all compact, discrete, or (connected) abelian Lie groups.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction and main results
This paper follows our prior work [15] and continues the study of probability theory beyond—but also subsuming—the Banach space setting. In the present work, we estimate sums of independent random variables in several ways, under very primitive mathematical assumptions that suffice to state our results. The setting is as follows.
Definition 1.1
A metric semigroup is defined to be a semigroup \(({\mathscr {G}}, \cdot )\) equipped with a metric \(d_{\mathscr {G}}: {\mathscr {G}}\times {\mathscr {G}}\rightarrow [0,\infty )\) that is translation-invariant. In other words,
(Equivalently, \(({\mathscr {G}},d_{\mathscr {G}})\) is a metric space equipped with a associative binary operation such that \(d_{\mathscr {G}}\) is translation-invariant.) Similarly, one defines a metric monoid and a metric group.
Metric groups are ubiquitous in probability theory and functional analysis, and subsume all normed linear spaces as well as compact and (connected) abelian Lie groups as special cases. More modern examples of recent interest are mentioned presently.
Now suppose \((\varOmega , {\mathscr {A}}, \mu )\) is a probability space and \(X_1, \ldots , X_n \in L^0(\varOmega ,{\mathscr {G}})\) are \({\mathscr {G}}\)-valued random variables. Fix \(z_0, z_1 \in {\mathscr {G}}\) and define for \(1 \leqslant j \leqslant n\):
In this paper we discuss bounds that govern the behavior of \(U_n\)—and consequently, of sums \(S_n\) of independent \({\mathscr {G}}\)-valued random variables \(X_j\)—in terms of the variables \(X_j\), and even \(Y_j\) or \(M_j\). We are interested in a variety of bounds: (a) one-sided geometric tail estimates; (b) approximate two-sided bounds for tail probabilities; (c) approximate two-sided bounds for moments; and (d) comparison of moments. For instance, is it possible to obtain bounds for \({\mathbb {E}}_\mu [U_n^p]^{1/p}\) in terms of the tail distribution for \(U_n\), or in terms of \({\mathbb {E}}_\mu [U_n^q]^{1/q}\) for \(p,q > 0\)? The latter question has been well-studied in the literature for Banach spaces, and universal bounds that grow at the “correct” rate have been obtained for all \(q \gg 0\). We explore the question of obtaining correctly growing universal constants for metric semigroups, which include not only normed linear spaces and inner product spaces, but also all connected abelian and compact Lie groups. Our results show that the universal constants in such inequalities do not depend on the semigroup in question.
1.1 Motivations
Our motivations in developing probability theory in such general settings are both modern and classical. An increasing number of modern-day theoretical and applied settings require mathematical frameworks that go beyond Banach spaces. For instance, data and random variables may take values in manifolds such as (real or complex) Lie groups. Compact or connected abelian Lie groups also commonly feature in the literature, including permutation groups and other finite groups, lattices, orthogonal groups, and tori. In fact every abelian, Hausdorff, metrizable, topologically complete group G admits a translation-invariant metric [17], though this fails to hold for cancellative semigroups [18]. Certain classes of amenable groups are also metric groups (see [14] for more details). Other modern examples arise in the study of large networks and include the space of graphons with the cut norm, which arises naturally out of combinatorics and is related to many applications [21]. In a parallel vein, the space of labelled graphs \({\mathscr {G}}(V)\) on a fixed vertex set V is a 2-torsion metric group (see [12, 13]), hence does not embed into a normed linear space.
With the above settings in mind, in this paper we develop novel techniques for proving maximal inequalities—as well as comparison results between tail distributions and various moments—for sums of independent random variables taking values in the aforementioned groups, which need not be Banach spaces.
At the same time, we also have theoretical motivations in mind when developing probability theory on non-linear spaces such as \({\mathscr {G}}(V)\) and beyond. Throughout the past century, the emphasis in probability has shifted somewhat from proving results on stochastic convergence, to obtaining sharper and stronger bounds on random sums, in increasingly weaker settings. A celebrated achievement of probability theory has been to develop a rigorous and systematic framework for studying the behavior of sums of (independent) random variables; see e.g. [20]. In this vein, we provide unifications of our results on graph space with those in the Banach space literature, by proving them in a more primitive mathematical framework encompassing both of these (and other) settings. In particular, our results apply to compact/abelian/discrete Lie groups, as well as normed linear spaces.
For example, maximal inequalities by Hoffmann–Jørgensen, Lévy, Ottaviani–Skorohod, and Mogul’skii require merely the notions of a metric and a binary associative operation to state them. Thus one only needs a separable metric semigroup \({\mathscr {G}}\) rather than a Banach space to state these inequalities. However, note that working in a metric semigroup raises technical questions. For instance, the lack of an identity element means one has to specify how to compute magnitudes of \({\mathscr {G}}\)-valued random variables (before trying to bound or estimate them); also, it is not apparent how to define truncations of random variables. The lack of inverses, norms, or commutativity implies in particular that one cannot rescale or subtract random variables.
In the present work, we explain how to overcome these challenges. We also hope to show that the approach of working with arbitrary metric semigroups turns out to be richly rewarding in (i) obtaining the above (and other) results for non-Banach settings; (ii) unifying these results with the existing Banach space results in order to hold in the greatest possible generality; and (iii) further strengthening these unified versions where possible.
1.2 Organization and results
We now describe the organization and contributions of the present paper. In Sect. 2 we prove the Mogul’skii–Ottaviani–Skorohod inequalities for all metric semigroups \({\mathscr {G}}\). As an application, we show Lévy’s equivalence for stochastic convergence in metric semigroups.
In Sect. 3, we come to our main goal in this paper, of estimating and comparing moments and tail probabilities for sums of independent \({\mathscr {G}}\)-valued random variables. Our main tool is a variant of Hoffmann–Jørgensen’s inequality for metric semigroups, which is shown in recent work [15]. The relevant part for our purposes is now stated.
Theorem 1.1
(Khare and Rajaratnam [15]) Notation as in Definition 1.1 and Equation (1.1). Suppose \(X_1, \ldots \), \(X_n \in L^0(\varOmega ,{\mathscr {G}})\) are independent. Fix scalars
and define
where \(\delta _{i1}\) denotes the Kronecker delta. Now if \(\sum _{i=1}^k n_i \leqslant n+1\), then:
Remark that Theorem 1.1 generalizes the original Hoffmann–Jørgensen inequality in three ways: (i) mathematically it strengthens the state-of-the-art even for real variables; (ii) it unifies previous results by Johnson and Schechtman [10], Klass and Nowicki [16], and Hitczenko and Montgomery-Smith [6] in the Banach space literature; and (iii) the result holds in the most primitive setting needed to state it, thereby being applicable also to e.g. Lie groups.
We now discuss several ways in which to estimate the size of sums of independent \({\mathscr {G}}\)-valued random variables, for metric semigroups \({\mathscr {G}}\). We present two results in this section, corresponding to two of the estimation techniques discussed in the introduction. (For a third result, see Theorem 3.1.)
The first approach, informally speaking, uses the Hoffmann–Jørgensen inequality to generalize an upper bound for \({\mathbb {E}}_\mu [\Vert S_n\Vert ^p]\) in terms of the quantiles of \(\Vert S_n\Vert \) as well as \({\mathbb {E}}_\mu [M_n^p]\)—but now in the “minimal” framework of metric semigroups. More precisely, we show that controlling the behavior of \(X_n\) is equivalent to controlling \(S_n\) or \(U_n\), for all metric semigroups.
Theorem A
Suppose \(A \subset {\mathbb {N}}\) is either \({\mathbb {N}}\) or \(\{ 1, \ldots , N \}\) for some \(N \in {\mathbb {N}}\). Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a separable metric semigroup, \(z_0, z_1 \in {\mathscr {G}}\), and \(X_n \in L^0(\varOmega ,{\mathscr {G}})\) are independent for all \(n \in A\). If \(\sup _{n \in A} d_{\mathscr {G}}(z_1, z_0 S_n) < \infty \) almost surely, then for all \(p \in (0,\infty )\),
This result extends [7, Theorem 3.1] by Hoffmann–Jørgensen to the “minimal” framework of metric semigroups. The proofs of Theorem A and the next result use the notion of the quantile functions, or decreasing rearrangements, of \({\mathscr {G}}\)-valued random variables:
Definition 1.2
Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a metric semigroup, and \(X : (\varOmega , {\mathscr {A}}, \mu ) \rightarrow ({\mathscr {G}},{\mathscr {B}}_{\mathscr {G}})\). We define the decreasing (or non-increasing) rearrangement of X to be the right-continuous inverse \(X^*\) of the function \(t \mapsto {\mathbb {P}}_\mu \left( d_{\mathscr {G}}(z_0, z_0 X) > t \right) \), for any \(z_0 \in {\mathscr {G}}\). In other words, \(X^*\) is the real-valued random variable defined on [0, 1] with the Lebesgue measure, as follows:
Note that \(X^*\) has exactly the same law as \(d_{\mathscr {G}}(z_0, z_0 X)\). Moreover, if \(({\mathscr {G}}, \Vert \cdot \Vert )\) is a normed linear space, then \(d_{\mathscr {G}}(z_0, z_0 X)\) can be replaced by \(\Vert X\Vert \), and often papers in the literature refer to \(X^*\) as the decreasing rearrangement of \(\Vert X\Vert \) instead of X itself. The convention that we adopt above is slightly weaker.
The second approach provides another estimate on the size of \(S_n\) through its moments, by comparing \(\Vert S_n \Vert _q\) to \(\Vert S_n \Vert _p\)—or more precisely, \({\mathbb {E}}_\mu [U_n^q]^{1/q}\) to \({\mathbb {E}}_\mu [U_n^p]^{1/p}\)—for \(0 < p \leqslant q\). Moreover, the constants of comparison are universal, valid for all abelian semigroups and all finite sequences of independent random variables, and depend only on a threshold:
Theorem B
Given \(p_0 > 0\), there exist universal constants \(c = c(p_0), c' = c'(p_0) > 0\) depending only on \(p_0\), such that for all choices of (a) separable abelian metric semigroups \(({\mathscr {G}}, d_{\mathscr {G}})\), (b) finite sequences of independent \({\mathscr {G}}\)-valued random variables \(X_1, \ldots , X_n\), (c) \(q \geqslant p \geqslant p_0\), and (d) \(\epsilon \in (-q,\log (16)]\), we have
Moreover, we may choose
Theorem B extends a host of results in the Banach space literature, including by Johnson–Schechtman–Zinn [11], Hitczenko [5], and Hitczenko and Montgomery-Smith [6]. (see also [20, Theorem 6.20] and [19, Proposition 1.4.2]) Theorem B also yields the correct order of the constants as \(q \rightarrow \infty \), as discussed by Johnson et al. in loc. cit. where they extend previous work on Khinchin’s inequality by Rosenthal [24]. Moreover, all of these results are shown for Banach spaces. Theorem B holds additionally for all compact Lie groups, finite abelian groups and lattices, and spaces of labelled and unlabelled graphs.
2 Lévy’s equivalence in metric semigroups
In this section we prove:
Theorem 2.1
(Lévy’s Equivalence) Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a complete separable metric semigroup, \(X_n : (\varOmega , {\mathscr {A}}, \mu ) \rightarrow ({\mathscr {G}}, {\mathscr {B}}_{\mathscr {G}})\) are independent, \(X \in L^0(\varOmega , {\mathscr {G}})\), and \(S_n\) is defined as in (1.1). Then
Moreover, if the sequence \(S_n\) does not converge as above, then it diverges almost surely.
Special cases of this result have been shown in the literature. For instance, [2, §9.7] considers \({\mathscr {G}}= \mathbb {R}^n\). The more general case of a separable Banach space \({\mathbb {B}}\) was shown by It\(\hat{\text{ o }}\)–Nisio [9, Theorem 3.1], as well as by Hoffmann-Jørgensen and Pisier [8, Lemma 1.2]. The most general version in the literature to date is by Tortrat, who proved the result for a complete separable metric group in [25]. Thus Theorem 2.1 is the closest to assuming only the minimal structure necessary to state the result (as well as to prove it).
In order to prove Theorem 2.1, we first study basic properties of metric semigroups. Note that for a metric group, the following is standard; see [17], for instance.
Lemma 2.1
If \(({\mathscr {G}}, d_{\mathscr {G}})\) is a metric (semi)group, then the translation-invariance of \(d_{\mathscr {G}}\) implies the “triangle inequality”:
and in turn, this implies that each (semi)group operation is continuous.
If instead \({\mathscr {G}}\) is a group equipped with a metric \(d_{\mathscr {G}}\), then except for the last two statements, any two of the following assertions imply the other two:
-
1.
\(d_{\mathscr {G}}\) is left-translation invariant: \(d_{\mathscr {G}}(ca, cb) = d_{\mathscr {G}}(a,b)\) for all \(a,b,c \in {\mathscr {G}}\). In other words, left-multiplication by any \(c \in {\mathscr {G}}\) is an isometry.
-
2.
\(d_{\mathscr {G}}\) is right-translation invariant.
-
3.
The inverse map \(: {\mathscr {G}}\rightarrow {\mathscr {G}}\) is an isometry. Equivalently, the triangle inequality (2.1) holds.
-
4.
\(d_{\mathscr {G}}\) is invariant under all inner/conjugation automorphisms.
In order to show Theorem 2.1 for metric semigroups, we recall the following preliminary result from [14], and will use it below without further reference.
Proposition 2.1
[14] Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a metric semigroup, and \(a,b \in {\mathscr {G}}\). Then
is independent of \(a \in {\mathscr {G}}\). Moreover, a set \({\mathscr {G}}\) is a metric semigroup only if \({\mathscr {G}}\) is a metric monoid, or the set of non-identity elements in a metric monoid \({\mathscr {G}}'\). This is if and only if the number of idempotents in \({\mathscr {G}}\) is one or zero, respectively. Furthermore, the metric monoid \({\mathscr {G}}'\) is (up to a monoid isomorphism) the unique smallest element in the class of metric monoids containing \({\mathscr {G}}\) as a sub-semigroup.
Remark 1
In the sequel, we denote—when required—the unique metric monoid containing a given metric semigroup \({\mathscr {G}}\) by \({\mathscr {G}}' := {\mathscr {G}}\cup \{ 1' \}\). Note that the idempotent \(1'\) may already be in \({\mathscr {G}}\), in which case \({\mathscr {G}}= {\mathscr {G}}'\). One consequence of Proposition 2.1 is that instead of working with metric semigroups, one can use the associated monoid \({\mathscr {G}}'\) instead. (In other words, the (non)existence of the identity is not an issue in many such cases.) This helps simplify other calculations. For instance, what would be a lengthy, inductive (yet straightforward) computation now becomes much simpler: for nonnegative integers k, l, and \(z_0, z_1, \ldots , z_{k+l} \in {\mathscr {G}}\), the triangle inequality (2.1) implies:
2.1 The Mogul’skii inequalities and proof of Lévy’s equivalence
Like Lévy’s equivalence (Theorem 2.1) and the Hoffmann–Jørgensen inequality (Theorem 1.1), many other maximal and minimal inequalities can be formulated using only the notions of a distance function and of a semigroup operation. We now extend to metric semigroups two inequalities by Mogul’skii, which were used in [22] to prove a law of the iterated logarithm in normed linear spaces. The following result will be useful in proving Theorem 2.1.
Proposition 2.2
(Mogul’skii–Ottaviani–Skorohod inequalities) Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a separable metric semigroup, \(z_0, z_1 \in {\mathscr {G}}\), \(a,b \in [0,\infty )\), and \(X_1, \ldots , X_n\)\(\in L^0(\varOmega ,{\mathscr {G}})\) are independent. Then for all integers \(1 \leqslant m \leqslant n\),
These inequalities strengthen [22, Lemma 1] from normed linear spaces to arbitrary metric semigroups. Also note that the second inequality generalizes the Ottaviani–Skorohod inequality to all metric semigroups. Indeed, sources such as [2, § 9.7.2] prove this result in the special case \({\mathscr {G}}= (\mathbb {R}^n, +), z_0 = z_1 = 0, m=1, a = \alpha + \beta , b = \beta \), with \(\alpha , \beta > 0\).
We omit the proof of Proposition 2.2 for brevity as it involves standard arguments. Using this result, one can now prove Theorem 2.1. The idea is to use the approach in [2]; however, it needs to be suitably modified in order to work in the current level of generality.
Proof of Theorem 2.1
The forward implication is easily verified in the more general setting of a separable metric space; see e.g. [2, Section 9.2]. Conversely, we claim that \(S_i\) is Cauchy almost everywhere, if it converges in probability to X. Given \(\epsilon ,\eta > 0\), the assumption and definitions imply that there exists \(n_0 \in {\mathbb {N}}\) such that
This implies that \(\displaystyle {\mathbb {P}}_\mu \left( d_{\mathscr {G}}(S_m,S_n) \geqslant \epsilon / 4 \right) < \frac{\eta }{1 + \eta }\) for all \(n \geqslant m \geqslant n_0\). Now define \(S'_i := \prod _{j=1}^i X_{n_0+j}\). Fix \(n > n_0\) and apply Proposition 2.2 to \(\{ X_{n_0+i} : 1 \leqslant i \leqslant n - n_0 \}\) with \(m=1, a = \epsilon /2, b = \epsilon /4\), and \(z_0 = z_1\):
Now define \(Q_{n_0} := \sup _{n > n_0} d_{\mathscr {G}}(S_{n_0}, S_n)\) and \(\delta _{n_0} := \sup _{n> m > n_0} d_{\mathscr {G}}(S_m,S_n)\). Then \(\delta _{n_0} \leqslant 2 Q_{n_0}\); moreover, taking the limit of the above inequality as \(n \rightarrow \infty \) yields:
But then \({\mathbb {P}}_\mu \left( \sup _{n > m} d_{\mathscr {G}}(S_m,S_n) \geqslant \epsilon \right) \leqslant \eta \) for all \(m > n_0\). Thus, \(S_n\) is Cauchy almost everywhere. Since \({\mathscr {G}}\) is complete, the result now follows from [2, Lemma 9.2.4]; that the almost sure limit is X is because \(S_n {\mathop {\longrightarrow }\limits ^{P}} X\). Finally, since the \(X_n\) are independent, the convergence of the sequence \(S_n\) is a tail event. In particular, it has probability zero or one by the Kolmogorov 0–1 law, concluding the proof.
We remark for completeness that the other Lévy equivalence has been addressed in [1, 3, 25] for various classes of topological groups. See also [23] for a variant in discrete completely simple semigroups, [2, 9] for Banach space versions, and [14] for a version over any normed abelian metric (semi)group.
3 Measuring the magnitude of sums of independent random variables
We now prove Theorems A and B using the Hoffmann–Jørgensen inequality in Theorem 1.1. Recall that the Banach space version of this inequality is extremely important in the literature and is widely used in bounding sums of independent Banach space-valued random variables. Having proved Theorem 1.1, an immediate application of our main result is in obtaining the first such bounds for general metric semigroups \({\mathscr {G}}\). We also provide uniformly good \(L^p\)-bounds and tail probability bounds on sums \(S_n\) of independent \({\mathscr {G}}\)-valued random variables.
3.1 An upper bound by Hoffmann–Jørgensen
In this subsection we prove Theorem A. The proof uses basic properties of decreasing rearrangements (see Definition 1.2), which we record here and use below, possibly without reference.
Proposition 3.1
Suppose \(X, Y : (\varOmega , {\mathscr {A}}, \mu ) \rightarrow [0,\infty )\) are random variables, and
-
1.
\(X^*(t) \leqslant x\) if and only if \({\mathbb {P}}_\mu \left( X > x \right) \leqslant t\).
-
2.
\(X^*(t)\) is decreasing in \(t \in [0,1]\) and increasing in \(X \geqslant 0\).
-
2.
\((X/x)^*(t) = X^*(t)/x\).
-
3.
Suppose \({\mathbb {P}}_\mu \left( X> x \right) \leqslant \beta {\mathbb {P}}_\mu \left( Y > \gamma x \right) \) for all \(x>0\). Then for all \(p \in (0,\infty )\) and \(t \in (0,1)\),
$$\begin{aligned} {\mathbb {E}}_\mu [Y^p] \geqslant \beta ^{-1} \gamma ^p {\mathbb {E}}_\mu [X^p], \qquad {\mathbb {E}}_\mu [X^p] \geqslant t X^*(t)^p. \end{aligned}$$ -
4.
Fix finitely many tuples of positive constants \((\alpha _i, \beta _i, \gamma _i, \delta _i)_{i=1}^N\), and real-valued nondecreasing functions \(f_i\) such that for all \(x>0\) there exists at least one i such that
$$\begin{aligned} f_i({\mathbb {P}}_\mu \left( X> \alpha _i x \right) ) \leqslant \beta _i {\mathbb {P}}_\mu \left( Y > \gamma _i x \right) ^{\delta _i}. \end{aligned}$$(3.1)Then
$$\begin{aligned} X^*(t) \leqslant \max _{1 \leqslant i \leqslant N} \frac{\alpha _i}{\gamma _i} Y^*\left( (f_i(t)/\beta _i)^{1/\delta _i}\right) . \end{aligned}$$(3.2)If on the other hand (3.1) holds for all i, then
$$\begin{aligned} X^*(t) \leqslant \min _{1 \leqslant i \leqslant N} \frac{\alpha _i}{\gamma _i} Y^*\left( (f_i(t)/\beta _i)^{1/\delta _i}\right) . \end{aligned}$$
Proof
These properties are shown using the definitions via straightforward arguments, and so we omit the proofs, except for the final part. By assumption there exists at least one i such that if \({\mathbb {P}}_\mu \left( X> \alpha _i x \right) > t\) for some t, then \(\beta _i {\mathbb {P}}_\mu \left( Y> \gamma _i x \right) ^{\delta _i} > f_i(t)\) since \(f_i\) is nondecreasing. For this choice of i, we obtain:
(where we only consider \(y \geqslant 0\)). Therefore for all \(t \in [0,1]\),
Taking the supremum of both sides yields Eq. (3.2). If on the other hand Eq. (3.1) holds for all i, then the preceding inclusion holds with the union replaced by intersection. Now taking the supremum of both sides yields Eq. (3.2) with maximum replaced by minimum (since each set in the intersection is an interval containing 0).
Using Proposition 3.1, we now show one of the main results in this paper.
Proof of Theorem A
Note for all n that
from which we obtain
Taking first the supremum over \(n \in A\) and then the expectation proves the backward implication. Conversely, first claim that controlling sums of \({\mathscr {G}}\)-valued \(L^p\) random variables in probability (i.e., in \(L^0\)) allows us to control these sums in \(L^p\) as well, for \(p>0\). Namely, we make the following claim:
Suppose \(({\mathscr {G}}, d_{\mathscr {G}})\) is a separable metric semigroup, \(p \in (0,\infty )\), and \(X_1, \ldots , X_n\)\(\in L^p(\varOmega ,{\mathscr {G}})\) are independent. Now fix \(z_0, z_1 \in {\mathscr {G}}\) and let \(S_k, U_n, M_n\) be as in Definition 1.1 and Eq. (1.1). Then,
Note that the claim is akin to the upper bound by Hoffmann–Jørgensen that bounds \({\mathbb {E}}_\mu [\Vert S_n \Vert ^p]\) in terms of \({\mathbb {E}}_\mu [M_n^p]\) and the quantiles of \(\Vert S_n \Vert \) for Banach space-valued random variables (see [7, proof of Theorem 3.1] and [4, Lemma 3.1]). We omit its proof for brevity, as a similar statement is asserted in [20, Proposition 6.8]. Given the claim, define:
as above, where we also use the assumption that \(U_A < \infty \) almost surely. Now for all \(n \in A\), compute using the above claim and elementary properties of decreasing rearrangements:
This concludes the proof if A is finite; for \(A = {\mathbb {N}}\), use the monotone convergence theorem for the increasing sequence \(0 \leqslant U_n^p \rightarrow U_A^p\).
3.2 Two-sided bounds and \(L^p\) norms
We now formulate and prove additional results that control tail behavior for metric semigroups and monoids—specifically, \(M_A, U_n, U_n^*\). This includes proving our other main result, Theorem B. We begin by setting notation.
Definition 3.1
Suppose \({\mathscr {G}}\) is a metric semigroup.
-
1.
Given \(X_n \in L^0(\varOmega ,{\mathscr {G}})\) as above, for all n in a finite or countable set A, define the random variable \(\ell _X = \ell _{(X_n)} : \mathbb {R} \rightarrow [0,\infty ]\) via:
$$\begin{aligned} \ell _X(t) := {\left\{ \begin{array}{ll} \inf \{ y> 0\ : \ \sum _{n \in A} {\mathbb {P}}_\mu \left( d_{\mathscr {G}}(z_0, z_0 X_n) > y \right) \leqslant t \}, &{}\quad \text {if } t \in [0,1],\\ 0, &{}\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$As indicated in [6, §2], one then has:
$$\begin{aligned} \mathbb {P}(\ell _X> x) = \sum _{n \in A} {\mathbb {P}}_\mu \left( d_{\mathscr {G}}(z_0, z_0 X_n) > x \right) , \end{aligned}$$where \(\mathbb {P}\) is the Lebesgue measure on [0, 1].
-
2.
Two families of variables P(t) and Q(t) are said to be comparable, denoted by \(P(t) \approx Q(t)\), if there exist constants \(c_1, c_2 > 0\) such that \(c_1^{-1} P(t) \leqslant Q(t) \leqslant c_2 P(t)\) uniformly over all t. The \(c_i\) are called the “constants of approximation”. For the remaining definitions, assume\(({\mathscr {G}}, 1_{\mathscr {G}}, d_{\mathscr {G}})\)is a separable metric monoid.
-
3.
Given \(t \geqslant 0\) and a random variable \(X \in L^0(\varOmega , {\mathscr {G}})\), define its truncation to be:
$$\begin{aligned} X(t) := {\left\{ \begin{array}{ll} 1_{\mathscr {G}}, \quad &{}\quad \text{ if } d_{\mathscr {G}}(1_{\mathscr {G}}, X) > t,\\ X, &{} \quad \text{ otherwise. } \end{array}\right. } \end{aligned}$$ -
4.
Given variables \(X_1, \ldots , X_n : \varOmega \rightarrow {\mathscr {G}}\), and \(r \in (0,1)\), define:
$$\begin{aligned} U'_n(r) := \max _{1 \leqslant k \leqslant n} d_{\mathscr {G}}(1_{\mathscr {G}}, \prod _{i=1}^k X_i(\ell _X(r))). \end{aligned}$$
The following estimate on tail behavior compares \(U_n\) with its decreasing rearrangement.
Theorem 3.1
Given \(p_0 > 0\), there exist universal constants of approximation (depending only on \(p_0\)), such that for all \(p \geqslant p_0\), separable abelian metric monoids \(({\mathscr {G}}, 1_{\mathscr {G}}, d_{\mathscr {G}})\), and finite sequences \(X_1, \ldots , X_n\) of independent \({\mathscr {G}}\)-valued random variables (for any \(n \in {\mathbb {N}}\)),
where \(U_n\) and \(U'_n\) were defined in Eq. (1.1) and Definition 3.1 respectively.
For real-valued X, the expression \({\mathbb {E}}[|X|^p]^{1/p}\) is also denoted by \(\Vert X \Vert _p\) in the literature.
To show Theorem 3.1, we require some preliminary results which provide additional estimates to govern tail behavior, and which we now collect before proving the theorem. As these preliminaries are often extensions to metric semigroups of results in the Banach space literature, we will sketch or omit their proofs now.
The first result obtains two-sided bounds to control the behavior of the “maximum magnitude” \(M_A\) (cf. Eq. (3.3)).
Proposition 3.2
Suppose \(\{ X_n : n \in A \}\) is a (finite or countably infinite) sequence of independent random variables with values in a separable metric semigroup \(({\mathscr {G}}, d_{\mathscr {G}})\).
-
1.
For all \(t \in (0,1)\), \(\ell _X(2t) \leqslant \ell _X(t/(1-t)) \leqslant M_A^*(t) \leqslant \ell _X(t)\).
-
2.
Suppose \(X_n \in L^p(\varOmega ,{\mathscr {G}})\) for some \(p>0\) (and for all \(n \in A\)). For all \(t > 0\), define:
$$\begin{aligned} \varPsi _X(t) := p \sum _{n \in A} \int _{\ell _X(t)}^\infty u^{p-1} {\mathbb {P}}_\mu \left( d_{\mathscr {G}}\left( z_0, z_0 X_n\right) > u \right) \ du. \end{aligned}$$Then, \(\displaystyle \frac{t \ell _X(t)^p + \varPsi _X(t)}{1+t} \leqslant {\mathbb {E}}_\mu [M_A^p] \leqslant \ell _X(t)^p + \varPsi _X(t)\).
Proof
The first part follows [6, Proposition 1] (using a special case of Equation (3.2)). For the second, follow the arguments for showing [4, Lemma 3.2]; see also [20, Lemma 6.9].
We next discuss a consequence of Hoffmann-Jørgensen’s inequality for metric semigroups, Theorem 1.1, which can be used to bound the \(L^p\)-norms of the variables \(U_n\)—or more precisely, to relate these \(L^p\)-norms to the tail distributions of \(U_n\) via \(U_n^*\).
Lemma 3.1
(Notation as in Definition 1.1 and Eq. (1.1)) There exists a universal positive constant \(c_1\) such that for any \(0 \leqslant t \leqslant s \leqslant 1/2\), any separable metric semigroup \(({\mathscr {G}}, d_{\mathscr {G}})\) with elements \(z_0, z_1\), and any sequence of independent \({\mathscr {G}}\)-valued random variables \(X_1, \ldots , X_n\),
Proof
We begin by writing down a consequence of Theorem 1.1:
If \({\mathbb {P}}_\mu \left( U_n > t \right) \leqslant 1/2\), then this quantity is further dominated by
Now carry out the steps mentioned in the proof of [6, Corollary 1]. \(\square \)
The final preliminary result is proved by adapting the proofs of [6, Lemma 3 and Corollary 2] to metric monoids.
Proposition 3.3
Suppose \(({\mathscr {G}}, 1_{\mathscr {G}}, d_{\mathscr {G}})\) is a separable metric monoid and \(X_1, \ldots ,\)\(X_n: \varOmega \rightarrow {\mathscr {G}}\) is a finite sequence of independent \({\mathscr {G}}\)-valued random variables. For \(r \in (0,1)\), define:
where \(X'_i(t)\) equals \(1_{\mathscr {G}}\) if \(d_{\mathscr {G}}(1_{\mathscr {G}}, X_i) \leqslant t\), and \(X_i\) otherwise.
-
1.
Then \(U''_n(r)\) may be expressed as the sum of “disjoint” random variables \(V_k\) for \(k \in {\mathbb {N}}\). In other words, \(\varOmega \) can be partitioned into measurable subsets \(E_k\) such that \(V_k = U''_n(r)\) on \(E_k\) and \(1_{\mathscr {G}}\) otherwise. Moreover, the \(V_k\) may be chosen such that \(V_k^*(t) \leqslant k \cdot \ell (t (k-1)! / r^{k-1})\).
-
2.
Given the assumptions, for all \(p \in (0,\infty )\),
$$\begin{aligned} {\mathbb {E}}_\mu [U''_n(r)^p]^{1/p} \leqslant 2 e^{2^p r/p} {\mathbb {E}}[\ell _X^p]^{1/p}. \end{aligned}$$
With the above results in hand, we can now show the above theorem.
Proof of Theorem 3.1
Compute using the triangle inequality (2.1) and Remark 1:
Hence \(M_n \leqslant 2 U_n\). Now compute for \(p \geqslant p_0\), using Propositions 3.1 and 3.2 :
Hence there exists a constant \(0 < c_1 = c_1(p_0)\) such that:
This yields one inequality; another one is obtained using Proposition 3.2 as follows:
Now if \({\mathbb {P}}_\mu \left( U'_n(e^{-p}/8)> y \right) > \eta \) for some \(\eta \in [\frac{e^{-p}}{8},1]\), then by the reverse triangle inequality,
Hence by definition and the above calculations,
Applying this with \(\eta = e^{-p}/4\),
Hence as above, there exists a constant \(0 < c_2 = c_2(p_0)\) such that:
This proves the second of the four claimed inequalities. The remaining arguments can now be shown by suitably adapting the proof of [6, Theorem 3].
Finally, we use Theorem 3.1 to prove our remaining main result.
Proof of Theorem B
Using Proposition 2.1, let \({\mathscr {G}}'\) denote the smallest metric monoid containing \({\mathscr {G}}\). Thus the \(X_k\) are a sequence of independent \({\mathscr {G}}'\)-valued random variables, and we may assume henceforth that \({\mathscr {G}}= {\mathscr {G}}'\). Compute using Proposition 3.2, and the fact that \(X^*\) and X have the same law for the real-valued random variable \(X = M_n\):
Using this computation, as well as Lemma 3.1 and Theorem 3.1 for \({\mathscr {G}}'\), we compute:
since \(\epsilon \in (-q, \log (16)]\). There are now two cases: first if \(e^p \geqslant \epsilon +q\), then
On the other hand, if \(e^p < \epsilon + q\) then set \(C := 1 + \frac{\log (4)}{p_0}\) and note that \(Cq \geqslant q + \log (4)\). Therefore,
Using the above analysis now yields:
Setting \(c := c'_1 \max (2^{1/p_0}, c_1(1 + \log (4)/p_0), c_1 c_2 (1 + \log (4)/p_0))\), we obtain the first inequality claimed in the statement of the theorem.
To show the second inequality, we first verify that if \(\epsilon \geqslant \min (1, e - p_0)\), then the function \(f(x) := x/\log (\epsilon +x)\) is strictly increasing on \((p_0,\infty )\). Now compute:
Next, use Proposition 3.1 to show: \(M^*_n(e^{-q}/8) \leqslant {\mathbb {E}}_\mu [M_n^q]^{1/q} (8e^q)^{1/q} \leqslant 8^{1/p_0} e {\mathbb {E}}_\mu [M_n^q]^{1/q}\). Using the previous two facts, we now complete the proof of the second inequality by beginning with the first inequality:
The second inequality in the theorem now follows.
References
Csiszár, I.: On infinite products of random elements and infinite convolutions of probability distributions on locally compact groups. Probab. Theory Relat. Fields 5(4), 279–295 (1966)
Dudley, R.M.: Real Analysis and Probability, Cambridge Studies in Advanced Mathematics, vol. 74. Cambridge University Press, Cambridge (2002)
Galmarino, A.R.: The equivalence theorem for compositions of independent random elements on locally compact groups and homogeneous spaces. Probab. Theory Relat. Fields 7(1), 29–42 (1967)
Giné, E., Zinn, J.: Central limit theorems and weak laws of large numbers in certain Banach spaces. Probab. Theory Relat. Fields 62(3), 323–354 (1983)
Hitczenko, P.: On a domination of sums of random variables by sums of conditionally independent ones. Ann. Probab. 22(1), 453–468 (1994)
Hitczenko, P., Montgomery-Smith, S.J.: Measuring the magnitude of sums of independent random variables. Ann. Probab. 29(1), 447–466 (2001)
Hoffmann-Jørgensen, J.: Sums of independent Banach space valued random variables. Stud. Math. 52(2), 159–186 (1974)
Hoffmann-Jørgensen, J., Pisier, G.: The law of large numbers and the central limit theorem in Banach spaces. Ann. Probab. 4, 587–599 (1976)
Itô, K., Nisio, M.: On the convergence of sums of independent Banach space valued random variables. Osaka J. Math. 5, 35–48 (1968)
Johnson, W.B., Schechtman, G.: Sums of independent random variables in rearrangement invariant function spaces. Ann. Probab. 17, 789–808 (1989)
Johnson, W.B., Schechtman, G., Zinn, J.: Best constants in moment inequalities for linear combinations of independent and exchangeable random variables. Ann. Probab. 13(1), 234–253 (1985)
Khare, A., Rajaratnam, B.: Differential calculus on the space of countable labelled graphs (2014). arXiv:1410.6214
Khare, A., Rajaratnam, B.: Integration and measures on the space of countable labelled graphs (2015). arXiv:1506.01439
Khare, A., Rajaratnam, B.: The Khinchin–Kahane inequality and Banach space embeddings for metric groups (2016). arXiv:1610.03037
Khare, A., Rajaratnam, B.: The Hoffmann–Jørgensen inequality in metric semigroups. Ann. Probab. 45(6A), 4101–4111 (2017)
Klass, M.J., Nowicki, K.: An improvement of Hoffmann–Jørgensen’s inequality. Ann. Probab. 28(2), 851–862 (2000)
Klee Jr., V.L.: Invariant metrics in groups (solution of a problem of Banach). Proc. Am. Math. Soc. 3(3), 484–487 (1952)
Kranz, P., Guo, M.: A metrizable cancellative semigroup without translation invariant metric. Proc. Am. Math. Soc. 66, 17–19 (1977)
Kwapień, S., Woyczyński, W.A.: Random Series and Stochastic Integrals. Single and Multiple. Birkhäuser, Boston (1992)
Ledoux, M., Talagrand, M.: Probability in Banach Spaces (Isoperimetry and Processes), Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer, Berlin (1991)
Lovász, L.: Large Networks and Graph Limits, Colloquium Publications, vol. 60. American Mathematical Society, Providence (2012)
Mogul’skii, A.A.: On the law of the iterated logarithm in Chung’s form for functional spaces. Theory Probab. Appl. 24(2), 405–413 (1979)
Mukherjea, A., Sun, T.C.: Convergence of products of independent random variables with values in a discrete semigroup. Probab. Theory Relat. Fields 46(2), 227–236 (1978)
Rosenthal, H.P.: On the subspaces of \(L^p\) (\(p>2\)) spanned by sequences of independent random variables. Isr. J. Math. 8(3), 273–303 (1970)
Tortrat, A.: Lois de probabilite sur un espace topologique completement régulier et produits infinis à termes indépendant dans un groupe topologique. Annales de l’Institut Henri Poincaré (B): Probabilités et Statistique 1, 217–237 (1964/65)
Acknowledgements
We thank David Montague and Doug Sparks for providing detailed feedback on an early draft of the paper, which improved the exposition. We also thank the referees for a careful reading of the paper. This work was partially supported by the following: US Air Force Office of Scientific Research Grant Award FA9550-13-1-0043, US National Science Foundation under Grant DMS-0906392, DMS-CMG 1025465, AGS-1003823, DMS-1106642, DMS-CAREER-1352656, Defense Advanced Research Projects Agency DARPA YFA N66001-111-4131, the UPS Foundation, SMC-DBNKY, Ramanujan Fellowship SB/S2/RJN-121/2017 and MATRICS Grant MTR/2017/000295 from SERB (Govt. of India), by grant F.510/25/CAS-II/2018 (SAP-I) from UGC (Govt. of India), and a Young Investigator Award from the Infosys Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Un Cig Ji.
Rights and permissions
About this article
Cite this article
Khare, A., Rajaratnam, B. Probability inequalities and tail estimates for metric semigroups. Adv. Oper. Theory 5, 779–795 (2020). https://doi.org/10.1007/s43036-020-00048-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s43036-020-00048-8
Keywords
- Metric semigroup
- maximal inequality
- Lévy inequality
- Ottaviani–Skorohod inequality
- Mogul’skii inequality
- Lévy equivalence
- tail estimate
- moment estimate
- decreasing rearrangement
- universal constant