1 Introduction

Throughout the paper \((\mathcal {X},d)\) denotes a polish metric space and \(\mathcal {P}(\mathcal {X})\) the set of Borel probability measures on \(\mathcal {X}.\) On the product space \(\mathcal {X}^n\), we consider the following \(\ell _p\) product distance \(d_p\) defined by

$$\begin{aligned} d_p(x,y) = \left[ \sum _{i=1}^n d^p(x_i,y_i)\right] ^{1/p},\quad x,y\in \mathcal {X}^n,\quad p\ge 1. \end{aligned}$$

(Note that the dependence on the dimension \(n\) is understood.) If \(A\) is a Borel subset of \(\mathcal {X}^n\), we define its enlargement \(A_{r,p}\) (simply denoted by \(A_r\) when \(n=1\)), \(r\ge 0\) as follows

$$\begin{aligned} A_{r,p} =\left\{ x \in \mathcal {X}^n ; d_p(x,A) \le r\right\} . \end{aligned}$$

Also, in all what follows, \(\alpha : \mathbb {R}^+ \rightarrow \mathbb {R}^+\) will always be a non increasing function. One will say that \(\mu \in \mathcal {P}(\mathcal {X})\) satisfies the dimension free concentration property with the concentration profile \(\alpha \) and with respect to the \(\ell _p\) product structure if

$$\begin{aligned} \mu ^n(A_{r,p})\ge 1-\alpha (r),\quad \forall r\ge 0, \end{aligned}$$
(1.1)

for all \(A\subset \mathcal {X}^n\), with \(\mu ^n(A)\ge 1/2\). For simplicity, we will often say that \(\mu \) satisfies the dimension free concentration inequality \(\mathbf {CI}_p^\infty (\alpha )\), and, if \(\mu \) satisfies (1.1) only for \(n=1\), we will say that \(\mu \) satisfies \(\mathbf {CI}(\alpha ).\) We refer to [28] for an introduction on the notion of concentration of measure.

The general problem considered in this paper is to give a characterization of the class of probability measures satisfying \(\mathbf {CI}_p^\infty (\alpha )\). Our main result shows that the class of probability measures satisfying \(\mathbf {CI}^\infty _2(\alpha )\), for some non trivial \(\alpha \), is always contained in the class of probability measures satisfying the Poincaré inequality. Moreover, these two classes coincide when \(\alpha \) is exponential: \(\alpha (r)=be^{-ar}\), for some \(a,b>0.\)

Before stating this result, let us recall the definition of the Poincaré inequality: one says that \(\mu \in \mathcal {P}(\mathcal {X})\) satisfies the Poincaré inequality with the constant \(\lambda \in \mathbb {R}^+\cup \{+\infty \}\), if

$$\begin{aligned} \lambda \mathrm {Var}_\mu (f) \le \int |\nabla ^- f|^2\,d\mu , \end{aligned}$$
(1.2)

for all Lipschitz function \(f:\mathcal {X}\rightarrow \mathbb {R}\), where by definition

$$\begin{aligned} |\nabla ^- f|(x) = \limsup _{y \rightarrow x} \frac{[f(y)-f(x)]_-}{d(y,x)}, \quad (\text{ with } [X]_-:=\max (-X,0)) \end{aligned}$$

when \(x\) is not isolated in \(\mathcal {X}\) (we set \(|\nabla ^- f|(x)=0\), when \(x\) is isolated in \(\mathcal {X}\)). By convention \(\infty \times 0 =0\), so that \(\lambda =+\infty \) if and only if \(\mu \) is a Dirac measure.

Remark 1.1

Let us make a few comments about the definition of the Poincaré inequality.

  1. (1)

    Since the right hand side of (1.2) is always finite when \(f\) is Lipschitz, it follows in particular that Lipschitz functions always have finite variance when Poincaré inequality holds.

  2. (2)

    When \((\mathcal {X},d)\) is a smooth Riemannian manifold equipped with its geodesic distance and \(f:\mathcal {X}\rightarrow \mathbb {R}\), it is not difficult to check that if \(f\) is differentiable at a point \(x\), then \(|\nabla ^-f|(x)\) coincides with the norm of the vector \(\nabla f(x)\) (belonging to the tangent space at \(x\)). If \((\mathcal {X},d)=(B,\Vert \,\cdot \,\Vert )\) is a Banach space, and \(f:B\rightarrow \mathbb {R}\) is differentiable at \(x\in B\), then \(|\nabla ^-f|(x)\) is equal to \(\Vert Df(x)\Vert _*\), the dual norm of the differential \(Df(x)\in B^*\) of \(f\) at \(x\). So (1.2) gives back the usual definitions in a smooth context.

  3. (3)

    To a Lipschitz function \(f\) on \(\mathcal {X}\), one can also associate \(|\nabla ^+ f|\) and \(|\nabla f|\), which are defined by replacing \([\,\cdot \,]_-\) by \([\,\cdot \,]_+\) or by \(|\,\cdot \,|\) respectively. Since \(|\nabla ^+ f|=|\nabla ^-(- f)|\), replacing \(f\) by \(-f\), we observe that the Poincaré inequality can be equivalently restated as

    $$\begin{aligned} \lambda \mathrm {Var}_\mu (f) \le \int |\nabla ^+ f|^2\,d\mu , \end{aligned}$$
    (1.3)

    for all Lipschitz function \(f:\mathcal {X}\rightarrow \mathbb {R}\). Moreover, since \(|\nabla f|=\max (|\nabla ^-f|;|\nabla ^+f|)\), we see that (1.2) and (1.3) both imply yet another version of Poincaré inequality (considered for instance in [28] or [10]):

    $$\begin{aligned} \lambda \mathrm {Var}_\mu (f) \le \int |\nabla f|^2\,d\mu , \end{aligned}$$
    (1.4)

    for all Lipschitz function \(f:\mathcal {X}\rightarrow \mathbb {R}\). That (1.4) also implies (1.2) is not obvious. A proof of this fact can be found in [17, Proposition 5.1] (the result is stated for the logarithmic Sobolev inequality but the same conclusion holds for the Poincaré inequality). The proof relies on a technique, developed in [2], consisting in relaxing the right hand side of (1.4) and yielding to the notion of Cheeger energy.

1.1 Main results

Denote by \(\overline{\Phi }\) the tail distribution function of the standard Gaussian measure \(\gamma (dx)=(2\pi )^{-1/2} e^{-x^2/2}\,dx\) on \(\mathbb {R}\) defined by

$$\begin{aligned} \overline{\Phi } (x) = \frac{1}{\sqrt{2\pi }}\int \limits _{x}^{+\infty } e^{-u^2/2}\,du,\quad x\in \mathbb {R}. \end{aligned}$$

The main result of this paper is the following theorem.

Theorem 1.2

If \(\mu \) satisfies the dimension free concentration property \(\mathbf {CI}_2^\infty (\alpha )\), then \(\mu \) satisfies the Poincaré inequality (1.2) with the constant \(\lambda \) defined by

$$\begin{aligned} \sqrt{\lambda }= \sup \left\{ \frac{ \overline{\Phi }^{-1} \left( \alpha (r)\right) }{r};\quad r> 0 \text { s.t } \alpha (r)\le 1/2 \right\} . \end{aligned}$$

Moreover, if \(\alpha \) is convex decreasing and such that \(\alpha (0)=1/2\), then \(\lambda =(2\pi \alpha '_+(0)^2),\) where \(\alpha '_+(0) \in [-\infty , 0)\) is the right derivative of \(\alpha \) at \(0.\)

Conversely, it is well known—since the work by Gromov and Milman [25] (see also [1, 10, 19, 39] for related results)—that a probability measure \(\mu \) verifying the Poincaré inequality satisfies a dimension free concentration property with a profile of the form \(\alpha (r)=be^{-ar}\), for some \(a,b>0\). We recall this property in the following theorem and refer to Sect. 5 for a proof.

Theorem 1.3

(Gromov and Milman) Suppose that \(\mu \) satisfies the Poincaré inequality (1.2) with a constant \(\lambda >0\), then it satisfies the dimension free concentration property with the profile

$$\begin{aligned} \alpha (r)=b\exp (-a\sqrt{\lambda }r),\quad r\ge 0, \end{aligned}$$

where \(a,b\) are universal constants.

Thus, Theorems 1.2 and 1.3 give a full description of the set of probability distributions verifying a dimension free concentration property with a concentration profile \(\alpha \) such that \(\{r :\alpha (r)<1/2\}\ne \emptyset \) : this set coincides with the set of probability measures verifying the Poincaré inequality. An immediate corollary of Theorems and 1.3 (see Corollary 4.1 below) is that any type of dimension free concentration inequality can always be improved into a dimension free concentration inequality with an exponential profile (up to universal constants). This was already noticed by Talagrand [41]. See Sect. 4.3 for a further discussion.

Remark 1.4

Let us make some comments on the constant \(\lambda \) appearing in Theorem 1.2.

  1. (1)

    We observe first that \(\lambda >0\) if and only if there is some \(r_o>0\) such that \(\alpha (r_o)<1/2.\) In particular, Theorem 1.2 applies in the case of the following “minimal” profile \(\alpha =\beta _{a_o,r_o}\), defined as follows

    $$\begin{aligned} \beta _{a_o,r_o}(r)=1/2,\ \text { if } r<a_o\quad \text {and}\quad \beta _{a_o,r_o}(r)=a_o,\ \text{ if } r\ge r_o, \end{aligned}$$
    (1.5)

    where \(a_o \in [0,1/2)\), \(r_o>0\). If a probability measure satisfies \(\mathbf {CI}_2^\infty (\beta _{a_o,r_o})\), then it satisfies the Poincaré inequality with the constant

    $$\begin{aligned} \sqrt{\lambda _{a_o,r_o}} := \frac{ \overline{\Phi }^{-1} \left( a_o\right) }{r_o} \end{aligned}$$
  2. (2)

    Then we notice that any non increasing function \(\alpha :\mathbb {R}^+\rightarrow \mathbb {R}^+\), with \(\alpha (0)=1/2\), can be written as an infimum of minimal profiles:

    $$\begin{aligned} \alpha = \inf _{r>0} \beta _{\alpha (r),r}. \end{aligned}$$

    Therefore, the constant \(\lambda \) given in Theorem 1.2 is the supremum of the constants \(\lambda _{\alpha (r),r}\) \(r>0\) defined above. This shows that the information contained in the concentration profile \(\alpha \) is treated pointwise, and that the global behavior of \(\alpha \) is not taken into account.

  3. (3)

    It is well known that the standard Gaussian measure \(\gamma \) satisfies the dimension free concentration property with the profile \(\alpha =\overline{\Phi }\) (this follows from the Isoperimetric theorem in Gauss space due to Sudakov and Cirelson [40] and Borell [11], see e.g. [28]). Hence, applying Theorem 1.2, we conclude that \(\gamma \) satisfies the Poincaré inequality with the constant \(\lambda =1\), which is well known to be optimal (see e.g. [3, Chapter 1]).

  4. (4)

    Finally we observe that, if the concentration profile \(\alpha (r)\) goes to zero too fast when \(r\rightarrow \infty \), then \(\lambda =+\infty \) and so \(\mu \) is a Dirac measure. This happens for instance when \(\alpha (r)=be^{-ar^k}\), \(r\ge 0\), with \(k>2\) and \(a,b>0.\)

Theorem 1.2 completes a previous result obtained by the first author [18] (see also [21]), namely that the Gaussian dimension free concentration is characterized by a transport-entropy inequality. We now state this result and start by recalling some notation. The Kantorovich–Rubinstein distance \(W_p\), \(p\ge 1\), between \(\nu ,\mu \in \mathcal {P}(\mathcal {X})\) is defined by

$$\begin{aligned} W_p^p(\nu ,\mu )=\inf _{\Pi (\nu ,\mu )} \mathbb {E}[d^p(X,Y)], \end{aligned}$$

where the infimum runs over the set \(\Pi (\nu ,\mu )\) of couples of random variables \((X,Y)\) such that \(\mathrm {Law}(X)=\mu \) and \(\mathrm {Law}(Y)=\nu \). Then, a probability measure is said to satisfy the \(p\)-Talagrand transport-entropy inequality with constant \(C>0\), if it holds

$$\begin{aligned} W_p^p(\nu ,\mu ) \le CH(\nu |\mu ),\quad \forall \nu \in \mathcal {P}(\mathcal {X}), \end{aligned}$$
(1.6)

where the relative entropy functional is defined by \(H(\nu |\mu )=\int \log \frac{d\nu }{d\mu }\,d\nu \) if \(\nu \) is absolutely continuous with respect to \(\mu \), and \(H(\nu |\mu )=+\infty \) otherwise. Inequalities of this type were introduced by Marton and Talagrand in the nineties [31, 43]. We refer to the survey [20] for more informations on this topic.

Theorem 1.5

[18] Fix \(p\ge 2\) and \(C>0\). Then, a probability measure \(\mu \) satisfies the \(p\)-Talagrand transport inequality (1.6) with constant \(C\) if and only if it satisfies the dimension free concentration inequality \(\mathbf {CI}_p^\infty (\alpha )\), with a concentration profile of the form

$$\begin{aligned} \alpha (r)=\exp \left( -\frac{1}{C} [r-r_o]_+^p\right) ,\quad r\ge 0, \end{aligned}$$

for some \(r_o\ge 0\) (with \([X]_+:=\max (X,0)\)).

As we will see, the proofs of Theorem 1.2 and 1.5 are very different. Both make use of probability limit theorems, but not at the same scale: Theorem 1.5 used Sanov’s large deviations theorem, whereas Theorem is an application of the central limit theorem. Moreover, contrary to what happens in Theorem 1.2 (see item (2) of Remark 1.4), the global behavior of the concentration profile is used in Theorem 1.5.

In view of Theorems 1.2 and 1.5, it is natural to formulate the following general question:

  • (Q) Which functional inequality is equivalent to \(\mathbf {CI}_p^\infty (\alpha )\) for a concentration profile of the form

    $$\begin{aligned} \alpha (r)=\exp \left( -a[r-r_o]^k_+\right) ,\quad r\ge 0, \end{aligned}$$

    where \(a>0,r_o\ge 0\) and \(k>0\) ?

Remark 1.6

It is easy to see, using the central limit theorem, that for \(p\in [1,2)\) the only probability measures verifying \(\mathbf {CI}_p^{\infty }(\alpha )\), for some \(\alpha \) such that \(\alpha (r_o)<1/2\) for at least one \(r_o>0\), are Dirac masses. Thus the question \(\mathbf {(Q)}\) is interesting only for \(p\ge 2.\)

To summarize, Theorem 1.5 shows that the answer to \(\mathbf {(Q)}\) is the \(p\)-Talagrand inequality for \(k=p\) and \(p\ge 2\). Theorem 1.2 shows that the answer is the Poincaré inequality for \(p=2\) and for \(k\in (0,1]\). Moreover point (4) of Remark 1.4 above shows that for \(p=2\), the question is interesting only for \(k\in [1;2].\) The case \(k\in (1;2)\) is still open.

Finally, we mention that some partial results are known for \(p=\infty \). Indeed, Bobkov and Houdré [9] characterized the set of probability measures on \(\mathbb {R}\) satisfying \(\mathbf {CI}_\infty ^\infty (\beta _{a_o,r_o})\), with \(a_o \in [0,1/2)\), where \(\beta _{a_o,r_o}\) is the minimal concentration profile defined by (1.5). They showed that a probability measure \(\mu \) belongs to this class if and only if the map \(U_\mu \) defined by

$$\begin{aligned} U_\mu (x)=F_\mu ^{-1}\left( \frac{1}{1+e^{-x}}\right) ,\quad x\in \mathbb {R}, \end{aligned}$$

where \(F_\mu (x)=\mu ((-\infty ,x])\) and \(F_\mu ^{-1}(p)=\inf \{x\in \mathbb {R}; F_\mu (x)\ge p\}\), \(p\in (0,1)\), satisfies the following inequality on the interval where it is defined:

$$\begin{aligned} |U_\mu (x)-U_\mu (y)|\le a+b|x-y|, \end{aligned}$$

for some \(a,b\ge 0.\)

Finally, we mention that, under additional assumptions, Theorem 1.2 extends to convex sets. More precisely, we will prove that, inter alia, in geodesic spaces in which the distance function is convex (Busemann spaces), a probability measure satisfies \(\mathbf {CI}^\infty _2(\alpha )\) restricted to convex sets, with a non trivial profile \(\alpha \), if and only if it satisfies the Poincaré inequality restricted to convex functions (see Sect. 6 for a precise statement). This generalizes a previous result by Bobkov and Götze that was only valid for probability on the real line [7, Theorem 4.2].

1.2 Alternative formulation in terms of observable diameters

It is possible to give an alternative formulation of Theorems 1.2 and 1.3 using the notion of observable diameter introduced by Gromov [24, Chapter 3.1/2]. Recall that, if \((\mathcal {X},d,\mu )\) is a metric space equipped with a probability measure and \(t \in [0,1]\), the partial diameter of \((\mathcal {X},d)\) is defined as the infimum of the diameters of all the subsets \(A\subset \mathcal {X}\) satisfying \(\mu (A)\ge 1-t.\) It is denoted by \(\mathrm {Part\,Diam}(\mathcal {X},d,\mu ,t).\) If \(f:\mathcal {X}\rightarrow \mathbb {R}\) is some \(1\)-Lipschitz function, we denote by \(\mu _f \in \mathcal {P}(\mathbb {R})\) the push forward of \(\mu \) under \(f\). Then, the observable diameter of \((\mathcal {X},d,\mu )\) is defined as follows

$$\begin{aligned} \mathrm {Obs\,Diam}(\mathcal {X},d,\mu ,t)=\sup _{f\ 1-\text {Lip}} \mathrm {Part\,Diam}(\mathbb {R},|\,\cdot \,|,\mu _f,t)\in \mathbb {R}^+\cup \{+\infty \} . \end{aligned}$$

We define accordingly the observable diameters of \((\mathcal {X}^n,d_2,\mu ^n,t)\) for all \(n\in \mathbb {N}^*.\)

The observable diameters are related to concentration profiles by the following lemma (see e.g. [16, Lemma 2.22]).

Lemma 1.7

If \(\mu \) satisfies \(\mathbf {CI}(\alpha )\), then

$$\begin{aligned} \mathrm {Obs\,Diam}(\mathcal {X},d,\mu ,2\alpha (r)) \le 2r, \end{aligned}$$

for all \(r\ge 0\) such that \(\alpha (r)\le 1/2.\)

Conversely, for all \(t\in [0,1/2]\), for all \(A\subset \mathcal {X}\), with \(\mu (A)\ge 1/2\), it holds

$$\begin{aligned} \mu (A_{r(t)}) \ge 1-t \end{aligned}$$

with \(r(t)=\mathrm {Obs\,Diam}(\mathcal {X},d,\mu ,t).\)

The following corollary gives an interpretation of the Poincaré inequality in terms of the boundedness of the observable diameters of the sequence of metric probability spaces \((\mathcal {X}^n,d_2,\mu ^n)_{n\in \mathbb {N}^*}.\)

Corollary 1.8

A probability measure \(\mu \) on \((\mathcal {X},d)\) satisfies the Poincaré inequality (1.2) with the optimal constant \(\lambda \) if and only if for some \(t\in (0,1/2)\)

$$\begin{aligned}r_\infty (t):=\sup _{n\in \mathbb {N}^*} \mathrm {Obs\,Diam}(\mathcal {X}^n,d_2,\mu ^n,t)<\infty .\end{aligned}$$

Moreover,

$$\begin{aligned} \overline{\Phi }^{-1}(t)\le r_\infty (t)\sqrt{\lambda } \le a\log \left( \frac{b}{t}\right) ,\quad \forall t\in (0,1/2) \end{aligned}$$

where \(a>0\) and \(b\ge 1\) are some universal constants.

1.3 Tools

In this section, we briefly introduce the main tools that will be used in the proof of Theorem 1.2: inf-convolution operators, related to both concentration and to Hamilton–Jacobi equations, and the central limit theorem.

The first main tool in the proof of Theorem 1.2 is a new alternative formulation of concentration of measure \(\mathbf {CI}_p^\infty (\alpha )\) in terms of deviation inequalities for inf-convolution operators that was introduced in [21]. Recall that for all \(t>0\), the infimum convolution operator \(f\mapsto Q_t f\) is defined for all \(f:\mathcal {X}^n \rightarrow \mathbb {R}\cup \{+\infty \}\) bounded from below as follows

$$\begin{aligned} Q_tf(x)=\inf _{y\in \mathcal {X}^n}\left\{ f(y) + \frac{1}{t^{p-1}} d^p_p(x,y)\right\} ,\quad x\in \mathcal {X}^n \end{aligned}$$
(1.7)

(we should write \(Q_t^{p,(n)}\), but we will omit, for simplicity, the superscripts \(p\) and \((n)\) in the notation).

In the next proposition, we recall a result from [21] that gives a new way to express concentration of measure (our first main tool).

Proposition 1.9

Let \(\mu \in \mathcal {P}(\mathcal {X})\); \(\mu \) satisfies \(\mathbf {CI}_p^\infty (\alpha )\) if and only if for all \(n\in \mathbb {N}^*\) and for all measurable function \(f:\mathcal {X}^n \rightarrow \mathbb {R}\cup \{+\infty \}\) bounded from below and such that \(\mu ^n(f=+\infty )<1/2\), it holds

$$\begin{aligned} \mu ^n(Q_t f > m(f) +r)\le \alpha (r^{1/p}t^{1-1/p}),\quad \forall r,t>0, \end{aligned}$$
(1.8)

where \(m(f)=\inf \{m\in \mathbb {R};\mu ^n(f\le m) \ge 1/2\}.\)

The second main tool is the well known fact that the function \(u:(t,x) \mapsto Q_tf(x)\) is, in some weak sense, solution of the Hamilton–Jacobi equation

$$\begin{aligned} \frac{\partial u}{\partial t} = -\frac{1}{4} |\nabla _x u|^2. \end{aligned}$$

This result is very classical on \(\mathbb {R}^k\) (see e.g [14]); extensions to metric spaces were proposed in [2, 5, 30] or [22]. This will be discussed in the next section.

The third tool is the central limit theorem for triangular arrays of independent random variables (see e.g. [15, p. 530]).

Theorem 1.10

For each \(n\), let \(X_{1,n}, X_{2,n},\ldots ,X_{n,n}\) be independent real random variables and define \(T_n= X_{1,n}+\cdots +X_{n,n}.\) Assume that \(\mathbb {E}[T_n^2] =1\), \(\mathbb {E}[X_{k,n}]=0\) for all \(k\in \{0,\ldots ,n\}\), and that the following Lindeberg condition holds, for all \(t>0\)

$$\begin{aligned} \sum _{k=1}^{n} \mathbb {E}\left[ X_{k,n}^{2}{1}\mathrm{l}{_{|X_{k,n}|>t}}\right] \rightarrow 0\quad \hbox {as } n\rightarrow +\infty . \end{aligned}$$
(1.9)

Then the distribution of \(T_n\) converges weakly to the standard normal law.

We end this introduction with a short roadmap of the paper. In Sect. 2 we make some comments on Theorem 1.2. In particular, we compare Theorem 1.2 to a result by E. Milman on the Poincaré inequality in spaces with non-negative curvature and show, as an immediate consequence of our main result as well as E. Milman’s result, that the celebrated KLS conjecture for isotropic log-concave probability measures can be reduced to some universal concentration inequalities (for isotropic log-concave probability measures). In Sect. 3, we recall some properties of the infimum convolution operators that will be used in the proofs. Section 4 is dedicated to the proof of Theorem 1.2 and Sect. 5 to the proof of Theorem 1.3, by means of the Herbst argument (the latter is somehow classical (see e.g. [3, 28]) but requires some care due to our general framework). Finally Sect. 6 deals with the case of convex sets and the Poincaré inequality restricted to convex functions.

2 Comparison with other results

In this section we collect some remarks and consequences of our main theorem. First we shall compare our result to one of E. Milman, in Riemannian setting. Then, we state the celebrated KLS conjecture and give an equivalent formulation in terms of dimension free concentration property. Finally, we may comment on other type of dimension free concentration properties involving a different definition of enlargement.

2.1 Dimension free concentration vs. non negative curvature.

In Riemannian setting, Theorem 1.2 is reminiscent of the following recent result by E. Milman showing that under non-negative curvature the Poincaré constant of a probability measure can be expressed through very weak concentration properties of the measure [34, 35].

We recall that the Minkowski content of a set \(A \subset \mathcal {X}\) is defined as follows

$$\begin{aligned} \mu ^+(A) := \liminf _{r\rightarrow 0} \frac{\mu (A_r) - \mu (A)}{r}. \end{aligned}$$

Theorem 2.1

(Milman [35]) Let \(\mu (dx)=e^{-V(x)}\,dx\) be an absolutely continuous probability measure on a smooth complete separable Riemannian manifold \(M\) equipped with its geodesic distance \(d.\) Suppose that \(V:M\rightarrow \mathbb {R}\) is a function of class \(\mathcal {C}^2\) such that

$$\begin{aligned} \mathrm {Ric} + \mathrm {Hess}\,V \ge 0, \end{aligned}$$

and that \(\mu \) satisfies the following concentration of measure inequality

$$\begin{aligned} \mu (A_r) \ge 1 - \alpha (r),\quad \forall r\ge 0, \end{aligned}$$

with \(\alpha :[0,\infty ) \rightarrow [0,1/2]\) such that \(\alpha (r_o)<1/2\), for some \(r_o>0\). Then \(\mu \) satisfies the following Cheeger’s inequality

$$\begin{aligned} \mu ^+(A)\ge D \min (\mu (A) ; 1-\mu (A)),\quad \forall A \subset M, \end{aligned}$$

with

$$\begin{aligned} D=\sup \left\{ \frac{\Psi (\alpha (r))}{r} ; \quad r>0 \text { s.t } \alpha (r)<1/2\right\} , \end{aligned}$$

where \(\Psi :[0,1/2)\) is some universal function.

We recall that Cheeger’s inequality with the constant \(D\) implies the Poincaré inequality (1.2) with the constant \(\lambda =D^2/4\) [13, 33]. In our result the non-negative curvature assumption of Milman’s result is replaced by the assumption that the concentration is dimension free.

Remark 2.2

Notice that, if \(M\) has non-negative Ricci curvature and \(\mu (dx) = \frac{1}{|K|} \mathbf {1}_K(x)\,dx\) is the normalized restriction of the Riemanian volume to a geodesically convex set \(K\), then Milman [36] also obtains that

$$\begin{aligned} D= \sup \left\{ \frac{1-2\alpha (r)}{r} ; \quad r>0\right\} . \end{aligned}$$

This bound is optimal (see [36]).

2.2 A remark on the KLS conjecture

In this section, \(\mathcal {X}=\mathbb {R}^k\) is always equipped with its standard Euclidean norm \(|\,\cdot \,|\).

Let us recall the celebrated conjecture by Kannan et al. [26]. Recall first that a probability measure \(\mu \) on \(\mathbb {R}^k\) is isotropic if \(\int x\,\mu (dx)=0\) and \(\int x_ix_j \,\mu (dx) =\delta _{ij}\) for all \(1\le i,j\le n.\) It is log-concave if it has a density of the form \(e^{-V}\), where \(V:\mathbb {R}^k\rightarrow \mathbb {R}\cup \{+\infty \}\) is a convex function.

Conjecture 2.3

(Kannan et al. [26]) There is a universal constant \(D>0\) such that for all \(k\in \mathbb {N}^*\), any log-concave and isotropic probability measure \(\mu \) on \(\mathbb {R}^k\) satisfies the following Cheeger inequality

$$\begin{aligned} \mu ^+(A)\ge D\min (\mu (A) ; 1-\mu (A)),\quad \forall A \subset \mathbb {R}^k. \end{aligned}$$

Equivalently, there is a universal constant \(\lambda >0\) such that for all \(k\in \mathbb {N}^*\), any log-concave and isotropic probability measure \(\mu \) on \(\mathbb {R}^k\) satisfies the following Poincaré inequality

$$\begin{aligned} \lambda \mathrm {Var}_\mu (f)\le \int |\nabla f|^2\,d\mu , \end{aligned}$$

for all \(f:\mathbb {R}^k \rightarrow \mathbb {R}\) Lipschitz.

Note that, in the statement above, the converse implication from Poincaré to Cheeger inequality is due to Buser [12] and Ledoux [27, 29] and is in fact true more generally on Riemannian manifolds with non-negative Ricci curvature [29].

According to E. Milman’s Theorem 2.1, the above conjecture can be reduced to a statement about universal concentration inequalities for log-concave isotropic probabilities.

Corollary 2.4

The KLS conjecture is equivalent to the following statement. There exists \(r_o>0, a_o \in [0,1/2)\) such that for any \(m\in \mathbb {N}^*\), any log-concave and isotropic probability \(\nu \) on \(\mathbb {R}^m\) satisfies

$$\begin{aligned} \nu (A + r_o B_2) \ge 1-a_o,\quad \forall A \subset \mathbb {R}^m \text { s.t. } \nu (A)\ge 1/2, \end{aligned}$$
(2.1)

where \(B_2\) is the Euclidean unit ball of \(\mathbb {R}^m.\)

This corollary follows immediately from Theorem 2.1. Below, we propose an alternative proof based on our main result (Theorem 1.2).

Proof of Corollary 2.4

According to Theorem 1.3, it is clear that the KLS conjecture implies uniform exponential concentration estimates for isotropic log-concave probability measures.

Conversely, let \(\mu \) be isotropic and log-concave on \(\mathbb {R}^k\). For all \(n\in \mathbb {N}^*\), the probability \(\mu ^n\) is still isotropic and log-concave on \(\left( \mathbb {R}^k\right) ^n\). So applying (2.1) to \(\nu =\mu ^n\) on \(\left( \mathbb {R}^k\right) ^n\), for all \(n\in \mathbb {N}^*\), we conclude that \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\beta _{a_o,r_o})\), where the concentration profile \(\beta _{a_o,r_o}\) is defined by (1.5). According to Theorem 1.2, we conclude that \(\mu \) satisfies Poincaré inequality with the constant \(\lambda = \left( \overline{\Phi }^{-1}(a_o)/r_o\right) ^2\). Since this holds for any isotropic log-concave probability measure in any dimension, this ends the proof. \(\square \)

2.3 Euclidean vs. Talagrand type enlargements

Theorem 1.2 improves a preceding result by the first author [18] where a stronger form of exponential dimension free concentration, introduced by Talagrand [41, 42], was shown to be equivalent to a transport-entropy inequality which was known to be also equivalent to the Poincaré inequality. These equivalences, together with our main result, will allow us to prove that the two notions of exponential dimension free concentration properties are actually equivalent (see Theorem 2.9 below).

In order to present these equivalences, we need some notation and definition.

Given \(n\in \mathbb {N}^*\) and \(A \subset \mathcal {X}^n\), consider the following family of enlargements of \(A\):

$$\begin{aligned} \widetilde{A}_{a,r} =\left\{ x \in \mathcal {X}^n ; \exists y\in A \text { s.t } \sum _{i=1}^n \theta (ad(x_i,y_i))\le r\right\} ,\quad a >0,\quad r\ge 0 \end{aligned}$$

where \(\theta (t) = t^2,\) if \(t\in [0,1]\) and \(\theta (t)=2t-1\), if \(t\ge 1.\)

In the next definition, we recall the dimension free concentration property introduced by Talagrand.

Definition 2.5

A probability measure \(\mu \) on \(\mathcal {X}\) is said to satisfy the Talagrand exponential type dimension free concentration inequality with constants \(a,b\ge 0\) if for all \(n\in \mathbb {N}^*\), for all \(A\subset \mathcal {X}^n\) with \(\mu ^n(A)\ge 1/2\), it holds

$$\begin{aligned} \mu ^n(\widetilde{A}_{a,r})\ge 1-be^{-r},\quad \forall r\ge 0. \end{aligned}$$
(2.2)

Remark 2.6

Using elementary algebra, one can compare the Talagrand concentration inequality (2.2) with the dimension free concentration inequality (1.1) under investigation in this paper. More precisely we may prove that the former is stronger than the latter. Indeed, since \(t\mapsto \theta (\sqrt{t})\) is concave and vanishes at \(0\), it is thus sub-additive. In turn, the following inequality holds

$$\begin{aligned} \sum _{i=1}^n\theta (ad(x_i,y_i)) \ge \theta \left( \sqrt{\sum _{i=1}^n a^2d^2(x_i,y_i)}\right) =\theta (ad_2(x,y)),\quad \forall x,y\in \mathcal {X}^n. \end{aligned}$$

Therefore,

$$\begin{aligned} \widetilde{A}_{a,\theta (ar)} \subset A_{r,2}, \end{aligned}$$

and so, if \(\mu \) satisfies the Talagrand concentration inequality (2.2), then it obviously verifies the dimension free concentration inequality with the profile \(\alpha (u)=be^{-\theta (au)}\le ebe^{-2au},\) \(u\ge 0.\)

The following theorem summarizes the known links between Talagrand exponential type dimension free concentration and the Poincaré inequality.

Theorem 2.7

Let \(\mu \) be a probability measure on \(\mathcal {X}\). The following statements are equivalent

  1. (1)

    \(\mu \) satisfies the Poincaré inequality (1.2) with a constant \(\lambda >0\).

  2. (2)

    \(\mu \) satisfies the Talagrand exponential type dimension free concentration inequality (2.2) with constants \(a,b>0\).

  3. (3)

    \(\mu \) satisfies the following transport-entropy inequality

    $$\begin{aligned} \inf _{(X,Y) \in \Pi (\mu ,\nu )}\mathbb {E} \left( \theta (Cd(X,Y)) \right) \le H(\nu |\mu ),\quad \forall \nu \in \mathcal {P}(\mathcal {X}), \end{aligned}$$

    for some constant \(C>0\), (recall that \(\Pi (\mu ,\nu )\) and the relative entropy \(H(\nu |\mu )\) are defined before Theorem 1.5).

Moreover the constants above are related as follows :

  • \((1)\Rightarrow (2)\) with \(a=\kappa \sqrt{\lambda }\) and \(b=1\), for some universal constant \(\kappa .\)

  • \((2)\Rightarrow (3)\) with \(C=a.\)

  • \((3)\Rightarrow (1)\) with \(\lambda =2C^2.\)

Let us make some bibliographical comments about the different implications in Theorem 2.7. The implication \((1)\Rightarrow (2)\) is due to Bobkov and Ledoux [10], the implication \((2)\Rightarrow (3)\) is due to the first author [18, Theorem 5.1], and the implication \((3)\Rightarrow (1)\) is due to Maurey [32] or Otto and Villani [37]. The equivalence between (1) and (3) was first proved by Bobkov et al. [6].

Remark 2.8

It is worth noting that the implication \((2)\Rightarrow (3)\) follows from Theorem 1.5 for \(p=2\) by a change of metric argument. Namely, suppose that \(\mu \) satisfies the concentration property (2) of Theorem 2.7, for some \(a>0\), and define \(\tilde{d}(x,y)=\sqrt{\theta (ad(x,y))}\) for all \(x,y \in \mathcal {X}.\) It is not difficult to check that the function \(\theta ^{1/2}\) is subadditive, and therefore \(\tilde{d}\) defines a new distance on \(\mathcal {X}\). The \(\ell _2\) extension of \(\tilde{d}\) to the product \(\mathcal {X}^n\) is

$$\begin{aligned} \tilde{d}_2(x,y)= \left[ \sum _{i=1}^n \theta (ad(x_i,y_i))\right] ^{1/2},\quad x,y\in \mathcal {X}^n, \end{aligned}$$

and it holds

$$\begin{aligned} \widetilde{A}_{a,r} = \left\{ x\in \mathcal {X}^n ; \tilde{d}_2(x,A) \le \sqrt{r}\right\} ,\quad \forall A \subset \mathcal {X}^n. \end{aligned}$$

Therefore, statement (2) can be restated by saying that \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\alpha )\) (with respect to the distance \(\tilde{d}\)) with the Gaussian concentration profile \(\alpha (r)=be^{-r^2}.\) Applying Theorem 1.5, we conclude that \(\mu \) satisfies the \(2\)-Talagrand transport entropy inequality with the constant \(1\) with respect to the distance \(\tilde{d}\), which is exactly (3) with \(C=a\).

An immediate consequence of Theorem 1.2 and of Bobkov–Ledoux theorem \((1)\Rightarrow (2)\) above is the following result showing the equivalence between the two forms of dimension free exponential concentration.

Theorem 2.9

Let \(\mu \) be a probability measure on \(\mathcal {X}\). The following are equivalent.

  1. (1)

    The probability measure \(\mu \) satisfies the Talagrand exponential type dimension free concentration inequality (2.2) with constants \(a\) and \(b\).

  2. (2)

    The probability measure \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\alpha )\) with a profile \(\alpha (u)=b'e^{-a'u},\) \(u\ge 0\).

Moreover, the constants are related as follows: \((1)\Rightarrow (2)\) with \(a'=2a\) and \(b'=eb\), and \((2)\Rightarrow (1)\) with \(a=\kappa a'/\sqrt{\log (2b')}\) (for some universal constant \(\kappa \)) and \(b=1\).

We do not know if there exists a direct proof of the implication \((2) \Rightarrow (1)\).

Proof

We have already proved that (1) implies (2) (see Remark 2.6). Let us prove the converse. According to Theorem 1.2 we conclude from (2) that \(\mu \) satisfies the Poincaré inequality with a constant \(\lambda \ge \left( \frac{\overline{\Phi }^{-1}(\alpha (u))}{u}\right) ^{2},\) for all \(u\) such that \(\alpha (u)<1/2.\) A classical inequality gives \(\overline{\Phi }(t)\le \frac{1}{2}e^{-t^2/2},\) \(t\ge 0.\) Therefore, \(\overline{\Phi }^{-1}(t)\ge 2\sqrt{-\log (2t)},\) for all \(t\in (0,1/2)\). Hence, taking \(u=2\log (2b')/a'\) [which guarantees that \(\alpha (u)=1/(4b')\)] yields to \(\lambda \ge \frac{a'^2}{\log (2b')}\). According to the implication (1) \(\Rightarrow \) (2) in Theorem 2.7 we conclude that \(\mu \) satisfies Talagrand concentration inequality (2.2) with \(a=\kappa a'/\sqrt{\log (2b')}.\) \(\square \)

3 Some properties of the inf-convolution operators

In this short (technical) section, we recall some properties of the inf-convolution operators related to Hamilton–Jacobi equations and to the concentration of measure, in the setting of metric spaces (recall that \((\mathcal {X},d)\) is a complete separable metric space).

3.1 Inf-convolution operators and Hamilton–Jacobi equations

In this paragraph, we shall only consider the case \(p=2\). We do this restriction for simplicity and also since only that particular case will be used in the proof of Theorem 1.2 (in the next section). However we mention that most of the results of this section can be extended to any \(p>1\).

The following proposition collects two basic observations about the operators \(Q_t\), \(t>0.\)

Proposition 3.1

Let \(h:\mathcal {X}\rightarrow \mathbb {R}\) be a Lipschitz function. Then

  1. (i)

    for all \(x\in \mathcal {X}\), \(Q_th(x) \rightarrow h(x)\), when \(t\rightarrow 0^+\);

  2. (ii)

    for all \(\nu \in \mathcal {P}(\mathcal {X})\),

    $$\begin{aligned} \limsup _{t \rightarrow 0^+} \frac{1}{t}\int h(x)-Q_th(x)\,\nu (dx) \le \frac{1}{4}\int |\nabla ^-h(x)|^2\,\nu (dx).\end{aligned}$$
    (3.1)

Before giving the proof of Proposition 3.1, let us complete the picture by recalling the following theorem of [2, 23] (improving preceding results of [5, 30]). This result will not be used in the sequel.

Theorem 3.2

Let \(h\) be a bounded function on a polish metric space \(\mathcal {X}\). Then \((t,x)\mapsto Q_th(x)\) satisfies the following Hamilton–Jacobi (in)equation

$$\begin{aligned} \frac{d}{dt_+} Q_th(x) \le -\frac{1}{4}|\nabla Q_th|^2(x), \quad \forall t>0,\ \forall x\in \mathcal {X}\end{aligned}$$
(3.2)

where \(d/dt_+\) stands for the right derivative, and \(|\nabla h|(x)=\limsup _{y\rightarrow x}\frac{|h(y)-h(x)|}{d(y,x)}.\) Moreover, if the space \(\mathcal {X}\) is geodesic (i.e. for all \(x,y\in \mathcal {X}\) there exists at least one curve \((z_t)_{t\in [0,1]}\) such that \(z_0=x\), \(z_1=y\) and \(d(z_s,z_t)=|t-s|d(x,y)\)) then (3.2) holds with equality.

Observe that, strangely, the two inequalities (3.1) and (3.2) go in the opposite direction. This suggests that, at \(t=0\), there should be equality, at least for some class of functions.

Proof of Proposition 3.1

Let \(L>0\) be a Lipschitz constant of \(h\); since \(Q_t h \le h\) one has

$$\begin{aligned} Q_th(x)=\inf _{y\in B(x, 2Lt)} \left\{ h(y)+\frac{1}{t}d^2(x,y)\right\} . \end{aligned}$$

(Namely, if \(d(x,y)>2Lt\), it holds \(h(y)-h(x)+\frac{1}{t}d^2(x,y)\ge \left( -L+\frac{1}{t}d(x,y)\right) d(x,y)>0\).)

Hence

$$\begin{aligned} 0\le \frac{h(x)\!-\!Q_th(x)}{t}&\!=\!\sup _{y\in B(x,2Lt)}\left\{ \frac{h(x)-h(y)}{t} \!-\! \frac{d^2(x,y)}{t^2}\right\} \\&\!\le \! \sup _{y\in B(x,2Lt)}\left\{ \frac{[h(x)\!-\!h(y)]_+}{d(x,y)}\frac{d(x,y)}{t} - \frac{d^2(x,y)}{t^2}\right\} \\&\!\le \sup _{r \in \mathbb {R}} \left\{ \sup _{y\in B(x,2Lt)} \frac{[h(x)\!-\!h(y)]_+}{d(x,y)}r\! -\! r^2\right\} \!=\! \frac{1}{4}\sup _{y\in B(x,2Lt)} \frac{[h(x)\!-\!h(y)]_+^2}{d^2(x,y)}\!. \end{aligned}$$

We conclude from this that \(0\le (h-Q_th)/t\le L^2/4\). This implies in particular that \(Q_th\rightarrow h\) when \(t\rightarrow 0.\) Taking the \(\limsup \) when \(t\rightarrow 0^+\) gives

$$\begin{aligned} \limsup _{t\rightarrow 0^+}\frac{h(x)-Q_th(x)}{t} \le \frac{1}{4}|\nabla ^-h(x)|^2. \end{aligned}$$
(3.3)

Inequality (3.1) follows from (3.3) using Fatou’s lemma in its \(\limsup \) version. The application of Fatou’s lemma is justified by the fact that the family of functions \(\{(h-Q_th)/t\}_{t>0}\) is uniformly bounded. \(\square \)

Remark 3.3

The proof of (3.3) can also be found in [45, Theorem 22.46] (see also [23, Proposition A.3], [2, 5, 30]).

3.2 Inf-convolution operators and concentration of measure

In this subsection we briefly recall the short proof of Proposition 1.9 for the sake of completeness [proposition that relates the infimum convolution operator and the concentration property \(\mathbf {CI}_p^\infty (\alpha )\)].

Proof

Let \(f:\mathcal {X}^n \rightarrow \mathbb {R}\cup \{+\infty \}\) be a function bounded from below with \(\mu ^n(f=+\infty )<1/2\). By definition of \(m(f)\), it holds \(\mu ^n(f\le m(f)) \ge 1/2.\) Define \(A=\{f \le m(f)\}\). If \(\mu \) satisfies the dimension free concentration property \(\mathbf {CI}_p^\infty (\alpha )\), since by definition of \(m(f)\), \(\mu ^n(A)\ge 1/2\), one has \(\mu ^n(\mathcal {X}^n \setminus A_{u,p}) \le \alpha (u)\), for all \(u\ge 0.\) Then, observe that

$$\begin{aligned} Q_tf(x) \le m(f) + \frac{1}{t^{p-1}}d^p_p(x,A),\quad \forall x\in \mathcal {X}^n. \end{aligned}$$

Hence \(\{Q_tf>m(f)+r\}\subset \{d_p(\,\cdot \,,A)>{r^{1/p}t^{1-1/p}}\} = \mathcal {X}\setminus A_{r^{1/p}t^{1-1/p},p}\), which proves (1.8).

To prove the converse, take a Borel set \(A \subset \mathcal {X}^n\) such that \(\mu ^n(A)\ge 1/2\) and consider the function \(i_A\) equals to \(0\) on \(A\) and \(+\infty \) on \(A^c.\) For this function, \(Q_ti_A=d_p^p(x,A)/t^{p-1}\) and one can choose \(m(i_A)=0.\) Applying (1.8) gives the concentration property \(\mathbf {CI}_p^\infty (\alpha )\).

4 Poincaré inequality and concentration of measure

This section is dedicated to the proof of our main result Theorem 1.2, and to Corollary 1.8. Moreover, we shall explain how Theorem 1.2 can be used to improve any non-trivial dimension free concentration property to an exponential one.

4.1 From dimension free concentration to the Poincaré inequality: proof of Theorem 1.2

Proof of Theorem 1.2

Let \(h:\mathcal {X}\rightarrow \mathbb {R}\) be a bounded Lipschitz function such that \(\int h\,d\mu =0.\) For all \(n\in \mathbb {N}^*\), define \(f_n:\mathcal {X}^n\rightarrow \mathbb {R}^+\) by

$$\begin{aligned} f_n(x)=h(x_1)+\cdots +h(x_n),\quad \forall x=(x_1,\ldots ,x_n)\in \mathcal {X}^n. \end{aligned}$$

Our aim is to apply the central limit theorem. Applying (1.8) to \(f_n\) with \(t=1/\sqrt{n}\) and \(r=\sqrt{n}u\), for some \(u>0\), we easily arrive at

$$\begin{aligned}&\mu ^{n} \left( \frac{1}{\sqrt{n}\sigma _n}\sum _{i=1}^n \left[ Q_{1/\sqrt{n}} h(x_i) -\mu (Q_{1/\sqrt{n}}h)\right] > \frac{1}{\sigma _n\sqrt{n}}m(f_n)\right. \nonumber \\&\qquad \quad \left. +\frac{\sqrt{n}}{\sigma _n} \mu \left( h-Q_{1/\sqrt{n}}h \right) + \frac{u}{\sigma _n} \right) \le \alpha (\sqrt{u}), \end{aligned}$$
(4.1)

where \(\sigma _n^2 = \mathrm {Var}_{\mu } (Q_{1/\sqrt{n}} h)\) and \(m(f_n)\) is a median of \(f_n\) under \(\mu ^n\), that is to say any number \(m\in \mathbb {R}\) such that \(\mu ^n(f\ge m)\ge 1/2\) and \(\mu ^n(f_n\le m)\ge 1/2.\)

We deal with each term of (4.1) separately. According to point (i) of Proposition 3.1, we observe that \(\sigma _n \rightarrow \sigma =\sqrt{\mathrm {Var}_\mu (h)}\), when \(n\) goes to \(\infty \) and according to Point (ii) of Proposition 3.1, that

$$\begin{aligned} \limsup _{n\rightarrow +\infty }\sqrt{n}\mu \left( h-Q_{1/\sqrt{n}}h \right) \le \frac{1}{4}\int |\nabla ^- h|^2\,d\mu . \end{aligned}$$

On the other hand, let \(m_n=m(f_n)/(\sqrt{n}\sigma )\). According to the central limit theorem (Theorem 1.10) the law of the random variables \(T_n=f_n/(\sqrt{n}\sigma )\) under \(\mu ^n\) converges weakly to the standard Gaussian. Since weak convergence implies the convergence of quantiles as soon as the limit distribution has a continuous repartition function (see for instance [44, Lemma 21.2]), we have in particular, \(m_n \rightarrow 0\) as \(n\rightarrow \infty \).

Now, fix \(\varepsilon >0\). According to the above observations, (4.1) yields, for any \(u>0\) and any \(n\) sufficiently large,

$$\begin{aligned}&\mu ^{n} \left( \frac{1}{\sqrt{n}\sigma _n}\sum _{i=1}^n \left[ Q_{1/\sqrt{n}} h(x_i) -\mu (Q_{1/\sqrt{n}}h)\right] > \frac{1+\varepsilon }{\sigma }\left( \varepsilon + \frac{1}{4}\int |\nabla ^- h|^2\,d\mu +u \right) \right) \nonumber \\&\quad \le \alpha (\sqrt{u}). \end{aligned}$$
(4.2)

In order to apply (again) the central limit theorem (Theorem 1.10), introduce the following random variables

$$\begin{aligned} \widetilde{T}_n(x)=\sum _{i=1}^n \frac{1}{\sqrt{n}\sigma _n} \left[ Q_{1/\sqrt{n}} h(x_i) -\mu (Q_{1/\sqrt{n}}h)\right] \end{aligned}$$

under \(\mu ^n\). Since \(h\) is bounded \(Q_{1/\sqrt{n}} h\) is uniformly bounded in \(n\), and by Lebesgue theorem, as \(n\) goes to \(\infty \), we see that the Lindeberg condition (1.9) is verified:

$$\begin{aligned} \int \frac{1}{\sigma _n^2}\left( Q_{1/\sqrt{n}} h-\mu (Q_{1/\sqrt{n}} h)\right) ^2{1}\mathrm{l}_{\left| Q_{1/\sqrt{n}} h-\mu (Q_{1/\sqrt{n}} h)\right| >t\sqrt{n} \sigma _n} d\mu \rightarrow 0 \quad \hbox { as } n \rightarrow +\infty . \end{aligned}$$

Therefore, letting \(n\) go to \(+\infty \), and \(\varepsilon \) to \(0\), by continuity of \(\overline{\Phi }\), the inequality (4.2) provides

$$\begin{aligned} \overline{\Phi } \left( \frac{\int |\nabla ^- h|^2\,d\mu }{4\sigma } + \frac{u}{\sigma } \right) \le \alpha (\sqrt{u}), \quad \forall u \ge 0. \end{aligned}$$

Let \(u\ge 0\) be such that \(\alpha (\sqrt{u}) < 1/2\) and \(k(u) := \overline{\Phi }^{-1}(\alpha (\sqrt{u}))>0\). We easily get from the latter inequality that

$$\begin{aligned} k(u)\sqrt{\mathrm {Var}_\mu (h)} \le u + \frac{1}{4}\int |\nabla ^- h|^2\,d\mu . \end{aligned}$$

Replacing \(h\) by \(s h\), \(s >0\), and taking the infimum, we arrive at

$$\begin{aligned} k(u) \sqrt{\mathrm {Var}_\mu (h)} \le \inf _{s>0}\left\{ \frac{u}{s} +\frac{s}{4} \int |\nabla ^- h|^2\,d\mu \right\} =\sqrt{u \int |\nabla ^- h|^2\,d\mu }. \end{aligned}$$

Optimizing over \(u\), one concludes that Poincaré inequality (1.2) is satisfied with the constant \(\lambda \) announced in Theorem 1.2.

Now let \(h:\mathcal {X}\rightarrow \mathbb {R}\) be an unbounded Lipschitz function. Consider the sequence of functions \(h_n=(h\vee -n)\wedge n\), \(n\in \mathbb {N}^*\) converging pointwise to \(h\). For all \(n\in \mathbb {N}^*\), \(h_n\) is bounded and Lipschitz, and it is not difficult to check that

$$\begin{aligned}&|\nabla ^-h_n|(x)= 0\ \text { if }x\in \{h\le -n\}\cup \{h>n\}\quad \text {and}\quad |\nabla ^-h_n|(x)\nonumber \\&\quad = |\nabla ^-h_n|(x)\ \text { if }x\in \{-n<h\le n\}. \end{aligned}$$
(4.3)

In particular, the sequence \(|\nabla ^-h_n|\) converges monotonically to \(|\nabla ^-h|\). Applying Fatou’s lemma and the monotone convergence theorem, we obtain

$$\begin{aligned}&\lambda \int \int (h(x)-h(y))^2\,\mu (dx)\mu (dy)\le \lambda \liminf _{n\rightarrow \infty }\mathrm {Var}_\mu (h_n)\nonumber \\&\quad \le \liminf _{n\rightarrow \infty }\int |\nabla ^-h_n|^2\,d\mu = \int |\nabla ^-h|^2\,d\mu , \end{aligned}$$

which completes the proof of Theorem 1.2. \(\square \)

4.2 Poincaré inequality and boundedness of observable diameters of product probability spaces

In this section we prove Corollary 1.8.

Proof of Corollary 1.8

Assume first that \(\mu \) satisfies the Poincaré inequality (1.2) with the optimal constant \(\lambda \). Then according to Theorem 1.3, \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\alpha )\) with the concentration profile \(\alpha (r)=be^{-\sqrt{\lambda }r}\), where \(a,b\) are universal constants (\(b\ge 1/2\)). According to the first part of Lemma 1.7 [applied to the metric probability space \((\mathcal {X}^n,d_2,\mu ^n)\)], it follows that for all \(n\in \mathbb {N}^*\), \(\mathrm {Obs\, Diam}(\mathcal {X}^n,d_2,\mu ^n, t) \le 2\frac{\log (4b/t)}{a\sqrt{\lambda }},\) for all \(t\le 1\) and thus

$$\begin{aligned} r_\infty (t)\sqrt{\lambda }\le a'\log (b'/t),\quad \forall t\le 1 \end{aligned}$$

for some universal constants \(a',b'.\)

Conversely, assume that \(0<r_\infty (t_o)<\infty \) for some \(t_o\in (0,1/2).\) According to the second part of Lemma 1.7, \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\beta _{t_o,r_\infty (t_o)})\), where the minimal profiles \(\beta \) are defined in (1.5). According to Theorem 1.2, it follows that \(\mu \) satisfies the Poincaré inequality with an optimal constant \(\lambda >0\) such that

$$\begin{aligned} \sqrt{\lambda }r_\infty (t_o) \ge \overline{\Phi }^{-1}(t_o). \end{aligned}$$

According to the first step, we conclude that \(r_\infty (t)<\infty \) for all \(t\le 1\), and so the inequality above is true for all \(t\in (0,1/2).\) \(\square \)

4.3 Self improvement of dimension free concentration inequalities

The next result shows that a non-trivial dimension free concentration inequality can always be upgraded into an inequality with an exponential decay. This observation goes back to Talagrand [41, Proposition 5.1].

Corollary 4.1

If \(\mu \) satisfies \(\mathbf {CI}_2^\infty (\alpha )\) with a profile \(\alpha \) such that \(\alpha (r_o)<1/2\) for some \(r_o\), then it satisfies \(\mathbf {CI}_2^\infty \) with an exponential concentration. More explicitly, it satisfies the dimension free concentration property with the profile \(\tilde{\alpha }(r)=be^{-a\sqrt{\lambda }r}\), where \(a,b\) are universal constants and

$$\begin{aligned} \sqrt{\lambda }= \sup \left\{ \frac{\overline{\Phi }^{-1} \left( \alpha (r)\right) }{r};\quad r>0 \text { s.t } \alpha (r) < 1/2 \right\} . \end{aligned}$$

This result is an immediate corollary of Theorems 1.2 and 1.3.

In [41] this result was stated and proved only for probability measures on \(\mathbb {R}\). We thank E. Milman for mentioning to us that the argument was in fact more general. For the sake of completeness, we extend below Talagrand’s argument in a very general abstract framework. For a future use, we only assume that the dimension free concentration property holds on a good subclass of sets. We refer to the proof of Proposition where this refinement will be used (the subclass of sets being the class of convex sets).

Proposition 4.2

Let \((\mathcal {X},d)\) be a complete separable metric space, \(p\ge 1\) and for all \(n\in \mathbb {N}^*\) let \(\mathcal {A}_n\) be a class of Borel sets in \(\mathcal {X}^n\) satisfying the following conditions:

  1. (i)

    For all \(n\in \mathbb {N}^*\) and \(r\ge 0\), if \(A\in \mathcal {A}_n\) then \(A_{r,p}\in \mathcal {A}_n.\)

  2. (ii)

    If \(A\in \mathcal {A}_m\), then \(A^n\in \mathcal {A}_{nm}.\)

Suppose that a Borel probability measure \(\mu \) on \(\mathcal {X}\) satisfies the following dimension free concentration property: there exists \(r_o>0,a_o\in [0,1/2)\) such that for all \(n\in \mathbb {N}^*\),

$$\begin{aligned} \mu ^n(A_{r_o,p})\ge 1-a_o,\quad \forall A \in \mathcal {A}_n \text { s.t. } \mu ^n(A)\ge 1/2. \end{aligned}$$

Then, for any \(\gamma \in ( -\log (1-a_o)/\log (2), 1)\), there exists \(c\in [1/2,1)\) depending only on \(\gamma \) and \(a_o\) such that for all \(n\in \mathbb {N}^*\),

$$\begin{aligned} \mu ^n(A_{r,p}) \ge 1- \frac{1-c}{\gamma } \gamma ^{r/r_o},\quad \forall r\ge 0,\quad \forall A \in \mathcal {A}_n \text { s.t. } \mu ^n(A)\ge c. \end{aligned}$$

Note in particular that in the case \(p=2\) (and \(\mathcal {A}_n\) the class of all Borel sets), we recover the conclusion of Corollary 4.1 with slightly less accurate constants.

Proof

Given \(A\in \mathcal {A}_1\), it holds \(\left( A^n\right) _{r_o,p}\subset \left( A_{r_o}\right) ^n\) and, according to (i) and (ii), both sets belong to \(\mathcal {A}_n.\) Therefore, if \(\mu (A)\ge (1/2)^{1/n}\), it holds \(\mu (A_{r_o}) \ge (1-a_o )^{1/n}.\) Now, let \(A\in \mathcal {A}_1\) be such that \(\mu (A)\ge 1/2\) and let \(n_A\) be the greatest integer \(n\in \mathbb {N}^*\) such that \(\mu (A)\ge (1/2)^{1/n}.\) By definition of \(n_A\),

$$\begin{aligned} \frac{\log (2)}{\log (1/\mu (A))}-1< n_A\le \frac{\log (2)}{\log (1/\mu (A))}. \end{aligned}$$

According to what precedes,

$$\begin{aligned} \mu (A_{r_o}^c)\le 1-(1-a_o )^{1/n_A}\le 1-\exp \left( \frac{\log (1-a_o )\log (1/\mu (A))}{\log (2)-\log (1/\mu (A))}\right) . \end{aligned}$$

The function \(\varphi (u)=\exp \left( \frac{\log (1-a_o )\log (1/u)}{\log (2)-\log (1/u)}\right) \) satisfies

$$\begin{aligned} \varphi (u)=1-\frac{\log (1-a_o )}{\log (2)}(u-1)+o(u-1), \end{aligned}$$

when \(u\rightarrow 1.\) So \(\frac{1-\varphi (\mu (A))}{1-\mu (A)} \rightarrow -\frac{\log (1-a_o )}{\log (2)}\in (0,1),\) when \(\mu (A)\rightarrow 1.\) Therefore, if \(\gamma \) is any number in the interval \((-\frac{\log (1-a_o )}{\log (2)},1)\), there exists \(c>1/2\) (depending only on \(\gamma \)) such that for all \(A \in \mathcal {A}_1\) with \(\mu (A)\ge c\) it holds

$$\begin{aligned} \mu \left( A_{r_o}^c\right) \le \gamma \mu (A^c). \end{aligned}$$

Iterating (which is possible thanks to \((i)\) and the easy to check property \(\left( A_{r_1,p}\right) _{r_2,p}\subset A_{r_1+r_2,p}\)) yields

$$\begin{aligned} \mu \left( A_{kr_o}^c\right) \le \gamma ^k \mu (A^c),\quad \forall k\in \mathbb {N}^*. \end{aligned}$$

It follows easily that for all \(u\ge 0\),

$$\begin{aligned} \mu \left( A_u^c\right) \le ((1-c)/\gamma ) \gamma ^{u/r_o}. \end{aligned}$$

Applying the argument above to the product measure \(\mu ^p\), \(p\in \mathbb {N}^*\), gives the conclusion. \(\square \)

5 From Poincaré inequality to exponential concentration: proof of Theorem 1.3

In this section, we give a proof of Theorem 1.3. Its conclusion is very classical in, say, a Euclidean setting. But to deal with the general metric space framework requires some additional technical ingredients that we present now.

5.1 Technical preparation

In order to regularize Lipschitz functions, we shall introduce an approximate sup-convolution operator. More precisely, for all \(\varepsilon >0\), and for all function \(f:\mathcal {X}^n\rightarrow \mathbb {R}\cup \{-\infty \}\), we define the (approximate) sup-convolution operator by

$$\begin{aligned} R_{\varepsilon } f(x):=\sup _{y\in \mathcal {X}^n}\left\{ f(y) -\sqrt{\varepsilon +d_2^2(x,y)} \right\} ,\quad x\in \mathcal {X}^n . \end{aligned}$$
(5.1)

The next lemma collects some useful properties about \(R_\varepsilon \).

Lemma 5.1

Let \(f:\mathcal {X}^n\rightarrow \mathbb {R}\cup \{-\infty \}\) be a function taking at least one finite value and \(\varepsilon >0\). Then

  1. (i)

    If \(R_\varepsilon f(x_o)<\infty \) for some \(x_o\in \mathcal {X}^n\), then \(R_\varepsilon f\) is finite everywhere and is \(1\)-Lipschitz with respect to \(d_2\). Moreover, it holds

    $$\begin{aligned} \displaystyle \sum _{i=1}^n|\nabla _i^- R_\varepsilon f(x)|^2\le 1,\quad \forall x\in \mathcal {X}^n. \end{aligned}$$
  2. (ii)

    If \(f\) is \(1\)-Lipschitz, then for all \(x\in \mathcal {X}^n\),

    $$\begin{aligned} f(x)-\sqrt{\varepsilon }\le R_\varepsilon f(x)\le f(x). \end{aligned}$$
  3. (iii)

    If \((\mathcal {X},d)\) is a Banach space and \(f\) is convex, then \(R_\varepsilon f\) is also convex.

Proof

We fix \(\varepsilon >0.\)

Point (iii) follows easily from the fact that \(R_\varepsilon f\) is a supremum of convex functions, since by a simple change of variable

$$\begin{aligned} R_\varepsilon f(x)=\sup _{z\in \mathcal {X}^n}\left\{ f(x-z) -\sqrt{\varepsilon +\Vert z\Vert _2^2} \right\} ,\quad x\in \mathcal {X}^n. \end{aligned}$$

Point (ii) is also easy. Indeed, the first inequality follows by choosing \(y=x\). For the second inequality, observe that, since \(f\) is 1-Lipschitz,

$$\begin{aligned} R_\varepsilon f(x)-f(x)=\sup _{y\in \mathcal {X}^n}\left\{ f(y)-f(x) -\sqrt{\varepsilon +d_2^2(x,y)} \right\} \le \sup _{r\ge 0}\left\{ r -\sqrt{\varepsilon +r^2}\right\} =0. \end{aligned}$$

Now we turn to the proof of Point (i).

The first part of the statement follows from the fact that \(R_\varepsilon f\) is a supremum of \(1\)-Lipschitz functions. To prove the other part, we need to fix some notation. For \(x=(x_1,\dots ,x_n) , z=(z_1,\dots ,z_n) \in \mathcal {X}^n\) and \(i \in \{1,2,\ldots ,n\}\), we set

$$\begin{aligned} \bar{x}^iz=(x_1,\ldots ,x_{i-1},z_i,x_{i+1},\ldots ,x_n). \end{aligned}$$

Also, we set \(\theta (u):=\sqrt{\varepsilon +u}\), \(u \in \mathbb {R}\), so that \(R_\varepsilon f(x)=\sup _y \{f(y)-\theta (d_2^2(x,y))\}\) (observe that \(\theta \) is concave).

Fix \(x=(x_1,\dots ,x_n) \in \mathcal {X}^n\) and a parameter \(\eta \in (0,1)\) that will be chosen later on and consider \(z=(z_1,\ldots , z_n)\in \mathcal {X}^n\) such that \(z_i\ne x_i\) for all \(i\in \{1,\ldots ,n\}\). We assume that \(R_\varepsilon f\) is everywhere finite. Hence, since \(\eta \min _{1\le i\le n} d(x_i,z_i)>0\), there exists \(\hat{y}=\hat{y}(x,z,\eta )\) such that

$$\begin{aligned} R_\varepsilon f(x) \le f(\hat{y}) -\sqrt{\varepsilon +d_2^2(x,\hat{y})} + \eta \min _{1\le i\le n} d(x_i,z_i). \end{aligned}$$

As a consequence, using that \(\theta \) is concave, for all \(1\le i\le n\), we have

$$\begin{aligned} {\left[ R_\varepsilon f(\bar{x}^i z_i)-R_\varepsilon f(x)\right] _-}&= {\left[ R_\varepsilon f(x)-R_\varepsilon f(\bar{x}^i z_i)\right] }_+ \le \left[ \theta (d_2^2(x^i z_i,\hat{y}))-\theta (d_2^2(x,\hat{y}))\right] _+ \\&\quad +\eta \min _{1\le i\le n} d(x_i,z_i)\\&\le \left[ d_2^2(x^i z_i,\hat{y}) - d_2^2(x,\hat{y}) \right] _+ \theta '(d_2^2(x,\hat{y})) +\eta \min _{1\le i\le n} d(x_i,z_i)\\&= \left[ d(\hat{y}_i,z_i)-d(\hat{y}_i,x_i)\right] _+\left( d(\hat{y}_i,z_i)+d(\hat{y}_i,x_i)\right) \theta '(d_2^2(x,\hat{y})) \\&\quad +\eta \min _{1\le i\le n} d(x_i,z_i) \\&\le 2 d(z_i,x_i) d(\hat{y}_i,z_i) \theta '(d_2^2(x,\hat{y})) +\eta \min _{1\le i\le n} d(x_i,z_i) \end{aligned}$$

where, in the last line we used that the positive part \(\left[ d(\hat{y}_i,z_i)-d(\hat{y}_i,x_i)\right] _+\) guarantees that \(d(\hat{y}_i,x_i) \le d(\hat{y}_i,z_i)\) and the triangular inequality.

Using the Cauchy-Schwarz inequality, it follows for any \(\delta >0\) that

$$\begin{aligned} \frac{\left[ R_\varepsilon f\left( \bar{x}^i z_i\right) \!-\!R_\varepsilon f(x)\right] _-^2}{d^2(x_i,z_i)}&\!\le \! \left( 2 d(\hat{y}_i,z_i) \theta '(d_2^2(x,\hat{y})) \!+\!\eta \right) ^2 \le (1\!+\!\delta ) 4 d^2(\hat{y}_i,z_i) \theta '(d_2^2(x,\hat{y}))^2 \\&\quad + \left( 1+\frac{1}{\delta }\right) \eta ^2 . \end{aligned}$$

Therefore, using the triangular and the Cauchy-Schwarz inequalities, we get for any \(\delta >0\),

$$\begin{aligned} \sum _{i=1}^n \frac{\left[ R_\varepsilon f\left( \bar{x}^i z_i\right) -R_\varepsilon f(x)\right] _-^2}{d^2(x_i,z_i)}&\le (1\!+\!\delta ) 4 d_2^2(\hat{y},z) \theta '\left( d_2^2(x,\hat{y})\right) ^2 \!+\! n\left( 1\!+\!\frac{1}{\delta }\right) \eta ^2 \\&\!\le \! (1\!+\!\delta ) 4 (d_2(\hat{y},x) \!+\! d_2(x,z))^2 \theta '\left( d_2^2(x,\hat{y})\right) ^2 \!+\! n\left( 1\!+\!\frac{1}{\delta }\right) \eta ^2 \\&\!\le \! (1\!+\!\delta )^2 4 d_2^2(\hat{y},x) \theta '\left( d_2^2(x,\hat{y})\right) ^2 \\&\quad \!+\! (1\!+\!\delta )\left( 1\!+\!\frac{1}{\delta }\right) 4 d_2^2(x,z)^2 \theta '\left( d_2^2(x,\hat{y})\right) ^2 \!+\! n\left( 1\!+\!\frac{1}{\delta }\right) \eta ^2 \\&\le (1+\delta )^2 + (1+\delta )\left( 1+\frac{1}{\delta }\right) \frac{d_2^2(x,z)^2}{\varepsilon } + n\left( 1+\frac{1}{\delta }\right) \eta ^2, \end{aligned}$$

using that \(4u^2\theta '(u^2)^2 \le 1\) and \(\theta '(u^2)^2\le 1/(4\varepsilon ).\) The expected result follows at once taking the limits \(z_i \rightarrow x_i\), and then \(\eta \rightarrow 0\) and \(\delta \rightarrow 0\). This ends the proof of the lemma. \(\square \)

The next ingredient in the proof of Theorem 1.3 is the so-called Herbst argument that we explain now. We recall the following notation: for any \(f :\mathcal {X}^n \rightarrow \mathbb {R}\), set

$$\begin{aligned} m(f)=\inf \left\{ m\in \mathbb {R}; \mu ^n(f\le m)\ge 1/2 \right\} . \end{aligned}$$

Proposition 5.2

Assume that \(\mu \) satisfies the Poincaré inequality (1.2) with a constant \(\lambda >0\). Then, there exist universal constants \(a,b>0\) such that then for all \(n\in \mathbb {N}^*\), it holds

$$\begin{aligned} \mu ^n\left( f>m\left( f\right) + \sqrt{\frac{ 2}{\lambda }} +r\right) \le b\exp \left( -a\sqrt{\lambda }r\right) ,\quad \forall r\ge 0 \end{aligned}$$

for all function \(f:\mathcal {X}^n\rightarrow \mathbb {R}\) such that

$$\begin{aligned} \sum _{i=1}^n |\nabla ^-_if|^2(x)\le 1,\quad \forall x\in \mathcal {X}^n. \end{aligned}$$
(5.2)

The same conclusion holds if the function \(f\) satisfies

$$\begin{aligned} \sum _{i=1}^n |\nabla ^+_if|^2(x)\le 1,\quad \forall x\in \mathcal {X}^n. \end{aligned}$$
(5.3)

Proof

It is well known that the Poincaré inequality tensorizes properly (see e.g. [28, proposition 5.6]). Indeed, recall that for all \(n\in \mathbb {N}^*\) and for all product probability measure \(\mu ^n\)

$$\begin{aligned} \lambda \mathrm {Var}_{\mu ^n}(g)\le \int \sum _{i=1}^n \mathrm {Var}_{\mu }(g_i)\,\mu ^n(dx), \end{aligned}$$

where \(g_i(x_i):=g(x_1,\ldots ,x_{i-1},x_i,x_{i+1},\ldots ,x_n)\), with \(x_1,\ldots ,x_{i-1},x_{i+1},\ldots ,x_n\) fixed. Therefore, if \(\mu \) satisfies (1.2), then the product probability measure \(\mu ^n\) satisfies

$$\begin{aligned} \lambda \mathrm {Var}_{\mu ^n}(g)\le \int \sum _{i=1}^n |\nabla _i^-g|^2(x)\,\mu ^n(dx), \end{aligned}$$
(5.4)

for all function \(g:\mathcal {X}^n \rightarrow \mathbb {R}\) that is Lipschitz in each coordinate.

Let \(f:\mathcal {X}^n\rightarrow \mathbb {R}\) be bounded and such that (5.2) holds, and define \(Z(s)=\log \int e^{s f}\,d\mu ^n\), for all \(s\ge 0\). Applying (5.4) to \(g=e^{s f}\) (which is still bounded and Lipschitz) and using (5.2) yields easily to

$$\begin{aligned} \lambda \left[ \int e^{2s f}d\mu ^n - \left( \int e^{s f}\,d\mu ^n\right) ^2\right] \le s^2\int e^{2sf}\,d\mu ^n . \end{aligned}$$

Thus

$$\begin{aligned} \log (1-s^2/\lambda ) + Z(2s) \le 2Z(s),\quad \forall 0\le s \le \sqrt{\lambda }. \end{aligned}$$

According to Hölder’s inequality, the function \(Z\) is convex. Therefore,

$$\begin{aligned} Z(2s)\ge Z(s) + Z'(s)s. \end{aligned}$$

As a result,

$$\begin{aligned} \log (1-s^2/\lambda ) + Z'(s)s \le Z(s),\quad \forall 0\le s \le \sqrt{\lambda }, \end{aligned}$$

and so

$$\begin{aligned} \frac{d}{ds}\left( \frac{Z(s)}{s}\right) \le -\frac{\log (1-s^2/\lambda )}{s^2},\quad \forall 0< s \le \sqrt{\lambda }. \end{aligned}$$

Since \(Z(s)/s\rightarrow \int f\,d\mu ^n\) when \(s\rightarrow 0\), we conclude that

$$\begin{aligned} \int e^{s(f-\int f\,d\mu ^n)}d\mu ^n \le \exp \left( \frac{s}{\sqrt{\lambda }} \int \limits _0^{s/\sqrt{\lambda }} \frac{-\log (1-v^2)}{v^2}\,dv \right) ,\quad \forall s\le \sqrt{\lambda }. \end{aligned}$$

Taking \(s=\sqrt{\lambda }/2\), we easily get, by the (exponential) Chebychev Inequality,

$$\begin{aligned} \mu ^n\left( f-\int f\,d\mu ^n >r\right) \le be^{-\frac{\sqrt{\lambda }}{2}r},\quad \forall r\ge 0, \end{aligned}$$
(5.5)

with \(b=\exp \left( \frac{1}{2}\int _0^{1/2}\frac{-\log (1-v^2)}{v^2}\,dv\right) .\)

Now, since the probability measure \(\mu ^n\) satisfies the Poincaré inequality with constant \(\lambda >0\), (5.2) implies that

$$\begin{aligned}\lambda \mathrm {Var}_{\mu ^n}( f)\le 1.\end{aligned}$$

Therefore, by Markov’s inequality, it holds, for all \(\varepsilon >0\)

$$\begin{aligned} \mu ^n\left( f\le \int f\,d\mu ^n -\sqrt{\frac{ 2+\varepsilon }{\lambda }} \right) \le \frac{\lambda }{2+\varepsilon } \,\mathrm {Var}_{\mu ^n}( f) < \frac{1}{2}. \end{aligned}$$

Hence, according to this definition of \(m(f)\), it follows easily that

$$\begin{aligned} m\left( f\right) \ge \int f\,d\mu ^n -\sqrt{\frac{ 2}{\lambda } } . \end{aligned}$$

This inequality together with (5.5) provides the expected deviation inequality

$$\begin{aligned} \mu ^n\left( f>m\left( f\right) + \sqrt{\frac{ 2}{\lambda } } +r\right) \le be^{-\frac{\sqrt{\lambda }}{2}r},\quad \forall r\ge 0. \end{aligned}$$
(5.6)

Now suppose that \(f:\mathcal {X}^n\rightarrow \mathbb {R}\) is Lipschitz but not bounded. Consider the function \(f_n=(f\vee -n) \wedge n\), \(n\in \mathbb {N}^*.\) Applying (4.3) componentwise, we see that \(|\nabla _i^-f_n|\le |\nabla _i^-f|\) for all \(i\in \{1,2,\ldots ,n\}\). Therefore, \(f_n\) satisfies (5.2). Applying (5.6) to \(f_n\), and letting \(n\) go to \(\infty \), one sees easily that (5.6) still holds for the unbounded Lipschitz function \(f\).

Exactly the same proof works under the condition (5.3) [using the equivalent inequality (1.3)] \(\square \)

5.2 Proof of Theorem 1.3

Thanks to the previous section (Sect. 5.1), we are now in position to prove Theorem 1.3.

Proof of Theorem 1.3

Let \(A\) be a measurable subset of \(\mathcal {X}^{n}\) of measure \(\mu ^n(A)\ge 1/2\) and define for all \(\varepsilon >0\), \(f_{A,\varepsilon }(x):=\sqrt{\varepsilon +d_2^2(x,A)}\), \(x \in \mathcal {X}^n\). By definition of \(R_\varepsilon \) given in (5.1), it holds \(f_{A,\varepsilon }=-R_\varepsilon i_A\), where \(i_A:\mathcal {X}^n\rightarrow \{-\infty ; 0\}\) is the function defined by \(i_A(x)=0\), if \(x\in A\) and \(i_A(x)=-\infty \) otherwise. According to Point \((ii)\) of Lemma 5.1, we see that \(f_{A,\varepsilon }\) satisfies Condition (5.3) of Proposition 5.2. Hence, observing that \(m(f_{A,\varepsilon })=\sqrt{\varepsilon }\), it follows from Proposition 5.2 that

$$\begin{aligned} \mu ^n\left( \sqrt{\varepsilon +d_2^2(x,A)}>\sqrt{\varepsilon } + \sqrt{\frac{ 2}{\lambda } } +r\right) \le b\exp \left( -a\sqrt{\lambda }r\right) ,\quad \forall r\ge 0 . \end{aligned}$$

Letting \(\varepsilon \) go to \(0\) yields (after a change of variable) to,

$$\begin{aligned} 1-\mu ^n(A_{r,2}) \le b' \exp (-a\sqrt{\lambda } r),\quad \forall r\ge 0, \end{aligned}$$

with \(b'=\max (b,1/2)e^{\sqrt{2} a}\). The proof of Theorem 1.3 is complete. \(\square \)

6 Poincaré inequality for convex functions and dimension free convex concentration property.

In this final section, we investigate the links between the dimension free concentration property \(\mathbf {CI}^\infty _2(\alpha )\) restricted to convex sets, and the Poincaré inequality restricted to the class of convex functions (see below for a precise definition)

6.1 Convexity and convex concentration on geodesic spaces

To deal with convexity properties, we shall assume throughout the section that the metric space \((\mathcal {X},d)\) is geodesic. This means that any two points \(x\) and \(y \) of \(\mathcal {X}\) can be connected by at least one constant-speed continuous curve in \(\mathcal {X}\): i.e. for all \(x,y \in \mathcal {X}\), there exists \((x_t)_{t\in [0,1]}\) in \(\mathcal {X}\) satisfying \(x_0=x\), \(x_1=y\), and for all \(s,t\in [0,1]\), \(d(x_s,x_t)=|t-s| d(x_0,x_1)\). Such a curve is called a (constant-speed) geodesic joining \(x\) to \(y\).

Definition 6.1

Let \((\mathcal {X},d)\) be a geodesic space.

  1. (i)

    A set \(A\subset \mathcal {X}\) is said to be convex if for all \(x_0, x_1 \in A\) and all geodesics \((x_t)_{t\in [0,1]}\) joining \(x_0\) to \(x_1\), it holds \(x_t\in A\), \(\forall t \in [0,1]\).

  2. (ii)

    A function \(f:\mathcal {X}\rightarrow \mathbb {R}\cup \{+\infty \}\) is said to be convex if for all geodesics \((x_t)_{t\in [0,1]}\) in \(\mathcal {X}\),

    $$\begin{aligned} f(x_t)\le (1-t)f(x_0) +t f(x_1). \end{aligned}$$

Accordingly, one will say that \(\mu \in \mathcal {P}(\mathcal {X})\) satisfies the dimension free convex concentration property with the concentration profile \(\alpha \) and with respect to the \(\ell _2\) product structure (in short \(\mathbf {CCI}_2^\infty (\alpha )\)), if for all convex subset \(A\subset \mathcal {X}^n\), with \(\mu ^n(A)\ge 1/2,\)

$$\begin{aligned} \mu ^n(A_{r,2})\ge 1-\alpha (r),\quad \forall r\ge 0. \end{aligned}$$
(6.1)

As for \(\mathbf {CI}_2^\infty (\alpha )\) in Proposition 1.9, the convex concentration property can be characterized using the inf-convolution operator \(Q_t\).

Proposition 6.2

Let \(\mu \in \mathcal {P}(\mathcal {X})\); \(\mu \) satisfies \(\mathbf {CCI}_2^\infty (\alpha )\) if and only if for all \(n\in \mathbb {N}^*\) and for all convex function \(f:\mathcal {X}^n \rightarrow \mathbb {R}\cup \{+\infty \}\) bounded from below and such that \(\mu ^n(f=+\infty )<1/2\), it holds

$$\begin{aligned} \mu ^n(Q_t f > m(f) +r)\le \alpha (\sqrt{tr}),\quad \forall r,t>0, \end{aligned}$$
(6.2)

where \(m(f)=\inf \{m\in \mathbb {R};\mu ^n(f\le m(f)) \ge 1/2\}.\)

Proof

By the following two observations the proof of the above proposition (that we omit) becomes essentially identical to the one of Proposition 1.9 in Sect. 3.2: (a) By definition, if \(f:\mathcal {X}\rightarrow \mathbb {R}\cup \{+\infty \}\) is convex then any level set \(\{f\le r\}\), \(r\in \mathbb {R}\), is convex; (b) for any convex set \(A\), the function \(i_A\), that equals 0 on \(A\) and \(+\infty \) on \(\mathcal {X}\setminus A\), is convex. \(\square \)

For technical reasons, we will have to distinguish between the following two versions of the Poincaré inequality in restriction to convex functions (in short convex Poincaré):

$$\begin{aligned} \lambda \mathrm {Var}_\mu (f)\le \int |\nabla ^-f|^2\,d\mu ,\quad \forall f \text { Lipschitz and convex} \end{aligned}$$
(6.3)

and

$$\begin{aligned} \lambda \mathrm {Var}_\mu (f)\le \int |\nabla ^+f|^2\,d\mu ,\quad \forall f \text { Lipschitz and convex}. \end{aligned}$$
(6.4)

The argument used in Point (3) of Remark 1.1 to prove the equivalence between (1.2) and (1.3) in the usual setting does not work anymore since if \(f\) is convex, \(-f\) is no more convex. However, if \((\mathcal {X},d)\) is finite dimensional Banach space or a smooth Riemannian manifold and if one assumes that \(\mu \) is absolutely continuous with respect to the Lebesgue (or the volume) measure, by Rademacher’s theorem, both gradients in the definition of (6.3) and (6.4) are equal, except on a set of \(\mu \)-measure 0, which, in turn, guarantees that (6.3) and (6.4) are equivalent.

6.2 Dimension free convex concentration implies the convex Poincaré inequality (6.3)

Starting from the functional characterization of \(\mathbf {CCI}_2^\infty (\alpha )\) stated in Proposition 6.2 and following the lines of the proof of Theorem 1.2, we shall obtain the first main theorem of this section (a counterpart of Theorem 1.2 in the convex situation).

Theorem 6.3

If \(\mu \) satisfies the dimension free convex concentration property \(\mathbf {CCI}_2^\infty (\alpha )\), then \(\mu \) satisfies the convex Poincaré inequality

$$\begin{aligned} \lambda \mathrm {Var}_\mu (f)\le \int |\nabla ^-f|^2\,d\mu , \end{aligned}$$

for all locally LipschitzFootnote 1and convex function with finite variance with the constant \(\lambda \) defined by

$$\begin{aligned} \sqrt{\lambda }= \sup \left\{ \frac{ \overline{\Phi }^{-1} \left( \alpha (r)\right) }{r}; \quad r> 0 \text { s.t } \alpha (r)\le 1/2 \right\} . \end{aligned}$$

If moreover \(\int _0^{\infty } r\alpha (r)\,dr<\infty \) then the variance of all convex and Lipschitz functions is finite and thus \(\mu \) verifies (6.3).

Proof

The proof is very similar to the proof of Theorem , but one needs to take care of the technical difficulties coming from the restriction to convex sets/functions. We give here only the main lines and omit most of the details. Also, we may use the notation of the proof of Theorem 1.2.

Let \(h:\mathcal {X}\rightarrow \mathbb {R}\) be a locally Lipschitz convex function such that \(\int h\,d\mu =0\) and \(\int h^2\,d\mu <+\infty \). For all \(n\in \mathbb {N}^*\), define \(f_n:\mathcal {X}^n\rightarrow \mathbb {R}^+\) by

$$\begin{aligned} f_n(x)=h(x_1)+\cdots +h(x_n),\quad \forall x=(x_1,\ldots ,x_n)\in \mathcal {X}^n. \end{aligned}$$

We first observe that \(f_n\) is convex. Indeed, this point is an easy consequence of the fact that \((x_t)_{t\in [0,1]}\) is a geodesic in \((\mathcal {X}^n,d_2)\) if and only if for all \(i\in \{1\ldots ,n\}\), \((x_{i,t})_{t\in [0,1]}\) are geodesics in \((\mathcal {X},d)\) (where \(x_t=(x_{1,t},\ldots ,x_{n,t})\)). Therefore we may apply Proposition 6.2 to the function \(f_n\).

For the next step of the proof, we need some analogue of Proposition 3.1 for convex locally Lipschitz functions \(h :\mathcal {X}\rightarrow \mathbb {R}\) (not necessarily globally Lipschitz). From the definition of the convexity property of \(h\), one may easily check that for all \(x,y\in \mathcal {X}\)

$$\begin{aligned} h(x)-h(y)\le d(x,y)|\nabla ^- h(x)|. \end{aligned}$$

It follows that for all \(t>0\),

$$\begin{aligned} 0\le \frac{h(x)-Q_th(x)}{t}&= \sup _{y\in \mathcal {X}}\left\{ \frac{h(x)-h(y)}{t} - \frac{d^2(x,y)}{t^2}\right\} \\&\le \sup _{y\in \mathcal {X}}\left\{ |\nabla ^- h(x)| \frac{d(x,y)}{t} - \frac{d^2(x,y)}{t^2}\right\} \\&= \sup _{r \in \mathbb {R}}\left\{ |\nabla ^- h(x)| r - r^2\right\} \le \frac{1}{4}|\nabla ^- h(x)| ^2<\infty . \end{aligned}$$

This implies in particular that \(Q_th(x)\rightarrow h(x)\) for all \(x\in \mathcal {X}\). Moreover, it holds

$$\begin{aligned} \sqrt{n}\mu \left( h-Q_{1/\sqrt{n}}h \right) \le \frac{1}{4}\int |\nabla ^- h|^2\,d\mu ,\quad \forall n\in \mathbb {N}^*. \end{aligned}$$

Let us first assume that \(h\) is bounded from below. Then it holds, \(|Q_{1/\sqrt{n}}h|\le |h|+|\inf h|\) for all \(n\in \mathbb {N}^*\). Since \(\int h^2\, d\mu <\infty \), applying the dominated convergence theorem yields to

$$\begin{aligned} \lim _{n\rightarrow +\infty } \sigma _n^2=\lim _{n\rightarrow +\infty } \mathrm {Var}_{\mu } (Q_{1/\sqrt{n}} h)=\mathrm {Var}_{\mu } ( h)=\sigma ^2. \end{aligned}$$

Since \(\int h^2\,d\mu <+\infty \), we may show that any median of \(m_n\) of \(f_n/(\sqrt{n} \sigma )\) tends to 0 as \(n\) goes to \(+\infty \). As a consequence, (4.2) holds for \(n\) sufficiently large. The rest of the proof is identical to the one of Theorem : we apply the Central Limit Theorem (Theorem 1.10), observing that the Lindeberg condition holds since \(|Q_{1/\sqrt{n}}h|\le |\inf h|+|h|\) for all \(n\in \mathbb {N}^*\), and \(\int h^2\, d\mu <\infty \). This proves the claim in the case where \(h\) is assumed bounded from below. To show that the Poincaré inequality still holds if \(h\) is not bounded from below, one considers the approximation sequence \(h_n=h\vee -n\), \(n\in \mathbb {N}^*\) (note that \(h_n\) is convex) and one follows the same line of reasoning as in the end of the proof of Theorem 1.2.

Finally, to complete the proof, observe that if \(f:\mathcal {X}\rightarrow \mathbb {R}\) is convex and \(1\)-Lipschitz, then \(A=\{f\le m(f)\}\) is a convex set with \(\mu (A)\ge 1/2\), and since \(A_{r,2}\subset \{f\le m+r\}\), we conclude from \(\mathbf {CCI}_2(\alpha )\) that

$$\begin{aligned} \mu (f>m(f)+r)\le \alpha (r),\quad \forall r\ge 0. \end{aligned}$$

Since \(\int _0^{\infty } r\alpha (r)\,dr<\infty \), an integration by part shows that \(\int [f-m(f)]_+^2\,d\mu <\infty .\) Therefore, if \(f\) is convex, \(1\)-Lipschitz and bounded from below, then \(f\) has a finite variance and so applying the first part of the proof, we conclude that \(\mathrm {Var}_\mu (f)\le 1/\lambda .\) Using the same truncation as above, we see that this inequality extends to all convex and \(1\)-Lipschitz functions. This complete the proof. \(\square \)

6.3 Convex Poincaré inequality implies exponential dimension free convex concentration

To get the converse, namely to get a counterpart of Gromov–Milman theorem for convex sets [that the convex Poincaré inequalities (6.3) or (6.4) imply the dimension free concentration property \(\mathbf {CCI}_2^\infty (\alpha )\)] with an exponential profile, we need to add the structural assumption that the underlying metric space is of Busemann type.

Recall that \((\mathcal {X},d)\) is said to be a Busemann’s space if the distance \(d:\mathcal {X}^2\rightarrow \mathbb {R}^+\) is a convex function (of both variables). Banach spaces are obvious examples of such spaces. Another important class of Busemann’s spaces are complete connected Riemannian manifolds with non positive sectional curvature. We refer to [4, 8, 38] for more informations on the topic and a proof of this statement.

Let us mention some elementary properties of Busemann’s spaces:

  1. (B1)

    if \(\mathcal {X}\) is a Busemann’s space, any two points of \(\mathcal {X}\) are joined by a unique constant speed geodesic;

  2. (B2)

    for any convex subset \(A\subset \mathcal {X}\), the function \(\mathcal {X}\ni x \mapsto d(x,A)\) is convex;

  3. (B3)

    if \((\mathcal {X},d)\) is a Busemann’s space, then \((\mathcal {X}^n,d_2)\) is a Busemann’s space.

We are now in position to state our second main theorem (a counterpart of Gromov–Milman’s Theorem 1.3).

Theorem 6.4

Let \((\mathcal {X},d)\) be a geodesic space and \(\mu \) be a probability measure. Assume one of the following hypotheses: either

  1. (a)

    \(\mathcal {X}\) is a Busemann’s space and \(\mu \) satisfies the convex Poincaré inequality (6.4) with a constant \(\lambda >0\).

or

  1. (b)

    \((\mathcal {X},\Vert \cdot \Vert )\) is a Banach space (with \(\Vert x-y\Vert =d(x,y)\), \(x,y \in \mathcal {X}\)) and \(\mu \) satisfies the convex Poincaré inequality (6.3) with a constant \(\lambda >0\);

Then \(\mu \) satisfies the dimension free convex concentration property \(\mathbf {CCI}_2^\infty (\alpha )\) with the profile

$$\begin{aligned} \alpha (r)=b\exp (-a\sqrt{\lambda }r),\quad r\ge 0, \end{aligned}$$

where \(a,b\) are universal constants.

As a direct corollary of Theorems 6.4 and 6.3, when \(\mathcal {X}\) is a Banach space, we conclude that, similarly to the general case, the set of probability measures \(\mu \) satisfying the dimension free convex concentration property \(\mathbf {CCI}_2^\infty (\alpha )\) with exponential profile coincides with the set of probability measures verifying the convex Poincaré inequality (6.3).

The remaining of the section is dedicated to the proof of Theorem 6.4. There is a technical obstacle for applying Herbst argument as in Proposition 5.2. Namely, since the class of convex functions is not stable under truncation from above (if \(f\) is convex and \(a \in \mathbb {R}\), \(\min (f; a)\) is in general not convex) it is delicate to deal safely with \(\int e^{sf}\,d\mu \), \(s\ge 0\). Although a method based on \(p\)-th moments could perhaps be considered in replacement,

we use instead an argument based on Corollary 4.2: we first show that under (6.3) and (6.4), \(\mu \) verifies the dimension free convex concentration property \(\mathbf {CCI}_2^\infty (\alpha )\) with some polynomial concentration profile and then we upgrade it into an exponential one using Corollary 4.2.

Proof

We start as in the proof of Proposition 1.3 by noticing that the convex Poincaré inequalities (6.3) and (6.4) tensorize properly. Therefore, if \(\mu \) verifies one of the convex Poincaré inequalities, then for all \(n\in \mathbb {N}^*\),

$$\begin{aligned} \lambda \mathrm {Var}_{\mu ^n}(f)\le \int \sum _{i=1}^n\left| \nabla _i^{+/-}f\right| ^2\,d\mu ^n, \end{aligned}$$

for all Lipschitz and convex function \(f:\mathcal {X}^n\rightarrow \mathbb {R}\). In particular, if \(f\) is \(1\)-Lipschitz and such that \(\sum _{i=1}^n|\nabla _i^{+/-}f|^2\le 1\), then \(\mathrm {Var}_{\mu ^n}(f)\le 1/\lambda \). Applying Jensen’s inequality, we see that

$$\begin{aligned} 1/\lambda \ge \mathrm {Var}_{\mu ^n}(f)\!=\!\inf _{a\in \mathbb {R}}\int (f-a)^2\,d\mu ^n\!\ge \! \left( \inf _{a\in \mathbb {R}}\int |f-a|\,d\mu ^n\right) ^2\!=\!\left( \int |f\!-\!m(f)|\,d\mu ^n\right) ^2 \end{aligned}$$

Thus, it follows immediately from Markov’s inequality that

$$\begin{aligned} \mu ^n\left( f>m(f)+r\right) \le \frac{1}{\sqrt{\lambda }r},\quad \forall r>0. \end{aligned}$$
(6.5)

Let us first assume that Assumption (a) holds. Observe that the function \(\mathcal {X}^n\rightarrow \mathbb {R}:x \mapsto f_A(x):=d_2(x,A)\) is convex and \(1\)-Lipschitz when \(A\subset \mathcal {X}^n\) is a convex set [this follows from properties (B2) and (B3) above]. Moreover, using Point (i) of Lemma 5.1 and arguing as in the proof of Theorem 1.3, we see that \(f_{A,\varepsilon }:=\sqrt{\varepsilon +f_A^2}\) satisfies the condition \(\sum _{i=1}^n|\nabla _i^{+}f_A|^2\le 1\). Since the function \(u\mapsto \sqrt{\varepsilon +u^2}\) is convex and increasing, the function \(f_{A,\varepsilon }\) is itself convex. Applying (6.5) to \(f_{A,\varepsilon }\) and letting \(\varepsilon \rightarrow 0\), we get

$$\begin{aligned} \mu ^n(A_{r,2})\ge 1-\frac{1}{\sqrt{\lambda }r}, \end{aligned}$$
(6.6)

for all convex set \(A.\)

In particular, taking \(r_o=4/\sqrt{\lambda }\) we are in the framework of Corollary 4.2, with \(a_o=1/4\) and \(\mathcal {A}_n\) being the class of convex sets of \(\mathcal {X}^n.\) (Note that Assumptions (i) and (ii) are well verified by \(\mathcal {A}_n\): a product of convex sets is always convex and properties (B2) and (B3) above show that the enlargement of a convex set remains convex). We thus conclude that there are universal constants \(0<\gamma <1\) and \(1/2\le c<1\) such that, for all \(n\in \mathbb {N}^*\), for all convex set \(A\) such that \(\mu ^n(A)\ge c\)

$$\begin{aligned} \mu ^n(A_{r,2})\ge 1-be^{-a\sqrt{\lambda }r},\quad \forall r\ge 0, \end{aligned}$$

with \(b=(1-c)/\gamma \) and \(a=-\log (\gamma )/4\). Now, if \(\mu ^n(A)\ge 1/2\), then applying (6.6) we see that \(\mu ^n(A_{r_1,2})\ge c\) for \(r_1=\frac{1}{\sqrt{\lambda }(1-c)}\). We easily conclude from this that \(\mu \) verifies \(\mathbf {CCI}_2^\infty (\alpha )\) with \(\alpha (r)=b'e^{-a\sqrt{\lambda }r}\), for some other universal constant \(b'\).

Assume now that Assumption (b) holds. Let \(A\subset \mathcal {X}^n\) be a convex subset. The function \(f_A\) defined above is \(1\)-Lipschitz and convex. Therefore, according to point (i) and (iii) Lemma 5.1, the function \(R_\varepsilon f_A\), where \(R_\varepsilon \) is the operator defined in (5.1), is \(1\) Lipschitz, convex and satisfies \(\sum _{i=1}^n|\nabla _i^{-}R_\varepsilon f_A|^2\le 1\). Applying (6.5), we conclude that

$$\begin{aligned} \mu ^n\left( R_\varepsilon f_A>m(R_\varepsilon f_A)+r\right) \le \frac{1}{\sqrt{\lambda }r},\quad \forall r>0. \end{aligned}$$

Moreover, according to point (ii) of Lemma 5.1, it holds \(f_A-\sqrt{\varepsilon }\le R_\varepsilon f_A\le f_A\). Inserting this into the deviation inequality above and letting \(\varepsilon \rightarrow 0\) yields to the conclusion that (6.6) holds for all convex \(A.\) The rest of the proof is identical. \(\square \)