1 Introduction

Let \(\{T_t:\, t\ge 0\}\) be a continuous semigroup of stochastic maps (a Markov semigroup) with a unique stationary distribution \(\pi \). Defining the p-norm, for \(p\ge 1\), of a function f by \(\Vert f\Vert _p:=(\mathbb {E}|f|^p)^{1/p}\), where the expectation is with respect to \(\pi \), a simple convexity-type argument verifies that \(\Vert T_tf\Vert _p\le \Vert f\Vert _p\). That is, \(T_t\), for all \(t\ge 0\), is a contraction under p-norms. Since \(p\mapsto \Vert f\Vert _p\) is non-decreasing, a stronger contractivity inequality is the following:

$$\begin{aligned} \Vert T_tf\Vert _p\le \Vert f\Vert _q, \end{aligned}$$
(1)

for \(1\le q\le p\) and \(t=t(p)\) an increasing function of p satisfying \(t(q)=0\). Thus an inequality of this form is called a hypercontractivity inequality. Since \(T_0\) equals the identity map, the inequality (1) for \(p=q\) reduces to an equality. Thus its infinitesimal version around \(t=0\) must also hold. This infinitesimal version is derived from the derivative of the left hand side of (1) and is called a q-log-Sobolev inequality.Footnote 1 Such an inequality involves two quantities: the entropy function and the Dirichlet form. A log-Sobolev inequality guarantees the existence of a positive constant, called a log-Sobolev constant, up to which the entropy function is dominated by the Dirichlet form. Not only can one derive log-Sobolev inequalities from hypercontractivity ones, but a collection of the former inequalities can also be used to prove hypercontractivity inequalities through integration. Thus log-Sobolev inequalities and hypercontractivity inequalities are essentially equivalent.

A fundamental tool in the theory of log-Sobolev inequalities is the Stroock–Varopoulos inequality. This inequality enables us to compare the Dirichlet forms associated to different values of q, using which a log-Sobolev inequality for \(q=2\) can be used to derive a log-Sobolev inequality for any q. Indeed, the Stroock–Varopoulos inequality allows us to derive a collection of log-Sobolev inequalities from a single one, from which hypercontractivity inequalities can be proven by integration.

Hypercontractivity inequalities were first studied in the context of quantum field theory [22, 40, 48], but later found several important applications in different areas of mathematics, e.g., concentration of measure inequalities [8, 45], transportation cost inequalities [21], estimating the mixing times [18], analysis of Boolean functions [15] and information theory [1, 25]. One of the main ingredients of most of these applications is the so called tensorization property. It states that the hypercontractivity inequality

$$\begin{aligned} \Vert T_t^{\otimes n} f\Vert _p\le \Vert f\Vert _q, \end{aligned}$$

is satisfied for every \(n\ge 1\) if and only if it holds for \(n=1\). That is, the hypercontractivity of \(T_t\) is equivalent to the hypercontractivity of its tensor powers. Proof of the tensorization property is not hard, and can be obtained using the multiplicativity of the operator \((q\rightarrow p)\)-norm. Another proof, based on the equivalence of log-Sobolev and hypercontractivity inequalities, uses chain rule and the subadditivity of the entropy function.

Hypercontractivity inequalities can also be studied for \( p, q<1\). Although \(\Vert \cdot \Vert _p\) for \(p<1\) is not a norm, it satisfies the reverse Minkowski inequality from which one can show that \(\Vert T_tf\Vert _p\ge \Vert f\Vert _p\) when \(p<1\). Thus it is natural to consider inequalities of the form (1) for \( p, q<1\) in the reverse direction. Such inequalities are called reverse hypercontractivity inequalities. The theory of log-Sobolev inequalities for the range of \(q<1\) is developed similarly and can be used for proving reverse hypercontractivity inequalities as well [36].

Quantum hypercontractivity inequalities The theory of hypercontractivity and log-Sobolev inequalities in the quantum (non-commutative) case has been developed by Olkiewicz and Zegarlinski [43]. Here the semigroup of stochastic maps is replaced by a semigroup of quantum superoperators (QMS) representing the time evolution of an open quantum system under the Markovian approximation in the Heisenberg picture. Kastoryano and Temme in [26] used log-Sobolev inequalities to estimate the mixing time of quantum Markov semigroups. The study of quantum reverse hypercontractivity was initiated in [14], where following [36] some applications were discussed. For other applications of hypercontractivity inequalities in quantum information theory see [16, 32, 39].

Due to the non-commutative features of quantum physics, hypercontractivity and log-Sobolev inequalities in the quantum case are much more complicated. Therefore, despite the apparent analogy with the classical (i.e. commutative) case, several complications arise. In particular, one of the main drawbacks of the theory in the non-commutative case is the lack of a general quantum Stroock–Varopoulos inequality. As mentioned above, such an inequality would allow one to derive hypercontractivity inequalities solely from a 2-log-Sobolev inequality. Special cases of the quantum Stroock–Varopoulos inequality, called regularity and strong regularity properties, were considered in the literature and proved for certain examples [26, 43]. The most general result in this direction is a proof of the strong regularity property for a wide class of quantum Markov semigroups obtained in [3].

Even more problematic is the issue of tensorization. As mentioned before, the proof of the tensorization property in the commutative case is quite easy and can be done with at least two methods, yet none of them generalize to the non-commutative case; (i) The superoperator norm is not multiplicative in general, and (ii) one cannot interpret the quantum conditional entropy as an average of an entropic quantity over a smaller system, which is a crucial aspect of the proof in the classical setting. Thus far, the tensorization property has been proven only for a few special examples of quantum Markov semigroups. In particular, it was proven for the qubit depolarizing semigroup in [26, 33] and is generalized for all unital qubit semigroups in [28]. Moreover, in [49] some techniques were developed for bounding the log-Sobolev constants associated to the tensor powers of quantum Markov semigroups, which can be considered as an intermediate resolution of the tensorization problem. We also refer to [4, 6] for the theory of hypercontractivity and log-Sobolev inequalities for completely bounded norms.

1.1 Our Results

In this paper we first develop the theory of quantum reverse hypercontractivity inequalities beyond the unital case. This is done almost in a manner analogous to the (forward) hypercontractivity inequalities. Here, in contrast to [26, 43], we need to use different normalizations for the entropy function as well as the Dirichlet form to make them non-negative even for parameters \(p<1\). Our results in this part are summarized in Theorem 11.

Our next result is a quantum Stroock–Varopoulos inequality for both the forward and reverse cases. We prove this inequality under the assumption of strong reversibility of the QMS. We provide two proofs for the quantum Stroock–Varopoulos inequality. The first proof is based on ideas in [11, 43]. The second proof is based on ideas in [3] in which the strong regularity is proven under the same assumption. Indeed, our quantum Stroock–Varopoulos inequality is a generalization of the strong regularity property established in [3]. Theorem 14 states our result in this part.

We then prove some tensorization-type results. The first one, Theorem 19, provides a uniform bound on the 1-log-Sobolev constant of generalized depolarizing semigroups and their tensor powers. The proof of this result is a generalization of the proof of a similar result in the classical case [36]. This tensorization result together with our Stroock–Varopoulos inequality gives a reverse hypercontractivity inequality which is used in the subsequent section. The second tensorization result, Theorem 21, shows that the 2-log-Sobolev constant of the n-fold tensor power of a qubit generalized depolarizing semigroup is independent of n. Next, in Theorem 25 we explicitly compute this 2-log-Sobolev constant. Finally, in Corollary 26 we use these results to establish a uniform bound on the 2-log-Sobolev constant of any qubit quantum Markov semigroup and its tensor powers. We note that the latter bound improves over the bounds provided in [49].

Let us briefly explain the ideas behind the latter tensorization results. Previously, Theorem 21 was known in the unital case (the usual depolarizing semigroup), the proof of which was based on an inequality on the norms of a \(2\times 2\) block matrix and its submatrices from [27]. Our proof of Theorem 21 is based on the same inequality. First in Lemma 22 we derive an infinitesimal version of that inequality in terms of the entropies of a \(2\times 2\) block matrix and its submatrices, and then use it to prove Theorem 21. To prove Theorem 25 we need to show that a certain function of qubit density matrices is optimized over diagonal ones. Once we show this, the explicit expression for the 2-log-Sobolev constant is obtained from the associated classical log-Sobolev constant derived in [18]. Finally, Corollary 26 is a quantum generalization of a classical result from [18] with an essentially similar proof except that we should take care of tensorization separately.

Finally, we apply the quantum reverse hypercontractivity in proving strong converse bounds for the tasks of quantum hypothesis testing and classical-quantum channel coding. In the next section, we briefly explain the key idea behind the application of reverse hypercontractivity to the problem of classical hypothesis testing.

1.2 Application to Hypothesis Testing Problem

Recently, the authors of [31] introduced a new technique to prove strong converse results in information theory using reverse hypercontractivity inequalities. In the following we briefly explain the ideas via the problem of hypothesis testing.

Suppose that n samples independently drawn from a probability distribution on some sample space \(\Omega \) are provided, and the task is to distinguish between two possible hypotheses which are given by the distributions P and Q on \(\Omega \). In this setting, we apply a test functionFootnote 2\(f:\Omega ^n\rightarrow \{0, 1\}\) to make the decision; Letting \((x_1, \dots , x_n)\in \Omega ^n\) be the observed samples, if \(f(x_1, \dots , x_n)\) equals 1, we infer the hypothesis to be P, and otherwise infer it to be Q. The following two types of error may occur: the error of Type I of wrongly inferring the distribution to be Q given by \(\alpha _n(f):=P^{\otimes n}(f=0)\), and the error of Type II of wrongly inferring the distribution to be P given by \(\beta _n(f):=Q^{\otimes n}(f=1)\). In the asymmetric regime, we further assume that \(\alpha _n(f)\) is uniformly bounded by some fixed error \(\varepsilon \in (0,1)\), and we are interested in the smallest possible achievable error \(\beta _n(f)\).

The idea in [31] is to use the following variational formula for the relative entropy between P and Q (see, e.g., [45]):

$$\begin{aligned} nD(P\Vert Q)=D(P^{\otimes n}\Vert Q^{\otimes n})=\sup _{g>0} \mathbb {E}_{P^{\otimes n}}[\log g]-\log \mathbb {E}_{Q^{\otimes n}}[g], \end{aligned}$$
(2)

where \(\mathbb {E}_{P^{\otimes n}}\) stands for the expectation with respect to the distribution \(P^{\otimes n}\), and the maximum is over functions g on \(\Omega ^n\). This formula is indeed used for g being a noisy version of f. To get this noisy version a Markov semigroup is employed.

For any function \(h:\Omega \rightarrow {\mathbb {R}}\) define

$$\begin{aligned} T_t(h):= \mathrm {e}^{-t}h +(1-\mathrm {e}^{-t}) \mathbb {E}_P[h]. \end{aligned}$$
(3)

These maps define a classical version of the generalized quantum depolarizing semigroup (see Equation (17)). That is, for every \(x\in \Omega \), we have \(T_t(h)(x) = \mathrm {e}^{-t}h(x) + (1-\mathrm {e}^{-t}) \mathbb {E}_P[h]\). Then \(\{T_t:\, t\ge 0\}\) forms a semigroup that satisfies the following reverse hypercontractivity inequality [36]:

$$\begin{aligned} \Vert T_t(h)\Vert _{q}\ge \Vert h\Vert _{p}, \quad \forall p, q, t , \quad 0\le q<p<1,~~~ t\ge \log \left( \frac{1-q}{1-p}\right) , \end{aligned}$$
(4)

where the norms are defined with respect to the distribution P, i.e., \(\Vert h\Vert _p = \big ( \mathbb {E}_P[|h|^p] \big )^{1/p}\). Now the idea is to use (2) for \(g=T_t^{\otimes n} f\) as follows:

$$\begin{aligned} nD(P\Vert Q) \ge \mathbb {E}_{P^{\otimes n}}[\log T_t^{\otimes n}f]-\log \mathbb {E}_{Q^{\otimes n}}[T_t^{\otimes n} f]. \end{aligned}$$
(5)

Bounding the second term on the right hand side is easy. Letting \(\gamma =\left\| \frac{dP}{dQ} \right\| _\infty \) we have

$$\begin{aligned} \mathbb {E}_{Q^{\otimes n}}[T_t^{\otimes n}(f)]&=\mathbb {E}_{Q^{\otimes n}}\big [\big (\mathrm {e}^{-t}+(1-\mathrm {e}^{-t})\mathbb {E}_P \big )^{\otimes n}f \big ]\nonumber \\&\le \mathbb {E}_{Q^{\otimes n}}\big [\big (\mathrm {e}^{-t}+\gamma (1-\mathrm {e}^{-t})\mathbb {E}_Q \big )^{\otimes n}f \big ]\nonumber \\&= \left( \mathrm {e}^{-t}+\gamma (1-\mathrm {e}^{-t})\right) ^n \mathbb {E}_{Q^{\otimes n}} [f]\nonumber \\&= \left( \mathrm {e}^{-t}+\gamma (1-\mathrm {e}^{-t})\right) ^n \beta _n(f)\nonumber \\&\le \mathrm {e}^{\left( \gamma -1\right) nt} \beta _n(f), \end{aligned}$$
(6)

where the last inequality follows from \(\mathrm {e}^{\gamma t}-1\ge \gamma (\mathrm {e}^t-1)\) for \(\gamma \ge 1\).

Now we need to bound the first term in terms of \(\alpha _n(f)\). The crucial observation here is that

$$\begin{aligned} \Vert h\Vert _0 = \lim _{r\rightarrow 0} \Vert h\Vert _r = \mathrm {e}^{\mathbb {E}_{{P}}[\log |h|]}. \end{aligned}$$
(7)

It is then natural to use the reverse hypercontractivity inequality (4) for \(q=0\). In fact, using the tensorization property, that (4) also holds for \(T_t^{\otimes n}\), we have

$$\begin{aligned} \mathbb {E}_{P^{\otimes n}}[\log T_t f]&=\log \Vert T_t^{\otimes n}(f)\Vert _{0}\nonumber \\&\ge \log \Vert f\Vert _{1-\mathrm {e}^{-t}}\nonumber \\&\ge \frac{1}{1-\mathrm {e}^{-t}}\log \mathbb {E}_{P^{\otimes n}}[f]\nonumber \\&\ge \left( \frac{1}{t}+1\right) \log (1-\alpha _n(f)), \end{aligned}$$
(8)

where the second line follows from the reverse hypercontractivity inequality, the third line follows from the fact that \(T_t^{\otimes n}(f)\) takes values in [0, 1], and the last line follows from \(\mathrm {e}^{-t}\ge 1-t\). Now using (6) and (8) in (5), using \(\alpha _n(f)\le \varepsilon \) and optimizing over the choice of \(t> 0\) we arrive at

$$\begin{aligned} \beta _n(f)\ge (1-\varepsilon )\mathrm {e}^{-nD(P\Vert Q) -2\sqrt{n \left\| \frac{dP}{dQ} \right\| _\infty \log \frac{1}{1-\varepsilon } } }. \end{aligned}$$
(9)

In the present work, we show that the above analysis can be carried over to the quantum setting. Let us explain the similarities with the classical case as well as difficulties we face in doing this. Firstly, a variational expression for the quantum relative entropy similar to (2) is already known [44]. Secondly, the semigroup (3) is easily generalized to the generalized depolarizing semigroup in the quantum case. Thirdly, the reverse hypercontractivity inequality (4) is derived in the quantum case from our theory of quantum reverse hypercontractivity as well as our quantum Stroock–Varopoulos inequality. However we need this inequality in its n-fold tensor product form, for which we use our tensorization-type result. Also, generalizing the computations in (6) to the quantum case is straightforward. Nevertheless, we face a problem in the next step; The crucial identity (7) no longer holds in the non-commutative case. Indeed, as far as we know, non-commutative \(L_p\)-norms do not possess a closed expression in the limit \(p\rightarrow 0\). To get around this problem, instead of a variational formula similar to (2), we use our quantum reverse hypercontractivity inequality together with a variational formula for p-norms (obtained from the reverse Hölder inequality). Then we derive an inequality of the form (9) by taking an appropriate limit.

Section 5 contains our results on applications of reverse hypercontractivity inequalities to strong converse of the quantum hypothesis testing as well as the classical-quantum channel coding problems.

2 Notations

For a Hilbert space \(\mathcal {H}\), the algebra of (bounded) linear operators acting on \(\mathcal {H}\) is denoted by \(\mathcal {B}(\mathcal {H})\). The adjoint of \(X\in \mathcal {B}(\mathcal {H})\) is denoted by \(X^\dagger \) and

$$\begin{aligned} |X|:=\sqrt{X^\dagger X}. \end{aligned}$$

The subspace of self-adjoint operators is denoted by \(\mathcal {B}_{sa}(\mathcal {H}) \subset \mathcal {B}(\mathcal {H})\). When \(X\in \mathcal {B}_{sa}(\mathcal {H})\) is positive semi-definite (positive definite) we represent it by \(X\ge 0\) (\(X> 0\)). We let \(\mathcal {P}(\mathcal {H})\) be the cone of positive semi-definite operators on \(\mathcal {H}\) and \(\mathcal {P}_{+}(\mathcal {H}) \subset \mathcal {P}(\mathcal {H})\) the set of (strictly) positive operators. Further, let \(\mathcal {D}(\mathcal {H}):=\lbrace \rho \in \mathcal {P}(\mathcal {H})\mid \text {tr}\rho =1\rbrace \) denote the set of density operators (or states) on \(\mathcal {H}\), and \(\mathcal {D}_+(\mathcal {H}):=\mathcal {D}(\mathcal {H})\cap \mathcal {P}_+(\mathcal {H})\) denote the subset of faithful states. We denote the support of an operator A by \({\mathrm {supp}}(A)\). We let \(\mathbb {I}\in \mathcal {B}(\mathcal {H})\) be the identity operator on \(\mathcal {H}\), and \(\mathcal {I}:\mathcal {B}(\mathcal {H})\mapsto \mathcal {B}(\mathcal {H})\) be the identity superoperator acting on \(\mathcal {B}(\mathcal {H})\).

We sometimes deal with tensor products of Hilbert spaces. In this case, in order to keep track of subsystems, it is appropriate to label the Hilbert spaces as \(\mathcal {H}_A, \mathcal {H}_B\) etc. We also denote \(\mathcal {H}_A\otimes \mathcal {H}_B\) by \(\mathcal {H}_{AB}\). Then the subscript in \(X_{AB}\) indicates that it belongs to \(\mathcal {B}(\mathcal {H}_{AB})\). We also use \(\mathcal {H}^{\otimes n} = \mathcal {H}_{A_1}\otimes \cdots \otimes \mathcal {H}_{A_n}\) where \(\mathcal {H}_{A_i}\)’s are isomorphic Hilbert spaces. Moreover, for any \(S\subseteq \{1, \dots , n\}\) we use the shorthand notations \(A_S{=A^S=\{A_j: \, j\in S \}}\), and \(\mathcal {H}_{A_S}\) for \(\bigotimes _{j\in S}\mathcal {H}_{A_j}\). We also identify \(A_{\{1, \dots , n\}}\) with \(A^n\).

A superoperator \(\Phi :\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) is called positive if \(\Phi (X)\ge 0\) whenever \(X\ge 0\). It is called completely positive if \(\mathcal {I}\otimes \Phi \) is positive where \(\mathcal {I}:\mathcal {B}(\mathcal {H}')\rightarrow \mathcal {B}(\mathcal {H}')\) is the identity superoperator associated to an arbitrary Hilbert space \(\mathcal {H}'\). Observe that a positive superoperator \(\Phi \) is hermitian-preserving meaning that \(\Phi (X^\dagger ) =\Phi (X)^\dagger \). A superoperator is called unital if \(\Phi ({\mathbb {I}})={\mathbb {I}}\), and is called trace-preserving if \(\text {tr}\,\Phi (X)=\text {tr}X\) for all X. The adjoint of \(\Phi \), denoted by \(\Phi ^*\) is defined with respect to the Hilbert–Schmidt inner product:

$$\begin{aligned} \text {tr}\left( X^\dagger \Phi (Y)\right) = \text {tr}\left( \Phi ^*(X)^\dagger Y\right) . \end{aligned}$$
(10)

Note that the adjoint of a unital map is trace-preserving and vice versa.

2.1 Non-commutative Weighted \(L_p\)-Spaces

Throughout the paper we fix \(\sigma \in {\mathcal {D}}_+(\mathcal {H})\) to be a positive definite density matrix. We define

$$\begin{aligned} \Gamma _\sigma (X):= \sigma ^{\frac{1}{2}}X\sigma ^{\frac{1}{2}}. \end{aligned}$$

Then \(\mathcal {B}(\mathcal {H})\) is equipped with the inner product

$$\begin{aligned} \langle X, Y\rangle _\sigma := \text {tr}\left( X^\dagger \Gamma _\sigma (Y)\right) = \text {tr}\left( \Gamma _\sigma (X^\dagger )Y\right) . \end{aligned}$$

Note that if \(X, Y\ge 0\) then \(\langle X, Y\rangle _\sigma \ge 0\). This inner product induces a norm on \(\mathcal {B}(\mathcal {H})\):

$$\begin{aligned} \Vert X\Vert _{2, \sigma } := \sqrt{\langle X, X\rangle _\sigma }. \end{aligned}$$
(11)

This 2-norm can be generalized for other values of p. For every \(p\in {\mathbb {R}}\setminus \{0\}\) we define

$$\begin{aligned} \Vert X\Vert _{p, \sigma }:= \text {tr}\left[ \big |\Gamma _\sigma ^{\frac{1}{p}}(X)\big |^p\right] ^{\frac{1}{p}} = \text {tr}\left[ \big |\sigma ^{\frac{1}{2p}}X\sigma ^{\frac{1}{2p}}\big |^p\right] ^{\frac{1}{p}}\equiv \big \Vert \Gamma ^{\frac{1}{p}}_\sigma (X)\big \Vert _p , \end{aligned}$$
(12)

where

$$\begin{aligned} \Vert X\Vert _p:=\left( \text {tr}\,|X|^p\right) ^{1/p}, \end{aligned}$$

denotes the (generalized) Schatten norm of order p. In particular, if \(X> 0\) then \(\Vert X\Vert _{p, \sigma }^p=\text {tr}\big [\Gamma _\sigma ^{1/p}(X)^p\big ]\). Note that this definition reduces to (11) when \(p=2\). The values of \(\Vert X\Vert _{p, \sigma }\) for \(p\in \{0, \pm \infty \}\) are defined in the limits. Since the function \(p\mapsto \Vert X\Vert _{p,\sigma }\) is increasing and bounded below by 0, by the monotone convergence theorem, the limit \(p\rightarrow 0\) exists but does not have a closed expression, as opposed to the classical setting (cf Equation (7)). Observe also that \(\Vert X\Vert _{p, \sigma } = \Vert X^{\dagger }\Vert _{p, \sigma }\) for all X. Moreover, \(\Vert \cdot \Vert _{p, \sigma }\) for \(1\le p\le \infty \) satisfies the triangle inequality (the Minkowski inequality) and is a norm. The dual of this norm is \(\Vert \cdot \Vert _{{\hat{p}}, \sigma }\) where \({\hat{p}}\) is the Hölder conjugate of p given by

$$\begin{aligned} \frac{1}{p} +\frac{1}{{\hat{p}}}=1, \end{aligned}$$
(13)

where \(p>1\), and \(\hat{p}=+\infty \) for \(p=1\). We indeed for \(1\le p\le \infty \) and arbitrary X have [43]

$$\begin{aligned} \Vert X\Vert _{p, \sigma } = \sup _Y \frac{|\langle X, Y\rangle _\sigma |}{\Vert Y\Vert _{{\hat{p}}, \sigma }}. \end{aligned}$$
(14)

Moreover, for \(-\infty< p<1\), \(p\ne 0\) and positive definiteX we have

$$\begin{aligned} \Vert X\Vert _{p, \sigma }= \inf _{Y>0} \frac{\langle X, Y\rangle _\sigma }{\Vert Y\Vert _{{\hat{p}}, \sigma }}, \end{aligned}$$
(15)

where again \({\hat{p}}\) is defined via (13).Footnote 3 This identity is a consequence of the reverse Hölder inequality:

Lemma 1

(Reverse Hölder inequality). Let \(X\ge 0\) and \(Y>0\). Then, for any \(p< 1\) with Hölder conjugate \(\hat{p}\) we have

$$\begin{aligned} \langle X,Y\rangle _\sigma \ge \Vert X\Vert _{p,\sigma }\Vert Y\Vert _{\hat{p},\sigma }. \end{aligned}$$

Proof

The proof is a direct generalization of equation (32) of [50] (see also Lemma 5 of [14]): for any \(A\ge 0\) and \(B>0\),

$$\begin{aligned} \text {tr}(AB)\ge \Vert A\Vert _p\Vert B\Vert _{\hat{p}}. \end{aligned}$$

From there, choosing \(A:=\Gamma _{\sigma }^{\frac{1}{p}}(X)\) and \(B:=\Gamma _\sigma ^{\frac{1}{\hat{p}}}(Y)\),

$$\begin{aligned} \langle X,Y\rangle _{\sigma }=\text {tr}\big (\sigma ^{1/p}X\sigma ^{1/p}\sigma ^{1/\hat{p}}Y\sigma ^{1/\hat{p}}\big )=\text {tr}(AB)\ge \Vert A\Vert _p\Vert B\Vert _{\hat{p}}=\Vert X\Vert _{p,\sigma }\Vert Y\Vert _{\hat{p},\sigma }. \end{aligned}$$

\(\square \)

Another property of \(\Vert \cdot \Vert _{p, \sigma }\) for \(-\infty \le p<1\) is the reverse Minkowski inequality. As mentioned above, when \(p\ge 1\), the triangle inequality is satisfied due to the Minkowski inequality. When \(p<1\) we have the inequality in the reverse direction:

$$\begin{aligned} \Vert X\Vert _{p, \sigma } + \Vert Y\Vert _{p, \sigma }\le \Vert X+Y\Vert _{p, \sigma }. \end{aligned}$$

Again this inequality in the special case of \(\sigma \) being the completely mixed state is proven in [14] but the generalization to arbitrary \(\sigma \) is immediate.

For arbitrary pq define the power operator by

$$\begin{aligned} I_{q, p}(X) := \Gamma _\sigma ^{-\frac{1}{q} }\left( \big |\Gamma _\sigma ^{\frac{1}{p}}(X)\big |^{\frac{p}{q}}\right) . \end{aligned}$$

Here are some immediate properties of the power operator.

Proposition 2

[26, 43] F or all \(q,r,p\in (-\infty ,\infty )\backslash \{0\}\) and \(X\in \mathcal {B}(\mathcal {H})\):

  1. (i)

    \(\Vert I_{q, p}(X)\Vert _{q, \sigma }^q =\Vert X\Vert _{p, \sigma }^p\). In particular we have \(\Vert I_{p, p}(X)\Vert _{p, \sigma } = \Vert X\Vert _{p, \sigma }\).

  2. (ii)

    \(I_{q, r}\circ I_{r, p} = I_{q, p}\).

  3. (iii)

    For \(X\ge 0\) we have \(I_{p, p}(X)=X\).

2.2 Entropy

For a given \(\sigma \in \mathcal {D}_+(\mathcal {H})\) and arbitrary \(p\ne 0\) we define the entropy functionFootnote 4 for \(X> 0\) by

$$\begin{aligned} \text {Ent}_{p, \sigma }(X):= & {} \text {tr}\Big [ \big (\Gamma _\sigma ^{\frac{1}{p}}(X)\big )^p\cdot \log \big (\Gamma _\sigma ^{\frac{1}{p}}(X)\big )^p \Big ] \\&\quad -\text {tr}\Big [\big (\Gamma _\sigma ^{\frac{1}{p}}(X)\big )^p\cdot \log \sigma \Big ]- \Vert X\Vert _{p, \sigma }^p\cdot \log \Vert X\Vert _{p, \sigma }^p. \end{aligned}$$

As usual, the entropy function for \(p\in \{0, \pm \infty \}\) is defined in the limit.

Remark 1

When \(p> 0\), in the definition of the entropy we can take X to be positive semi-definite. However, when \(p<0\), we need to consider X to be positive definite in order to avoid difficulties. For this reason, in the rest of the paper we state our definitions and results for positive definite X, keeping in mind that when \(p, q>0 \) they can easily be generalized to positive semi-definite X (say, by taking an appropriate limit).

The significance of the entropy function comes from its relation to the derivative of the p-norm.

Proposition 3

[26, 43] For a differentiable operator valued function \(p\mapsto X_p\) we have, for any \(p\in \mathbb {R}\backslash \{0\}\):

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert X_p\Vert _{p,\sigma }= & {} \frac{1}{p^2}\Vert X_p\Vert _{p, \sigma }^{1-p}\cdot \left( \frac{1}{2}\text {Ent}_{p, \sigma }\big (I_{p, p}(X_p)\big )\right. \\&\left. +\frac{1}{2}\text {Ent}_{p, \sigma }\big (I_{p, p}(X_p^{\dagger })\big ) + \gamma \right) . \end{aligned}$$

Here \(\gamma \) is given by

$$\begin{aligned} \gamma =\frac{p^2}{2}\left( \text {tr}\Big [\Gamma _\sigma ^{\frac{1}{p}}(Z_p^{\dagger })\cdot \Gamma _\sigma ^{\frac{1}{p}}(X_p)\cdot \big | \Gamma _\sigma ^{\frac{1}{p}}(X_p) \big |^{p-2}\Big ] +\text {tr}\Big [\Gamma _\sigma ^{\frac{1}{p}}(X_p^{\dagger })\cdot \Gamma _\sigma ^{\frac{1}{p}}(Z_p)\cdot \big | \Gamma _\sigma ^{\frac{1}{p}}(X_p) \big |^{p-2}\Big ]\right) , \end{aligned}$$

where \(Z_p := \frac{\text {d}}{\text {d}p}X_p\).

We will be using two special cases of this proposition. First, if \(X_p> 0\) for all p, we have

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert X_p\Vert _{p,\sigma } = \frac{1}{p^2}\Vert X_p\Vert _{p, \sigma }^{1-p}\cdot \left( \text {Ent}_{p, \sigma }(X_p) + p^2\text {tr}\Big [ \Gamma _\sigma ^{\frac{1}{p}}(Z_p) \cdot \Gamma _\sigma ^{\frac{1}{p}} (X_p)^{p-1} \Big ] \right) . \end{aligned}$$

Second, if \(X_p=X\) is independent of p we have

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert X\Vert _{p,\sigma } = \frac{1}{p^2}\Vert X\Vert _{p, \sigma }^{1-p}\cdot \left( \frac{1}{2}\text {Ent}_{p, \sigma }\big (I_{p, p}(X)\big ) +\frac{1}{2}\text {Ent}_{p, \sigma }\big (I_{p, p}(X^{\dagger })\big ) \right) . \end{aligned}$$
(16)

We will also use the following properties of the entropy function that are easy to verify.

Proposition 4

[26]

  1. (i)

    \(\text {Ent}_{p, \sigma }(I_{p, 2}(X)) = \text {Ent}_{q, \sigma }(I_{q, 2}(X))\) for all \(p, q\in \mathbb {R}\backslash \{0\}\) and \(X\in \mathcal {B}(\mathcal {H})\).

  2. (ii)

    \(\text {Ent}_{p, \sigma }(cX) = c^p \text {Ent}_{p, \sigma }(X)\) for all \(X> 0\) and constants \(c> 0\).

  3. (iii)

    For any density matrix \(\rho \) we have

    $$\begin{aligned} \text {Ent}_{2, \sigma }\big (\Gamma _\sigma ^{-\frac{1}{2}}(\sqrt{\rho })\big ) = D(\rho \Vert \sigma ), \end{aligned}$$

    where \(D(\rho \Vert \sigma ) = \text {tr}(\rho \log \rho ) - \text {tr}(\rho \log \sigma )\) is Umegaki’s relative entropy.

  4. (iv)

    For any density matrix \(\rho \) we have

    $$\begin{aligned} \text {Ent}_{1, \sigma }\big (\Gamma _\sigma ^{-1}(\rho )\big ) = D(\rho \Vert \sigma ). \end{aligned}$$

Corollary 5

  1. (a)

    For all \(X>0\) and arbitrary \(p\in \mathbb {R}\backslash \{0\}\) we have \(\text {Ent}_{p, \sigma }(X)\ge 0\).

  2. (b)

    For all \(X>0\), the map \(p\mapsto \Vert X\Vert _{p, \sigma }\) is non-decreasing on \(\mathbb {R}\).

  3. (c)

    \(X\mapsto \text {Ent}_{1, \sigma }(X)\) is a convex function on positive semi-definite matrices.

Proof

(a) By part (i) of the previous proposition it suffices to prove the corollary for \(p=1\). Moreover, by part (ii) we may assume that X is of the form \(X=\Gamma _\sigma ^{-1}(\rho )\) for some density matrix \(\rho \). Then by part (iv) we have \(\text {Ent}_{1, \sigma }(X) = D(\rho \Vert \sigma )\ge 0\).

(b) By (a) both \(\text {Ent}_{p, \sigma }(I_{p,p}(X))\) and \(\text {Ent}_{p, \sigma }(I_{p,p}(X^\dagger ))\) are non-negative. Thus using (16) the derivative of \(p\mapsto \Vert X\Vert _{p, \sigma }\) is non-negative, and this function is non-decreasing.

(c) This is a direct consequence of the joint convexity of \((\rho ,\sigma )\mapsto D(\rho \Vert \sigma )\) (see e.g., [54]). \(\quad \square \)

2.3 Quantum Markov Semigroups

A quantum Markov semigroup (QMS) is the basic model for the evolution of an open quantum system in the Markovian regime. Such quantum Markov semigroup (in the Heisenberg picture) is a set \(\{\Phi _t:\, t\ge 0\}\) of completely positive unital superoperators \(\Phi _t: \mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) of the form

$$\begin{aligned} \Phi _t = \mathrm {e}^{-t\mathcal {L}}, \end{aligned}$$

where \(\mathcal {L}: \mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) is a superoperator called the Lindblad generator of the QMS. The general form of such a Lindblad generator is characterized in [20, 30]. We note that \(\Phi _0=\mathcal {I}\) and \(\Phi _{t+s}=\Phi _s\circ \Phi _t\). Moreover, for any \(X\in \mathcal {B}(\mathcal {H})\) we have

$$\begin{aligned} \frac{\text {d}}{\text {d}t}\Phi _t(X) = -\mathcal {L}\circ \Phi _t(X) = -\Phi _t\circ \mathcal {L}(X). \end{aligned}$$

In particular, since \(\Phi _t\) is assumed to be unital, we have

$$\begin{aligned} \mathcal {L}({\mathbb {I}})=0. \end{aligned}$$

The dual of \(\mathcal {L}\) generates the associated QMS in the Schrödinger picture: \(\Phi _t^* = \mathrm {e}^{-t\mathcal {L}^*}\) where \(\mathcal {L}^*\) is the adjoint of \(\mathcal {L}\) with respect to the Hilbert–Schmidt inner product defined in (10). Since \(\mathcal {L}\) is not full-rank, there exists some non-zero \(\sigma \) in the kernel of \(\mathcal {L}^*\) as well. Then \(\sigma \) is an invariant of the semigroup \(\{\Phi _t^*: t\ge 0\}\), i.e., \(\Phi _t^*(\sigma ) = \sigma \) for all \(t\ge 0\). Throughout the paper we assume that such a \(\sigma \) is unique (up to scaling) and full-rank. Then it can be proven that \(\sigma \) is a density matrix.Footnote 5 Thus by the above uniqueness and full-rankness assumptions, \(\{\Phi _t^*:\, t\ge 0\}\) admits a unique invariant state \(\sigma \) in \(\mathcal {D}_+(\mathcal {H})\). We call such a QMS primitive. Observe that for a primitive QMS the identity operator \({\mathbb {I}}\) is the unique (up to scaling) element in the kernel of \(\mathcal {L}\).

We say that the QMS is \(\sigma \)-reversible or satisfies the detailed balanced condition with respect to some \(\sigma \in \mathcal {D}_+(\mathcal {H})\) if

$$\begin{aligned} \Gamma _\sigma \circ \mathcal {L}\circ \Gamma _\sigma ^{-1}= \mathcal {L}^*. \end{aligned}$$

From this equation and \(\mathcal {L}({\mathbb {I}})=0\) it is clear that

$$\begin{aligned} \mathcal {L}^*(\sigma )=0, \end{aligned}$$

and that \(\sigma \) is a fixed point of \(\Phi _t^*\). Therefore, if the QMS is primitive and \(\sigma \)-reversible, then \(\sigma \) would be the unique invariant state of \(\{\Phi _t^*:\, t\ge 0\}\).

We will frequently use the following immediate consequence of reversibility.

Lemma 6

\(\mathcal {L}\) is \(\sigma \)-reversible if and only if both \(\mathcal {L}\) and \(\Phi _t\) are self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \), which means that for all \(X, Y\in \mathcal {B}(\mathcal {H})\) we have

$$\begin{aligned} \langle X, \mathcal {L}(Y)\rangle _\sigma = \langle \mathcal {L}(X), Y\rangle _\sigma , \qquad \langle X, \Phi _t(Y)\rangle _\sigma = \langle \Phi _t(X), Y\rangle _\sigma . \end{aligned}$$

A primitive QMS with the unique invariant state \(\sigma \in \mathcal {D}_+(\mathcal {H})\) is called p-contractive if it is a contraction under the p-norm, that is, for all \(t\ge 0\) and \(X> 0\) we have

$$\begin{aligned} \Vert \Phi _t(X)\Vert _{p,\sigma }\le \Vert X\Vert _{p, \sigma }, \quad \text {if }\quad p\ge 1. \end{aligned}$$

It is called reversep-contractive if for all \(t\ge 0\) and \(X>0\)

$$\begin{aligned} \Vert \Phi _t(X)\Vert _{p,\sigma }\ge \Vert X\Vert _{p, \sigma }, \quad \text {if}\quad p< 1. \end{aligned}$$

We say that the QMS is contractive if it is p-contractive for all \(p\ge 1\) and reverse p-contractive for \(p<1\).

Two remarks are in line. Firstly, as mentioned before, when \(p>0\) in the above definition we may safely take \(X\ge 0\) (instead of \(X>0\)). For uniformity of presentation we prefer to take \(X>0\) in order to jointly consider the cases \(p>0\) and \(p\le 0\) in the definitions. Of course in the former case by taking an appropriate limit, a contractivity inequality for \(X\ge 0\) can be derived once we have one for \(X>0\). Secondly, in the above definition we restrict to positive definite (or positive semidefinite) X since here \(\Phi _t\) is a completely positive map, and the superoperator norm of completely positive maps (at least for \(p\ge 1\)) is optimized over positive semidefinite operators (see e.g. [17] and reference therein). The proof of the following proposition is postponed to Appendix A.

Proposition 7

  1. (i)

    Any primitive QMS is (reverse) p-contractive for \(p\in (-\infty , -1]\cup [1/2, +\infty )\).

  2. (ii)

    Any primitive QMS whose unique invariant state is \(\sigma ={\mathbb {I}}/d\), the completely mixed state, is (reverse) p-contractive for all p.

The reader familiar with the notion of sandwichedp-Rényi divergence [37, 52] would notice that p-contractivity is related to [5] the data processing inequality of sandwiched p-Rényi divergences, which is known to hold [5, 19, 37] for \(p\ge 1/2\). In Appendix A we give a proof of part (i) for the range \(p\in (-\infty , -1]\cup [1/2, 1)\) based on new ideas which may be of independent interest. Moreover, later in Corollary 15, under a stronger assumption than primitivity we will prove (reverse) p-contractivity for all p.

An important example of classical semigroups is generated by the map \(f\mapsto f-\mathbb {E}f\), where the expectation is with respect to some fixed distribution. This generator is sometimes called the simple generator [36]. The quantum analog of simple generators is

$$\begin{aligned} \mathcal {L}(X):=X - \text {tr}(\sigma X) I, \end{aligned}$$

for some positive definite density matrix \(\sigma \). Observe that \(\mathcal {L}\) is primitive, and \(\mathcal {L}^*(X) =X-\text {tr}(X)\sigma \) satisfies the detailed balanced condition with respect to \(\sigma \). The quantum Markov semigroup associated to this Lindblad generator is

$$\begin{aligned} \Phi _t(X)=\mathrm {e}^{-t} X + (1-\mathrm {e}^{-t}) \text {tr}(\sigma X) {\mathbb {I}}. \end{aligned}$$
(17)

In the special case where \(\sigma \) is the completely mixed state, \(\Phi _t\) and \(\Phi _t^*\) coincide and become depolarizing channels. Indeed, (17) is a generalized depolarizing channel in the Heisenberg picture.

Having two Lindblad generators \(\mathcal {L}\) and \(\mathcal {K}\) associated to two semigroups \(\{\Phi _t:\, t\ge 0\}\) and \(\{\Psi _t:\, t\ge 0\}\), respectively, we may consider a new Lindblad generator \(\mathcal {L}\otimes \mathcal {I}+\mathcal {I}\otimes \mathcal {K}\). This Lindblad generator generates the semigroup \(\{\Phi _t\otimes \Psi _t:\, t\ge 0\}\). Moreover, letting

$$\begin{aligned} {\widehat{\mathcal {L}}}_i:= \mathcal {I}^{\otimes (i-1)}\otimes \mathcal {L}\otimes \mathcal {I}^{\otimes (n-i)}, \end{aligned}$$
(18)

we have

$$\begin{aligned} \Phi _t^{\otimes n} = \mathrm {e}^{-t\sum _{i=1}^n {\widehat{\mathcal {L}}}_i}. \end{aligned}$$

Note that, if \(\mathcal {L}\) is primitive and reversible with respect to \(\sigma \), then \(\sum _{i=1}^n {\widehat{\mathcal {L}}}_i\) is also primitive and reversible with respect to \(\sigma ^{\otimes n}\).

2.4 Dirichlet Form

We now define the Dirichlet formFootnote 6 associated to a QMS with generator \(\mathcal {L}\) by

$$\begin{aligned} \mathcal {E}_{p, \mathcal {L}}(X) = \frac{p{\hat{p}}}{4}\langle I_{{\hat{p}}, p}(X), \mathcal {L}(X)\rangle _\sigma , \end{aligned}$$

where \({\hat{p}}\) is the Hölder conjugate of p. Verification of the following properties of the Dirichlet form is easy.

Proposition 8

  1. (i)

    \(\mathcal {E}_{{\hat{p}}, \mathcal {L}}(I_{{\hat{p}}, 2}(X)) =\mathcal {E}_{p, \mathcal {L}}(I_{p, 2}(X))\) for all \(p\in \mathbb {R}\backslash \{0\} \) and \(X\in \mathcal {B}(\mathcal {H})\).

  2. (ii)

    \(\mathcal {E}_{ p, \mathcal {L}}(cX)=c^p\mathcal {E}_{p, \mathcal {L}}(X)\) for \(X\ge 0\) and constant \(c\ge 0\).

  3. (iii)

    \(\mathcal {E}_{2, \mathcal {L}} (X) = \langle X, \mathcal {L}(X)\rangle _\sigma \) for all \(X> 0\).

  4. (iv)

    \(\mathcal {E}_{1, \mathcal {L}}(X) = \frac{1}{4} \text {tr}\left[ \Gamma _\sigma \big (\mathcal {L}(X)\big )\cdot \big (\log \Gamma _\sigma (X) - \log \sigma \big ) \right] .\)

The non-negativity of the Dirichlet form is not clear from its definition. Here we prove the non-negativity assuming that the QMS is p-contractive. By Proposition 7 we then conclude the non-negativity of \(\mathcal {E}_{p, \mathcal {L}}(X)\) for \(p\notin (-1, 1/2)\). Later on, based on an stronger assumption than \(\sigma \)-reversibility, we will prove \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\) for all values of p and \(X>0\).

Proposition 9

Suppose that \(\mathcal {L}\) generates a QMS that is primitive and \(\sigma \in \mathcal {D}_+(\mathcal {H})\) is its unique invariant state. Let \(p\in \mathbb {R}\ne \{0\}\). If the QMS is (reverse) p-contractive, then \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\) for all \(X> 0\).

Proof

Define

$$\begin{aligned} g(t) :={\hat{p}}\big \Vert \Phi _t(X) \big \Vert ^{p}_{p, \sigma }-{\hat{p}}\Vert X\Vert ^{p}_{p, \sigma } . \end{aligned}$$

By assumption of (reverse) p-contractivity, for all \(t\ge 0\) we have \(g(t)\le 0\). We note that \(g(0)=0\). Therefore, \(g'(0)\le 0\). We compute

$$\begin{aligned} g'(0)&= \frac{\text {d}}{\text {d}t}\, {\hat{p}}\,\big \Vert \Phi _t(X) \big \Vert ^{p}_{p, \sigma }\Big |_{t=0}\\&= \frac{\text {d}}{\text {d}t} \,{\hat{p}}\,\text {tr}\Big ( \Gamma _{\sigma }^{\frac{1}{p}}\circ \Phi _t(X)^p\Big )\Big |_{t=0}\\&= -p{\hat{p}} \,\text {tr}\Big ( \Gamma _\sigma ^{\frac{1}{p}}\circ \mathcal {L}(X) \cdot \Gamma _\sigma ^{\frac{1}{p}} (X)^{p-1} \Big )\\&= -p{\hat{p}} \,\text {tr}\Big ( \mathcal {L}(X) \cdot \Gamma _\sigma ^{\frac{1}{p}}\big (\Gamma _\sigma ^{\frac{1}{p}} (X)^{p-1}\big ) \Big )\\&= -p{\hat{p}} \langle I_{{\hat{p}}, p}(X), \mathcal {L}(X)\rangle _\sigma . \end{aligned}$$

This gives \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\)\(\quad \square \)

2.5 Hypercontractivity and Logarithmic-Sobolev Inequalities

We showed in Proposition 7 that \(\Phi _t\) belonging to a \(\sigma \)-reversible QMS is contractive, at least for certain values of p. That is, \(\Vert \Phi _t(X)\Vert _{p, \sigma }\) is bounded (from above or below depending on whether \(p\ge 1\) or \(p<1\)) by \(\Vert X\Vert _{p, \sigma }\). On the other hand, By part (b) of Corollary 5 bounding \(\Vert \Phi _t(X)\Vert _{p, \sigma }\) by \(\Vert X\Vert _{q, \sigma }\) when \(1\le q<p\) or \(p<q<1\) is a stronger inequality than contractivity. Such inequalities are called hypercontractivity inequalities or reverse hypercontractivity inequalities depending on whether \(1\le q<p\) or \(p<q<1\) respectively. These inequalities have found a wide range of applications in the literature.

It is well-known that quantum hypercontractivity inequalities stem from quantum logarithmic-Sobolev (log-Sobolev) inequalities. They are essentially equivalent objects, so proving log-Sobolev inequalities gives hypercontractivity ones. The theory of reverse hypercontractivity inequalities have been generalized to the non-commutative case for unital semigroups in [14]. Here we generalize the theory for general QMS.

Given a primitive Lindblad generator \(\mathcal {L}\) that is reversible with respect to a positive definite density matrix \(\sigma \) and \(p\in \mathbb {R}\backslash \{0\}\), a p-log-Sobolev inequality is an inequality of the form

$$\begin{aligned} \beta \,\text {Ent}_{p, \sigma }(X)\le \mathcal {E}_{p, \mathcal {L}}(X), \qquad \forall X> 0. \end{aligned}$$

The best constant \(\beta \) satisfying the above inequality is called the p-log-Sobolev constant and is denoted by \(\alpha _p(\mathcal {L})\). That is,

$$\begin{aligned} \alpha _p(\mathcal {L}) : = \inf \frac{\mathcal {E}_{p, \mathcal {L}}(X)}{\text {Ent}_{p, \sigma }(X)}, \end{aligned}$$

where the infimum is taken over \(X> 0\) with \(\text {Ent}_{p, \sigma }(X)\ne 0\).

By the following proposition we can restrict ourselves to log-Sobolev constants for values of \(p\in [0, 2]\).

Proposition 10

\(\alpha _p(\mathcal {L})=\alpha _{{\hat{p}}}(\mathcal {L})\) for all Lindblad generators \(\mathcal {L}\).

Proof

Identifying X with \(I_{p, 2}(Y)\), for some arbitrary \(Y> 0\), this is an immediate consequence of part (i) of Proposition 4 and part (i) of Proposition 8\(\quad \square \)

We can now state how log-Sobolev inequalities are related to hypercontractivity and reverse hypercontractivity inequalities. The first part of the following theorem is already known [26, 43].

Theorem 11

Let \(\mathcal {L}\) be a primitive Lindblad generator that is reversible with respect to a positive definite density matrix \(\sigma \). Then the following holds:

  • (Hypercontractivity) Suppose that \(\beta _2 = \inf _{p\in [1, 2]} \alpha _p(\mathcal {L}) >0\). Then for \(1\le q\le p\) and

    $$\begin{aligned} t\ge \frac{1}{4\beta _2}\log \frac{p-1}{q-1}, \end{aligned}$$
    (19)

    we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X> 0\)

  • (Reverse hypercontractivity) Suppose that \(\beta _1 = \inf _{p\in {(}0, 1]} \alpha _p(\mathcal {L}) >0\). Then for \(p\le q<1\) and

    $$\begin{aligned} t\ge \frac{1}{4\beta _1}\log \frac{p-1}{q-1}, \end{aligned}$$
    (20)

    we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\ge \Vert X\Vert _{q, \sigma }\) for all \(X> 0\), where Eq. 20 is understood in the limit whenever \(p=0\) or \(q=0\).

The proof strategy of this theorem is quite standard. Here we present a proof for the sake of completeness.

Proof

It suffices to prove the theorem when \(t= \frac{1}{4\beta } \log \frac{p-1}{q-1}\) for \(\beta \) being either \(\beta _2\) or \(\beta _1\) depending on whether we prove the hypercontractivity part or the reverse hypercontractivity part. Thus, fix q and define

$$\begin{aligned} t(p):= \frac{1}{4\beta } \log \frac{p-1}{q-1}. \end{aligned}$$

Define

$$\begin{aligned} f(p):=\Vert \Phi _{t(p)}(X)\Vert _{p, \sigma } -\Vert X\Vert _{q, \sigma } = \Vert X_p\Vert _{p, \sigma } - \Vert X\Vert _{q, \sigma }, \end{aligned}$$

where \(X_p:= \Phi _{t(p)}(X)> 0\). To continue the proof we compute the derivative of f(p) using Proposition 3.

$$\begin{aligned} f'(p)&= \frac{\text {d}}{\text {d}p} \Vert X_p\Vert _{p, \sigma } = \frac{1}{p^2}\Vert X_p\Vert _{p, \sigma }^{1-p}\cdot \left( \text {Ent}_{p, \sigma }(X_p) + p^2 \text {tr}\Big [ \Gamma _\sigma ^{\frac{1}{p}}(Z_p)\cdot \Gamma _\sigma ^{\frac{1}{p}}(X_p)^{p-1} \Big ]\right) , \end{aligned}$$

where

$$\begin{aligned} Z_p= \frac{\text {d}}{\text {d}p} X_p = -t'(p)\mathcal {L}(X_p)= -\frac{1}{4\beta (p-1)} \mathcal {L}(X_p). \end{aligned}$$

Therefore,

$$\begin{aligned} f'(p) = \frac{1}{p^2}\Vert X_p\Vert _{p, \sigma }^{1-p}\cdot \Big (\text {Ent}_{p, \sigma }(X_p) - \frac{1}{\beta } \mathcal {E}_{p, \mathcal {L}}(X_p) \Big ). \end{aligned}$$

Now suppose that \(q\ge 1\) and \(\beta \le \alpha _p(\mathcal {L})\) for all \(p\in [1, 2]\). Then for \(p\ge q\) we have

$$\begin{aligned} \text {Ent}_{p, \sigma }(X_p)\le \frac{1}{\alpha _p(\mathcal {L})}\mathcal {E}_{p, \mathcal {L}}(X_p)\le \frac{1}{\beta }\mathcal {E}_{p, \mathcal {L}}(X_p). \end{aligned}$$

As a result, \(f'(p)\le 0\) for all \(p\ge q\). Since \(f(q)=0\) we conclude that \(f(p)\le 0\) for all \(p\ge q\). This gives the hypercontractivity part of the theorem.

For the reverse hypercontractivity part, assume that \(q< 1\) and \(\beta \le \alpha _p(\mathcal {L})\) for all \(p\in [0, 1]\). Then for \(p\le q\) we have

$$\begin{aligned} \text {Ent}_{p, \sigma }(X_p)\le \frac{1}{\alpha _p(\mathcal {L})}\mathcal {E}_{p, \mathcal {L}}(X_p)\le \frac{1}{\beta }\mathcal {E}_{p, \mathcal {L}}(X_p), \end{aligned}$$

where the second inequality holds since \(p<1\), so either p or its Hölder conjugate belongs to [0, 1]. Therefore, \(f'(p)\le 0\) for all \(p\le q< 1\), and since \(f(q)=0\), \(f(p)\ge 0\) for all \(p<q\)\(\quad \square \)

3 Quantum Stroock–Varopoulos Inequality

In the previous section we developed the basic tools required to understand quantum hypercontractivity and reverse hypercontractivity inequalities and log-Sobolev inequalities. By Theorem 11 to obtain hypercontractivity and reverse hypercontractivity inequalities we need to find bounds on log-Sobolev constants in ranges \(p\in [1, 2]\) or \(p\in [0, 1]\). Now the question is how such bounds can be found.

In the classical (commutative) case, the most relevant p-log-Sobolev constants are \(\alpha _2(\mathcal {L})\) and \(\alpha _1(\mathcal {L})\). Indeed, \(p\mapsto \alpha _p(\mathcal {L})\) is a non-increasing function on \(p\in [0, 2]\), so in Theorem 11 the parameters \(\beta _1\) and \(\beta _2\) can be replaced with \(\alpha _1(\mathcal {L})\) and \(\alpha _2(\mathcal {L})\) respectively. This result is proven via comparison of the Dirichlet forms, an inequality that is sometimes called the Stroock–Varopoulos inequality.

In this section we prove a quantum generalization of the Stroock–Varopoulos inequality, and conclude in Theorem 11 that, for strongly reversible semigroups, we can take \(\beta _p=\alpha _p(\mathcal {L})\) for \(p=1, 2\). We should point out that a quantum Stroock–Varopoulos inequality in the special case of \(\sigma \) being the completely mixed state is proven in [14]. Also, a special case of the Stroock–Varopoulos inequality (called strong\(L_p\)-regularity) for certain Lindblad generators is proven in [26, 43]. A strong \(L_p\)-regularity is also proven in [3] which we generalize to a quantum Stroock–Varopoulos inequality.

The assumption of \(\sigma \)-reversibility is not enough for us for proving the quantum Stroock–Varopoulos inequality. We indeed need \(\mathcal {L}\) to be self-adjoint with respect to an inner product different from \(\langle \cdot , \cdot \rangle _\sigma \) defined above (see Lemma 6). In the following we first define this new inner product, state some of its properties and then go to our quantum Stroock–Varopoulos inequality.

3.1 The GNS Inner Product

In what follows we use the GNS inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) on \(\mathcal {B}(\mathcal {H})\) that is defined by [11]:

$$\begin{aligned} \langle X, Y\rangle _{1, \sigma } := \text {tr}(\sigma X^\dagger Y). \end{aligned}$$
(21)

We note that this inner product coincides with \(\langle X, Y\rangle _\sigma = \text {tr}(\sigma ^{1/2}X^\dagger \sigma ^{1/2}Y)\) when, e.g., X and \(\sigma \) commute. But in general \(\langle \cdot , \cdot \rangle _{1, \sigma }\) is different from \(\langle \cdot , \cdot \rangle _\sigma \).

The following lemma was first proven in [11]. We will give a proof here for the sake of completeness.

Lemma 12

Let \(\mathcal {L}\) be a Lindblad generator that is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) defined above. Then the followings hold.

  1. (i)

    \(\mathcal {L}\) commutes with the superoperator \(\Delta _{\sigma }:X \mapsto \sigma X\sigma ^{-1}.\)

  2. (ii)

    \(\mathcal {L}\) is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{\sigma }\).

Based on part (ii) of this lemma (see also Lemma 6) we say that a Lindblad generator \(\mathcal {L}\) is strongly\(\sigma \)-reversible if it is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\).

Proof

(i) Using the fact the \(\mathcal {L}(Y)^{\dagger } = \mathcal {L}(Y^{\dagger })\), for all XY we have

$$\begin{aligned} \langle X, \Delta _\sigma \circ \mathcal {L}(Y)\rangle _{1, \sigma }&= \text {tr}(\sigma X^{\dagger }\sigma \mathcal {L}(Y)\sigma ^{-1}) \\&= \text {tr}( X^{\dagger }\sigma \mathcal {L}(Y)) \\&= \langle \mathcal {L}(Y)^{\dagger }, X^{\dagger }\rangle _{1, \sigma }\\&= \langle \mathcal {L}(Y^\dagger ), X^{\dagger }\rangle _{1, \sigma }\\&= \langle Y^\dagger , \mathcal {L}(X^{\dagger })\rangle _{1, \sigma }\\&= \text {tr}(\sigma Y \mathcal {L}(X)^{\dagger })\\&= \text {tr}(\Delta _{\sigma }(Y) \sigma \mathcal {L}(X)^\dagger )\\&= \langle \mathcal {L}(X), \Delta _\sigma (Y)\rangle _{1, \sigma }\\&= \langle X, \mathcal {L}\circ \Delta _\sigma (Y)\rangle _{1, \sigma }. \end{aligned}$$

This gives \(\Delta _\sigma \circ \mathcal {L}= \mathcal {L}\circ \Delta _\sigma \).

(ii) Follows easily from (i) and the fact that

$$\begin{aligned} \langle X, Y\rangle _{\sigma } = \langle Y^{\dagger }, \Delta _{\sigma }^{1/2}(X^{\dagger })\rangle _{1, \sigma }. \end{aligned}$$

\(\square \)

The following lemma is indeed a consequence of Theorem 3.1 of [11]. Here we prefer to present a direct proof.

Lemma 13

Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then for every \(t\ge 0\) there are operators \(R_k\in \mathcal {B}(\mathcal {H})\) and \(\omega _k> 0\) such that \(\Delta _\sigma (R_k) = {\omega _k} R_k\),

$$\begin{aligned} \Phi _t(X) = \sum _k R_k XR_k^\dagger , \end{aligned}$$
(22)

and \(\sum _k R_k R_k^{\dagger }=I\).

Proof

By Lemma 12 the Lindblad generator \(\mathcal {L}\) and then \(\Phi _t=\mathrm {e}^{-t\mathcal {L}}\) commute with \(\Delta _\sigma \), i.e.,

$$\begin{aligned} \Phi _t\circ \Delta _\sigma = \Delta _\sigma \circ \Phi _t. \end{aligned}$$
(23)

Fix an orthonormal basis \(\{|i\rangle \}_{i=1}^d\) for the underlying Hilbert space \(\mathcal {H}=\mathcal {H}_A\) and define

$$\begin{aligned} |\Upsilon \rangle := \sum _{i=1}^d |i\rangle _A|i\rangle _B \in \mathcal {H}_{AB}, \end{aligned}$$

where \(\mathcal {H}_B\) is isomorphic to \(\mathcal {H}_A\). It is not hard to verify that for any matrix M we have

$$\begin{aligned} (M_A\otimes I_B)|\Upsilon \rangle = {\mathbb {I}}_A\otimes M_B^T|\Upsilon \rangle , \end{aligned}$$
(24)

where the transpose is with respect to the basis \(\{|i\rangle \}_{i=1}^d\).

The Choi–Jamiolkowski representation of \(\Phi _t\) is

$$\begin{aligned} J_{AB} : = (\Phi _t \otimes \mathcal {I}_B)(|\Upsilon \rangle \langle \Upsilon |). \end{aligned}$$

Then using (24) it is not hard to verify that (23) translates to

$$\begin{aligned} (\sigma _A^{-1}\otimes \sigma _B^T) J_{AB} = J_{AB}( \sigma _A^{-1}\otimes \sigma _B^T). \end{aligned}$$

That is, \(J_{AB}\) and \(\sigma _A^{-1}\otimes \sigma _B^{T}\) commute. On the other hand, \(J_{AB}\) is positive semidefinite since it is the Choi–Jamiolkowski representation of a completely positive map. Therefore, \(J_{AB}\) and \(\sigma _A^{-1}\otimes \sigma _B^{T}\) can be simultaneously diagonalized in an orthonormal basis, i.e., there exists an orthonormal basis \(\{|v_k\rangle \}_{k=1}^{d^2}\) of \(\mathcal {H}_{AB}\) such that

$$\begin{aligned} J_{AB}|v_k\rangle&= \lambda _k |v_k\rangle \end{aligned}$$
(25)
$$\begin{aligned} \sigma _A^{-1}\otimes \sigma _B^{T}|v_k\rangle&= \omega _k^{-1}|v_k\rangle , \end{aligned}$$
(26)

where \(\lambda _k\ge 0,\, \omega _k> 0\). Define the operator \(V_k\) by

$$\begin{aligned} (V_k\otimes I_B)|\Upsilon \rangle = |v_k\rangle . \end{aligned}$$

Then again using (24), equation (26) translates to

$$\begin{aligned} \sigma ^{-1}V_k \sigma = \omega _k^{-1} V_k. \end{aligned}$$

Moreover, equation (25) means that

$$\begin{aligned} (\Phi _t\otimes \mathcal {I}_B)(|\Upsilon \rangle \langle \Upsilon |)=J_{AB}=\sum _k \lambda _k |v_k\rangle \langle v_k| = \sum _k \lambda _k (V_k\otimes I_B) |\Upsilon \rangle \langle \Upsilon | (V_k^{\dagger }\otimes I_B), \end{aligned}$$

which gives

$$\begin{aligned} \Phi _t(X) :=\sum _k \lambda _k V_kXV_k^{\dagger }. \end{aligned}$$

Then letting \(R_k:= \sqrt{\lambda _k} V_k\) we have \(\sigma R_k= \omega _kR_k \sigma \) and (22) holds. The other equation comes from \(\Phi _t({\mathbb {I}})={\mathbb {I}}\)\(\quad \square \)

3.2 Comparison of the Dirichlet Forms

We can now state the main result of this section.

Theorem 14

(Quantum Stroock–Varopoulos inequality). Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator, which means that it is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) defined in (21). Then for all \(X> 0\) we have

$$\begin{aligned} \mathcal {E}_{p, \mathcal {L}}\big (I_{p, 2}(X)\big )\ge \mathcal {E}_{q, \mathcal {L}}\big (I_{q, 2}(X)\big ), \qquad 0{<} p\le q\le 2. \end{aligned}$$

Remark 2

As mentioned above, special cases of Theorem 14 were already investigated in the literature. In the case that \(\sigma \) is the maximally mixed state, this was done in [14]. The inequality was also recently extended to the GNS-symmetric setting for the range of parameters \(p\ge 1\) and \(q=2\) in [3].

We have two proofs for this theorem. The first one, that we present here, is based on ideas in [26, 43]. The second one, that is moved to Appendix B, is based on ideas in [3]. We present both the proofs in this paper since they are different in nature and whose ideas can be useful elsewhere.

First proof of Theorem 14

For any \(t\ge 0\) define the function \(h_t:{[}0,\infty )\rightarrow \mathbb {R}\) by

$$\begin{aligned} h_t(s):= \big \langle I_{2/(2-s), 2}(X) , \Phi _t \circ I_{2/s, 2}(X)\big \rangle _\sigma \end{aligned}$$

for \(s\in (0,\infty )\backslash \{2\}\), and \(h_t(0)=h_t(2)=\text {tr}(\Gamma _\sigma ^{1/2}(X)^2)\). Since by part (ii) of Lemma 12, \(\Phi _t=\mathrm {e}^{-t\mathcal {L}}\) is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \), we have \(h_t(2-s)=h_t(s)\) and \(h_t\) is symmetric about \(s=1\). Moreover, exploring the definition of \(h_t(s)\) we find that \(s\mapsto h_t(s)\) is analytic with a convergent Taylor series at \(s=1\). Then, by the symmetry around \(s=1\), all the the odd-order derivatives of \(h_t\) at \(s=1\) vanish, and we have

$$\begin{aligned} h_t(s) = h_t(1) + \sum _{j=1}^\infty \frac{c_j}{(2j)!} (s-1)^{2j}, \end{aligned}$$
(27)

where

$$\begin{aligned} c_j = \frac{\text {d}^{2j}}{\text {d}s^{2j}} h_t(s)\Big |_{s=1}\,. \end{aligned}$$
(28)

Note that the above series expansion is convergence by analyticity of \(s\mapsto h_t(s)\).We claim that all the even-order derivatives of \(h_t\) at \(s=1\) are non-negative, i.e., \(c_j\ge 0\). We use Lemma 13 to verify this. Let \(R_k\)’s be operators such that

$$\begin{aligned} \sigma R_k\sigma ^{-1} = \omega _kR_k, \end{aligned}$$
(29)

with \(\omega _k> 0\) and (22) holds. Then letting \(Y:= \Gamma _\sigma ^{1/2}(X)\) and using (29) we compute

$$\begin{aligned} h_t(s)&= \text {tr}\Big [ \Gamma _\sigma ^{\frac{s}{2}}(Y^{2-s})\cdot \Phi _t\big (\Gamma _\sigma ^{-\frac{s}{2}}(Y^s)\big ) \Big ] \\&= \sum _k \text {tr}\Big [ Y^{2-s} \sigma ^{\frac{s}{4}} R_k \sigma ^{-\frac{s}{4}} Y^s \sigma ^{-\frac{s}{4}} R_k^{\dagger } \sigma ^{\frac{s}{4}} \Big ]\\&=\sum _k \omega _k^{\frac{s}{2}}\text {tr}\Big [ Y^{2-s} R_k Y^s R_k^{\dagger } \Big ]. \end{aligned}$$

Now diagonalizing Y in its eigenbasis: \(Y=\sum _\ell \mu _\ell | \ell \rangle \langle \ell |\), we find that

$$\begin{aligned} h_t(s) = \sum _{k, \ell , \ell '} \mu _\ell ^2 \, \big | \langle \ell | R_k|\ell '\rangle \big |^2\Big (\frac{\sqrt{\omega _k}\,\mu _{\ell '}}{\mu _\ell }\Big )^s . \end{aligned}$$

Therefore, \(h_t(s)\) is a sum of exponential functions with positive coefficients. From this expression it is clear that \(c_j\)’s as defined in Eq. 28 are all non-negative.

For \(s\in (0,\infty )\backslash \{2\}\), let us define

$$\begin{aligned} g_t(s):= \frac{h_t(s)-h_t(0)}{(s-1)^2-1} = \sum _{j=1}^\infty \frac{c_j}{(2j)!}\left( \sum _{i=0}^{j-1} (s-1)^{2i} \right) , \end{aligned}$$

and extend the function \(g_t\) by continuity on \([0,\infty )\), since \(h_t\) is differentiable at 0 and at 2. From this expression it is clear that \(g_t(s)\) is non-decreasing on \([1, +\infty )\). Therefore, \(\lim _{t\rightarrow 0^+} g_t(s)/t\) is non-decreasing on \([1, +\infty )\). On the other hand, we have \(h_t(0) =\text {tr}(Y^2) = h_0(s)\). We thus can compute

$$\begin{aligned} \lim _{t\rightarrow 0^+} \frac{g_t(s)}{t}&=\frac{1}{(s-1)^2-1} \lim _{t\rightarrow 0^+} \frac{h_t(s)-h_t(0)}{t} \\&=\frac{1}{(s-1)^2-1} \lim _{t\rightarrow 0^+} \frac{h_t(s)-h_0(s)}{t} \\&=\frac{1}{(s-1)^2-1} \frac{\partial }{\partial t} h_t(s)\Big |_{t=0} \\&=-\frac{1}{(s-1)^2-1} \big \langle I_{2/(2-s), 2}(X) , \mathcal {L}\circ I_{2/s, 2}(X)\big \rangle _\sigma . \end{aligned}$$

Therefore

$$\begin{aligned} s\mapsto -\frac{1}{(s-1)^2-1} \big \langle I_{2/(2-s), 2}(X) , \mathcal {L}\circ I_{2/s, 2}(X)\big \rangle _\sigma , \end{aligned}$$

is non-decreasing on \([1, +\infty )\). Now the desired result follows once we identify 2/s with p (and \(2/(2-s)\) with \({\hat{p}}\), its Hölder conjugate). \(\quad \square \)

Here are some important consequences of the above theorem.

Corollary 15

Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then the followings hold:

  1. (i)

    For all \(p\in {\mathbb {R}}\backslash \{0\}\) and \(X> 0\) we have

    $$\begin{aligned} \mathcal {E}_{p, \mathcal {L}}(X)\ge 0. \end{aligned}$$
  2. (ii)

    The associated QMS is p-contractive for all p.

Remark 3

As mentioned before, the fact that p-Dirichlet forms are positive for \(p\in (-\infty ,- 1]\cup [+1/2,\infty )\) is a simple consequence of contraction of non-commutative weighted \(L_p\)-norms (or equivalently of the data processing inequality for sandwiched p-Rényi divergences), which follows by invariance of the state \(\sigma \) and interpolation of these spaces (see [43]). The case \(p\in (-1,+1/2)\) is much more subtle, since it is known that the data processing inequality does not hold in general in this parameter range, as opposed to the classical case. More precisely, p-contractivity of \(\Phi _t\) implies that the sandwiched p-Rényi divergence is monotone under \(\Phi _t\) [5, 19, 37]. Therefore, when \(\Phi _t\) comes from a QMS satisfying the above strong reversibility condition, sandwiched p-Rényi divergences are monotone under \(\Phi _t\) not only for \(p\ge 1/2\) but for all values of p.

Proof

(i) By Theorem 14 (and part (i) of Proposition 8) for every \(p\ne 0\) we have

$$\begin{aligned} \mathcal {E}_{p, \mathcal {L}}(I_{p, 2}(X))\ge \mathcal {E}_{2, \mathcal {L}}(X). \end{aligned}$$

Indeed, for \(p\in (0,2]\), the inequality holds by Theorem 14, and for \(p\notin [0,2]\), further use Proposition 8(i) to conclude. On the other hand, since we have self-adjointness of the semigroup with respect to \(\langle .,.\rangle _\sigma \), its generator has positive spectrum, so that we have \(\mathcal {E}_{2, \mathcal {L}}(X)\ge 0\). Therefore, \(\mathcal {E}_{p, \mathcal {L}}(I_{p, 2}(X))\ge 0\).

(ii) Define g(t) as in the proof of Proposition 9. By part (i) we have \(g'(t)\le 0\) for all \(t\ge 0\) and \(g(0)=0\). Therefore, \(g(t)\ge 0\) for all \(t\ge 0\). This gives p-contractivity. \(\quad \square \)

The following corollary is an immediate consequence of the quantum Stroock–Varopoulos inequality as well as part (i) of Proposition 4.

Corollary 16

Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then \(p\mapsto \alpha _p(\mathcal {L})\) is non-increasing on [0, 2], where \(\alpha _0(\mathcal {L})\) is defined as the limit \(p\rightarrow 0\).

Now we can state an improvement over Theorem 11.

Corollary 17

Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then the following holds:

  • (Hypercontractivity) For \(1\le q\le p\) and

    $$\begin{aligned} t\ge \frac{1}{4\alpha _2(\mathcal {L})}\log \frac{p-1}{q-1}, \end{aligned}$$
    (30)

    we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X\ge 0\)

  • (Reverse hypercontractivity) For \(p\le q<1\) and

    $$\begin{aligned} t\ge \frac{1}{4\alpha _1(\mathcal {L})}\log \frac{p-1}{q-1}, \end{aligned}$$
    (31)

    we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\ge \Vert X\Vert _{q, \sigma }\) for all \(X> 0\).

Remark 4

Equation 30 was already known to be implied by the strong \(L_p\)-regularity defined by [43]. This condition, which is a special case of the Stroock–Varopoulos inequality, was recently shown in [3].

Before ending this section, we state a result that will play an important role in Sect. 5.

Lemma 18

Let \(\{\Phi _t:\, t\ge 0\}\) be a a primitive QMS that is strongly \(\sigma \)-reversible. Let \(X,Y>0\) and \(-\infty \le q, p< 1\). Then, for any \(t\ge 0\) such that \((1-p)(1-q)\ge \mathrm {e}^{-4\alpha _1(\mathcal {L}) t}\) we have

$$\begin{aligned} \langle X,\Phi _t(Y)\rangle _\sigma \ge \Vert X\Vert _{p,\sigma }\Vert Y\Vert _{q,\sigma }. \end{aligned}$$

Proof

The result follows by a direct application of Lemma 1 together with the reverse hypercontractivity inequality in Corollary 17\(\quad \square \)

4 Tensorization

Our goal in this section is to prove hypercontractivity (or reverse hypercontractivity) inequalities of the form \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma ^{\otimes n}} \le \Vert X\Vert _{q, \sigma ^{\otimes n}}\) (or \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma ^{\otimes n}} \ge \Vert X\Vert _{q, \sigma ^{\otimes n}}\)) for certain ranges of tpq that are independent of n. Indeed, so far we have a theory of using log-Sobolev inequalities to prove such inequalities when \(n=1\), but in some applications, e.g., those we present later in this paper, we need such inequalities for arbitrary n. We need some notations to state the problem more precisely.

For a Lindblad generator \(\mathcal {L}\) we define

$$\begin{aligned} {\widehat{\mathcal {L}}}_i:= \mathcal {I}^{\otimes (i-1)}\otimes \mathcal {L}\otimes \mathcal {I}^{\otimes (n-i)}, \end{aligned}$$
(32)

as an operator acting on \(\mathcal {B}(\mathcal {H}^{\otimes n})\). We also let

$$\begin{aligned} \mathcal {K}_n:= \sum _{i=1}^n {\widehat{\mathcal {L}}}_i. \end{aligned}$$
(33)

Observe that if \(\mathcal {L}\) is (strongly) \(\sigma \)-reversible, then \(\mathcal {K}_n\) is (strongly) reversible with respect to \(\sigma ^{\otimes n}\). Moreover, \({\widehat{\mathcal {L}}}_i\)’s commute with each other and

$$\begin{aligned} \mathrm {e}^{-t\mathcal {K}_n} = \Phi _t^{\otimes n}. \end{aligned}$$

That is, \(\mathcal {K}_n\) is a (strongly) \(\sigma ^{\otimes n}\)-reversible Lindblad generator which generates the quantum Markov semigroup \(\big \{\Phi _t^{\otimes n}:\, t\ge 0\big \}\). Now we can ask how the (reverse) hypercontractivity inequalities associated to \(\Phi _t\) are related to those for \(\Phi _t^{\otimes n}\). Equivalently, what is the relation between the log-Sobolev constants \(\alpha _p(\mathcal {L})\) to \(\alpha _p(\mathcal {K}_n)\)? In the commutative (classical) case the answer is easy; \(\alpha _p(\mathcal {K}_n)\) equals \(\alpha _p(\mathcal {L})\) for all n, and having a (reverse) hypercontractivity inequality for \(\Phi _t\) immediately gives one for \(\Phi _t^{\otimes n}\). This is because in the classical case operator norms are multiplicative, or because the entropy function satisfies a certain subadditivity property (see e.g., [36]). The aforementioned property that, in the classical case, \(\alpha _p(\mathcal {K}_n)\) is independent of n, is usually called the tensorization property.

Tensorization property of log-Sobolev constants of quantum Lindblad generators, unlike its classical counterpart, is highly non-trivial. Thus proving (reverse) hypercontractivity inequalities that are independent of n is a difficult problem in the non-commutative case. There are some attempts in this direction. Montanaro and Osborne in [33] proved such hypercontractivity inequalities for the qubit depolarizing channel (see also [26]). King [28] generalized this result for all unital qubit QMS. Cubitt et al. developed the theory of quantum reverse hypercontractivity inequalities in the unital case in [14] and proved some tensorization-type results. Also, Cubitt et al. [49] developed some techniques for proving bounds on log-Sobolev constants \(\alpha _p(\mathcal {K}_n)\) that are independent of n. Beigi and King [6] took the path of developing the theory of log-Sobolev inequalities not for the usual \(q\rightarrow p\) norm, but for the completely bounded norm. The point is that completely bounded norms are automatically multiplicative [17], so there is no problem of tensorization for the associated log-Sobolev constants. However, the existence of a complete version of the LSI constant was disproved in [4].

In this section we prove two tensorization-type results, one for 1-log-Sobolev constants which will be used for reverse hypercontractivity inequalities, and the other for 2-log-Sobolev constants which would be useful for hypercontractivity inequalities.

Theorem 19

Let \(\sigma _1, \dots , \sigma _n\) be arbitrary positive definite density matrices. Let \(\mathcal {L}_i(X) = X-\text {tr}(\sigma _i X) {\mathbb {I}}\) be the simple generator associated to the state \(\sigma _i\). Let

$$\begin{aligned} {\widehat{\mathcal {L}}}_i:= \mathcal {I}^{\otimes (i-1)}\otimes \mathcal {L}_i\otimes \mathcal {I}^{\otimes (n-i)}, \end{aligned}$$

and define \(\mathcal {K}_n\) by (33). Then we have \(\alpha _1(\mathcal {K}_n) \ge \frac{1}{4}\), independently of n.

Remark 5

Observe that Theorem 19 does not show the tensorization of \(\alpha _1\) for the depolarizing semigroup, but only proves a positive lower bound independent of n. Hence, the tensorization of \(\alpha _1\) is still an open problem.

Letting \(\sigma _i\)’s to be equal in the above theorem, we obtain the promised tensorization-type result for the 1-log-Sobolev constant.Footnote 7

Proof

We need to show that for all \(X_{A^n}\in \mathcal {P}_+(\mathcal {H}_{A^n})\) we have

$$\begin{aligned} \frac{1}{4} \text {Ent}_{1, \sigma _{A^n}}(X_{A^n})\le \mathcal {E}_{1, \mathcal {K}_n}(X_{A^n}), \end{aligned}$$

where \(\sigma _{A_i} =\sigma _i\) and

$$\begin{aligned} \sigma _{A^n} = \sigma _1\otimes \cdots \otimes \sigma _n. \end{aligned}$$

Using parts (ii) of Proposition 4 and Proposition 8, without loss of generality we can assume that \(X_{A^n}= \Gamma _{\sigma _{A^n}}^{-1}(\rho _{A^n})\) where \(\rho _{A^n}\in \mathcal {D}_+(\mathcal {H}_{A^n})\) is a density matrix. Then, using parts (iv) of Proposition 4 and Proposition 8, we need to show that

$$\begin{aligned} D(\rho _{A^n}\Vert \sigma _{A^n})\le \sum _{i=1}^n \text {tr}\Big [ \Gamma _{\sigma _{A^n}}\circ {\widehat{\mathcal {L}}}_i\circ \Gamma _{\sigma _{A^n}}^{-1} (\rho _{A^n})\cdot \big ( \log \rho _{A^{n}} - \log (\sigma _{A^n}) \big )\Big ]. \end{aligned}$$
(34)

Observe that

$$\begin{aligned} \Gamma _{\sigma _{A^n}}\circ {\widehat{\mathcal {L}}}_i\circ \Gamma _{\sigma _{A^n}}^{-1}&= \mathcal {I}^{\otimes (i-1)}\otimes \big (\Gamma _{\sigma _i}\circ \mathcal {L}\circ \Gamma _{\sigma _i}^{-1}\big )\otimes \mathcal {I}^{\otimes (n-i)} = \mathcal {I}^{\otimes (i-1)}\otimes \mathcal {L}^*_i\otimes \mathcal {I}^{\otimes (n-i)}, \end{aligned}$$

with \(\mathcal {L}^*_i (Y) = Y- \text {tr}(Y)\sigma _i\). Therefore,

$$\begin{aligned} \Gamma _{\sigma _{A^n}}\circ {\widehat{\mathcal {L}}}_i\circ \Gamma _{\sigma _{A^n}}^{-1} (\rho _{A^n}) = \rho _{A^n} - \rho _{A^{\sim i}}\otimes \sigma _{A_i}, \end{aligned}$$

where \(A^{\sim i} = (A_1, \dots , A_{i-1}, A_{i+1}, \dots , A_n)\) and \(\rho _{A^{\sim i}} = \text {tr}_{A_i}(\rho _{A^n})\) is the partial trace of \(\rho _{A^n}\) with respect to the i-th subsystem. Therefore, (34) is equivalent to

$$\begin{aligned} D(\rho _{A^n}\Vert \sigma _{A^n})&\le \sum _{i=1}^n\text {tr}\Big [ \big (\rho _{A^n} - \rho _{A^{\sim i}}\otimes \sigma _{A_i} \big )\cdot \big ( \log \rho _{A^{ n}} - \log (\sigma _{A^n}) \big )\Big ]\\&= \sum _{i=1}^n\Big [ D(\rho _{A^n}\Vert \sigma _{A^n}) + D(\rho _{A^{\sim i}}\otimes \sigma _{A_i}\Vert \rho _{A^n}) - D(\rho _{A^{\sim i}}\otimes \sigma _{A_i}\Vert \sigma _{A^n})\Big ]. \end{aligned}$$

Now since \(D(\rho _{A^{\sim i}}\otimes \sigma _{A_i}\Vert \rho _{A^n})\ge 0\), it suffices to show that

$$\begin{aligned} D(\rho _{A^n}\Vert \sigma _{A^n})&\le \sum _{i=1}^n\Big [ D(\rho _{A^n}\Vert \sigma _{A^n}) - D(\rho _{A^{\sim i}}\otimes \sigma _{A_i}\Vert \sigma _{A^n})\Big ]. \end{aligned}$$
(35)

We note that \(D(\xi _B\Vert \tau _B) = -H(B)_\xi -\text {tr}(\xi \log \tau )\) where \(H(B)_\xi = -\text {tr}(\xi \log \xi )\) is the von Neumann entropy. Moreover, \(\log (\xi \otimes \tau ) = \log \xi \otimes I + I\otimes \log \tau \). Therefore, (35) is equivalent to

$$\begin{aligned} -H(A^n)_{\rho } - \sum _{i=1}^n \text {tr}(\rho _{A_i}\log \sigma _i)&\le \sum _{i=1}^n\Big [ -H(A^n)_\rho -\sum _{j=1}^n \text {tr}(\rho _{A_j}\log \sigma _j)\\&\quad + H(A^{\sim i})_\rho + \sum _{j\ne i} \text {tr}(\rho _{A_j}\log \sigma _j) \Big ]\\&=\sum _{i=1}^n\big [ - H(A^n)_\rho - \text {tr}(\rho _{A_i}\log \sigma _i)+ H(A^{\sim i})_\rho \big ]\\&=\sum _{i=1}^n\big [ - H(A_i| A^{\sim i})_\rho - \text {tr}(\rho _{A_i}\log \sigma _i) \big ]. \end{aligned}$$

This is equivalent to

$$\begin{aligned} H(A^n)_{\rho }&\ge \sum _{i=1}^n H(A_i| A^{\sim i})_\rho , \end{aligned}$$

which is an immediate consequence of the data processing inequality (i.e., \(H(B|C)_\xi \ge H(B|CD)_\xi \)) once we use the chain rule

$$\begin{aligned} H(A^n)_{\rho } = H(A_1)_{\rho }+\sum _{i=2}^n H(A_i| A_1, \dots , A_{i-1})_{\rho }. \end{aligned}$$

This conclude the proof. \(\quad \square \)

Remark 6

A similar proof was recently and independently obtained in [9]. Moreover, the proof uses similar ideas to the proof of the tensorization property of the variant of \(\alpha _2\) for the completely bounded norm in [6].

We can now use Corollary 17 and the fact that the simple generator is strongly reversible to conclude the following.

Corollary 20

Let \(\sigma _1, \dots , \sigma _n\) be arbitrary positive definite density matrices. Let \(\mathcal {L}_i(X) = X-\text {tr}(\sigma _i X) {\mathbb {I}}\) be the simple generator associated to the generalized depolarizing channel \(\Phi _{t, i}(X)=\mathrm {e}^{-t} X + (1-\mathrm {e}^{-t}) \text {tr}(\sigma _i X) {\mathbb {I}}\). Define \(\sigma ^{(n)} = \sigma _1\otimes \cdots \otimes \sigma _n\) and \(\Phi _t^{(n)} = \Phi _{t, 1}\otimes \cdots \otimes \Phi _{t, n}\). Then for \(p\le q<1\) and \(t\ge \log \frac{p-1}{q-1}\) we have

$$\begin{aligned} \big \Vert \Phi _t^{(n)}(X)\big \Vert _{p, \sigma ^{(n)}}\ge \Vert X\Vert _{q, \sigma ^{(n)}}, \qquad \forall n\ge 1, \end{aligned}$$

where \(X\in \mathcal {P}_+(\mathcal {H}^{\otimes n})\) is arbitrary.

We now state the second tensorization result which is about the 2-log-Sobolev constant.

Theorem 21

Let \(\dim \mathcal {H}=2\) and \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) for some positive definite density matrix \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Then we have

$$\begin{aligned} \alpha _2(\mathcal {K}_n)= \alpha _2(\mathcal {L}), \qquad \forall n, \end{aligned}$$

where \(\mathcal {K}_n\) is defined in (33).

Our main tool to prove this theorem is the following entropic inequality that is of independent interest and can be useful elsewhere.

Lemma 22

Let \(\mathcal {H}\) and \(\mathcal {H}'\) be Hilbert spaces with \(\dim \mathcal {H}=2\). Let \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) be a positive semidefinite matrix with the block form

$$\begin{aligned} X=\begin{pmatrix} A &{}\quad C\\ C^{\dagger } &{}\quad B \end{pmatrix}, \end{aligned}$$
(36)

where \(A, B, C\in \mathcal {B}(\mathcal {H}')\). For a density matrix \(\rho \in \mathcal {D}_+(\mathcal {H}')\), the matrix M defined as

$$\begin{aligned} M= \begin{pmatrix} \Vert A\Vert _{2, \rho } &{}\quad \Vert C\Vert _{2, \rho }\\ \Vert C^{\dagger }\Vert _{2, \rho } &{}\quad \Vert B\Vert _{2, \rho } \end{pmatrix} \end{aligned}$$
(37)

is positive semidefinite. Moreover, let \(\sigma \in \mathcal {D}_+(\mathcal {H})\) be a density matrix of the form

$$\begin{aligned} \sigma = \begin{pmatrix} \theta &{}\quad 0\\ 0 &{}\quad 1-\theta \end{pmatrix}, \end{aligned}$$
(38)

where \(\theta \in (0,1)\). Then we have

$$\begin{aligned} \text {Ent}_{2, \sigma \otimes \rho }(X)\le&~ \text {Ent}_{2, \sigma }(M) + \theta \text {Ent}_{2, \rho }(A) +(1-\theta )\text {Ent}_{2, \rho }(B)\nonumber \\&~+ \sqrt{\theta (1-\theta )}\,\text {Ent}_{2, \rho }(I_{2, 2}(C)) + \sqrt{\theta (1-\theta )}\,\text {Ent}_{2, \rho }(I_{2, 2}(C^\dagger )), \end{aligned}$$
(39)

where the map \(I_{2,2}\) is defined with respect to the state \(\rho \).

Proof

For any \(p\ge 2\) define

$$\begin{aligned} M_p := \begin{pmatrix} \Vert A\Vert _{p, \rho } &{}\quad \Vert C\Vert _{p, \rho }\\ \Vert C^{\dagger }\Vert _{p, \rho } &{}\quad \Vert B\Vert _{p, \rho } \end{pmatrix}, \end{aligned}$$

so that \(M_2=M\). Since \(X\ge 0\), both A and B are positive semidefinite. Moreover, we have

$$\begin{aligned} \Gamma _{{\mathbb {I}}\otimes \rho }^{\frac{1}{p}}(X)= \begin{pmatrix} \Gamma _\rho ^{\frac{1}{p}}(A) &{}\quad \Gamma _\rho ^{\frac{1}{p}}(C)\\ \Gamma _\rho ^{\frac{1}{p}}(C^{\dagger }) &{}\quad \Gamma _\rho ^{\frac{1}{p}}(B) \end{pmatrix}\ge 0. \end{aligned}$$

As a result, according to Theorem IX.5.9 of [7] there exists a contraction\(R\in \mathcal {B}(\mathcal {H}')\) such that \(\Gamma _\rho ^{\frac{1}{p}}(C) = \big (\Gamma _\rho ^{\frac{1}{p}}(A)\big )^{\frac{1}{2}} R \big (\Gamma _\rho ^{\frac{1}{p}}(B)\big )^{\frac{1}{2}}\). Therefore, by Hölder’s inequality we have

$$\begin{aligned} \big \Vert \Gamma _\rho ^{\frac{1}{p}}(C)\big \Vert _{p}&=\big \Vert \big (\Gamma _\rho ^{\frac{1}{p}}(A)\big )^{\frac{1}{2}} R \big (\Gamma _\rho ^{\frac{1}{p}}(B)\big )^{\frac{1}{2}}\big \Vert _{p} \\&\le \big \Vert \big (\Gamma _\rho ^{\frac{1}{p}}(A)\big )^{\frac{1}{2}}\big \Vert _{ 2p} \cdot \Vert R\Vert _{\infty }\cdot \big \Vert \big (\Gamma _\rho ^{\frac{1}{p}}(B)\big )^{\frac{1}{2}}\big \Vert _{ 2p}\\&\le \big \Vert \big (\Gamma _\rho ^{\frac{1}{p}}(A)\big )^{\frac{1}{2}}\big \Vert _{ 2p} \cdot \big \Vert \big (\Gamma _\rho ^{\frac{1}{p}}(B)\big )^{\frac{1}{2}}\big \Vert _{ 2p}\\&= \big \Vert \Gamma _\rho ^{\frac{1}{p}}(A)\big \Vert ^{\frac{1}{2}}_{ p} \cdot \big \Vert \Gamma _\rho ^{\frac{1}{p}}(B)\big \Vert ^{\frac{1}{2}}_{ p}. \end{aligned}$$

Then using \(\Vert Y\Vert _{p, \rho } = \Vert \Gamma _\rho ^{1/p}(Y)\Vert _{p}\), we find that

$$\begin{aligned} \Vert C\Vert _{p, \rho } \le \Vert A\Vert _{p, \rho }^{\frac{1}{2}}\cdot \Vert B\Vert _{p, \rho }^{\frac{1}{2}}, \end{aligned}$$

and hence \(M_p\ge 0\). In particular, \(M_2=M\ge 0\) and \(\text {Ent}_{2, \rho }(M)\) is well-defined.

Define \(\psi (p):= \Vert M_p\Vert _{p, \sigma } - \Vert X\Vert _{p, \sigma \otimes \rho }\). It is shown by King [27] that \(\psi (p)\ge 0\) for all \(p\ge 2\). Indeed, this inequality is proven in [27] in the special case where \(\sigma \) and \(\rho \) are the identity operators on the relevant spaces. Nevertheless, we have

$$\begin{aligned} \Vert X\Vert _{p, \sigma \otimes \rho } = \left\| \begin{pmatrix} \theta ^{\frac{1}{p}}\Gamma _{\rho }^{\frac{1}{p}}(A) &{}\quad \big (\theta (1-\theta )\big )^{\frac{1}{2p}}\Gamma _{\rho }^{\frac{1}{p}}(C)\\ \big (\theta (1-\theta )\big )^{\frac{1}{2p}}\Gamma _{\rho }^{\frac{1}{p}}(C^{\dagger }) &{}\quad (1-\theta )^{\frac{1}{p}}\Gamma _{\rho }^{\frac{1}{p}}(B) \end{pmatrix}\right\| _{p}, \end{aligned}$$

and

$$\begin{aligned} \Vert M_p\Vert _{p, \sigma } = \left\| \begin{pmatrix} \theta ^{\frac{1}{p}}\Vert \big \Vert \Gamma _{\rho }^{\frac{1}{p}}(A)\big \Vert _{p} &{}\quad \big (\theta (1-\theta )\big )^{\frac{1}{2p}}\big \Vert \Gamma _{\rho }^{\frac{1}{p}}(C)\big \Vert _{p}\\ \big (\theta (1-\theta )\big )^{\frac{1}{2p}}\big \Vert \Gamma _{\rho }^{\frac{1}{p}}(C^{\dagger })\big \Vert _{p} &{}\quad (1-\theta )^{\frac{1}{p}}\big \Vert \Gamma _{\rho }^{\frac{1}{p}}(B)\big \Vert _{p} \end{pmatrix}\right\| _{p}. \end{aligned}$$

Thus, King’s result holds for arbitrary \(\rho \) and diagonal \(\sigma \) as well, and we have \(\psi (p)\ge 0\) for all \(p\ge 2\). On the other hand, a straightforward computation verifies that \(\psi (2)=0\). This means that \(\psi '(2)\ge 0\), i.e.,

$$\begin{aligned} \frac{\text {d}}{\text {d}p} \big (\Vert M_p\Vert _{p, \sigma } - \Vert X\Vert _{p, \sigma \otimes \rho }\big ) \bigg |_{p=2}\ge 0. \end{aligned}$$

The derivatives can be computed using Proposition 3. We have

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert X\Vert _{p,\sigma \otimes \rho } \bigg |_{p=2}= \frac{1}{4}\Vert X\Vert _{2, \sigma \otimes \rho }^{-1}\cdot \text {Ent}_{2, \sigma \otimes \rho }(X), \end{aligned}$$
(40)

and

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert M_p\Vert _{p,\sigma } \bigg |_{p=2}= \frac{1}{4}\Vert M\Vert _{2, \sigma }^{-1}\cdot \Big ( \text {Ent}_{2, \sigma }(M) + 4\text {tr}\big [ \Gamma _\sigma ^{\frac{1}{2}}(M'_2) \cdot \Gamma _\sigma ^{\frac{1}{2}} (M) \big ] \Big ), \end{aligned}$$

where

$$\begin{aligned} M'_2=\frac{\text {d}}{\text {d}p}M_p\bigg |_{p=2}=\frac{1}{4}\begin{pmatrix} \Vert A\Vert _{2, \rho }^{-1}\cdot \text {Ent}_{2, \rho }(A) &{}\quad w\\ w &{} \quad \Vert B\Vert _{2, \rho }^{-1}\cdot \text {Ent}_{2, \rho }(B) \end{pmatrix}, \end{aligned}$$

and \(w =\Vert C\Vert _{2, \rho }^{-1}\cdot \left( \frac{1}{2}\text {Ent}_{2, \rho }\big (I_{2, 2}(C)\big ) +\frac{1}{2}\text {Ent}_{2, \rho }\big (I_{2, 2}(C^{\dagger })\big )\right) \). We conclude that

$$\begin{aligned} \frac{\text {d}}{\text {d}p}\Vert M_p\Vert _{p,\sigma } \bigg |_{p=2}&= \frac{1}{4}\Vert M\Vert _{2, \sigma }^{-1}\cdot \Big ( \text {Ent}_{2, \sigma }(M) + \theta \text {Ent}_{2, \rho }(A) +(1-\theta ) \text {Ent}_{2, \rho }(B) \\&\quad +\,\sqrt{\theta (1-\theta )}\text {Ent}_{2, \rho }\big (I_{2, 2}(C)\big )+\sqrt{\theta (1-\theta )}\text {Ent}_{2, \rho }\big (I_{2, 2}(C^\dagger )\big ) \Big ). \end{aligned}$$

Comparing to (40) and using \(\Vert M\Vert _{2, \sigma }=\Vert X\Vert _{2, \sigma \otimes \rho }\) the desired inequality follows. \(\quad \square \)

We need yet another lemma to prove Theorem 21.

Lemma 23

For any Lindblad generator \(\mathcal {K}\) that is \(\rho \)-reversible for some positive definite density matrix \(\rho \) we have

$$\begin{aligned} \mathcal {E}_{2, \mathcal {K}}\big (I_{2, 2}(C)\big ) +\mathcal {E}_{2, \mathcal {K}}\big (I_{2, 2}(C^\dagger )\big ) \le \langle C, \mathcal {K}(C)\rangle _\rho +\langle C^\dagger , \mathcal {K}(C^\dagger )\rangle _\rho \end{aligned}$$

for any C.

Proof

Define \(D:=\Gamma _{\rho }^{\frac{1}{2}}(C)\). Then for \(j\in \{0,1\}\)

$$\begin{aligned} Y_{j}:= \begin{pmatrix} |D| &{}\quad (-1)^j D^\dagger \\ (-1)^j D &{}\quad |D^\dagger | \end{pmatrix}\ge 0, \end{aligned}$$

is positive semidefinite [7]. Since \(\Gamma _{\rho }^{-1/2}\) is completely positive we have

$$\begin{aligned} Z_{j}:=\mathcal {I}\otimes \Gamma _{\rho }^{-1/2}(Y_j) = \begin{pmatrix} I_{2, 2}(C) &{}\quad (-1)^j C^\dagger \\ (-1)^j C &{}\quad I_{2, 2}(C^\dagger ) \end{pmatrix}\ge 0. \end{aligned}$$

On the other hand, \(\Psi _t= \mathrm {e}^{-t\mathcal {K}}\) is completely positive. Therefore,

$$\begin{aligned} \mathcal {I}\otimes \Psi _t (Z_0) = \begin{pmatrix} \Psi _t(I_{2, 2}(C)) &{}\quad \Psi _t(C^\dagger )\\ \Psi _t(C) &{}\quad \Psi _t(I_{2, 2}(C^\dagger )) \end{pmatrix}\ge 0, \end{aligned}$$

is positive semidefinite. Putting these together we find that

$$\begin{aligned} g(t):=\langle Z_1, \mathcal {I}\otimes \Psi _t (Z_0)\rangle _{{\mathbb {I}}\otimes \rho }\ge 0, \qquad \forall t\ge 0. \end{aligned}$$

We note that

$$\begin{aligned} g(t)&= \big \langle I_{2, 2}(C), \Psi _t(I_{2, 2}(C))\big \rangle _{\rho } + \big \langle I_{2, 2}(C^\dagger ), \Psi _t(I_{2, 2}(C^\dagger ))\big \rangle _{\rho }\\&\quad - \big \langle C, \Psi _t(C)\big \rangle _{\rho }- \big \langle C^\dagger , \Psi _t(C^\dagger )\big \rangle _{\rho }. \end{aligned}$$

From this expression it is clear that

$$\begin{aligned} g(0)= \Vert I_{2, 2}(C)\Vert _{2, \rho }^2 + \Vert I_{2, 2}(C^\dagger )\Vert _{2, \rho }^2 - \Vert C\Vert _{2, \rho }^2 -\Vert C^\dagger \Vert _{2, \rho }^2 =0. \end{aligned}$$

Therefore, we must have \(g'(0)\ge 0\) which is equivalent to the desired inequality. \(\quad \square \)

Now we have all the required tools for proving Theorem 21. Indeed, we can prove a stronger statement out of which Theorem 21 is implied by a simple induction.

Theorem 24

Let \(\dim \mathcal {H}=2\) and \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) for some positive definite density matrix \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Also let \(\mathcal {K}\) be a Lindblad generator associated to a primitive QMS that is reversible with respect to some positive definite state \(\rho \in \mathcal {D}_+(\mathcal {H}')\). Then we have

$$\begin{aligned} \alpha _2(\mathcal {L}\otimes \mathcal {I}' + \mathcal {I}\otimes \mathcal {K})= \min \{\alpha _2(\mathcal {L}), \, \alpha _2(\mathcal {K})\}, \end{aligned}$$

where \(\mathcal {I}\) and \(\mathcal {I}'\) denote the identity superoperators acting on \(\mathcal {B}(\mathcal {H})\) and \(\mathcal {B}(\mathcal {H}')\) respectively.

Proof

Let \(\alpha =\min \{\alpha _2(\mathcal {L}), \, \alpha _2(\mathcal {K})\}\). By restricting X in the 2-log-Sobolev inequality to be of the tensor product form and using

$$\begin{aligned} \text {Ent}_{2, \sigma \otimes \rho }(Y\otimes Y') = \text {Ent}_{2, \sigma }(Y)+\text {Ent}_{2, \rho }(Y'), \end{aligned}$$

we conclude that \(\alpha _2(\mathcal {L}\otimes \mathcal {I}+ \mathcal {I}\otimes \mathcal {K})\le \alpha \). To prove the inequality in the other direction we need to show that for any \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) we have

$$\begin{aligned} \alpha \, \text {Ent}_{2, \sigma \otimes \rho } (X) \le \mathcal {E}_{2, \mathcal {L}\otimes \mathcal {I}' + \mathcal {I}\otimes \mathcal {K}} (X). \end{aligned}$$
(41)

Assume, without loss of generality, that \(\sigma \) is diagonal of the form (38), and that \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) has the block form (36). Define M by (37). Then by Lemma 22 we have

$$\begin{aligned} \text {Ent}_{2, \sigma \otimes \rho }(X)&\le \text {Ent}_{2, \sigma }(M) + \theta \text {Ent}_{2, \rho }(A) +(1-\theta )\text {Ent}_{2, \rho }(B)\\&\quad + \sqrt{\theta (1-\theta )}\,\text {Ent}_{2, \rho }(I_{2, 2}(C)) + \sqrt{\theta (1-\theta )}\,\text {Ent}_{2, \rho }(I_{2, 2}(C^\dagger )). \end{aligned}$$

On the other hand by the definition of \(\alpha \) we have

$$\begin{aligned} \alpha \,\text {Ent}_{2, \sigma }(M) \le \mathcal {E}_{2, \mathcal {L}}(M), \end{aligned}$$

and

$$\begin{aligned} \alpha \, \text {Ent}_{2, \rho }(Y) \le \mathcal {E}_{2, \mathcal {K}}(Y), \end{aligned}$$

for all \(Y\in \big \{ A, B, I_{2, 2}(C), I_{2, 2}(C^\dagger ) \big \}\). Therefore, we have

$$\begin{aligned} \alpha \,\text {Ent}_{2, \sigma \otimes \rho }(X)&\le \mathcal {E}_{2, \mathcal {L}}(M) + \theta \mathcal {E}_{2, \mathcal {K}}(A) +(1-\theta )\mathcal {E}_{2, \mathcal {K}}(B)\nonumber \\&\quad + \sqrt{\theta (1-\theta )}\,\mathcal {E}_{2, \mathcal {K}}(I_{2, 2}(C)) + \sqrt{\theta (1-\theta )}\,\mathcal {E}_{2, \mathcal {K}}(I_{2, 2}(C^\dagger ))\nonumber \\&\le \mathcal {E}_{2, \mathcal {L}}(M) + \theta \mathcal {E}_{2, \mathcal {K}}(A) +(1-\theta )\mathcal {E}_{2, \mathcal {K}}(B)\nonumber \\&\quad + \sqrt{\theta (1-\theta )}\,\langle C, \mathcal {K}(C)\rangle + \sqrt{\theta (1-\theta )}\,\langle C^\dagger , \mathcal {K}(C^\dagger )\rangle , \end{aligned}$$
(42)

where in the second inequality we use Lemma 23. We now have

$$\begin{aligned} \mathcal {E}_{2, \,\mathcal {L}\otimes \mathcal {I}' + \mathcal {I}\otimes \mathcal {K}} (X)&= \langle X, (\mathcal {L}\otimes \mathcal {I}' + \mathcal {I}\otimes \mathcal {K})(X)\rangle _{\sigma \otimes \rho }\\&= \langle X, \mathcal {L}\otimes \mathcal {I}'(X)\rangle _{\sigma \otimes \rho } + \Bigg \langle \begin{pmatrix} A &{} \quad C\\ C^\dagger &{}\quad B \end{pmatrix}, \begin{pmatrix} \mathcal {K}(A) &{} \quad \mathcal {K}(C)\\ \mathcal {K}(C^\dagger ) &{}\quad \mathcal {K}(B) \end{pmatrix}\Bigg \rangle _{\sigma \otimes \rho }. \end{aligned}$$

We compute each term in the above sum separately.

$$\begin{aligned} \big \langle X, \,\mathcal {L}\,\otimes \,&\,\mathcal {I}'(X)\big \rangle _{2, \sigma \otimes \rho } \\&= \Bigg \langle \begin{pmatrix} A &{}\quad C\\ C^\dagger &{}\quad B \end{pmatrix}, \begin{pmatrix} (1-\theta )(A-B) &{}\quad C\\ C^\dagger &{}\quad \theta (B-A) \end{pmatrix}\Bigg \rangle _{2, \sigma \otimes \rho } \\&= \theta (1-\theta )\langle A, A-B\rangle _\rho +\theta (1-\theta )\langle B, B-A\rangle _\rho \\&\quad + 2\sqrt{\theta (1-\theta )}\langle C, C\rangle _\rho \\&= \theta (1-\theta )\Vert A\Vert _{2,\rho }^2 +\theta (1-\theta )\Vert B\Vert _{2,\rho }^2 -2\theta (1-\theta )\langle A, B\rangle _\rho \\&\quad + 2\sqrt{\theta (1-\theta )}\Vert C\Vert _{2, \rho }\\&\ge \theta (1-\theta )\Vert A\Vert _{2,\rho }^2 +\theta (1-\theta )\Vert B\Vert _{2,\rho }^2 -2\theta (1-\theta )\Vert A\Vert _{2, \rho }\cdot \Vert B\Vert _{2, \rho }\\&\quad + 2\sqrt{\theta (1-\theta )}\Vert C\Vert _{2, \rho }\\&= \langle M, \mathcal {L}(M)\rangle _\sigma \\&= \mathcal {E}_{2, \mathcal {L}}(M). \end{aligned}$$

For the second term we compute

$$\begin{aligned}&\Bigg \langle \begin{pmatrix} A &{} \quad C\\ C^\dagger &{}\quad B \end{pmatrix}, \quad \begin{pmatrix} \mathcal {K}(A) &{} \quad \mathcal {K}(C)\\ \mathcal {K}(C^\dagger ) &{}\quad \mathcal {K}(B) \end{pmatrix}\Bigg \rangle _{\sigma \otimes \rho } \\&\quad = \theta \langle A, \,\mathcal {K}(A)\rangle _\rho + (1-\theta ) \langle B, \mathcal {K}(B)\rangle _\rho \\&\quad \quad + \sqrt{\theta (1-\theta )} \langle C, \mathcal {K}(C)\rangle _\rho + \sqrt{\theta (1-\theta )} \langle C^\dagger , \mathcal {K}(C^\dagger )\rangle _\rho \\&\quad = \theta \mathcal {E}_{2, \mathcal {K}}(A) + (1-\theta ) \mathcal {E}_{2, \mathcal {K}}(B) \\&\quad \quad + \sqrt{\theta (1-\theta )} \langle C, \mathcal {K}(C)\rangle + \sqrt{\theta (1-\theta )} \langle C^\dagger , \mathcal {K}(C^\dagger )\rangle . \end{aligned}$$

Therefore, we have

$$\begin{aligned} \mathcal {E}_{2, \mathcal {L}\otimes \mathcal {I}' + \mathcal {I}\otimes \mathcal {K}}(X)&\ge \mathcal {E}_{2, \mathcal {L}}(M) +\theta \mathcal {E}_{2, \mathcal {K}}(A) + (1-\theta ) \mathcal {E}_{2, \mathcal {K}}(B)\\&\quad + \sqrt{\theta (1-\theta )} \langle C, \mathcal {K}(C)\rangle + \sqrt{\theta (1-\theta )} \langle C^\dagger , \mathcal {K}(C^\dagger )\rangle . \end{aligned}$$

Comparing this to (42) we arrive at the desired inequality (41). \(\quad \square \)

We now give the exact expression of the 2-log-Sobolev constant of the simple Lindblad generator (in any dimension). We recall that the case of the 1-log-Sobolev constant was found in [38] (see also [26] when \(\sigma ={\mathbb {I}}/d\)). The proof in our general setting is similar to the one of [38]. We however provide it in Appendix C for the sake of completeness.

Theorem 25

Let \(\sigma \in \mathcal {D}_+(\mathcal {H})\) be arbitrary and let \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) be the simple Lindblad generator. Then we have

$$\begin{aligned} \alpha _2(\mathcal {L})= \frac{1-2s_{\min }(\sigma )}{\log \big (1/s_{\min }(\sigma )-1\big )}, \end{aligned}$$
(43)

where \(s_{\min }(\sigma )\) is the minimum eigenvalue of \(\sigma \).

We can now derive a tensorization-type result for a wide class of Lindblad generators. Let \(\mathcal {L}\) be a \(\sigma \)-reversible and primitive Lindblad generator. Recall that the spectral gap of \(\mathcal {L}\) is defined by

$$\begin{aligned} \lambda (\mathcal {L}) = \inf _{X} \frac{\mathcal {E}_{2, \mathcal {L}}(X)}{\text {Var}_\sigma (X)}, \end{aligned}$$

where \(\text {Var}_\sigma (X) =\langle X, X\rangle _\sigma - \langle X, {\mathbb {I}}\rangle _\sigma ^2 =\Vert X\Vert _{2, \sigma }^2 -\langle X, {\mathbb {I}}\rangle _\sigma ^2\), see e.g. [26]. Observe that \(\text {Var}_\sigma (X)\) is the squared length of the projection of X onto the subspace orthogonal to \({\mathbb {I}}\in \mathcal {B}(\mathcal {H})\) with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \). On the other hand, \({\mathbb {I}}\) is the soleFootnote 8 0-eigenvector of \(\mathcal {L}\) up to a phase which is self-adjoint with respect to this inner product. Therefore, \(\lambda (\mathcal {L})\) is the minimum non-zero eigenvalue of \(\mathcal {L}\). Note that, since \(\mathcal {L}\) has positive spectral gap, the Dirichlet form \(\mathcal {E}_{2, \mathcal {L}}\) is non-negative, so \(\lambda (\mathcal {L})>0\). Indeed, \(\lambda (\mathcal {L})\) is really the spectral gap of \(\mathcal {L}\) above the zero eigenvalue.

The spectral gap satisfies the tensorization property, as shown below. Observe that

$$\begin{aligned} \mathcal {K}_n= \sum _{i=1}^n {\widehat{\mathcal {L}}}_i, \end{aligned}$$

is a sum of mutually commuting operators. Then the eigenvalues of \(\mathcal {K}_n\) are summations of eigenvalues of individual \({\widehat{\mathcal {L}}}_i\)’s. Since each \({\widehat{\mathcal {L}}}_i\) is a tensor product of \(\mathcal {L}\) with some identity superoperator, the set of its eigenvalues is the same as that of \(\mathcal {L}\). Using these we conclude that

$$\begin{aligned} \lambda (\mathcal {K}_n) = \lambda (\mathcal {L}), \qquad \forall n. \end{aligned}$$
(44)

It is well-known that \(\lambda (\mathcal {L})\ge \alpha _2(\mathcal {L})\) [10, 26]. The following corollary gives a lower bound on \(\alpha _2(\mathcal {L})\) in terms of \(\lambda (\mathcal {L})\).

Corollary 26

Let \(\dim \mathcal {H}=2\) and \(\sigma \in \mathcal {D}_+(\mathcal {H})\). For any \(\sigma \)-reversible primitive Lindblad generator \(\mathcal {L}\) we have

$$\begin{aligned} \alpha _2(\mathcal {K}_n) \ge \frac{1-2s_{\min }(\sigma )}{\log \big ( 1/s_{\min }(\sigma )-1\big ) } \lambda (\mathcal {L}), \end{aligned}$$

where \(s_{\min }(\sigma )\) denotes the minimal eigenvalue of \(\sigma \).

This corollary is a non-commutative version of Corollary A.4 of [18] and gives a stronger bound compared to Corollary 6 of [49]. It would be interesting to compare this corollary with the result of King [28] who generalized the hypercontractivity inequalities of [33] for the unital qubit depolarizing channel to all unital qubit quantum Markov semigroups. Here, having a bound on the 2-log-Sobolev constant of the \(\sigma \)-reversible generalized qubit depolarizing channel (and its tensorization property), we derive a bound on the 2-log-Sobolev constant of all qubit \(\sigma \)-reversible QMS.

Proof of Corollary 26

Let \(\mathcal {L}'\) be the simple Lindblad generator that is \(\sigma \)-reversible, and let \(X\in \mathcal {P}(\mathcal {H}^{\otimes n})\) be arbitrary. Then by Theorems 21 and 25 we have

$$\begin{aligned} \frac{1-2s_{\min }(\sigma )}{\log \big ( 1/s_{\min }(\sigma )-1\big ) } \, \text {Ent}_{2, \sigma ^{\otimes n}} \le \sum _{i=1}^n \big \langle X, {\widehat{\mathcal {L}}}'_i(X)\big \rangle _{\sigma ^{\otimes n}}. \end{aligned}$$
(45)

Let \({\mathcal {W}}_i\subset \mathcal {B}(\mathcal {H}^{\otimes n})\) be the subspace spanned by operators of the form \(A_1\otimes \cdots \otimes A_n \in \mathcal {B}(\mathcal {H}^{\otimes n})\) with \(A_i={\mathbb {I}}\in \mathcal {B}(\mathcal {H})\). In other words, \({\mathcal {W}}_i = \ker ({\widehat{\mathcal {L}}}'_i)\). Then \(\big \langle X, {\widehat{\mathcal {L}}}'_i(X)\big \rangle _{\sigma ^{\otimes n}}\) equals the squared length of the projection of X onto \({\mathcal {W}}_i^{\perp }\). On the other hand, since \(\mathcal {L}\) is primitive and \(\sigma \)-reversible, we also have \({\mathcal {W}}_i=\ker {\widehat{\mathcal {L}}}_i \) and \({\mathcal {W}}_i^{\perp }\) is invariant under \({\widehat{\mathcal {L}}}_i\). Moreover, by definition \(\lambda ({\widehat{\mathcal {L}}}_i)\) is the minimum eigenvalue of \({\widehat{\mathcal {L}}}_i\) restricted to \({\mathcal {W}}_i^{\perp }\) (i.e., the minimum non-zero eigenvalue). We conclude that

$$\begin{aligned} \lambda ({\widehat{\mathcal {L}}}_i)\big \langle X, {\widehat{\mathcal {L}}}'_i(X)\big \rangle _{\sigma ^{\otimes n}}\le \big \langle X, {\widehat{\mathcal {L}}}_i(X)\big \rangle _{\sigma ^{\otimes n}}. \end{aligned}$$

On the other hand since \({\widehat{\mathcal {L}}}_i\) equals the tensor product of \(\mathcal {L}\) with some identity superoperators, \(\lambda ({\widehat{\mathcal {L}}}_i) = \lambda (\mathcal {L})\). Therefore,

$$\begin{aligned} \lambda (\mathcal {L})\big \langle X, {\widehat{\mathcal {L}}}'_i(X)\big \rangle _{\sigma ^{\otimes n}}\le \big \langle X, {\widehat{\mathcal {L}}}_i(X)\big \rangle _{\sigma ^{\otimes n}}. \end{aligned}$$

Using this in (45) we arrive at

$$\begin{aligned} \lambda (\mathcal {L})\frac{1-2s_{\min }(\sigma )}{\log \big ( 1/s_{\min }(\sigma )-1\big ) } \, \text {Ent}_{2, \sigma ^{\otimes n}} \le \sum _{i=1}^n \big \langle X, {\widehat{\mathcal {L}}}_i(X)\big \rangle _{\sigma ^{\otimes n}} = \langle X, \mathcal {K}_n(X)\rangle _{\sigma ^{\otimes n}}. \end{aligned}$$

This gives the desired bound on \(\alpha _2(\mathcal {K}_n)\)\(\quad \square \)

Corollary 27

Let \(\dim \mathcal {H}=2\) and \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Let \(\mathcal {L}\) be a \(\sigma \)-reversible primitive Lindblad generator. Then for any \(1\le q\le p\) and \(t\ge 0\) satisfying

$$\begin{aligned} t\ge \frac{\log \big ( 1/s_{\min }(\sigma )-1\big )}{4\lambda (\mathcal {L}) \, \big (1-2s_{\min }(\sigma )\big )}\log \frac{p-1}{q-1}, \end{aligned}$$

we have \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X> 0\).

5 Application: Second-Order Converses

One of the primary goals of information theory is to find optimal rates of information-theoretic tasks. For instance, for the task of information transmission over a noisy channel, this optimal rate is the capacity. The latter is said to satisfy the strong converse property if any attempt to transmit information at a rate higher than it fails with certainty in the limit of infinitely many uses of the channel. In this section, we show how reverse hypercontractivity inequalities can be used to derive finite sample size strong converse bounds in the tasks of asymmetric quantum hypothesis testing and classical communication through a classical-quantum channel.

5.1 Quantum Hypothesis Testing

Binary quantum hypothesis testing concerns the problem of discriminating between two different quantum states, and is essential for various quantum information-processing protocols. Suppose that a party, Bob, receives a quantum system, with the knowledge that it is prepared either in the state \(\rho \) (the null hypothesis) or in the state \(\sigma \) (the alternative hypothesis) over a finite-dimensional Hilbert space \({{\mathcal {H}}}\). His aim is to infer which hypothesis is true, i.e., which state the system is in. To do so he performs a measurement on the system that he receives. This is most generally described by a POVM \(\{T,{\mathbb {I}}- T\}\) where \(0 \le T \le {\mathbb {I}}\); When the measurement outcome is T he infers that the state is \(\rho \), and otherwise it is \(\sigma \). Adopting the nomenclature from classical hypothesis testing, we refer to T as a test. The probability that Bob correctly guesses the state to be \(\rho \) is then equal to \(\text {tr}(T \rho )\), whereas his probability of correctly guessing the state to be \(\sigma \) is \(\text {tr}(({\mathbb {I}}-T)\sigma )\). Bob can erroneously infer the state to be \(\sigma \) when it is actually \(\rho \) or vice versa. The corresponding error probabilities are referred to as the Type I error and Type II error, respectively, and are given as follows:

$$\begin{aligned} \alpha (T) := \text {tr}(({\mathbb {I}}- T)\rho ),\quad \beta (T) := \text {tr}(T \sigma ). \end{aligned}$$

Correspondingly, if multiple (say, n) identical copies of the system are available, and a test \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\) is performed on the n copies, then the Type I and Type II errors are given by

$$\begin{aligned} \alpha _n(T_n) := \text {tr}(({\mathbb {I}}_n - T_n)\rho ^{\otimes n}),\quad \beta _n(T_n) := \text {tr}(T_n \sigma ^{\otimes n}), \end{aligned}$$

where \({\mathbb {I}}_n\) denotes the identity operator in \({{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\). There is a trade-off between the two error probabilities and there are various ways to optimize them. In the setting of asymmetric quantum hypothesis testing, one minimizes the Type II error under the constraint that the Type I error stays below a threshold value \(\varepsilon \in (0,1)\). In this case one is interested in the following quantity

$$\begin{aligned} \beta _{n, \varepsilon } := \min \{\beta _n(T_n) \, : \, \alpha _n(T_n) \le \varepsilon , ~ 0\le T_n\le {\mathbb {I}}_n\}, \end{aligned}$$
(46)

where the infimum is taken over all possible tests \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\). The quantum Stein lemma [23, 42] states that

$$\begin{aligned} \lim _{n \rightarrow \infty } \left( -\frac{1}{n}\log \beta _{n, \varepsilon }\right) = D(\rho ||\sigma )\quad \forall \varepsilon \in (0,1). \end{aligned}$$

The asymptotic strong converse rate\(R_{sc}\) of the above quantum hypothesis testing problem is defined to be the smallest number R such that if

$$\begin{aligned} \limsup _{n \rightarrow \infty } \frac{1}{n} \log \beta _n(T_n) \le - R, \end{aligned}$$

for some sequence of tests \(\{T_n\}_{n \in {\mathbb {N}}}\), then

$$\begin{aligned} \lim _{n \rightarrow \infty } \alpha _n(T_n) = 1. \end{aligned}$$

This quantity has been shown to be equal to Stein’s exponent \(D(\rho ||\sigma )\). In this section we are interested in obtaining a bound on the rate of convergence of \(\alpha _n(T_n)\)as a function of n, that is when Bob receives a finite number of identical copies of the quantum system. We use reverse hypercontractivity in order to obtain our bound. Before stating and proving the main theorem of this section, we recall the following important inequality that will be used in the proof.

Lemma 28

(Araki–Lieb–Thirring inequality [2, 29]) For any \(A,B\in \mathcal {P}(\mathcal {H})\), and \(r\in [0,1]\),

$$\begin{aligned} \text {tr}(B^{r/2}A^r B^{r/2} )\le \text {tr}(B^{1/2} A B^{1/2})^r. \end{aligned}$$

Our main result, from which a bound for the finite blocklength strong converse rate follows directly as a corollary, is given by Theorem 29.

Theorem 29

Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) being faithful density matrices.Footnote 9 Then for any test \(0\le T_n\le {\mathbb {I}}_n\), where \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\)

$$\begin{aligned} \log \text {tr}(\sigma ^{\otimes n} T_n )&\ge -nD(\rho \Vert \sigma ) - 2 \sqrt{{n \Vert \sigma ^{-1/2}\rho \sigma ^{-1/2}\Vert _\infty \log \frac{1}{\text {tr}(\rho ^{\otimes n} T_n)}}} +\log \text {tr}(\rho ^{\otimes n} T_n). \end{aligned}$$
(47)

Proof

The result follows by combining Theorem 19 and Lemma 1. For simplicity of notation we will use \(\sigma _n:=\sigma ^{\otimes n}\) and \(\rho _n:=\rho ^{\otimes n}\). Let \(0\le p,q\le 1\) and let \(t \ge 0\) be such that

$$\begin{aligned} (1-p)(1-q)&=\mathrm {e}^{-t}. \end{aligned}$$
(48)

Let \({{\mathcal {L}}}\) denote the generator of a generalized depolarizing semigroup \(\{\Phi _t:\, t\ge 0\}\) with invariant state \(\rho \), i.e., \(\Phi _t(X)=\mathrm {e}^{-t} X + (1-\mathrm {e}^{-t}) \text {tr}(\rho X) {\mathbb {I}}\). By Theorem 19 the 1-log-Sobolev constants of this QMS and its tensor powers are lower bounded by 1/4. Then using Lemma 18 for \(Y=T_n\) and \(X=\Gamma _{\rho _n}^{-1}(\sigma _n)\) we obtain

$$\begin{aligned} \text {tr}\big (\sigma _n \Phi _t^{\otimes n}(T_n)\big )\ge \big \Vert \Gamma _{\rho _n}^{-1}(\sigma _n)\big \Vert _{p,\rho _n}\Vert T_n\Vert _{q,\rho _n}. \end{aligned}$$
(49)

An application of the Araki–Lieb–Thirring inequality, Lemma 28, with \(A=\sigma _n\), \(B=\rho _n^{(1-p)/p}\) and \(r=p\in [0,1]\) leads to

$$\begin{aligned} \big \Vert \Gamma _{\rho _n}^{-1}(\sigma _n)\big \Vert _{p,\rho _n}&=\left[ \text {tr}\Big ( \rho _n^{(1-p)/2p}\sigma _n\rho _n^{(1-p)/2p}\Big )^p\right] ^{1/p}\ge \left[ \text {tr}\big ( \rho _n^{1-p}\,\sigma _n^{p}\,\big )\right] ^{1/p}\\&=\exp \left( - D_{1-p}(\rho _n\Vert \sigma _n) \right) , \end{aligned}$$

where

$$\begin{aligned} D_{1-p}(\rho \Vert \sigma ):=\frac{-1}{p}\log \text {tr}\left( \sigma ^p\,\rho ^{1-p}\right) , \end{aligned}$$

denotes the sandwiched p-Rényi divergence between \(\rho \) and \(\sigma \). A very similar application of Lemma 28 for \(A=T_n\) and \(B=\rho _n^{1/q}\) and \(r=q\in [0,1]\) yields

$$\begin{aligned} \Vert T_n\Vert _{q,\rho _n}=\left[ \text {tr}\big (\rho _n^{1/2q} T_n\rho _n^{1/2q} \big )^q\right] ^{1/q}\ge \left[ \text {tr}\big (\rho _n T_n^q\big )\right] ^{1/q}\ge \left[ \text {tr}\big (\rho _n T_n\big )\right] ^{1/q}, \end{aligned}$$

where in the last inequality, we used that \(0\le T_n\le {\mathbb {I}}\), so that \(T_n^q\ge T_n\). Using the last two bounds in (49), we get

$$\begin{aligned} \text {tr}(\sigma _n \Phi _t^{\otimes n}(T_n))\ge \left[ \text {tr}(\rho _n T_n)\right] ^{1/q}\exp \left( -D_{1-p}(\rho _n\Vert \sigma _n) \right) . \end{aligned}$$

Taking the limit \(p \rightarrow 0\) (and \(q\rightarrow 1-\mathrm {e}^{-t}\)) on both sides of the above inequality yields

$$\begin{aligned} \text {tr}(\sigma _n \Phi _t^{\otimes n}(T_n))&\ge \left[ \text {tr}(\rho _n T_n)\right] ^{1/(1-\mathrm {e}^{-t})}\exp \left( -D(\rho _n\Vert \sigma _n) \right) . \end{aligned}$$
(50)

Let \(\gamma :=\Vert \sigma ^{-1/2}{\rho }\sigma ^{-1/2}\Vert _{\infty }\) and define the superoperator \(\Psi _t\) by

$$\begin{aligned} \Psi _t(X) = \mathrm {e}^{-t} X +\gamma (1-\mathrm {e}^{-t})\text {tr}(\sigma X)\,{\mathbb {I}}. \end{aligned}$$

Then by induction on n it can be shown that \(\Psi _t^{\otimes n} -\Phi _t^{\otimes n}\) is a completely positive superoperator. This is clear from definitions for \(n=1\), and for every \(Y\in \mathcal {P}(\mathcal {H}^{\otimes n}\otimes \mathcal {H}')\), where \(\mathcal {H}'\) is an arbitrary Hilbert space, we have

$$\begin{aligned} \Psi _t^{\otimes n}\otimes \mathcal {I}(Y)&= \big (\Psi ^{\otimes (n-1)}\otimes \mathcal {I}\otimes \mathcal {I}\big ) \big (\mathcal {I}^{\otimes (n-1)}\otimes \Psi _t\otimes \mathcal {I}(Y)\big )\\&\ge \big (\Phi ^{\otimes (n-1)}\otimes \mathcal {I}\otimes \mathcal {I}\big ) \big (\mathcal {I}^{\otimes (n-1)}\otimes \Psi _t\otimes \mathcal {I}(Y)\big )\\&=\big (\mathcal {I}^{\otimes (n-1)}\otimes \Psi _t\otimes \mathcal {I}\big ) \big (\Phi ^{\otimes (n-1)}\otimes \mathcal {I}\otimes \mathcal {I}(Y)\big )\\&\ge \big (\mathcal {I}^{\otimes (n-1)}\otimes \Phi _t\otimes \mathcal {I}\big ) \big (\Phi ^{\otimes (n-1)}\otimes \mathcal {I}\otimes \mathcal {I}(Y)\big )\\&= \Phi _t^{\otimes n}\otimes \mathcal {I}(Y), \end{aligned}$$

where in the inequalities come from the induction hypothesis and the base of induction. Therefore, \(\Psi _t^{\otimes n} -\Phi _t^{\otimes n}\) is a completely positive. On the other hand, for every \(Y\in \mathcal {B}(\mathcal {H}^{\otimes n})\) we have

$$\begin{aligned} \text {tr}\big (\sigma _n\Psi _t^{\otimes n}(Y)\big ) = \big (\mathrm {e}^{-t}+ \gamma (1-\mathrm {e}^{-t})\big )^n\,\text {tr}(\sigma _n Y). \end{aligned}$$

This equation is immediate for \(n=1\), and for arbitrary n can be proven by first observing that it holds for \(Y=Y_1\otimes \cdots \otimes Y_n\) being of a tensor product form, and then using linearity. Putting these together we arrive at

$$\begin{aligned} \text {tr}\big (\sigma _n \Phi _t^{\otimes n}(T_n)\big )&\le \text {tr}\big (\sigma _n \Psi _t^{\otimes n}\,(T_n)\big ) \\&= \big (\mathrm {e}^{-t}+ \gamma (1-\mathrm {e}^{-t})\big )^n\,\text {tr}(\sigma _n T_n). \end{aligned}$$

Next using the fact that \(\gamma \ge 1\) (which follows simply by taking the trace of the operator inequality \(\rho \le \gamma \sigma \)), the convexity of \(h(x)=x^\gamma \) implies \((h(x)-h(1))/(x-1)\ge h'(1)\) for every \(x\ge 1\). Therefore, \(\mathrm {e}^{\gamma t}-1\ge \gamma (\mathrm {e}^t-1)\) for every \(t\ge 0\), and \(\mathrm {e}^{-t}+ \gamma (1-\mathrm {e}^{-t})\le \mathrm {e}^{(\gamma -1)t}\). As a result

$$\begin{aligned} \text {tr}\big (\sigma _n \Phi _t^{\otimes n}(T_n)\big )&\le \mathrm {e}^{(\gamma -1)nt}\,\text {tr}(\sigma _n T_n). \end{aligned}$$
(51)

Then from (50) and (51) we get

$$\begin{aligned} \left[ \text {tr}(\rho _n T_n)\right] ^{1/(1-\mathrm {e}^{-t})}\exp \left( -D(\rho _n\Vert \sigma _n) \right) \le \mathrm {e}^{(\gamma -1)nt}\text {tr}(\sigma _nT_n ). \end{aligned}$$

Taking the logarithm of both sides yields

$$\begin{aligned} \log \text {tr}(\sigma _nT_n )&\ge -D(\rho _n\Vert \sigma _n) - (\gamma -1)nt + \frac{1}{1-\mathrm {e}^{-t}} \log \text {tr}(\rho _n T_n)\nonumber \\&\ge -D(\rho _n\Vert \sigma _n) - \gamma nt + \left( 1+ \frac{1}{t}\right) \log \text {tr}(\rho _n T_n), \end{aligned}$$
(52)

where the second inequality follows from \(\mathrm {e}^t \ge 1+ t\) and

$$\begin{aligned} \frac{1}{1-\mathrm {e}^{-t}} = 1+ \frac{1}{\mathrm {e}^t-1}\le 1+\frac{1}{t}. \end{aligned}$$

Optimizing (52) over the choice of t yields

$$\begin{aligned} t=\left( \frac{-\log \text {tr}(\rho _nT_n)}{\gamma n } \right) ^{1/2}, \end{aligned}$$

and we obtain the desired inequality

$$\begin{aligned} \log \text {tr}(\sigma _nT_n )&\ge -nD(\rho \Vert \sigma ) - 2 \sqrt{-\gamma n\log \text {tr}(\rho _nT_n)}+ \log \text {tr}(\rho _n T_n). \end{aligned}$$

\(\square \)

Remark 7

The bound found by the present reverse hypercontractivity technique is weaker than the one found in Equation (75) of [34], which is in particular tight as \(n\rightarrow \infty \). However, as opposed to [34], the techniques developed in this paper have the particular advantage that they can be generalized to obtain strong converses in various problems of quantum network information theory (see [12, 13]).

Corollary 30

(Finite-blocklength strong converse bound for quantum hypothesis testing). Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) and \(\gamma =\Vert \rho \sigma ^{-1}\Vert _\infty \). Then for any test \(0\le T_n\le {\mathbb {I}}_n\), where \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\), if the Type II error satisfies the inequality \(\beta _n(T_n) \le \mathrm {e}^{-nr}\) for \(r > D(\rho ||\sigma )\), then the Type I error satisfies

$$\begin{aligned} \alpha _n(T_n)&\ge 1 - \mathrm {e}^{-nf}, \end{aligned}$$
(53)

where

$$\begin{aligned} f = \left( \sqrt{\gamma + (r-D(\rho ||\sigma ))} - \sqrt{\gamma }\right) ^2, \end{aligned}$$

and hence tends to zero in the limit of \(r \rightarrow D(\rho ||\sigma )\).

Proof

Fix \(r> D(\rho \Vert \sigma )\) and consider a sequence of tests \(T_n\) such that \(\beta _n(T_n)\le \mathrm {e}^{-nr}\). Then, from Theorem 29 we have

$$\begin{aligned} -nr&\ge -n D(\rho ||\sigma ) - 2\, \sqrt{n\gamma \log \frac{1}{1 - \alpha _n(T_n)}} - \log \frac{1}{1-\alpha _n(T_n)}. \end{aligned}$$

Defining \(x_n^2 := \log \frac{1}{1-\alpha _n(T_n)}\) this is equivalent to

$$\begin{aligned} x_n^2 + 2 \,\sqrt{n\gamma }\, x_n \,-\, n\, (r- D(\rho ||\sigma )) \ge 0, \end{aligned}$$

solving which directly leads to the statement of the corollary. \(\quad \square \)

Theorem 29 also leads to the following finite blocklength second order lower bound on the Type II error when the Type I error is less than a threshold value.

Corollary 31

Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) . Then for any \(n \in {\mathbb {N}}\) and \(\varepsilon >0\) the minimal Type II error satisfies

$$\begin{aligned} \beta _{n, \varepsilon } \ge (1-\varepsilon ) \exp \left( - nD(\rho ||\sigma ) - 2\, \sqrt{n\gamma \log \left( \frac{1}{1-\varepsilon }\right) }\,\right) , \end{aligned}$$

where \(\gamma = \Vert \rho \sigma ^{-1}\Vert _\infty \).

5.2 Classical-Quantum Channels

The strong converse property of the capacity of a classical-quantum (c-q) channel was proved independently in [41, 53]. In this section, we use the quantum reverse hypercontractivity inequality to obtain a finite blocklength strong converse bound for transmission of information through classical-quantum (c-q) channels. Suppose Alice wants to send classical messages belonging to a finite set \({{\mathcal {M}}}\) to Bob, using a memoryless c-q channel:

$$\begin{aligned} {{\mathcal {W}}}: {{\mathcal {X}}} \rightarrow {{\mathcal {D}}}({{\mathcal {H}}_B}), \end{aligned}$$

where \({{\mathcal {X}}}\) denotes a finite alphabet, and \({{\mathcal {H}}_B}\) is a finite-dimensional Hilbert space with dimension d. Thus the output of the channel under input \(x\in {\mathcal {X}}\) is some quantum state \(\rho _x={\mathcal {W}}(x)\in \mathcal {D}(\mathcal {H}_B)\). To send a message \(m \in {{\mathcal {M}}}\), Alice encodes it in a codeword

$$\begin{aligned} {{\mathcal {E}}}^{(n)}(m) = x^n(m)\equiv x^n := (x_1, x_2, \ldots x_n) \in {{\mathcal {X}}}^n, \end{aligned}$$

where \({{\mathcal {E}}}^{(n)}\) denotes the encoding map. She then sends it to Bob through n successive uses of the channel \({{\mathcal {W}}}^{\otimes n}\), whose action on the codeword \(x^n\) is given by

$$\begin{aligned} {{\mathcal {W}}}^{\otimes n}(x^n) = \rho _{x_1} \otimes \cdots \otimes \rho _{x_n} \equiv \rho _{x^n} . \end{aligned}$$

In order to infer Alice’s message, Bob applies a measurement, described by a POVM \(\Pi ^n:= \{\Pi ^n_{m'}\}_{m' \in {{\mathcal {M}}}}\) on the state \({{\mathcal {W}}}^{\otimes n}(x^n)=\rho _{x^n}\) that he receives. The outcome of the measurement would be Bob’s guess of Alice’s message. See Fig. 1.

Fig. 1
figure 1

Encoding and decoding of a classical message sent over a c-q channel. \({\mathcal {E}}^{(n)}\) is the encoding map, and \(\Pi ^n\) is the POVM constituting the decoding map

The triple \((|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) defines a code which we denote as \(\mathcal {C}_n\) (see [51]). The rate of the code is given by \(\log |\mathcal {M}|/n\), and its maximum probability of error is given by

$$\begin{aligned} p_{\max }({{\mathcal {C}}}_n; {{\mathcal {W}}}):=\max _{m\in \mathcal {M}} \Big [1-\text {tr}\big (\,\Pi ^n_m\,\mathcal {W}^{\otimes n}\circ \mathcal {E}^{(n)}(m)\big )\Big ]. \end{aligned}$$

We let \(C_{n, \varepsilon }({\mathcal {W}})\) be the maximum rate \(\log |{\mathcal {M}}|/n\) over all codes \({\mathcal {C}}_n=(|{\mathcal {M}}|, {\mathcal {E}}^{(n)}, \Pi ^n)\) with \(p_{\max }({\mathcal {C}}_n; {\mathcal {W}})\le \varepsilon \). Then the (asymptotic) capacity of the channel is defined by

$$\begin{aligned} C(\mathcal {W}):=\lim _{\varepsilon \rightarrow 0}\liminf _{n\rightarrow \infty } C_{n, \varepsilon }(\mathcal {W}). \end{aligned}$$

For c-q channels, this is known to be given by [24, 46]

$$\begin{aligned} C({\mathcal {W}}) = \max _{P_X} I(X; B)_\rho . \end{aligned}$$

Here the maximum is taken over all probability distributions \(P_X\) on \({\mathcal {X}}\), the bipartite state \(\rho _{XB}\) is given by

$$\begin{aligned} \rho _{XB} = \sum _{x\in {\mathcal {X}}} P_X(x) |x\rangle \langle x|\otimes \rho _x, \end{aligned}$$

and \(I(X; B)_\rho = D(\rho _{XB}\Vert \rho _X\otimes \rho _B)\) is the mutual information function. The fact that the capacity is given by maximum mutual information is indeed implied by its additivity [47]. That is, the maximum mutual information associated to the channel \({\mathcal {W}}^{\otimes n}\) equals n times the maximum mutual information of \({\mathcal {W}}\):

$$\begin{aligned} \max _{P_{X^n}} I(X^n; B^n) = n \max _{P_X} I(X; B) = nC({\mathcal {W}}). \end{aligned}$$
(54)

Theorem 32

Let \(\mathcal {W}:\mathcal {X}\rightarrow \mathcal {D}(\mathcal {H}_B)\) be a c-q channel with \({\mathcal {W}}(x)=\rho _x\) being faithful for all \(x\in {\mathcal {X}}\). Then, for any code \({{\mathcal {C}}}_n:=(|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) with \(p_{\max }({\mathcal {C}}_n; {\mathcal {W}})\le \varepsilon \) we have

$$\begin{aligned} I(X^n; B^n)\ge \log |{\mathcal {M}}| -2\sqrt{dn\log \frac{1}{1-\varepsilon }} - \log \frac{1}{1-\varepsilon }, \end{aligned}$$

where \(d=\dim \mathcal {H}_B\) and the mutual information is computed with respect to the state

$$\begin{aligned} \rho _{X^nB^n} = \frac{1}{|{\mathcal {M}}|}\sum _{m} |x^n(m)\rangle \langle x^n(m)|\otimes \rho _{x^n(m)}. \end{aligned}$$

This theorem together with the additivity result (54) directly imply that for any code of rate larger than \(C(\mathcal {W})\), the maximum probability of error goes to one, as \(n\rightarrow \infty \).

Proof

For every \(x^n=(x_1, \dots , x_n)\in {\mathcal {X}}^n\) let \(\Phi _{t, x^n}=\Phi _{t, x_1}\otimes \cdots \otimes \Phi _{t, x_n}\) with

$$\begin{aligned} \Phi _{t, x}(X) = \mathrm {e}^{-t}X +(1-\mathrm {e}^{-t}) \text {tr}(\rho _x X){\mathbb {I}}. \end{aligned}$$

Then following similar steps as in the proof of Theorem 29, using Theorem 19, Lemma 18 and the Araki–Lieb–Thirring inequality, for every \(\Pi _m^n\) we have

$$\begin{aligned} \text {tr}\big (\rho _{B^n} \Phi _{t, x^n}(\Pi ^n_m)\big ) \ge \big [\text {tr}\big ( \rho _{x^n} \Pi _m^n \big )\big ]^{1/(1-e^{-t})} \mathrm {e}^{-D(\rho _{x^n}\Vert \rho _{B^n})}. \end{aligned}$$

Letting \(x^n=x^n(m)\), using \(\text {tr}\big ( \rho _{x^n(m)} \Pi _m^n \big )\ge 1-\varepsilon \), taking logarithm of both sides and averaging over the choice of \(m\in {\mathcal {M}}\) we obtain

$$\begin{aligned} \frac{1}{|{\mathcal {M}}|} \sum _{m\in {\mathcal {M}}} \log \text {tr}\big (\rho _{B^n} \Phi _{t, x^n(m)}(\Pi ^n_m)\big )&\ge -\frac{1}{|{\mathcal {M}}|} \sum _{m\in {\mathcal {M}}} D(\rho _{x^n(m)} \Vert \rho _{B^n}) + \frac{1}{1-\mathrm {e}^{-t}} \log (1-\varepsilon ) \\&= -I(X^n; B^n) + \frac{1}{1-\mathrm {e}^{-t}} \log (1-\varepsilon )\\&\ge -I(X^n; B^n) + \big (1+\frac{1}{t}\big ) \log (1-\varepsilon ). \end{aligned}$$

Now define \(\Psi _t(X) = \mathrm {e}^{-t}X + (1-\mathrm {e}^{-t})\text {tr}(X){\mathbb {I}}\). Following similar steps as in the proof of Theorem 29, using \(\rho _{x} \le {\mathbb {I}}\) it can be shown that \(\Psi _t^{\otimes n} - \Phi _{t, x^n(m)}\) is completely positive. Therefore, \(\Phi _{t, x^n(m)}(\Pi ^n_m)\le \Psi _t^{\otimes n}(\Pi _m^n)\) and we have

$$\begin{aligned} -I(X^n; B^n) + \big (1+\frac{1}{t}\big ) \log (1-\varepsilon )&\le \frac{1}{|{\mathcal {M}}|} \sum _m \log \text {tr}\big (\rho _{B^n} \Psi _t^{\otimes n} (\Pi _m^n)\big )\\&\le \log \Big (\frac{1}{|{\mathcal {M}}|} \sum _m \text {tr}\big (\rho _{B^n} \Psi _t^{\otimes n} (\Pi _m^n)\Big )\\&= \log \Big ( \frac{1}{|{\mathcal {M}}|} \text {tr}\big ( \rho _{B^n} \Psi _t^{\otimes n}({\mathbb {I}}^{\otimes n}_B) \big ) \Big ), \end{aligned}$$

where the second line follows from the concavity of the logarithm function and in the third line we use the fact that \(\{\Pi ^n_m:\, m\in {\mathcal {M}}\}\) is a POVM. On the other hand,

$$\begin{aligned} \Psi _t^{\otimes n}({\mathbb {I}}^{\otimes n}_B) = \big ( \mathrm {e}^{-t} + (1-\mathrm {e}^{-t}) d \big )^n {\mathbb {I}}^{\otimes n}_B\le \mathrm {e}^{(d-1)nt}{\mathbb {I}}^{\otimes n}_B. \end{aligned}$$

Therefore,

$$\begin{aligned} -I(X^n; B^n) + \big (1+\frac{1}{t}\big ) \log (1-\varepsilon ) \le -\log |{\mathcal {M}}| + dnt. \end{aligned}$$

Optimizing over the choice of \(t> 0\), the desired result follows. \(\quad \square \)

The above theorem leads to the following finite blocklength second order strong converse bound for the classical capacity of a c-q channel.

Corollary 33

For any sequence of codes \(\mathcal {C}_n:=(|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) of rates \(r:=\frac{|\mathcal {M}|}{n}>{C}(\mathcal {W})\),

$$\begin{aligned} p_{\max }(\mathcal {C}_n;\mathcal {W})\ge 1-\mathrm {e}^{-nf}\,, \end{aligned}$$

where \(f:=\big (\sqrt{d+(r-C(\mathcal {W}))}-\sqrt{d} \big )^2\).

Proof

We apply the bound found in Theorem 32, so that

$$\begin{aligned} nC(\mathcal {W})\ge \log |\mathcal {M}|-2\sqrt{dn\log \frac{1}{1-\varepsilon }} - \log \frac{1}{1-\varepsilon }. \end{aligned}$$

The result follows by an analysis similar to the one of Corollary 30\(\quad \square \)

Remark 8

As pointed out in Remark 7, the strong converse bound that we find here is weaker than the one of [35]. However, and as opposed to [35], our technique has recently been successfully applied to network information theoretical scenarios (see [12, 13]).