Abstract
In this paper we develop the theory of quantum reverse hypercontractivity inequalities and show how they can be derived from log-Sobolev inequalities. Next we prove a generalization of the Stroock–Varopoulos inequality in the non-commutative setting which allows us to derive quantum hypercontractivity and reverse hypercontractivity inequalities solely from 2-log-Sobolev and 1-log-Sobolev inequalities respectively. We then prove some tensorization-type results providing us with tools to prove hypercontractivity and reverse hypercontractivity not only for certain quantum superoperators but also for their tensor powers. Finally as an application of these results, we generalize a recent technique for proving strong converse bounds in information theory via reverse hypercontractivity inequalities to the quantum setting. We prove strong converse bounds for the problems of quantum hypothesis testing and classical-quantum channel coding based on the quantum reverse hypercontractivity inequalities that we derive.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(\{T_t:\, t\ge 0\}\) be a continuous semigroup of stochastic maps (a Markov semigroup) with a unique stationary distribution \(\pi \). Defining the p-norm, for \(p\ge 1\), of a function f by \(\Vert f\Vert _p:=(\mathbb {E}|f|^p)^{1/p}\), where the expectation is with respect to \(\pi \), a simple convexity-type argument verifies that \(\Vert T_tf\Vert _p\le \Vert f\Vert _p\). That is, \(T_t\), for all \(t\ge 0\), is a contraction under p-norms. Since \(p\mapsto \Vert f\Vert _p\) is non-decreasing, a stronger contractivity inequality is the following:
for \(1\le q\le p\) and \(t=t(p)\) an increasing function of p satisfying \(t(q)=0\). Thus an inequality of this form is called a hypercontractivity inequality. Since \(T_0\) equals the identity map, the inequality (1) for \(p=q\) reduces to an equality. Thus its infinitesimal version around \(t=0\) must also hold. This infinitesimal version is derived from the derivative of the left hand side of (1) and is called a q-log-Sobolev inequality.Footnote 1 Such an inequality involves two quantities: the entropy function and the Dirichlet form. A log-Sobolev inequality guarantees the existence of a positive constant, called a log-Sobolev constant, up to which the entropy function is dominated by the Dirichlet form. Not only can one derive log-Sobolev inequalities from hypercontractivity ones, but a collection of the former inequalities can also be used to prove hypercontractivity inequalities through integration. Thus log-Sobolev inequalities and hypercontractivity inequalities are essentially equivalent.
A fundamental tool in the theory of log-Sobolev inequalities is the Stroock–Varopoulos inequality. This inequality enables us to compare the Dirichlet forms associated to different values of q, using which a log-Sobolev inequality for \(q=2\) can be used to derive a log-Sobolev inequality for any q. Indeed, the Stroock–Varopoulos inequality allows us to derive a collection of log-Sobolev inequalities from a single one, from which hypercontractivity inequalities can be proven by integration.
Hypercontractivity inequalities were first studied in the context of quantum field theory [22, 40, 48], but later found several important applications in different areas of mathematics, e.g., concentration of measure inequalities [8, 45], transportation cost inequalities [21], estimating the mixing times [18], analysis of Boolean functions [15] and information theory [1, 25]. One of the main ingredients of most of these applications is the so called tensorization property. It states that the hypercontractivity inequality
is satisfied for every \(n\ge 1\) if and only if it holds for \(n=1\). That is, the hypercontractivity of \(T_t\) is equivalent to the hypercontractivity of its tensor powers. Proof of the tensorization property is not hard, and can be obtained using the multiplicativity of the operator \((q\rightarrow p)\)-norm. Another proof, based on the equivalence of log-Sobolev and hypercontractivity inequalities, uses chain rule and the subadditivity of the entropy function.
Hypercontractivity inequalities can also be studied for \( p, q<1\). Although \(\Vert \cdot \Vert _p\) for \(p<1\) is not a norm, it satisfies the reverse Minkowski inequality from which one can show that \(\Vert T_tf\Vert _p\ge \Vert f\Vert _p\) when \(p<1\). Thus it is natural to consider inequalities of the form (1) for \( p, q<1\) in the reverse direction. Such inequalities are called reverse hypercontractivity inequalities. The theory of log-Sobolev inequalities for the range of \(q<1\) is developed similarly and can be used for proving reverse hypercontractivity inequalities as well [36].
Quantum hypercontractivity inequalities The theory of hypercontractivity and log-Sobolev inequalities in the quantum (non-commutative) case has been developed by Olkiewicz and Zegarlinski [43]. Here the semigroup of stochastic maps is replaced by a semigroup of quantum superoperators (QMS) representing the time evolution of an open quantum system under the Markovian approximation in the Heisenberg picture. Kastoryano and Temme in [26] used log-Sobolev inequalities to estimate the mixing time of quantum Markov semigroups. The study of quantum reverse hypercontractivity was initiated in [14], where following [36] some applications were discussed. For other applications of hypercontractivity inequalities in quantum information theory see [16, 32, 39].
Due to the non-commutative features of quantum physics, hypercontractivity and log-Sobolev inequalities in the quantum case are much more complicated. Therefore, despite the apparent analogy with the classical (i.e. commutative) case, several complications arise. In particular, one of the main drawbacks of the theory in the non-commutative case is the lack of a general quantum Stroock–Varopoulos inequality. As mentioned above, such an inequality would allow one to derive hypercontractivity inequalities solely from a 2-log-Sobolev inequality. Special cases of the quantum Stroock–Varopoulos inequality, called regularity and strong regularity properties, were considered in the literature and proved for certain examples [26, 43]. The most general result in this direction is a proof of the strong regularity property for a wide class of quantum Markov semigroups obtained in [3].
Even more problematic is the issue of tensorization. As mentioned before, the proof of the tensorization property in the commutative case is quite easy and can be done with at least two methods, yet none of them generalize to the non-commutative case; (i) The superoperator norm is not multiplicative in general, and (ii) one cannot interpret the quantum conditional entropy as an average of an entropic quantity over a smaller system, which is a crucial aspect of the proof in the classical setting. Thus far, the tensorization property has been proven only for a few special examples of quantum Markov semigroups. In particular, it was proven for the qubit depolarizing semigroup in [26, 33] and is generalized for all unital qubit semigroups in [28]. Moreover, in [49] some techniques were developed for bounding the log-Sobolev constants associated to the tensor powers of quantum Markov semigroups, which can be considered as an intermediate resolution of the tensorization problem. We also refer to [4, 6] for the theory of hypercontractivity and log-Sobolev inequalities for completely bounded norms.
1.1 Our Results
In this paper we first develop the theory of quantum reverse hypercontractivity inequalities beyond the unital case. This is done almost in a manner analogous to the (forward) hypercontractivity inequalities. Here, in contrast to [26, 43], we need to use different normalizations for the entropy function as well as the Dirichlet form to make them non-negative even for parameters \(p<1\). Our results in this part are summarized in Theorem 11.
Our next result is a quantum Stroock–Varopoulos inequality for both the forward and reverse cases. We prove this inequality under the assumption of strong reversibility of the QMS. We provide two proofs for the quantum Stroock–Varopoulos inequality. The first proof is based on ideas in [11, 43]. The second proof is based on ideas in [3] in which the strong regularity is proven under the same assumption. Indeed, our quantum Stroock–Varopoulos inequality is a generalization of the strong regularity property established in [3]. Theorem 14 states our result in this part.
We then prove some tensorization-type results. The first one, Theorem 19, provides a uniform bound on the 1-log-Sobolev constant of generalized depolarizing semigroups and their tensor powers. The proof of this result is a generalization of the proof of a similar result in the classical case [36]. This tensorization result together with our Stroock–Varopoulos inequality gives a reverse hypercontractivity inequality which is used in the subsequent section. The second tensorization result, Theorem 21, shows that the 2-log-Sobolev constant of the n-fold tensor power of a qubit generalized depolarizing semigroup is independent of n. Next, in Theorem 25 we explicitly compute this 2-log-Sobolev constant. Finally, in Corollary 26 we use these results to establish a uniform bound on the 2-log-Sobolev constant of any qubit quantum Markov semigroup and its tensor powers. We note that the latter bound improves over the bounds provided in [49].
Let us briefly explain the ideas behind the latter tensorization results. Previously, Theorem 21 was known in the unital case (the usual depolarizing semigroup), the proof of which was based on an inequality on the norms of a \(2\times 2\) block matrix and its submatrices from [27]. Our proof of Theorem 21 is based on the same inequality. First in Lemma 22 we derive an infinitesimal version of that inequality in terms of the entropies of a \(2\times 2\) block matrix and its submatrices, and then use it to prove Theorem 21. To prove Theorem 25 we need to show that a certain function of qubit density matrices is optimized over diagonal ones. Once we show this, the explicit expression for the 2-log-Sobolev constant is obtained from the associated classical log-Sobolev constant derived in [18]. Finally, Corollary 26 is a quantum generalization of a classical result from [18] with an essentially similar proof except that we should take care of tensorization separately.
Finally, we apply the quantum reverse hypercontractivity in proving strong converse bounds for the tasks of quantum hypothesis testing and classical-quantum channel coding. In the next section, we briefly explain the key idea behind the application of reverse hypercontractivity to the problem of classical hypothesis testing.
1.2 Application to Hypothesis Testing Problem
Recently, the authors of [31] introduced a new technique to prove strong converse results in information theory using reverse hypercontractivity inequalities. In the following we briefly explain the ideas via the problem of hypothesis testing.
Suppose that n samples independently drawn from a probability distribution on some sample space \(\Omega \) are provided, and the task is to distinguish between two possible hypotheses which are given by the distributions P and Q on \(\Omega \). In this setting, we apply a test functionFootnote 2\(f:\Omega ^n\rightarrow \{0, 1\}\) to make the decision; Letting \((x_1, \dots , x_n)\in \Omega ^n\) be the observed samples, if \(f(x_1, \dots , x_n)\) equals 1, we infer the hypothesis to be P, and otherwise infer it to be Q. The following two types of error may occur: the error of Type I of wrongly inferring the distribution to be Q given by \(\alpha _n(f):=P^{\otimes n}(f=0)\), and the error of Type II of wrongly inferring the distribution to be P given by \(\beta _n(f):=Q^{\otimes n}(f=1)\). In the asymmetric regime, we further assume that \(\alpha _n(f)\) is uniformly bounded by some fixed error \(\varepsilon \in (0,1)\), and we are interested in the smallest possible achievable error \(\beta _n(f)\).
The idea in [31] is to use the following variational formula for the relative entropy between P and Q (see, e.g., [45]):
where \(\mathbb {E}_{P^{\otimes n}}\) stands for the expectation with respect to the distribution \(P^{\otimes n}\), and the maximum is over functions g on \(\Omega ^n\). This formula is indeed used for g being a noisy version of f. To get this noisy version a Markov semigroup is employed.
For any function \(h:\Omega \rightarrow {\mathbb {R}}\) define
These maps define a classical version of the generalized quantum depolarizing semigroup (see Equation (17)). That is, for every \(x\in \Omega \), we have \(T_t(h)(x) = \mathrm {e}^{-t}h(x) + (1-\mathrm {e}^{-t}) \mathbb {E}_P[h]\). Then \(\{T_t:\, t\ge 0\}\) forms a semigroup that satisfies the following reverse hypercontractivity inequality [36]:
where the norms are defined with respect to the distribution P, i.e., \(\Vert h\Vert _p = \big ( \mathbb {E}_P[|h|^p] \big )^{1/p}\). Now the idea is to use (2) for \(g=T_t^{\otimes n} f\) as follows:
Bounding the second term on the right hand side is easy. Letting \(\gamma =\left\| \frac{dP}{dQ} \right\| _\infty \) we have
where the last inequality follows from \(\mathrm {e}^{\gamma t}-1\ge \gamma (\mathrm {e}^t-1)\) for \(\gamma \ge 1\).
Now we need to bound the first term in terms of \(\alpha _n(f)\). The crucial observation here is that
It is then natural to use the reverse hypercontractivity inequality (4) for \(q=0\). In fact, using the tensorization property, that (4) also holds for \(T_t^{\otimes n}\), we have
where the second line follows from the reverse hypercontractivity inequality, the third line follows from the fact that \(T_t^{\otimes n}(f)\) takes values in [0, 1], and the last line follows from \(\mathrm {e}^{-t}\ge 1-t\). Now using (6) and (8) in (5), using \(\alpha _n(f)\le \varepsilon \) and optimizing over the choice of \(t> 0\) we arrive at
In the present work, we show that the above analysis can be carried over to the quantum setting. Let us explain the similarities with the classical case as well as difficulties we face in doing this. Firstly, a variational expression for the quantum relative entropy similar to (2) is already known [44]. Secondly, the semigroup (3) is easily generalized to the generalized depolarizing semigroup in the quantum case. Thirdly, the reverse hypercontractivity inequality (4) is derived in the quantum case from our theory of quantum reverse hypercontractivity as well as our quantum Stroock–Varopoulos inequality. However we need this inequality in its n-fold tensor product form, for which we use our tensorization-type result. Also, generalizing the computations in (6) to the quantum case is straightforward. Nevertheless, we face a problem in the next step; The crucial identity (7) no longer holds in the non-commutative case. Indeed, as far as we know, non-commutative \(L_p\)-norms do not possess a closed expression in the limit \(p\rightarrow 0\). To get around this problem, instead of a variational formula similar to (2), we use our quantum reverse hypercontractivity inequality together with a variational formula for p-norms (obtained from the reverse Hölder inequality). Then we derive an inequality of the form (9) by taking an appropriate limit.
Section 5 contains our results on applications of reverse hypercontractivity inequalities to strong converse of the quantum hypothesis testing as well as the classical-quantum channel coding problems.
2 Notations
For a Hilbert space \(\mathcal {H}\), the algebra of (bounded) linear operators acting on \(\mathcal {H}\) is denoted by \(\mathcal {B}(\mathcal {H})\). The adjoint of \(X\in \mathcal {B}(\mathcal {H})\) is denoted by \(X^\dagger \) and
The subspace of self-adjoint operators is denoted by \(\mathcal {B}_{sa}(\mathcal {H}) \subset \mathcal {B}(\mathcal {H})\). When \(X\in \mathcal {B}_{sa}(\mathcal {H})\) is positive semi-definite (positive definite) we represent it by \(X\ge 0\) (\(X> 0\)). We let \(\mathcal {P}(\mathcal {H})\) be the cone of positive semi-definite operators on \(\mathcal {H}\) and \(\mathcal {P}_{+}(\mathcal {H}) \subset \mathcal {P}(\mathcal {H})\) the set of (strictly) positive operators. Further, let \(\mathcal {D}(\mathcal {H}):=\lbrace \rho \in \mathcal {P}(\mathcal {H})\mid \text {tr}\rho =1\rbrace \) denote the set of density operators (or states) on \(\mathcal {H}\), and \(\mathcal {D}_+(\mathcal {H}):=\mathcal {D}(\mathcal {H})\cap \mathcal {P}_+(\mathcal {H})\) denote the subset of faithful states. We denote the support of an operator A by \({\mathrm {supp}}(A)\). We let \(\mathbb {I}\in \mathcal {B}(\mathcal {H})\) be the identity operator on \(\mathcal {H}\), and \(\mathcal {I}:\mathcal {B}(\mathcal {H})\mapsto \mathcal {B}(\mathcal {H})\) be the identity superoperator acting on \(\mathcal {B}(\mathcal {H})\).
We sometimes deal with tensor products of Hilbert spaces. In this case, in order to keep track of subsystems, it is appropriate to label the Hilbert spaces as \(\mathcal {H}_A, \mathcal {H}_B\) etc. We also denote \(\mathcal {H}_A\otimes \mathcal {H}_B\) by \(\mathcal {H}_{AB}\). Then the subscript in \(X_{AB}\) indicates that it belongs to \(\mathcal {B}(\mathcal {H}_{AB})\). We also use \(\mathcal {H}^{\otimes n} = \mathcal {H}_{A_1}\otimes \cdots \otimes \mathcal {H}_{A_n}\) where \(\mathcal {H}_{A_i}\)’s are isomorphic Hilbert spaces. Moreover, for any \(S\subseteq \{1, \dots , n\}\) we use the shorthand notations \(A_S{=A^S=\{A_j: \, j\in S \}}\), and \(\mathcal {H}_{A_S}\) for \(\bigotimes _{j\in S}\mathcal {H}_{A_j}\). We also identify \(A_{\{1, \dots , n\}}\) with \(A^n\).
A superoperator \(\Phi :\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) is called positive if \(\Phi (X)\ge 0\) whenever \(X\ge 0\). It is called completely positive if \(\mathcal {I}\otimes \Phi \) is positive where \(\mathcal {I}:\mathcal {B}(\mathcal {H}')\rightarrow \mathcal {B}(\mathcal {H}')\) is the identity superoperator associated to an arbitrary Hilbert space \(\mathcal {H}'\). Observe that a positive superoperator \(\Phi \) is hermitian-preserving meaning that \(\Phi (X^\dagger ) =\Phi (X)^\dagger \). A superoperator is called unital if \(\Phi ({\mathbb {I}})={\mathbb {I}}\), and is called trace-preserving if \(\text {tr}\,\Phi (X)=\text {tr}X\) for all X. The adjoint of \(\Phi \), denoted by \(\Phi ^*\) is defined with respect to the Hilbert–Schmidt inner product:
Note that the adjoint of a unital map is trace-preserving and vice versa.
2.1 Non-commutative Weighted \(L_p\)-Spaces
Throughout the paper we fix \(\sigma \in {\mathcal {D}}_+(\mathcal {H})\) to be a positive definite density matrix. We define
Then \(\mathcal {B}(\mathcal {H})\) is equipped with the inner product
Note that if \(X, Y\ge 0\) then \(\langle X, Y\rangle _\sigma \ge 0\). This inner product induces a norm on \(\mathcal {B}(\mathcal {H})\):
This 2-norm can be generalized for other values of p. For every \(p\in {\mathbb {R}}\setminus \{0\}\) we define
where
denotes the (generalized) Schatten norm of order p. In particular, if \(X> 0\) then \(\Vert X\Vert _{p, \sigma }^p=\text {tr}\big [\Gamma _\sigma ^{1/p}(X)^p\big ]\). Note that this definition reduces to (11) when \(p=2\). The values of \(\Vert X\Vert _{p, \sigma }\) for \(p\in \{0, \pm \infty \}\) are defined in the limits. Since the function \(p\mapsto \Vert X\Vert _{p,\sigma }\) is increasing and bounded below by 0, by the monotone convergence theorem, the limit \(p\rightarrow 0\) exists but does not have a closed expression, as opposed to the classical setting (cf Equation (7)). Observe also that \(\Vert X\Vert _{p, \sigma } = \Vert X^{\dagger }\Vert _{p, \sigma }\) for all X. Moreover, \(\Vert \cdot \Vert _{p, \sigma }\) for \(1\le p\le \infty \) satisfies the triangle inequality (the Minkowski inequality) and is a norm. The dual of this norm is \(\Vert \cdot \Vert _{{\hat{p}}, \sigma }\) where \({\hat{p}}\) is the Hölder conjugate of p given by
where \(p>1\), and \(\hat{p}=+\infty \) for \(p=1\). We indeed for \(1\le p\le \infty \) and arbitrary X have [43]
Moreover, for \(-\infty< p<1\), \(p\ne 0\) and positive definiteX we have
where again \({\hat{p}}\) is defined via (13).Footnote 3 This identity is a consequence of the reverse Hölder inequality:
Lemma 1
(Reverse Hölder inequality). Let \(X\ge 0\) and \(Y>0\). Then, for any \(p< 1\) with Hölder conjugate \(\hat{p}\) we have
Proof
The proof is a direct generalization of equation (32) of [50] (see also Lemma 5 of [14]): for any \(A\ge 0\) and \(B>0\),
From there, choosing \(A:=\Gamma _{\sigma }^{\frac{1}{p}}(X)\) and \(B:=\Gamma _\sigma ^{\frac{1}{\hat{p}}}(Y)\),
\(\square \)
Another property of \(\Vert \cdot \Vert _{p, \sigma }\) for \(-\infty \le p<1\) is the reverse Minkowski inequality. As mentioned above, when \(p\ge 1\), the triangle inequality is satisfied due to the Minkowski inequality. When \(p<1\) we have the inequality in the reverse direction:
Again this inequality in the special case of \(\sigma \) being the completely mixed state is proven in [14] but the generalization to arbitrary \(\sigma \) is immediate.
For arbitrary p, q define the power operator by
Here are some immediate properties of the power operator.
Proposition 2
[26, 43] F or all \(q,r,p\in (-\infty ,\infty )\backslash \{0\}\) and \(X\in \mathcal {B}(\mathcal {H})\):
-
(i)
\(\Vert I_{q, p}(X)\Vert _{q, \sigma }^q =\Vert X\Vert _{p, \sigma }^p\). In particular we have \(\Vert I_{p, p}(X)\Vert _{p, \sigma } = \Vert X\Vert _{p, \sigma }\).
-
(ii)
\(I_{q, r}\circ I_{r, p} = I_{q, p}\).
-
(iii)
For \(X\ge 0\) we have \(I_{p, p}(X)=X\).
2.2 Entropy
For a given \(\sigma \in \mathcal {D}_+(\mathcal {H})\) and arbitrary \(p\ne 0\) we define the entropy functionFootnote 4 for \(X> 0\) by
As usual, the entropy function for \(p\in \{0, \pm \infty \}\) is defined in the limit.
Remark 1
When \(p> 0\), in the definition of the entropy we can take X to be positive semi-definite. However, when \(p<0\), we need to consider X to be positive definite in order to avoid difficulties. For this reason, in the rest of the paper we state our definitions and results for positive definite X, keeping in mind that when \(p, q>0 \) they can easily be generalized to positive semi-definite X (say, by taking an appropriate limit).
The significance of the entropy function comes from its relation to the derivative of the p-norm.
Proposition 3
[26, 43] For a differentiable operator valued function \(p\mapsto X_p\) we have, for any \(p\in \mathbb {R}\backslash \{0\}\):
Here \(\gamma \) is given by
where \(Z_p := \frac{\text {d}}{\text {d}p}X_p\).
We will be using two special cases of this proposition. First, if \(X_p> 0\) for all p, we have
Second, if \(X_p=X\) is independent of p we have
We will also use the following properties of the entropy function that are easy to verify.
Proposition 4
[26]
-
(i)
\(\text {Ent}_{p, \sigma }(I_{p, 2}(X)) = \text {Ent}_{q, \sigma }(I_{q, 2}(X))\) for all \(p, q\in \mathbb {R}\backslash \{0\}\) and \(X\in \mathcal {B}(\mathcal {H})\).
-
(ii)
\(\text {Ent}_{p, \sigma }(cX) = c^p \text {Ent}_{p, \sigma }(X)\) for all \(X> 0\) and constants \(c> 0\).
-
(iii)
For any density matrix \(\rho \) we have
$$\begin{aligned} \text {Ent}_{2, \sigma }\big (\Gamma _\sigma ^{-\frac{1}{2}}(\sqrt{\rho })\big ) = D(\rho \Vert \sigma ), \end{aligned}$$where \(D(\rho \Vert \sigma ) = \text {tr}(\rho \log \rho ) - \text {tr}(\rho \log \sigma )\) is Umegaki’s relative entropy.
-
(iv)
For any density matrix \(\rho \) we have
$$\begin{aligned} \text {Ent}_{1, \sigma }\big (\Gamma _\sigma ^{-1}(\rho )\big ) = D(\rho \Vert \sigma ). \end{aligned}$$
Corollary 5
-
(a)
For all \(X>0\) and arbitrary \(p\in \mathbb {R}\backslash \{0\}\) we have \(\text {Ent}_{p, \sigma }(X)\ge 0\).
-
(b)
For all \(X>0\), the map \(p\mapsto \Vert X\Vert _{p, \sigma }\) is non-decreasing on \(\mathbb {R}\).
-
(c)
\(X\mapsto \text {Ent}_{1, \sigma }(X)\) is a convex function on positive semi-definite matrices.
Proof
(a) By part (i) of the previous proposition it suffices to prove the corollary for \(p=1\). Moreover, by part (ii) we may assume that X is of the form \(X=\Gamma _\sigma ^{-1}(\rho )\) for some density matrix \(\rho \). Then by part (iv) we have \(\text {Ent}_{1, \sigma }(X) = D(\rho \Vert \sigma )\ge 0\).
(b) By (a) both \(\text {Ent}_{p, \sigma }(I_{p,p}(X))\) and \(\text {Ent}_{p, \sigma }(I_{p,p}(X^\dagger ))\) are non-negative. Thus using (16) the derivative of \(p\mapsto \Vert X\Vert _{p, \sigma }\) is non-negative, and this function is non-decreasing.
(c) This is a direct consequence of the joint convexity of \((\rho ,\sigma )\mapsto D(\rho \Vert \sigma )\) (see e.g., [54]). \(\quad \square \)
2.3 Quantum Markov Semigroups
A quantum Markov semigroup (QMS) is the basic model for the evolution of an open quantum system in the Markovian regime. Such quantum Markov semigroup (in the Heisenberg picture) is a set \(\{\Phi _t:\, t\ge 0\}\) of completely positive unital superoperators \(\Phi _t: \mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) of the form
where \(\mathcal {L}: \mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) is a superoperator called the Lindblad generator of the QMS. The general form of such a Lindblad generator is characterized in [20, 30]. We note that \(\Phi _0=\mathcal {I}\) and \(\Phi _{t+s}=\Phi _s\circ \Phi _t\). Moreover, for any \(X\in \mathcal {B}(\mathcal {H})\) we have
In particular, since \(\Phi _t\) is assumed to be unital, we have
The dual of \(\mathcal {L}\) generates the associated QMS in the Schrödinger picture: \(\Phi _t^* = \mathrm {e}^{-t\mathcal {L}^*}\) where \(\mathcal {L}^*\) is the adjoint of \(\mathcal {L}\) with respect to the Hilbert–Schmidt inner product defined in (10). Since \(\mathcal {L}\) is not full-rank, there exists some non-zero \(\sigma \) in the kernel of \(\mathcal {L}^*\) as well. Then \(\sigma \) is an invariant of the semigroup \(\{\Phi _t^*: t\ge 0\}\), i.e., \(\Phi _t^*(\sigma ) = \sigma \) for all \(t\ge 0\). Throughout the paper we assume that such a \(\sigma \) is unique (up to scaling) and full-rank. Then it can be proven that \(\sigma \) is a density matrix.Footnote 5 Thus by the above uniqueness and full-rankness assumptions, \(\{\Phi _t^*:\, t\ge 0\}\) admits a unique invariant state \(\sigma \) in \(\mathcal {D}_+(\mathcal {H})\). We call such a QMS primitive. Observe that for a primitive QMS the identity operator \({\mathbb {I}}\) is the unique (up to scaling) element in the kernel of \(\mathcal {L}\).
We say that the QMS is \(\sigma \)-reversible or satisfies the detailed balanced condition with respect to some \(\sigma \in \mathcal {D}_+(\mathcal {H})\) if
From this equation and \(\mathcal {L}({\mathbb {I}})=0\) it is clear that
and that \(\sigma \) is a fixed point of \(\Phi _t^*\). Therefore, if the QMS is primitive and \(\sigma \)-reversible, then \(\sigma \) would be the unique invariant state of \(\{\Phi _t^*:\, t\ge 0\}\).
We will frequently use the following immediate consequence of reversibility.
Lemma 6
\(\mathcal {L}\) is \(\sigma \)-reversible if and only if both \(\mathcal {L}\) and \(\Phi _t\) are self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \), which means that for all \(X, Y\in \mathcal {B}(\mathcal {H})\) we have
A primitive QMS with the unique invariant state \(\sigma \in \mathcal {D}_+(\mathcal {H})\) is called p-contractive if it is a contraction under the p-norm, that is, for all \(t\ge 0\) and \(X> 0\) we have
It is called reversep-contractive if for all \(t\ge 0\) and \(X>0\)
We say that the QMS is contractive if it is p-contractive for all \(p\ge 1\) and reverse p-contractive for \(p<1\).
Two remarks are in line. Firstly, as mentioned before, when \(p>0\) in the above definition we may safely take \(X\ge 0\) (instead of \(X>0\)). For uniformity of presentation we prefer to take \(X>0\) in order to jointly consider the cases \(p>0\) and \(p\le 0\) in the definitions. Of course in the former case by taking an appropriate limit, a contractivity inequality for \(X\ge 0\) can be derived once we have one for \(X>0\). Secondly, in the above definition we restrict to positive definite (or positive semidefinite) X since here \(\Phi _t\) is a completely positive map, and the superoperator norm of completely positive maps (at least for \(p\ge 1\)) is optimized over positive semidefinite operators (see e.g. [17] and reference therein). The proof of the following proposition is postponed to Appendix A.
Proposition 7
-
(i)
Any primitive QMS is (reverse) p-contractive for \(p\in (-\infty , -1]\cup [1/2, +\infty )\).
-
(ii)
Any primitive QMS whose unique invariant state is \(\sigma ={\mathbb {I}}/d\), the completely mixed state, is (reverse) p-contractive for all p.
The reader familiar with the notion of sandwichedp-Rényi divergence [37, 52] would notice that p-contractivity is related to [5] the data processing inequality of sandwiched p-Rényi divergences, which is known to hold [5, 19, 37] for \(p\ge 1/2\). In Appendix A we give a proof of part (i) for the range \(p\in (-\infty , -1]\cup [1/2, 1)\) based on new ideas which may be of independent interest. Moreover, later in Corollary 15, under a stronger assumption than primitivity we will prove (reverse) p-contractivity for all p.
An important example of classical semigroups is generated by the map \(f\mapsto f-\mathbb {E}f\), where the expectation is with respect to some fixed distribution. This generator is sometimes called the simple generator [36]. The quantum analog of simple generators is
for some positive definite density matrix \(\sigma \). Observe that \(\mathcal {L}\) is primitive, and \(\mathcal {L}^*(X) =X-\text {tr}(X)\sigma \) satisfies the detailed balanced condition with respect to \(\sigma \). The quantum Markov semigroup associated to this Lindblad generator is
In the special case where \(\sigma \) is the completely mixed state, \(\Phi _t\) and \(\Phi _t^*\) coincide and become depolarizing channels. Indeed, (17) is a generalized depolarizing channel in the Heisenberg picture.
Having two Lindblad generators \(\mathcal {L}\) and \(\mathcal {K}\) associated to two semigroups \(\{\Phi _t:\, t\ge 0\}\) and \(\{\Psi _t:\, t\ge 0\}\), respectively, we may consider a new Lindblad generator \(\mathcal {L}\otimes \mathcal {I}+\mathcal {I}\otimes \mathcal {K}\). This Lindblad generator generates the semigroup \(\{\Phi _t\otimes \Psi _t:\, t\ge 0\}\). Moreover, letting
we have
Note that, if \(\mathcal {L}\) is primitive and reversible with respect to \(\sigma \), then \(\sum _{i=1}^n {\widehat{\mathcal {L}}}_i\) is also primitive and reversible with respect to \(\sigma ^{\otimes n}\).
2.4 Dirichlet Form
We now define the Dirichlet formFootnote 6 associated to a QMS with generator \(\mathcal {L}\) by
where \({\hat{p}}\) is the Hölder conjugate of p. Verification of the following properties of the Dirichlet form is easy.
Proposition 8
-
(i)
\(\mathcal {E}_{{\hat{p}}, \mathcal {L}}(I_{{\hat{p}}, 2}(X)) =\mathcal {E}_{p, \mathcal {L}}(I_{p, 2}(X))\) for all \(p\in \mathbb {R}\backslash \{0\} \) and \(X\in \mathcal {B}(\mathcal {H})\).
-
(ii)
\(\mathcal {E}_{ p, \mathcal {L}}(cX)=c^p\mathcal {E}_{p, \mathcal {L}}(X)\) for \(X\ge 0\) and constant \(c\ge 0\).
-
(iii)
\(\mathcal {E}_{2, \mathcal {L}} (X) = \langle X, \mathcal {L}(X)\rangle _\sigma \) for all \(X> 0\).
-
(iv)
\(\mathcal {E}_{1, \mathcal {L}}(X) = \frac{1}{4} \text {tr}\left[ \Gamma _\sigma \big (\mathcal {L}(X)\big )\cdot \big (\log \Gamma _\sigma (X) - \log \sigma \big ) \right] .\)
The non-negativity of the Dirichlet form is not clear from its definition. Here we prove the non-negativity assuming that the QMS is p-contractive. By Proposition 7 we then conclude the non-negativity of \(\mathcal {E}_{p, \mathcal {L}}(X)\) for \(p\notin (-1, 1/2)\). Later on, based on an stronger assumption than \(\sigma \)-reversibility, we will prove \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\) for all values of p and \(X>0\).
Proposition 9
Suppose that \(\mathcal {L}\) generates a QMS that is primitive and \(\sigma \in \mathcal {D}_+(\mathcal {H})\) is its unique invariant state. Let \(p\in \mathbb {R}\ne \{0\}\). If the QMS is (reverse) p-contractive, then \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\) for all \(X> 0\).
Proof
Define
By assumption of (reverse) p-contractivity, for all \(t\ge 0\) we have \(g(t)\le 0\). We note that \(g(0)=0\). Therefore, \(g'(0)\le 0\). We compute
This gives \(\mathcal {E}_{p, \mathcal {L}}(X)\ge 0\). \(\quad \square \)
2.5 Hypercontractivity and Logarithmic-Sobolev Inequalities
We showed in Proposition 7 that \(\Phi _t\) belonging to a \(\sigma \)-reversible QMS is contractive, at least for certain values of p. That is, \(\Vert \Phi _t(X)\Vert _{p, \sigma }\) is bounded (from above or below depending on whether \(p\ge 1\) or \(p<1\)) by \(\Vert X\Vert _{p, \sigma }\). On the other hand, By part (b) of Corollary 5 bounding \(\Vert \Phi _t(X)\Vert _{p, \sigma }\) by \(\Vert X\Vert _{q, \sigma }\) when \(1\le q<p\) or \(p<q<1\) is a stronger inequality than contractivity. Such inequalities are called hypercontractivity inequalities or reverse hypercontractivity inequalities depending on whether \(1\le q<p\) or \(p<q<1\) respectively. These inequalities have found a wide range of applications in the literature.
It is well-known that quantum hypercontractivity inequalities stem from quantum logarithmic-Sobolev (log-Sobolev) inequalities. They are essentially equivalent objects, so proving log-Sobolev inequalities gives hypercontractivity ones. The theory of reverse hypercontractivity inequalities have been generalized to the non-commutative case for unital semigroups in [14]. Here we generalize the theory for general QMS.
Given a primitive Lindblad generator \(\mathcal {L}\) that is reversible with respect to a positive definite density matrix \(\sigma \) and \(p\in \mathbb {R}\backslash \{0\}\), a p-log-Sobolev inequality is an inequality of the form
The best constant \(\beta \) satisfying the above inequality is called the p-log-Sobolev constant and is denoted by \(\alpha _p(\mathcal {L})\). That is,
where the infimum is taken over \(X> 0\) with \(\text {Ent}_{p, \sigma }(X)\ne 0\).
By the following proposition we can restrict ourselves to log-Sobolev constants for values of \(p\in [0, 2]\).
Proposition 10
\(\alpha _p(\mathcal {L})=\alpha _{{\hat{p}}}(\mathcal {L})\) for all Lindblad generators \(\mathcal {L}\).
Proof
Identifying X with \(I_{p, 2}(Y)\), for some arbitrary \(Y> 0\), this is an immediate consequence of part (i) of Proposition 4 and part (i) of Proposition 8. \(\quad \square \)
We can now state how log-Sobolev inequalities are related to hypercontractivity and reverse hypercontractivity inequalities. The first part of the following theorem is already known [26, 43].
Theorem 11
Let \(\mathcal {L}\) be a primitive Lindblad generator that is reversible with respect to a positive definite density matrix \(\sigma \). Then the following holds:
-
(Hypercontractivity) Suppose that \(\beta _2 = \inf _{p\in [1, 2]} \alpha _p(\mathcal {L}) >0\). Then for \(1\le q\le p\) and
$$\begin{aligned} t\ge \frac{1}{4\beta _2}\log \frac{p-1}{q-1}, \end{aligned}$$(19)we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X> 0\)
-
(Reverse hypercontractivity) Suppose that \(\beta _1 = \inf _{p\in {(}0, 1]} \alpha _p(\mathcal {L}) >0\). Then for \(p\le q<1\) and
$$\begin{aligned} t\ge \frac{1}{4\beta _1}\log \frac{p-1}{q-1}, \end{aligned}$$(20)we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\ge \Vert X\Vert _{q, \sigma }\) for all \(X> 0\), where Eq. 20 is understood in the limit whenever \(p=0\) or \(q=0\).
The proof strategy of this theorem is quite standard. Here we present a proof for the sake of completeness.
Proof
It suffices to prove the theorem when \(t= \frac{1}{4\beta } \log \frac{p-1}{q-1}\) for \(\beta \) being either \(\beta _2\) or \(\beta _1\) depending on whether we prove the hypercontractivity part or the reverse hypercontractivity part. Thus, fix q and define
Define
where \(X_p:= \Phi _{t(p)}(X)> 0\). To continue the proof we compute the derivative of f(p) using Proposition 3.
where
Therefore,
Now suppose that \(q\ge 1\) and \(\beta \le \alpha _p(\mathcal {L})\) for all \(p\in [1, 2]\). Then for \(p\ge q\) we have
As a result, \(f'(p)\le 0\) for all \(p\ge q\). Since \(f(q)=0\) we conclude that \(f(p)\le 0\) for all \(p\ge q\). This gives the hypercontractivity part of the theorem.
For the reverse hypercontractivity part, assume that \(q< 1\) and \(\beta \le \alpha _p(\mathcal {L})\) for all \(p\in [0, 1]\). Then for \(p\le q\) we have
where the second inequality holds since \(p<1\), so either p or its Hölder conjugate belongs to [0, 1]. Therefore, \(f'(p)\le 0\) for all \(p\le q< 1\), and since \(f(q)=0\), \(f(p)\ge 0\) for all \(p<q\). \(\quad \square \)
3 Quantum Stroock–Varopoulos Inequality
In the previous section we developed the basic tools required to understand quantum hypercontractivity and reverse hypercontractivity inequalities and log-Sobolev inequalities. By Theorem 11 to obtain hypercontractivity and reverse hypercontractivity inequalities we need to find bounds on log-Sobolev constants in ranges \(p\in [1, 2]\) or \(p\in [0, 1]\). Now the question is how such bounds can be found.
In the classical (commutative) case, the most relevant p-log-Sobolev constants are \(\alpha _2(\mathcal {L})\) and \(\alpha _1(\mathcal {L})\). Indeed, \(p\mapsto \alpha _p(\mathcal {L})\) is a non-increasing function on \(p\in [0, 2]\), so in Theorem 11 the parameters \(\beta _1\) and \(\beta _2\) can be replaced with \(\alpha _1(\mathcal {L})\) and \(\alpha _2(\mathcal {L})\) respectively. This result is proven via comparison of the Dirichlet forms, an inequality that is sometimes called the Stroock–Varopoulos inequality.
In this section we prove a quantum generalization of the Stroock–Varopoulos inequality, and conclude in Theorem 11 that, for strongly reversible semigroups, we can take \(\beta _p=\alpha _p(\mathcal {L})\) for \(p=1, 2\). We should point out that a quantum Stroock–Varopoulos inequality in the special case of \(\sigma \) being the completely mixed state is proven in [14]. Also, a special case of the Stroock–Varopoulos inequality (called strong\(L_p\)-regularity) for certain Lindblad generators is proven in [26, 43]. A strong \(L_p\)-regularity is also proven in [3] which we generalize to a quantum Stroock–Varopoulos inequality.
The assumption of \(\sigma \)-reversibility is not enough for us for proving the quantum Stroock–Varopoulos inequality. We indeed need \(\mathcal {L}\) to be self-adjoint with respect to an inner product different from \(\langle \cdot , \cdot \rangle _\sigma \) defined above (see Lemma 6). In the following we first define this new inner product, state some of its properties and then go to our quantum Stroock–Varopoulos inequality.
3.1 The GNS Inner Product
In what follows we use the GNS inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) on \(\mathcal {B}(\mathcal {H})\) that is defined by [11]:
We note that this inner product coincides with \(\langle X, Y\rangle _\sigma = \text {tr}(\sigma ^{1/2}X^\dagger \sigma ^{1/2}Y)\) when, e.g., X and \(\sigma \) commute. But in general \(\langle \cdot , \cdot \rangle _{1, \sigma }\) is different from \(\langle \cdot , \cdot \rangle _\sigma \).
The following lemma was first proven in [11]. We will give a proof here for the sake of completeness.
Lemma 12
Let \(\mathcal {L}\) be a Lindblad generator that is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) defined above. Then the followings hold.
-
(i)
\(\mathcal {L}\) commutes with the superoperator \(\Delta _{\sigma }:X \mapsto \sigma X\sigma ^{-1}.\)
-
(ii)
\(\mathcal {L}\) is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{\sigma }\).
Based on part (ii) of this lemma (see also Lemma 6) we say that a Lindblad generator \(\mathcal {L}\) is strongly\(\sigma \)-reversible if it is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\).
Proof
(i) Using the fact the \(\mathcal {L}(Y)^{\dagger } = \mathcal {L}(Y^{\dagger })\), for all X, Y we have
This gives \(\Delta _\sigma \circ \mathcal {L}= \mathcal {L}\circ \Delta _\sigma \).
(ii) Follows easily from (i) and the fact that
\(\square \)
The following lemma is indeed a consequence of Theorem 3.1 of [11]. Here we prefer to present a direct proof.
Lemma 13
Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then for every \(t\ge 0\) there are operators \(R_k\in \mathcal {B}(\mathcal {H})\) and \(\omega _k> 0\) such that \(\Delta _\sigma (R_k) = {\omega _k} R_k\),
and \(\sum _k R_k R_k^{\dagger }=I\).
Proof
By Lemma 12 the Lindblad generator \(\mathcal {L}\) and then \(\Phi _t=\mathrm {e}^{-t\mathcal {L}}\) commute with \(\Delta _\sigma \), i.e.,
Fix an orthonormal basis \(\{|i\rangle \}_{i=1}^d\) for the underlying Hilbert space \(\mathcal {H}=\mathcal {H}_A\) and define
where \(\mathcal {H}_B\) is isomorphic to \(\mathcal {H}_A\). It is not hard to verify that for any matrix M we have
where the transpose is with respect to the basis \(\{|i\rangle \}_{i=1}^d\).
The Choi–Jamiolkowski representation of \(\Phi _t\) is
Then using (24) it is not hard to verify that (23) translates to
That is, \(J_{AB}\) and \(\sigma _A^{-1}\otimes \sigma _B^{T}\) commute. On the other hand, \(J_{AB}\) is positive semidefinite since it is the Choi–Jamiolkowski representation of a completely positive map. Therefore, \(J_{AB}\) and \(\sigma _A^{-1}\otimes \sigma _B^{T}\) can be simultaneously diagonalized in an orthonormal basis, i.e., there exists an orthonormal basis \(\{|v_k\rangle \}_{k=1}^{d^2}\) of \(\mathcal {H}_{AB}\) such that
where \(\lambda _k\ge 0,\, \omega _k> 0\). Define the operator \(V_k\) by
Then again using (24), equation (26) translates to
Moreover, equation (25) means that
which gives
Then letting \(R_k:= \sqrt{\lambda _k} V_k\) we have \(\sigma R_k= \omega _kR_k \sigma \) and (22) holds. The other equation comes from \(\Phi _t({\mathbb {I}})={\mathbb {I}}\). \(\quad \square \)
3.2 Comparison of the Dirichlet Forms
We can now state the main result of this section.
Theorem 14
(Quantum Stroock–Varopoulos inequality). Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator, which means that it is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _{1, \sigma }\) defined in (21). Then for all \(X> 0\) we have
Remark 2
As mentioned above, special cases of Theorem 14 were already investigated in the literature. In the case that \(\sigma \) is the maximally mixed state, this was done in [14]. The inequality was also recently extended to the GNS-symmetric setting for the range of parameters \(p\ge 1\) and \(q=2\) in [3].
We have two proofs for this theorem. The first one, that we present here, is based on ideas in [26, 43]. The second one, that is moved to Appendix B, is based on ideas in [3]. We present both the proofs in this paper since they are different in nature and whose ideas can be useful elsewhere.
First proof of Theorem 14
For any \(t\ge 0\) define the function \(h_t:{[}0,\infty )\rightarrow \mathbb {R}\) by
for \(s\in (0,\infty )\backslash \{2\}\), and \(h_t(0)=h_t(2)=\text {tr}(\Gamma _\sigma ^{1/2}(X)^2)\). Since by part (ii) of Lemma 12, \(\Phi _t=\mathrm {e}^{-t\mathcal {L}}\) is self-adjoint with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \), we have \(h_t(2-s)=h_t(s)\) and \(h_t\) is symmetric about \(s=1\). Moreover, exploring the definition of \(h_t(s)\) we find that \(s\mapsto h_t(s)\) is analytic with a convergent Taylor series at \(s=1\). Then, by the symmetry around \(s=1\), all the the odd-order derivatives of \(h_t\) at \(s=1\) vanish, and we have
where
Note that the above series expansion is convergence by analyticity of \(s\mapsto h_t(s)\).We claim that all the even-order derivatives of \(h_t\) at \(s=1\) are non-negative, i.e., \(c_j\ge 0\). We use Lemma 13 to verify this. Let \(R_k\)’s be operators such that
with \(\omega _k> 0\) and (22) holds. Then letting \(Y:= \Gamma _\sigma ^{1/2}(X)\) and using (29) we compute
Now diagonalizing Y in its eigenbasis: \(Y=\sum _\ell \mu _\ell | \ell \rangle \langle \ell |\), we find that
Therefore, \(h_t(s)\) is a sum of exponential functions with positive coefficients. From this expression it is clear that \(c_j\)’s as defined in Eq. 28 are all non-negative.
For \(s\in (0,\infty )\backslash \{2\}\), let us define
and extend the function \(g_t\) by continuity on \([0,\infty )\), since \(h_t\) is differentiable at 0 and at 2. From this expression it is clear that \(g_t(s)\) is non-decreasing on \([1, +\infty )\). Therefore, \(\lim _{t\rightarrow 0^+} g_t(s)/t\) is non-decreasing on \([1, +\infty )\). On the other hand, we have \(h_t(0) =\text {tr}(Y^2) = h_0(s)\). We thus can compute
Therefore
is non-decreasing on \([1, +\infty )\). Now the desired result follows once we identify 2/s with p (and \(2/(2-s)\) with \({\hat{p}}\), its Hölder conjugate). \(\quad \square \)
Here are some important consequences of the above theorem.
Corollary 15
Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then the followings hold:
-
(i)
For all \(p\in {\mathbb {R}}\backslash \{0\}\) and \(X> 0\) we have
$$\begin{aligned} \mathcal {E}_{p, \mathcal {L}}(X)\ge 0. \end{aligned}$$ -
(ii)
The associated QMS is p-contractive for all p.
Remark 3
As mentioned before, the fact that p-Dirichlet forms are positive for \(p\in (-\infty ,- 1]\cup [+1/2,\infty )\) is a simple consequence of contraction of non-commutative weighted \(L_p\)-norms (or equivalently of the data processing inequality for sandwiched p-Rényi divergences), which follows by invariance of the state \(\sigma \) and interpolation of these spaces (see [43]). The case \(p\in (-1,+1/2)\) is much more subtle, since it is known that the data processing inequality does not hold in general in this parameter range, as opposed to the classical case. More precisely, p-contractivity of \(\Phi _t\) implies that the sandwiched p-Rényi divergence is monotone under \(\Phi _t\) [5, 19, 37]. Therefore, when \(\Phi _t\) comes from a QMS satisfying the above strong reversibility condition, sandwiched p-Rényi divergences are monotone under \(\Phi _t\) not only for \(p\ge 1/2\) but for all values of p.
Proof
(i) By Theorem 14 (and part (i) of Proposition 8) for every \(p\ne 0\) we have
Indeed, for \(p\in (0,2]\), the inequality holds by Theorem 14, and for \(p\notin [0,2]\), further use Proposition 8(i) to conclude. On the other hand, since we have self-adjointness of the semigroup with respect to \(\langle .,.\rangle _\sigma \), its generator has positive spectrum, so that we have \(\mathcal {E}_{2, \mathcal {L}}(X)\ge 0\). Therefore, \(\mathcal {E}_{p, \mathcal {L}}(I_{p, 2}(X))\ge 0\).
(ii) Define g(t) as in the proof of Proposition 9. By part (i) we have \(g'(t)\le 0\) for all \(t\ge 0\) and \(g(0)=0\). Therefore, \(g(t)\ge 0\) for all \(t\ge 0\). This gives p-contractivity. \(\quad \square \)
The following corollary is an immediate consequence of the quantum Stroock–Varopoulos inequality as well as part (i) of Proposition 4.
Corollary 16
Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then \(p\mapsto \alpha _p(\mathcal {L})\) is non-increasing on [0, 2], where \(\alpha _0(\mathcal {L})\) is defined as the limit \(p\rightarrow 0\).
Now we can state an improvement over Theorem 11.
Corollary 17
Let \(\mathcal {L}\) be a strongly \(\sigma \)-reversible Lindblad generator. Then the following holds:
-
(Hypercontractivity) For \(1\le q\le p\) and
$$\begin{aligned} t\ge \frac{1}{4\alpha _2(\mathcal {L})}\log \frac{p-1}{q-1}, \end{aligned}$$(30)we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X\ge 0\)
-
(Reverse hypercontractivity) For \(p\le q<1\) and
$$\begin{aligned} t\ge \frac{1}{4\alpha _1(\mathcal {L})}\log \frac{p-1}{q-1}, \end{aligned}$$(31)we have \(\Vert \Phi _t(X)\Vert _{p, \sigma }\ge \Vert X\Vert _{q, \sigma }\) for all \(X> 0\).
Remark 4
Equation 30 was already known to be implied by the strong \(L_p\)-regularity defined by [43]. This condition, which is a special case of the Stroock–Varopoulos inequality, was recently shown in [3].
Before ending this section, we state a result that will play an important role in Sect. 5.
Lemma 18
Let \(\{\Phi _t:\, t\ge 0\}\) be a a primitive QMS that is strongly \(\sigma \)-reversible. Let \(X,Y>0\) and \(-\infty \le q, p< 1\). Then, for any \(t\ge 0\) such that \((1-p)(1-q)\ge \mathrm {e}^{-4\alpha _1(\mathcal {L}) t}\) we have
Proof
The result follows by a direct application of Lemma 1 together with the reverse hypercontractivity inequality in Corollary 17. \(\quad \square \)
4 Tensorization
Our goal in this section is to prove hypercontractivity (or reverse hypercontractivity) inequalities of the form \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma ^{\otimes n}} \le \Vert X\Vert _{q, \sigma ^{\otimes n}}\) (or \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma ^{\otimes n}} \ge \Vert X\Vert _{q, \sigma ^{\otimes n}}\)) for certain ranges of t, p, q that are independent of n. Indeed, so far we have a theory of using log-Sobolev inequalities to prove such inequalities when \(n=1\), but in some applications, e.g., those we present later in this paper, we need such inequalities for arbitrary n. We need some notations to state the problem more precisely.
For a Lindblad generator \(\mathcal {L}\) we define
as an operator acting on \(\mathcal {B}(\mathcal {H}^{\otimes n})\). We also let
Observe that if \(\mathcal {L}\) is (strongly) \(\sigma \)-reversible, then \(\mathcal {K}_n\) is (strongly) reversible with respect to \(\sigma ^{\otimes n}\). Moreover, \({\widehat{\mathcal {L}}}_i\)’s commute with each other and
That is, \(\mathcal {K}_n\) is a (strongly) \(\sigma ^{\otimes n}\)-reversible Lindblad generator which generates the quantum Markov semigroup \(\big \{\Phi _t^{\otimes n}:\, t\ge 0\big \}\). Now we can ask how the (reverse) hypercontractivity inequalities associated to \(\Phi _t\) are related to those for \(\Phi _t^{\otimes n}\). Equivalently, what is the relation between the log-Sobolev constants \(\alpha _p(\mathcal {L})\) to \(\alpha _p(\mathcal {K}_n)\)? In the commutative (classical) case the answer is easy; \(\alpha _p(\mathcal {K}_n)\) equals \(\alpha _p(\mathcal {L})\) for all n, and having a (reverse) hypercontractivity inequality for \(\Phi _t\) immediately gives one for \(\Phi _t^{\otimes n}\). This is because in the classical case operator norms are multiplicative, or because the entropy function satisfies a certain subadditivity property (see e.g., [36]). The aforementioned property that, in the classical case, \(\alpha _p(\mathcal {K}_n)\) is independent of n, is usually called the tensorization property.
Tensorization property of log-Sobolev constants of quantum Lindblad generators, unlike its classical counterpart, is highly non-trivial. Thus proving (reverse) hypercontractivity inequalities that are independent of n is a difficult problem in the non-commutative case. There are some attempts in this direction. Montanaro and Osborne in [33] proved such hypercontractivity inequalities for the qubit depolarizing channel (see also [26]). King [28] generalized this result for all unital qubit QMS. Cubitt et al. developed the theory of quantum reverse hypercontractivity inequalities in the unital case in [14] and proved some tensorization-type results. Also, Cubitt et al. [49] developed some techniques for proving bounds on log-Sobolev constants \(\alpha _p(\mathcal {K}_n)\) that are independent of n. Beigi and King [6] took the path of developing the theory of log-Sobolev inequalities not for the usual \(q\rightarrow p\) norm, but for the completely bounded norm. The point is that completely bounded norms are automatically multiplicative [17], so there is no problem of tensorization for the associated log-Sobolev constants. However, the existence of a complete version of the LSI constant was disproved in [4].
In this section we prove two tensorization-type results, one for 1-log-Sobolev constants which will be used for reverse hypercontractivity inequalities, and the other for 2-log-Sobolev constants which would be useful for hypercontractivity inequalities.
Theorem 19
Let \(\sigma _1, \dots , \sigma _n\) be arbitrary positive definite density matrices. Let \(\mathcal {L}_i(X) = X-\text {tr}(\sigma _i X) {\mathbb {I}}\) be the simple generator associated to the state \(\sigma _i\). Let
and define \(\mathcal {K}_n\) by (33). Then we have \(\alpha _1(\mathcal {K}_n) \ge \frac{1}{4}\), independently of n.
Remark 5
Observe that Theorem 19 does not show the tensorization of \(\alpha _1\) for the depolarizing semigroup, but only proves a positive lower bound independent of n. Hence, the tensorization of \(\alpha _1\) is still an open problem.
Letting \(\sigma _i\)’s to be equal in the above theorem, we obtain the promised tensorization-type result for the 1-log-Sobolev constant.Footnote 7
Proof
We need to show that for all \(X_{A^n}\in \mathcal {P}_+(\mathcal {H}_{A^n})\) we have
where \(\sigma _{A_i} =\sigma _i\) and
Using parts (ii) of Proposition 4 and Proposition 8, without loss of generality we can assume that \(X_{A^n}= \Gamma _{\sigma _{A^n}}^{-1}(\rho _{A^n})\) where \(\rho _{A^n}\in \mathcal {D}_+(\mathcal {H}_{A^n})\) is a density matrix. Then, using parts (iv) of Proposition 4 and Proposition 8, we need to show that
Observe that
with \(\mathcal {L}^*_i (Y) = Y- \text {tr}(Y)\sigma _i\). Therefore,
where \(A^{\sim i} = (A_1, \dots , A_{i-1}, A_{i+1}, \dots , A_n)\) and \(\rho _{A^{\sim i}} = \text {tr}_{A_i}(\rho _{A^n})\) is the partial trace of \(\rho _{A^n}\) with respect to the i-th subsystem. Therefore, (34) is equivalent to
Now since \(D(\rho _{A^{\sim i}}\otimes \sigma _{A_i}\Vert \rho _{A^n})\ge 0\), it suffices to show that
We note that \(D(\xi _B\Vert \tau _B) = -H(B)_\xi -\text {tr}(\xi \log \tau )\) where \(H(B)_\xi = -\text {tr}(\xi \log \xi )\) is the von Neumann entropy. Moreover, \(\log (\xi \otimes \tau ) = \log \xi \otimes I + I\otimes \log \tau \). Therefore, (35) is equivalent to
This is equivalent to
which is an immediate consequence of the data processing inequality (i.e., \(H(B|C)_\xi \ge H(B|CD)_\xi \)) once we use the chain rule
This conclude the proof. \(\quad \square \)
Remark 6
A similar proof was recently and independently obtained in [9]. Moreover, the proof uses similar ideas to the proof of the tensorization property of the variant of \(\alpha _2\) for the completely bounded norm in [6].
We can now use Corollary 17 and the fact that the simple generator is strongly reversible to conclude the following.
Corollary 20
Let \(\sigma _1, \dots , \sigma _n\) be arbitrary positive definite density matrices. Let \(\mathcal {L}_i(X) = X-\text {tr}(\sigma _i X) {\mathbb {I}}\) be the simple generator associated to the generalized depolarizing channel \(\Phi _{t, i}(X)=\mathrm {e}^{-t} X + (1-\mathrm {e}^{-t}) \text {tr}(\sigma _i X) {\mathbb {I}}\). Define \(\sigma ^{(n)} = \sigma _1\otimes \cdots \otimes \sigma _n\) and \(\Phi _t^{(n)} = \Phi _{t, 1}\otimes \cdots \otimes \Phi _{t, n}\). Then for \(p\le q<1\) and \(t\ge \log \frac{p-1}{q-1}\) we have
where \(X\in \mathcal {P}_+(\mathcal {H}^{\otimes n})\) is arbitrary.
We now state the second tensorization result which is about the 2-log-Sobolev constant.
Theorem 21
Let \(\dim \mathcal {H}=2\) and \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) for some positive definite density matrix \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Then we have
where \(\mathcal {K}_n\) is defined in (33).
Our main tool to prove this theorem is the following entropic inequality that is of independent interest and can be useful elsewhere.
Lemma 22
Let \(\mathcal {H}\) and \(\mathcal {H}'\) be Hilbert spaces with \(\dim \mathcal {H}=2\). Let \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) be a positive semidefinite matrix with the block form
where \(A, B, C\in \mathcal {B}(\mathcal {H}')\). For a density matrix \(\rho \in \mathcal {D}_+(\mathcal {H}')\), the matrix M defined as
is positive semidefinite. Moreover, let \(\sigma \in \mathcal {D}_+(\mathcal {H})\) be a density matrix of the form
where \(\theta \in (0,1)\). Then we have
where the map \(I_{2,2}\) is defined with respect to the state \(\rho \).
Proof
For any \(p\ge 2\) define
so that \(M_2=M\). Since \(X\ge 0\), both A and B are positive semidefinite. Moreover, we have
As a result, according to Theorem IX.5.9 of [7] there exists a contraction\(R\in \mathcal {B}(\mathcal {H}')\) such that \(\Gamma _\rho ^{\frac{1}{p}}(C) = \big (\Gamma _\rho ^{\frac{1}{p}}(A)\big )^{\frac{1}{2}} R \big (\Gamma _\rho ^{\frac{1}{p}}(B)\big )^{\frac{1}{2}}\). Therefore, by Hölder’s inequality we have
Then using \(\Vert Y\Vert _{p, \rho } = \Vert \Gamma _\rho ^{1/p}(Y)\Vert _{p}\), we find that
and hence \(M_p\ge 0\). In particular, \(M_2=M\ge 0\) and \(\text {Ent}_{2, \rho }(M)\) is well-defined.
Define \(\psi (p):= \Vert M_p\Vert _{p, \sigma } - \Vert X\Vert _{p, \sigma \otimes \rho }\). It is shown by King [27] that \(\psi (p)\ge 0\) for all \(p\ge 2\). Indeed, this inequality is proven in [27] in the special case where \(\sigma \) and \(\rho \) are the identity operators on the relevant spaces. Nevertheless, we have
and
Thus, King’s result holds for arbitrary \(\rho \) and diagonal \(\sigma \) as well, and we have \(\psi (p)\ge 0\) for all \(p\ge 2\). On the other hand, a straightforward computation verifies that \(\psi (2)=0\). This means that \(\psi '(2)\ge 0\), i.e.,
The derivatives can be computed using Proposition 3. We have
and
where
and \(w =\Vert C\Vert _{2, \rho }^{-1}\cdot \left( \frac{1}{2}\text {Ent}_{2, \rho }\big (I_{2, 2}(C)\big ) +\frac{1}{2}\text {Ent}_{2, \rho }\big (I_{2, 2}(C^{\dagger })\big )\right) \). We conclude that
Comparing to (40) and using \(\Vert M\Vert _{2, \sigma }=\Vert X\Vert _{2, \sigma \otimes \rho }\) the desired inequality follows. \(\quad \square \)
We need yet another lemma to prove Theorem 21.
Lemma 23
For any Lindblad generator \(\mathcal {K}\) that is \(\rho \)-reversible for some positive definite density matrix \(\rho \) we have
for any C.
Proof
Define \(D:=\Gamma _{\rho }^{\frac{1}{2}}(C)\). Then for \(j\in \{0,1\}\)
is positive semidefinite [7]. Since \(\Gamma _{\rho }^{-1/2}\) is completely positive we have
On the other hand, \(\Psi _t= \mathrm {e}^{-t\mathcal {K}}\) is completely positive. Therefore,
is positive semidefinite. Putting these together we find that
We note that
From this expression it is clear that
Therefore, we must have \(g'(0)\ge 0\) which is equivalent to the desired inequality. \(\quad \square \)
Now we have all the required tools for proving Theorem 21. Indeed, we can prove a stronger statement out of which Theorem 21 is implied by a simple induction.
Theorem 24
Let \(\dim \mathcal {H}=2\) and \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) for some positive definite density matrix \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Also let \(\mathcal {K}\) be a Lindblad generator associated to a primitive QMS that is reversible with respect to some positive definite state \(\rho \in \mathcal {D}_+(\mathcal {H}')\). Then we have
where \(\mathcal {I}\) and \(\mathcal {I}'\) denote the identity superoperators acting on \(\mathcal {B}(\mathcal {H})\) and \(\mathcal {B}(\mathcal {H}')\) respectively.
Proof
Let \(\alpha =\min \{\alpha _2(\mathcal {L}), \, \alpha _2(\mathcal {K})\}\). By restricting X in the 2-log-Sobolev inequality to be of the tensor product form and using
we conclude that \(\alpha _2(\mathcal {L}\otimes \mathcal {I}+ \mathcal {I}\otimes \mathcal {K})\le \alpha \). To prove the inequality in the other direction we need to show that for any \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) we have
Assume, without loss of generality, that \(\sigma \) is diagonal of the form (38), and that \(X\in \mathcal {P}(\mathcal {H}\otimes \mathcal {H}')\) has the block form (36). Define M by (37). Then by Lemma 22 we have
On the other hand by the definition of \(\alpha \) we have
and
for all \(Y\in \big \{ A, B, I_{2, 2}(C), I_{2, 2}(C^\dagger ) \big \}\). Therefore, we have
where in the second inequality we use Lemma 23. We now have
We compute each term in the above sum separately.
For the second term we compute
Therefore, we have
Comparing this to (42) we arrive at the desired inequality (41). \(\quad \square \)
We now give the exact expression of the 2-log-Sobolev constant of the simple Lindblad generator (in any dimension). We recall that the case of the 1-log-Sobolev constant was found in [38] (see also [26] when \(\sigma ={\mathbb {I}}/d\)). The proof in our general setting is similar to the one of [38]. We however provide it in Appendix C for the sake of completeness.
Theorem 25
Let \(\sigma \in \mathcal {D}_+(\mathcal {H})\) be arbitrary and let \(\mathcal {L}(X) = X-\text {tr}(\sigma X) {\mathbb {I}}\) be the simple Lindblad generator. Then we have
where \(s_{\min }(\sigma )\) is the minimum eigenvalue of \(\sigma \).
We can now derive a tensorization-type result for a wide class of Lindblad generators. Let \(\mathcal {L}\) be a \(\sigma \)-reversible and primitive Lindblad generator. Recall that the spectral gap of \(\mathcal {L}\) is defined by
where \(\text {Var}_\sigma (X) =\langle X, X\rangle _\sigma - \langle X, {\mathbb {I}}\rangle _\sigma ^2 =\Vert X\Vert _{2, \sigma }^2 -\langle X, {\mathbb {I}}\rangle _\sigma ^2\), see e.g. [26]. Observe that \(\text {Var}_\sigma (X)\) is the squared length of the projection of X onto the subspace orthogonal to \({\mathbb {I}}\in \mathcal {B}(\mathcal {H})\) with respect to the inner product \(\langle \cdot , \cdot \rangle _\sigma \). On the other hand, \({\mathbb {I}}\) is the soleFootnote 8 0-eigenvector of \(\mathcal {L}\) up to a phase which is self-adjoint with respect to this inner product. Therefore, \(\lambda (\mathcal {L})\) is the minimum non-zero eigenvalue of \(\mathcal {L}\). Note that, since \(\mathcal {L}\) has positive spectral gap, the Dirichlet form \(\mathcal {E}_{2, \mathcal {L}}\) is non-negative, so \(\lambda (\mathcal {L})>0\). Indeed, \(\lambda (\mathcal {L})\) is really the spectral gap of \(\mathcal {L}\) above the zero eigenvalue.
The spectral gap satisfies the tensorization property, as shown below. Observe that
is a sum of mutually commuting operators. Then the eigenvalues of \(\mathcal {K}_n\) are summations of eigenvalues of individual \({\widehat{\mathcal {L}}}_i\)’s. Since each \({\widehat{\mathcal {L}}}_i\) is a tensor product of \(\mathcal {L}\) with some identity superoperator, the set of its eigenvalues is the same as that of \(\mathcal {L}\). Using these we conclude that
It is well-known that \(\lambda (\mathcal {L})\ge \alpha _2(\mathcal {L})\) [10, 26]. The following corollary gives a lower bound on \(\alpha _2(\mathcal {L})\) in terms of \(\lambda (\mathcal {L})\).
Corollary 26
Let \(\dim \mathcal {H}=2\) and \(\sigma \in \mathcal {D}_+(\mathcal {H})\). For any \(\sigma \)-reversible primitive Lindblad generator \(\mathcal {L}\) we have
where \(s_{\min }(\sigma )\) denotes the minimal eigenvalue of \(\sigma \).
This corollary is a non-commutative version of Corollary A.4 of [18] and gives a stronger bound compared to Corollary 6 of [49]. It would be interesting to compare this corollary with the result of King [28] who generalized the hypercontractivity inequalities of [33] for the unital qubit depolarizing channel to all unital qubit quantum Markov semigroups. Here, having a bound on the 2-log-Sobolev constant of the \(\sigma \)-reversible generalized qubit depolarizing channel (and its tensorization property), we derive a bound on the 2-log-Sobolev constant of all qubit \(\sigma \)-reversible QMS.
Proof of Corollary 26
Let \(\mathcal {L}'\) be the simple Lindblad generator that is \(\sigma \)-reversible, and let \(X\in \mathcal {P}(\mathcal {H}^{\otimes n})\) be arbitrary. Then by Theorems 21 and 25 we have
Let \({\mathcal {W}}_i\subset \mathcal {B}(\mathcal {H}^{\otimes n})\) be the subspace spanned by operators of the form \(A_1\otimes \cdots \otimes A_n \in \mathcal {B}(\mathcal {H}^{\otimes n})\) with \(A_i={\mathbb {I}}\in \mathcal {B}(\mathcal {H})\). In other words, \({\mathcal {W}}_i = \ker ({\widehat{\mathcal {L}}}'_i)\). Then \(\big \langle X, {\widehat{\mathcal {L}}}'_i(X)\big \rangle _{\sigma ^{\otimes n}}\) equals the squared length of the projection of X onto \({\mathcal {W}}_i^{\perp }\). On the other hand, since \(\mathcal {L}\) is primitive and \(\sigma \)-reversible, we also have \({\mathcal {W}}_i=\ker {\widehat{\mathcal {L}}}_i \) and \({\mathcal {W}}_i^{\perp }\) is invariant under \({\widehat{\mathcal {L}}}_i\). Moreover, by definition \(\lambda ({\widehat{\mathcal {L}}}_i)\) is the minimum eigenvalue of \({\widehat{\mathcal {L}}}_i\) restricted to \({\mathcal {W}}_i^{\perp }\) (i.e., the minimum non-zero eigenvalue). We conclude that
On the other hand since \({\widehat{\mathcal {L}}}_i\) equals the tensor product of \(\mathcal {L}\) with some identity superoperators, \(\lambda ({\widehat{\mathcal {L}}}_i) = \lambda (\mathcal {L})\). Therefore,
Using this in (45) we arrive at
This gives the desired bound on \(\alpha _2(\mathcal {K}_n)\). \(\quad \square \)
Corollary 27
Let \(\dim \mathcal {H}=2\) and \(\sigma \in \mathcal {D}_+(\mathcal {H})\). Let \(\mathcal {L}\) be a \(\sigma \)-reversible primitive Lindblad generator. Then for any \(1\le q\le p\) and \(t\ge 0\) satisfying
we have \(\Vert \Phi _t^{\otimes n}(X)\Vert _{p, \sigma }\le \Vert X\Vert _{q, \sigma }\) for all \(X> 0\).
5 Application: Second-Order Converses
One of the primary goals of information theory is to find optimal rates of information-theoretic tasks. For instance, for the task of information transmission over a noisy channel, this optimal rate is the capacity. The latter is said to satisfy the strong converse property if any attempt to transmit information at a rate higher than it fails with certainty in the limit of infinitely many uses of the channel. In this section, we show how reverse hypercontractivity inequalities can be used to derive finite sample size strong converse bounds in the tasks of asymmetric quantum hypothesis testing and classical communication through a classical-quantum channel.
5.1 Quantum Hypothesis Testing
Binary quantum hypothesis testing concerns the problem of discriminating between two different quantum states, and is essential for various quantum information-processing protocols. Suppose that a party, Bob, receives a quantum system, with the knowledge that it is prepared either in the state \(\rho \) (the null hypothesis) or in the state \(\sigma \) (the alternative hypothesis) over a finite-dimensional Hilbert space \({{\mathcal {H}}}\). His aim is to infer which hypothesis is true, i.e., which state the system is in. To do so he performs a measurement on the system that he receives. This is most generally described by a POVM \(\{T,{\mathbb {I}}- T\}\) where \(0 \le T \le {\mathbb {I}}\); When the measurement outcome is T he infers that the state is \(\rho \), and otherwise it is \(\sigma \). Adopting the nomenclature from classical hypothesis testing, we refer to T as a test. The probability that Bob correctly guesses the state to be \(\rho \) is then equal to \(\text {tr}(T \rho )\), whereas his probability of correctly guessing the state to be \(\sigma \) is \(\text {tr}(({\mathbb {I}}-T)\sigma )\). Bob can erroneously infer the state to be \(\sigma \) when it is actually \(\rho \) or vice versa. The corresponding error probabilities are referred to as the Type I error and Type II error, respectively, and are given as follows:
Correspondingly, if multiple (say, n) identical copies of the system are available, and a test \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\) is performed on the n copies, then the Type I and Type II errors are given by
where \({\mathbb {I}}_n\) denotes the identity operator in \({{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\). There is a trade-off between the two error probabilities and there are various ways to optimize them. In the setting of asymmetric quantum hypothesis testing, one minimizes the Type II error under the constraint that the Type I error stays below a threshold value \(\varepsilon \in (0,1)\). In this case one is interested in the following quantity
where the infimum is taken over all possible tests \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\). The quantum Stein lemma [23, 42] states that
The asymptotic strong converse rate\(R_{sc}\) of the above quantum hypothesis testing problem is defined to be the smallest number R such that if
for some sequence of tests \(\{T_n\}_{n \in {\mathbb {N}}}\), then
This quantity has been shown to be equal to Stein’s exponent \(D(\rho ||\sigma )\). In this section we are interested in obtaining a bound on the rate of convergence of \(\alpha _n(T_n)\)as a function of n, that is when Bob receives a finite number of identical copies of the quantum system. We use reverse hypercontractivity in order to obtain our bound. Before stating and proving the main theorem of this section, we recall the following important inequality that will be used in the proof.
Lemma 28
(Araki–Lieb–Thirring inequality [2, 29]) For any \(A,B\in \mathcal {P}(\mathcal {H})\), and \(r\in [0,1]\),
Our main result, from which a bound for the finite blocklength strong converse rate follows directly as a corollary, is given by Theorem 29.
Theorem 29
Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) being faithful density matrices.Footnote 9 Then for any test \(0\le T_n\le {\mathbb {I}}_n\), where \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\)
Proof
The result follows by combining Theorem 19 and Lemma 1. For simplicity of notation we will use \(\sigma _n:=\sigma ^{\otimes n}\) and \(\rho _n:=\rho ^{\otimes n}\). Let \(0\le p,q\le 1\) and let \(t \ge 0\) be such that
Let \({{\mathcal {L}}}\) denote the generator of a generalized depolarizing semigroup \(\{\Phi _t:\, t\ge 0\}\) with invariant state \(\rho \), i.e., \(\Phi _t(X)=\mathrm {e}^{-t} X + (1-\mathrm {e}^{-t}) \text {tr}(\rho X) {\mathbb {I}}\). By Theorem 19 the 1-log-Sobolev constants of this QMS and its tensor powers are lower bounded by 1/4. Then using Lemma 18 for \(Y=T_n\) and \(X=\Gamma _{\rho _n}^{-1}(\sigma _n)\) we obtain
An application of the Araki–Lieb–Thirring inequality, Lemma 28, with \(A=\sigma _n\), \(B=\rho _n^{(1-p)/p}\) and \(r=p\in [0,1]\) leads to
where
denotes the sandwiched p-Rényi divergence between \(\rho \) and \(\sigma \). A very similar application of Lemma 28 for \(A=T_n\) and \(B=\rho _n^{1/q}\) and \(r=q\in [0,1]\) yields
where in the last inequality, we used that \(0\le T_n\le {\mathbb {I}}\), so that \(T_n^q\ge T_n\). Using the last two bounds in (49), we get
Taking the limit \(p \rightarrow 0\) (and \(q\rightarrow 1-\mathrm {e}^{-t}\)) on both sides of the above inequality yields
Let \(\gamma :=\Vert \sigma ^{-1/2}{\rho }\sigma ^{-1/2}\Vert _{\infty }\) and define the superoperator \(\Psi _t\) by
Then by induction on n it can be shown that \(\Psi _t^{\otimes n} -\Phi _t^{\otimes n}\) is a completely positive superoperator. This is clear from definitions for \(n=1\), and for every \(Y\in \mathcal {P}(\mathcal {H}^{\otimes n}\otimes \mathcal {H}')\), where \(\mathcal {H}'\) is an arbitrary Hilbert space, we have
where in the inequalities come from the induction hypothesis and the base of induction. Therefore, \(\Psi _t^{\otimes n} -\Phi _t^{\otimes n}\) is a completely positive. On the other hand, for every \(Y\in \mathcal {B}(\mathcal {H}^{\otimes n})\) we have
This equation is immediate for \(n=1\), and for arbitrary n can be proven by first observing that it holds for \(Y=Y_1\otimes \cdots \otimes Y_n\) being of a tensor product form, and then using linearity. Putting these together we arrive at
Next using the fact that \(\gamma \ge 1\) (which follows simply by taking the trace of the operator inequality \(\rho \le \gamma \sigma \)), the convexity of \(h(x)=x^\gamma \) implies \((h(x)-h(1))/(x-1)\ge h'(1)\) for every \(x\ge 1\). Therefore, \(\mathrm {e}^{\gamma t}-1\ge \gamma (\mathrm {e}^t-1)\) for every \(t\ge 0\), and \(\mathrm {e}^{-t}+ \gamma (1-\mathrm {e}^{-t})\le \mathrm {e}^{(\gamma -1)t}\). As a result
Then from (50) and (51) we get
Taking the logarithm of both sides yields
where the second inequality follows from \(\mathrm {e}^t \ge 1+ t\) and
Optimizing (52) over the choice of t yields
and we obtain the desired inequality
\(\square \)
Remark 7
The bound found by the present reverse hypercontractivity technique is weaker than the one found in Equation (75) of [34], which is in particular tight as \(n\rightarrow \infty \). However, as opposed to [34], the techniques developed in this paper have the particular advantage that they can be generalized to obtain strong converses in various problems of quantum network information theory (see [12, 13]).
Corollary 30
(Finite-blocklength strong converse bound for quantum hypothesis testing). Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) and \(\gamma =\Vert \rho \sigma ^{-1}\Vert _\infty \). Then for any test \(0\le T_n\le {\mathbb {I}}_n\), where \(T_n \in {{\mathcal {B}}}({{\mathcal {H}}}^{\otimes n})\), if the Type II error satisfies the inequality \(\beta _n(T_n) \le \mathrm {e}^{-nr}\) for \(r > D(\rho ||\sigma )\), then the Type I error satisfies
where
and hence tends to zero in the limit of \(r \rightarrow D(\rho ||\sigma )\).
Proof
Fix \(r> D(\rho \Vert \sigma )\) and consider a sequence of tests \(T_n\) such that \(\beta _n(T_n)\le \mathrm {e}^{-nr}\). Then, from Theorem 29 we have
Defining \(x_n^2 := \log \frac{1}{1-\alpha _n(T_n)}\) this is equivalent to
solving which directly leads to the statement of the corollary. \(\quad \square \)
Theorem 29 also leads to the following finite blocklength second order lower bound on the Type II error when the Type I error is less than a threshold value.
Corollary 31
Let \(\rho ,\sigma \in \mathcal {D}_+(\mathcal {H})\) . Then for any \(n \in {\mathbb {N}}\) and \(\varepsilon >0\) the minimal Type II error satisfies
where \(\gamma = \Vert \rho \sigma ^{-1}\Vert _\infty \).
5.2 Classical-Quantum Channels
The strong converse property of the capacity of a classical-quantum (c-q) channel was proved independently in [41, 53]. In this section, we use the quantum reverse hypercontractivity inequality to obtain a finite blocklength strong converse bound for transmission of information through classical-quantum (c-q) channels. Suppose Alice wants to send classical messages belonging to a finite set \({{\mathcal {M}}}\) to Bob, using a memoryless c-q channel:
where \({{\mathcal {X}}}\) denotes a finite alphabet, and \({{\mathcal {H}}_B}\) is a finite-dimensional Hilbert space with dimension d. Thus the output of the channel under input \(x\in {\mathcal {X}}\) is some quantum state \(\rho _x={\mathcal {W}}(x)\in \mathcal {D}(\mathcal {H}_B)\). To send a message \(m \in {{\mathcal {M}}}\), Alice encodes it in a codeword
where \({{\mathcal {E}}}^{(n)}\) denotes the encoding map. She then sends it to Bob through n successive uses of the channel \({{\mathcal {W}}}^{\otimes n}\), whose action on the codeword \(x^n\) is given by
In order to infer Alice’s message, Bob applies a measurement, described by a POVM \(\Pi ^n:= \{\Pi ^n_{m'}\}_{m' \in {{\mathcal {M}}}}\) on the state \({{\mathcal {W}}}^{\otimes n}(x^n)=\rho _{x^n}\) that he receives. The outcome of the measurement would be Bob’s guess of Alice’s message. See Fig. 1.
The triple \((|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) defines a code which we denote as \(\mathcal {C}_n\) (see [51]). The rate of the code is given by \(\log |\mathcal {M}|/n\), and its maximum probability of error is given by
We let \(C_{n, \varepsilon }({\mathcal {W}})\) be the maximum rate \(\log |{\mathcal {M}}|/n\) over all codes \({\mathcal {C}}_n=(|{\mathcal {M}}|, {\mathcal {E}}^{(n)}, \Pi ^n)\) with \(p_{\max }({\mathcal {C}}_n; {\mathcal {W}})\le \varepsilon \). Then the (asymptotic) capacity of the channel is defined by
For c-q channels, this is known to be given by [24, 46]
Here the maximum is taken over all probability distributions \(P_X\) on \({\mathcal {X}}\), the bipartite state \(\rho _{XB}\) is given by
and \(I(X; B)_\rho = D(\rho _{XB}\Vert \rho _X\otimes \rho _B)\) is the mutual information function. The fact that the capacity is given by maximum mutual information is indeed implied by its additivity [47]. That is, the maximum mutual information associated to the channel \({\mathcal {W}}^{\otimes n}\) equals n times the maximum mutual information of \({\mathcal {W}}\):
Theorem 32
Let \(\mathcal {W}:\mathcal {X}\rightarrow \mathcal {D}(\mathcal {H}_B)\) be a c-q channel with \({\mathcal {W}}(x)=\rho _x\) being faithful for all \(x\in {\mathcal {X}}\). Then, for any code \({{\mathcal {C}}}_n:=(|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) with \(p_{\max }({\mathcal {C}}_n; {\mathcal {W}})\le \varepsilon \) we have
where \(d=\dim \mathcal {H}_B\) and the mutual information is computed with respect to the state
This theorem together with the additivity result (54) directly imply that for any code of rate larger than \(C(\mathcal {W})\), the maximum probability of error goes to one, as \(n\rightarrow \infty \).
Proof
For every \(x^n=(x_1, \dots , x_n)\in {\mathcal {X}}^n\) let \(\Phi _{t, x^n}=\Phi _{t, x_1}\otimes \cdots \otimes \Phi _{t, x_n}\) with
Then following similar steps as in the proof of Theorem 29, using Theorem 19, Lemma 18 and the Araki–Lieb–Thirring inequality, for every \(\Pi _m^n\) we have
Letting \(x^n=x^n(m)\), using \(\text {tr}\big ( \rho _{x^n(m)} \Pi _m^n \big )\ge 1-\varepsilon \), taking logarithm of both sides and averaging over the choice of \(m\in {\mathcal {M}}\) we obtain
Now define \(\Psi _t(X) = \mathrm {e}^{-t}X + (1-\mathrm {e}^{-t})\text {tr}(X){\mathbb {I}}\). Following similar steps as in the proof of Theorem 29, using \(\rho _{x} \le {\mathbb {I}}\) it can be shown that \(\Psi _t^{\otimes n} - \Phi _{t, x^n(m)}\) is completely positive. Therefore, \(\Phi _{t, x^n(m)}(\Pi ^n_m)\le \Psi _t^{\otimes n}(\Pi _m^n)\) and we have
where the second line follows from the concavity of the logarithm function and in the third line we use the fact that \(\{\Pi ^n_m:\, m\in {\mathcal {M}}\}\) is a POVM. On the other hand,
Therefore,
Optimizing over the choice of \(t> 0\), the desired result follows. \(\quad \square \)
The above theorem leads to the following finite blocklength second order strong converse bound for the classical capacity of a c-q channel.
Corollary 33
For any sequence of codes \(\mathcal {C}_n:=(|\mathcal {M}|,\mathcal {E}^{(n)},\Pi ^n)\) of rates \(r:=\frac{|\mathcal {M}|}{n}>{C}(\mathcal {W})\),
where \(f:=\big (\sqrt{d+(r-C(\mathcal {W}))}-\sqrt{d} \big )^2\).
Proof
We apply the bound found in Theorem 32, so that
The result follows by an analysis similar to the one of Corollary 30. \(\quad \square \)
Remark 8
As pointed out in Remark 7, the strong converse bound that we find here is weaker than the one of [35]. However, and as opposed to [35], our technique has recently been successfully applied to network information theoretical scenarios (see [12, 13]).
Notes
For sake of brevity, we refrain from defining the phrases shown in italics throughout this introduction. Please refer to the main text and references therein for details.
The test could be probabilistic, but for simplicity of presentation we restrict to deterministic tests.
In the case \(p=0\), we define \(\hat{p}=0\) (see e.g., Definition 1.2 of [36]).
Our entropy function here is different from the one in [26] by a factor of p. This modification ensures us that if X and \(\sigma \) commute, we get the usual entropy function in the classical case. Moreover, this extra factor makes the entropy function non-negative even for \(p<0\).
By Brouwer’s fixed-point theorem, \(\Phi _1^*\), has a fixed point in \(\mathcal {D}(\mathcal {H})\) because it maps this compact convex set to itself. On the other hand, since \(\Phi _t^*= (\Phi _1^*)^t\), any fixed point of \(\Phi _1^*\) is an invariant of the whole semigroup. Thus \(\{\Phi _t^*:\, t\ge 0\}\) always has an invariant state in \(\mathcal {D}(\mathcal {H})\).
Again, our definition of the Dirichlet form is different from that of [26] by a factor of p/2 and a negative sign.
Note that this result was independently obtained recently in [9] by introducing the notion of a conditional log-Sobolev constant and finding a uniform lower bound on the latter. Moreover, a special case of the above theorem corresponding to \(\sigma \) being the completely mixed state was proved in [38].
This 0-eigenvector is unique since \(\mathcal {L}\) is assumed to be primitive.
What we really need is that the supports of \(\rho \) and \(\sigma \) being the same (and not being the whole \(\mathcal {H}\)) since in this case we may restrict everything to this support.
References
Ahlswede, R., Gacs, P.: Spreading of sets in product spaces and hypercontraction of the Markov operator. Ann. Probab. 4(6), 925–939 (1976)
Araki, H.: On an inequality of Lieb and Thirring. Lett. Math. Phys. 19(2), 167–170 (1990)
Bardet, I.: Estimating the decoherence time using non-commutative Functional Inequalities. arXiv preprint arXiv:1710.01039 (2017)
Bardet, I., Rouzé, C.: Hypercontractivity and logarithmic Sobolev inequality for non-primitive quantum Markov semigroups and estimation of decoherence rates. arXiv preprint: arXiv:1803.05379 (2018)
Beigi, S.: Sandwiched Rényi divergence satisfies data processing inequality. J. Math. Phys. 54, 122202 (2013)
Beigi, S., King, C.: Hypercontractivity and the logarithmic Sobolev inequality for the completely bounded norm. J. Math. Phys. 57(1), 015206 (2016)
Bhatia, R.: Positive Definite Matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2015)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)
Capel, A., Lucia, A., Pérez-García, D.: Quantum conditional relative entropy and quasi-factorization of the relative entropy. J. Phys. A Math. Theor. 51(48), 484001 (2018)
Carbone, R., Martinelli, A.: Logarithmic Sobolev inequalities in non-commutative algebras. Infinite Dimens. Anal. Quantum Probab. Relat. Top. 18(02), 1550011 (2015)
Carlen, E.A., Maas, J.: Gradient flow and entropy inequalities for quantum markov semigroups with detailed balance. J. Funct. Anal. 273(5), 1810–1869 (2017)
Cheng, H.-C., Datta, N., Rouzé, C.: Strong converse bounds in quantum network information theory: distributed hypothesis testing and source coding. arXiv preprint: arXiv:1905.00873 (2019)
Cheng, H.-C., Datta, N., Rouzé, C.: Strong converse for classical-quantum degraded broadcast channels. arXiv preprint arXiv:1905.00874 (2019)
Cubitt, T., Kastoryano, M., Montanaro, A., Temme, K.: Quantum reverse hypercontractivity. J. Math. Phys. 56(10), 102204 (2015)
de Wolf, R.: A brief introduction to Fourier analysis on the Boolean cube. Theory Comput. 1, 1–20 (2008)
Delgosha, P., Beigi, S.: Impossibility of local state transformation via hypercontractivity. Commun. Math. Phys. 332(1), 449–476 (2014)
Devetak, I., Junge, M., King, C., Ruskai, M.B.: Multiplicativity of completely bounded \(p\)-norms implies a new additivity result. Commun. Math. Phys. 266(1), 37–63 (2006)
Diaconis, P., Saloff-Coste, L.: Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 6(3), 695–750 (1996)
Frank, R.L., Lieb, E.H.: Monotonicity of a relative Rényi entropy. J. Math. Phys. 54, 122201 (2013)
Gorini, V., Kossakowski, A., Sudarshan, E.C.G.: Complete positive dynamical semigroups of N-level systems. J. Math. Phys. 17(1976), 821 (1976)
Gozlan, N., Leonard, C.: Transport inequalities. A survey. Markov Process. Relat. Fields 16, 635–736 (2010)
Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97(4), 1061–1083 (1975)
Hiai, F., Petz, D.: The proper formula for relative entropy and its asymptotics in quantum probability. Commun. Math. Phys. 143(1), 99–114 (1991)
Holevo, A.S.: The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory 44(1), 269–273 (1998)
Kamath, S., Anantharam, V.: Non-interactive simulation of joint distributions: the Hirschfeld–Gebelein–Rényi maximal correlation and the hypercontractivity ribbon. In: Proceedings of 50th Annual Allerton Conference on Communication, Control, and Computing, pp. 1057–1064 (2012)
Kastoryano, M.J., Temme, K.: Quantum logarithmic Sobolev inequalities and rapid mixing. J. Math. Phys. 54(5), 052202 (2013)
King, C.: Inequalities for trace norms of \(2\times 2\) block matrices. Commun. Math. Phys. 242(3), 531–545 (2003)
King, C.: Hypercontractivity for semigroups of unital qubit channels. Commun. Math. Phys. 328(1), 285–301 (2014)
Lieb, E.H., Thirring, W.E.: Inequalities for the moments of the eigenvalues of the Schrodinger Hamiltonian and their relation to Sobolev inequalities. In: Thirring, W. (ed.) The Stability of Matter: From Atoms to Stars, pp. 135–169. Springer, Berlin (1991)
Lindblad, G.: On the generators of quantum dynamical semigroups. Commun. Math. Phys. 48(2), 119–130 (1976)
Liu, J., van Handel, R., Verdú, S.: Beyond the blowing-up lemma: sharp converses via reverse hypercontractivity. In: 2017 IEEE International Symposium on Information Theory (ISIT), pp. 943–947 (June 2017)
Montanaro, A.: Some applications of hypercontractive inequalities in quantum information theory. J. Math. Phys. 53(12), 1–18 (2012)
Montanaro, A., Osborne, T.J.: Quantum boolean functions. Chic. J. Theor. Comput. Sci. (2010). https://doi.org/10.4086/cjtcs.2010.001
Mosonyi, M., Ogawa, T.: Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies. Commun. Math. Phys. 334(3), 1617–1648 (2015)
Mosonyi, M., Ogawa, T.: Strong converse exponent for classical-quantum channel coding. Commun. Math. Phys. 355(1), 373–426 (2017)
Mossel, E., Oleszkiewicz, K., Sen, A.: On reverse hypercontractivity. Geom. Funct. Anal. 23(3), 1062–1097 (2013)
Müler-Lennert, M., Dupuis, F., Szehr, O., Fehr, S., Tomamichel, M.: On quantum Rényi entropies: a new generalization and some properties. J. Math. Phys. 54(12), 122203 (2013)
Müller-Hermes, A., França, D.S., Wolf, M.M.: Relative entropy convergence for depolarizing channels. J. Math. Phys. 57(2), 022202 (2016)
Müller-Hermes, A., Stilck Franca, D., Wolf, M.M.: Entropy production of doubly stochastic quantum channels. J. Math. Phys. 57(2), 022203 (2016)
Nelson, E.: A quartic interaction in two dimensions. In: Goodman, R. (ed.) Mathematical Theory of Elementary Particles, pp. 69–73. Springer, Berlin (1966)
Ogawa, T., Nagaoka, H.: Strong converse to the quantum channel coding theorem. IEEE Trans. Inf. Theory 45(7), 2486–2489 (1999)
Ogawa, T., Nagaoka, H.: Strong converse and Stein’s lemma in quantum hypothesis testing. IEEE Trans. Inf. Theory 46(7), 2428–2433 (2000)
Olkiewicz, R., Zegarlinski, B.: Hypercontractivity in noncommutative \(L_p\) spaces. J. Funct. Anal. 161(1), 246–285 (1999)
Petz, D.: A variational expression for the relative entropy. Commun. Math. Phys. 114(2), 345–349 (1988)
Raginsky, M., Sason, I.: Concentration of measure inequalities in information theory, communications, and coding. Found. Trends® Commun. Inf. Theory 10(1–2), 1–246 (2013)
Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56(1), 131 (1997)
Shor, P.W.: Additivity of the classical capacity of entanglement-breaking quantum channels. J. Math. Phys. 43(9), 4334–4340 (2002)
Simon, B., Hoegh-Krohn, R.: Hypercontractive semigroups and two dimensional self-coupled bose fields. J. Funct. Anal. 9(2), 121–180 (1972)
Temme, K., Pastawski, F., Kastoryano, M.J.: Hypercontractivity of quasi-free quantum semigroups. J. Phys. A Math. Gen. 47, 5303 (2014)
Tomamichel, M., Berta, M., Hayashi, M.: Relating different quantum generalizations of the conditional Rényi entropy. J. Math. Phys. 55(8), 082206 (2014)
Watrous, J.: The Theory of Quantum Information. Cambridge University Press, Cambridge (2018)
Wilde, M.M., Winter, A., Yang, D.: Strong converse for the classical capacity of entanglement-breaking channels. Commun. Math. Phys. 331(2), 593–622 (2014)
Winter, A.: Coding theorem and strong converse for quantum channels. IEEE Trans. Inf. Theory 45(7), 2481–2485 (1999)
Wolf, M.M.: Quantum Channels & Operations: Guided Tour. http://www-m5.ma.tum.de/foswiki/pub/M5/Allgemeines/MichaelWolf/QChannelLecture.pdf (2012). Lecture Notes Based on a Course Given at the Niels-Bohr Institute
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. M. Wolf
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Proof of Proposition 7
(i) As mentioned in [16] (and explicitly worked out in [5]) for \(p\ge 1\), contractivity can be proven using the Riesz–Thorin interpolation theorem. So we focus on \(p\in (-\infty , -1]\cup [1/2, 1)\). First let \(p=-q\in (-\infty , -1]\), and \(X> 0\). We note that
On the other hand, \(\Phi _t\) is completely positive and unital, and \(z\mapsto z^{-1}\) is operator convex. Therefore, by operator Jensen’s inequality \(\Phi _t(X^{-1})\ge \Phi _t(X)^{-1}\) and by the monotonicity of the norm we have \(\Vert \Phi _t(X)^{-1}\Vert _{q,\sigma } \le \Vert \Phi _t(X^{-1})\Vert _{q,\sigma }\). We conclude that
where for the second inequality we use q-contractivity of \(\Phi _t\) for \(q\ge 1\).
Now suppose that \(p\in [1/2, 1)\). We note that its Hölder conjugate \({\hat{p}}\in (-\infty , -1]\), and that \(\Phi _t\) is reverse \({\hat{p}}\)-contractive. Then using Hölder’s duality, for \(X>0\) we have
where \(\widehat{\Phi }_t\) is the adjoint of \(\Phi _t\) with respect to \(\langle .,.\rangle _\sigma \), for each \(t\ge 0\). Here the first equality follows from Lemma 6, and the inequality follows from the \({\hat{p}}\)-contractivity of \(\Phi _t\), i.e, \(\Vert \Phi _t(Y)\Vert _{{\hat{p}}, \sigma } \ge \Vert Y\Vert _{{\hat{p}}, \sigma } \ge 1\).
(ii) As worked out in [14] this is an immediate consequence of the operator Jensen inequality.
Second Proof of Theorem 14
The proof is very similar to the one used in [3] to prove the strong \(L_p\)-regularity of the Dirichlet forms. Before stating the proof we need some definitions.
For a compact set I we let C(I) to be the Banach space of continuous, complex valued functions on I (equipped with the supremum norm). Then the Banach space \(C(I\times I)\) becomes a \(*\)-algebra when endowed with the natural involution \(f\mapsto f^*\) with \(f^*(x,y)=\overline{f(x,y)}\). Thus \(C(I\times I)\) is a \(C^*\)-algebra.
We endow \(\mathcal {B}(\mathcal {H})\) with a Hilbert space structure by equipping it with the Hilbert–Schmidt inner product:
Fix \(X,Y\in \mathcal {B}_{sa}(\mathcal {H})\), and let I be a compact interval containing the spectrum of both X and Y. We define a \(*\)-representation \(\pi _{X,Y}: C(I\times I)\rightarrow \mathcal {B}\big (\mathcal {B}(\mathcal {H})\big )\) that is uniquely determined by its action on tensor products of functions as follows. For \(f, g\in C(I)\) we define \(\pi _{X, Y}(f\otimes g)\in \mathcal {B}\big ( \mathcal {B}(\mathcal {H}) \big )\) by
The following lemma can be found in [3] (see Lemma 4.2):
Lemma 34
\(\pi _{XY}\) is a \(*\)-representation between \(C^*\)-algebras. That is,
-
(i)
\(\pi _{XY}(1)=\mathcal {I}\), where 1 is the constant function on \(I\times I\) equal to 1.
-
(ii)
\(\pi _{XY}(f^*g)=\pi _{XY}(f)^*\pi _{XY}(g)\) for all \(f,g\in C(I\times I)\).
-
(iii)
If \(f\in C(I\times I)\), is a non-negative function, then \(\pi _{XY}(f)\) is a positive semi-definite operator on \(\mathcal {B}(\mathcal {H})\) for the Hilbert–Schmidt inner product, i.e., \(\pi _{X, Y}(f)\in \mathcal {P}\big ( \mathcal {B}(\mathcal {H}) \big )\).
Now, for any function \(f\in C(I)\), define \({\tilde{f}}\) to be the function in \(C(I\times I)\) defined by
The following lemma, proved in [3] (see Lemma 4.2), gives a generalization of the chain rule formula to a derivation.
Lemma 35
Let \(X, Y\in \mathcal {B}_{sa}(\mathcal {H})\) and let I be a compact interval containing the spectrums of X, Y. Let \(f\in C(I)\) be a continuously differentiable function such that \(f(0)=0\). Then for all \(V\in \mathcal {B}(\mathcal {H})\) we have
where \({\tilde{f}}\) is defined by (55).
We can now prove the theorem. By the result of [11] (an extension of Lemma 13), there are superoperators \(\partial _j:\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) of the form
where \(V_j\in \mathcal {B}(\mathcal {H})\), such that
Moreover, \(V_j\)’s are such that there are \(\omega _j\ge 0\) with
Using the above equation one can show [3] that
For arbitrary \(X> 0\) define \(Y_j:= \omega _j^{-1/4}\, \Gamma _{\sigma }^{\frac{1}{2}}(X)\) and \({Z}_j:= \omega _j^{1/4}\,\Gamma _{\sigma }^{\frac{1}{2}}(X)\). Using (57) we compute
where in (58) we used (56), in (59) we used (57), and in (60) we used the chain rule formula of Lemma 35 for the functions \(f_\alpha \) with \(f_\alpha (x)=x^{\alpha }\). Finally, in (61) we used part (ii) of Lemma 34.
Now, using the proofs of Theorem 2.1 and Lemma 2.4 of [36], for any \(x,y\ge 0\) and \(0\le p\le q\le 2\) we have
This means that for all x, y we have
Hence, by part (iii) of Lemma 34 we have
Remark 9
The difference with the proof of \(L_p\)-regularity of [3] lies in the choice of the inequality (62) used at the end of the proof.
Proof of Theorem 25
Since both \(\text {Ent}_{2, \sigma }(X)\) and \(\mathcal {E}_{2, \mathcal {L}}(X)\) are homogenous of degree two in X, to prove a log-Sobolev inequality, without loss of generality we can assume that X is of the form \(X=\Gamma _\sigma ^{-1/2}(\sqrt{\rho })\) where \(\rho \) is a density matrix. In this case
Let \(\sigma = \sum _{i=1}^d s_i|i\rangle \langle i|\) and \(\rho =\sum _{k=1}^d r_k |{{\tilde{k}}}\rangle \langle {{\tilde{k}}}|\) be the eigen-decompositions of \(\sigma \) and \(\rho \). Then
and
Let \(A=(a_{ik})_{d\times d}\) be a \(d\times d\) matrix whose entries are given by
Observe that, fixing the eigenvalues \(s_i\)’s and \(r_k\)’s, the entropy \(\text {Ent}_{2, \sigma }(X)\) is a linear function of A and \(\mathcal {E}_{2, \mathcal {L}}(X)\) is concave function of A. On the other hand, since both \(\{|1\rangle , \dots , |d\rangle \}\) and \(\{|{{\tilde{1}}}\rangle , \dots , |{{\tilde{d}}}\rangle \}\) form orthonormal bases, A is a doubly stochastic matrix. Then by Birkhoff’s theorem, A can be written as a convex combination of permutations matrices. We conclude that if an inequality of the form
holds for all permutation matrices A, then it holds for all doubly stochastic A, and then for all \(\sigma , \rho \) with the given eigenvalues. We note that A is a permutation matrix when \(\{|1\rangle , \dots , |d\rangle \}\) and \(\{|{{\tilde{1}}}\rangle , \dots , |{{\tilde{d}}}\rangle \}\) are the same bases (up to some permutation) which means that \(\sigma \) and \(\rho \) commute. Therefore, a log-Sobolev inequality of the form
holds for all \(\rho \) if and only if it holds for all \(\rho \) that commute with \(\sigma \). That is, to find the log-Sobolev constant
we may restrict to those \(\rho \) that commute with \(\sigma \). This optimization problem over such \(\rho \) is equivalent to computing the 2-log-Sobolev constant of the classical simple Lindblad generator, and has been solved in Theorem A.1 of [18]. \(\quad \square \)
Rights and permissions
About this article
Cite this article
Beigi, S., Datta, N. & Rouzé, C. Quantum Reverse Hypercontractivity: Its Tensorization and Application to Strong Converses. Commun. Math. Phys. 376, 753–794 (2020). https://doi.org/10.1007/s00220-020-03750-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-020-03750-z