1 Introduction and First Results

Consider a quantum system in which an observable A can be written as a sum A=A 1+⋯+A k of a number of components A 1,…,A k . If the components correspond to isolated subsystems then the total quantum entropy of the system \(S(A)=-\operatorname{Tr} A\log A \) is equal to the sum of the entropies of each subsystem. In the general case we may define the residual entropy

$$\varphi(A_1,\dots,A_k)=S(A)-\sum _{i=1}^k S(A_i)\quad A=A_1+ \cdots+A_k $$

as the difference between the total entropy of the system and the sum of the entropies of each subsystem; although it is a negative quantity.

Another type of residual entropy is the entropy gain over a quantum channel studied by Holevo and others [7, 8],

$$A\to S \bigl(\varPhi(A) \bigr)-S(A), $$

where Φ is a quantum channel represented by a completely positive trace preserving linear map.

Theorem 1.1

Consider n×n matrices A and n×m matrices K. The trace function

$$\varphi(A)=-\operatorname{Tr} K^*AK\log \bigl(K^*AK \bigr)+\operatorname{Tr} K^*(A\log A)K $$

is convex in positive definite A for arbitrary K.

Proof

The function f(t)=tlogt defined for t>0 is operator convex. It is well-known but may be derived from [6, Theorem 2.4] since f(0)=0, and logt is operator monotone. The perspective function,

$$g(t,s)=s f \bigl(ts^{-1} \bigr)=t\log t-t\log s\quad t,s>0, $$

is therefore operator convex as a function of two variables [3, Theorem 2.2]. Consider the Hilbert space \(\mathcal{H}=M_{n\times m} \) equipped with inner product given by \((X,Y)=\operatorname{Tr} Y^{*}X \) for matrices X,YM n×m and let L A and R B denote left and right multiplication with AM n and BM m respectively. If A and B are positive definite matrices then L A and R B are positive definite commuting operators on \(\mathcal{H}\). Operator convexity of the perspective function g(t,s) is equivalent to convexity of the map

$$\begin{aligned} (A,B) \to&\operatorname{Tr} K^* (L_{A\log A}-L_A R_{\log B} ) (K) \\ =&\operatorname{Tr} \bigl(K^* (A\log A) K - K^* A K\log B \bigr) \quad A,B>0 \end{aligned}$$

for every KM n×m cf. [4, Theorem 1.1]. The statement of the theorem now follows by replacing B with K AK in the above expression. □

Corollary 1.2

The residual entropy

$$\varphi(A_1,\dots,A_k)=-\operatorname{Tr} A\log A+\sum _{i=1}^k \operatorname{Tr} A_i \log A_i \quad A=A_1+\cdots+A_k $$

is a convex function in positive definite n×n matrices A 1,…,A k .

Proof

We apply Theorem 1.1 to block matrices of the form

$$A= \left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} A_1 & 0 & \ldots& 0\\ 0 & A_2 & & 0\\ \vdots& & \ddots\\ 0 & 0 & & A_k \end{array} \right ) \quad\text{and}\quad K= \left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} I & 0 & \ldots& 0\\ I & 0 & \ldots& 0\\ \vdots& \vdots& & \vdots\\ I & 0 & \ldots& 0 \end{array} \right ) , $$

and since the entry in the first row and the first column of the block matrix

$$-K^*AK\log \bigl(K^*AK \bigr)+K^*(A\log A)K $$

is calculated to

$$-(A_1+\cdots+A_k)\log (A_1+ \cdots+A_k )+\sum_{i=1}^k A_i\log A_i $$

the statement of the corollary follows. Notice that we used the same block matrix technique as in [2]. □

It is actually much easier to obtain the above result by expressing the residual entropy as a sum of relative entropies. We may however obtain other results by carefully choosing the arbitrary matrix K in Theorem 1.1.

Corollary 1.3

Consider the entropy gain

$$\varphi(A)= S \bigl(\varPhi(A) \bigr)-S(A) $$

over a quantum channel Φ, where the channel is represented by a completely positive trace preserving linear map Φ. The entropy gain φ(A) is a convex function in A.

Proof

A completely positive trace preserving linear map Φ:M n M m is of the form

$$\varPhi(A)=\sum_{i=1}^k a_i^* A a_i $$

where the so-called Kraus matrices a 1,…,a k M n×m satisfy

$$a_1 a_1^*+\cdots+a_k a_k^*=1. $$

We now apply Theorem 1.1 by substituting A by the matrix

$$\left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} A & 0 & \ldots& 0\\ 0 & A & & 0\\ \vdots& & \ddots\\ 0 & 0 & & A \end{array} \right ) \quad\text{and setting}\quad K= \left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} a_1 & 0 & \ldots& 0\\ a_2 & 0 & \ldots& 0\\ \vdots& \vdots& & \vdots\\ a_k & 0 & \ldots& 0 \end{array} \right ) . $$

The entry in the first row and the first column of the block matrix

$$-K^*AK\log \bigl(K^*AK \bigr)+K^*(A\log A)K $$

is then calculated to −Φ(A)logΦ(A)+Φ(AlogA). Since Φ is trace preserving it follows that the entropic map

$$A\to S \bigl(\varPhi(A) \bigr)+\operatorname{Tr}\varPhi(A\log A)=S \bigl( \varPhi(A) \bigr)-S(A) $$

is convex. □

Corollary 1.4

The entropy gain

$$\varphi(A_1,\dots,A_k)=S \bigl(\varPhi_1(A_1)+ \cdots+\varPhi_k(A_k) \bigr)-\sum _{i=1}^k S(A_i) $$

of k positive definite quantities observed through k quantum channels Φ 1,…,Φ k is a convex function in A 1,…,A k .

Proof

The statement is obtained as in the above corollary by considering suitable block matrices, where each block corresponds to a single quantum channel. We leave the details to the reader. □

2 Carlen-Lieb Trace Functions

We give new proofs of some of the statements in [2] without using variational methods.

Theorem 2.1

(Carlen-Lieb)

The trace function

$$(A,B)\to\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/r}\quad 0<p\le r\le1 $$

is concave in positive definite matrices A and B.

Proof

The function

$$f(t)= \bigl(t^p+1 \bigr)^{1/p}\quad t>0 $$

is operator monotone, cf. [1, Corollary 4.3]. Indeed, if z=re with 0<θ<π then z p=r p e ipθ. Since we add a positive constant it is plain that the argument of z p+1 is less than but still positive. The argument of f(z) is therefore between zero and θ<π. We have shown that the analytic continuation of f to the complex upper half plane has positive imaginary part, thus f is operator monotone.

The perspective function

$$(t,s)\to s f \bigl(ts^{-1} \bigr)=s \bigl(t^ps^{-p}+1 \bigr)^{1/p}= \bigl(t^p+s^p \bigr)^{1/p} $$

is therefore operator concave, cf. [3, Theorem 2.2] and so is the function,

$$g(t,s)= \bigl(t^p+s^p \bigr)^{1/r}\quad t,s>0, $$

that appears by composing with the operator monotone and operator concave function tt p/r.

The left and right multiplication operators L A and R B are positive definite commuting operators on the Hilbert space \(\mathcal{H}=M_{n} \) equipped with the inner product \((A,B)=\operatorname{Tr} B^{*} A\). It follows that the (super) operator mapping

$$(A,B)\to \bigl(L_A^p+R_B^p \bigr)^{1/r} $$

is concave according to the preceding remark. The trace function

$$ (A,B)\to\operatorname{Tr} K^* \bigl(L_A^p+R_B^p \bigr)^{1/r}(K) $$
(1)

is therefore concave by [4, Theorem 1.1]. The statement now follows by choosing K as the identity matrix. Indeed, under the trace we have

$$\operatorname{Tr}(L_A+L_B) (A+B)^n= \operatorname{Tr}(A+B)^{n+1} $$

for each n, and we thus obtain

$$\operatorname{Tr} \bigl(L_A^p+R_B^p \bigr)^{1/r}(I)=\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/r} $$

by simple algebraic calculations. □

Notice that the statement in (1) is stronger than what is obtained in the reference [2].

Theorem 2.2

The function

$$f(t)= \bigl(t^p+1 \bigr)^{1/p}\quad t>0 $$

is operator convex for 1≤p≤2.

Proof

We have previously shown that f is operator monotone for 0<p≤1. Let us calculate the representing measure.

figure a

We set z=re for r>0 and 0<θ<π and calculate the analytic continuation of f,

$$f \bigl(r e^{i\theta} \bigr)= \bigl(r^p e^{ip\theta}+1 \bigr)^{1/p}, $$

into the complex upper half plane. Let argz with 0≤argz<2π denote the angle between the positive x-axis and the complex number z=x+iy. With this non-standard convention argz is an analytic function in C∖[0,∞), and the angle A p (r,θ) between the positive x-axis and (r p e ipθ+1)1/p is given by

$$A_p(r,\theta)=\frac{1}{p}\arg \bigl(r^p\cos p \theta+1 +i r^p \sin p\theta \bigr), $$

and it satisfies

$$0< A_p(r,\theta)<\theta<\pi\quad\text{for } 0<p\le1, r>0, 0<\theta< \pi . $$

The imaginary part of the analytic continuation of f is therefore given by

$$\Im f \bigl(r e^{i\theta} \bigr)= \bigl(1+r^{2p}+2r^p \cos p\theta \bigr)^{1/(2p)} \sin A_p(r,\theta), $$

and the representing measure of f is obtained as the limit

$$\frac{1}{\pi} \lim_{\theta\to\pi} \Im f \bigl(r e^{i\theta} \bigr)= \frac{1}{\pi} \bigl(1+r^{2p}+2r^p\cos p\pi \bigr)^{1/(2p)} \sin A_p(r,\pi). $$

It follows that

$$ \bigl(t^p+1 \bigr)^{1/p}=\beta+t+\int _0^\infty \biggl(\frac{\lambda}{1+\lambda ^2}- \frac{1}{t+\lambda} \biggr) h_p(\lambda) d\lambda, $$
(2)

where β is a constant determined by setting t=0 in Eq. (2), and the non-negative function h p is given by

$$h_p(\lambda)=\frac{1}{\pi} \bigl(1+\lambda^{2p}+2 \lambda^p\cos p\pi \bigr)^{1/(2p)} \sin A_p(\lambda, \pi)\quad\lambda> 0, $$

cf. [5] for the details. The key in the proof is the realisation that

$$\pi<A_p(\lambda,\pi)<2\pi\quad\text{for $ 1<p< 2 $ and $ r >0$,} $$

and this is so because argz<arg(z+1)<2π when z is in the lower complex plane. It follows that both sides in Eq. (2) are real analytic functions in p in the whole interval (0,2).

figure b

The formula in (2) is consequently valid also for 1≤p≤2. However, for 1<p<2 the weight function h p is negative implying that f is operator convex. Notice that h p =0 for p=1. □

The same line of arguments as for 0<p≤1 applies, so we obtain:

Corollary 2.3

The trace function

$$(A,B)\to\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/p}\quad1\le p\le2 $$

is convex in positive definite matrices A and B.

2.1 Variational inequalities

Remark 2.4

Let x and y be positive numbers and take 0<p<1. It is easy to prove that

$$\bigl(x^p+y^p \bigr)^{1/p}\le \lambda^{(p-1)/p} x + (1-\lambda)^{(p-1)/p}y\quad \text{for } 0< \lambda<1 $$

with equality for λ=x p(x p+y p)−1.

Theorem 2.5

Let 0<p<1 and take positive definite n×n matrices A,B. Then

$$\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/p}\le \operatorname{Tr} \bigl( X^{(p-1)/p} A + (1-X)^{(p-1)/p} B \bigr) $$

for each n×n matrix X with 0<X<1. If A and B commute then there is equality for X=A p(A p+B p)−1.

Proof

We know that the trace function \(\varphi(X,Y)=\operatorname{Tr}(X^{p}+Y^{p})^{1/p} \) is concave in positive definite X and Y. It is also positively homogeneous since

$$\varphi(tX,tY)=t\varphi(X,Y)\quad t>0. $$

It follows that the Fréchet differential

$$d\varphi(X,Y) (A,B)\ge\varphi(A,B) $$

for positive definite X,Y,A,B, cf. for example [9, Lemma 5]. We notice that

$$d\varphi(X,Y) (A,B)=d_1\varphi(X,Y)A+d_2\varphi(X,Y)B $$

by the chain rule for Fréchet differentials. By setting f(t)=t 1/p and g(t)=t p we obtain

$$\begin{aligned} d_1\varphi(X,Y)A =&\operatorname{Tr} \mathit{df}\bigl(X^p+Y^p \bigr) \mathit{dg}(X)A=\operatorname{Tr} f'\bigl(X^p+Y^p \bigr) \mathit{dg}(X)A \\ =&\frac{1}{p}\operatorname{Tr}\bigl(X^p+Y^p \bigr)^{(1-p)/p} \mathit{dg}(X)A \end{aligned}$$

and similarly

$$d_2\varphi(X,Y)B=\frac{1}{p}\operatorname{Tr} \bigl(X^p+Y^p \bigr)^{(1-p)/p} \mathit{dg}(Y)B. $$

We thus derive that

$$\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/p}\le \frac{1}{p} \operatorname{Tr} \bigl(X^p+Y^p \bigr)^{(1-p)/p} \bigl(\mathit{dg}(X)A+ \mathit{dg}(Y)B \bigr). $$

Let now 0<X<1 and set Y=(1−X p)1/p. Then X p+Y p=1 and thus

$$\begin{aligned} \operatorname{Tr}\bigl(A^p+B^p\bigr)^{1/p} \le& \frac{1}{p} \operatorname{Tr} \bigl(\mathit{dg}(X)A+ \mathit{dg}(Y)B \bigr) \\ =&\frac{1}{p} \operatorname{Tr} \bigl(g'(X)A+ g'(Y)B \bigr) \\ =&\operatorname{Tr} \bigl( X^{p-1} A+\bigl(1-X^p \bigr)^{(p-1)/p}B \bigr). \end{aligned}$$

We may replace X with X 1/p since any 0<X<1 can be obtained in this way, and we obtain

$$\operatorname{Tr} \bigl(A^p+B^p \bigr)^{1/p}\le \operatorname{Tr} \bigl( X^{(p-1)/p} A+ (1-X)^{(p-1)/p} B \bigr) $$

which is the statement of the theorem. □

3 New Types of Trace Functions

Theorem 3.1

Let 0<p≤1. The function of two variables,

$$g(t,s)=\left \{ \begin{array}{l@{\quad}l} \frac{t-s}{t^p-s^p} & t\ne s\\ \frac{1}{p} t^{1-p} & t=s, \end{array} \right . $$

defined for t,s>0, is operator concave.

Proof

We notice that g(t,s) is not a perspective function, so our approach will have to be more indirect. We first prove that for 0≤λ≤1 the function

$$f_\lambda(t)= \bigl(\lambda t^p + 1-\lambda \bigr)^{1/p}\quad t>0 $$

is operator monotone. Indeed, if z=re with 0<θ<π, then z p=r p e ipθ. Since we add a positive constant it is plain that the argument of λz p+1−λ is less that but still positive. The argument of f λ (z) is therefore between zero and θ<π. We have shown that the analytic continuation of f λ to the complex upper half plane has positive imaginary part, thus f λ is operator monotone.

The perspective function

$$(t,s)\to s f_\lambda \bigl(ts^{-1} \bigr)= \bigl(\lambda t^p+(1-\lambda) s^p \bigr)^{1/p}\quad t,s>0 $$

is operator concave and so is any function that appears as the composition of an operator monotone function of one variable with the perspective. It follows that

$$(t,s)\to \bigl(\lambda t^p +(1-\lambda)s^p \bigr)^{(1-p)/p} $$

is operator concave. However, by an elementary calculation we may write

$$\frac{t-s}{t^p-s^p}=\frac{1}{p}\int_0^1 \bigl(\lambda t^p+(1-\lambda )s^p \bigr)^{(1-p)/p} d \lambda, $$

and the statement of the theorem follows. □

Take 0≤p≤1. Since the function \((t,s)\to\frac {t-s}{t^{p}-s^{p}} \) is operator concave, it follows that the trace function

$$(A,B)\to\operatorname{Tr} K^*\frac{L_A-R_B}{L_A^p-R_B^p}(K) $$

is concave in positive definite n×n matrices for any n×n matrix K, where L A and R B denote left and right multiplication with A and B.

By choosing K as the unit matrix we obtain:

Theorem 3.2

Let 0<p≤1. The trace function

$$(A,B)\to\operatorname{Tr}\frac{A-B}{A^p-B^p} $$

is concave in positive definite matrices.

4 The Fréchet Differential

Some of the techniques in this section are adapted from [9].

Theorem 4.1

Consider the function f(t)=t p for 0<p≤1. The map

$$x\to\operatorname{Tr} h \mathit{df}(x)^{-1}h, $$

defined in positive definite n×n matrices, is concave for each self-adjoint n×n matrix h.

Proof

Consider x>0 and a basis \((e_{i})_{i=1}^{n} \) in which x is diagonal with eigenvalues given by xe i =λ i e i for i=1,…,n. We may then calculate

$$e_i \mathit{df}(x) h e_j=e_i h e_j \frac{\lambda_i^p-\lambda_j^p}{\lambda _i-\lambda_j}\quad i,j=1,\dots,n. $$

Expressed in this basis df(x)h=hL f (λ 1,…,λ n ) is the Hadamard (entry-wise) product of h and the Löwner matrix

$$L_f (\lambda_1,\dots,\lambda_n )= \biggl( \frac{\lambda ^p_i-\lambda^p_j}{\lambda_i-\lambda_j} \biggr)_{i,j=1}^n . $$

The inverse Fréchet differential df(x)−1 h is therefore well-defined and given by the Hadamard product

$$\mathit{df}(x)^{-1} h =h\circ \biggl(\frac{\lambda_i-\lambda_j}{\lambda^p_i-\lambda ^p_j} \biggr)_{i,j=1}^n $$

expressed in the same basis and thus

$$\operatorname{Tr} h \mathit{df}(x)^{-1}h=\sum_{i,j=1}^n |(he_i\mid e_j)|^2 \frac{\lambda _i-\lambda_j}{\lambda^p_i-\lambda^p_j} = \operatorname{Tr} h g(L_x, R_x)h, $$

where L x and R x are left and right multiplication with x and

$$g(t,s)=\frac{t-s}{t^p-s^p}\quad t,s>0. $$

The operators L x and R x are positive definite commuting operators on the Hilbert space \(\mathcal{H}=M_{m} \) equipped with the inner product \((A,B)=\operatorname{Tr} B^{*} A\). The last expression \(\operatorname{Tr} h \mathit{df}(x)^{-1}h=\operatorname{Tr} h g(L_{x}, R_{x})h \) is independent of any particular basis, and since g is operator concave by Theorem 3.1, we obtain [4, Theorem 1.1] that the map \(x\to\operatorname{Tr} h \mathit{df}(x)^{-1}h \) is concave. □

Theorem 4.2

Consider the function f(t)=t p for 0<p≤1. The map of two variables,

$$(x,h)\to\operatorname{Tr} h \mathit{d f}(x)h\quad x>0, h^*=h, $$

is convex.

Proof

Keeping the notation as in the proof of Theorem 1.1 we define two quadratic forms α and β on \(\mathcal{H}\oplus\mathcal{H} \) by setting

$$\begin{aligned} \alpha(X\oplus Y) =&\lambda\operatorname{Tr} X \mathit{df}(A_1)X +(1-\lambda) \operatorname{Tr} Y \mathit{df}(A_2)Y \\ \beta(X\oplus Y) =&\operatorname{Tr}\bigl(\lambda X+(1-\lambda)Y\bigr) \mathit{d f}(A) \bigl(\lambda X+(1-\lambda)Y\bigr), \end{aligned}$$

where A 1,A 2 are positive definite matrices, and A=λA 1+(1−λ)A 2 for some λ∈[0,1]. The statement of the theorem is equivalent to the majorisation

$$ \beta(X\oplus Y)\le\alpha(X\oplus Y) $$
(3)

for arbitrary self-adjoint X,YM n . The quadratic form \(h\to \operatorname{Tr} h \mathit{d f}(x)h \) is positive definite since

$$\operatorname{Tr} h \mathit{d f}(x)h=\sum_{i,j=1}^n |(h e_i\mid e_j)|^2\frac{\lambda ^p_i-\lambda^p_j}{\lambda_i-\lambda_j} , $$

where \((e_{i})_{i=1}^{n} \) is a basis in which x is diagonal and λ 1,…λ n are the corresponding eigenvalues counted with multiplicity. We also notice that the corresponding sesqui-linear form is given by

$$\bigl(h,h' \bigr)\to\operatorname{Tr} h' \mathit{d f}(x)h. $$

The two quadratic forms α and β are in particular positive definite. Therefore, there exists an operator Γ on \(\mathcal{H}\oplus\mathcal{H} \) which is positive definite in the Hilbert space structure given by β such that

$$\alpha \bigl(X\oplus Y, X'\oplus Y' \bigr)=\beta \bigl(\varGamma(X\oplus Y), X'\oplus Y' \bigr)\quad X,X',Y,Y'\in M_n , $$

where we retain the notation α and β also for the corresponding sesqui-linear forms. Suppose γ is an eigenvalue of Γ corresponding to an eigenvector XY. Then

$$\alpha \bigl(X\oplus Y,X'\oplus Y' \bigr)=\beta \bigl( \gamma(X\oplus Y), X'\oplus Y' \bigr)\quad\text{for } X',Y'\in M_n $$

or equivalently

$$\begin{aligned} &\lambda\operatorname{Tr} X' \mathit{d f}(A_1)X+(1-\lambda) \operatorname{Tr} Y' \mathit{d f}(A_2)Y \\ &\quad =\gamma\operatorname{Tr}\bigl(\lambda X'+(1- \lambda)Y'\bigr) \mathit{d f}(A) \bigl(\lambda X+(1-\lambda)Y\bigr) \end{aligned}$$

for arbitrary X′,Y′∈M n . From this we may derive the identities

$$\mathit{d f}(A_1)X=\gamma \mathit{d f}(A) \bigl(\lambda X+(1-\lambda)Y \bigr)=\mathit{d f}(A_2)Y $$

and thus by setting M=df(A)(λX+(1−λ)Y), we obtain

$$\begin{aligned} \mathit{d f}(A)^{-1}(M) =&\lambda X+(1-\lambda) Y \\ =&\lambda \mathit{d f}(A_1)^{-1}(\gamma M)+(1-\lambda) \mathit{d f}(A_2)^{-1}(\gamma M). \end{aligned}$$

By multiplying from the left with M and taking the trace we obtain

$$\begin{aligned} &\gamma \bigl(\lambda\operatorname{Tr} M^* \mathit{d f}(A_1)^{-1} M + (1-\lambda) \operatorname{Tr} M^* \mathit{d f}(A_2)^{-1}M \bigr) \\ &\quad= \operatorname{Tr} M^* \mathit{df}(A)^{-1} M \ge \lambda\operatorname{Tr} M^* \mathit{df}(A_1)^{-1} M +(1- \lambda)\operatorname{Tr} M^* \mathit{df}(A_2)^{-1} M, \end{aligned}$$

where the last inequality is implied by the concavity result in Theorem 4.1. This shows that the positive definite operator Γ≥1 from which (3) and the statement of the theorem follow. □

Since the dependence of the function f in \(\operatorname{Tr} h \mathit{df}(x)h \) is linear we immediately obtain:

Corollary 4.3

Let f be a function written on the form

$$f(t)=\int_0^1 t^p d\mu(p)\quad t>0, $$

where μ is a positive measure on the unit interval. Then the map of two variables,

$$(x,h)\to\operatorname{Tr} h \mathit{d f}(x)h\quad x>0, h^*=h, $$

is convex.

If we in the corollary above choose μ as the Lebesgue measure, we realise that

$$f(t)=\frac{t-1}{\log t}\quad t>0 $$

is an example of a function such that \((x,h)\to\operatorname{Tr} h \mathit{d f}(x)h \) is convex. Moreover, the perspective g of f given by

$$g(t,s)=s f \bigl(ts^{-1} \bigr)=s\frac{ts^{-1}-1}{\log(ts^{-1})}=\frac{t-s}{\log t-\log s} \quad t,s>0 $$

is operator concave. Since

$$\operatorname{Tr} h d{\log}(x)^{-1} h=\operatorname{Tr} h g(L_x,R_x)h $$

this observation directly shows that the function \(x\to\operatorname{Tr} h d{\log} (x)^{-1}h \) is concave, cf. [9, Eq. (3.4)].

5 More Trace Functions

Lemma 5.1

Let K be a contraction. Then

$$\psi(A)=q \bigl(A^{q-1}-K \bigl(K^*AK \bigr)^{q-1}K^* \bigr) \ge0 $$

for −1≤q≤1.

Proof

By continuity we may assume K invertible. For 0≤q≤1 we use the inequality

$$K^* A^{1-q} K\le \bigl(K^*AK \bigr)^{(1-q)}, $$

or by inversion

$$K^{-1}A^{-(1-q)} \bigl(K^* \bigr)^{-1}= \bigl(K^* A^{1-q} K \bigr)^{-1}\ge \bigl(K^*AK \bigr)^{-(1-q)} $$

which implies the inequality

$$A^{q-1}-K \bigl(K^*AK \bigr)^{q-1}K^*\ge0. $$

For −1≤q≤0 we apply Jensen’s sub-homogeneous operator inequality

$$K^* A^{1-q} K\ge \bigl(K^*AK \bigr)^{1-q}, $$

or by inversion

$$K^{-1}A^{-(1-q)} \bigl(K^* \bigr)^{-1}= \bigl(K^* A^{1-q} K \bigr)^{-1}\le \bigl(K^*AK \bigr)^{q-1}. $$

This inequality finally implies

$$A^{q-1}\le K \bigl(K^*AK \bigr)^{q-1}K^* $$

and the proof is complete. □

Corollary 5.2

Let K be a contraction. The mapping

$$\varphi(A)=\operatorname{Tr} \bigl(K^*AK \bigr)^q-\operatorname{Tr} A^q\quad A>0 $$

is decreasing for −1≤q≤1.

Proof

The Fréchet differential of φ(A) is given by

$$d \varphi(A)D=-q\operatorname{Tr} \bigl(A^{q-1}- K \bigl(K^*AK \bigr)^{q-1}K^* \bigr)D=-\operatorname{Tr}\psi(A) D, $$

thus d φ(A)D≤0 for arbitrary D≥0 by the preceding lemma. □