1 Introduction

Information geometry on the quantum state space is of great importance from geometrical understanding of quantum theory as well as pure mathematical interest [1,2,3,4,5,6,7,8]. Indeed, the success of geometrical description of quantum theory in various problems should be stressed [9,10,11,12,13,14,15,16]. The geometrical structure in the quantum case is much more complex, when compared with information geometry for the space of probability distributions. The origin of rich geometrical structure for the quantum case is normally attributed to non-commutativity among quantum operators. For example, there is no quantum version of the unique ‘natural’ metric for the probability space (Cf, Cencov’s theorem [17]). Instead, we have a family of natural metrics, which have one-to-one correspondence to an operator monotone function. This seminal work by Petz [18] is regarded as a quantum version of Cencov’s theorem to some extent. Yet, there are many questions remained to be answered.

On the one side, Petz’s result does give the foundation for a possible family of natural metrics on the quantum state space [18,19,20,21,22,23,24]. On the other side, it relies on unnecessary assumptions to characterize possible metrics. A natural question is then to ask what happens if we relax some of the conditions in Petz’s theorem. There are several results along this line of extensions for Petz’s result [25,26,27,28]. Among them, a recent work by Takahashi and Fujiwara [27] showed a family of non-monotone metrics based on the sandwiched Rényi divergence. They derived a necessary and sufficient condition for the metric to be monotone under a completely positive and trace-preserving (CP-TP) maps. While this is an interesting result on its own, they did not yet find usefulness of non-monotone metric in the statistical point of view.

It is known that there are two different approaches to define a metric on the quantum state space. The first one is to start with a properly defined divergence function (a contrast function) [1, 29, 30]. The other is based on quantum covariance matrix. (See Refs. [1, 3].) In this paper, we focus on the latter approach [31], since it is in a harmony with the parameter estimation problem about quantum states [1,2,3,4, 10]. We shall present how to define a non-monotone metric and to discuss its relation to the parameter estimation problem.

The primary purpose of this paper is to introduce a new class of Riemannian metrics on the quantum state space. Our method is based on taking a convex mixture of two natural inner products on the state space. This then invites us to define a two-parameter family of metrics, which are not necessarily monotone under a CP-TP map. We explicitly derive a necessary and sufficient condition for this two-parameter family of metrics to be monotone. It seems that these newly proposed metrics are not well studied in the literature so far.

As an application of these new metrics, we analyze a problem of characterizing quantum statistical model based on properties of the tangent space. In Refs. [32, 33], we could only consider special cases, whereas the present paper generalizes the previous result. A key observation is that there exists a family of inner products based on the quantum covariance matrix. It then defines a family of different representations of the tangent vectors by the e- and m-duality.

The outline of this paper is as follows: In Sect. 2, we provide necessary background to discuss a monotone metric on the quantum state space. Several properties of the commutation operator are also proven. We then define a family of inner products and the quantum Fisher metric based on the quantum covariance. In Sect. 3, we introduce two-parameter family of quantum Fisher metrics, called the \((\lambda ,\lambda ')\)-Fisher metric. We prove the necessary and sufficient condition for it to be monotone under a CP-TP map (Theorem 3). We also prove several properties of our new quantum Fisher metric. In Sect. 4, we discuss two applications of the proposed quantum Fisher metric. The first one is a problem of model characterization. The second one is the Holevo-type bound. From these results, we see that the proposed quantum Fisher metric is intimately related to the quantum parameter estimation problem. The last section provides concluding remarks of this paper.

2 Preliminaries and basic lemmas

2.1 Quantum state space

Let \(\mathcal{H}\) be a finite-dimensional Hilbert space, and denote the set of all (bounded) linear operators on \(\mathcal{H}\) by \(\mathcal{L}(\mathcal{H})\). Denote by \(\mathcal{L}_h(\mathcal{H})\) and \(\mathcal{L}_+(\mathcal{H})\) the set of all Hermitian operators and positive-definite operators, respectively. As notations, \(\mathrm {Re}\,X=(X+X^*)/2\) and \(\mathrm {Im}\,X=(X-X^*)/2\mathrm {i}\) denote the real and imaginary parts of the operator \(X\in \mathcal{L}(\mathcal{H})\). In this paper, we follow physicist’s convention: \(X^*\) is the complex conjugation of X, whereas \(X^\dagger \) is Hermitian conjugation of X. By definition, \(\mathrm {Re}\,X\) is real symmetric and \(\mathrm {Im}\,X\) is real skew-symmetric. \(|X|:=\sqrt{X^\dagger X}\) defines the absolute value operator of \(X\in \mathcal{L}(\mathcal{H})\). \(\langle X,Y\rangle _{\mathrm {HS}}:=\mathrm {tr}\left( X^\dagger Y\right) \) denotes the Hilbert–Schmidt inner product between \(X,Y\in \mathcal{L}(\mathcal{H})\).

A quantum state \(\rho \) is a positive semi-definite operator with unit trace, and the totality of quantum states is denoted by

$$\begin{aligned} \overline{\mathcal{S}}(\mathcal{H}):= \{\rho \in \mathcal{L}_h(\mathcal{H})\,|\, \rho \ge 0,\,\mathrm {tr}\left( \rho \right) =1\}. \end{aligned}$$

Throughout the paper, we only consider full-rank states. The set of all strictly positive states \(\rho >0\) is denoted by \(\mathcal{S}(\mathcal{H})\):

$$\begin{aligned} \mathcal{S}(\mathcal{H}):= \{\rho \in \mathcal{L}_h(\mathcal{H})\,|\, \rho >0,\,\mathrm {tr}\left( \rho \right) =1\}. \end{aligned}$$
(1)

Clearly, \(\mathcal{S}(\mathcal{H})\) is an open subset of \(\overline{\mathcal{S}}(\mathcal{H})\). Our primary interest is to study the geometrical properties of \(\mathcal{S}(\mathcal{H})\). When the Hilbert space is fixed, we simply denote \(\mathcal{S}=\mathcal{S}(\mathcal{H})\).

A map \(\mathcal{E}\) from the state space to itself is said trace preserving, if \(\mathrm {tr}\left( \mathcal{E}(\rho )\right) =1\) holds for all \(\rho \in \mathcal{S}(\mathcal{H})\). \(\mathcal{E}\) is said positive, if \(\mathcal{E}(\rho )>0\) holds for all \(\rho \in \mathcal{S}(\mathcal{H})\). Lastly, \(\mathcal{E}\) is said completely positive, when an extended map \(\mathcal{E}\otimes \mathrm {id}\) from \(\mathcal{S}(\mathcal{H}\otimes \mathcal{K})\) to itself is positive for arbitrary ancillary Hilbert space \(\mathcal{K}\). (\(\mathrm {id}\) denotes the identity map on \(\mathcal{K}\).) In this paper, we only consider a completely positive and trace-preserving (CP-TP) map.

Let \(d=\dim \mathcal{H}\) be the dimension of the Hilbert space, then the dimension of the quantum state space is \(D=d^2-1\) due to the normalization condition \(\mathrm {tr}\left( \rho \right) =1\). By introducing an appropriate set of coordinate systems to \(\mathcal{S}(\mathcal{H})\), we can regard the quantum state space as a differential manifold. One such coordinate system can be constructed by the generalized Bloch vector representation as follows. Let \(\{\sigma _i\}_{i\in \mathcal{I}}\) be the set of generalized Pauli matrices (Gell-Mann matrices) such that \(\sigma _i^\dagger =\sigma _i\) and \(\mathrm {tr}\left( \sigma _i\sigma _j\right) =2\delta _{ij}\). Here, \(\mathcal{I}:=\{1,2,\ldots ,D\}\) the index set for the generalized Pauli matrices. Consider a map \(\psi :\,\rho \mapsto \theta =(\theta ^i)\) with \(\theta ^i:=\mathrm {tr}\left( \rho \sigma _i\right) \). Then, we can show that the map \(\theta \mapsto \rho _\theta =\frac{I}{d}+\frac{1}{2} \sum _{i=1}^{D}\theta ^i\sigma _i\in \mathcal{S}(\mathcal{H})\) is bijective [34,35,36], where I denotes the identity matrix on \(\mathcal{H}\). Note here that \(\theta =(\theta ^i)\) is a D-dimensional real vector on \(\varTheta =\psi (\mathcal{S})\).

Having introduced the differential geometrical setting for the quantum state space \(\mathcal{S}=\mathcal{S}(\mathcal{H})\), we can define the tangent space \(T_\rho (\mathcal{S})\) and vector fields on it. We denote the set of all vector fields on \(\mathcal{S}\) by \(\mathcal{X}(\mathcal{S})\). A Riemannian metric is a smooth family of inner products on \(T_\rho (\mathcal{S})\times T_\rho (\mathcal{S})\), which is denoted by \(g_\rho (\cdot ,\cdot )\) or \(\langle \cdot ,\cdot \rangle _\rho \).

2.2 Monotone metric by Petz

We first define a monotone metric on the quantum state space [3, 18,19,20].

Definition 1

A metric on \(\mathcal{S}\) is said monotone metric, when the following inequality holds for all CP-TP maps.

$$\begin{aligned} g_\rho (X,Y) \ge g_{\mathcal{E}(\rho )}(\mathcal{E}_*(X),\mathcal{E}_*(Y))\quad \forall \rho \in \mathcal{S},\forall X,Y\in \mathcal{X}(\mathcal{S}), \end{aligned}$$
(2)

where \(\mathcal{E}_*:=d\mathcal{E}\) is the differential of the CP-TP map \(\mathcal{E}\).

The differential of the CP-TP map \(d\mathcal{E}\) is a map from \(T_\rho (\mathcal{S})\) to \(T_{\mathcal{E}(\rho )}(\mathcal{S})\) defined by \((d\mathcal{E})_\rho (X)=X(\mathcal{E}(\rho ))\) for \(X\in T_\rho (\mathcal{S})\). The operational meaning of this definition is simple; one should not increase the amount of statistical ‘information’ by applying any physical operation \(\mathcal{E}\) on \(\rho \).

We note that the classical corresponding requirement to characterize a natural metric on the probability space is much weaker than this. The celebrated Cencov’s theorem only demands the invariance of a metric under arbitrary Markov map rather than monotonicity by all possible classical channels. See book [1] for the detailed discussion. One open question in the quantum case is whether we can relax the above condition on monotonicity for a possible metric to characterize a natural metric.

When restricted to monotone metrics on the quantum state space, Petz gave an equivalent formulation of monotone metric based on the concept of operator monotone functions and the modular operator. A real-valued function whose domain is \(\mathcal{T}\), \(f:\mathcal{T}\rightarrow {\mathbb R}\), is said operator monotone, when \(A\ge B\) implies \(f(A)\ge f(B)\) for all \(A,B\in \mathcal{L}(\mathcal{H})\) whose spectra are in \(\mathcal{T}\). The modular operator at \(\rho \in \mathcal{S}(\mathcal{H})\) is defined by the map \(\varDelta _\rho : A\mapsto \rho A \rho ^{-1}\). Petz’s characterization of a monotone metric is given by the next theorem [18].

Theorem 1

(Petz) For every monotone metric on \(\mathcal{S}\), there uniquely exists an operator monotone function f such that the following relation holds:

$$\begin{aligned} g_\rho (X,Y)= \langle X^{{(\mathrm {m})}},Y_f^{{(\mathrm {e})}}\rangle _{\mathrm {HS}}, \end{aligned}$$
(3)

where \(X^{(\mathrm {m})}:= X\rho \) is the mixture representation (m-rep.) of \(X\in T_\rho (\mathcal{S})\), and \(Y_f^{(\mathrm {e})}:= f(\varDelta _\rho )^{-1} [(Y\rho )\rho ^{-1}]\) is the exponential representation (e-rep.) of \(Y\in T_\rho (\mathcal{S})\).

The operator monotone function \(f:{\mathbb R}_+\rightarrow {\mathbb R}\) in this theorem is referred to as the Petz function (\({\mathbb R}_+:=\{x|x>0\}\)). The monotone metric defined by f is denoted by \(g^f\). Usually, we adopt a convention \(f(1)=1\) to reduce to the classical case for arbitrary classical model. Four familiar examples of operator monotone functions are

$$\begin{aligned} f_{\mathrm {SLD}}(t)=\frac{1+t}{2},\ f_{\mathrm {RLD}}(t)=t,\ f_{\mathrm {LLD}}(t)=1,\ f_\mathrm{{BKM}}(t)=\frac{1-t}{\log t} . \end{aligned}$$
(4)

\(f_{\mathrm {SLD}}\), \(f_{\mathrm {RLD}}\), \(f_{\mathrm {LLD}}\), and \(f_\mathrm{{BKM}}\) correspond to the symmetric logarithmic derivative (SLD) Fisher metric, the right logarithmic derivative (RLD) Fisher metric, the left logarithmic derivative (LLD) Fisher metric, and the Bogoliubov–Kubo–Mori metric, respectively.

Petz further showed the necessary and sufficient condition for a metric g to be real. This is given by the conjugate \(f^*\) (or transpose) of f is equal to itself:

$$\begin{aligned} f^*(t):=tf(\frac{1}{t})=f(t) \quad \forall t\in \mathcal{T}. \end{aligned}$$
(5)

When this is satisfied, f is said symmetric. For the general operator monotone function f, the monotone metric is complex. For example, the SLD \(f_{\mathrm {SLD}}\) is symmetric, but the RLD \(f_{\mathrm {RLD}}\) is not. In fact, the conjugate of the RLD is the LLD. However, it is always possible to symmetrize f by the arithmetic mean:

$$\begin{aligned} f\mapsto \frac{1}{2}(f+f^*), \end{aligned}$$
(6)

where the addition of two function is defined by \((f_1+f_2)(t)=f_1(t)+f_2(t)\). By symmetrization, the symmetrized RLD Fisher metric is given by

$$\begin{aligned} \frac{1}{2} (f_{\mathrm {RLD}}+f_{\mathrm {RLD}}^*) =\frac{1}{2} (f_{\mathrm {RLD}}+f_{\mathrm {LLD}})=f_{\mathrm {SLD}}. \end{aligned}$$
(7)

Alternatively, we first consider a complex metric g and take the real part of it. The corresponding e-representation is found through the e- and m-duality [Eq. (3)]. It is straightforward to show that the e-representation is given by

$$\begin{aligned} \partial _i^{(\mathrm {e})}= \frac{1}{2} \left( L_i+ L_i^\dagger \right) . \end{aligned}$$
(8)

This corresponds to the Petz function by the harmonic mean:

$$\begin{aligned} (f^\mathrm {Re})^{-1} = \frac{1}{2}\left( f^{-1}+{f^*}^{-1} \right) , \end{aligned}$$
(9)

where the inverse of a function f is defined by \(f^{-1}(t)=[f(t)]^{-1}\). As an example, consider the RLD Fisher metric. Its real part corresponds to the Petz function:

$$\begin{aligned} f^\mathrm {Re}_\mathrm {RLD}(t)= \frac{2t}{1+t}. \end{aligned}$$
(10)

Because we are concerned with positive definite operators, we only need to consider operator monotone functions whose domain is positive real values \({\mathbb R}_+\) to characterize possible metrics on \(\mathcal{S}(\mathcal{H})\). Let \(\mathcal{F}_{\mathrm {MON}}\) be the set of all symmetric and normalized operator monotone functions from \({\mathbb R}_+\) to \({\mathbb R}\). We define an ordering between two functions by the relation: \(f_1\ge f_2\) \({\mathop {\Leftrightarrow }\limits ^\mathrm{def}}\) \(f_1(t)\ge f_2(t)\) for all \(t\in \mathcal{T}\). Then, Petz proved that the following minimum and maximum elements exist in the set \(\mathcal{F}_{\mathrm {MON}}\).

$$\begin{aligned}&\displaystyle f_{\min }\le f\le f_{\max }\quad (\forall f\in \mathcal{F}_{\mathrm {MON}}), \end{aligned}$$
(11)
$$\begin{aligned}&\displaystyle f_{\min }(t)=\frac{2t}{1+t}=f^\mathrm {Re}_\mathrm {RLD}(t), \quad f_{\max }(t)=\frac{1+t}{2}=f_\mathrm {SLD}(t). \end{aligned}$$
(12)

This is to say that the SLD (RLD) Fisher metric is the maximum (minimum) element of the set \(\mathcal{F}_{\mathrm {MON}}\). Thus, Petz completely characterized all possible real metrics on the quantum state space \(\mathcal{S}=\mathcal{S}(\mathcal{H})\).

There are several extensions of Petz’s results. For example, a monotone metric on the quantum state manifold was extended to all positive definite matrices without unit trace constraint [26]. Its classical counterpart is known as the denormalization of a statistical model (see Ref. [1]). As another extension, Yamagata recently relaxed the condition on the Petz theorem [25]. In particular, a metric was characterized by only requiring monotonicity under completely positive, trace non-increasing maps and additive noise. Lastly, Takahashi and Fujiwara [27] studied a family of non-monotone metrics based on the sandwiched Rényi divergence.

2.3 Commutation operator and modular operator

For a given quantum state \(\rho >0\), Holevo [10, 37] defined a super-operator \(\mathcal{D}_\rho \) from \(\mathcal{L}(\mathcal{H})\) to itself, whose action is determined by the operator equation:

$$\begin{aligned}{}[\rho ,\,X]:=\rho X-X\rho =\mathrm {i}\rho \mathcal{D}_\rho (X)+\mathrm {i}\mathcal{D}_\rho (X)\rho . \end{aligned}$$
(13)

The super-operator \(\mathcal{D}_{\rho }\) is called the commutation operator at \(\rho \). We can show that \(\mathcal{D}_{\rho }\) is a linear map and the action \(\mathcal{D}_\rho (X)\) is uniquely defined. A useful fact is that \(X A+A X=0\) for \(A\in \mathcal{L}_+(\mathcal{H})\) implies \(X=0\). From definition, we can show that for \(X,Y\in \mathcal{L}(\mathcal{H})\), the following relation holds

$$\begin{aligned} \mathcal{D}_\rho (XY)=\mathcal{D}_\rho (X) Y+ X\mathcal{D}_\rho (Y)+\mathcal{D}_\rho \big ( \mathcal{D}_\rho (X)\mathcal{D}_\rho (Y) \big ). \end{aligned}$$
(14)

Recursive applications of this relation imply

$$\begin{aligned} \mathcal{D}_\rho (XY)=\sum _{k=1}^n \left( \mathcal{D}_\rho ^{k-1}(X)\mathcal{D}_\rho ^{k}(Y) + \mathcal{D}_\rho ^{k}(X)\mathcal{D}_\rho ^{k-1}(Y) \right) +\mathcal{D}_\rho \big ( \mathcal{D}_\rho ^n(X)\mathcal{D}_\rho ^n(Y) \big ), \end{aligned}$$
(15)

\(\forall n\in {\mathbb N}\). When X commutes with \(\rho \), i.e., \([\rho ,\,X]=0\), \(\mathcal{D}_\rho (X)=0\) holds. This together with the relation (14) proves the next lemma.

Lemma 1

For \(X\in \mathcal{L}(\mathcal{H})\) commuting with \(\rho \), i.e., \([\rho ,X]=0\), we have

$$\begin{aligned} \mathcal{D}_\rho (XY)=X\mathcal{D}_\rho (Y) \text{ and } \mathcal{D}_\rho (YX)=\mathcal{D}_\rho (Y) X\quad \forall Y\in \mathcal{L}(\mathcal{H}). \end{aligned}$$
(16)

We next state its relation to the modular operator \(\varDelta _\rho \). (See, for example, Ref. [38].) Define the function on \({\mathbb R}_+\) by

$$\begin{aligned} f_{\mathcal{D}}(t)=\frac{1-t}{1+t}. \end{aligned}$$
(17)

Then, the commutation operator is also expressed as follows:

Lemma 2

$$\begin{aligned} \mathcal{D}_\rho = \mathrm {i}f_{\mathcal{D}}(\varDelta _\rho ) \end{aligned}$$
(18)

holds. The inverse relation is

$$\begin{aligned} \varDelta _\rho = f_{\mathcal{D}}^{-1}(\mathrm {i}\mathcal{D}_\rho ), \end{aligned}$$
(19)

where \(f_{\mathcal{D}}^{-1}(t)=\frac{1+t}{1-t}=f_{\mathcal{D}}(-t)\).

Proof

We rewrite definition of \(\mathcal{D}_\rho \) [eq. (13)] as follows:

$$\begin{aligned} \rho X-X\rho =\mathrm {i}\rho \mathcal{D}_\rho (X)+\mathrm {i}\mathcal{D}_\rho (X)\rho \ \Leftrightarrow \&\rho X\rho ^{-1}-X=\mathrm {i}\rho \mathcal{D}_\rho (X)\rho ^{-1} +\mathrm {i}\mathcal{D}_\rho (X) \end{aligned}$$
(20)
$$\begin{aligned} \Leftrightarrow \&\varDelta _\rho (X)-X = \mathrm {i}[\varDelta _\rho (\mathcal{D}_\rho (X))+ \mathcal{D}_\rho (X) ] \end{aligned}$$
(21)
$$\begin{aligned} \Leftrightarrow \&\mathrm {i}(1-\varDelta _\rho )(X) = [(1+\varDelta _\rho )\circ \mathcal{D}_\rho ] (X) \end{aligned}$$
(22)
$$\begin{aligned} \Leftrightarrow \&\mathrm {i}(1+\varDelta _\rho )^{-1} (1-\varDelta _\rho )(X) = \mathcal{D}_\rho (X). \end{aligned}$$
(23)

This proves the first statement. The inverse relationship follows similarly. \(\square \)

As a corollary, the action of \(\varDelta _\rho \) commutes with the commutation operator.

Corollary 1

$$\begin{aligned} \varDelta _\rho \circ \mathcal{D}_\rho =\mathcal{D}_\rho \circ \varDelta _\rho , \end{aligned}$$
(24)

holds. In other words, \(\varDelta _\rho ( \mathcal{D}_\rho (X))= \mathcal{D}_\rho ( \varDelta _\rho (X))\) for all \(X\in \mathcal{L}(\mathcal{H})\).

2.4 Inner product

Given a state \(\rho \) on \(\mathcal{H}\), Petz and Toth [31] defined the family of inner products for \(\mathcal{L}(\mathcal{H})\) at \(\rho \) by

$$\begin{aligned} \langle X,Y\rangle _\rho ^\nu = \int _{-1}^{1} \nu (d\lambda ) \mathrm {tr}\left( \rho ^{\frac{1-\lambda }{2}} X^\dagger \rho ^{\frac{1+\lambda }{2}} Y \right) \ \forall X,Y\in \mathcal{L}(\mathcal{H}). \end{aligned}$$
(25)

Here, \(\nu \) is an arbitrary probability measure on \([-1,1]\). (The original definition is defined by a measure on [0, 1].) We call it the \(\nu \)LD inner product at \(\rho \). As a special subfamily of the \(\nu \)LD inner products, we introduce the \(\lambda \)-family of inner products (the \(\lambda \)LD inner product) by the measure \(\frac{1+\lambda }{2} \delta _{1} +\frac{1-\lambda }{2}\delta _{-1}\) with \(\delta _{x}\) the Dirac measure concentrated at x:

$$\begin{aligned} \langle X,Y\rangle _\rho ^\lambda =\frac{1+\lambda }{2} \mathrm {tr}\left( \rho YX^\dagger \right) +\frac{1-\lambda }{2} \mathrm {tr}\left( \rho X^\dagger Y\right) , \end{aligned}$$
(26)

for \(\lambda \in [-1,1]\). This family includes the RLD (\(\lambda =1\)), the SLD (\(\lambda =0\)), and the LLD (\(\lambda =-1\)) inner products. They are denoted by \( \langle \cdot ,\cdot \rangle _\rho ^R\), \( \langle \cdot ,\cdot \rangle _\rho ^S\), and \( \langle \cdot ,\cdot \rangle _\rho ^L\), respectively. In passing, we note that the \(\lambda \)LD inner product was used to regularize a singular behavior of the RLD inner product when studying a pure state model in Ref. [39].

2.5 Tangent space

Given an n-parameter family of quantum states, \(\mathcal{M}=\{ \rho _\theta \,|\,\theta \in \varTheta \}\), we regard it as the differential manifold under regularity conditions. Naturally, any model \(\mathcal{M}\) is a submanifold of the set of all quantum states \(\mathcal{S}=\mathcal{S}(\mathcal{H})\). We define the \(\nu \)LD operator for ith direction \(L_{i}^\nu (\rho )\) at \(\rho \) (\(i=1,2,\ldots ,n\)) by

$$\begin{aligned} \mathrm {tr}\left( \partial _i\rho X\right) = \langle L_{i}^\nu (\rho ),X\rangle _{\rho }^{\nu },\quad \forall X\in \mathcal{L}(\mathcal{H}), \end{aligned}$$
(27)

with \(\partial _i:= \partial /\partial \theta ^i\) the partial differentiation with respect to \(\theta ^i\). Formally, \(L_{i}^\nu (\rho )\) is defined by the solution to the operator equation:

$$\begin{aligned} \partial _i\rho = \int _{-1}^{1} \nu (d\lambda ) \rho ^{\frac{1+\lambda }{2}} L_{i}^\nu (\rho ) \rho ^{\frac{1-\lambda }{2}}. \end{aligned}$$
(28)

Below, we simply denote \(L_{i}^\lambda \) instead of \(L_{i}^\lambda (\rho )\) when there is no confusion. Following the language of quantum information geometry, the m-representation of the tangent vector \(\partial _i\) is \(\partial _i\rho \), whereas \(L_{i}^\nu \) is regarded as an e-representation of the tangent vector:

$$\begin{aligned} \partial _i^{(\mathrm {m})}&= \partial _i\rho , \end{aligned}$$
(29)
$$\begin{aligned} \partial _i^{(\mathrm {e})}&= L_{i}^\nu . \end{aligned}$$
(30)

A useful identity follows from definition (27):

$$\begin{aligned} \langle L_{i}^S,X\rangle _{\rho }^{S}=\langle L_{i}^\nu ,X\rangle _{\rho }^{\nu }=\langle \partial _i^{(\mathrm {m})},X\rangle _{\mathrm {HS}},\quad \forall X\in \mathcal{L}(\mathcal{H}), \end{aligned}$$
(31)

which holds for an arbitrary measure \(\nu \). When we consider the \(\lambda \)-family of inner products, we call it the \(\lambda \)LD operators for the ith direction \(L_{i}^\lambda (\rho )\) at \(\rho \).

The \(\nu \)LD operators enjoy the following relations to the RLD and SLD operators.

$$\begin{aligned} L_{i}^R&= \int _{-1}^{1} \nu (d\lambda ) \rho ^{-\frac{1-\lambda }{2}} L_{i}^\nu \rho ^{\frac{1-\lambda }{2}} , \end{aligned}$$
(32)
$$\begin{aligned} L_{i}^S&= L_{i}^\nu +\mathrm {i}\int _{-1}^{1} \nu (d\lambda ) \rho ^{-\frac{1-\lambda }{2}} \mathcal{D}_{\rho }( L_{i}^\nu )\rho ^{\frac{1-\lambda }{2}}. \end{aligned}$$
(33)

In particular, the \(\lambda \)LD operators satisfy

$$\begin{aligned} L_{i}^S=L_{i}^\lambda +\mathrm {i}\lambda \mathcal{D}_{\rho }(L_{i}^\lambda ). \end{aligned}$$
(34)

The \(\nu \)LD tangent space at \(\rho \) is the real span of the \(\nu \)LD operators:

$$\begin{aligned} T_\rho ^\nu (\mathcal{M}):=\mathrm {span}_{\mathbb R}\{ L_{i}^\nu (\rho )\}, \end{aligned}$$
(35)

and its complex extension is denoted by

$$\begin{aligned} \tilde{T}_\rho ^\nu (\mathcal{M}):=\mathrm {span}_{\mathbb C}\{ L_{i}^\nu (\rho )\}. \end{aligned}$$
(36)

The \(\nu \)LD tangent space should be regarded as a representation of the tangent space \(T_\rho (\mathcal{M})=\mathrm {span} \{ (\partial _i)_\rho \}\). A crucial point here is that each different choice of \(\nu \)LD inner product gives a different representation \(T_\rho ^\nu (\mathcal{M})\). As \(\nu \)LD are the tangent vectors, they satisfy

$$\begin{aligned} \mathrm {tr}\left( \rho L_{i}^\nu \right) =0\quad (\rho \in \mathcal{M}). \end{aligned}$$

Hence, \(T_\rho ^\nu (\mathcal{M})\) is an n-dimensional linear subspace of the space:

$$\begin{aligned} \mathcal{L}_{h,0}(\mathcal{H}):=\{X\in \mathcal{L}_h(\mathcal{H})\,|\, \langle X,\rho \rangle _{\mathrm {HS}}=0\}, \end{aligned}$$

which is a \(D=d^2-1\)-dimensional space.

When we take the set of all quantum states as a manifold \(\mathcal{S}\), the tangent space coincides with the set of Hermitian operators orthogonal to \(\rho \) with respect to the Hilbert–Schmidt inner product. Therefore,

$$\begin{aligned} T_\rho (\mathcal{M}) \cong \mathcal{L}_{h,0}(\mathcal{H}), \end{aligned}$$
(37)

holds as a linear isomorphism. This can be easily proven by the fact \(T_\rho (\mathcal{M})\) is a linear subspace of \(\mathcal{L}_{h,0}(\mathcal{H})\), and their dimensions are same. In this case, the complex span of the \(\nu \)LD operators is identified as

$$\begin{aligned} \tilde{T}_\rho ^\nu (\mathcal{M}) = \mathcal{L}_{0}(\mathcal{H}):=\{X\in \mathcal{L}(\mathcal{H})\,|\, \langle X,\rho \rangle _{\mathrm {HS}}=0\}. \end{aligned}$$
(38)

This result shows that different representations of the tangent spaces collapsed to one for \(\mathcal{M}=\mathcal{S}\).

2.6 Quantum Fisher metric and \(\nu \)LD covariance matrix

The \(\nu \)LD Fisher metric, or the \(\nu \)LD Fisher information matrix, is defined by

$$\begin{aligned} G^\nu&:= \big [ g_{ij}^\nu \big ],\nonumber \\ g_{ij}^\nu (\rho )&:= \langle L_{i}^\nu (\rho ),L_{j}^\nu (\rho )\rangle _{\rho }^{\nu }. \end{aligned}$$
(39)

By definition, \(G^\nu \) is a positive-definite complex matrix. Using the e- and m-representations of the tangent vectors, we can also write the metric tensor by

$$\begin{aligned} g_{ij}^\nu = \langle \partial _i\rho ,L_{i}^\nu \rangle _{\mathrm {HS}}=\langle \partial _i^{(\mathrm {m})},\partial _i^{(\mathrm {e})}\rangle _{\mathrm {HS}}, \end{aligned}$$
(40)

where \(\partial _i^{(\mathrm {e})}= L_{i}^\nu \) is the e-representation of the \(\nu \)LD operator. When considering the \(\lambda \)-family, we call it as the \(\lambda \)LD Fisher metric.

Let \(\big (G^\nu \big )^{-1}\) be the inverse of \(G^\nu \) and denote its ij component by \(g^{\nu ,ij} \). The \(\nu \)LD dual operators \(L^{\nu ,j}\) are defined by

$$\begin{aligned} L^{\nu ,i}:= \sum _{k=1}^n g^{\nu ,ki} L_{k}^\nu . \end{aligned}$$
(41)

They form a dual basis with respect to the \(\nu \)LD inner product,

$$\begin{aligned} \langle L^{\nu ,i},L_{j}^\nu \rangle _{\rho }^{\nu }=\delta _{\,j}^{i}. \end{aligned}$$
(42)

It follows that \(g^{\nu ,ij} = \langle L^{\nu ,i},L^{\nu ,j}\rangle _{\rho }^{\nu }\) holds.

We can associate a statistical meaning of the \(\nu \)LD Fisher metric as follows. Given an n-dimensional quantum parametric model \(\mathcal{M}=\{\rho _\theta |\theta \in \varTheta \}\), we consider a vector of Hermitian operators \(\mathbf {X}=(X^1,\dots ,X^n)\in \mathcal{L}(\mathcal{H})^n\). We define the \(\nu \)LD covariance matrix (at \(\rho \in \mathcal{M}\)) by

$$\begin{aligned} V_{\rho _\theta }^\nu [\mathbf {X}]&:=\big [ v_{\rho _\theta }^{\nu ,ij}[\mathbf {X}]\big ],\nonumber \\ v_{\rho _\theta }^{\nu ,ij}[\mathbf {X}]&:= \langle X^i,X^j\rangle ^\nu _{\rho _\theta }. \end{aligned}$$
(43)

When \(\mathbf {X}=(X^1,\ldots ,X^n)\) obeys the relations: \( \mathrm {tr}\left( \rho _\theta X^i\right) =\theta ^i\) and \(\partial _i \mathrm {tr}\left( \rho _\theta X^j\right) =\delta _i^j\) for all ij, \(\mathbf {X}\) is said locally unbiased at \(\rho _\theta \). The following theorem is a direct consequence of the inner product structure [1].

Theorem 2

For any locally unbiased \(\mathbf {X}\) at \(\rho _\theta \), the matrix inequality,

$$\begin{aligned} V_{\rho _\theta }[\mathbf {X}] \ge \big (G^\nu (\rho _\theta ) \big )^{-1}, \end{aligned}$$
(44)

holds for every choice of the quantum Fisher metric defined by the measure \(\nu \).

From the above construction of a Riemannian metric, it is essential to define an inner product or quantum covariance based on it. This construction is normally referred to as the induced metric from a generalized covariance. This is different from the formulation by Petz’s theorem. To find a relation to the Petz function formalism, we need to construct a function f satisfying the relation.

$$\begin{aligned} f(\varDelta _\rho )^{-1} [(\partial _i\rho )\rho ^{-1}] = L_{i}^\nu , \end{aligned}$$
(45)

where \(L_{i}^\nu \) is given by the integral equation (28). While this is a non-trivial problem, the \(\lambda \)LD Fisher metric case can be solved as follows.

From definition (28), we have

$$\begin{aligned} \partial _i\rho = \frac{1+\lambda }{2}\rho L^\lambda _i +\frac{1-\lambda }{2}L^\lambda _i\rho . \end{aligned}$$
(46)

With the help of the RLD and LLD, we immediately identify

$$\begin{aligned} f_\lambda (t) = \frac{1+\lambda }{2}f_\mathrm {RLD}(t)+\frac{1-\lambda }{2}f_\mathrm {LLD}(t) =\frac{1+\lambda }{2}t+\frac{1-\lambda }{2}. \end{aligned}$$
(47)

In other words, the \(\lambda \)-family of monotone metrics is given by a convex mixture of two Petz’s functions \(f_\mathrm {RLD}\) and \(f_\mathrm {LLD}\). Since a convex mixture of two operator monotone functions is also operator monotone, \(f_\lambda \) defines the monotone metric.

The conjugate of Petz’s function exhibits the following symmetry.

$$\begin{aligned} f^*_\lambda (t) = \frac{1+\lambda }{2}+\frac{1-\lambda }{2} t = f_{-\lambda }(t). \end{aligned}$$
(48)

Thus, it gives a real metric if and only if \(\lambda =0\) (the SLD Fisher metric). Also, the symmetrized metric is the SLD Fisher metric.

We next consider the real part of the \(\lambda \)LD Fisher metric. Formula (9) immediately gives

$$\begin{aligned} f^{\mathrm {Re}\,}_{\lambda }(t)=\frac{1+t}{2}-\frac{1}{2}\frac{(1-t)^2}{1+t}\lambda ^2. \end{aligned}$$
(49)

To show monotonicity of the real part of the \(\lambda \)LD Fisher metric, it suffices to prove the following ordering relation:

$$\begin{aligned} f_{\max }\ge f^{\mathrm {Re}\,}_{\lambda } \ge f_{\min } . \end{aligned}$$
(50)

The first inequality is shown by negativity of the second term in Eq. (49). To prove the second inequality, we rewrite it as

$$\begin{aligned} \frac{1}{2} (1+t)^2-\frac{1}{2} (1-t)^2\lambda ^2\ge 2t\ \Leftrightarrow \ (1-\lambda ^2)(1-t)^2\ge 0. \end{aligned}$$
(51)

This is always true for all \(\lambda \in [-1,1]\).

2.7 Basic lemmas

First, the following lemma is fundamental.

Lemma 3

For any smooth function (differential sufficiently many times) \(f:{\mathbb R}_+\rightarrow {\mathbb R}\), the identity holds

$$\begin{aligned} \langle f(\varDelta _\rho ) (X),Y\rangle _{\rho }^{\nu }=\langle X,f(\varDelta _\rho )(Y)\rangle _{\rho }^{\nu }, \end{aligned}$$
(52)

for all \(X,Y\in \mathcal{L}(\mathcal{H})\) and for every choice of a measure \(\nu \).

Proof

We first note that \(\mathrm {tr}\left( \varDelta _{\rho }(X)Y\right) =\mathrm {tr}\left( X \varDelta _\rho ^{-1}(Y)\right) \) holds for \(X,Y\in \mathcal{L}(\mathcal{H})\). Repeating it yields \(\mathrm {tr}\left( \varDelta _\rho ^k(X)Y\right) =\mathrm {tr}\left( X \varDelta _\rho ^{-k}(Y)\right) \) for \(k\in {\mathbb N}\). By the Taylor expansion of f, we thus get

$$\begin{aligned} \mathrm {tr}\left( f(\varDelta _\rho )(X)Y\right) =\mathrm {tr}\left( X f(\varDelta _\rho ^{-1})(Y)\right) , \end{aligned}$$
(53)

for \(f:{\mathbb R}_+\rightarrow {\mathbb R}\). Second, we have \(\varDelta _{\rho }(X)^\dagger =(\rho X\rho ^{-1})^\dagger =\rho ^{-1}X^\dagger \rho = \varDelta _\rho ^{-1}(X^\dagger )\). Repeating this gives the relation,

$$\begin{aligned} \left[ f(\varDelta _\rho )(X)\right] ^\dagger =f(\varDelta _\rho ^{-1})(X^\dagger ), \end{aligned}$$
(54)

for any \(C^\infty \) function \(f:{\mathbb R}_+\rightarrow {\mathbb R}\).

Next, definition (25), Eq. (53), and Eq. (54) prove the following relation:

$$\begin{aligned} \langle f(\varDelta _\rho ) (X),Y\rangle _{\rho }^{\nu }&= \int _{-1}^{1} \nu (d\lambda ) \mathrm {tr}\left( \rho ^{\frac{1-\lambda }{2}} \left[ f(\varDelta _\rho ) (X)\right] ^\dagger \rho ^{\frac{1+\lambda }{2}} Y \right) \end{aligned}$$
(55)
$$\begin{aligned}&=\int _{-1}^{1} \nu (d\lambda ) \mathrm {tr}\left( f(\varDelta _\rho ^{-1})( \rho ^{\frac{1-\lambda }{2}} X^\dagger \rho ^{\frac{1+\lambda }{2}}) Y \right) \end{aligned}$$
(56)
$$\begin{aligned}&=\int _{-1}^{1} \nu (d\lambda ) \mathrm {tr}\left( \rho ^{\frac{1-\lambda }{2}} X^\dagger \rho ^{\frac{1+\lambda }{2}}f(\varDelta _\rho )(Y) \right) \end{aligned}$$
(57)
$$\begin{aligned}&=\langle X,f(\varDelta _\rho )(Y)\rangle _{\rho }^{\nu }. \end{aligned}$$
(58)

\(\square \)

A direct consequence of this lemma, the next corollary, is useful.

Corollary 2

$$\begin{aligned} \langle \mathcal{D}_\rho (X),Y\rangle _{\rho }^{\nu }=-\langle X,\mathcal{D}_\rho (Y)\rangle _{\rho }^{\nu }, \end{aligned}$$
(59)

that holds for all \(X,Y\in \mathcal{L}(\mathcal{H})\) and for any choice of a measure \(\nu \).

Proof

From Lemma 2, we have \(\mathcal{D}_\rho (X)=\mathrm {i}f_{\mathcal{D}}(\varDelta _\rho )(X)\). Using Lemma 3, we obtain

$$\begin{aligned} \langle \mathcal{D}_\rho (X),Y\rangle _{\rho }^{\nu }&=-\mathrm {i}\langle f_{\mathcal{D}}(\varDelta _\rho )(X),Y\rangle _{\rho }^{\nu } \end{aligned}$$
(60)
$$\begin{aligned}&=-\mathrm {i}\langle X,f_{\mathcal{D}}(\varDelta _\rho )(Y)\rangle _{\rho }^{\nu } \end{aligned}$$
(61)
$$\begin{aligned}&=-\langle X,\mathrm {i}f_{\mathcal{D}}(\varDelta _\rho )(Y)\rangle _{\rho }^{\nu } \end{aligned}$$
(62)
$$\begin{aligned}&=-\langle X,\mathcal{D}_\rho (Y)\rangle _{\rho }^{\nu }. \end{aligned}$$
(63)

\(\square \)

The \(\lambda \)-family version of this corollary will be used later.

$$\begin{aligned} \langle \mathcal{D}_\rho (X),Y\rangle _{\rho }^{\lambda }=-\langle X,\mathcal{D}_\rho (Y)\rangle _{\rho }^{\lambda },\quad \forall X,Y\in \mathcal{L}(\mathcal{H}). \end{aligned}$$
(64)

Next lemma concerns the difference between two different representations of the tangent vector \(L^{\nu }_{i}\). In general, we have no direct relationship between \(L^{\nu }_i\) and \(L^{\nu '}_i\) as they belong to different sets. However, its dual has the following property:

Lemma 4

\(L^{\nu ,i}-L^{\nu ',i}\) is orthogonal to the \(\nu \)LD tangent space \(T^\nu _\rho (\mathcal{M})\) with respect to \(\langle \cdot ,\cdot \rangle _{\rho }^{\nu }\), and is orthogonal to the \(\nu '\)LD tangent space \(T^{\nu '}_\rho (\mathcal{M})\) with respect to \(\langle \cdot ,\cdot \rangle ^{\nu '}_{\rho }\) at every point \(\rho \in \mathcal{M}\).

Proof

We show that \(L^{\nu ,i}-L^{\nu ',i}\) is orthogonal to \(\partial _j\rho \) (\(j=1,2,\ldots ,n\)) with respect to the Hilbert–Schmidt inner product. Identity (31) then implies the statement in this lemma. Orthogonality is shown by direct calculation as follows:

$$\begin{aligned} \langle L^{\nu ,i}-L^{\nu ',i},\partial _j\rho \rangle _{\mathrm {HS}}&=\langle L^{\nu ,i},\partial _j\rho \rangle _{\mathrm {HS}}-\langle L^{\nu ',i},\partial _j\rho \rangle _{\mathrm {HS}}\\&= \langle L^{\nu ,i},L^{\nu }_{j}\rangle _{\rho }^{\nu }-\langle L^{\nu ',i},L^{\nu '}_{j}\rangle _{\rho }^{\nu '}\\&=\delta ^i_j-\delta ^i_j=0. \end{aligned}$$

\(\square \)

For the \(\lambda \)-family of inner products, we have a simple formula to relate the difference between two inner products to the SLD inner product. Its proof directly follows from definitions and Corollary 2.

Lemma 5

For any \(\lambda ,\lambda '\in [-1,1]\), the following identity holds.

$$\begin{aligned} \langle X,Y\rangle ^\lambda _{\rho }-\langle X,Y\rangle ^{\lambda '}_{\rho } =(\lambda -\lambda ') \langle X,\mathrm {i}\mathcal{D}_{\rho }(Y)\rangle _{\rho }^{S} ,\quad \forall X,Y\in \mathcal{L}(\mathcal{H}). \end{aligned}$$
(65)

Proof

It follows from the definition that

$$\begin{aligned} \langle X,Y\rangle ^\lambda _{\rho }=\mathrm {tr}\left( \rho X^\dagger f_\lambda (\varDelta _\rho )(Y)\right) . \end{aligned}$$
(66)

Since \(f_\lambda (t)-f_{\lambda '}(t)=\frac{1}{2}(\lambda -\lambda ')(t-1) \), we obtain

$$\begin{aligned} \langle X,Y\rangle ^\lambda _{\rho }-\langle X,Y\rangle ^{\lambda '}_{\rho }&=\frac{1}{2}(\lambda -\lambda ') \mathrm {tr}\left( \rho X^\dagger (\varDelta _\rho -1)(Y)\right) \end{aligned}$$
(67)
$$\begin{aligned}&=\frac{1}{2}(\lambda -\lambda ') \mathrm {tr}\left( X^\dagger [\rho ,\,Y]\right) \end{aligned}$$
(68)
$$\begin{aligned}&=\frac{1}{2}(\lambda -\lambda ') \mathrm {tr}\left( X^\dagger \{\rho ,\,\mathrm {i}\mathcal{D}_\rho (Y)]\right) \end{aligned}$$
(69)
$$\begin{aligned}&=(\lambda -\lambda ') \langle X,\mathrm {i}\mathcal{D}_{\rho }(Y)\rangle _{\rho }^{S}. \end{aligned}$$
(70)

\(\square \)

3 Non-monotone metric

3.1 \((\nu ,\nu ')\)-Fisher metric

In our previous study [33], we introduced two specific non-monotone metrics without detailed analysis. Generalizing this idea, we define the following matrix:

Definition 2

Let \(\{L^{\nu '}_i\}\) be the set of the \(\nu \)LD operators and consider the \(\nu \)LD inner product. We define the \(n\times n\) matrix:

$$\begin{aligned} G^{\nu ,\nu '}&:= \big [ g^{\nu ,\nu '}_{ij}\big ],\nonumber \\ g^{\nu ,\nu '}_{ij}&:= \langle L^{\nu }_i,L^{\nu }_j\rangle _{\rho }^{\nu '}. \end{aligned}$$
(71)

From this definition, we can show that \(G^{\nu ,\nu '}\) is positive-definite. It then follows that \(G^{\nu ,\nu '}\) defines an inner product for the \(\nu \)LD tangent space \(T_\rho ^{\nu }(\mathcal{M})\) at each point \(\rho \in \mathcal{M}\), and hence, \(G^{\nu ,\nu '}\) is a metric tensor. In this way, we can introduce another family of Riemannian metrics. In the following, we call this metric a \((\nu ,\nu ')\)-Fisher metric. The case \(\nu =\nu '\) reduces to the \(\nu \)LD Fisher metric (39). When considering the \(\lambda \)-family, we call it a \((\lambda ,\lambda ')\)-Fisher metric.

3.2 Petz function representation

We construct the Petz function for the \((\lambda ,\lambda ')\)-Fisher metric. To that end, we need to find the Petz function \(f_{\lambda ,\lambda '}\) such that the following relation holds.

$$\begin{aligned} g^{\lambda ,\lambda '}_{ij}&= \langle \partial _i\rho ,L^{{(\mathrm {e})}}_j\rangle _{\mathrm {HS}}, \end{aligned}$$
(72)
$$\begin{aligned} (\partial _j\rho ) \rho ^{-1}&=f_{\lambda ,\lambda '}(\varDelta _\rho ) (L^{{(\mathrm {e})}}_j). \end{aligned}$$
(73)

A key observation is that the \(\lambda \)LD inner product (26) is expressed as

$$\begin{aligned} \langle X,Y\rangle _{\rho }^{\lambda }=\langle X\rho ,f_\lambda (\varDelta _\rho )(Y)\rangle _{\mathrm {HS}}. \end{aligned}$$
(74)

With this, the e-representation of the \(\lambda \)LD \(L^{\lambda }_i=f_\lambda (\varDelta _\rho )^{-1}(\partial _i\rho ) \rho ^{-1}\) yields,

$$\begin{aligned} g^{\lambda ,\lambda '}_{ij}&=\langle L^{\lambda }_i,L^{\lambda }_j\rangle _{\rho }^{\lambda '} \end{aligned}$$
(75)
$$\begin{aligned}&=\langle L^{\lambda }_i\rho ,f_{\lambda '}(\varDelta _\rho )(L^{\lambda }_j)\rangle _{\mathrm {HS}} \end{aligned}$$
(76)
$$\begin{aligned}&=\langle f_\lambda (\varDelta _\rho )^{-1}(\partial _i\rho ),f_{\lambda '}(\varDelta _\rho )(L^{\lambda }_j)\rangle _{\mathrm {HS}} \end{aligned}$$
(77)
$$\begin{aligned}&=\langle \partial _i\rho ,f_\lambda (\varDelta _\rho )^{-1}(f_{\lambda '}(\varDelta _\rho )(L^{\lambda }_j))\rangle _{\mathrm {HS}}, \end{aligned}$$
(78)

where the last equality follows from Lemma 3. We then identify

$$\begin{aligned} L^{{(\mathrm {e})}}_j&=f_\lambda (\varDelta _\rho )^{-1}(f_{\lambda '}(\varDelta _\rho )(L^{\lambda }_j)) \end{aligned}$$
(79)
$$\begin{aligned}&=(f_\lambda ^{-1} f_{\lambda '})(\varDelta _\rho )(L^{\lambda }_j) \end{aligned}$$
(80)
$$\begin{aligned}&=(f_\lambda ^{-1} f_{\lambda '} f_\lambda ^{-1})(\varDelta _\rho )(\partial _j\rho ) \rho ^{-1}, \end{aligned}$$
(81)

where the multiplication of two functions \(f_1f_2\) is defined by the mapping \(f_1f_2(t):=f_1(t)f_2(t)\). Therefore, we obtain

$$\begin{aligned} f_{\lambda ,\lambda '}(t)=\frac{[f_{\lambda }(t)]^2}{f_{\lambda '}(t)}. \end{aligned}$$
(82)

We remark that \(f_{\lambda ,\lambda '}\) (\(\lambda \ne \lambda '\)) is not symmetric. Explicitly, its conjugate is

$$\begin{aligned} f^*_{\lambda ,\lambda '}=f_{-\lambda ,-\lambda '}. \end{aligned}$$
(83)

Thus, it is symmetric if and only if \(\lambda =\lambda '=0\).

3.3 Non-monotonicity

In this subsection, we analyze the non-monotonicity of the \((\lambda ,\lambda ')\)-Fisher metric. Since \(\lambda =\lambda '\) reduces to the \(\lambda \)LD Fisher metric, which is monotone metric, we consider \(\lambda \ne \lambda '\) only. According to Petz’s Theorem 1, it is necessary and sufficient that \(f_{\lambda ,\lambda '}\) needs to be operator monotone.

For the (\(\lambda ,\lambda '\))-Fisher metric, we can show that its Petz function \(f_{\lambda ,\lambda '}\) is strictly convex for \(\lambda \ne \lambda '\). This is because its second derivative is

$$\begin{aligned} \frac{d^2 f_{\lambda ,\lambda '}}{dt^2}(t) =\frac{1}{2} \frac{(\lambda - \lambda ')^2}{[f_{\lambda '}(t)]^2} > 0 \quad (\lambda \ne \lambda '). \end{aligned}$$
(84)

It is known that \(g^f\) is monotone metric if and only if f is operator concave, and hence, it is necessary that f is concave. Therefore, convexity of \(f_{\lambda ,\lambda '}\) implies non-monotonicity of the (\(\lambda ,\lambda '\))-Fisher metric.

Next, we analyze its symmetrized metric. The Petz function is

$$\begin{aligned} \bar{f}_{\lambda ,\lambda '}= \frac{1}{2}(f_{\lambda ,\lambda '}+f^*_{\lambda ,\lambda '}) =\frac{1}{2}(f_{\lambda ,\lambda '}+f_{-\lambda ,-\lambda '}) , \end{aligned}$$
(85)

where \(f^*_{\lambda ,\lambda '}=f_{-\lambda ,-\lambda '}\) [Eq. (83)] is used. Clearly, this is a sum of two convex functions. Thus, \(\bar{f}_{\lambda ,\lambda '}\) is also convex, which shows the symmetrized metric is not monotone.

Lastly, let us consider the real part of the (\(\lambda ,\lambda '\))-Fisher metric. Using Eq. (9) gives

$$\begin{aligned} f_{\lambda ,\lambda '}^{\mathrm {Re}\,}=2\left( \frac{f_{\lambda '}}{f_{\lambda }^2}+\frac{f_{-\lambda '}}{f_{-\lambda }^2}\right) ^{-1}. \end{aligned}$$
(86)

Note that this Petz function reduces to a trivial one when \(\lambda =0\):

$$\begin{aligned} f_{0,\lambda '}^{\mathrm {Re}\,}=f_{\mathrm {SLD}}, \end{aligned}$$
(87)

which is operator monotone. The next result is our main contribution.

Theorem 3

The real part of the (\(\lambda ,\lambda '\))-Fisher metric is a monotone metric if and only if the following two conditions are satisfied.

$$\begin{aligned} \lambda ^4-3\lambda ^2+2\lambda \lambda '&\le 0, \end{aligned}$$
(88)
$$\begin{aligned} -3\lambda ^2+2\lambda \lambda '+1&\ge 0. \end{aligned}$$
(89)

In Fig. 1, we draw regions for \(f_{\lambda ,\lambda '}^{\mathrm {Re}\,}\) being an operator monotone function in the (\(\lambda ,\lambda '\)) plane, which are shown in gray.

Fig. 1
figure 1

Operator monotone regions for the Petz function \(f_{\lambda ,\lambda '}^{\mathrm {Re}\,}\) (gray regions)

Proof

Since the choice \(\lambda =0\) is operator monotone, we assume \( \lambda \ne 0\). A key idea of the proof is that the Petz function of any real monotone metric satisfies the relation (1112) (Petz’s theorem). In other words, we need to prove the two conditions in Theorem 3 are equivalent to

$$\begin{aligned} f_{\max }\ge f_{\lambda ,\lambda '}^{\mathrm {Re}\,} \ge f_{\min }. \end{aligned}$$
(90)

First, we note that

$$\begin{aligned} (f_{\lambda ,\lambda '}^{\mathrm {Re}\,})^{-1}- (f_{\max })^{-1}&=\frac{1}{4}\frac{(1-f_{\mathrm {SLD}})^2F^{\max }_{\lambda ,\lambda '}}{f_{\mathrm {SLD}}f_{\lambda }^2f_{-\lambda }^2}, \end{aligned}$$
(91)
$$\begin{aligned} F^{\max }_{\lambda ,\lambda '}(t)&:= (\lambda ^3+3\lambda ^2-2\lambda \lambda ')(1+t)^2+4\lambda ^4 t. \end{aligned}$$
(92)

Therefore, we have \(f_{\max }\ge f_{\lambda ,\lambda '}^{\mathrm {Re}\,}\Leftrightarrow F^{\max }_{\lambda ,\lambda '}\ge 0\). It is rather lengthy yet elementary to analyze this quadratic function. The result is

$$\begin{aligned} f_{\max }\ge f_{\lambda ,\lambda '}^{\mathrm {Re}\,}\ \Leftrightarrow \ \lambda ^4-3\lambda ^2+2\lambda \lambda '\le 0. \end{aligned}$$
(93)

Next, we have

$$\begin{aligned} (f_{\min })^{-1}-(f_{\lambda ,\lambda '}^{\mathrm {Re}\,})^{-1}&=\frac{1}{4}\frac{(1-f_{\mathrm {SLD}})^2f^{\mathrm {Re}\,}_{\mathrm {RLD}}F^{\min }_{\lambda ,\lambda '}}{f_{\lambda }^2f_{-\lambda }^2}, \end{aligned}$$
(94)
$$\begin{aligned} F^{\min }_{\lambda ,\lambda '}(t)&:=(1- \lambda ^2)^2(1+t^2)-(\lambda ^4+4\lambda ^2-4\lambda \lambda '-1)2t. \end{aligned}$$
(95)

We work out the condition \(F^{\min }_{\lambda ,\lambda '}\ge 0\) to get

$$\begin{aligned} f_{\lambda ,\lambda '}^{\mathrm {Re}\,}\ge f_{\min }\ \Leftrightarrow \ -3\lambda ^2+2\lambda \lambda '+1\ge 0. \end{aligned}$$
(96)

It is straightforward to check that \(\lambda =0\) is also included in these two conditions. \(\square \)

3.4 Properties

We introduce another \(n\times n\) matrix by

$$\begin{aligned} {Z}^{\nu ,\nu '}&:= \big [ {z}^{ij}_{\nu ,\nu '}\big ],\nonumber \\ {z}^{ij}_{\nu ,\nu '}&:= \langle L^{\nu ,i},L^{\nu ,j}\rangle _{\rho }^{\nu '}. \end{aligned}$$
(97)

Its relation to the metric tensor defined by the matrix (71) is

$$\begin{aligned} {Z}^{\nu ,\nu '} = \big (G^{\nu }\big )^{-1} G^{\nu ,\nu '}\big (G^{\nu }\big )^{-1}. \end{aligned}$$
(98)

In our discussion, we call \({Z}^{\nu ,\nu '}\) as the \((\nu ,\nu ')\)-Fisher dual metric.

The next lemma shows a matrix ordering relation between the \(\nu \)LD Fisher metric and the \((\nu ,\nu ')\)-Fisher dual metric.

Lemma 6

The matrix inequality,

$$\begin{aligned} {Z}^{\nu ,\nu '}\ge \big (G^{\nu '}\big )^{-1}, \end{aligned}$$
(99)

holds for arbitrary sets of measures \(\nu ,\nu '\). The necessary and sufficient condition for the equality is given by \(\forall i\), \(L^{\nu ,i}-L^{\nu ',i}=0\).

Due to the relation (98), this is equivalent to

$$\begin{aligned} G^{\nu ,\nu '}\ge G^{\nu }\big (G^{\nu '}\big )^{-1}G^{\nu }. \end{aligned}$$
(100)

Proof

Let \(m_{\nu ;\nu '}^i:=L^{\nu ,i}-L^{\nu ',i}\) and define an \(n\times n\) Hermitian matrix,

$$\begin{aligned} M_{\nu ;\nu '}:= \big [\langle m_{\nu ;\nu '}^i,m_{\nu ;\nu '}^j\rangle _{\rho }^{\nu '}\big ]. \end{aligned}$$
(101)

The matrix \(M_{\nu ,\nu '}\) is then positive semi-definite. Using Lemma 4, we can also express matrix elements of \(M_{\nu ,\nu '}\) as

$$\begin{aligned} \langle m_{\nu ;\nu '}^i,m_{\nu ;\nu '}^j\rangle _{\rho }^{\nu '}&= \langle m_{\nu ;\nu '}^i,L^{\nu ,j}\rangle _{\rho }^{\nu '}-\langle m_{\nu ;\nu '}^i,L^{\nu ',j}\rangle _{\rho }^{\nu '}\\&=\langle m_{\nu ;\nu '}^i,L^{\nu ,j}\rangle _{\rho }^{\nu '}\\&=\langle L^{\nu ,i},L^{\nu ,j}\rangle _{\rho }^{\nu '}-\langle L^{\nu ',i},L^{\nu ,j}\rangle _{\rho }^{\nu '}\\&={z}_{\nu ,\nu '}^{ij}-\sum _{k=1}^n (g^{\nu ',ki})^* \langle L^{\nu }_k,L^{\nu ',j}\rangle _{\rho }^{\nu }\\&={z}_{\nu ,\nu '}^{ij}-\sum _{k=1}^n g^{\nu ',ik} \langle L^{\nu '}_k,L^{\nu ',j}\rangle ^{\nu '}_{\rho }\\&={z}_{\nu ,\nu '}^{ij}-g^{\nu ',ij} . \end{aligned}$$

Therefore, we show the matrix inequality \([M_{\nu ,\nu '}]={Z}^{\nu ,\nu '}-(G^{\nu '})^{-1}\ge 0\). The equality is satisfied if and only if this matrix \([M_{\nu ,\nu '}]\) is zero. This is equivalent to \(m_{\nu ;\nu '}^i=L^{\nu ,i}-L^{\nu ',i}=0\) for all \(i=1,2,\dots ,n\). \(\square \)

We can also derive another matrix ordering relation by using the same logic as Lemma 6. Its proof is to start with \(m_{\nu ;\nu '}^i:=L^\nu _i-L^{\nu '}_i\), and it is omitted.

Lemma 7

The matrix inequality,

$$\begin{aligned} {G}^{\nu ,\nu '}\ge 2G^{\nu }-G^{\nu '}, \end{aligned}$$
(102)

holds for arbitrary sets of measures \(\nu ,\nu '\). The necessary and sufficient condition for the equality is \(\forall i\), \(L^{\nu }_{i}-L^{\nu '}_{i}=0\).

Next lemma shows a relation between the \((\lambda ,\lambda ')\)-Fisher metric through the commutation operator.

Lemma 8

For any \(\lambda ,\lambda '\in [-1,1]\), the following relations hold.

$$\begin{aligned} g^{\lambda }_{ij}&=g^{\lambda ,\lambda '}_{ij}+\mathrm {i}(\lambda -\lambda ') \langle L^{\lambda }_{i},\mathcal{D}_{\rho }(L^{\lambda }_{j})\rangle _{\rho }^{S}, \end{aligned}$$
(103)
$$\begin{aligned} g^{\lambda ,ij}&={z}_{\lambda ,\lambda '}^{ij}+\mathrm {i}(\lambda -\lambda ') \langle L^{\lambda ,i},\mathcal{D}_{\rho }(L^{\lambda ,j})\rangle _{\rho }^{S}. \end{aligned}$$
(104)

Proof

Set \(X=L^{\lambda }_{i}\) and \(Y=L^{\lambda }_{j}\) in Lemma 5 to get the first relation. The second relation follows from Lemma 5 with \(X=L^{\lambda ,i}\) and \(Y=L^{\lambda ,j}\). \(\square \)

As a corollary of Lemma 5 and this lemma, we have several relations. We list three of them as examples.

Corollary 3

$$\begin{aligned} g^{S,ij}&={z}_{0,\lambda '}^{ij}-\mathrm {i}\lambda '\langle L^{S,i},\mathcal{D}_{\rho }(L^{S,j})\rangle _{\rho }^{S}, \end{aligned}$$
(105)
$$\begin{aligned} g^{\lambda ,ij}&={z}_{\lambda ,0}^{ij}+\mathrm {i}\lambda \langle L^{\lambda ,i},\mathcal{D}_{\rho }(L^{\lambda ,j})\rangle _{\rho }^{S}, \end{aligned}$$
(106)
$$\begin{aligned} g^{\lambda ,ij}&=g^{S,ij}+\mathrm {i}\lambda \langle L^{\lambda ,i},\mathcal{D}_{\rho }(L^{S,j})\rangle _{\rho }^{S}. \end{aligned}$$
(107)

Proof

The first relation is obtained by setting \(\lambda =0\) in Eq. (104). The second one corresponds to \(\lambda '=0\) in Eq. (104). The last equation follows from Lemma 5 with the choice \(\lambda '=0\) and \(X=L^{\lambda ,i},Y=L^{S,j}\), and the use of Eq. (31). \(\square \)

4 Application to quantum parameter estimation

4.1 Quantum Cramér–Rao inequality

We now consider the quantum state estimation problem: Given a parametric model \(\mathcal{M}=\{\rho _\theta |\theta \in \varTheta \}\), we perform a measurement described by a positive operator-valued measure \(\varPi =\{\varPi _x\}_{x\in \mathcal{X}}\). Based on the measurement outcome \(x\in \mathcal{X}\), we make an estimate \(\hat{\theta }:\mathcal{X}\rightarrow \varTheta \). The objective here is to minimize the mean square error matrix:

$$\begin{aligned} V_\theta [\varPi ,\hat{\theta }]&:= \big [v_{\theta }^{ij}[\varPi ,\hat{\theta }]\big ],\nonumber \\ v_{\theta }^{ij}[\varPi ,\hat{\theta }]&:= \sum _{x\in \mathcal{X}} (\hat{\theta }(x)^i-\theta ^i)(\hat{\theta }^j(x)-\theta ^j) \mathrm {tr}\left( \rho _\theta \varPi _x\right) . \end{aligned}$$
(108)

As in the classical statistics, the \(\lambda \)LD Fisher metric sets an estimation error bound, known as the quantum Cramér–Rao inequality, as follows:

Theorem 4

For any (locally) unbiased estimator \((\varPi ,\hat{\theta })\), the matrix inequality holds for each \(\theta \in \varTheta \) and every choice of the \(\lambda \)LD Fisher information matrix:

$$\begin{aligned} V_\theta [\varPi ,\hat{\theta }] \ge \big (G^\lambda _\theta \big )^{-1}, \end{aligned}$$
(109)

where \(G^\lambda _\theta =G^\lambda (\rho _\theta )\).

We remark that unlike the classical case, this matrix inequality can not be saturated even asymptotically in general.

4.2 Holevo, SLD, and RLD bounds

We first define the Holevo bound, which is a bound for the mean square error matrix. In the traditional setting, we aim to minimize the weighted-trace of the mean square matrix of the form: \(\mathrm {Tr}\left\{ W V_\theta [\varPi ,\hat{\theta }] \right\} \). Here, W (\(n\times n\) positive definite) is called a weight matrix and can be chosen arbitrary. Here, trace on the \(n\times n\) real matrix space is denoted by \(\mathrm {Tr}\left\{ \cdot \right\} \) to distinguish trace on the Hilbert space \(\mathrm {tr}\left( \cdot \right) \).

Definition 3

Given a weight matrix \(W>0\), the Holevo bound at \(\rho _\theta \) is defined by the following optimization:

$$\begin{aligned} \quad C_\theta ^H[W]:=\!\min _{\mathbf {X}\in \mathcal{X}_\theta } \mathrm {Tr}\left\{ W \mathrm {Re}\,Z_\theta [\mathbf {X}]\right\} \!+\!\mathrm {Tr}\left\{ |W^{\frac{1}{2}} \mathrm {Im}\,Z_\theta [\mathbf {X}] W^{\frac{1}{2}}|\right\} , \end{aligned}$$
(110)

where the set \(\mathcal{X}_\theta \) is defined by

$$\begin{aligned} \mathcal{X}_\theta :=\{\mathbf {X}\in \mathcal{L}_h(\mathcal{H})^n\,|\,\forall i,j,\,\partial _i\mathrm {tr}\left( \rho _\theta X^j\right) =\delta ^j_{\,i} \}, \end{aligned}$$

and the \(n\times n\) matrix is

$$\begin{aligned} Z_\theta [\mathbf {X}]:=\big [ \langle X^i,X^j\rangle ^R_{\rho _\theta } \big ] . \end{aligned}$$
(111)

Note that the above minimization is attained by \(\mathbf {X}\) satisfying \(\mathrm {tr}\left( \rho _\theta X^i\right) =0 \), we see that the optimum choice of \(\mathbf {X}\) is locally unbiased at \(\theta \).

By the quantum Cramér–Rao inequality (109), we can work out the same argument to derive the quantum Cramér–Rao bound [10]. We report the result only.

Theorem 5

The weighted trace of the MSE matrix for any locally unbiased estimator satisfies

$$\begin{aligned}&\displaystyle \mathrm {Tr}\left\{ W V_\theta [\varPi ,\hat{\theta }] \right\} \ge C_\theta ^\lambda [W], \end{aligned}$$
(112)
$$\begin{aligned}&\displaystyle C_\theta ^\lambda [W] :=\mathrm {Tr}\left\{ W\mathrm {Re}\,\big (G_\theta ^\lambda \big )^{-1}\right\} +\mathrm {Tr}\left\{ |W^{\frac{1}{2}} {\mathrm {Im}\,} \big (G_\theta ^\lambda \big )^{-1} W^{\frac{1}{2}} |\right\} , \end{aligned}$$
(113)

for every choice of \(\lambda \in [-1,1]\).

Note that familiar Cramér–Rao bounds, the SLD (\(\lambda =0\)) and RLD (\(\lambda =1\)) bounds, are included as special cases.

$$\begin{aligned} C_\theta ^S[W]&:=\mathrm {Tr}\left\{ W\big (G_\theta ^S\big )^{-1}\right\} ,\\ C_\theta ^R[W]&:=\mathrm {Tr}\left\{ W\mathrm {Re}\,\big (G_\theta ^R\big )^{-1}\right\} +\mathrm {Tr}\left\{ |W^{\frac{1}{2}} {\mathrm {Im}\,} \big (G_\theta ^R\big )^{-1} W^{\frac{1}{2}} |\right\} . \end{aligned}$$

Since the relation \(C_\theta ^H[W]\ge C_\theta ^\lambda [W]\) holds for all \(\lambda \in [-1,1]\), thus we have \(C_\theta ^H[W]\ge \max _{\lambda \in [-1,1]}C_\theta ^\lambda [W]\). However, there is no ordering among \(C_\theta ^\lambda [W]\) in general.

4.3 Special models

In general, the Holevo bound cannot be expressed as a simple closed formula. In Refs. [32, 33], we gave several equivalent characterization for quantum parametric models so that the Holevo bound coincides with the following special bounds explained in the next subsection. In the rest of discussion, we consider global aspect of parametric models. In other words, all conditions and statements below are for all points \(\rho \in \mathcal{M}\) (or \(\theta \in \varTheta \)).

4.3.1 D-invariant model [10]

Holevo introduced notion of the D-invariant model.

Definition 4

When the SLD tangent space is invariant under the action of the commutation operator \(\mathcal{D}_\rho \), a model is said locally D-invariant at \(\rho \).

We say \(\mathcal{M}\) is D-invariant, when a model is D-invariant at all points \(\rho \in \mathcal{M}\).

It was known that if the model is D-invariant, then the RLD Cramér–Rao bound can be attained. In Ref. [32], the converse statement was proven.

Theorem 6

A model \(\mathcal{M}\) is D-invariant if and only if the Holevo bound is equivalent to the RLD Cramér–Rao bound.

Therefore, the definition of D-invariance is equivalent to achievability of the RLD Cramér–Rao bound. In refs. [32, 33], several equivalent characterization for the D-invariant model was derived.

4.3.2 Asymptotically classical model [32, 33, 40]

Motivated by equivalence proven in Theorem 6, we define the following class of models.

Definition 5

When the Holevo bound is equivalent to the SLD Cramér–Rao bound for all choices of weight matrices, the model is said asymptotically classical.

In Refs. [33, 40], the following theorem was shown.

Theorem 7

A model \(\mathcal{M}\) is asymptotically classical, if and only if the state-weighted trace of commutation relations for all pairwise SLD operators vanish.

Mathematically, this is expressed as \(\mathrm {tr}\left( \rho _\theta [L^S_i(\theta ),\,L^S_j(\theta )]\right) =0\) for all ij. We call this condition as the weak commutativity condition. (The authors of Ref. [40] called this condition as the compatibility condition.) In Ref. [33], several equivalent characterization for the asymptotically classical model was derived. In particular, the condition:

$$\begin{aligned} {\mathcal{M}} \hbox {is asymptotically classical} \leftrightarrow \ \big (G^S\big )^{-1}=Z^{0,1}, \end{aligned}$$

is important in our discussion.

4.3.3 Classical model

At each point \(\theta \in \varTheta \), a d-dimensional quantum state \(\rho _\theta \) can be diagonalized with a unitary \(U_\theta \) as \(\rho _\theta =U_\theta \Lambda _\theta U_\theta ^{-1}\), where the diagonal matrix,

$$\begin{aligned} \Lambda _\theta =\mathrm {diag}(p_\theta (1),p_\theta (2),\dots ,p_\theta (d)), \end{aligned}$$
(114)

lists the eigenvalues of the state \(\rho _\theta \). By definition, \(\forall i,\,p_\theta (i)>0\) and \(\sum _{i=1}^dp_\theta (i)=1\). In other words, \(\Lambda _\theta \) can be regarded as an element of the probability simplex \(\mathcal{P}(d)\). (The set of all positive probability distributions on the set \(\{1,2,\dots ,d\}\).) When the unitary \(U_\theta \) is independent of \(\theta \) for all point in \(\varTheta \), it is clear that any statistical problem is reduced to the classical one. With this identification, we have the following definition.

Definition 6

A model \(\mathcal{M}\) is said classical, if the family of quantum states \(\rho _\theta \) can be diagonalized with a \(\theta \)-independent unitary U as

$$\begin{aligned} \rho _\theta =U \Lambda _\theta U^{-1}, \end{aligned}$$
(115)

for all parameter \(\theta \in \varTheta \). In other words, \(\mathcal{M}\) is said classical, if there exists a one-to-one mapping from each element \(\rho _\theta \in \mathcal{M}\) to \(p_\theta \in \mathcal{P}(d)\).

The classical model is reduced to the familiar dually flat manifold, which is well studied in the classical theory of information geometry. The unique Riemannian metric is the Fisher metric.

4.4 Characterization of models

In this section, we state our results for characterizing quantum parametric models based on the \(\lambda \)LD operators, tangent spaces, and Fisher metrics. As before, we are considering the global aspect of statistical models, and thus, all conditions refer to all points \(\rho \in \mathcal{M}\) (or \(\forall \theta \in \varTheta \)). In the previous publication [33], the case of \(\lambda =1\) was proven. In this paper, we show that it can be extended to any \(\lambda \in [-1,1]\).

4.4.1 D-invariant model

For the D-invariant model, we have the following equivalent characterization:

  1. 1.

    \(\mathcal{M}\) is D-invariant.

  2. 2.

    The \(\lambda \)LD tangent space is the invariant subspace of the \(\mathcal{D}_\rho \) operator for every \(\lambda \).

  3. 3.

    The \(\lambda \)LD dual operators are identical to the SLD dual operators for every \(\lambda \ne 0\).

  4. 4.

    The inverse of the \(\lambda \)LD Fisher metric is identical to the \((0,\lambda )\)-Fisher dual metric for every \(\lambda \ne 0\).

  5. 5.

    The inverse of the SLD Fisher metric is identical to the \((\lambda ,0)\)-Fisher dual metric for every \(\lambda \ne 0\).

Mathematically, condition 2 is \(\forall \lambda \mathcal{D}_{\rho }(T^\lambda _\rho )\subset T^\lambda _\rho \). Condition 3 is equivalent to \(\forall \lambda ,\forall i,\ L^{\lambda ,i}=L^{S,i}\). Condition 4 is expressed as \(\forall \lambda ,(G^\lambda )^{-1}=Z^{0,\lambda }\). Last, condition 5 is \(\forall \lambda ,(G^S)^{-1}=Z^{\lambda ,0}\)

In fact, we can relax these conditions as follows: If there exists some \(\lambda \ne 0\) satisfying one of the conditions, then it implies D-invariance of the model.

1.:

\(\mathcal{M}\) is D-invariant.

2\(^{\prime }\).:

The \(\lambda \)LD tangent space is the invariant subspace of the \(\mathcal{D}_\rho \) operator for some \(\lambda \).

3\(^{\prime }\).:

The \(\lambda \)LD dual operators are identical to the SLD dual operators for some \(\lambda \ne 0\).

4\(^{\prime }\).:

The inverse of the \(\lambda \)LD Fisher metric is identical to the \((0,\lambda )\)-Fisher dual metric for some \(\lambda \ne 0\).

5\(^{\prime }\).:

The inverse of the SLD Fisher metric is identical to the \((\lambda ,0)\)-Fisher dual metric for some \(\lambda \ne 0\).

4.4.2 Asymptotically classical model

For the asymptotically classical model, we have the following equivalence:

  1. 1.

    \(\mathcal{M}\) is asymptotically classical.

  2. 2.

    The image of the SLD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the \(\lambda \)LD tangent space with respect to the \(\lambda \)LD inner product for every \(\lambda \).

  3. 3.

    The image of the \(\lambda \)LD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the SLD tangent space with respect to the \(\lambda \)LD inner product for every \(\lambda \).

  4. 4.

    The inverse of the SLD Fisher metric is equivalent to \((0,\lambda )\)-Fisher metric for every \(\lambda \ne 0\).

Condition 2 is to demand: \(\langle \mathcal{D}_{\rho }(X),Y\rangle _{\rho }^{\lambda }=0\) for all \(X\in T^S_\rho ,Y\in T^\lambda _\rho \) holds for every \(\lambda \). \(\Leftrightarrow \) \(\forall i,j,\ \langle \mathcal{D}_{\rho }(L^S_i),L^\lambda _j\rangle _{\rho }^{\lambda }=0\) for all \(\lambda \). Condition 3 is expressed as \(\forall i,j,\ \langle \mathcal{D}_{\rho }(L^\lambda _i),L^S_j \rangle _{\rho }^{S}=0\). Condition 4 states that \(G^S=G^{0,\lambda } \) holds for every \(\lambda \ne 0\).

Similarly, we have a weaker version of the above characterization.

1.:

\(\mathcal{M}\) is asymptotically classical.

2\(^{\prime }\).:

The image of the SLD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the \(\lambda \) tangent space with respect to the \(\lambda \)LD inner product for some \(\lambda \).

3\(^{\prime }\).:

The image of the \(\lambda \)LD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the SLD tangent space with respect to the \(\lambda \)LD inner product for some \(\lambda \).

4\(^{\prime }\).:

The inverse the SLD Fisher metric is equivalent to \((0,\lambda )\)-Fisher metric for some \(\lambda \ne 0\).

4.4.3 Classical model

For the classical model, we have the following equivalent characterization:

  1. 1.

    \(\mathcal{M}\) is classical.

  2. 2.

    The image of \(\lambda \)LD tangent space by the \(\mathcal{D}_\rho \) operator is in the kernel of \(\mathcal{D}_\rho \) for every \(\lambda \).

  3. 3.

    The \(\lambda \)LD operators are identical to the SLD operators for every \(\lambda \ne 0\).

  4. 4.

    The \(\lambda \)LD Fisher metric is identical to the SLD Fisher metric for every \(\lambda \ne 0\).

  5. 5.

    The \(\lambda \)LD Fisher metric is identical to the \((\lambda ,0)\)-Fisher metric for every \(\lambda \ne 0\).

  6. 6.

    The dual \((0,\lambda )\)-Fisher metric is identical to the dual \((\lambda ,0)\)-Fisher metric for every \(\lambda \ne 0\).

  7. 7.

    The image of \(\lambda \)LD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the \(\lambda \)LD tangent space with respect to the \(\lambda \)LD inner product for every \(\lambda \ne 0\).

  8. 8.

    The image of \(\lambda \)LD tangent space by the \(\mathcal{D}_\rho \) operator is orthogonal the \(\lambda \)LD tangent space with respect to the SLD inner product for every \(\lambda \ne 0\).

  9. 9.

    \(\mathcal{M}\) is D-invariant and asymptotically classical.

We can express condition 2 by \(\forall i,\ \mathcal{D}_{\rho }(L^\lambda _i)=0\). Conditions 3 and 4 are equivalent to \(\forall i,\ L^\lambda _i=L^S_i\) and \(G^\lambda =G^S\), respectively. Conditions 5 and 6 are expressed as \(G^\lambda =G^{\lambda ,0}\) and \(Z^{0,\lambda }=Z^{\lambda ,0}\). Condition 7 is \(\langle L^\lambda _i,\mathcal{D}_{\rho }(L^\lambda _j)\rangle _{\rho }^{\lambda }=0\) for all ij and for every \(\lambda \ne 0\), whereas condition 8 is \(\langle L^\lambda _i,\mathcal{D}_{\rho }(L^\lambda _j)\rangle _{\rho }^{S}=0\) for all ij and for every \(\lambda \ne 0\).

Just as in the D-invariant case and the asymptotically classical case, conditions 2\(\sim \)8 can be relaxed to requiring the existence of \(\lambda \). This extension is omitted for want of space.

4.5 Proofs

In this subsection, we give proofs for the results presented in the previous subsection.

4.5.1 D-invariant model

We will prove the following three steps.

$$\begin{aligned}&1\Leftrightarrow 2\Rightarrow 3\Leftrightarrow 4\Leftrightarrow 5, \end{aligned}$$
(116)
$$\begin{aligned}&2\Rightarrow 2^{\prime }\Rightarrow 3^{\prime }\Leftrightarrow 4^{\prime }\Leftrightarrow 5^{\prime }, \end{aligned}$$
(117)
$$\begin{aligned}&3^{\prime }\Rightarrow 1. \end{aligned}$$
(118)

First, let us rewrite the D-invariant condition 1 as follows [32]. Consider a D-invariant model \(\mathcal{M}\) and let \(T_\rho ^S(\mathcal{M})=\mathrm {span}_{{\mathbb R}}\{L_i^S\}\) be the SLD tangent space at \(\rho \). By definition, \(\mathcal{D}_\rho (X)\subset T_\rho ^S(\mathcal{M})\) for \(X\in T_\rho ^S(\mathcal{M})\) holds, if and only if the canonical projection of \(\mathcal{D}_\rho (X)\) onto the SLD tangent space is equivalent to itself for all \(X\in T_\rho ^S(\mathcal{M})\). For \(\lambda \in [-1,1]\), let \(P_{T^\lambda }\) be the canonical projection onto the \(\lambda \)LD tangent space \(T_\rho ^\lambda (\mathcal{M})\) with respect to the \(\lambda \)LD inner product \(\langle \cdot ,\cdot \rangle _\rho ^\lambda \). By definition, for every operators \(X\in \mathcal{L}(\mathcal{H})\), we have

$$\begin{aligned} P_{T^\lambda }(X)=\sum _{i=1}^n \langle X,L^{\lambda ,i}\rangle _\rho ^\lambda L_i^\lambda =\sum _{i=1}^n \langle X,L^\lambda _i\rangle _\rho ^\lambda L^{\lambda ,i}. \end{aligned}$$

Therefore, D-invariance of the model is equivalent to

$$\begin{aligned} \mathcal{D}_\rho (L^S_i) =\sum _{j=1}^n \langle \mathcal{D}_\rho (L^S_i),L^S_j\rangle _{\rho }^{S} L^{S,j} \ \text{ for } \forall i. \end{aligned}$$

From Corollary 3, we can rewrite it as

$$\begin{aligned} \text{ Condition } \text{1 } \Leftrightarrow \ \mathcal{D}_\rho (L^S_i)&=-\mathrm {i}\sum _{j=1}^n( g^{0,1}_{ji}-g^S_{ji})L^{S,j} =\sum _{j=1}^n\mathrm {Im}\,\left( g^{0,1}_{ji}\right) L^{S,j} \end{aligned}$$
(119)
$$\begin{aligned}&=\sum _{j,k=1}^n\mathrm {Im}\,\left( g^{0,1}_{ji}\right) g^{S,kj} L^S_k. \end{aligned}$$
(120)

Proof of (116): From condition 1 \(\Leftrightarrow \) (120), we can easily show that the \(\lambda \)LD tangent space is also an invariant subspace under \(\mathcal{D}_\rho \). In particular, we obtain

$$\begin{aligned} \mathcal{D}_\rho (L^\lambda _i) =\sum _{j,k=1}^n\mathrm {Im}\,\left( g^{0,1}_{ji}\right) g^{S,kj}L^{\lambda }_k. \end{aligned}$$
(121)

This is because of the relation (34): \(L_{i}^S=(I+\mathcal{D}_\rho )(L_{i}^\lambda )\) and \(\mathcal{D}_\rho \) commutes with \((I+\mathcal{D}_\rho )^{-1}\). Thus, \(1\Rightarrow 2\) holds. Conversely, when condition 2 holds, we can work out to obtain coefficients explicitly by the canonical projection and Corollary 3. Thus, we show \(1\Leftrightarrow 2\).

Next, condition 1 \(\Leftrightarrow \) (119) and the use of Corollary 3 [eq. (107)] give

$$\begin{aligned} \big (G^\lambda \big )^{-1}=\big (G^S\big )^{-1}+\mathrm {i}\lambda \mathrm {Im}\,\left( Z^{0,1}\right) \Leftrightarrow \big (G^\lambda \big )^{-1}=Z^{0,\lambda }. \end{aligned}$$

Therefore, we show 1 \(\Rightarrow \) 4 \(\Leftrightarrow \) 3 \(\Leftrightarrow \) 5, where we used Lemma 6.

Proof of (117): 2 \(\Rightarrow \) \(2^{\prime }\) is a trivial statement. \(2^{\prime }\Rightarrow 3^{\prime }\Leftrightarrow 4^{\prime }\Leftrightarrow 5^{\prime }\) can be shown in the same logic as the proof for (116) above.

Proof of (118): Suppose there exists \(\lambda \ne 0\) such that \(\forall i,\ L^{\lambda ,i}=L^{S,i}\) holds. Then, we can rewrite this by eq. (34) as

$$\begin{aligned} \mathcal {D}_\rho (L^ \lambda _i)=- \mathrm {i}\lambda ^{-1}\sum _{j=1}^n g^S_{ji}\left( L^{\lambda ,j}-L^ \lambda _i \right) . \end{aligned}$$
(122)

This states that \(\mathcal{D}_{\rho }(T^\lambda _\rho )\) is a subspace of \(\tilde{T}^\lambda _\rho \): the complex span of the \(\lambda \)LD tangent space. Furthermore, we can work out these coefficients to show that they are real, and hence, \(\mathcal{D}_{\rho }(T^\lambda _\rho )\subset T^\lambda _\rho \) holds.

4.5.2 Asymptotically classical model

First, we show \(1\Leftrightarrow 2\Leftrightarrow 3\Leftrightarrow 4\). Consider an asymptotically classical model \(\mathcal{M}\). Then, the equivalent expression \(\mathrm {tr}\left( \rho _\theta [L^S_i,\,L^S_j]\right) =0\) in Theorem 7 can be expressed as:

$$\begin{aligned} \langle L^S_i,\mathcal{D}_{\rho }(L^S_j)\rangle _{\rho }^{S}=0. \end{aligned}$$

This shows that the image of the SLD tangent space is orthogonal to the SLD tangent space with respect to the SLD inner product. To prove orthogonality to the \(\lambda \)LD tangent space, we use identity (31) to get

$$\begin{aligned} \langle L^\lambda _i,\mathcal{D}_{\rho }(L^S_j)\rangle _{\rho }^{\lambda }=0\ \forall i,j. \end{aligned}$$

Thus, equivalence to condition 2 is proven. Equivalence to condition 3 is due to Corollary 2. Next, use of Lemma 8 immediately proves equivalence to condition 4. We can check that the choice of \(\lambda \) is arbitrary.

Next, we show \(1\Leftrightarrow 2^{\prime }\Leftrightarrow 3^{\prime }\Leftrightarrow 4^{\prime }\). Equivalence \(2^{\prime }\Leftrightarrow 3^{\prime }\Leftrightarrow 4^{\prime }\) is trivial, and hence, we only need to prove condition \(4^{\prime }\) implies asymptotically classicality. Suppose condition \(4^{\prime }\), which is equivalent to

$$\begin{aligned} G^S=G^{0,\lambda }\ \Leftrightarrow \ \mathrm {Im}\,(G^{0,\lambda })=0\ \Leftrightarrow \ \mathrm {Im}\,(G^{0,1})=0, \end{aligned}$$

where \(\mathrm {Re}\,G^{0,\lambda }=G^S\) and \(\mathrm {Im}\,(G^{0,\lambda }) =\lambda \mathrm {Im}\,(G^{0,1})\) are used. This proves the model is asymptotically classical.

4.5.3 Classical model

We will prove the statement by the four steps.

$$\begin{aligned}&1\Leftrightarrow 2\Leftrightarrow 3, \end{aligned}$$
(123)
$$\begin{aligned}&1\Leftrightarrow 4\Leftrightarrow 7, \end{aligned}$$
(124)
$$\begin{aligned}&1\Leftrightarrow 5\Leftrightarrow 8, \end{aligned}$$
(125)
$$\begin{aligned}&1\Leftrightarrow 6. \end{aligned}$$
(126)

First, we note that condition 1 implies other conditions. This is because classicality of the model does not exhibit any non-commutativity. Thus, converse statements need to be proven. Equivalence between \(1\Leftrightarrow 9\) was proven in Ref. [33], which will be used in the proof below. Here, we alternatively show it by noting the following fact. If a model is D-invariant and asymptotically classical, then the image of the SLD tangent space is invariant subspace of the commutation operator and is also orthogonal to itself with respect to the SLD inner product. But this is only possible only if the image of the SLD tangent space is in the kernel of the commutation operator.

Proof of (123): Note that the definition of the classical model is equivalent to the commutativity of \(\rho _\theta \) for all different values \(\theta \). This is then equivalent to:

$$\begin{aligned} \forall i,\ [\partial _i\rho _{\theta },\,\rho _\theta ]=0. \end{aligned}$$
(127)

Direct calculation and the definition of the \(\lambda \)LD operators give \([L_i^\lambda ,\,\rho _\theta ]=0\) for all i. From definition of the commutation operator and the fact that \(X\rho +\rho X=0\) implies \(X=0\) if \(\rho >0\), we have

$$\begin{aligned} \ker \mathcal{D}_{\rho }=\{X\in \mathcal{L}(\mathcal{H})\,|\, [X,\rho ]=0 \}. \end{aligned}$$

This then proves equivalence to condition 2.

Equivalence to condition 3 is shown as follows. If the model is classical, it is straightforward to show \(L^\lambda _i=L^S_i\) for all i. Conversely, if the \(\lambda \)LD operators are identical to the SLD operators, we obtain \(\forall i,\ \mathcal{D}_{\rho }(L^\lambda _i)=0\) by the identity (34). This then implies condition 2.

Proof of (124): Condition 4 is equivalent to 7 because of the identity (34):

$$\begin{aligned} \langle \mathcal{D}_{\rho }(L^\lambda _i),L^\lambda _j\rangle _{\rho }^{\lambda }=\mathrm {i}\langle L^S_i-L^\lambda _i,L^\lambda _j\rangle _{\rho }^{\lambda }=0 \Leftrightarrow G^S=G^\lambda . \end{aligned}$$

To prove condition 4 implies classicality, we rewrite it as \((G^\lambda )^{-1}=(G^S)^{-1}\). This then implies \(\mathrm {Im}\,(Z^{0,\lambda })=0\) (asymptotically classicality) and \((G^\lambda )^{-1}=Z^{0,\lambda }\) (D-invariance). This is equivalent to condition 9.

Proof of (125): Equivalence between condition 5 and 8 follows from eq. (104) in Lemma 8 with the choice \(\lambda '=0\). Next, suppose condition 5, and apply Lemma 7 with \(\lambda '=0\).

$$\begin{aligned} G^{\lambda ,0}=G^\lambda \ge 2G^\lambda -G^S \Leftrightarrow G^S\ge G^\lambda . \end{aligned}$$
(128)

In particular, we have

$$\begin{aligned} G^S\ge \mathrm {Re}\,(G^\lambda ). \end{aligned}$$
(129)

By Lemma 6 with \(\lambda \rightarrow 0\) and \(\lambda '\rightarrow \lambda \), we have

$$\begin{aligned} Z^{0,\lambda }\ge (G^\lambda )^{-1} \ \Rightarrow \ (G^S)^{-1}\ge \mathrm {Re}\,\left\{ (G^\lambda )^{-1}\right\} . \end{aligned}$$
(130)

Eqs. (129) and (130) imply

$$\begin{aligned} G^S=G^\lambda . \end{aligned}$$

This follows from the standard (positive) matrix algebra. Therefore, we prove the model is classical.

Proof of (126): Condition 6 is written as:

$$\begin{aligned} Z^{0,\lambda }=Z^{\lambda ,0} . \end{aligned}$$
(131)

Using Lemma 6 with \(\lambda '=0\) thus gives

$$\begin{aligned} Z^{0,\lambda }=Z^{\lambda ,0}\ge (G^S)^{-1}. \end{aligned}$$
(132)

This then implies \(\mathrm {Im}\,(Z^{0,\lambda })=0\); otherwise, the above matrix inequality is violated. Thus, the model is asymptotically classical. With this, we also obtain \(Z^{\lambda ,0}= (G^S)^{-1}\) (D-invariance). Therefore, the model is classical.

4.6 Holevo–Nagaoka type bound

Before closing our discussion, we point out that the \(\lambda \)LD inner product enables us to define the Holevo–Nagaoka type bound [10, 41] as follows. Due to the page limitation, we only sketch our ideas without proofs.

Let \(\mathbf {X}\in \mathcal{X}_\theta \) be the set of Hermitian operators and define the \(n\times n\) matrix:

$$\begin{aligned} Z_\theta ^\lambda [\mathbf {X}]:=\big [ \langle X^i,X^j\rangle ^\lambda _{\rho _\theta } \big ] . \end{aligned}$$
(133)

It is easy to show the matrix inequality,

$$\begin{aligned} V_\theta [\varPi ,\hat{\theta }]\ge Z_\theta ^\lambda [\mathbf {X}], \end{aligned}$$
(134)

for all \(\mathbf {X}\in \mathcal{X}_\theta \) and for all locally unbiased estimators. We thus obtain the following bound for the weighted trace of the mean square error matrix.

$$\begin{aligned} C_\theta ^{HN,\lambda }[W]:=\min _{\mathbf {X}\in \mathcal{X}_\theta } \mathrm {Tr}\left\{ W \mathrm {Re}\,Z_\theta ^\lambda [\mathbf {X}]\right\} + \mathrm {Tr}\left\{ |W^{\frac{1}{2}} \mathrm {Im}\,Z_\theta ^\lambda [\mathbf {X}] W^{\frac{1}{2}}|\right\} . \end{aligned}$$
(135)

The derivation of this bound follows exactly same manner as the Holevo bound [41]. (\(\lambda =1\) reduces to the Holevo bound.) Because this family of bounds is true for all \(\lambda \), we immediately observe that

$$\begin{aligned} C_\theta ^{HN}[W]:=\max _{\lambda \in [-1,1]} C_\theta ^{HN,\lambda }[W], \end{aligned}$$
(136)

defines the best bound.

We note that the \(\lambda \)LD inner product takes the form of Eq. (26). Then, we have

$$\begin{aligned} Z_\theta ^\lambda [\mathbf {X}]= \mathrm {Re}\,Z_\theta [\mathbf {X}] + \mathrm {i}\lambda \mathrm {Im}\,Z_\theta [\mathbf {X}] , \end{aligned}$$
(137)

and the effect of \(\lambda \) parameter enters as \(|\lambda |\) in the second term of bound (135). Therefore, we conclude

$$\begin{aligned} C_\theta ^{HN}[W]=\max _{\lambda \in [-1,1]} C_\theta ^{HN,}[W]=C_\theta ^H[W], \end{aligned}$$

where the maximum is attained by \(\lambda =\pm 1\). In other words, the Holevo bound is the best bound for the weighted trace of the mean square error matrix.

Next, we turn our attention to minimizing different scalar quantities of the mean square error matrix. This problem is known as optimal design of experiments in the statistical literature [42,43,44,45,46,47]. When other optimality criterion is adopted, it is not clear if the above observation of optimality for \(\lambda =\pm 1\) holds or not. To illustrate this point, suppose we are interested in minimizing the determinant of the mean square matrix (the D-optimality):

$$\begin{aligned} \min _{\varPi ,\hat{\theta }} \mathrm {Det}\left\{ V_\theta [\varPi ,\hat{\theta }] \right\} . \end{aligned}$$
(138)

In this case, the optimal value for the best bound is again \(\pm 1\). To demonstrate it explicitly, let us consider arbitrary two-parameter model. For the \(\lambda \)LD inner product, we have

$$\begin{aligned} \mathrm {Det}\left\{ V_\theta [\varPi ,\hat{\theta }] \right\} \ge \mathrm {Det}\left\{ Z_\theta ^\lambda [\mathbf {X}]\right\} . \end{aligned}$$

The \(\lambda \) version of Holevo–Nagaoka-type bound is

$$\begin{aligned} C^{\mathrm {D},\lambda }_\theta :=\min _{\mathbf {X}\in \mathcal{X}_\theta } \mathrm {Det}\left\{ \mathrm {Re}\,Z_\theta [\mathbf {X}]\right\} +\lambda ^2 \mathrm {Det}\left\{ \mathrm {Im}\,Z_\theta [\mathbf {X}]\right\} . \end{aligned}$$
(139)

Then, we can show that the choice \(\lambda =\pm 1\) gives the best bound, and the corresponding bound is given as

$$\begin{aligned} C^{\mathrm {D}}_\theta :=\min _{\mathbf {X}\in \mathcal{X}_\theta }\mathrm {Det}\left\{ \mathrm {Re}\,Z_\theta [\mathbf {X}]\right\} +\mathrm {Det}\left\{ \mathrm {Im}\,Z_\theta [\mathbf {X}]\right\} . \end{aligned}$$
(140)

If this conclusion of optimality \(\lambda =\pm 1\) can be proven in general, it means that the RLD (or LLD) inner product is essential for the quantum parameter estimation problem. This point awaits to be shown in the future work.

5 Concluding remarks

In this paper, we have studied non-monotone metric on the quantum state manifold. Our philosophy is to start with the \(\lambda \)LD inner product. To the best of our knowledge, this two-parameter family of metrics has not been studied so far. As the main result, we derived the necessary and sufficient condition for the metric to be monotone. As an application, we have given characterization of D-invariant, asymptotically classical, and classical statistical models based on the newly proposed \(\lambda \)LD inner product and the quantum Fisher metric based on it. This latter part is considered as a generalization of the previous work [33] where only the case \(\lambda =1\) was proven. Our results only concern about properties of the tangent spaces, and we expect that properties of affine connections based on our approach should be worth further investigation.

The quantum information geometry community so far restricts ourselves to the study on monotone metric except for a few exceptional results. We do not know the general geometrical structures of the quantum state space without the monotonicity, and hence, this subject is of great interest from purely mathematical point of view. To add a link to physics, we not that quantum information theory also finds importance of the non-CPTP map in the context of a non-Markovian interaction with a correlated environment [48,49,50,51]. In this regard, we expect that quantum information geometry without the monotonicity should also play an important role in physics.

As a final remark, we point out the ideas presented in this paper are quite general. First, we note that our construction for a one-parameter family of inner products is applicable for other inner products. We can start with a convex mixture of any two well-defined inner products. This then defines a new family of quantum Fisher metrics. Next is the idea of using a particular set of logarithmic derivative operators together with a different inner product. This ‘mismatched’ combination defines non-monotone metric in general. Regardless of its practical usefulness, it certainly deserves further study from geometrical understanding of a quantum state manifold.