1 Introduction

One of the basic principles of quantum theory is the Robertson-Heisenberg uncertainty inequality [4, 7]

$$\begin{aligned} \Delta _\psi (A)\Delta _\psi (B)\ge \frac{1}{4}|{\langle {\psi ,[{A,B}]\psi }\rangle }|^2 \end{aligned}$$
(1.1)

where AB are self-adjoint operators and \(\psi \) is a vector state on a Hilbert space. The inequality (1.1) is usually applied to position and momentum operators AB in which case \(|{\langle {\psi ,[{A,B}]\psi }\rangle }|^2=\hbar ^2\) where \(\hbar \) is Planck’s constant. In this situation, A and B are unbounded operators, but for mathematical rigor we shall only deal with bounded operators. However, our results can be extended to the unbounded case by considering a dense subspace common to the domains of A and B. In this paper, we derive a generalization of (1.1). This generalization applies to mixed states and contains an additional covariance term that results in a stronger inequality.

The main result in Section 2 is an uncertainty principle for observable operators. This principle contains four parts: a commutator term, a covariance term, a correlation term and a product of variances term. This last term is sometimes called a product of uncertainties. In Section 2 we also characterize, for faithful states, when the uncertainty inequality is an equality. Section 3 introduces the concept of a real-valued observable. If \(\rho \) is a state and A is a real-valued observable, we define the \(\rho \)-average, \(\rho \)-deviation and \(\rho \)-variance of A. If B is another real-valued observable, we define the \(\rho \)-correlation and \(\rho \)-covariance of AB. An uncertainty principle for real-valued observables is given in terms of these concepts. An important role is played by the stochastic operator \(\widetilde{A}\) for A. In Section 3 we also define the sharp version of a real-valued observable and characterize when two real-valued observables have the same sharp version

Section 4 illustrates the theory presented in Section 3 with two examples. The first example considers two dichotomic arbitrary real-valued observables. The second example considers the special case of two noisy spin observables. In this case, the uncertainty inequality becomes very simple. Section 5 discusses real-values coarse graining of observables.

2 Quantum Uncertainty Principle

For a complex Hilbert space H, we denote the set of bounded linear operators by \(\mathcal {L}(H)\) and the set of bounded self-adjoint operators by \(\mathcal {L}_S(H)\). A positive trace-class operator with trace one is a state and the set of states on H is denoted by \(\mathcal {S}(H)\). A state \(\rho \) is faithful if \(\mathrm {tr\,} (\rho C^*C)=0\) for \(C\in \mathcal {L} (H)\) implies that \(C=0\). For \(\rho \in \mathcal {S} (H)\) and \(C,D\in \mathcal {L} (H)\) we define the sesquilinear form \(\langle {C,D}\rangle _\rho =\mathrm {tr\,} (\rho C^*D)\).

Lemma 2.1

(i) If \(C\in \mathcal {L} (H)\), \(\rho \in \mathcal {S} (H)\), then \(\mathrm {tr\,} (\rho C^*)=\overline{\mathrm {tr\,} (\rho C)}\). (ii) The form \(\langle {\bullet ,\bullet }\rangle _\rho \) is a positive semi-definite inner product. (iii) A state \(\rho \) is faithful if and only if \(\langle {\bullet ,\bullet }\rangle _\rho \) is an inner product

Proof

(i) If D is a trace-class operator and \(\left\{ \phi _i\right\} \) is an orthonormal basis for H, we have

$$\begin{aligned} \mathrm {tr\,} (D^*)=\sum \limits _i\langle {\phi _i,D^*\phi _i}\rangle =\sum \limits _i\overline{\langle {D^*\phi _i,\phi _i}\rangle }=\sum \limits _i\overline{\langle {\phi _i,D\phi _i}\rangle } =\overline{\mathrm {tr\,} (D)} \end{aligned}$$

Hence,

$$\begin{aligned} \mathrm {tr\,} (\rho C^*)=\mathrm {tr\,}\left[ {(C\rho )^*}\right] =\overline{\mathrm {tr\,} (C\rho )}=\overline{\mathrm {tr\,} (\rho C)} \end{aligned}$$

(ii) Applying (i), we have

$$\begin{aligned} \overline{\langle {C,D}\rangle _\rho }=\overline{\mathrm {tr\,} (\rho C^*D)}=\mathrm {tr\,}\left[ {\rho (C^*D)^*}\right] =\mathrm {tr\,} (\rho D^*C)=\langle {D,C}\rangle _\rho \end{aligned}$$

Moreover, since \(C^*C\ge 0\) we have \(\langle {C,C}\rangle _\rho =\mathrm {tr\,} (\rho C^*C)\ge 0\). Hence, \(\langle {\bullet ,\bullet }\rangle _\rho \) is a positive semi-definite inner product. (iii) If \(\langle {\bullet ,\bullet }\rangle _\rho \) is an inner product, then

$$\begin{aligned} \langle {C,C}\rangle _\rho =\mathrm {tr\,} (\rho C^*C)=0 \end{aligned}$$

implies \(C=0\) so \(\rho \) is faithful. Conversely, if \(\rho \) is faithful, then

$$\begin{aligned} \mathrm {tr\,} (\rho C^*C)=\langle {C,C}\rangle _\rho =0 \end{aligned}$$

implies \(C=0\) so \(\langle {\bullet ,\bullet }\rangle _\rho \) is an inner product \(\square \)

For \(A\in \mathcal {L}_S(H)\) and \(\rho \in \mathcal {S} (H)\), the \(\rho \)-average (or \(\rho \)-expectation) of A is \(\langle {A}\rangle _\rho =\mathrm {tr\,} (\rho A)\) and \(\rho \)-deviation of A is \(D_\rho (A)=A-\langle {A}\rangle _\rho I\) where I is the identity map on H. If \(A,B\in \mathcal {L}_S(H)\), the \(\rho \)-correlation of AB is

$$\begin{aligned} \textrm{Cor}_\rho (A,B)=\mathrm {tr\,}\left[ {\rho D_\rho (A)D_\rho (B)}\right] \end{aligned}$$

Although \(\textrm{Cor}_\rho (A,B)\) need not be a real number, it is easy to check that \(\overline{\textrm{Cor}_\rho (A,B)}=\textrm{Cor}_\rho (B,A)\). We say that A and B are uncorrelated if \(\textrm{Cor}_\rho (A,B)=0\). The \(\rho \)-covariance of AB is \(\Delta _\rho (A,B)=\mathrm {Re\,}\textrm{Cor}_\rho (A,B)\) and the \(\rho \)-variance of A is

$$\begin{aligned} \Delta _\rho (A)=\Delta _\rho (A,A)=\textrm{Cor}_\rho (A,A)=\mathrm {tr\,}\left[ {\rho D_\rho (A)^2}\right] \end{aligned}$$

It is straightforward to show that

$$\begin{aligned} \textrm{Cor}_\rho (A,B)= & {} \mathrm {tr\,} (\rho AB)-\langle {A}\rangle _\rho \langle {B}\rangle _\rho \end{aligned}$$
(2.1)
$$\begin{aligned} \Delta _\rho (A,B)= & {} \mathrm {Re\,}\mathrm {tr\,} (\rho AB)-\langle {A}\rangle _\rho \langle {B}\rangle _\rho \end{aligned}$$
(2.2)
$$\begin{aligned} \Delta _\rho (A)= & {} \langle {A^2}\rangle _\rho -\langle {A}\rangle _\rho ^2 \end{aligned}$$
(2.3)

We see from (2.1) that A and B are \(\rho \)-uncorrelated if and only if \(\mathrm {tr\,} (\rho AB)=\langle {A}\rangle _\rho \langle {B}\rangle _\rho \). We say that A and B commute if their commutant \([{A,B}]=AB-BA=0\).

Example 1

In the tensor product \(H_1\otimes H_2\) let \(\rho =\rho _1\otimes \rho _2\in \mathcal {S} (H_1\otimes H_2)\) be a product state and let \(A_1\in \mathcal {L}_S(H_1)\), \(A_2\in \mathcal {L}_S(H_2)\). Then \(A=A_1\otimes I_2\), \(B=I_1\otimes A_2\in \mathcal {L}_S(H_1\otimes H_2)\) are \(\rho \)-uncorrelated because

$$\begin{aligned} \mathrm {tr\,} (\rho AB)= & {} \mathrm {tr\,}\left[ {\rho _1\otimes \rho _2(A_1\otimes I_2)(I_2\otimes A_2)}\right] =\mathrm {tr\,}\left[ {\rho _1\otimes \rho _2(A_1\otimes A_2)}\right] \\= & {} \mathrm {tr\,} (\rho _1A_1\otimes \rho _2A_2)=\mathrm {tr\,} (\rho _1A_1)\mathrm {tr\,} (\rho _2A_2)\\= & {} \mathrm {tr\,} (\rho _1\otimes \rho _2A_1\otimes I_2)\mathrm {tr\,} (\rho _1\otimes \rho _2I_1\otimes A_2)=\langle {A}\rangle _\rho \langle {B}\rangle _\rho \end{aligned}$$

This shows that AB are \(\rho \)-uncorrelated for any product state \(\rho \). Of course, \([{A,B}]=0\) in this case. However, there are examples of noncommuting operators that are uncorrelated. For instance, on \(H=\mathbb {C}^2\) let \(\alpha =\left[ \begin{array}{c} 1\\ 0\end{array}\right] \), \(\phi =\left[ \begin{array}{c} 0\\ 1\end{array}\right] \), \(\psi =\frac{1}{\sqrt{2}}\left[ \begin{array}{c} 1\\ 1\end{array}\right] \). With \(\rho =|{\alpha }\rangle \langle {\alpha }|\), \(A=|{\phi }\rangle \langle {\phi }|\), \(B=|{\psi }\rangle \langle {\psi }|\) we have

$$\begin{aligned} \mathrm {tr\,} (\rho AB)=\langle {A}\rangle _\rho \langle {B}\rangle _\rho =0 \end{aligned}$$

Hence, AB are \(\rho \)-uncorrelated. However,

$$\begin{aligned} AB&=\langle {\phi ,\psi }\rangle |{\phi }\rangle \langle {\psi }|=\frac{1}{\sqrt{2}}|{\phi }\rangle \langle {\psi }|\\ BA&=\langle {\psi ,\phi }\rangle |{\psi }\rangle \langle {\phi }|=\frac{1}{\sqrt{2}}|{\psi }\rangle \langle {\phi }| \end{aligned}$$

so \([{A,B}]\ne 0\). \(\square \)

We now present our main result.

Theorem 2.2

If \(A,B\in \mathcal {L}_S(H)\) and \(\rho \in \mathcal {S} (H)\), then (i) \(\frac{1}{4}|{\mathrm {tr\,}\left( {\rho [{A,B}]}\right) }|^2+\left[ {\Delta _\rho (A,B)}\right] ^2=|{\textrm{Cor}_\rho (A,B)}|^2\)

(ii) \(\frac{1}{4}|{\mathrm {tr\,}\left( {\rho [{A,B}]}\right) }|^2+\left[ {\Delta _\rho (A,B)}\right] ^2\le \Delta _\rho (A)\Delta _\rho (B)\)

Proof

(i) Applying Lemma 2.1 we have

$$\begin{aligned} \mathrm {tr\,}\left( {[{A,B}]}\right)= & {} \mathrm {tr\,} (\rho AB)-\mathrm {tr\,} (\rho BA)=\mathrm {tr\,} (\rho AB)-\overline{\mathrm {tr\,}\left[ {\rho (BA)^*}\right] }\nonumber \\= & {} \mathrm {tr\,} (\rho AB)-\overline{\mathrm {tr\,}(\rho A^*B^*)}=\mathrm {tr\,} (\rho AB)-\overline{\mathrm {tr\,} (\rho AB)}\nonumber \\= & {} 2i\,\mathrm {Im\,}\left[ {\mathrm {tr\,} (\rho AB)}\right] \end{aligned}$$
(2.4)

From (2.2) and (2.4) we obtain

$$\begin{aligned} \frac{1}{4}|{\mathrm {tr\,}\left( {\rho [{A,B}]}\right) }|^2+\left[ {\Delta _\rho (A,B)}\right] ^2= & {} \left[ {\mathrm {Im\,} (\rho AB)}\right] ^2+\left[ {\mathrm {Re\,}\mathrm {tr\,} (\rho A B)-\langle {A}\rangle _\rho \langle {B}\rangle _\rho }\right] ^2\\= & {} |{\mathrm {Re\,}\mathrm {tr\,} (\rho AB)-\langle {A}\rangle _\rho \langle {B}\rangle _\rho +i\,\mathrm {Im\,}\mathrm {tr\,} (\rho AB)}|^2\\= & {} |{\mathrm {tr\,} (\rho AB)-\langle {A}\rangle )_\rho \langle {B}\rangle _\rho }|^2=|{\textrm{Cor}_\rho (A,B)}|^2 \end{aligned}$$

(ii) Applying Lemma 2.1(ii), the form \(\langle {C,D}\rangle _\rho =\mathrm {tr\,}(\rho C^*D)\) is a positive semi-definite inner product. Hence, Schwarz’s inequality holds and we have

$$\begin{aligned} |{\textrm{Cor}_\rho (A,B)}|^2= & {} |{\mathrm {tr\,}\left[ {\rho D_\rho (A)D_\rho (B)}\right] }|^2=|{\langle {D_\rho (A),D_\rho (B)}\rangle _\rho }|^2\\\le & {} \langle {D_\rho (A),D_\rho (A)}\rangle _\rho \langle {D_\rho (B),D_\rho (B)}\rangle _\rho =\mathrm {tr\,}\left[ {\rho D_\rho (A)^2}\right] \mathrm {tr\,}\left[ {\rho D_\rho (B)^2}\right] \\= & {} \Delta _\rho (A)\Delta _\rho (B) \end{aligned}$$

\(\square \)

We call Theorem 2.2(i) the uncertainty equation and Theorem 2.2(ii) the uncertainty inequality. Together, they are called the uncertainty principle. Notice that Theorem 2.2(ii) is a considerable strengthening of the usual Robertson-Heisenberg inequality (1.1) since it contains the term \(\left[ {\Delta _\rho (A,B)}\right] ^2\) and it applies to arbitrary states. Thus, even when \([{A,B}]=0\) we still have an uncertainty relation

$$\begin{aligned} \left[ {\Delta _\rho (A,B)}\right] ^2=|{\mathrm {tr\,}\left[ {\rho D_\rho (A)D_\rho (B)}\right] }|^2\le \Delta _\rho (A)\Delta _\rho (B) \end{aligned}$$

Lemma 2.3

A state \(\rho \) is faithful if and only if the eigenvalues of \(\rho \) are positive.

Proof

Suppose the eigenvalues \(\lambda _i\) of \(\rho \) are positive with corresponding normalized eigenvectors \(\phi _i\). Then we can write \(\rho =\sum \lambda _i|{\phi _i}\rangle \langle {\phi _i}|\) for the orthonormal basis \(\left\{ \phi _i\right\} \). For any \(A\in \mathcal {L} (H)\) we obtain

$$\begin{aligned} \mathrm {tr\,} (\rho A^*A)=\sum \lambda _i\mathrm {tr\,}\left( {|{\phi _i}\rangle \langle {\phi _i}|A^*A}\right) =\sum \lambda _i\langle {A\phi _i,A\phi _i}\rangle =\sum \lambda _i\Vert {A\phi _i}\Vert ^2 \end{aligned}$$

Hence, \(\mathrm {tr\,} (\rho A^*A)=0\) implies \(A\phi _i=0\) for all i. It follows that \(A=0\). Conversely, if 0 is an eigenvalue of \(\rho \) and \(\phi \) is a corresponding unit eigenvector, then setting \(P_\phi =|{\phi }\rangle \langle {\phi }|\) we have

$$\begin{aligned} \mathrm {tr\,} (\rho P_\phi ^*P_\phi )=\mathrm {tr\,} (\rho P_\phi )=\langle {\phi ,\rho \phi }\rangle =0 \end{aligned}$$

But \(P_\phi \ne 0\) so \(\rho \) is not faithful. \(\square \)

Theorem 2.4

If \(\rho \) is faithful. then the following statements are equivalent. (i) The uncertainty inequality of Theorem 2.2(ii) is an equality. (ii) \(D_\rho (B)=\alpha D_\rho (A)\) for \(\alpha \in \mathbb {R}\). (iii) \(B=\alpha A+\beta I\) for \(\alpha ,\beta \in \mathbb {R}\). If one of the conditions holds, then

$$\begin{aligned} \left[ {\Delta _\rho (A,B)}\right] ^2=|{\textrm{Cor}_\rho (A,B)}|^2=\Delta _\rho (A)\Delta _\rho (B) \end{aligned}$$
(2.5)

Proof

(i)\(\Rightarrow \)(ii) If the uncertainty inequality is an equality, then

$$\begin{aligned} |{\mathrm {tr\,}\left[ {\rho D_\rho (A)D_\rho (B)}\right] }|^2=\Delta _\rho (A)\Delta _\rho (B) \end{aligned}$$
(2.6)

We can rewrite (2.6) as

$$\begin{aligned} |{\langle {D_\rho (A),D_\rho (B)}\rangle _\rho }|^2=\langle {D_\rho (A),D_\rho (A)}\rangle _\rho \langle {D_\rho (B),D_\rho (B)}\rangle _\rho \end{aligned}$$

Since we have equality in Schwarz’s inequality and \(\langle {\bullet ,\bullet }\rangle _\rho \) is an inner product, it follows that \(D_\rho (B)=\alpha D_\rho (A)\) for some \(\alpha \in \mathbb {C}\). Since \(D_\rho (B)^*=D_\rho (B)\) and \(D_\rho (A)^*=D_\rho (A)\) we conclude that \(\alpha \in \mathbb {R}\). (ii)\(\Rightarrow \)(iii) If \(D_\rho (B)=\alpha D_\rho (A)\) for \(\alpha \in \mathbb {R}\), we have

$$\begin{aligned} B-\langle {B}\rangle _\rho I=\alpha \left( {A-\langle {A}\rangle _\rho I)}\right) \end{aligned}$$

Hence, letting \(\beta =\langle {B}\rangle _\rho -\alpha \langle {A}\rangle _\rho \) we have \(B=\alpha A+\beta I\). Since \(A,B\in \mathcal {L}_S(H)\) and \(\alpha \in \mathbb {R}\), we have that \(\beta \in \mathbb {R}\). (iii)\(\Rightarrow \)(i) If (iii) holds, then

$$\begin{aligned} \langle {B}\rangle _\rho =\mathrm {tr\,} (\rho B)=\alpha \mathrm {tr\,} (\rho A)+\beta =\alpha \langle {A}\rangle _\rho +\beta \end{aligned}$$

Hence, \(\beta =\langle {B}\rangle _\rho -\alpha \langle {A}\rangle _\rho \) so that

$$\begin{aligned} D_\rho (B)= & {} B-\langle {B}\rangle _\rho I=\alpha A+\beta I-\langle {B}\rangle _\rho I\\= & {} \alpha A+\langle {B}\rangle _\rho I-\alpha \langle {A}\rangle _\rho I-\langle {B}\rangle _\rho I=\alpha D_\rho (A) \end{aligned}$$

Thus, (ii) holds and it follows that (2.6) holds and this implies (i). Equation (2.5) holds because (2.6) holds. \(\square \)

Example 2

The simplest faithful state when \(\dim H=n<\infty \) is \(\rho =I/n\). Then \(\langle {A,B}\rangle _\rho =\frac{1}{n}\,\mathrm {tr\,} (A^*B)\) which is essentially the Hilbert-Schmidt inner product \(\langle {A,B}\rangle _{HS}=\mathrm {tr\,} (A^*B)\). In this case for \(A,B\in \mathcal {L}_S(H)\) we have \(\langle {A}\rangle _\rho =\frac{1}{n}\,\mathrm {tr\,} (A)\), \(D_\rho (A)=A-\frac{1}{n}\,\mathrm {tr\,} (A)I\). The other statistical concepts become:

$$\begin{aligned} \textrm{Cor}_\rho (A,B)= & {} \mathrm {tr\,}\left[ {\rho D_\rho (A)D_\rho (B)}\right] =\frac{1}{n}\,\mathrm {tr\,} (AB)-\frac{1}{ n^2}\,\mathrm {tr\,} (A)\mathrm {tr\,} (B)\\ \Delta _\rho (A,B)= & {} \frac{1}{n}\,\mathrm {Re\,}\mathrm {tr\,} (AB)-\frac{1}{n^2}\,\mathrm {tr\,} (A)\mathrm {tr\,} (B)\\ \Delta _\rho (A)= & {} \frac{1}{n}\,\mathrm {tr\,} (A^2)-\left[ {\frac{1}{n}\,\mathrm {tr\,} (A)}\right] ^2\\ \mathrm {tr\,}\left( {\rho \left[ {A,B}\right] }\right)= & {} \frac{2i}{n}\,\mathrm {Im\,}\mathrm {tr\,} (AB) \end{aligned}$$

The uncertainty principle is given by:

$$\begin{aligned} \left[ {\mathrm {Im\,}\mathrm {tr\,} (AB)}\right] ^2+ & {} \left[ {\mathrm {Re\,}\mathrm {tr\,}(AB)-\frac{1}{n}\,\mathrm {tr\,} (A)\mathrm {tr\,} (B)}\right] ^2 =|{\mathrm {tr\,} (AB)-\frac{1}{n}\,\mathrm {tr\,} (A)\mathrm {tr\,} (B)}|^2\\\le & {} \left[ {\mathrm {tr\,} (A^2)-\frac{1}{n}\,\mathrm {tr\,} (A)^2}\right] \left[ {\mathrm {tr\,} (B^2)-\frac{1}{n}\,\mathrm {tr\,} (B)^2}\right] \square \end{aligned}$$

3 Real-Valued Observables

An effect is an operator \(C\in \mathcal {L}_S(H)\) that satisfies \(0\le C\le I\) [1, 4,5,6]. Effects are thought of as two outcomes yes-no measurements. When the result of measuring C is yes , we say that C occurs and when the result is no , then C does not occur. A real-valued observable is a finite set of effects \(A=\left\{ A_x: x\in \Omega _A\right\} \) where \(\sum \limits _{x\in \Omega _A}A_x=I\) and \(\Omega _A\subseteq \mathbb {R}\) is the outcome space for A. The effect \(A_x\) occurs when the result of measuring A is the outcome x. The condition \(\sum \limits _{x\in \Omega _A}A_x=I\) specifies that one of the possible outcomes of A must occur. An observable is also called a positive operator-valued measure (POVM). We say A is sharp if \(A_x\) is a projection for all \(x\in \Omega _A\) and in this case, A is a projection-valued measure [4, 7]. Corresponding to A we have the stochastic operator \(\widetilde{A}\in \mathcal {L} (H)\) given by \(\widetilde{A} =\sum \limits _{x\in \Omega _A}xA_x\). Notice that we need A to be real-valued in order for \(\widetilde{A}\) to exist.

We now apply the theory presented in Section 2 to real-valued observables. For \(\rho \in \mathcal {S} (H)\), the \(\rho \)-average (or \(\rho \)-expectation) of A is defined by

$$\begin{aligned} \langle {A}\rangle _\rho =\langle {\widetilde{A}\,}\rangle _\rho =\mathrm {tr\,} (\rho \widetilde{A}\,)=\sum \limits _{x\in \Omega _A}x\mathrm {tr\,} (\rho A_x) \end{aligned}$$
(3.1)

We interpret \(\mathrm {tr\,} (\rho A_x)\) as the probability that a measurement of A results in the outcome x when the system is in state \(\rho \). Thus, (3.1) says that the \(\rho \)-average of A is the sum of its outcomes times the probabilities these outcomes occur. We define the \(\rho \)-deviation of A by

$$\begin{aligned} D_\rho (A)= & {} D_\rho (\widetilde{A}\,)=\widetilde{A} -\langle {A}\rangle _\rho I=\sum \limits _{x\in \Omega _A}xA_x-\sum \limits _{x\in \Omega _A}x\mathrm {tr\,} (\rho A_x)I\\= & {} \sum \limits _{x\in \Omega _A}x\left[ {A_x-\mathrm {tr\,} (\rho A_x)I}\right] \end{aligned}$$

If AB are real-valued observables, the \(\rho \)-correlation of AB is \(\textrm{Cor}_\rho (A,B)=\textrm{Cor}_\rho (\widetilde{A} ,\widetilde{B}\,)\), \(\rho \)-covariance of AB is \(\Delta _\rho (A,B)=\Delta _\rho (\widetilde{A} ,\widetilde{B}\,)\) and the \(\rho \)-variance of A is \(\Delta _\rho (A)=\Delta _\rho (\widetilde{A}\,)\). Applying (2.1) we obtain

$$\begin{aligned} \textrm{Cor}_\rho (A,B)= & {} \mathrm {tr\,} (\rho \widetilde{A}\widetilde{B}\,)-\langle {\widetilde{A}\,}\rangle _\rho \langle {\widetilde{B}\,}\rangle _\rho =\mathrm {tr\,}\left( {\rho \sum \limits _{x,y}xyA_xB_y}\right) -\langle {\widetilde{A}\,}\rangle _\rho \langle {\widetilde{B}\,}\rangle _\rho \nonumber \\= & {} \sum \limits _{x,y}xy\left[ {\mathrm {tr\,} (\rho A_xB_y)-\mathrm {tr\,} (\rho A_x)\mathrm {tr\,} (\rho B_y)}\right] \end{aligned}$$
(3.2)

It follows that

$$\begin{aligned} \Delta _\rho (A,B)=\sum \limits _{x,y}xy\left[ {\mathrm {Re\,}\mathrm {tr\,} (\rho A_xB_y)-\mathrm {tr\,} (\rho A_x)\mathrm {tr\,} (\rho B_y)}\right] \end{aligned}$$
(3.3)

and

$$\begin{aligned} \Delta _\rho (A)=\sum \limits _{x,y}xy\left[ {\mathrm {tr\,} (\rho A_xA_y)-\mathrm {tr\,} (\rho A_x)\mathrm {tr\,} (\rho A_y)}\right] \end{aligned}$$
(3.4)

We also have by (2.4) that

$$\begin{aligned} \mathrm {tr\,}\left( {\rho \left[ {\widetilde{A} ,\widetilde{B}\,}\right] }\right)= & {} 2i\,\mathrm {Im\,}\mathrm {tr\,} (\rho \widetilde{A}\widetilde{B}\,)=2i\,\mathrm {Im\,}\mathrm {tr\,}\left( {\rho \sum \limits _{x,y}xyA_xB_y)}\right) \nonumber \\= & {} 2i\sum \limits _{x,y}xy\,\mathrm {Im\,}\mathrm {tr\,} (\rho A_xB_y) \end{aligned}$$
(3.5)

Substituting \(\widetilde{A},\widetilde{B}\) for AB in Theorem 2.2 gives an uncertainty principle for real-valued observables.

Two observables AB are compatible (or jointly measurable) if there exists a joint observable \(C_{(x,y)}\), \((x,y)\in \Omega _A\times \Omega _B\), such that \(A_x=\sum \limits _yC_{(x,y)}\), \(B_y=\sum \limits _xC_{(x,y)}\) for all \(x\in \Omega _A\), \(y\in \Omega _B\). If \(\left[ {A_x,B_y}\right] =0\) for all xy, then AB are compatible with \(C_{(x,y)}=A_xB_y\) for all \((x,y)\in \Omega _A\times \Omega _B\). However, if AB are compatible, they need not commute [4]. If AB are compatible real-valued observables, then

$$\begin{aligned} \widetilde{A}= & {} \sum \limits _xxA_x=\sum \limits _{x,y}xC_{(x,y)}\\ \widetilde{B}= & {} \sum \limits _yyB_y=\sum \limits _{x,y}yC_{(x,y)} \end{aligned}$$

Using (3.2), (3.3), (3.4), (3.5) we can write \(\textrm{Cor}_\rho (A,B),\Delta _\rho (A,B),\Delta _\rho (A),\Delta _\rho (B)\) and \(\mathrm {tr\,}\left( {\rho \left[ {\widetilde{A} ,\widetilde{B}\,}\right] }\right) \) in terms of \(C_{(x,y)}\). Hence, we can express the uncertainty principle in terms of \(C_{(x,y)}\).

If \(A=\left\{ A_x: x\in \Omega _A\right\} \) is a real-valued observable, then \(\widetilde{A}\) has spectral decomposition \(\widetilde{A} =\sum \limits _{i=1}^n\lambda _iP_i\) where \(\lambda _i\in \mathbb {R}\) are the distinct eigenvalues of \(\widetilde{A}\) and \(P_i\) are projections with \(\sum P_i=I\). We call \(\widehat{A} =\left\{ P_i: i=1,2,\ldots ,n\right\} \) the sharp version of A. Then \(\widehat{A}\) is a real-valued observable with outcome space \(\Omega _{\widehat{A}}=\left\{ \lambda _i: i=1,2,\ldots ,n\right\} \) and \(P _{\lambda _i}=P_i\). Since \((\widehat{A}\,)^\sim =\widetilde{A}\), A and \(\widehat{A}\) have the same stochastic operator. It follows that \(\langle {A}\rangle _\rho =\langle {\widehat{A}\,}\rangle _\rho \), \(\Delta _\rho (A)=\Delta _\rho (\widehat{A}\,)\) and if B is another real-valued observable, then \(\textrm{Cor}_\rho (A,B)=\textrm{Cor}_\rho (\widehat{A} ,\widehat{B}\,)\) and \(\Delta _\rho (A,B)=\Delta _\rho (\widehat{A} ,\widehat{B}\,)\).

Lemma 3.1

The following statements are equivalent. (i) \(\widehat{A} =\widehat{B}\). (ii) \(\widetilde{A} =\widetilde{B}\). (iii) \(\langle {A}\rangle _\rho =\langle {B}\rangle _\rho \) for all \(\rho \in \mathcal {S} (H)\).

Proof

(i)\(\Rightarrow \)(ii) If \(\widehat{A} =\widehat{B}\) then

$$\begin{aligned} \widetilde{A} =(\widehat{A}\,)^\sim =(\widehat{B}\,)^\sim =\widetilde{B} \end{aligned}$$

(ii)\(\Rightarrow \)(iii) If \(\widetilde{A} =\widetilde{B}\) then

$$\begin{aligned} \langle {A}\rangle _\rho =\langle {\widetilde{A}\,}\rangle _\rho =\langle {\widetilde{B}\,}\rangle _\rho =\langle {B}\rangle _\rho \end{aligned}$$

(iii)\(\Rightarrow \)(i) If \(\langle {A}\rangle _\rho =\langle {B}\rangle _\rho \) for all \(\rho \in \mathcal {S} (H)\), then \(\langle {\widetilde{A}\,}\rangle _\rho =\langle {\widetilde{B}\,}\rangle _\rho \) for all \(\rho \in \mathcal {S} (H)\). It follows that \(\widehat{A} =\widehat{B}\). \(\square \)

Let \(\widetilde{A} =\sum xA_x=\sum \lambda _iP_i\) so \(\widehat{A} =\left\{ P_i: i=1,2,\ldots ,n\right\} \) is a sharp version of A. Let \(B=\left\{ B_x: x\in \Omega _A\right\} \) be the real-valued observable given by \(B_x=\sum \limits _{i=1}^nP_iA_xP_i\). We conclude that A and B have the same sharp version because

$$\begin{aligned} \widetilde{B}= & {} \sum \limits _xxB_x=\sum \limits _iP_i\sum \limits _xxA_xP_i=\sum \limits _iP_i\widetilde{A} P_i=\sum \limits _iP_i\sum \limits _j\lambda _jP_jP_i\\= & {} \sum \limits _{i,j}\lambda _iP_iP_jP_i=\sum \limits _i\lambda _iP_i=\widetilde{A} \end{aligned}$$

so by Lemma 3.1, \(\widehat{A} =\widehat{B}\). We say that B is a conjugate of A. Letting \(C_{ix}=P_iA_xP_i\), we have that

$$\begin{aligned} \left\{ C_{ix}: i=1,2,\ldots ,n, x\in \Omega _A\right\} \end{aligned}$$

is an observable and \(\sum \limits _iC_{ix}=B_x\), \(\sum \limits _xC_{ix}=P_i\). It follows that B and \(\widehat{A}\) are compatible with joint observable \(\left\{ C_{ix}\right\} \). We say that an observable \(A=\left\{ A_x: x\in \Omega _A\right\} \) is commutative if \(\left[ {A_x,A_y}\right] =0\) for all \(x,y\in \Omega _A\). Notice that if A is sharp, then A is commutative. However, there are many unsharp observables that are commutative.

Theorem 3.2

If A is commutative, then B is conjugate to A if and only if \(B=A\).

Proof

If A is commutative, we show that A is conjugate to A. Since

$$\begin{aligned} \widehat{A} =\sum xA_x=\sum \lambda _iP_i \end{aligned}$$

we have that \(\left[ {\widehat{A} ,A_x}\right] =0\) for all \(x\in \Omega _A\). By the spectral theorem, \(\left[ {A_x,P_i}\right] =0\) for all xi so \(A_x=\sum P_iA_xP_i\). Therefore, A is conjugate to A. Conversely, suppose A is commutative and B is conjugate to A. Then \(B_x=\sum \limits _iP_iA_xP_i\) for all \(x\in \Omega _A\). As before, we have that \(\left[ {\widehat{A} _x,A_x}\right] =0\) for all \(x\in \Omega _A\) so \(\left[ {A_x,P_i}\right] =0\) for all xi. Hence,

$$\begin{aligned} B_x=\sum \limits _iP_iA_xP_i=A_x\sum \limits _iP_i=A_x \end{aligned}$$

for all \(x\in \Omega _B=\Omega _A\) so \(B=A\). \(\square \)

Thus, nontrivial conjugates only occur in the nonclassical case where A is noncommutative.

4 More Examples

This section illustrates the theory in Sections 2 and 3 with two examples.

Example 3

A two outcome observable is called a dichotomic observable. Of course, a dichotomic observable is commutative but it need not be sharp. Let \(A=\left\{ A_1,I-A_1\right\} \) be a dichotomic observable with \(\Omega _A=\left\{ 1,-1\right\} \). Then

$$\begin{aligned} \widetilde{A}= & {} A_1-(I-A_1)=2A_1-I\\ \langle {A}\rangle _\rho= & {} \mathrm {tr\,} (\rho \widetilde{A}\,)=\mathrm {tr\,}\left[ {\rho (2A_1-I)}\right] =2\,\mathrm {tr\,} (\rho A_1)-1\\ D_\rho (A)= & {} \widetilde{A} -\langle {A}\rangle _\rho I=2A_1-I-2\,\mathrm {tr\,} (\rho A_1)I+I=2\left[ {A_1-\mathrm {tr\,} (\rho A_1)I}\right] \end{aligned}$$

If \(B=\left\{ B_1,I-B_1\right\} \) is another dichotomic observable with \(\Omega _B=\left\{ 1,-1\right\} \), then

$$\begin{aligned} \textrm{Cor}_\rho (A,B)= & {} \mathrm {tr\,} (\rho \widetilde{A}\widetilde{B}\,)-\langle {A}\rangle _\rho \langle {B}\rangle _\rho \nonumber \\= & {} \mathrm {tr\,}\left[ {\rho (2A_1-I)(2B_1-I)}\right] -\left[ {2\,\mathrm {tr\,} (\rho A_1-1)}\right] \left[ {2\,\mathrm {tr\,} (\rho B_1-1)}\right] \nonumber \\= & {} \mathrm {tr\,}\left[ {\rho (4A_1B_1-2A_1-2B_1+I)}\right] -4\,\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)\nonumber \\{} & {} \quad +2\,\mathrm {tr\,} (\rho A_1)+2\mathrm {tr\,} (\rho B_1)-1\nonumber \\= & {} 4\left[ {\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}\right] \end{aligned}$$
(4.1)

Hence,

$$\begin{aligned} \Delta _\rho (A,B)= & {} 4\left[ {\mathrm {Re\,}\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}\right] \end{aligned}$$

and

$$\begin{aligned} \Delta _\rho (A)= & {} \Delta _\rho (A,A)=4\left[ {\mathrm {tr\,} (\rho A_1^2)-\left( {\mathrm {tr\,} (\rho A_1)}\right) ^2}\right] \end{aligned}$$

We also have

$$\begin{aligned} \left[ {\widetilde{A} ,\widetilde{B}\,}\right]= & {} \left[ {2A_1-I,2B_1-I}\right] =(2A_1-I)(2B_1-I)-(2B_1-I)(2A_1-I)\\= & {} 4\left[ {A_1,B_1}\right] \end{aligned}$$

We conclude that \(\left[ {\widetilde{A} ,\widetilde{B}\,}\right] =0\) if and only if \(\left[ {A_1,B_1}\right] =0\) and this does not hold in general so \(\widetilde{A},\widetilde{B}\) need not commute. The uncertainty principle becomes

$$\begin{aligned}{} & {} \left[ {\mathrm {Im\,}\mathrm {tr\,} (\rho A_1B_1)}\right] ^2+\left[ {\mathrm {Re\,}\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho A_2)}\right] ^2\nonumber \\{} & {} =|{\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}|^2\nonumber \\{} & {} \le \left[ {\mathrm {tr\,} (\rho A_1^2)-\left( {\mathrm {tr\,} (\rho A_1)}\right) ^2}\right] \left[ {\mathrm {tr\,} (\rho B_1^2)-\left( {\mathrm {tr\,} (\rho B_1)}\right) ^2}\right] \end{aligned}$$
(4.2)

\(\square \)

Example 4

We now consider a special case of Example 3. For \(H\in \mathbb {C}^2\) we define the Pauli matrices

$$\begin{aligned} \sigma _x=\left[ \begin{array}{cc} 0&{}1\\ 1&{}0 \end{array}\right] , \quad \sigma _y=\left[ \begin{array}{cc} 0&{}i\\ -i&{}0 \end{array}\right] ,\quad \sigma _z=\left[ \begin{array}{cc} 1&{}0\\ 0&{}-1 \end{array}\right] \end{aligned}$$

Let \(\mu \in \left[ {0,1}\right] \) and define the dichotomic observable \(A=\left\{ A_1,I-A_1\right\} \), where

$$\begin{aligned} A_1=\frac{1}{2}(I+\mu \sigma _x)=\frac{1}{2}\left[ \begin{array}{cc} 1&{}\mu \\ \mu &{}1 \end{array}\right] \end{aligned}$$

and \(\Omega _A=\left\{ 1,-1\right\} \). Similarly, let \(B=\left\{ B_1,I-B_1\right\} \), where

$$\begin{aligned} B_1=\frac{1}{2}(I+\mu \sigma _y)=\frac{1}{2}\left[ \begin{array}{cc} 1&{}i\mu \\ -i\mu &{}1 \end{array}\right] \end{aligned}$$

and \(\Omega _B=\left\{ 1,-1\right\} \). We call A and B noisy spin observables along the x and y directions, respectively, with noise parameter \(1-\mu \) [7].

Any state \(\rho \in \mathcal {S} (H)\) has the form \(\rho =\frac{I}{2}(I+\overrightarrow{r}\bullet \overrightarrow{\sigma })\) where \(\overrightarrow{r}\in \mathbb {R} ^3\) with \(\Vert {\overrightarrow{r}}\Vert \le 1\) [1, 2]. This is called the Block sphere representation of \(\rho \) [4, 7]. The eigenvalues of \(\rho \) are \(\lambda _{\pm } =\frac{1}{2}\left( {1\pm \Vert {\overrightarrow{r}}\Vert }\right) \). Then \(\lambda _+=1\), \(\lambda _-=0\) if and only if \(\Vert {\overrightarrow{r}}\Vert =1\) and these are precisely the pure states. Letting \(\sigma _1=\sigma _x\), \(\sigma _2=\sigma _y\), \(\sigma _3=\sigma _z\) we obtain

$$\begin{aligned} \rho= & {} \frac{1}{2}\left[ \begin{array}{cc} 1+r_3&{}r_1-ir_2\\ r_1+ir_2&{}1-r_3\end{array}\right] \end{aligned}$$

and

$$\begin{aligned} \rho A_1= & {} \frac{1}{4}\left[ \begin{array}{cc} 1+r_3&{}r_1-ir_2\\ r_1+ir_2&{}1-r_3\end{array}\right] \left[ \begin{array}{cc} 1&{}\mu \\ \mu &{}1\end{array}\right] \\= & {} \left[ \begin{array}{cc} 1+r_3+(r_1-ir_2)\mu &{}(1+r_3)\mu +r_1-ir_2\\ (1-r_3)\mu +r _1+ir_2&{}1-r_3+(r_1+ir_2)\mu \end{array}\right] \end{aligned}$$

Hence, \(\mathrm {tr\,} (\rho A_1)=\frac{1}{2}(1+r_1\mu )\) and as in Example 3, \(\langle {A}\rangle _\rho =r_1\mu \). Similarly, \(\mathrm {tr\,} (\rho B_1)=\frac{1}{2}(1+r_2\mu )\) and \(\langle {B}\rangle _\rho =r_2\mu \). We also obtain

$$\begin{aligned} \mathrm {tr\,} (\rho A_1B_1)=\frac{1}{4}\left[ {1+(r_1+r_2)\mu +ir_2\mu ^2}\right] \end{aligned}$$

and it follows from (4.1) that

$$\begin{aligned} \textrm{Cor}_\rho (A,B)= & {} 4\left[ {\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}\right] \\= & {} 1+(r_1+r_2)\mu +ir_3\mu ^2-(1+r_1\mu )(1+r_2\mu )=-r_1r_2\mu ^2+ir_3\mu ^2 \end{aligned}$$

Therefore, \(\Delta _\rho (A,B)=-r_1r_2\mu ^2\). A straightforward calculation shows that

$$\begin{aligned} \mathrm {tr\,} (\rho A_1^2)= & {} \frac{1}{4}(1+\mu ^2)+\frac{1}{2}\mu r_1\\ \mathrm {tr\,} (\rho B_1^2)= & {} \frac{1}{4}(1+\mu ^2)+\frac{1}{2}\mu r_2 \end{aligned}$$

It follows that

$$\begin{aligned} \Delta _\rho (A)=4\left[ {\mathrm {tr\,} (\rho A_1^2)-\left( {\mathrm {tr\,} (\rho A_1)}\right) ^2}\right] =\mu ^2(1-r_1^2) \end{aligned}$$

and similarly, \(\Delta _\rho (B)=\mu ^2(1-r_2^2)\).

The commutator term in (4.2) becomes

$$\begin{aligned} \left[ {\mathrm {Im\,}\mathrm {tr\,} (\rho A_1B_1)}\right] ^2=\frac{1}{16}\,r_3^2\mu ^4 \end{aligned}$$

The covariance term in (4.2) is

$$\begin{aligned} \left[ {\mathrm {Re\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}\right] ^2=\frac{1}{16}\,r_1^2r_2^2\mu ^4 \end{aligned}$$

and the correlation term in (4.2) is

$$\begin{aligned} |{\mathrm {tr\,} (\rho A_1B_1)-\mathrm {tr\,} (\rho A_1)\mathrm {tr\,} (\rho B_1)}|^2=\frac{1}{16}\,(r_3^2+r_1^2r_2^2)\mu ^4 \end{aligned}$$

Finally, the variance term in (4.2) is given by

$$\begin{aligned} \Delta _\rho (A_1)\Delta _\rho (B_1)=\frac{1}{16}\,(1-r_1^2)(1-r_2^2)\mu ^4 \end{aligned}$$

The inequality in (4.2) reduces to

$$\begin{aligned} \frac{1}{16}(r_3^2+r_1^2+r_2^2)\mu ^4\le \frac{1}{16}(1-r_1^2)(1-r_2^2)\mu ^4 \end{aligned}$$
(4.3)

If \(\mu \ne 0\), (4.3) is equivalent to the inequality

$$\begin{aligned} \Vert {\overrightarrow{r}}\Vert ^2=r_1^2+r_2^2+r_3^2\le 1 \end{aligned}$$

If the commutator term vanishes and \(\mu \ne 0\), the uncertainty inequality becomes

$$\begin{aligned} r_1^2r_2^2\le (1-r_1^2)(1-r_2^2) \end{aligned}$$
(4.4)

which is equivalent to \(r_1^2+r_2^2\le 1\). If A and B are \(\rho \)-uncorrelated and \(\mu \ne 0\), the uncertainty inequality becomes \(r_3^2\le (1-r_1^2)(1-r_2^2)\) which is equivalent to \(\Vert {\overrightarrow{r}}\Vert ^2\le 1+r_1^2r_2^2\). This inequality and (4.4) are weaker than (4.3). \(\square \)

5 Real-Valued Coarse Graining

Let \(A=\left\{ A_x: x\in \Omega _A\right\} \) be an arbitrary observable. We assume that A is not necessarily real-valued so the outcome space \(\Omega _A\) is an arbitrary finite set. For \(f:\Omega _A\rightarrow \mathbb {R}\) with range \(\mathcal {R} (f)\) we define the real-valued observable f(A) by \(\Omega _{f(A)}=\mathcal {R} (f)\) and for all \(z\in \Omega _{f(A)}\)

$$\begin{aligned} f(A)_z=A_{f^{-1}(z)}=\sum \left\{ A_x: f(x)=z\right\} \end{aligned}$$

We call f(A) a real-valued coarse graining of A [2,3,4]. Then f(A) has stochastic operator

$$\begin{aligned} f(A)^\sim =\sum \limits _zzf(A)_z=\sum \limits _zzA_{f^{-1}(z)}=\sum \limits _z\sum \limits _{x\in f^{-1}(z)}zA_x=\sum \limits _xf(x)A_x \end{aligned}$$

It follows that \(\langle {f(A)}\rangle _\rho =\sum \limits _xf(x)\mathrm {tr\,} (\rho A_x)\) for all \(\rho \in \mathcal {S} (H)\). If B is another observable and \(g:\Omega _B\rightarrow \mathbb {R}\) we have

$$\begin{aligned} \textrm{Cor}_\rho \left[ {f(A),g(B)}\right]= & {} \sum \limits _{x,y}f(x)g(y)\mathrm {tr\,} (\rho A_xB_y)-\langle {f(A)}\rangle _\rho \langle {g(B)}\rangle _\rho \\ \Delta _\rho \left[ {f(A),g(B)}\right]= & {} \sum \limits _{x,y}f(x)g(y)\mathrm {Re\,}\mathrm {tr\,} (\rho A_xB_y)-\langle {f(A)}\rangle _\rho \langle {g(B)}\rangle _\rho \\ \Delta _\rho \left[ {f(A)}\right]= & {} \sum \limits _{x,y}f(x)f(y)\mathrm {tr\,} (\rho A_xA_y)-\langle {f(A)}\rangle _\rho ^2 \end{aligned}$$

Moreover, we have the uncertainty inequality

$$\begin{aligned} |{\textrm{Cor}_\rho \left[ {f(A),g(B)}\right] }|^2\le \Delta _\rho \left[ {f(A)}\right] \Delta _\rho \left[ {g(B)}\right] \end{aligned}$$

We denote the set of trace-class operators on H by \(\mathcal {T} (H)\). An operation on H is a completely positive, trace reducing, linear map \(\mathcal {O}:\mathcal {T} (H)\rightarrow \mathcal {T} (H)\) [1,2,3,4]. If \(\mathcal {O}\) preserves the trace, then \(\mathcal {O}\) is called a channel. A (finite) instrument is a finite set of operators \(\mathcal {I} =\left\{ \mathcal {I}_x: x\in \Omega _\mathcal {I}\right\} \) such that \(\overline{\mathcal {I}} =\sum \left\{ \mathcal {I}_x: x\in \Omega _\mathcal {I}\right\} \) is a channel [1,2,3,4]. We say that \(\mathcal {I}\) measures an observable A if \(\Omega _\mathcal {I} =\Omega _A\) and \(\mathrm {tr\,}\left[ {\mathcal {I}_x(\rho )}\right] =\mathrm {tr\,} (\rho A_x)\) for all \(x\in \Omega _\mathcal {I}\). It can be shown that \(\mathcal {I}\) measures a unique observable which we denote by \(J(\mathcal {I})\) [2, 3]. Conversely, any observable is measured by many instruments [1,2,3,4]. Corresponding to an operation \(\mathcal {O}\) we have its dual-operation \(\mathcal {O} ^*:\mathcal {L} (H)\rightarrow \mathcal {L} (H)\) defined by \(\mathrm {tr\,}\left[ {\rho \mathcal {O} ^*(C)}\right] =\mathrm {tr\,}\left[ {\mathcal {O} (\rho )C}\right] \) for all \(\rho \in \mathcal {S} (H)\) [2, 3]. It can be shown that \(J(\mathcal {I})_x=\mathcal {I}_x^*(I)\) for all \(x\in \Omega _\mathcal {I}\) where I is the identity operator [2, 3].

As with observables, if \(\mathcal {I}\) is an instrument, and \(f:\Omega _\mathcal {I}\rightarrow \mathbb {R}\) we define the real-valued instrument \(f(\mathcal {I})\) such that \(\Omega _{f(\mathcal {I})}=\mathcal {R} (f)\) and

$$\begin{aligned} f(\mathcal {I})_z=\sum \left\{ \mathcal {I}_x: f(x)=z\right\} \end{aligned}$$

If \(J(\mathcal {I})=A\), then \(J\left[ {f(\mathcal {I})}\right] =f(A)\) because

$$\begin{aligned} \mathrm {tr\,}\left[ {f(\mathcal {I})_z(\rho )}\right]= & {} \mathrm {tr\,}\left[ {\sum \left\{ \mathcal {I}_x(\rho ): f(x)=z\right\} }\right] =\sum \left\{ \mathrm {tr\,}\left[ {\mathcal {I}_x(\rho )}\right] : f(x)=z\right\} \\= & {} \sum \left\{ \mathrm {tr\,} (\rho A_x): f(x)=z\right\} =\mathrm {tr\,}\left[ {\rho \sum \left\{ A_x: f(x)=z\right\} }\right] \\= & {} \mathrm {tr\,}\left[ {\rho f(A)_z}\right] \end{aligned}$$

for all \(z\in \Omega _{f(A)}=\Omega _{f(\mathcal {I})}\). If \(\mathcal {I}\) is real-valued, we define \(\widetilde{\mathcal {I}}\) on \(\mathcal {L} (H)\) by \(\widetilde{\mathcal {I}}(C)=\sum x\mathcal {I}_x(C)\) and \(\langle {\mathcal {I}}\rangle _\rho =\mathrm {tr\,}\left[ {\widetilde{\mathcal {I}} (\rho )}\right] \). If \(J(\mathcal {I})=A\), then

$$\begin{aligned} \langle {\mathcal {I}}\rangle _\rho =\mathrm {tr\,}\left[ {\sum x\mathcal {I}_x(\rho )}\right] =\sum x\mathrm {tr\,}\!\left[ {\mathcal {I}_x(\rho )}\right] =\sum x\mathrm {tr\,} (\rho A_x) =\langle {A}\rangle _\rho \end{aligned}$$

for all \(\rho \in \mathcal {S} (H)\). We also define \(\Delta _\rho (\mathcal {I})=\Delta _\rho (A)\). It follows that \(\langle {f(\mathcal {I})}\rangle _\rho =\langle {f(A)}\rangle _\rho \), \(\Delta _\rho \left[ {f(\mathcal {I})}\right] =\Delta _\rho \left[ {f(A)}\right] \) and \(f(\mathcal {I})^\sim =\sum f(x)\mathcal {I}_x\).

Let \(A=\left\{ A_x: x\in \Omega _A\right\} \), \(B=\left\{ B_y: y\in \Omega _B\right\} \) be arbitrary observables and suppose \(\mathcal {I}\) is an instrument with \(J(\mathcal {I})=A\). Define the \(\mathcal {I}\)-product observable \(A\circ B\) with \(\Omega _{A\circ B}=\Omega _A\times \Omega _B\) given by \((A\circ B)_{(x,y)}=\mathcal {I}_x(B_y)\) [2, 3]. Then \(A\circ B\) is indeed an observable because

$$\begin{aligned} \sum \limits _{x,y}(A\circ B)_{(x,y)}=\sum \limits _{x,y}\mathcal {I}_x^*(B_y)=\sum \limits _x\mathcal {I}_x^*\left( {\sum \limits _yB_y}\right) =\sum \limits _x\mathcal {I}_x^*(I)=\sum \limits _xA_x=I \end{aligned}$$

Although \(A\circ B\) depends on \(\mathcal {I}\), we shall not indicate this for simplicity. We interpret \(A\circ B\) as the observable obtained by first measuring A using \(\mathcal {I}\) and then measuring B. If \(f:\Omega _A\times \Omega _B\rightarrow \mathbb {R}\) we obtain the real-valued observable \(f(A,B)=f(A\circ B)\). We then have

$$\begin{aligned} f(A,B)_z= & {} (A\circ B)_{f^{-1}(z)}=\sum \left\{ (A\circ B)_{(x,y)}: f(x,y)=z\right\} \\= & {} \sum \left\{ \mathcal {I}_x^*(B_y): f(x,y)=z\right\} \\ f(A,B)^\sim= & {} \sum \limits _{x,y}f(x,y)(A\circ B)_{(x,y)}=\sum \limits _{x,y}f(x,y)\mathcal {I}_x^*(B_y)\\ \langle {f(A,B)}\rangle _\rho= & {} \sum \limits _{x,y}f(x,y)\mathrm {tr\,}\left[ {\rho (A\circ B)_{(x,y)}}\right] =\sum \limits _{x,y}f(x,y)\mathrm {tr\,}\left[ {\rho \mathcal {I}_x^*(B_y)}\right] \\ \Delta _\rho \left[ {f(A,B)}\right]= & {} \sum \limits _{x,y,x',y'}f(x,y)f(x',y')\mathrm {tr\,}\left[ {\rho (A\circ B)_{(x,y)}(A\circ B)_{(x',y')}}\right] -\langle {f(A,B)}\rangle _\rho ^2\\= & {} \mathrm {tr\,}\left\{ \rho \left[ {\sum \limits _{x,y}f(x,y)\mathcal {I}_x^*(B_y)}\right] ^2\right\} -\langle {f(A,B)}\rangle _\rho ^2 \end{aligned}$$

If f is a product function \(f(x,y)=g(x)h(y)\) we obtain

$$\begin{aligned} f(A,B)_z=\sum \limits _z\left\{ \mathcal {I}_x^*(B_y): g(x)h(y)=z\right\} \end{aligned}$$

We then have the simplification

$$\begin{aligned} f(A,B)^\sim= & {} \sum \limits _{x,y}g(x)h(y)\mathcal {I}_x^*(B_y)=\sum \limits _xg_x\mathcal {I}_x^*\left( {\sum \limits _yh(y)B_y}\right) \\= & {} \sum \limits _xg(x)\mathcal {I}_x^*\left[ {h(B)^\sim }\right] \\ \end{aligned}$$

Hence,

$$\begin{aligned} \langle {f(A,B)}\rangle _\rho= & {} \mathrm {tr\,}\left[ {\rho f(A,B)^\sim }\right] =\mathrm {tr\,}\left\{ \rho \sum \limits _xg(x)\mathcal {I}_x^*\left[ {h(B)^\sim }\right] \right\} \\= & {} \sum \limits _xg(x)\mathrm {tr\,}\left\{ \rho \mathcal {I}_x^*\left[ {h(B)^\sim }\right] \right\} =\sum \limits _xg(x)\mathrm {tr\,}\left\{ \mathcal {I}_x(\rho )\left[ {h(B)^\sim }\right] \right\} \\= & {} \mathrm {tr\,}\left\{ \sum \limits _xg(x)\mathcal {I}_x(\rho )\left[ {h(B)^\sim }\right] \right\} =\mathrm {tr\,}\left\{ g(\mathcal {I})^\sim (\rho )\left[ {h(B)^\sim }\right] \right\} \end{aligned}$$

In a similar way we obtain

$$\begin{aligned} \Delta _\rho \left[ {f(A,B)}\right] =\mathrm {tr\,}\left\{ \left( {g(\mathcal {I})^\sim (\rho )\left[ {h(B)^\sim }\right] }\right) ^2\right\} -\langle {f(A,B)}\rangle _\rho ^2 \end{aligned}$$

If A and B are arbitrary observables, we define the observable B conditioned by A to be

$$\begin{aligned} (B\mid A)_y=\mathcal {I}_{\Omega _A}^*(B_y)=\sum \limits _{x\in \Omega _A}\mathcal {I}_x^*(B_y) \end{aligned}$$

where \(\Omega _{B\mid A}=\Omega _B\) [2, 3]. We interpret \((B\mid A)\) as the observable obtained by first measuring A without taking the outcome into account and then measuring B. If B is real-valued we have

$$\begin{aligned} (B\mid A)^\sim= & {} \sum \limits _yy(B\mid A)_y=\sum \limits _{x,y}y\mathcal {I}_x^*(B_y)=\mathcal {I}_{\Omega (A)}^*(\widetilde{B} )\\ \langle {(B\mid A)}\rangle _\rho= & {} \sum \limits _yy\mathrm {tr\,}\left[ {\rho \mathcal {I}_{\Omega (A)}^*(B_y)}\right] =\sum y\mathrm {tr\,}\left[ {\overline{\mathcal {I}} (\rho )B_y}\right] =\mathrm {tr\,}\left[ {\overline{\mathcal {I}} (\rho )\widetilde{B}\,}\right] =\langle {B}\rangle _{\overline{\mathcal {I}}_(\rho )}\\ \Delta _\rho \left[ {(B\mid A)}\right]= & {} \Delta _\rho \left[ {(B\mid A)^\sim }\right] =\Delta _\rho \left[ {\mathcal {I}_{\Omega (A)}^*(\widetilde{B} )}\right] =\mathrm {tr\,}\left\{ \left[ {\mathcal {I}_{\Omega (A)}^*(\widetilde{B} )}\right] ^2\right\} -\left[ {\langle {B}\rangle _{\overline{\mathcal {I}} (\rho )}}\right] ^2 \end{aligned}$$

We now illustrate the theory of this section with some examples.

Example 5

The simplest example of an instrument is a trivial instrument \(\mathcal {I}_x(\rho )=\omega (x)\rho \) where \(\omega \) is a probability measure on the finite set \(\Omega _\mathcal {I}\). It is clear that \(\mathcal {I}\) measures the trivial observable \(A_x=\omega (x)I\). Let B be an arbitrary observable and let \(f:\Omega _A\times \Omega _B\rightarrow \mathbb {R}\). We then have

$$\begin{aligned} (A\circ B)_{(x,y)}= & {} \mathcal {I}_x^*(B_y)=\omega (x)B_y\\ f(A,B)_z= & {} f(A\circ B)_z=\sum \left\{ \omega (x)B_y: f(x,y)=z\right\} \end{aligned}$$

We conclude that

$$\begin{aligned} f(A,B)^\sim= & {} \sum \limits _{x,y}f(x,y)\omega (x)B_y\\ \langle {f(A,B)}\rangle _\rho= & {} \sum \limits _{x,y}f(x,y)\omega (x)\mathrm {tr\,} (\rho B_y)\\ \Delta _\rho \left[ {f(A,B))}\right]= & {} \mathrm {tr\,}\left\{ \rho \left[ {\sum \limits _{x,y}f(x,y)\omega (x)B_y}\right] ^2\right\} -\langle {f(A,B)}\rangle _\rho ^2 \end{aligned}$$

Moreover, since

$$\begin{aligned} (B\mid A)_y=\sum \limits _x\mathcal {I}_x^*(B_y)=\sum \limits _x\omega (x)(B_y)=B_y \end{aligned}$$

we have that \((B\mid A)=B\). \(\square \)

Example 6

Let \(A=\left\{ A_x: x\in \Omega _A\right\} \) and \(B=\left\{ B_y: y\in \Omega _B\right\} \) be arbitrary observables and let \(\mathcal {H}_x(\rho )=\mathrm {tr\,} (\rho A_x)\alpha _x\), \(\alpha _x\in \mathcal {S} (H)\) be a Holevo instrument [2, 3]. Then \(\mathcal {H}\) measure A because

$$\begin{aligned} \mathrm {tr\,}\left[ {\mathcal {H}_x(\rho )}\right] =\mathrm {tr\,}\left[ {\mathrm {tr\,} (\rho A_x)\alpha _x}\right] =\mathrm {tr\,} (\rho A_x) \end{aligned}$$

Since \(\mathcal {H}_x^*(a)=\mathrm {tr\,} (\alpha _xa)A_x\) for all \(x\in \Omega _A\) [2, 3], we have

$$\begin{aligned} (A\circ B)_{(x,y)}=\mathcal {H}_x^*(B_y)=\mathrm {tr\,} (\alpha _xB_y)A_x \end{aligned}$$

If \(f:\Omega _A\times \Omega _B\rightarrow \mathbb {R}\), we obtain the real-valued observable

$$\begin{aligned} f(A,B)_z=\sum \left\{ \mathrm {tr\,} (\alpha _xB_y)A_x: f(x,y)=z\right\} \end{aligned}$$

We conclude that

$$\begin{aligned} f(A,B)_z= & {} \sum \limits _{x,y}f(x,y)\mathcal {H}_x^*(B_y)=\sum \limits _{x,y}f(x,y)\mathrm {tr\,} (\alpha _xB_y)A_x\\ \langle {f(A,B)}\rangle _\rho= & {} \sum \limits _{x,y}f(x,y)\mathrm {tr\,} (\alpha _xB_y)\mathrm {tr\,} (\rho A_x)\\ \Delta _\rho \left[ {f(A,B)}\right]= & {} \sum \limits _{x,y,x',y'}f(x,y)f(x',y')\mathrm {tr\,}\left[ {\rho \mathrm {tr\,} (\alpha _xB_y)A_x\mathrm {tr\,} (\alpha _{x'}B_{y'})A_{x'}}\right] \\{} & {} \quad -\langle {f(A,B)}\rangle _\rho ^2\\= & {} \mathrm {tr\,}\left\{ \rho \left[ {\sum \limits _{x,y}f(x,y)\mathrm {tr\,} (\alpha _xB_y)A_x}\right] ^2\right\} -\langle {f(A,B)}\rangle _\rho ^2 \end{aligned}$$

Moreover, we have

$$\begin{aligned} (B\mid A)_y=\sum \limits _x\mathcal {H}_x^*(B_y)=\sum \limits _x\mathrm {tr\,} (\alpha _xB_y)A_x \square \end{aligned}$$

Example 7

Let AB be arbitrary observables and let \(\mathcal {L}\) be the Lüders instrument given by \(\mathcal {L}_x(\rho )=A_x^{1/2}\rho A_x^{1/2}\) [2, 3, 6]. Then

$$\begin{aligned} \mathrm {tr\,}\left[ {\mathcal {L}_x(\rho )}\right] =\mathrm {tr\,} (A_x^{1/2}\rho A_x^{1/2})=\mathrm {tr\,} (\rho A_x) \end{aligned}$$

so \(\mathcal {L}\) measures A. Since \(\mathcal {L}_x^*(a)=A_x^{1/2}aA_x^{1/2}\) [2, 3] we have

$$\begin{aligned} (A\circ B)_{(x,y)}=A_x^{1/2}B_yA_x^{1/2} \end{aligned}$$

If \(f:\Omega _A\times \Omega _B\rightarrow \mathbb {R}\), we obtain the real-valued observable

$$\begin{aligned} f(A,B)_z=\sum \left\{ A_x^{1/2}B_yA_x^{1/2}: f(x,y)=z\right\} \end{aligned}$$

We conclude that

$$\begin{aligned} f(A,B)^\sim= & {} \sum \limits _{x,y}f(x,y)A_x^{1/2}B_yA_x^{1/2}\\ \langle {f(A,B)}\rangle _\rho= & {} \sum \limits _{x,y}f(x,y)\mathrm {tr\,} (\rho A_x^{1/2}B_yA_x^{1/2})=\sum \limits _{x,y}f(x,y)\mathrm {tr\,} (A_x^{1/2}\rho A_x^{1/2}B_y)\\ \Delta _\rho \left[ {f(A,B)}\right]= & {} \mathrm {tr\,}\left\{ \rho \left[ {\sum \limits _{x,y}f(x,y)A_x^{1/2}B_yA_x^{1/2}}\right] ^2\right\} -\langle {f(A,B)}\rangle _\rho ^2 \end{aligned}$$

Moreover, we have

$$\begin{aligned} (B\mid A)_y=\sum \limits _x\mathcal {L}_x^*(B_y)=\sum \limits _xA_x^{1/2}B_yA_x^{1/2}\square \end{aligned}$$