Keywords

1 Basic Rules of Quantum Mechanics

We begin by listing a basic set of rules from which all statements in this section should be derived. The choice of this set is by no means unique, and the selection of the properties of quantum mechanics that are used as basic rules, leaving the rest as derived rules, is actually a matter of preference. Our choice here comprises five rules describing states, transformations, measurements, compositions, and causality.

The first of these rules covers the description of the states of a physical system. We call a state pure when it is impossible to regard that state as a probabilistic mixture of two or more differentFootnote 1 states.

Rule 1:

A physical system is associated with a Hilbert space \(\mathcal{H}\). Every pure state of this system is represented by a normalized vector \(\vert \phi \rangle \in \mathcal{H}\). For any normalized vector \(\vert \psi \rangle \in \mathcal{H}\), it is possible to prepare the system in the state represented by \(\vert \psi \rangle\).

To avoid complications, we assume in this section that the dimension \(d =\dim \mathcal{H}\) of the Hilbert space is finite.Footnote 2 A physical system with a Hilbert space of dimension d is often called a d-level system. Rule 1 dictates that any appropriate instruction for preparation of the physical system leads to either a pure state represented by a single vector \(\vert \phi \rangle\) or a mixed state represented by an ensemble \(\{(p_{j},\vert \phi _{j}\rangle )\}\), which designates the situation where the system is prepared in state \(\vert \phi _{j}\rangle\) with probability p j . In either case, the representation is not unique: \(\vert \phi \rangle\) and \(e^{i\varphi }\vert \phi \rangle\) represent the same pure physical state. The different descriptions \(\{(p_{j},\vert \phi _{j}\rangle )\}\) and \(\{(q_{i},\vert \psi _{i}\rangle )\}\) may both refer to the same mixed state. We will introduce an alternative representation of the states, which is unique, in Sect. 1.2.

The next two rules cover the input-output relations of feasible operations on a physical system prepared in state \(\vert \phi _{\mathrm{in}}\rangle\). A state transformation refers to the case where the output is the quantum state \(\vert \phi _{\mathrm{out}}\rangle\) of the system after the operation. Rule 2 dictates the feasibility of unitary transformations, which are, in a sense, a basic set of transformations.

Rule 2:

For any unitary operator \(\hat{U}\) on \(\mathcal{H}\), it is possible to implement a state transformation where every input state \(\vert \phi _{\mathrm{in}}\rangle \in \mathcal{H}\) evolves into state \(\vert \phi _{\mathrm{out}}\rangle =\hat{ U}\vert \phi _{\mathrm{in}}\rangle\).

When the output is a classical variable, we are then referring to a measurement. Rule 3 covers a basic set of measurements called (complete) orthogonal measurements.

Rule 3:

For any orthonormal basis \(\{\vert u_{j}\rangle \}_{j=1,\ldots,d}\) of \(\mathcal{H}\), it is possible to implement a measurement that produces the outcome j = 1, … d with probability \(p_{j} = \vert \langle u_{j}\vert \phi _{\mathrm{in}}\rangle \vert ^{2}\) when the system is in state \(\vert \phi _{\mathrm{in}}\rangle \in \mathcal{H}\) before the measurement is performed.

In this rule, we are not interested in the state of the measured system after the measurement is performed. The two rules above only refer to the feasibility of the limited sets of transformations and measurements. In general, a much wider variety of operations should be available on a physical system, and we will see the whole landscape of these operations in Sect. 1.4.

The next rule is a very special rule that allows us to weave the threads of Rules 1, 2 and 3 into a texture of quantum information with dazzling patterns and colors. This rule tells us how to apply the three rules above when dealing with multiple physical systems. Consider two physical systems, A and B, which are independently accessible. For example, the two systems are well separated in space, meaning that one can freely operate on system A without affecting system B at all. We may call this type of operation local. In this case, we can treat the whole of systems A and B together as a single physical system (a composite system A B), or can focus on one of the two systems (a subsystem) with no interest in the other. Rule 4 provides the connection between these different viewpoints.

Rule 4:

Suppose that the subsystems A and B are associated with the Hilbert spaces \(\mathcal{H}_{A}\) and \(\mathcal{H}_{B}\), respectively. The composite system AB is then associated with a tensor-product space \(\mathcal{H}_{AB} = \mathcal{H}_{A} \otimes \mathcal{H}_{B}\). Local operations (e.g., state preparations, state transformations, measurements) are represented by the appropriate tensor products.

Specifically, preparation of system A in state \(\vert \phi \rangle _{A} \in \mathcal{H}_{A}\) and system B in state \(\vert \psi \rangle _{B} \in \mathcal{H}_{B}\) is equivalent to the preparation of a composite system AB in state \(\vert \phi \rangle _{A} \otimes \vert \psi \rangle _{B} \in \mathcal{H}_{AB}\). The state that can be written in this form is called a product state, and is often abbreviated as \(\vert \phi \rangle _{A}\vert \psi \rangle _{B}\) or even \(\vert \phi \psi \rangle _{AB}\). The unitary transformations \(\hat{U}_{A}\) on system A and \(\hat{V }_{B}\) on B result in the unitary transformation \(\hat{U}_{A} \otimes \hat{ V }_{B}\) on the composite system AB. Performing an orthogonal measurement with basis \(\{\vert u_{i}\rangle _{A}\}_{i=1,\ldots,d}\) on system A and another with basis \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) on system B can be regarded as the performance of a single orthogonal measurement, where the outcome is represented by two numbers (i, j), carried out on the composite system AB with the orthonormal basis \(\{\vert u_{i}\rangle _{A} \otimes \vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}^{i=1,\ldots,d}\) of \(\mathcal{H}_{AB}\).

According to Rule 1, we should be able to prepare a state represented by any vector \(\vert \varPsi \rangle _{AB} \in \mathcal{H}_{AB}\), possibly by the tailoring of suitable interaction between systems A and B. These vectors include, for example, \((\vert u_{1}\rangle _{A}\vert v_{1}\rangle _{B} + \vert u_{2}\rangle _{A}\vert v_{2}\rangle _{B})/\sqrt{2}\), which can never be written in the form \(\vert \phi \rangle _{A} \otimes \vert \psi \rangle _{B}\). This type of state is called entangled. Similarly, a unitary operator \(\hat{U}_{AB}\) acting on \(\mathcal{H}_{AB}\) is not necessarily a product \(\hat{U}_{A} \otimes \hat{ V }_{B}\), and the corresponding global unitary transformation should be feasible. There are also global orthogonal measurements, for which the orthonormal basis is composed of entangled state vectors.

Since the state of a composite system is not necessarily written as a product form, the definition of ‘the state of a subsystem’ is something of a moot point. Here, we adopt a definition with a clear operational meaning, called the marginal state of a subsystem, which is simply the state that the subsystem would be in if we discard all the other constituent subsystems. With regard to the marginal states, we assume the following.

Rule 5:

The marginal state of a subsystem is not changed by operating on other subsystems, as long as no information on the outcome of the operation is referred.

This rule is expected to hold because there would otherwise be a test on system A alone that would give clues on what operations were performed on a remote system B without any communication between them. The rule sets a limitation on the physically allowed state transformations and measurements, which complements the fact that Rules 2 and 3 merely dictate what we can at least do.

2 Density Operators

In classical mechanics, a mixed state is simply regarded as a way to formulate an observer’s lack of knowledge of the true state of a system. In principle, it is always possible to assume that there is an omnipotent observer who knows the exact state (the pure state) of every system. In quantum mechanics, however, this simple picture does not hold. When a composite system is in a pure state \(\vert \varPsi \rangle _{AB}\), we cannot associate the state of the subsystem A with a single vector \(\vert \phi \rangle _{A} \in \mathcal{H}_{A}\) unless \(\vert \varPsi \rangle _{AB}\) is a product state. Therefore, it is not always possible to assume that every system is in a pure state at the same time. In this subsection, we determine how we can represent the state of a subsystem when it is a part of a composite system in a pure state \(\vert \varPsi \rangle _{AB}\). We will see that the intuitive representation using an ensemble \(\{(p_{j},\vert \phi _{j}\rangle )\}\) is redundant in the sense that different descriptions may refer to the same physical state. This motivates us to introduce a density operator to offer a better representation in this respect. By using a helpful property of bipartite pure states called Schmidt decomposition, we will show that there is a one-to-one correspondence between the density operators and the physical states.

2.1 Measurement on a Subsystem

Suppose that the composite system AB is initially prepared in a pure state \(\vert \varPsi \rangle _{AB}\), and an orthogonal measurement with a basis \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) is then conducted on subsystem B, producing an outcome j with a probability p j . Let us derive a rule to calculate p j and identify the state of the subsystem A that is conditioned on the value of j.

Our strategy is to observe what happens if we perform a measurement with arbitrary basis \(\{\vert u_{i}\rangle _{A}\}_{i=1,\ldots,d}\) on system A. Regardless of the temporal order of the measurements on A and B, Rules 3 and 4 dictate that the joint probability of the two outcomes (i, j) is given by \(p_{i,j} = \vert (_{A}\langle u_{i}\vert \otimes _{B}\langle v_{j}\vert )\vert \varPsi \rangle _{AB}\vert ^{2}\). Let us introduce the unnormalized vector \(\vert \tilde{\phi }_{j}\rangle _{A}:= _{B}\langle v_{j}\vert \vert \varPsi \rangle _{AB} \in \mathcal{H}_{A}\). We then have \(p_{i,j} = \vert _{A}\langle u_{i}\vert \tilde{\phi }_{j}\rangle _{A}\vert ^{2}\) and \(p_{j} =\sum _{ i=1}^{d}p_{i,j} =\sum _{ i=1}^{d}\vert _{A}\langle u_{i}\vert \tilde{\phi }_{j}\rangle _{A}\vert ^{2} = \vert _{A}\langle \tilde{\phi }_{j}\vert \tilde{\phi }_{j}\rangle _{A}\vert ^{2}\). Using a normalized vector \(\vert \phi _{j}\rangle _{A}:= \vert \tilde{\phi }_{j}\rangle _{A}/\sqrt{p_{j}}\), we obtain an expression for the conditional probability, \(p_{i\vert j}:= p_{i,j}/p_{j} = \vert _{A}\langle u_{i}\vert \phi _{j}\rangle _{A}\vert ^{2}\). Because the choice of the basis \(\{\vert u_{i}\rangle _{A}\}_{i=1,\ldots,d}\) was arbitrary, comparison of this relationship to Rule 3 shows that the state of the subsystem A conditioned on the outcome j must be a pure state, which is represented by the vector \(\vert \phi _{j}\rangle _{A}\). Noting that the measurement on A can be performed immediately after the preparation of \(\vert \varPsi \rangle _{AB}\), we arrive at the following theorem.

Theorem 1.

Suppose that a composite system AB is initially prepared in a pure state \(\vert \varPsi \rangle _{AB}\) , and that an orthogonal measurement with a basis \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) is performed on subsystem B. The outcome j then occurs with probability p j and, conditioned on j, the subsystem A behaves as if it was initially prepared in the pure state \(\vert \phi _{j}\rangle _{A}\) , where

$$\displaystyle{ \sqrt{p_{j}}\vert \phi _{j}\rangle _{A} = _{B}\langle v_{j}\vert \vert \varPsi \rangle _{AB} }$$
(1.1)

holds.

2.2 Marginal State of a Subsystem

The argument in the previous subsection immediately provides a description of the marginal state of the subsystem A when the composite system AB is prepared in the pure state \(\vert \varPsi \rangle _{AB}\). If the value of the outcome j of the measurement on subsystem B is unavailable, then the state of system A after the measurement can be described by the ensemble \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d'}\), where the probabilities {p j } and the vectors \(\{\vert \phi _{j}\rangle _{A}\}\) are calculated from Eq. (1.1). From Rule 5, we see that the marginal state of subsystem A before the measurement was performed is also \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d'}\).

On the one hand, this description is helpful because it is sufficient to allow calculation of the statistics of the outcomes of further operations on system A alone. On the other hand, the argument above also shows that the description of a mixed state by the ensemble is by no means unique. If we change the basis \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) of the measurement to another basis, then the description of the state \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d'}\) also changes through Eq. (1.1). This new ensemble should also be a valid representation of the same state.

Lemma 1.

Two ensembles, \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d'}\) and \(\{(p'_{j},\vert \phi '_{j}\rangle _{A})\}_{j=1,\ldots,d'}\) , represent the same mixed state if a bipartite pure state \(\vert \varPsi \rangle _{AB}\) and orthonormal bases \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) and \(\{\vert v'_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) exist that satisfy

$$\displaystyle{ \sqrt{p_{j}}\vert \phi _{j}\rangle _{A} = _{B}\langle v_{j}\vert \vert \varPsi \rangle _{AB}\;\;\mathrm{and}\;\;\sqrt{p'_{j}}\vert \phi '_{j}\rangle _{A} = _{B}\langle v'_{j}\vert \vert \varPsi \rangle _{AB}. }$$
(1.2)

2.3 Density Operators

Consider a physical system that is associated with a Hilbert space \(\mathcal{H}\), and let us call an operator \(\hat{\rho }: \mathcal{H}\rightarrow \mathcal{H}\) a density operator when it is positive \((\hat{\rho }\geq 0)\) and of unit trace (\(\mathrm{Tr}\;\hat{\rho } = 1\)). We associate a mixed state of a system represented by the ensemble \(\{(q_{i},\vert \psi _{i}\rangle )\}_{i=1,\ldots,n}\) with a density operator given by

$$\displaystyle{ \hat{\rho }:=\sum _{ i=1}^{n}q_{ i}\vert \psi _{i}\rangle \langle \psi _{i}\vert. }$$
(1.3)

One immediate benefit of this representation by the density operator is that the marginal state that was discussed in Sect. 1.2.2 is represented by a unique density operator, i.e.,

$$\displaystyle{ \hat{\rho }_{A} =\sum _{ j=1}^{d'}p_{ j}\vert \phi _{j}\rangle _{A A}\langle \phi _{j}\vert =\sum _{ j=1}^{d'}p'_{ j}\vert \phi '_{j}\rangle _{AA}\langle \phi '_{j}\vert =\mathrm{ Tr}_{B}\vert \varPsi \rangle _{AB AB}\langle \varPsi \vert }$$
(1.4)

that holds under Eq. (1.2). This operator is called the marginal density operator of system A for the whole state \(\vert \varPsi \rangle _{AB}\).

Because any positive operator \(\hat{\rho }\) with a unit trace can be written in a diagonal form \(\hat{\rho }=\sum _{i}\lambda _{i}\vert u_{i}\rangle \langle u_{i}\vert \) using nonnegative eigenvalues \(\{\lambda _{i}\}\) with \(\sum _{i}\lambda _{i} = 1\) and orthonormal eigenvectors \(\{\vert u_{i}\rangle \}\), \(\hat{\rho }\) is the density operator for an ensemble \(\{(\lambda _{i},\vert u_{i}\rangle )\}_{i}\). Therefore, any density operator is associated with at least one physical state.

When an orthogonal measurement with a basis \(\{\vert u_{j}\rangle \}_{j}\) is performed on a mixed state \(\{(q_{i},\vert \psi _{i}\rangle )\}_{i}\), the probability of the outcome j is calculated using Rule 3 to be \(p_{j} =\sum _{i}q_{i}\vert \langle u_{j}\vert \psi _{i}\rangle \vert ^{2} =\langle u_{j}\vert \hat{\rho }\vert u_{j}\rangle\). This shows that the statistics of the measurement outcome depend only on the density operator. This also shows that each physical state is associated with a single density operator. Consider two mixed states with different density operators \(\hat{\rho }\) and \(\hat{\rho }'(\neq \hat{\rho })\). Because \(\vert u\rangle \in \mathcal{H}\) exists with \(\langle u\vert (\hat{\rho }-\hat{\rho }')\vert u\rangle \neq 0\), a measurement leading to different statistics between the two states also exists. The two states are therefore distinct. This fact implies that the density operator can be determined using a map from the set of physical states. As shown earlier, this map is surjective.

The remaining question is whether this map is bijective. At this point, it might not be injective, i.e., different mixed states could be associated with the same density operator. We will provide the answer to this question in Sect. 1.2.5, after we discuss the important properties of bipartite pure states in Sect. 1.2.4.

2.4 Properties of Bipartite Pure States

First, we consider how a general bipartite pure state \(\vert \varPsi \rangle _{AB}\) can be written in terms of the orthonormal bases \(\{\vert u_{i}\rangle _{A}\}_{i}\) and \(\{\vert v_{j}\rangle _{B}\}_{j}\) for the subsystems A and B. Because \(\{\vert u_{i}\rangle _{A}\vert v_{j}\rangle _{B}\}_{i,j}\) is a basis of \(\mathcal{H}_{AB}\), it is always possible to decompose \(\vert \varPsi \rangle _{AB}\) as \(\vert \varPsi \rangle _{AB} =\sum _{i,j}c_{i,j}\vert u_{i}\rangle _{A}\vert v_{j}\rangle _{B}\). The special aspect of bipartite states is that a much simpler form of decomposition, \(\vert \varPsi \rangle _{AB} =\sum _{i}c_{i}\vert u_{i}\rangle _{A}\vert v_{i}\rangle _{B}\), is available if we select \(\{\vert u_{i}\rangle _{A}\}_{i}\) and \(\{\vert v_{j}\rangle _{B}\}_{j}\) appropriately for the given vector \(\vert \varPsi \rangle _{AB}\). This decomposition is called Schmidt decomposition, and it will be convenient to describe Schmidt decomposition in the form of the following theorem.

Theorem 2.

Let \(\vert \varPsi \rangle _{AB} \in \mathcal{H}_{AB} = \mathcal{H}_{A} \otimes \mathcal{H}_{B}\) be a normalized vector that represents a pure state of a bipartite system AB. Let \(\hat{\rho }_{A} =\mathrm{ Tr}_{B}\vert \varPsi \rangle _{ABAB}\langle \varPsi \vert \) be the marginal density operator of system A, and let s be the rank of \(\hat{\rho }_{A}\) . For any orthonormal set of vectors \(\{\vert u_{i}\rangle _{A}\}_{i=1,\ldots,s} \subset \mathcal{H}_{A}\) that diagonalizes \(\hat{\rho }_{A}\) as \(\hat{\rho }_{A} =\sum _{ i=1}^{s}p_{i}\vert u_{i}\rangle _{AA}\langle u_{i}\vert \) with p i > 0(i = 1,…,s), there is an orthonormal set of vectors \(\{\vert v_{i}\rangle _{B}\}_{i=1,\ldots,s} \subset \mathcal{H}_{B}\) , such that

$$\displaystyle{ \vert \varPsi \rangle _{AB} =\sum _{ i=1}^{s}\sqrt{p_{ i}}\vert u_{i}\rangle _{A}\vert v_{i}\rangle _{B}. }$$
(1.5)

Proof.

Define the unnormalized vectors \(\vert \tilde{v}_{i}\rangle _{B}:= _{A}\langle u_{i}\vert \vert \varPsi \rangle _{AB}\). We then have \(\vert \varPsi \rangle _{AB} =\sum _{ i=1}^{s}\vert u_{i}\rangle _{A}\vert \tilde{v}_{i}\rangle _{B}\). We see that \(_{B}\langle \tilde{v}_{i}\vert \tilde{v}_{j}\rangle _{B} =\mathrm{ Tr}\vert \tilde{v}_{j}\rangle _{BB}\langle \tilde{v}_{i}\vert = _{A}\langle u_{j}\vert \mathrm{Tr}_{B}(\vert \varPsi \rangle _{ABAB}\langle \varPsi \vert )\vert u_{i}\rangle _{A} = _{A}\langle u_{j}\vert \hat{\rho }_{A}\vert u_{i}\rangle _{A} = p_{i}\delta _{i,j}\), where δ i, j  = 1 if i = j, and otherwise δ i, j  = 0. Thus, if we define \(\vert v_{i}\rangle _{B}:= \vert \tilde{v}_{i}\rangle _{B}/\sqrt{p_{i}}\), \(\{\vert v_{i}\rangle _{B}\}_{i}\) is an orthonormal set that satisfies Eq. (1.5). □ 

The number s is often called the Schmidt number of the state \(\vert \varPsi \rangle _{AB}\). If s is smaller than \(\dim \mathcal{H}_{A}\) or \(\dim \mathcal{H}_{B}\), we can always augment the orthonormal sets to form orthonormal bases.

Next, we introduce a concept that is opposite to the concept of the marginal density operator for a bipartite pure state. For a given density operator \(\hat{\rho }_{A}\) of subsystem A, a purification of the density operator is defined to be a pure state \(\vert \varPhi \rangle _{AB}\) of the composite system AB that satisfies \(\mathrm{Tr}_{B}\vert \varPhi \rangle _{ABAB}\langle \varPhi \vert =\hat{\rho } _{A}\). In contrast to the marginal density operator, which is unique to a given state \(\vert \varPsi \rangle _{AB}\), the purification of a given density operator \(\hat{\rho }_{A}\) is not unique and there are many bipartite pure states that can be regarded as purifications of \(\hat{\rho }_{A}\). However, they are connected by a simple relation [1, 2] that is given as follows.

Theorem 3.

For any two purifications \(\vert \varPhi \rangle _{AB}\) , \(\vert \varPhi '\rangle _{AB} \in \mathcal{H}_{AB} = \mathcal{H}_{A} \otimes \mathcal{H}_{B}\) of the same density operator \(\hat{\rho }_{A}\) , there is a unitary operator \(\hat{V }_{B}: \mathcal{H}_{B} \rightarrow \mathcal{H}_{B}\) such that

$$\displaystyle{ \vert \varPhi '\rangle _{AB} = (\hat{1}_{A} \otimes \hat{ V }_{B})\vert \varPhi \rangle _{AB}. }$$
(1.6)

Proof.

When we write down a diagonal form \(\hat{\rho }_{A} =\sum _{ i=1}^{s}p_{i}\vert u_{i}\rangle _{AA}\langle u_{i}\vert \), Theorem 2 ensures that the purifications are decomposed as \(\vert \varPhi \rangle _{AB} =\sum _{ i=1}^{s}\sqrt{p_{i}}\vert u_{i}\rangle _{A}\vert v_{i}\rangle _{B}\) and \(\vert \varPhi '\rangle _{AB} =\sum _{ i=1}^{s}\sqrt{p_{i}}\vert u_{i}\rangle _{A}\vert v'_{i}\rangle _{B}\). Because \(\{\vert v_{i}\rangle _{B}\}_{i}\) and \(\{\vert v'_{i}\rangle _{B}\}_{i}\) are orthonormal sets, a unitary operator \(\hat{V }_{B}\) exists such that \(\vert v'_{i}\rangle _{B} =\hat{ V }_{B}\vert v_{i}\rangle _{B}\) for all i. □ 

This theorem is quite simple but has deeper consequences. Suppose that Alice holds system A and Bob holds system B, and assume that only Bob knows whether the system AB is in state \(\vert \varPhi \rangle _{AB}\) or in state \(\vert \varPhi '\rangle _{AB}\). There are then only two possible situations: (i) The marginal density operators of subsystem A are different for \(\vert \varPhi \rangle _{AB}\) and \(\vert \varPhi '\rangle _{AB}\), and thus Alice can locally distinguish state \(\vert \varPhi \rangle _{AB}\) from state \(\vert \varPhi '\rangle _{AB}\) to some extent. (ii) The marginal density operators of subsystem A are the same and according to Theorem 3, Bob can switch locally between state \(\vert \varPhi \rangle _{AB}\) and state \(\vert \varPhi '\rangle _{AB}\). As a result, we see that there is no situation whatsoever in which Alice is unable to distinguish between the two states locally and Bob is unable to switch between the states locally. This property has led to the no-go theorem for unconditionally secure bit commitment [3, 4].

2.5 Physical States and Density Operators

We are now in a position to prove that there is a one-to-one correspondence between the physical states and the density operators. Consider two states represented by the ensembles \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d}\) and \(\{(p'_{j},\vert \phi '_{j}\rangle _{A})\}_{j=1,\ldots,d'}\), which are associated with the same density operator \(\hat{\rho }_{A}\). We will show that these two states are in fact the same state [1, 2].

Without loss of generality, we may assume that d ≤ d′. If d < d′, we can augment the ensemble \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d}\) in an equivalent manner to \(\{(p_{j},\vert \phi _{j}\rangle _{A})\}_{j=1,\ldots,d'}\) by adding dummy states \(\vert \phi _{j}\rangle _{A}\) with p j  = 0. Consider another system B with a Hilbert space \(\mathcal{H}_{B}\) with dimension d′, and take an orthonormal basis \(\{\vert v_{j}\rangle _{B}\}_{j=1,\ldots,d'}\). We then define the bipartite states \(\vert \varPsi \rangle _{AB}:=\sum _{ j=1}^{d'}\sqrt{p_{j}}\vert \phi _{j}\rangle _{A}\vert v_{j}\rangle _{B}\) and \(\vert \varPsi '\rangle _{AB}:=\sum _{ j=1}^{d'}\sqrt{p'_{j}}\vert \phi '_{j}\rangle _{A}\vert v_{j}\rangle _{B}\), which both have \(\hat{\rho }_{A}\) as their marginal density operator. From Theorem 3, there is a unitary operator \(\hat{V }_{B}\) with \(\vert \varPsi '\rangle _{AB} = (\hat{1}_{A} \otimes \hat{ V }_{B})\vert \varPsi \rangle _{AB}\). We define another orthonormal basis \(\{\vert v'_{j}\rangle _{B}\}_{j=1,\ldots,d'}\) using \(\vert v'_{j}\rangle:=\hat{ V }_{B}^{\dag }\vert v_{j}\rangle\). It is then simple to confirm that the requisite of Lemma 1, Eq. (1.2), holds, and thus the two states are the same state. When combined with the previous observation in Sect. 1.2.3, we can conclude that:

There is a one-to-one correspondence between the set of physical states and the set of density operators.

Having established that the density operators are conceptually an ideal description of the physical states, it is natural to expect that the basic and derived rules will be equally well stated when using the density operators in place of vectors to represent the physical states. In fact, by carefully following the definition, we obtain the following list of formulas.

figure a

Distinction is made between the pure and mixed states based simply on the rank of the density operator. The state is pure if and only if the rank of its density operator \(\hat{\rho }\) is 1, in which case it can be written as \(\hat{\rho }= \vert \phi \rangle \langle \phi \vert \) using the normalized vector \(\vert \phi \rangle\). The opposite extreme may be the case of the operators with maximal rank, which is equal to the dimension d of the Hilbert space. Among these operators, the state where \(\hat{\rho }=\hat{ 1}/d\) has the unique property of invariance under all unitary transformations, and is called the maximally mixed state.

Classification of the density operator can be related to the classification of the bipartite pure states through purification. The Schmidt number of a specific purification is equal to the rank of the density operator. The purification of a rank-one density operator, \(\hat{\rho }_{A} = \vert u\rangle _{AA}\langle u\vert \), is a product state in the form of \(\vert u\rangle _{A}\vert v\rangle _{B}\), while the purification of a nonpure density operator is an entangled state. The purification of a maximally mixed state is called a maximally entangled state. Under Schmidt decomposition of Eq. (1.5), a maximally entangled state \(\vert \varPhi \rangle _{AB}\) is written as

$$\displaystyle{ \vert \varPhi \rangle _{AB} = \frac{1} {\sqrt{d}}\sum _{i=1}^{d}\vert u_{ i}\rangle _{A}\vert v_{i}\rangle _{B}, }$$
(1.7)

where d is the dimension of \(\mathcal{H}_{A}\).

3 Qubits

The simplest of the physical systems is a two-level system that is associated with a Hilbert space of dimension 2, and is called a qubit. For a qubit, the general states, the orthogonal measurements, and the unitary transformations can be conveniently visualized using a three-dimensional image called the Bloch representation.

3.1 Pauli Operators

Consider a qubit and choose an orthonormal basis \(\{\vert 0\rangle,\vert 1\rangle \}\) of its Hilbert space \(\mathcal{H}\) as the standard basis. We define a set of three operators, called Pauli operators, as \(\hat{\sigma }_{x} =\hat{\sigma } _{1}:= \vert 0\rangle \langle 1\vert + \vert 1\rangle \langle 0\vert \), \(\hat{\sigma }_{y} =\hat{\sigma } _{2}:= -i\vert 0\rangle \langle 1\vert + i\vert 1\rangle \langle 0\vert \), and \(\hat{\sigma }_{z} =\hat{\sigma } _{3}:= \vert 0\rangle \langle 0\vert -\vert 1\rangle \langle 1\vert \). In the matrix representation under the standard basis, they are written as

$$\displaystyle{ \hat{\sigma }_{x} =\hat{\sigma } _{1} = \left (\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right ),\;\hat{\sigma }_{y} =\hat{\sigma } _{2} = \left (\begin{array}{cc} 0& - i\\ i & 0 \end{array} \right ),\;\hat{\sigma }_{z} =\hat{\sigma } _{3} = \left (\begin{array}{cc} 1& 0\\ 0 & -1 \end{array} \right ). }$$
(1.8)

They satisfy the following commutation and anti-commutation relations:

$$\displaystyle{ [\hat{\sigma }_{i},\hat{\sigma }_{j}] = 2i\epsilon _{ijk}\hat{\sigma }_{k}\;\;\mathrm{and}\;\;\{\hat{\sigma }_{i},\hat{\sigma }_{j}\} = 2\delta _{i,j}\hat{1}, }$$
(1.9)

where \([\hat{A},\hat{B}] =\hat{ A}\hat{B} -\hat{ B}\hat{A}\), \(\{\hat{A},\hat{B}\} =\hat{ A}\hat{B} +\hat{ B}\hat{A}\). The Levi-Civita symbol ε ijk is zero, except for \(\epsilon _{123} =\epsilon _{231} =\epsilon _{312} = 1\) and \(\epsilon _{321} =\epsilon _{132} =\epsilon _{213} = -1\), and the Einstein notation is used to omit the summation.

Together with \(\hat{\sigma }_{0}:=\hat{ 1}\), we have four self-adjoint and unitary operators. These satisfy the orthogonality relations,

$$\displaystyle{ \mathrm{Tr}(\hat{\sigma }_{\mu }\hat{\sigma }_{\nu }) = 2\delta _{\mu,\nu } }$$
(1.10)

for μ, ν = 0, 1, 2, 3. Every linear operator \(\hat{A}\) acting on \(\mathcal{H}\) is uniquely decomposed as \(\hat{A} = (P_{0}\hat{1} + P_{x}\hat{\sigma }_{x} + P_{y}\hat{\sigma }_{y} + P_{z}\hat{\sigma }_{z})/2\), where the four complex parameters (P 0, P x , P y , P z ) can be determined using \(P_{0} =\mathrm{ Tr}(\hat{A})\), \(P_{x} =\mathrm{ Tr}(\hat{\sigma }_{x}\hat{A})\), \(P_{y} =\mathrm{ Tr}(\hat{\sigma }_{y}\hat{A})\), and \(P_{z} =\mathrm{ Tr}(\hat{\sigma }_{z}\hat{A})\). It is convenient to regard \(\boldsymbol{P}:= (P_{x},P_{y},P_{z})\) as a three-dimensional vector, and to define \(\hat{\boldsymbol{\sigma }}:= (\hat{\sigma }_{x},\hat{\sigma }_{y},\hat{\sigma }_{z})\) as well. We denote the inner product between these vectors as \(\boldsymbol{P}\cdot \hat{\boldsymbol{\sigma }}:= P_{x}\hat{\sigma }_{x} + P_{y}\hat{\sigma }_{y} + P_{z}\hat{\sigma }_{z}\), and the squared norm as \(\vert \boldsymbol{P}\vert ^{2}:= P_{x}^{2} + P_{y}^{2} + P_{z}^{2}\). Using the vector notation, we have

$$\displaystyle{ \hat{A} = (P_{0}\hat{1} +\boldsymbol{ P}\cdot \hat{\boldsymbol{\sigma }})/2 }$$
(1.11)

with \(P_{0} =\mathrm{ Tr}(\hat{A})\) and \(\boldsymbol{P} =\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{A})\).

Because \(\hat{A}^{\dag } = (\bar{P}_{0}\hat{1} +\bar{\boldsymbol{ P}}\cdot \hat{\boldsymbol{\sigma }})/2\), \(\hat{A}\) is self-adjoint if and only if both P 0 and \(\boldsymbol{P}\) are real. For a self-adjoint operator \(\hat{A}\), it is simple to show that \(\det (\hat{A}) = (P_{0}^{2} -\vert \boldsymbol{P}\vert ^{2})/4\), and that the two eigenvalues of \(\hat{A}\) are \((P_{0} \pm \vert \boldsymbol{P}\vert )/2\). Therefore, \(\hat{A}\) is positive if and only if \(\boldsymbol{P}\) is real and \(P_{0} \geq \vert \boldsymbol{P}\vert \).

3.2 General States of a Qubit

Because a density operator \(\hat{\rho }\) is positive and has a unit trace, application of the decomposition of Eq. (1.11) leads to

$$\displaystyle{ \hat{\rho }= (\hat{1} +\boldsymbol{ P}\cdot \hat{\boldsymbol{\sigma }})/2 }$$
(1.12)

where the real vector \(\boldsymbol{P} =\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{\rho })\) satisfies \(\vert \boldsymbol{P}\vert \leq 1\). We see that the density operators, and thus the general states of a qubit, are uniquely represented by three-dimensional real vectors \(\boldsymbol{P} = (P_{x},P_{y},P_{z})\) with lengths no greater than unity. These vectors are called the Bloch vectors, and representation of the qubit states using these Bloch vectors is called Bloch representation. As shown in Fig. 1.1a, a Bloch vector is visualized in an xyz-Cartesian coordinate system as an arrow stemming from the origin and reaching a point (P x , P y , P z ) on or inside of a sphere of unit radius, which is called a Bloch sphere.

Fig. 1.1
figure 1

(a) Bloch sphere and a Bloch vector. The six pure states on one of the three axes, where \(\vert \pm \rangle:= (\vert 0\rangle \pm \vert 1\rangle )/\sqrt{2}\) and \(\vert i\pm \rangle:= (\vert 0\rangle \pm i\vert 1\rangle )/\sqrt{2}\), are also shown. (b) A pair of pure states \(\vert \phi \rangle\) and \(\vert \psi \rangle\), with \(\vert \langle \phi \vert \psi \rangle \vert =\cos (\theta /2)\)

As shown in Sect. 1.2.5, the rank of \(\hat{\rho }\) is 1 when it is a pure state, and for a qubit this implies that the smaller of the eigenvalues of \(\hat{\rho }\), \((1 -\vert \boldsymbol{P}\vert )/2\), is zero. A pure state is thus represented by a Bloch vector of length \(\vert \boldsymbol{P}\vert = 1\), with the vector tip reaching the Bloch sphere. For a mixed (and nonpure) state, the length of the Bloch vector is shorter (\(\vert \boldsymbol{P}\vert < 1\)). The maximally mixed state with \(\hat{\rho }=\hat{ 1}/2\) is represented by the zero vector \(\boldsymbol{P} =\boldsymbol{ 0}\).

Bloch vectors should not be confused with the vectors of the Hilbert space. Bloch vectors belong to a three-dimensional real vector space, while the Hilbert space of a qubit is a complex two-dimensional vector space. Consider two pure states, \(\hat{\rho }_{\phi } = \vert \phi \rangle \langle \phi \vert \) and \(\hat{\rho }_{\psi } = \vert \psi \rangle \langle \psi \vert \), with Bloch vectors \(\boldsymbol{P}_{\phi }\) and \(\boldsymbol{P}_{\psi }\), respectively. When \(\boldsymbol{P}_{\phi } \cdot \boldsymbol{ P}_{\psi } =\cos \theta\), then the angle between the two Bloch vectors is \(\theta\) (see Fig. 1.1b). In contrast, based on Eq. (1.10), we have \(\vert \langle \phi \vert \psi \rangle \vert ^{2} =\mathrm{ Tr}(\hat{\rho }_{\phi }\hat{\rho }_{\psi }) = (1 +\boldsymbol{ P}_{\phi } \cdot \boldsymbol{ P}_{\psi })/2 =\cos ^{2}(\theta /2)\), which implies that the angle between the two vectors of the Hilbert space is \(\theta /2\). For two orthogonal pure states, \(\theta /2 =\pi /2\) implies that the corresponding pair of Bloch vectors point in opposite directions.

3.3 Orthogonal Measurement on a Qubit

Let us interpret an orthogonal measurement using the basis \(\{\vert u_{0}\rangle,\vert u_{1}\rangle \}\) in terms of Bloch representation. We define the Bloch vectors \(\boldsymbol{P}_{0}\) and \(\boldsymbol{P}_{1}\) for the basis states using \(\hat{\rho }_{j}:= \vert u_{j}\rangle \langle u_{j}\vert = (\hat{1} +\boldsymbol{ P}_{j}\cdot \hat{\boldsymbol{\sigma }})/2\). Because the orthogonality \(\langle u_{0}\vert u_{1}\rangle = 0\) implies that \(\boldsymbol{P}_{1} = -\boldsymbol{P}_{0}\), the orthogonal measurement is completely characterized by the unit vector \(\boldsymbol{P}_{0}\), which is a direction in the three-dimensional space.

Suppose that the measured qubit is initially in the state given by \(\hat{\rho }= (\hat{1} +\boldsymbol{ P}\cdot \hat{\boldsymbol{\sigma }})/2\). The probabilities of outcome j = 0, 1 are then calculated to be \(p_{j} =\langle u_{j}\vert \hat{\rho }\vert u_{j}\rangle =\mathrm{ Tr}(\hat{\rho }_{j}\hat{\rho }) = (1 +\boldsymbol{ P}_{j} \cdot \boldsymbol{ P})/2\), leading to

$$\displaystyle{ p_{0} = (1 +\boldsymbol{ P}_{0} \cdot \boldsymbol{ P})/2\;\;\mathrm{and}\;\;p_{1} = (1 -\boldsymbol{ P}_{0} \cdot \boldsymbol{ P})/2. }$$
(1.13)

This shows that the probabilities are essentially determined by projection of the measured Bloch vector \(\boldsymbol{P}\) along the direction \(\boldsymbol{P}_{0}\) that was specified by the measurement, with appropriate scaling (see Fig. 1.2a).

Fig. 1.2
figure 2

(a) Orthogonal measurement with a basis \(\{\vert u_{j}\rangle \}_{j=0,1}\). The Bloch vector \(\boldsymbol{P}\) of the input state determines the probability p j of the outcome j. (b) The Bloch vector rotates in a unitary transformation with \(\hat{U}(\boldsymbol{n},\varphi )\)

3.4 Unitary Transformation on a Qubit

We now discuss how the Bloch vector of a physical state changes under a unitary transformation. We limit ourselves to the unitary operators \(\hat{U}\) that belong to a set called SU(2), and are characterized by the condition \(\det \hat{U} = 1\). This does not lose generality because \(\hat{U}(\theta ):= e^{i\theta /2}\hat{U}\) for real \(\theta\) transforms the state \(\hat{\rho }\) to \(\hat{U}(\theta )\hat{\rho }\hat{U}(\theta )^{\dag } =\hat{ U}\hat{\rho }\hat{U}^{\dag }\), which is independent of \(\theta\). All \(\hat{U}(\theta )\) physically represent the same transformation, and we are thus allowed to choose one that satisfies \(\det \hat{U}(\theta ) = e^{i\theta }\det \hat{U} = 1\) and thus \(\hat{U}(\theta ) \in SU(2)\). Note that the correspondence is not one-to-one but in fact two-to-one, because \(-\hat{U}(\theta ) =\hat{ U}(\theta +2\pi )\) also belongs to SU(2).

The elements of SU(2) are conveniently parametrized as follows. Any \(\hat{U} \in SU(2)\) can be written in the diagonal form \(\hat{U} = e^{-i\varphi /2}\vert u_{0}\rangle \langle u_{0}\vert + e^{i\varphi /2}\vert u_{1}\rangle \langle u_{1}\vert \) with \(\langle u_{0}\vert u_{1}\rangle = 0\). We may then write \(\hat{U} =\exp (-i\varphi \hat{S}/2)\) with \(\hat{S}:= \vert u_{0}\rangle \langle u_{0}\vert -\vert u_{1}\rangle \langle u_{1}\vert \), which is self-adjoint, traceless, and has eigenvalues of ± 1. Using the decomposition of Eq. (1.11), we find that \(\hat{S}\) is written as \(\hat{S} =\boldsymbol{ P} \cdot \hat{\boldsymbol{\sigma }} /2\) with \(\vert \boldsymbol{P}\vert = 2\). By introducing a unit vector \(\boldsymbol{n}:=\boldsymbol{ P}/2\), we conclude that the elements of SU(2) can be parametrized as

$$\displaystyle{ \hat{U}(\boldsymbol{n},\varphi ):=\exp [-i(\varphi /2)\boldsymbol{n}\cdot \hat{\boldsymbol{\sigma }}]. }$$
(1.14)

We are interested in how the Bloch vector evolves when the density operator evolves under a unitary transformation. Noting that \(\hat{U}(\boldsymbol{n},\varphi +\varphi ') =\hat{ U}(\boldsymbol{n},\varphi ')\hat{U}(\boldsymbol{n},\varphi )\) holds in general, we see that it is sufficient to focus on the transformations given by \(\hat{U}(\boldsymbol{n},\delta \varphi )\), where \(\delta \varphi\) is infinitesimally small. A general transformation \(\hat{U}(\boldsymbol{n},\varphi )\) is then understood as a result of sequential application of these infinitesimal transformations.

Under the transformation \(\hat{U}(\boldsymbol{n},\delta \varphi )\), a Bloch vector \(\boldsymbol{P}:=\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{\rho })\) evolves into \(\boldsymbol{P} +\delta \boldsymbol{ P} =\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{\rho }')\) with \(\hat{\rho }':=\hat{ U}(\boldsymbol{n},\delta \varphi )\hat{\rho }\hat{U}(\boldsymbol{n},\delta \varphi )^{\dag }\). Using \(\hat{U}(\boldsymbol{n},\delta \varphi )\mathop{\cong}\hat{1} - i(\delta \varphi /2)\boldsymbol{n}\cdot \hat{\boldsymbol{\sigma }}\) and collecting the terms up to the first order in \(\delta \varphi\), we find that \(\delta \boldsymbol{P} =\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{\rho }') -\mathrm{ Tr}(\hat{\boldsymbol{\sigma }}\hat{\rho }) = -i(\delta \varphi /2)\mathrm{Tr}([\hat{\boldsymbol{\sigma }},\boldsymbol{n}\cdot \hat{\boldsymbol{\sigma }}]\hat{\rho })\). From Eq. (1.9), we obtain \([\hat{\sigma }_{i},n_{j}\hat{\sigma }_{j}] = 2i\epsilon _{ijk}n_{j}\hat{\sigma }_{k}\) under the Einstein notation, which implies that \([\hat{\boldsymbol{\sigma }},\boldsymbol{n}\cdot \hat{\boldsymbol{\sigma }}] = 2i\boldsymbol{n}\times \hat{\boldsymbol{\sigma }}\). Therefore, \(\hat{U}(\boldsymbol{n},\delta \varphi )\) induces an infinitesimal change in the Bloch vector, which is given by

$$\displaystyle{ \delta \boldsymbol{P} =\delta \varphi \boldsymbol{ n} \times \boldsymbol{ P}. }$$
(1.15)

This is equal to the infinitesimal change in rotation around axis \(\boldsymbol{n}\) by the angle \(\delta \varphi\). We thus conclude that the Bloch vectors rotate around axis \(\boldsymbol{n}\) by angle \(\varphi\) under the general unitary transformation \(\hat{U}(\boldsymbol{n},\varphi )\) (see Fig. 1.2b). Notable examples include the Z gate with \(\hat{U}((0,0,1),\pm \pi ) = \mp i\hat{\sigma }_{z}\), the X gate with \(\hat{U}((1,0,0),\pm \pi ) = \mp i\hat{\sigma }_{x}\), and the Hadamard gate with \(\hat{U}((2^{-1/2},0,2^{-1/2}),\pm \pi ) = \mp 2^{-1/2}i(\hat{\sigma }_{z} +\hat{\sigma } _{x})\).

4 Generalized Measurements and Quantum Operations

The basic set of rules that we adopted in Sect. 1.1 dictated that we can carry out unitary transformations and orthogonal measurements on a physical system (Rules 2 and 3). Here, we extend the repertoire of what we can do to a physical system by using an auxiliary system as a workspace. We also clarify how far this extension goes, and draw a clear line between what we can and cannot do.

4.1 Use of Auxiliary Systems

Suppose that we want to operate on a physical system A. Let \(\hat{\rho }_{\mathrm{in}}\) be the density operator for the initial state of the system A. We first prepare an auxiliary system E, which has a Hilbert space \(\mathcal{H}_{E}\) of dimension s, in a fixed pure state \(\vert \phi _{\mathrm{ini}}\rangle _{E}\). We then let the systems A and E interact with each other such that the unitary transformation described by the unitary operator \(\hat{U}_{AE}: \mathcal{H}_{A} \otimes \mathcal{H}_{E} \rightarrow \mathcal{H}_{A} \otimes \mathcal{H}_{E}\) occurs. Finally, we perform an orthogonal measurement on system E with an orthonormal basis \(\{\vert j\rangle _{E}\}_{j=1,\ldots,s}\) of \(\mathcal{H}_{E}\). The output of the operation is the classical variable j and the final quantum state \(\hat{\rho }_{\mathrm{out}}^{(j)}\) of system A, which may depend on the value of j. This can thus be regarded as conducting a state transformation and performing a measurement at the same time. Using the rules that were summarized in Sect. 1.2.5, we can easily show how the final state \(\hat{\rho }_{\mathrm{out}}^{(j)}\) and the probability p j of obtaining j are related to the initial state:

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} = _{ E}\langle j\vert \hat{U}_{AE}(\hat{\rho }_{\mathrm{in}} \otimes \vert \phi _{\mathrm{ini}}\rangle _{EE}\langle \phi _{\mathrm{ini}}\vert )\hat{U}_{AE}^{\dag }\vert j\rangle _{ E}. }$$
(1.16)

We sometimes encounter a situation where the input and the output are different physical systems. For example, in the photoelectric effect, light is incident on a metal but an electron comes out of the metal. In such a case, we would regard the light field as the input system A, and the metal, including the electron that is eventually emitted, as the auxiliary system E. The whole system is the composite of A and E. The output system, i.e., the electron, is a subsystem of the composite system AE, and we call it system A′. The rest of system AE is then called system E′. In short, we have introduced two different ways to decompose the entire system into two subsystems, AE and AE′. Mathematically, this corresponds to an equivalence relation \(\mathcal{H}_{A} \otimes \mathcal{H}_{E} = \mathcal{H}_{A'} \otimes \mathcal{H}_{E'}\).

We can now generalize the strategy for use of an auxiliary system to include cases where the output system is not necessarily the same as the input system, as shown in Fig. 1.3. It is convenient to regard the unitary operator \(\hat{U}_{AE}\) as a linear map \(\hat{U}: \mathcal{H}_{A} \otimes \mathcal{H}_{E} \rightarrow \mathcal{H}_{A'} \otimes \mathcal{H}_{E'}\), where we dropped the subscript AE. Let s′ be the dimension of \(\mathcal{H}_{E'}\). The orthogonal measurement is performed on system E′ with an orthonormal basis \(\{\vert j\rangle _{E'}\}_{j=1,\ldots,s'}\). Equation (1.16) is then generalized as

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} = _{ E'}\langle j\vert \hat{U}(\hat{\rho }_{\mathrm{in}} \otimes \vert \phi _{\mathrm{ini}}\rangle _{EE}\langle \phi _{\mathrm{ini}}\vert )\hat{U}^{\dag }\vert j\rangle _{ E'}. }$$
(1.17)
Fig. 1.3
figure 3

Use of an auxiliary system in operation on the physical system A. An auxiliary system E is prepared in a fixed pure state \(\vert \phi _{\mathrm{ini}}\rangle _{E}\), and the unitary transformation \(\hat{U}\) is applied to systems A and E. System A′, which is part of the whole system AE, is released as an output. The remaining system, E′, is measured to produce the outcome j

It is convenient to introduce the operators \(\hat{M}^{(j)}: \mathcal{H}_{A} \rightarrow \mathcal{H}_{A'}\), which are defined by

$$\displaystyle{ \hat{M}^{(j)} = _{ E'}\langle j\vert \hat{U}\vert \phi _{\mathrm{ini}}\rangle _{E}. }$$
(1.18)

Using the relation \(\sum _{j=1}^{s'}\vert j\rangle _{E'E'}\langle j\vert =\hat{ 1}_{E'}\), we see that the operators satisfy the normalization condition

$$\displaystyle{ \sum _{j=1}^{s'}\hat{M}^{(j)\dag }\hat{M}^{(j)} =\hat{ 1}_{ A}. }$$
(1.19)

The set of operators \(\{\hat{M}^{(j)}: \mathcal{H}_{A} \rightarrow \mathcal{H}_{A'}\}\) that satisfies the above relationship are often called Kraus operators. Using these operators, Eq. (1.17) can be simplified as

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} =\hat{ M}^{(j)}\hat{\rho }_{ \mathrm{in}}\hat{M}^{(j)\dag }, }$$
(1.20)

where the input-output relationship is stated without any reference to the auxiliary systems E and E′.

In the above argument, we started with a given operator \(\hat{U}\) that represented a unitary transformation of the composite system to determine the Kraus operators for the simplified relationship of Eq. (1.20). As we will see, this process can be reversed, i.e., for any given set of Kraus operators \(\{\hat{M}^{(j)}\}\) that satisfies Eq. (1.19), there isFootnote 3 a unitary operator \(\hat{U}\) that satisfies Eq. (1.18). Let \(\{\vert u_{i}\rangle _{A}\}_{i=1,\ldots,d}\) be an orthonormal basis of \(\mathcal{H}_{A}\). Then, \(\{\vert u_{i}\rangle _{A} \otimes \vert \phi _{\mathrm{ini}}\rangle _{E}\}_{i=1,\ldots,d}\) is an orthonormal set. We define \(\vert v_{i}\rangle \in \mathcal{H}_{A'} \otimes \mathcal{H}_{E'}\) by \(\vert v_{i}\rangle:=\sum _{ j=1}^{s'}\hat{M}^{(j)}\vert u_{i}\rangle \otimes \vert j\rangle _{E'}\). From Eq. (1.19), it can be shown that \(\{\vert v_{i}\rangle \}_{i=1,\ldots,d}\) is an orthonormal set. There is thus a unitary operator \(\hat{U}: \mathcal{H}_{A} \otimes \mathcal{H}_{E} \rightarrow \mathcal{H}_{A'} \otimes \mathcal{H}_{E'}\) that connects the two orthonormal sets as \(\vert v_{i}\rangle =\hat{ U}\vert u_{i}\rangle _{A} \otimes \vert \phi _{\mathrm{ini}}\rangle _{E}\), which leads to Eq. (1.18). We thus conclude that any input-output relationship dictated by the Kraus operators as shown in Eq. (1.20) can be physically implemented by attaching an auxiliary system E, applying a suitable unitary transformation over the composite system, and then measuring the subsystem E′.

4.2 Physically Allowed Operations

In Sect. 1.4.1, we extended our ability to operate on physical systems through the rather heuristic use of an auxiliary system. It is natural to expect that the introduction of more complex schemes using two or more auxiliary systems may allow us to further extend the variety of possible operations. Additionally, if we look back on the basic rules in Sect. 1.1, we see that none of the rules require a physical operation to be built up from unitary transformations and orthogonal measurements alone. Nonetheless, we will show here that the input-output relations written in the form of Eq. (1.20) are essentially the only relations that are allowed physically.

Consider a black box that accepts a physical system A as an input, and produces a classical outcome j = 1, 2, , s, while leaving the system A′ as an output. Let d be the dimension of \(\mathcal{H}_{A}\). We want to know the way in which the output state \(\hat{\rho }_{\mathrm{out}}^{(j)}\) and the probability p j of the outcome are related to a general pure input state \(\vert \phi \rangle _{A}\). For that purpose, it is convenient to introduce a reference system B with a Hilbert space \(\mathcal{H}_{B}\) of the same dimension d. We take the orthonormal bases \(\{\vert i\rangle _{A}\}_{i=1,\ldots,d}\) and \(\{\vert i\rangle _{B}\}_{i=1,\ldots,d}\) for \(\mathcal{H}_{A}\) and \(\mathcal{H}_{B}\), respectively, and suppose that the system AB is initially prepared in a maximally entangled state, \(\vert \varPhi \rangle _{AB} = d^{-1/2}\sum _{i=1}^{d}\vert i\rangle _{A}\vert i\rangle _{B}\).

We now explain a frequently used technique called the relative states. For any given state \(\vert \phi \rangle _{A}\), we define the relative state of system B, with reference to the maximally entangled state \(\vert \varPhi \rangle _{AB}\), as

$$\displaystyle{ \vert \phi ^{{\ast}}\rangle _{ B}:=\sum _{ i=1}^{d}\vert i\rangle _{ BA}\langle \phi \vert i\rangle _{A}. }$$
(1.21)

It is then easy to see that

$$\displaystyle{ d^{-1/2}\vert \phi \rangle _{ A} = _{B}\langle \phi ^{{\ast}}\vert \vert \varPhi \rangle _{ AB} }$$
(1.22)

holds. The definition of the relative state is mutual, i.e., \(\vert \phi ^{{\ast}{\ast}}\rangle _{A} = \vert \phi \rangle _{A}\), because \(d^{-1/2}\vert \phi ^{{\ast}}\rangle _{B} = _{A}\langle \phi \vert \vert \varPhi \rangle _{AB}\) also holds.

In light of Theorem 1, this relation has the following meaning. If we conduct an orthogonal measurement on system B with a basis that includes state \(\vert \phi ^{{\ast}}\rangle _{B}\), then the corresponding outcome appears with probability 1∕d, and system A then behaves as if it were initially prepared in state \(\vert \phi \rangle _{A}\). While this is probabilistic, it offers a type of ex post facto method to prepare system A in the arbitrary state \(\vert \phi \rangle _{A}\).

We now proceed to the analysis of the black box (see Fig. 1.4a). After preparation of \(\vert \varPhi \rangle _{AB}\), suppose that system A is fed to the black box, while system B is left alone. After the black box has produced the outcome j, the state of the composite system AB should be represented by a density operator, which we denote by \(\hat{\rho }_{A'B}^{(j)}\). Let q j be the probability of producing the outcome j. Now suppose that we perform an orthogonal measurement with basis \(\{\vert v_{i}\rangle _{B}\}\) on system B, where \(\vert v_{1}\rangle _{B} = \vert \phi ^{{\ast}}\rangle _{B}\). The outcome i = 1 should then appear with probability r (j) and this leaves system A′ in state \(\hat{\rho }_{A'}^{(j)}\), where

$$\displaystyle{ r^{(j)}\hat{\rho }_{ A'}^{(j)} = _{ B}\langle \phi ^{{\ast}}\vert \hat{\rho }_{ A'B}^{(j)}\vert \phi ^{{\ast}}\rangle _{ B}. }$$
(1.23)

However, according to Theorem 1, an event with outcomes j and i = 1 must be interpreted as follows. With probability d −1, system A is initially prepared in \(\vert \phi \rangle _{A}\), and is then fed to the black box. This produces outcome j with probability p j , leaving system A′ in state \(\hat{\rho }_{\mathrm{out}}^{(j)}\). Comparison of the two interpretations leads to \(q_{j}r^{(j)} = d^{-1}p_{j}\) and \(\hat{\rho }_{A'}^{(j)} =\hat{\rho }_{ \mathrm{out}}^{(j)}\). Using Eq. (1.23), we then have

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} = dq_{j B}\langle \phi ^{{\ast}}\vert \hat{\rho }_{ A'B}^{(j)}\vert \phi ^{{\ast}}\rangle _{ B}. }$$
(1.24)

Consider a decomposition of the density operator,

$$\displaystyle{ \hat{\rho }_{A'B}^{(j)} =\sum _{ k=1}^{t^{(j)} }\vert \tilde{\varPsi }_{k}^{(j)}\rangle _{ A'BA'B}\langle \tilde{\varPsi }_{k}^{(j)}\vert, }$$
(1.25)

where \(\vert \tilde{\varPsi }_{k}^{(j)}\rangle _{A'B}\) is unnormalized. Noting that \(_{B}\langle \phi ^{{\ast}}\vert = \sqrt{d}_{AB}\langle \varPhi \vert \vert \phi \rangle _{A}\), we see that, for fixed values of j and k, the correspondence \(\vert \phi \rangle _{A}\mapsto \sqrt{dq_{j}}_{B}\langle \phi ^{{\ast}}\vert \vert \varPsi _{k}^{(j)}\rangle _{A'B}\) is a linear map. Thus, an operator \(\hat{M}^{(j,k)}: \mathcal{H}_{A} \rightarrow \mathcal{H}_{A'}\) exists such that

$$\displaystyle{ \sqrt{dq_{j}}_{B}\langle \phi ^{{\ast}}\vert \vert \varPsi _{ k}^{(j)}\rangle _{ A'B} =\hat{ M}^{(j,k)}\vert \phi \rangle _{ A}. }$$
(1.26)

Equation (1.24) is now written as

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} =\sum _{ k=1}^{t^{(j)} }\hat{M}^{(j,k)}\vert \phi \rangle _{ AA}\langle \phi \vert \hat{M}^{(j,k)\dag } }$$
(1.27)

for the input state \(\vert \phi \rangle _{A}\). Then, for the general input state \(\hat{\rho }_{\mathrm{in}}\) of system A, the input-output relationship of the black box is written as

$$\displaystyle{ p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} =\sum _{ k=1}^{t^{(j)} }\hat{M}^{(j,k)}\hat{\rho }_{ \mathrm{in}}\hat{M}^{(j,k)\dag }. }$$
(1.28)

Taking the trace of Eq. (1.27) and performing a sum over index j, we have \(\sum _{j,k A}\langle \phi \vert \hat{M}^{(j,k)\dag }\hat{M}^{(j,k)}\vert \phi \rangle _{A} = 1\) for arbitrary \(\vert \phi \rangle _{A}\). Therefore, \(\sum _{j,k}\hat{M}^{(j,k)\dag }\hat{M}^{(j,k)} =\hat{ 1}_{A}\) and \(\{\hat{M}^{(j,k)}\}\) is a set of Kraus operators.

Fig. 1.4
figure 4

(a) Characterization of a physical operation (the black box) by feeding in half of a maximally entangled state. Learning the statistics of the outcome j and the states \(\hat{\rho }_{A'B}^{(j)}\) then allows us to fully specify the input-output relationship. (b) Looking inside the box. Any physical process is implemented in an equivalent manner with an auxiliary system, a unitary transformation, and an orthogonal measurement

Equation (1.28) is the most general form of what we can do to a physical system. This equation is merely a trivial extension of Eq. (1.20) in Sect. 1.4.1. Consider a scheme that produces the outcome (j, k) and leaves system A′ in state \(\hat{\rho }_{\mathrm{out}}^{(j,k)}\), with an input-output relation given by \(p_{j,k}\hat{\rho }_{\mathrm{out}}^{(j,k)} =\hat{ M}^{(j,k)}\hat{\rho }_{\mathrm{in}}\hat{M}^{(j,k)\dag }\). As shownFootnote 4 in Sect. 1.4.1, this scheme can be implemented by simply attaching an auxiliary system E, applying a unitary transformation, and then performing an orthogonal measurement on system E′. The original black box is then faithfully simulated using this scheme as shown in Fig. 1.4b, by simply discarding the index k and yielding only the index j as the final outcome.

4.3 Generalized Measurements

By discarding the output quantum state in system A′ in the black box that was considered in Sect. 1.4.2, we can obtain the most general form of a physically allowed measurement process, which is called a generalized measurement. By taking the trace of Eq. (1.28), we have

$$\displaystyle{ p_{j} =\mathrm{ Tr}(\hat{F}^{(j)}\hat{\rho }_{ \mathrm{in}}), }$$
(1.29)

where \(\hat{F}^{(j)}:=\sum _{k}\hat{M}^{(j,k)\dag }\hat{M}^{(j,k)}\) is positive and satisfies \(\sum _{j}\hat{F}^{(j)} =\hat{ 1}_{A}\). Any measurement must be written in this form.

A set of positive operators \(\{\hat{F}^{(j)}\}\) acting on \(\mathcal{H}_{A}\) and satisfying \(\sum _{j}\hat{F}^{(j)} =\hat{ 1}_{A}\) is called the POVM (positive-operator-valued measure). For any given POVM \(\{\hat{F}^{(j)}\}\), we may define \(\hat{M}^{(j)}:= (\hat{F}^{(j)})^{1/2}\) and use the argument of Sect. 1.4.1 to construct a generalized measurement that satisfies Eq. (1.29) through the use of an auxiliary system as shown in Fig. 1.3, except that system A′ is discarded in this case.

An orthogonal measurement with basis \(\{\vert u_{j}\rangle _{A}\}\) is now regarded as a special case of the generalized measurements, when the POVM is chosen to be \(\hat{F}^{(j)} = \vert u_{j}\rangle _{A A}\langle u_{j}\vert \). Note that orthogonal measurements are not necessarily the ideal measurement, and some tasks favor other kinds of generalized measurement. We will provide an example below.

Unambiguous state discrimination. Consider a nonorthogonal pair of qubit states, \(\{\vert \phi _{0}\rangle _{A},\vert \phi _{1}\rangle _{A}\}\), with \(c:= \vert \langle \phi _{0}\vert \phi _{1}\rangle \vert > 0\). Suppose that qubit A has been secretly prepared in \(\vert \phi _{0}\rangle _{A}\) or in \(\vert \phi _{1}\rangle _{A}\) with an equal probability of \(q:= 1/2\). Consider the strategy used to distinguish between the two states as follows.

Choose \(\vert \phi _{j}^{\perp }\rangle _{A}\) (j = 0, 1) such that \(_{A}\langle \phi _{j}\vert \phi _{j}^{\perp }\rangle _{A} = 0\) and \(_{A}\langle \phi _{0}^{\perp }\vert \phi _{1}^{\perp }\rangle _{A} = c\). Consider a set \(\{\hat{F}^{(j)}\}_{j=0,1,2}\) defined by \(\hat{F}^{(0)}:= (1 + c)^{-1}\vert \phi _{1}^{\perp }\rangle _{AA}\langle \phi _{1}^{\perp }\vert \), \(\hat{F}^{(1)}:= (1 + c)^{-1}\vert \phi _{0}^{\perp }\rangle _{AA}\langle \phi _{0}^{\perp }\vert \), and \(\hat{F}^{(2)}:=\hat{ 1}_{A} -\hat{ F}^{(0)} -\hat{ F}^{(1)}\). Because \(\vert \phi _{0}^{\perp }\rangle _{A} \pm \vert \phi _{1}^{\perp }\rangle _{A}\) is an eigenvector of \(\hat{F}^{(0)} +\hat{ F}^{(1)}\) with eigenvalue \((1 + c)^{-1}(1 \pm c) \leq 1\), we have \(\hat{F}^{(0)} +\hat{ F}^{(1)} \leq \hat{ 1}_{A}\). Therefore, \(\{\hat{F}^{(j)}\}_{j=0,1,2}\) is a POVM, and the corresponding generalized measurement is feasible.

When the outcome of this measurement was j = 0, we were certain that the prepared state must be state \(\vert \phi _{0}\rangle _{A}\), because \(\mathrm{Tr}(\hat{F}^{(0)}\vert \phi _{1}\rangle _{AA}\langle \phi _{1}\vert ) = 0\). Similarly, if the outcome was j = 1, the prepared state must be state \(\vert \phi _{1}\rangle _{A}\). The overall success probability, i.e., the probability of obtaining j = 0, 1 is calculated to be \(p_{\mathrm{suc}}:=\sum _{j=0,1}q\mathrm{Tr}(\hat{F}^{(j)}\vert \phi _{j}\rangle _{AA}\langle \phi _{j}\vert ) = 1 - c\) [5].

If we are to construct a strategy with a similar lack of ambiguity using orthogonal measurements, we must choose either \(\{\vert \phi _{0}\rangle _{A},\vert \phi _{0}^{\perp }\rangle _{A}\}\) or \(\{\vert \phi _{1}\rangle _{A},\vert \phi _{1}^{\perp }\rangle _{A}\}\) as the basis. Regardless of how the two orthogonal measurements are mixed, the success probability is \(p_{\mathrm{suc}}^{\perp }:= q\vert _{A}\langle \phi _{0}^{\perp }\vert \phi _{1}\rangle _{A}\vert ^{2} = q\vert _{A}\langle \phi _{1}^{\perp }\vert \phi _{0}\rangle _{A}\vert ^{2} = (1 - c^{2})/2\). Thus we see that p suc > p suc  ⊥  for 0 < c < 1.

4.4 Quantum Operations

If we discard the outcome j from the black box that was considered in Sect. 1.4.2, then the output density operator of system A′ becomes \(\hat{\rho }_{\mathrm{out}}:=\sum _{j}p_{j}\hat{\rho }_{\mathrm{out}}^{(j)} =\sum _{j,k}\hat{M}^{(j,k)}\hat{\rho }_{\mathrm{in}}\hat{M}^{(j,k)\dag }\). Without loss of generality, we may replace the indices (j, k) with a single index j, which results in the general form of the state transformation,

$$\displaystyle{ \hat{\rho }_{\mathrm{out}} =\sum _{j}\hat{M}^{(j)}\hat{\rho }_{ \mathrm{in}}\hat{M}^{(j)\dag } }$$
(1.30)

with \(\sum _{j}\hat{M}^{(j)\dag }\hat{M}^{(j)} =\hat{ 1}_{A}\). Any physical process that takes system A as an input and leaves the same system or another system A′ as an output must be written in this form. This type of process is often called a quantum operation or a quantum channel. Mathematically, the map \(\chi:\hat{\rho } _{\mathrm{in}}\mapsto \hat{\rho }_{\mathrm{out}}\) that is written as per Eq. (1.30) is called a CPTP (completely-positive trace-preserving) map.

The argument in Sect. 1.4.1 ensures that the right-hand side of Eq. (1.30) can be rewritten as that of Eq. (1.17) summed over j, i.e.,

$$\displaystyle{ \hat{\rho }_{\mathrm{out}} =\mathrm{ Tr}_{E'}[\hat{U}(\hat{\rho }_{\mathrm{in}} \otimes \vert \phi _{\mathrm{ini}}\rangle _{EE}\langle \phi _{\mathrm{ini}}\vert )\hat{U}^{\dag }]. }$$
(1.31)

Operationally, this simply means that the measurement on system E′ shown in Fig. 1.3 is unnecessary. Thus, any quantum channel can be equivalently simulated using a simple three-step process, which consists of preparing the auxiliary system (E) in a fixed pure state, applying the unitary transformation, and discarding the subsystem (E′). This property is very helpful when it is necessary to prove that some tasks are physically impossible. This type of argument is vital for establishment of an operationally-defined measure of quantum properties, as indicated in the following example.

Fidelity. In an experimental demonstration, the quality of the final result is often evaluated in terms of the fidelity \(F =\langle \phi _{\mathrm{ideal}}\vert \hat{\rho }_{\mathrm{exp}}\vert \phi _{\mathrm{ideal}}\rangle\), where \(\vert \phi _{\mathrm{ideal}}\rangle\) is the desired state and \(\hat{\rho }_{\mathrm{exp}}\) is the state that was actually obtained in the experiment. The fidelity F between two general states \(\hat{\rho }_{1}\) and \(\hat{\rho }_{2}\) of system A is definedFootnote 5 as the maximum overlap between the purifications of these states in a composite system composed of A and an arbitrary system R, i.e.,

$$\displaystyle{ F(\hat{\rho }_{1},\hat{\rho }_{2}):=\max \{ \vert _{AR}\langle \varPsi _{1}\vert \varPsi _{2}\rangle _{AR}\vert ^{2}:\mathrm{ Tr}_{ R}(\vert \varPsi _{j}\rangle _{ARAR}\langle \varPsi _{j}\vert ) =\hat{\rho } _{j},j = 1,2\}. }$$
(1.32)

To justify the use of such a quantity in the evaluation of an experiment, we must show that the fidelity \(F(\hat{\rho }_{1},\hat{\rho }_{2})\) is a good measure of the closeness between the two states \(\hat{\rho }_{1}\) and \(\hat{\rho }_{2}\). To enable F to quantify the difficulty in distinguishing between the two states in principle, F should not be reduced (and thus the distinguishability should not improve) through the application of any quantum channel χ, i.e.,

$$\displaystyle{ F(\chi (\hat{\rho }_{1}),\chi (\hat{\rho }_{2})) \geq F(\hat{\rho }_{1},\hat{\rho }_{2}) }$$
(1.33)

should hold for any CPTP map χ. This can be proved as follows.

Let \(\vert \varPhi _{j}\rangle _{AR}\) be the purifications that achieve the maximum of Eq. (1.32), i.e., \(F(\hat{\rho }_{1},\hat{\rho }_{2}) = \vert _{AR}\langle \varPhi _{1}\vert \varPhi _{2}\rangle _{AR}\vert ^{2}\). We consider three different cases separately, corresponding to the three steps that are implied in Eq. (1.31).

  1. (i)

    \(\chi (\hat{\rho }_{j}) =\hat{\rho } _{j} \otimes \vert \phi \rangle _{BB}\langle \phi \vert \). In this case, \(\vert \varPsi _{j}\rangle _{ABR}:= \vert \varPhi _{j}\rangle _{AR}\vert \phi \rangle _{B}\) is a purification of \(\chi (\hat{\rho }_{j})\). Therefore, \(F(\chi (\hat{\rho }_{1}),\chi (\hat{\rho }_{2})) \geq \vert _{ABR}\langle \varPsi _{1}\vert \varPsi _{2}\rangle _{ABR}\vert ^{2} = \vert _{AR}\langle \varPhi _{1}\vert \varPhi _{2}\rangle _{AR}\vert ^{2} = F(\hat{\rho }_{1},\hat{\rho }_{2})\).

  2. (ii)

    \(\chi (\hat{\rho }_{j}) =\hat{ U}_{A}\hat{\rho }_{j}\hat{U}_{A}^{\dag }\). In this case, \(\vert \varPsi _{j}\rangle _{AR}:= (\hat{U}_{A} \otimes \hat{ 1}_{R})\vert \varPhi _{j}\rangle _{AR}\) is a purification of \(\chi (\hat{\rho }_{j})\). Therefore, \(F(\chi (\hat{\rho }_{1}),\chi (\hat{\rho }_{2})) \geq \vert _{AR}\langle \varPsi _{1}\vert \varPsi _{2}\rangle _{AR}\vert ^{2} = \vert _{AR}\langle \varPhi _{1}\vert \varPhi _{2}\rangle _{AR}\vert ^{2} = F(\hat{\rho }_{1},\hat{\rho }_{2})\).

  3. (iii)

    \(\chi (\hat{\rho }_{j}) =\mathrm{ Tr}_{\tilde{A}}(\hat{\rho }_{j})\), where \(\tilde{A}\) is a constituent subsystem of system A. In this case, \(\vert \varPhi _{j}\rangle _{AR}\) is also regarded as a purification of \(\chi (\hat{\rho }_{j})\). Therefore, \(F(\chi (\hat{\rho }_{1}),\chi (\hat{\rho }_{2})) \geq \vert _{AR}\langle \varPhi _{1}\vert \varPhi _{2}\rangle _{AR}\vert ^{2} = F(\hat{\rho }_{1},\hat{\rho }_{2})\).

For a general quantum channel χ, we may decompose the process into the three steps, and the above results demonstrate that F is nondecreasing in each of the three steps. Therefore, Eq. (1.33) holds.

No-cloning theorem. An immediate consequence of the nondecreasing property of the fidelity is the no-cloning theorem. Consider a cloning machine that would transform an arbitrary input pure state \(\hat{\rho }_{\mathrm{in},\phi }:= \vert \phi \rangle _{AA}\langle \phi \vert \) into a duplicated pure state \(\hat{\rho }_{\mathrm{out},\phi }:= \vert \phi \rangle _{AA}\langle \phi \vert \otimes \vert \phi \rangle _{A'A'}\langle \phi \vert \). For \(0 < \vert _{A}\langle \phi \vert \psi \rangle _{A}\vert ^{2} < 1\), we would have

$$\displaystyle{ F(\hat{\rho }_{\mathrm{out},\phi },\hat{\rho }_{\mathrm{out},\psi }) = F(\hat{\rho }_{\mathrm{in},\phi },\hat{\rho }_{\mathrm{in},\psi })^{2} < F(\hat{\rho }_{\mathrm{ in},\phi },\hat{\rho }_{\mathrm{in},\psi }), }$$
(1.34)

which violates Eq. (1.33). Therefore, this cloning machine could never exist.

5 Communication Resources

The task of sending quantum information is essentially different from that of sending classical information, and is achieved using a dedicated quantum channel. Interestingly, transmission of quantum information can also be achieved by supplementing a classical channel with another resource: entanglement. In this subsection, we will see how the three communication resources are related to each other, while focusing our discussion on the ideal cases.

5.1 Quantum Channels and Classical Channels

An ideal classical channel will transmit a symbol chosen from a fixed set {1, 2, , d} without any error from a sender to a receiver. The number of symbols d stands for the usefulness of the channel as a resource. A channel with d = 2 is normally regarded to have a unit of usefulness, called a bit. General ideal channels with d symbols have \(\log _{2}d\) bits. This makes sense because the combined use of a \((\log _{2}d)\)-bit channel and a \((\log _{2}d')\)-bit channel amounts to the single use of a \((\log _{2}d +\log _{2}d')\)-bit channel.

In a similar vein, we consider an ideal quantum channel, which faithfully transmits the arbitrary quantum states of a d-level physical system that is associated with a Hilbert space of dimension d. Because we have already called the two-level system a qubit, let us define the usefulness of such a channel as \((\log _{2}d)\) qubits. Because \(\mathrm{dim}(\mathcal{H}\otimes \mathcal{H}') = (\mathrm{dim}\mathcal{H})(\mathrm{dim}\mathcal{H}')\), this measure is additive for the combined use of ideal channels.

We now consider how the two types of channels differ. First, a quantum channel can never be simulated using any amount of classical channels. This is because of the no-cloning theorem, as described in Sect. 1.4.4. Because the output of a classical channel can be freely copied, if the receiver were able to reconstruct any input state \(\vert \phi \rangle\), then they could repeat the same procedure to create another copy of state \(\vert \phi \rangle\), which is forbidden by the no-cloning theorem.

In contrast, a \((\log _{2}d)\)-qubit quantum channel can be used to simulate a classical channel. To simulate a \((\log _{2}d')\)-bit channel, the sender can encode a symbol i ∈ { 1, 2, , d′} on a quantum state, i.e., the sender transmits the quantum state \(\hat{\rho }_{i}\) via the quantum channel, according to the symbol i that is to be transmitted. The receiver can then perform a measurement of the transmitted state to decode the index i. Encoding on mutually orthogonal states certainly works if d′ = d, but the user may want to exploit the fact that there are an infinite number of different quantum states to transmit larger numbers of symbols. To deny any such possibility, we recall that any measurement strategy must be described as in Eq. (1.29), using a POVM \(\{\hat{F}_{j}\}\). To simulate an ideal channel, \(\mathrm{Tr}(\hat{F}_{i}\hat{\rho }_{i}) = 1\) should hold for i = 1, , d′. Because \(\{\hat{F}_{j}\}\) are positive and \(\sum _{i=1}^{d'}\hat{F}_{i} \leq \hat{ 1}\), we have \(d' =\sum _{ i=1}^{d'}\mathrm{Tr}(\hat{F}_{i}\hat{\rho }_{i}) \leq \sum _{i=1}^{d'}\mathrm{Tr}(\hat{F}_{i}) \leq \mathrm{ Tr}\hat{1} = d\), thus proving the following.

Theorem 4.

Without use of another communication resource, a \((\log _{2}d)\) -qubit ideal quantum channel can never simulate a \((\log _{2}d')\) -bit ideal classical channel if d′ > d.

5.2 Entanglement as a Communication Resource

We have seen that a quantum channel is qualitatively different from a classical channel. We may then ask what exactly is the difference between the channels, or ask what kind of communication resources may be used to complement a classical channel to enable it to simulate a quantum channel. It turns out that the entanglement is the answer to these questions.

As an ideal resource of entanglement, let us consider a maximally entangled state with a Schmidt number of d,

$$\displaystyle{ \vert \varPhi _{0,0}\rangle _{AB}:= \frac{1} {\sqrt{d}}\sum _{j=0}^{d-1}\vert j\rangle _{ A}\vert j\rangle _{B} }$$
(1.35)

where \(\{\vert j\rangle _{A}\}\) and \(\{\vert j\rangle _{B}\}\) are the orthonormal bases of \(\mathcal{H}_{A}\) and \(\mathcal{H}_{B}\), respectively. When each subsystem is held by the sender and by the receiver, we can quantify the usefulness of this state as \((\log _{2}d)\) ebits, which is additive when two or more maximally entangled states are available. Any state that is written as \((\hat{U}_{A} \otimes \hat{ V }_{B})\vert \varPhi _{0,0}\rangle _{AB}\) is also a maximally entangled state and is regarded as a resource of the same number of ebits.

If a \((\log _{2}d)\)-qubit quantum channel is available, then the sender can create state \(\vert \varPhi _{0,0}\rangle _{AB}\) locally and transmit system B to the receiver, which produces \((\log _{2}d)\) ebits of entanglement resource.

Theorem 5 (Entanglement sharing).

A \((\log _{2}d)\) -qubit ideal quantum channel can be converted into \((\log _{2}d)\) ebits of ideal entanglement.

Next, let us compare entanglement with classical channels. First, entanglement does not help in augmentation of a classical channel.

Theorem 6.

Without use of another communication resource, no amount of entanglement can convert a \((\log _{2}d)\) -bit ideal classical channel into a \((\log _{2}d')\) -bit ideal classical channel with d′ > d.

Proof.

Suppose that the sender chooses a symbol i ∈ { 1, 2, , d′} at random. Assume that it is possible to transmit i faithfully by using a \((\log _{2}d)\)-bit ideal classical channel and shared entanglement. Because the output of the channel can be guessed correctly with a probability of 1∕d by random guessing, the receiver can form a strategy, which, without communication, allows the symbol i to be guessed with a success probability of 1∕d. Therefore, \(1/d \leq 1/d'\) must hold. □ 

Entanglement is a static resource in the sense that it is simply a correlation and it does not refer to any transfer of information. A classical channel is dynamic with regard to its ability to move information around. In this respect, the theorem above may be regarded as a natural example where a static resource cannot be converted into a dynamic resource. However, there is a subtlety here that will be manifest when we see the protocol for quantum dense coding in Sect. 1.5.4.

Finally, we consider the reverse question of how entanglement can be manipulated with unlimited use of classical channels. Suppose that Alice and Bob can freely use classical channels between them in both directions, and they can locally perform any physically allowed measurement or state transformation. This type of framework is called LOCC (local operations and classical communication).

Suppose that Alice and Bob initially share a pure bipartite state \(\vert \varPsi \rangle _{AB}\), and try to transform this state into other states under the LOCC framework. Without loss of generality, we may assume that only one party is conducting a local operation at any one time. This means that Alice first conducts a local operation, reveals an outcome to Bob through a classical communication, and Bob then conducts a local operation in turn, and so on. For Alice’s turn, her operation is generally written as in Eq. (1.28), with \(\hat{M}^{(j,k)}\) acting on Alice’s system alone. Although the general description includes the index k, which is discarded, for the purposes of state transformation, Alice may as well record this index. Therefore, we omit k and conclude that, after Alice’s first turn, Alice and Bob share state \(\vert \varPsi ^{(j)}\rangle _{A'B}\) with probability p j , where

$$\displaystyle{ p_{j}\vert \varPsi ^{(j)}\rangle _{ A'BA'B}\langle \varPsi ^{(j)}\vert = (\hat{M}_{ A}^{(j)} \otimes \hat{ 1}_{ B})\vert \varPsi \rangle _{ABAB}\langle \varPsi \vert (\hat{M}_{A}^{(j)} \otimes \hat{ 1}_{ B})^{\dag } }$$
(1.36)

and \(\hat{M}_{A}^{(j)}: \mathcal{H}_{A} \rightarrow \mathcal{H}_{A'}\) satisfies \(\sum _{j}\hat{M}_{A}^{(j)\dag }\hat{M}_{A}^{(j)} =\hat{ 1}_{A}\). Let \(\hat{\rho }_{B}\) and \(\hat{\rho }_{B}^{(j)}\) be the marginal density operators of system B for \(\vert \varPsi \rangle _{AB}\) and \(\vert \varPsi ^{(j)}\rangle _{A'B}\), respectively. Taking a partial trace and summation over j in Eq. (1.36), we have

$$\displaystyle{ \sum _{j}p_{j}\hat{\rho }_{B}^{(j)} =\hat{\rho } _{ B}. }$$
(1.37)

This equation shows that the rank of \(\hat{\rho }_{B}^{(j)}\) never exceeds that of \(\hat{\rho }_{B}\). The Schmidt number of state \(\vert \varPsi ^{(j)}\rangle _{A'B}\) therefore never exceeds that of the initial state \(\vert \varPsi \rangle _{AB}\). A similar argument is applicable to Bob’s turns, and we thus see that the Schmidt number never increases under the LOCC framework, even probabilistically. Specifically, no entanglement is generated under the LOCC framework when starting from a product state with a Schmidt number of unity. This is often adopted as a defining property of entanglement when discussing more general cases of mixed-state entanglement.

In view of the relationships between the communication resources, the above argument means that the classical channels do not help to increase entanglement, and this is summarized as follows.

Theorem 7.

Without use of another communication resource, no amount of communication over classical channels can convert a \((\log _{2}d)\) -ebit ideal entanglement into a \((\log _{2}d')\) -ebit ideal entanglement with d′ > d.

This theorem implies that entanglement has a nonclassical aspect that cannot be replaced by classical channels. If we combine the two resources, we will obtain a resource that is both dynamic and nonclassical, and we may perhaps simulate a quantum channel. This is indeed true, and will be explained in Sect. 1.5.4 after we summarize the properties of the maximally entangled states in Sect. 1.5.3.

5.3 Properties of Maximally Entangled States

Let \(\mathcal{H}_{A}\) and \(\mathcal{H}_{B}\) be Hilbert spaces of dimension d for the systems A and B. Here, we summarize the relevant properties of the maximally entangled states of system AB.

(E1):

All maximally entangled states have a common marginal state \(d^{-1}\hat{1}_{A}\) for subsystem A, and a common marginal state \(d^{-1}\hat{1}_{B}\) for subsystem B.

(E2):

For any pair of maximally entangled states \(\vert \varPhi \rangle _{AB}\) and \(\vert \varPhi '\rangle _{AB}\), unitary operators \(\hat{U}_{A}\) and \(\hat{V }_{B}\) exist such that \(\vert \varPhi '\rangle _{AB} = (\hat{U}_{A} \otimes \hat{ 1}_{B})\vert \varPhi \rangle _{AB} = (\hat{1}_{A} \otimes \hat{ V }_{B})\vert \varPhi \rangle _{AB}\).

(E3):

A maximally entangled state \(\vert \varPhi \rangle _{AB}\) specifies a one-to-one correspondence \(\vert \phi \rangle _{A} \leftrightarrow \vert \phi ^{{\ast}}\rangle _{B}\) between the pure states of subsystem A and those of subsystem B, as characterized by \(d^{-1/2}\vert \phi \rangle _{A} = _{B}\langle \phi ^{{\ast}}\vert \vert \varPhi \rangle _{AB}\) and \(d^{-1/2}\vert \phi ^{{\ast}}\rangle _{B} = _{A}\langle \phi \vert \vert \varPhi \rangle _{AB}\).

(E4):

A maximally entangled state \(\vert \varPhi \rangle _{AB}\) specifies a one-to-one correspondence \(\hat{M}_{A} \leftrightarrow \hat{ M}_{B}^{\mathrm{T}}\) between the operators that act on \(\mathcal{H}_{A}\) and those acting on \(\mathcal{H}_{B}\), as characterized by

$$\displaystyle{ (\hat{M}_{A} \otimes \hat{ 1}_{B})\vert \varPhi \rangle _{AB} = (\hat{1}_{A} \otimes \hat{ M}_{B}^{\mathrm{T}})\vert \varPhi \rangle _{ AB}. }$$
(1.38)

Specifically, if \(\hat{M}_{A}\) is unitary then \(\hat{M}_{B}^{\mathrm{T}}\) is also unitary, and vice versa.

(E5):

There is an orthonormal basis \(\{\vert \varPhi _{l,m}\rangle _{AB}\}_{l=0,\ldots,d-1}^{m=0,\ldots,d-1}\) of \(\mathcal{H}_{A} \otimes \mathcal{H}_{B}\) where every basis state is a maximally entangled state. This type of basis is called a Bell basis.

(E1) is the definition given in Sect. 1.2.5. (E2) is a combination of (E1) and Theorem 3. (E3) refers to the relative states explained in Sect. 1.4.2.

For (E5), a Bell basis that includes state \(\vert \varPhi _{0,0}\rangle _{AB}\) of Eq. (1.35) is constructed as follows. For each subsystem, we define unitary operators

$$\displaystyle{ \hat{X}:=\sum _{ j=0}^{d-1}\vert j + 1\;(\mathrm{mod}\;d)\rangle \langle j\vert \;\;\;\mathrm{and}\;\;\;\hat{Z}:=\sum _{ j=0}^{d-1}\beta ^{j}\vert j\rangle \langle j\vert }$$
(1.39)

with \(\beta:=\exp (2\pi i/d)\). Using these operators, we define \(\vert \varPhi _{l,m}\rangle _{AB}:= (\hat{X}_{A}^{l} \otimes \hat{ Z}_{B}^{m})\vert \varPhi _{0,0}\rangle _{AB}\). Using the relation \(\hat{Z}\hat{X} =\beta \hat{ X}\hat{Z}\), it is simple to show that \(\vert \varPhi _{l,m}\rangle _{AB}\) is a simultaneous eigenvector of the commuting unitary operators \(\hat{X}_{A} \otimes \hat{ X}_{B}\) and \(\hat{Z}_{A} \otimes \hat{ Z}_{B}^{-1}\) with eigenvalues of β m and β l, respectively. Therefore, the d 2 states \(\{\vert \varPhi _{l,m}\rangle _{AB}\}_{l=0,\ldots,d-1}^{m=0,\ldots,d-1}\) are all orthogonal. For d = 2, the Bell basis consists of the following states.

$$\displaystyle\begin{array}{rcl} \vert \varPhi _{+}\rangle & =& \vert \varPhi _{0,0}\rangle = 2^{-1/2}(\vert 0\rangle _{ A}\vert 0\rangle _{B} + \vert 1\rangle _{A}\vert 1\rangle _{B}){}\end{array}$$
(1.40)
$$\displaystyle\begin{array}{rcl} \vert \varPhi _{-}\rangle & =& \vert \varPhi _{0,1}\rangle = 2^{-1/2}(\vert 0\rangle _{ A}\vert 0\rangle _{B} -\vert 1\rangle _{A}\vert 1\rangle _{B}){}\end{array}$$
(1.41)
$$\displaystyle\begin{array}{rcl} \vert \varPsi _{+}\rangle & =& \vert \varPhi _{1,0}\rangle = 2^{-1/2}(\vert 1\rangle _{ A}\vert 0\rangle _{B} + \vert 0\rangle _{A}\vert 1\rangle _{B}){}\end{array}$$
(1.42)
$$\displaystyle\begin{array}{rcl} \vert \varPsi _{-}\rangle & =& \vert \varPhi _{1,1}\rangle = 2^{-1/2}(\vert 1\rangle _{ A}\vert 0\rangle _{B} -\vert 0\rangle _{A}\vert 1\rangle _{B}){}\end{array}$$
(1.43)

(E4) is confirmed as follows. Suppose that \(\vert \varPhi \rangle _{AB}\) is decomposed as shown in Eq. (1.7). By applying \(_{A}\langle u_{i}\vert _{B}\langle v_{j}\vert \) to Eq. (1.38), we see that Eq. (1.38) is equivalent to

$$\displaystyle{ _{A}\langle u_{i}\vert \hat{M}_{A}\vert u_{j}\rangle _{A} = _{B}\langle v_{j}\vert \hat{M}_{B}^{\mathrm{T}}\vert v_{ i}\rangle _{B} }$$
(1.44)

for i, j = 1, , d. This means that the matrix representation of \(\hat{M}_{B}^{\mathrm{T}}\) in the basis \(\{\vert v_{i}\rangle _{B}\}\) is the transpose of the matrix representation of \(\hat{M}_{A}\) in the basis \(\{\vert u_{i}\rangle _{A}\}\).

As an example of Property (E4), the following relations are worth mentioning:

$$\displaystyle\begin{array}{rcl} (\hat{X}_{A} \otimes \hat{ 1}_{B})\vert \varPhi _{0,0}\rangle _{AB}& =& (\hat{1}_{A} \otimes \hat{ X}_{B}^{-1})\vert \varPhi _{ 0,0}\rangle _{AB}{}\end{array}$$
(1.45)
$$\displaystyle\begin{array}{rcl} (\hat{Z}_{A} \otimes \hat{ 1}_{B})\vert \varPhi _{0,0}\rangle _{AB}& =& (\hat{1}_{A} \otimes \hat{ Z}_{B})\vert \varPhi _{0,0}\rangle _{AB},{}\end{array}$$
(1.46)

and can easily be confirmed.

5.4 Quantum Dense Coding and Quantum Teleportation

In this subsection, we explain two types of scheme in which shared entanglement helps with the conversion between the quantum and classical channels. Every subsystem X that appears in this subsection is a d-level system with Hilbert space of dimension d, and with a standard orthonormal basis denoted by \(\{\vert j\rangle _{X}\}\). The Bell basis is defined for each pair of subsystems according to the standard bases.

In Theorem 4, we have seen that a one-qubit quantum channel alone can only send one bit of classical information. If the sender and the receiver share entanglement beforehand, then the quantum channel can send more via a protocol called quantum dense coding [8].

Theorem 8 (Quantum dense coding).

A \((\log _{2}d)\) -qubit ideal quantum channel and a \((\log _{2}d)\) -ebit ideal entanglement can be converted into a \((2\log _{2}d)\) -bit ideal classical channel.

A protocol for quantum dense coding can be constructed simply by using the Bell basis \(\{\vert \varPhi _{l,m}\rangle _{AB}\}\) of the two d-level subsystems, i.e., Property (E5) in Sect. 1.5.3. We show that Alice can send Bob a symbol (l, m) that was chosen from d 2 candidates \(\{(l,m)\}_{l=0,\ldots,d-1}^{m=0,\ldots,d-1}\). Suppose that Alice and Bob shared the entangled state \(\vert \varPhi _{0,0}\rangle _{AB}\) initially. Property (E2) ensures that Alice can locally transformFootnote 6 the state \(\vert \varPhi _{0,0}\rangle\) into the state \(\vert \varPhi _{l,m}\rangle\) that is specified by the chosen symbol (l, m). She then sends subsystem A, which has a Hilbert space with dimension d, to Bob using the \((\log _{2}d)\)-qubit quantum channel. Bob, who now holds both subsystems A and B, conducts an orthogonal measurement with the Bell basis \(\{\vert \varPhi _{l,m}\rangle _{AB}\}\) to determine Alice’s choice (l, m).

This protocol is remarkable in the sense that the static resource of entanglement enhances an ideal channel’s ability to achieve the dynamic task of information transmission. This is in stark contrast with what we saw in Theorem 6, i.e., that the static resource of entanglement cannot augment the dynamic resources of classical channels.

Next, we explain the protocol of quantum teleportation [9], which combines the nonclassical resource of entanglement and the dynamic resource of a classical channel to achieve faithful transmission of quantum states.

Theorem 9 (Quantum teleportation).

A \((2\log _{2}d)\) -bit ideal classical channel and a \((\log _{2}d)\) -ebit ideal entanglement can be converted into a \((\log _{2}d)\) -qubit ideal quantum channel.

The protocol proceeds as follows. Suppose that Alice and Bob initially share the entangled states \(\vert \varPhi _{0,0}\rangle _{AB}\) of two d-level systems. Alice also holds another d-level subsystem A′, and she is supposed to transmit the state of this subsystem to Bob. Alice first performs an orthogonal measurement with the Bell basis \(\{\vert \varPhi _{l,m}\rangle _{AA'}\}\) on subsystems A and A′, and transmits the outcome (l, m) to Bob through the \((2\log _{2}d)\)-bit classical channel. Based on the received indices (l, m), Bob then applies a unitary transformation \(\hat{U}_{B}^{(l,m)}\) to subsystem B.

We now consider how we can choose \(\hat{U}_{B}^{(l,m)}\) such that the final state of system B is always identical to the initial state of system A′ (see also Fig. 1.5). Consider another d-level system R, and suppose that the system AR is initially prepared in state \(\vert \varPhi _{0,0}\rangle _{A'R}\). Later, at the end of this argument, we will use Property (E3) to discuss the case where A′ is initially prepared in the general state \(\vert \phi \rangle _{A'}\).

Fig. 1.5
figure 5

Entanglement swapping and quantum teleportation

We begin with the following relation, which can be easily confirmed from the definition of Eq. (1.35):

$$\displaystyle\begin{array}{rcl} d^{-1}\vert \varPhi _{ 0,0}\rangle _{BR} = _{AA'}\langle \varPhi _{0,0}\vert \vert \varPhi _{0,0}\rangle _{AB}\vert \varPhi _{0,0}\rangle _{A'R}.& &{}\end{array}$$
(1.47)

According to Theorem 1, this shows that if the outcome is (l, m) = (0, 0), then the state of the system BR is \(\vert \varPhi _{0,0}\rangle _{BR}\). We want to generalize this relationship to the case where \(_{AA'}\langle \varPhi _{0,0}\vert \) is replaced by \(_{AA'}\langle \varPhi _{l,m}\vert \). From Eq. (1.46), we have \(\vert \varPhi _{l,m}\rangle _{AA'} = (\hat{X}_{A}^{l}\hat{Z}_{A}^{m} \otimes \hat{ 1}_{A'})\vert \varPhi _{0,0}\rangle _{AA'}\) and thus \(_{AA'}\langle \varPhi _{l,m}\vert = _{AA'}\langle \varPhi _{0,0}\vert (\hat{Z}_{A}^{-m}\hat{X}_{A}^{-l} \otimes \hat{ 1}_{A'})\). From Eqs. (1.45) and (1.46), we have \((\hat{Z}_{A}^{-m}\hat{X}_{A}^{-l} \otimes \hat{ 1}_{B})\vert \varPhi _{0,0}\rangle _{AB} = (\hat{1}_{A} \otimes \hat{ X}_{B}^{l}\hat{Z}_{B}^{-m})\vert \varPhi _{0,0}\rangle _{AB}\). We therefore obtain

$$\displaystyle\begin{array}{rcl} d^{-1}(\hat{X}_{ B}^{l}\hat{Z}_{ B}^{-m} \otimes \hat{ 1}_{ R})\vert \varPhi _{0,0}\rangle _{BR} = _{AA'}\langle \varPhi _{l,m}\vert \vert \varPhi _{0,0}\rangle _{AB}\vert \varPhi _{0,0}\rangle _{A'R},& &{}\end{array}$$
(1.48)

which identifies the state of the system BR after the Bell measurement in the protocol. By setting \(\hat{U}_{B}^{(l,m)} =\hat{ Z}_{B}^{m}\hat{X}_{B}^{-l}\), the protocol should leave the system BR in the same state, \(\vert \varPhi _{0,0}\rangle _{BR}\), regardless of the value of the outcome (l, m). In summary, if we begin with state \(\vert \varPhi _{0,0}\rangle _{A'R}\), the protocol then transforms it into state \(\vert \varPhi _{0,0}\rangle _{BR}\), in which the system with which R is entangled changes from A′, possessed by Alice, to B, which is held by Bob. This procedure is often called entanglement swapping [10].

The case where system A′ is initially prepared in the arbitrary state \(\vert \phi \rangle _{A'} =\sum _{j}c_{j}\vert j\rangle _{A'}\) can be analyzed using Property (E3) on the relative states, as in Sect. 1.4.2. After entanglement swapping, the state of the system BR is \(\vert \varPhi _{0,0}\rangle _{BR}\). Suppose that we perform an orthogonal measurement on system R with a basis that includes a state \(\vert \phi ^{{\ast}}\rangle _{R} =\sum _{j}\bar{c}_{j}\vert j\rangle _{R}\). If the corresponding outcome is obtained, then the state of system B becomes its relative state, \(\vert \phi \rangle _{B} =\sum _{j}c_{j}\vert j\rangle _{B}\). Because the entanglement swapping protocol starts with state \(\vert \varPhi _{0,0}\rangle _{A'R}\) and does not operate on system R, Theorem 1 then dictates that such an event must be consistent with the case where system A′ was initially prepared in \(\vert \phi \rangle _{A'} =\sum _{j}c_{j}\vert j\rangle _{A'}\). We thus conclude that if we carry out the protocol with initial state \(\vert \phi \rangle _{A'} =\sum _{j}c_{j}\vert j\rangle _{A'}\), the final state of system B is \(\vert \phi \rangle _{B} =\sum _{j}c_{j}\vert j\rangle _{B}\), which is regarded as a faithful transmission of the quantum state.

The existence of this quantum teleportation protocol has profound consequences. Because classical channels are much easier to implement in practice, let us assume that these channels can be used freely in both directions between Alice and Bob. The quantum teleportation protocol and the entanglement sharing of Theorem 5 then imply that one qubit of dynamic resource and one ebit of static resource are freely interconvertible. Because the static resource of entanglement can be stored in quantum memories, this effectively allows dynamic resource storage. If the quantum channels are not ideal but noisy, we may convert these channels into noisy entanglement, which is then distilled into close-to-ideal entanglement and can be used for faithful quantum transmission. When we wish to concatenate the quantum channels, which will only work probabilistically, as in the case of transmission of photons over an optical fiber, the combination of entanglement sharing and entanglement swapping dramatically improves the process efficiency, as described in Chap. 4 It should also be noted that entanglement has no preferred direction. We can convert a quantum channel from Alice to Bob into a channel from Bob to Alice, through the protocol of entanglement sharing followed by quantum teleportation with backward classical communication.

5.5 Conversion Among the Resources

In the preceding subsections, we have described three protocols, entanglement sharing, quantum dense coding, and quantum teleportation, that provide conversion among the three types of communication resources: ebits, bits, and qubits. Because these protocols were introduced in a rather heuristic way, we might expect that there are many other protocols that can be used for resource conversion. Here, we argue that this is not the case. The three protocols in a sense exhaust all possibilities as far as conversion among the three ideal resource types is concerned.

Imagine that Alice and Bob have a right to use E ebits of a shared ideal entanglement, C bits of an ideal classical channel, and Q qubits of an ideal quantum channel, which we denote by the portfolio (E, C, Q). According to Theorems 5, 8, and 9, the three protocols change the portfolio in the following way.

Entanglement sharing (ES):

\((E,C,Q) \rightarrow (E + 1,C,Q - 1)\)

Quantum dense coding (DC):

\((E,C,Q) \rightarrow (E - 1,C + 2,Q - 1)\)

Quantum teleportation (QT):

\((E,C,Q) \rightarrow (E - 1,C - 2,Q + 1)\)

Let us assume that we start from (E 0, C 0, Q 0). By repeating these protocols N ES, N DC, and N QT times,

$$\displaystyle\begin{array}{rcl} (E_{0},C_{0},Q_{0}) + N_{\mathrm{ES}}(1,0,-1) + N_{\mathrm{DC}}(-1,2,-1) + N_{\mathrm{QT}}(-1,-2,1)& &{}\end{array}$$
(1.49)

is attainable. Therefore, if we ignore the fact that only a discrete set of points is attainable, we may say that it is possible to reach anywhere within a triangular pyramid with apex (E 0, C 0, Q 0) and with edges defined by the vectors (1, 0, −1), \((-1,2,-1)\), and \((-1,-2,1)\) (see Fig. 1.6).

Fig. 1.6
figure 6

Permitted resource conversion region. ES: entanglement sharing; DC: quantum dense coding; QT: quantum teleportation

We are now interested in whether we can reach a point outside this pyramid. We have already derived various restrictions on resource conversion in Theorems 4, 6, and 7, which are summarized as follows.

Theorem 4:

(0, 0, Q) → (0, C′, 0) only if C′ ≤ Q.

Theorem 6:

(E, C, 0) → (E′, C′, 0) only if C′ ≤ C

Theorem 7:

(E, C, 0) → (E′, C′, 0) only if E′ ≤ E

Using these theorems, we derive a restriction on a general protocol \(\mathcal{P}\) that performs the conversion from (E 0, C 0, Q 0) → (E, C, Q). This is done by combining \(\mathcal{P}\) with the three protocols, such that the theorems above are applicable to the entire conversion process. For example, we have

$$\displaystyle\begin{array}{rcl} & & (0,0,Q_{0} + Q + 2C_{0} + E_{0} + E)\mathop{\longrightarrow }\limits^{\mathrm{ES}}(Q + C_{0} + E_{0},0,Q_{0} + C_{0} + E)\mathop{\longrightarrow }\limits^{\mathrm{DC}} {}\\ & & (Q + E_{0},2C_{0},Q_{0} + E)\mathop{\longrightarrow }\limits^{\mathcal{P}}(Q + E,C_{0} + C,Q + E)\mathop{\longrightarrow }\limits^{\mathrm{DC}} {}\\ & & \qquad (0,2Q + C_{0} + C + 2E,0), {}\\ \end{array}$$

which, from Theorem 4, requires that

$$\displaystyle{ (E - E_{0}) + (C - C_{0}) + (Q - Q_{0}) \leq 0. }$$
(1.50)

Similarly, from

$$\displaystyle\begin{array}{rcl} & & (Q_{0} + Q + E_{0},C_{0} + 2Q_{0},0)\mathop{\longrightarrow }\limits^{\mathrm{QT}}(Q + E_{0},C_{0},Q_{0})\mathop{\longrightarrow }\limits^{\mathcal{P}}(Q + E,C,Q) {}\\ & & \mathop{\longrightarrow }\limits^{\mathrm{DC}}(E,C + 2Q,0), {}\\ \end{array}$$

we use Theorem 6 to obtain

$$\displaystyle{ (C - C_{0}) + 2(Q - Q_{0}) \leq 0. }$$
(1.51)

Finally, by applying Theorem 7 to

$$\displaystyle\begin{array}{rcl} & & (Q_{0} + E_{0},C_{0} + 2Q_{0},0)\mathop{\longrightarrow }\limits^{\mathrm{QT}}(E_{0},C_{0},Q_{0})\mathop{\longrightarrow }\limits^{\mathcal{P}}(E,C,Q)\mathop{\longrightarrow }\limits^{\mathrm{ES}}(Q + E,C,0), {}\\ \end{array}$$

we have

$$\displaystyle{ (E - E_{0}) + (Q - Q_{0}) \leq 0. }$$
(1.52)

It is simple to confirm that Eqs. (1.50), (1.51), and (1.52) correspond to the three faces of the pyramid in Fig. 1.6. It is thus impossible to reach any point outside the pyramid. We see that the three protocols of entanglement sharing, quantum dense coding, and quantum teleportation correspond to the three edges of the achievable region, and form a unique triad that governs the conversions that are allowed among the resources of quantum channels, classical channels, and entanglement.