2.1 Matrix Representations

People often use the expression “apples and oranges” to flag false analogies and similes. To a mathematician, apples are very much like oranges in the sense that each apple (orange) can be represented by an integer or a deficit of apples (oranges) by a negative integer. Both sets exhibit the behavior of a mathematical structure called a group. So if two apples and three apples add to five apples, the same is true for the oranges. Mathematical structures, in which a member of set A, and operations within that set, can be put into a one-to-one correspondence with members of set B, is called an isomorphism. In this chapter, we illustrate how the formal structure introduced in the previous chapter is isomorphic to a vector space whose elements are matrices. We will find that ket vectors can be represented by column matrices and their dual, the bra vectors, by row matrices.

Let’s start with the simplest, non-trivial, Hilbert space of a single qubit. Since the kets |0〉, |1〉 span a Hilbert space of dimension d = 2, any ket |Ψ〉 in this space can be expressed as the linear combination |Ψ〉 = c1|0〉 + c2|1〉. The constraint 〈Ψ|Ψ〉 = 1 on a physical state imposes the restriction that |c1|2 + |c2|2 = 1.

Consider the following array, also called a column matrix,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left( \begin{array}{c} c_{1} \\ c_{2} \\ \end{array} \right) {} \end{array} \end{aligned} $$
(2.1)

where c1, c2 are complex numbers. Matrix addition of two column matrices follows the rule

$$\displaystyle \begin{aligned} \left( \begin{array}{c} c_{1} \\ c_{2} \\ \end{array} \right) + \left( \begin{array}{c} d_{1} \\ d_{2} \\ \end{array} \right) = \left ( \begin{array}{c} c_{1} +d_{1} \\ c_{2} + d_{2} \\ \end{array} \right ) \end{aligned}$$

and it is apparent that the set of all possible column matrices form a vector space. This proposition is verified by checking conditions (i–vii) enumerated in the previous chapter. According to the definition of matrix addition condition (i) is obviously satisfied. So is condition (ii), as multiplication of a column matrix by a number (scalar) simply multiplies each entry in the array by that number. Conditions (iii–v) follow from definitions for matrix addition and scalar multiplication. The null vector is given by the array

$$\displaystyle \begin{aligned} \left ( \begin{array}{c} 0 \\ 0 \\ \end{array} \right) \end{aligned}$$

and each vector (2.1) has a unique inverse

$$\displaystyle \begin{aligned} \left( \begin{array}{c} -c_{1} \\ -c_{2} \\ \end{array} \right). \end{aligned}$$

Because the equality

$$\displaystyle \begin{aligned} \begin{array}{rcl} c_{1} \left ( \begin{array}{c} 1 \\ 0 \\ \end{array} \right) + c_{1} \left ( \begin{array}{c} 0 \\ 1 \\ \end{array} \right) = \left ( \begin{array}{c} 0 \\ 0 \\ \end{array} \right) {} \end{array} \end{aligned} $$
(2.2)

is satisfied, if and only if, c1 = c2 = 0, the vectors

$$\displaystyle \begin{aligned} \left ( \begin{array}{c} 1 \\ 0 \\ \end{array} \right) \quad {\mathrm{and}} \quad \left( \begin{array}{c} 0 \\ 1 \\ \end{array} \right) \end{aligned}$$

are linearly independent. Also, since this is the largest set of independent vectors, the dimension of this vector space is d = 2.

We conclude that there is an isomorphism between the abstract kets |0〉, |1〉 and column matrices. We assert the association

$$\displaystyle \begin{aligned} \begin{array}{rcl} && | 0 \rangle \Leftrightarrow \left ( \begin{array}{c} 1 \\ 0 \\ \end{array} \right) \\ && | 1 \rangle \Leftrightarrow \left ( \begin{array}{c} 0 \\ 1 \\ \end{array} \right). {} \end{array} \end{aligned} $$
(2.3)

The matrices on right-hand side of (2.3) are said to be a matrix representation of the ket vectors and the arrows suggest that we can always replace a ket with its matrix representation, and vice-versa. This identification is practiced so often that physicists make it a habit to denote column matrices as kets without bothering to acknowledge the implicit isomorphism. As we get more accustomed to working with matrices, we will also fall into this habit and replace ⇔ with an equality sign.

Having established the isomorphism between the vector space of column matrices with the ket space of a qubit, we now focus on row vectors. Obviously, they also form a vector space distinct from that of column matrices. Nevertheless, we can associate with each column matrix a corresponding row matrix. The row matrices constitute a dual space. Given column matrix

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{c} c_{1} \\ c_{2} \\ \end{array} \right) = c_{1} \left ( \begin{array}{c} 1 \\ 0 \\ \end{array} \right) + c_{2} \left ( \begin{array}{c} 0 \\ 1 \\ \end{array} \right), \end{array} \end{aligned} $$

we define its dual, the row matrix

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{cc} c^{*}_{1} & c^{*}_{2} \end{array} \right) = c_{1}^{*} \, \left ( \begin{array}{cc} 1 & 0 \end{array} \right) + c^{*}_{2} \left ( \begin{array}{cc} 0 & 1 \end{array} \right). \end{array} \end{aligned} $$
(2.4)

It’s evident that the row matrices \( \left ( \begin {array}{cc} 1 & 0 \end {array} \right ) \) and \( \left ( \begin {array}{cc} 0 & 1 \end {array} \right ) \) are basis vectors for the vector space of all 2d row matrices, and it follows that the association

$$\displaystyle \begin{aligned} \begin{array}{rcl} && \langle 0 | \Leftrightarrow \left ( \begin{array}{cc} 1 & 0 \end{array} \right) \\ && \langle 1 | \Leftrightarrow \left ( \begin{array}{cc} 0 & 1 \end{array} \right) {} \end{array} \end{aligned} $$
(2.5)

is appropriate.

Mathematica Notebook 2.1: Matrix manipulations and operations with Mathematica http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

2.1.1 Matrix Operations

We established that the matrix representation of a qubit ket is a two-dimensional column matrix, whereas its bra dual is the corresponding row matrix obtained using rule (2.5). But we also know that column and row matrices can be multiplied in two different ways. Let’s review those rules. Convention tells us that if an n-dimensional row matrix with entries a1, a2, …an is placed to the left of an n-dimensional column matrix with entries b1, b2, …bn, their product is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{i=1}^{n} \, a_{i} \, b_{i}. {} \end{array} \end{aligned} $$
(2.6)

For the case n = 2,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{cc} a_{1} & a_{2} \end{array} \right) \, \left ( \begin{array}{c} b_{1} \\ b_{2} \end{array} \right) = a_{1} b_{1} + a_{2} b_{2}. {} \end{array} \end{aligned} $$
(2.7)

The latter is called the scalar product and, as we show below, is identical to the definition discussed in Chap. 1. Let |Ψ〉 = c1|0〉 + c2|1〉 and |Φ〉 = d1|0〉 + d2|1〉. According to the discussions of the previous chapter, their inner product is \( \langle \varPhi | \varPsi \rangle = d^{*}_{1} c_{1} + d^{*}_{2} c_{2} \). If we make the associations

$$\displaystyle \begin{aligned} \begin{array}{rcl} && |\varPsi \rangle \Rightarrow \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right) \\ && |\varPhi \rangle \Rightarrow \left ( \begin{array}{c} d_{1} \\ d_{2} \end{array} \right) {} \end{array} \end{aligned} $$
(2.8)

then, according to the above definitions,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \langle \varPhi | \varPsi \rangle \rightarrow \left ( \begin{array}{cc} d^{*}_{1} & d^{*}_{2} \end{array} \right) \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right) = d^{*}_{1} c_{1} + d^{*}_{2} c_{2}. {} \end{array} \end{aligned} $$
(2.9)

In the previous chapter, we arranged bra-ket combinations to construct operators in the following manner

$$\displaystyle \begin{aligned} {\mathbf{X}} \equiv | \varPsi \rangle \langle \varPhi |. \end{aligned}$$

Prescription (2.8) leads us to consider the matrix operation

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{X}} \Rightarrow && \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right) \left ( \begin{array}{cc} d^{*}_{1} & d^{*}_{2} \end{array} \right) \equiv \\ && \quad \quad \quad \left ( \begin{array}{cc} d^{*}_{1} \quad & d^{*}_{2} \end{array} \right ) \\ &&\left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right ) \left ( \begin{array}{cc} c_{1} d^{*}_{1} & c_{1} d^{*}_{2} \\ c_{2} d^{*}_{1} & c_{2} d^{*}_{2} \end{array} \right ). {} \end{array} \end{aligned} $$
(2.10)

Thus the outer product X is isomorphic to a 2 × 2 square matrix. When we express this as an equality, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{X}} = \left ( \begin{array}{cc} c_{1} d^{*}_{1} & c_{1} d^{*}_{2} \\ c_{2} d^{*}_{1} & c_{2} d^{*}_{2} \end{array} \right ), {} \end{array} \end{aligned} $$
(2.11)

it is implicit that X is a matrix representation of operator |Ψ〉〈Φ|. There is another equivalent way to express this isomorphism. Using the definition X = |Ψ〉〈Φ|, we form all possible inner products of the two vectors X|0〉 and X|1〉 with bras 〈0|, 〈1|. We then organize the resulting scalar quantities in the following tabular, or matrix, form

$$\displaystyle \begin{aligned} \begin{array}{rcl} && \left ( \begin{array}{cc} \langle 0 | {\mathbf{X}} |0\rangle & \langle 0 | {\mathbf{X}} |1 \rangle \\ \langle 1 | {\mathbf{X}} | 0\rangle & \langle 1 | {\mathbf{X}} |1 \rangle \end{array} \right ) = \left ( \begin{array}{cc} \langle 0 | \varPsi \rangle \langle \varPhi | 0 \rangle & \langle 0 | \varPsi \rangle \langle \varPhi | 1 \rangle \\ \langle 1 | \varPsi \rangle \langle \varPhi | 0 \rangle & \langle 1 | \varPsi \rangle \langle \varPhi | 1 \rangle \end{array} \right ) {} \end{array} \end{aligned} $$
(2.12)

and which, when evaluated, agrees with expression (2.11). In general, an n-dimensional column matrix with entries a1, a2, …, an placed to the left of a n-dimensional row matrix with entries b1, b2, …, bn implies an n × n table, or square matrix, whose ith row and jth column contain the product ai bj.

Now |Ψ〉〈Φ| operates on kets to its right, or bras on its left and engenders a transformation of vectors in Hilbert and dual space respectively. Let’s illustrate this transformation in the matrix representation. Consider the vector X|0〉, and using the matrix representation of X and ket |0〉 we find

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{X}} |0 \rangle \Longrightarrow \left ( \begin{array}{cc} c_{1} d^{*}_{1} & c_{1} d^{*}_{2} \\ c_{2} d^{*}_{1} & c_{2} d^{*}_{2} \end{array} \right ) \left ( \begin{array}{c} 1 \\ 0 \end{array} \right ) = \left ( \begin{array}{c} c_{1} d^{*}_{1} \\ c_{2} d^{*}_{1} \end{array} \right ). {} \end{array} \end{aligned} $$
(2.13)

Because X|0〉 = |Ψ〉〈Φ|0〉 and \( \langle \varPhi | 0 \rangle = d^{*}_{1} \)

$$\displaystyle \begin{aligned} {\mathbf{X}} |0 \rangle = d^{*}_{1} |\varPsi \rangle. \end{aligned}$$

The latter agrees with (2.13) if |Ψ〉 is replaced by its matrix representation. Similarly, taking the conjugate transposeFootnote 1 of (2.11)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{c} 1 \, 0 \end{array} \right ) \left ( \begin{array}{cc} c^{*}_{1} d_{1} & c_{2}^{*} d_{1} \\ c_{1}^{*} d_{2} & c_{2}^{*} d_{2} \end{array} \right ) =\left ( \begin{array}{c} c_{1}^{*} d_{1} \, c_{2}^{*} d_{1} \end{array} \right ), {} \end{array} \end{aligned} $$
(2.14)

the matrix representation of the relation 〈Ψ|d1 = 〈0| X .

2.1.2 The Bloch Sphere

We now have the tools that allow us to investigate, in more detail, the properties of the qubit Hilbert space. According to the above discussion the matrix representation for a qubit |Ψ〉 is

$$\displaystyle \begin{aligned} \begin{array}{rcl} | \varPsi \rangle = \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right ) \quad \quad |c_{1}|{}^{2} + |c_{2}|{}^2 =1, {} \end{array} \end{aligned} $$
(2.15)

where the equivalence symbol ⇔ is replaced by an equality. c1, c2 are complex numbers which we express in the form c1 = x0 + i x1, c2 = x2 + i x3, and where x0, x1, x2, x3 are four independent real parameters. The requirement that |Ψ〉 is a physical (normalized) state imposes a constraint on the constants c1, c2 so that \( x_{0}^2+x_{1}^2+x_{2}^2+x_{3}^2 = 1 \). This equation describes a 3-sphere embedded in a four-dimensional space, and whose center is located at the origin. Let’s define a Hopf map of a point x0, x1, x2, x3 in this space to a point (x, y, z) in a three-dimensional space where

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle x = 2 ( x_{0} x_{2} + x_{1} x_{3} ) \\ &\displaystyle &\displaystyle y = 2 (x_{3} x_{0} - x_{1} x_{2} ) \\ &\displaystyle &\displaystyle z = x_{0}^{2}+x_{1}^2-x_{2}^2-x_{3}^2. {} \end{array} \end{aligned} $$
(2.16)

With relation (2.16) we find that

$$\displaystyle \begin{aligned} \sqrt{x^{2}+y^{2}+ z^{2}} = \sqrt{x_{0}^2+x_{1}^2+x_{2}^{2} +x_{3}^2} = 1, \end{aligned}$$

where we used the fact that (x0, x1, x2, x3) lies on a unit 3-sphere. Therefore, points x, y, z lie on the surface of a three-dimensional sphere, the Bloch sphere, of unit length. If we parameterize the coordinates

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle x_{0} = \cos{}(\theta/2) \cos{}(\beta) \\ &\displaystyle &\displaystyle x_{1} = \cos{}(\theta/2) \sin{}(\beta) \\ &\displaystyle &\displaystyle x_{2} = \sin{}(\theta/2) \cos{}(\beta+ \phi) \\ &\displaystyle &\displaystyle x_{3} = \sin{}(\theta/2)\sin{}(\beta + \phi) \end{array} \end{aligned} $$

for 0 ≤ θ ≤ π, 0 ≤ ϕ ≤ 2π, 0 ≤ β ≤ 2π we find that

$$\displaystyle \begin{aligned} (x,y,z) = (\sin\theta \cos\phi, \sin\theta \sin\phi, \cos\theta), \end{aligned}$$

the standard parameterization of a unit 2-sphere in a spherical coordinate system. Here θ, ϕ are the polar and azimuthal angles respectively. With this parameterization we find that

$$\displaystyle \begin{aligned} \begin{array}{rcl} | \varPsi \rangle = \exp(i \beta) \left ( \begin{array}{c} \cos\theta/2 \\ \exp(\mathrm{i} \phi) \sin\theta/2 \end{array} \right ). {} \end{array} \end{aligned} $$
(2.17)

Therefore, the state of a qubit (disregarding an overall phase factor, \( \exp (i \beta )\)) is represented by a point on the surface of the Bloch sphere. The point θ = 0, located on the “north pole” of the Bloch sphere, identifies the computational basis vector |0〉, whereas the |1〉 vector is described by the point located on the “south pole” in which θ = π (Fig. 2.1).

Fig. 2.1
figure 1

The Bloch sphere. The arrow points to the location on the sphere of the state \(\ensuremath { \left |\psi \right \rangle }\) given by (2.17). The points on the north and south poles of the sphere denote the locations of the computational basis states

Mathematica Notebook 2.2: Visualizing qubits on the Bloch sphere surface. http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

2.2 The Pauli Matrices

In the previous chapter, I introduced a physical model for a qubit that I called a qbulb. In analogy with a light bulb, the qbulb is an atom in either an off or on state. Those states are represented by the kets |0〉, |1〉 or their corresponding matrix representations (2.3). They are eigenstates of the operator \( {\mathbf {n}} \equiv \sum _{j=0}^{1} j\, | j \rangle \langle j | \), which when expressed in matrix form

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{n}} = \left ( \begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array} \right ). {} \end{array} \end{aligned} $$
(2.18)

n represents a measuring device whose outputs are its eigenvalues 0, 1 and indicates whether the atom or qbulb, is in the on or off position. As required by the foundational postulates, n must be a self-adjoint, or Hermitian, operator. In the previous chapter, we introduced the † operation, so that if X = |Φ〉〈Ψ|, then X  = |Ψ〉〈Φ|. If X is given by its matrix representation, the † operation on it is equivalent to the complex conjugate of each element of its transpose matrix. That is, given matrix X with entries Xmn for the m’th row and n’th column, the matrix X has as it’s corresponding entries \( X_{nm}^{*}\). Using this prescription it is obvious that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{n}}^{\dag} = \left ( \begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array} \right )^{\dag} = {\mathbf{n}}. {} \end{array} \end{aligned} $$
(2.19)

The basis states (2.3), as they are eigenstates of the “on-off” measurement operator n, play a special role as the computational basis. Keeping with convention, we agree that all matrix representations, for both vectors and operators, are taken with respect to the computational basis.

Let’s define the linear combinations

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle | u \rangle \equiv \frac{1}{\sqrt{2}} | 0 \rangle + \, \frac{1}{\sqrt{2}} | 1 \rangle \\ &\displaystyle &\displaystyle | v \rangle \equiv \frac{1}{\sqrt{2}} | 0 \rangle - \, \frac{1}{\sqrt{2}} | 1 \rangle \vspace{-4pt} \end{array} \end{aligned} $$

whose matrix representations are

$$\displaystyle \begin{aligned} \begin{array}{rcl} && | u \rangle = \frac{1}{\sqrt{2}} \left ( \begin{array}{r} 1 \\ \, 1 \end{array} \right ) \\ && | v \rangle = \frac{1}{\sqrt{2}} \left ( \begin{array}{r} 1 \\ -1 \end{array} \right ). {}\vspace{-4pt} \end{array} \end{aligned} $$
(2.20)

Because

$$\displaystyle \begin{aligned} \begin{array}{rcl} c_{1} | u \rangle + c_{2} | v \rangle = \frac{1}{\sqrt{2}} \, \left ( \begin{array}{c} c_{1}+ c_{2} \\ c_{1} - c_{2} \end{array} \right ) = \left ( \begin{array}{c} 0 \\ 0 \end{array} \right ) \vspace{-4pt} \end{array} \end{aligned} $$

only if c1 = c2 = 0, |u〉, |v〉 are linearly independent. Furthermore, since 〈u|v〉 = 0, 〈u|u〉 = 〈v|v〉 = 1, they constitute an alternative basis for a qubit. Let’s define the operator

$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\sigma}_{X} \equiv && \quad |u \rangle \langle u | \quad - \quad |v \rangle \langle v | \quad = \\ \\ && \frac{1}{2} \left( \begin{array}{ll} 1 & 1 \\ 1 & 1 \\ \end{array} \right) - \frac{1}{2} \left( \begin{array}{rr} 1 & -1 \\ -1 & 1 \\ \end{array} \right) = \left (\begin{array}{ll} 0 & 1 \\ 1 & 0 \\ \end{array} \right) {} \end{array} \end{aligned} $$
(2.21)

where the second line expresses the outer products by their matrix representations. Matrix σX is called the Pauli σX-matrix. There are two other Pauli-matrices,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma_{ Y} \equiv \left( \begin{array}{rr} 0 & -i \\ i & 0 \\ \end{array} \right) \quad \sigma_{ Z} \equiv \left( \begin{array}{rr} 1 & 0 \\ 0 & -1 \\ \end{array} \right). {} \end{array} \end{aligned} $$
(2.22)

All three Pauli matrices play an important role in generating operations in the Hilbert space of a qubit. Since physical measurement devices are represented by self-adjoint, or Hermitian, operators or matrices it is natural to ask; what is the most general self-adjoint 2 × 2 matrix? A little thought suggests

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left( \begin{array}{cc} a & b- i\, c \\ b + i\, c & d \\ \end{array} \right) {} \end{array} \end{aligned} $$
(2.23)

where a, b, c, d are arbitrary real numbers. We re-write this self-adjoint matrix in the form

(2.24)

where α = (a − d)∕2, β = (a + d)∕2 and is the 2 × 2 identity matrix. Thus, an arbitrary 2 × 2 Hermitian matrix can be represented by a linear combination of the three Pauli matrices and the identity matrix. The four matrices form a basis for the linear vector space of all 2 × 2 Hermitian matrices. Pauli matrices also possess interesting algebraic properties. Evaluating the following matrix products

$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\sigma}_{X} \, \boldsymbol{\sigma}_{Y} - \boldsymbol{\sigma}_{Y} \, \boldsymbol{\sigma}_{X} = 2 i \boldsymbol{\sigma}_{Z} \\ \boldsymbol{\sigma}_{Y} \, \boldsymbol{\sigma}_{Z} - \boldsymbol{\sigma}_{Z} \, \boldsymbol{\sigma}_{Y} = 2 i \boldsymbol{\sigma}_{X} \\ \boldsymbol{\sigma}_{Z} \, \boldsymbol{\sigma}_{X} - \boldsymbol{\sigma}_{X} \, \boldsymbol{\sigma}_{Z} = 2 i \boldsymbol{\sigma}_{Y} {} \end{array} \end{aligned} $$
(2.25)

we notice that the three Pauli matrices are closed under the bracket operation [A, B] ≡ A B − B A. In other words, given a linear combination of two Pauli matrices A and B, and performing the binary operation [A, B] one always arrives at a linear of combination Pauli matrices. Because matrix multiplication is non-commutative, i.e., AB is not necessarily equal to BA, the bracket operation is non-trivial. It is useful to introduce the shorthand notation

$$\displaystyle \begin{aligned} \begin{array}{rcl} [ \boldsymbol{\sigma}_{i},\boldsymbol{\sigma}_{j} ] = 2 i \sum_{k} \, \varepsilon_{ijk} \boldsymbol{\sigma}_{k} {} \end{array} \end{aligned} $$
(2.26)

where the subscripts i = 1, 2, 3 denote X, Y, Z respectively, and 𝜖ijk is called the Levi-Civita symbol. It has the property that 𝜖123 = 𝜖312 = 𝜖231 = 1, 𝜖213 = 𝜖321 = 𝜖132 = −1 and 𝜖ijk = 0 if any of the two subscript have identical values. In addition to providing a basis for all 2 × 2 Hermitian matrices, the Pauli matrices serve as generators of all 2 × 2 unitary matrices. According to the discussion in the previous chapter, a unitary operator U has the property . Because operators in qubit Hilbert space are represented by 2 × 2 matrices and are parameterized by four real parameters, it is convenient to express a general 2 × 2 unitary matrix as

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{U}} = \exp(i \gamma) \left( \begin{array}{rr} \exp(i \phi) \cos\theta & \exp(i \beta) \sin\theta \\ -\exp(-i\beta) \sin\theta & \quad \exp(-i \phi) \cos\theta \end{array} \right ), {} \end{array} \end{aligned} $$
(2.27)

where γ, ϕ, β, θ are real numbers. To see how U is related to the Pauli matrices we first need to understand how exponentiation of a matrix is defined. We learned in calculus that exponentiation of a number α can be defined by its infinite power series representation, i.e.

$$\displaystyle \begin{aligned} \exp(\alpha) =1 + \alpha + \frac{\alpha^{2}}{2!} + \frac{\alpha^{3}}{3!} + \dots \end{aligned}$$

So we define the exponentiation of a 2 × 2 matrix A by the expression

(2.28)

Given A = A , we construct the following

Taking the conjugate of the r.h.s of this expression we find that \( {\mathbf {U}}_{A}^{\dag } \equiv \exp (-i {\mathbf {A}}) =(\exp (i {\mathbf {A}}))^{\dag }\). Evaluating \({\mathbf {U}}_{A} {\mathbf {U}}^{\dag }_{A}= \exp (i {\mathbf {A}}) \exp (-i {\mathbf {A}}) \) by multiplying and collecting like terms of the series representation we find

(2.29)

Therefore, for any Hermitian operator A, UA is unitary. Consider

and which, is guaranteed to be unitary. Using the fact that we simplify this expression so that

(2.30)

where we have replaced the power series in α by their trigonometric representations. In explicit matrix form

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{U}}_{X}(\alpha) = \exp(i \alpha {\boldsymbol{\sigma}}_{X}) = \left ( \begin{array}{cc} \cos\alpha & i \sin\alpha \\ i \sin\alpha & \cos\alpha \end{array} \right ). {} \end{array} \end{aligned} $$
(2.31)

In the same manner we find that

$$\displaystyle \begin{aligned} \begin{array}{rcl} && {\mathbf{U}}_{Y}(\alpha) \equiv \exp(i \alpha {\boldsymbol{\sigma}}_{Y}) = \left ( \begin{array}{rr} \cos\alpha & \, \sin\alpha \\ -\sin\alpha & \cos\alpha \end{array} \right ) \\ && {\mathbf{U}}_{Z}(\alpha) \equiv \exp(i \alpha {\boldsymbol{\sigma}}_{Z}) = \left ( \begin{array}{cc} \exp(i \alpha) & 0 \\ 0 & \exp(-i \alpha) \end{array} \right ). {} \end{array} \end{aligned} $$
(2.32)

Evaluating the matrix product

$$\displaystyle \begin{aligned} \begin{array}{rcl} && {\mathbf{U}}_{Z}(\frac{\phi+\beta}{2}) {\mathbf{U}}_{Y}(\theta) {\mathbf{U}}_{Z}(\frac{\phi-\beta}{2}) = \\ && \left( \begin{array}{rr} \exp(i \phi) \cos\theta & \exp(i \beta) \sin\theta \\ -\exp(-i\beta) \sin\theta & \quad \exp(-i \phi) \cos\theta \end{array} \right ), {} \end{array} \end{aligned} $$
(2.33)

we find that unitary operator (2.27) can be expressed as a product of unitary operators whose generators are Pauli matrices and a scalar phase operation \(\exp (i\gamma )\). This fact comes in handy in later chapters where we show how quantum gate operations are carried out by unitary operators.

According to Postulate III, measurement devices are associated with Hermitian operators. Operator n, which measures a qubit’s occupancy number (1 or 0), is such a matrix, but is the converse true? Apparently, there exist an infinite set of Hermitian matrices in this Hilbert space. The Pauli matrices or any linear combination of them are Hermitian, so do they represent possible measurement devices? If so what do they measure? Consider operator σX, as it is Hermitian let’s assume that it is associated with some measurement device.

Postulate III demands that a measurement with the device results in one of the eigenvalues of σX. In order to find the eigenvectors and eigenvalues of σX we need to find solutions of

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\boldsymbol{\sigma}}_{X} |\lambda \rangle = \lambda | \lambda \rangle \end{array} \end{aligned} $$

where |λ〉 is the eigenvector associated with eigenvalue λ. The matrix form of this equation is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{cc} 0 &\displaystyle 1 \\ 1 &\displaystyle 0 \end{array} \right ) \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right ) = \lambda \, \left ( \begin{array}{c} c_{1} \\ c_{2} \end{array} \right ) \end{array} \end{aligned} $$

where we expressed vector |λ〉as a linear combination of the computational basis vectors |0〉, |1〉. Collecting terms we get

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \lambda \, c_{1} - c_{2} = 0 \\ &\displaystyle &\displaystyle c_{1} - \lambda \, c_{2} = 0. {} \end{array} \end{aligned} $$
(2.34)

Using one these equations to express c1 in terms of c2, and inserting that relation into the other equation one finds that, for non-trivial solutions of (2.34), condition

$$\displaystyle \begin{aligned} \begin{array}{rcl} \lambda^{2} - 1 = 0 {} \end{array} \end{aligned} $$
(2.35)

must be satisfied. Thus λ = ±1, and inserting these eigenvalues into (2.34) we express c1 in terms of c2. For example, with λ = 1 we get c1 − c2 = 0 or c1 = c2. We still have two free parameters because c2 is a complex number, but the orthonormality condition 〈λ|λ〉 = 1 fixes \(c_{1}=c_{2} = \exp (i\gamma )/\sqrt {2} \), where γ is an arbitrary number. Thus

$$\displaystyle \begin{aligned} \begin{array}{rcl} && | \lambda =1 \rangle = \frac{\exp(i \gamma)}{\sqrt{2}} \left ( \begin{array}{r} 1 \\ 1 \end{array} \right ) = \exp(i\gamma) | u \rangle \\ && | \lambda =-1 \rangle = \frac{\exp(i \beta)}{\sqrt{2}} \left ( \begin{array}{r} \quad 1 \\ -1 \end{array} \right ) = \exp(i\beta) | v \rangle {} \end{array} \end{aligned} $$
(2.36)

are eigenvectors of σX. γ, β are arbitrary phase factors which by convention we set to zero. For λ = 1 the state is parameterized by angles ϕ = 0, θ = π∕2 on the Bloch sphere, and ϕ = π, θ = π∕2 for λ = −1.

Suppose our qubit system is in state |Ψ〉. A measurement on this system with device n tells us whether the qubit is in the on, or off state but what does a measurement with σX tell us? To gain insight and answer that question we first consider a somewhat different physical system, that of a classical electromagnetic wave.

Mathematica Notebook 2.3: Pauli matrices as unitary gate generators. http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

2.3 Polarization of Light: A Classical Qubit

The publication in 1865 of A Dynamical Theory of the Electromagnetic Field by the theoretical physicist James Clerk Maxwell proved to be a watershed. It sparked both the electric and communication revolutions, and it is inconceivable to imagine the modern world without its transformational insights. Maxwell’s synthesis of electricity and magnetism catalyzed the discovery of electromagnetic waves, of which optical phenomena, X-rays, microwaves, etc. are instances. Maxwell’s theory predicts that light is a manifestation of electric and magnetic fields that vary in space and time in a specific way.

For a light beam that is coming out of this page, Maxwell’s equations predict an electric field that behaves in the following manner

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\vec E}(t) = E_{0} \exp(i \delta_{0} ) \left ( \cos\theta \, { {{\hat{\mathbf{i}}}} } + \sin\theta \exp(i \delta ) \, { {\hat{\mathbf{j}}}} \, \right ) \exp(i \omega t) {} \end{array} \end{aligned} $$
(2.37)

where the real part of complex function \({\vec E}(t) \) represents the electric field at any point x, y in the plane of the page at time t. \({\hat {\mathbf {i}}}, {\hat {\mathbf {j}}} \) are the orthogonal unit vectors along the x, and y directions respectively, δ0, δ, θ, E0 are real numbers, and ω is an angular frequency. The magnetic field is perpendicular, at each point in the x, y plane, to the \({\vec E}\) field. Knowledge of E0, δ0, δ, θ, ω provides a complete description of the wave. Instead of using the explicit vector description of (2.37), it is convenient to introduce the Jones vector

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{c} \cos\theta \\ \exp(i\delta) \sin\theta \end{array} \right ). {} \end{array} \end{aligned} $$
(2.38)

The first entry in the column matrix is, up to an overall constant and the factor \(\exp (i \omega t)\), the x-component of (2.37) and the second entry the y-component. For the value θ = 0 the Jones vector is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{c} 1 \\ 0 \end{array} \right ) = | 0 \rangle {} \end{array} \end{aligned} $$
(2.39)

and taking the real part of (2.37) with this Jones vector, and setting δ = 0, the electric field in the x, y plane is given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} E_{0} \, { {{\hat{\mathbf{i}}}}} \, \cos{}(\omega \, t + \delta_{0}). {} \end{array} \end{aligned} $$
(2.40)

It describes a vector oscillating along the x-coordinate axis and is called plane polarized light. In shorthand, we call it H type light. For θ = π∕2, δ = 0 the Jones vector is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{c} 0 \\ 1 \end{array} \right ) = | 1 \rangle {} \end{array} \end{aligned} $$
(2.41)

and the electric field is plane polarized along the y-axis, or V  type light. Inserting the values \( \theta = \frac {\pi }{ 4} \) and \( \delta =\frac {\pi }{2} \) into (2.38), the Jones vector becomes

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{1}{\sqrt{2} } \left ( \begin{array}{r} 1 \\ i \end{array} \right ) = \frac{1}{\sqrt{2}} \left ( | 0 \rangle + i \, | 1 \, \rangle \right ). {} \end{array} \end{aligned} $$
(2.42)

It is a linear combination, containing complex coefficients, of the Joneses vectors that describe linear polarization along the horizontal and vertical directions (x, y axes). With this Jones vector the real part of (2.37), the electric field, is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{E_{0}}{\sqrt{2}} \, \left ( \cos{}(\omega \, t + \delta_{0} ) { {{\hat{\mathbf{i}}}} } - \sin{}(\omega \, t + \delta_{0} ) { {{\hat{\mathbf{j}}}} } \right ). {} \end{array} \end{aligned} $$
(2.43)

Plotting (2.43) on the x, y plane as a function of time, one finds that it describes a vector rotating with angular frequency ω in the clockwise direction about a circle of radius of length \(E_{0}/\sqrt {2}\). The latter is called left circularly, or L, polarized light. The Jones vector

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{1}{\sqrt{2} } \left ( \begin{array}{r} 1 \\ -i \end{array} \right ) = \frac{1}{\sqrt{2}} \left ( | 0 \rangle - i \, | 1 \rangle \right ) {} \end{array} \end{aligned} $$
(2.44)

describes similar time behavior except that \({\vec E}(t)\) rotates in a counter-clockwise manner and is called right, or R, circular polarized light. The Jones vectors (2.39) and (2.41) are eigenstates of n whereas the left and right circular polarized Jones vectors (2.43), (2.44), are eigenstates of σY. Monochromatic light can be manipulated by optical instruments as shown in Fig. 2.2. In that figure, plane polarized light along the \( {\hat {\mathbf {j}}} \) axis is incident on a circular polarization filter. If that beam is interrupted by a polarization filter that allows only \( { \hat {\mathbf {i}}} \) plane polarized light through, we would find no output as the input was 100% polarized along the \( { \hat {\mathbf {j}}} \) axis. Instead it enters a circular polarization filter (disk object) which outputs (L-type) circular polarized light. Because the output of the left-circular polarizer is a linear combination of the |0〉, |1〉 states, a plane horizontal ( \( { \hat {\mathbf {i}} } \) ) polarization filter (rectangular object) does allow passage of that component.

Fig. 2.2
figure 2

Polarization states of classical light. V-type plane polarized light enters a polarization filter(disk) which outputs either R, or L-type circular polarized light. The rectangular filter allows only plane H-type of light to pass. The arrows denote the magnitude and direction of the electric field

With polarization filters and selectors we possess devices that filter and select a particular polarization state in an incident beam. Suppose we construct a black box, called N, that contains two indicator lights labeled H and V . This box filters and detects one of the two components, H or V , polarized light. Another box, called P, detects R,L type light.

Mathematica Notebook 2.4: Visualizing polarization of light. http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

2.3.1 A Qubit Parable

Let me entertain the following imaginary scenario. We provide the aforementioned boxes to researchers in some distant world who have no knowledge of Maxwell’s equations and have no other means, except the use of these boxes, to study light phenomena. Passing a (polarized) light beam through two N boxes connected in series, the researchers observe either H H, or VV  indicator light configurations. They never see the indicator light combinations H V , or VH. On a single box, they always find one indicator light on, but never both. A reasonable conclusion from those results is that H, V  are intrinsic, independent properties of light. Light appears to consist of V  or H type, but never a combination of the two. This binary choice leads the researchers to employ ket notation, i.e., |H〉, |V 〉 to formalize a theory of light. So, in this theory, when the H indicator is on, the device detects |H〉 type light, and |V 〉 light if the other indicator is on. Analogous experiments with the P box reveal similar behavior and the researchers define two new states of light, |R〉, |L〉 that correspond to those indicators settings. These two pairs of states appear to be mutually exclusive until one researcher performs an experiment in which a beam passes through the box combination N P N. In some runs, the researchers observe indicator configurations H R V  in the corresponding boxes as the beam proceeds from left to right. That is, the incident beam leaves the first N box in the H state but after passing through the P box, the second N box detects the presence of V  light. This data forces researchers to conclude that the P instrument, which measures the R, L properties of light, somehow affects the H, V  properties of light. From these results, the alien scientists posit a qubit interpretation and invoke similar hypotheses to those introduced in the previous chapter. The rationale for the theory is that it is consistent with all experiments performed by the pair of measurement devices. Importantly, the scientists conclude that one type of light, e.g. |R〉, is a combination of |H〉, |V 〉 light, a phenomenon they call superposition. They use matrix language to invoke the isomorphism

$$\displaystyle \begin{aligned} | { H} \rangle \Rightarrow \left ( \begin{array}{c} 1 \\ 0 \end{array} \right ) \quad | { V} \rangle \Rightarrow \left ( \begin{array}{c} 0 \\ 1 \end{array} \right ), \end{aligned}$$

and realize that these vectors are eigenstates, with eigenvalues 0, 1, of operator

$$\displaystyle \begin{aligned} \left ( \begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array} \right ) \end{aligned}$$

which is the matrix representation for their N instrument. Its eigenstates are given the special status as the computational basis of Hilbert space. The box P should also be represented by a Hermitian matrix. If the eigenvalues of P are 1, 0 corresponding to the eigenstates |R〉, |L, 〉 respectively, then the general form for the operator representing box P is

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{P}} = 1 | R \rangle \langle R | + 0 | L \rangle \langle L | = | R \rangle \langle R |. \end{array} \end{aligned} $$

Because |R〉, |L〉 are, presumably, linear combinations of the computational basis, |R〉 = c1|0〉 + c2|1〉 and so

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{P}} = c_{1}^{*} c_{1} | 0 \rangle \langle 0 | + c_{1}^{*} c_{2} | 1 \rangle \langle 0 | + c_{2}^{*} c_{1} | 0 \rangle \langle 1 | + c_{2}^{*} c_{2} | 1 \rangle \langle 1 | = \left ( \begin{array}{cc} c_{1}^{*} c_{1} &\displaystyle c_{2}^{*} c_{1} \\ c_{1}^{*} c_{2} &\displaystyle c_{2}^{*} c_{2} \end{array} \right ). \end{array} \end{aligned} $$

The complex constants c1, c2 are determined by carrying out the series of experiments described above. Suppose experiments reveal that \( c_{1}=1/\sqrt {2}, c_{2}=i /\sqrt {2} \), then

The presence of unit operator does not affect eigenstates (but shifts the eigenvalues) since and so we could associate operator P with one of the Pauli matrices σY.

This exercise forces us to re-consider our simplistic qbulb analogy for a qubit. Though it explains the occupation number measurements of a qubit, as represented by the N operator, the Maxwell qubit also possesses the P property. It appears that a qubit is more complex than that described by the qbulb model. To provide a more realistic model of the qubit we proceed to discuss a purely quantum mechanical phenomenon called spin .

Before discussing spin, I make a final comment on the qubit interpretation of light. Equation (2.37) offers a complete description of monochromatic electromagnetic waves without the necessity of a qubit interpretation. The alien scientists were forced to employ a Hilbert space interpretation of measurements because we gave them a limited set of tools, the N and P measurement devices. Those tools provided access only to a course-grained version of Maxwell’s theory. If our friends had access to Maxwell’s treatise, the qubit interpretation of light would be unnecessary. With the Maxwell theory, the aliens would be able to figure out the underlying mechanism responsible for the N and P outputs.

That is not the complete story. Maxwell’s classical theory has been supplanted by a quantum version, called Quantum Electrodynamics or QED for short. Its development in the mid-twentieth century is one of the great achievements of quantum field theory. In QED, monochromatic light is described as an excitation of a quantum field, which in many ways has particle properties. This excitation is called a photon, and it is a physical realization of a qubit as it has properties similar to those elaborated in our parable. Despite the lesson of our parable, that a classical system can exhibit qubit like behavior, there exists no classical analog for a property shared by a pair of photons, a phenomenon called entanglement. The latter plays an important role in quantum information theory and takes center stage in our subsequent discussions.

2.4 Spin

It’s time to heed our recommendation and jettison the simple qbulb model of the qubit. Instead, we need to identify a physical system that, along with the photon, exhibits all features of the qubit paradigm. The electron serves such a purpose. It was discovered in the mid-late nineteenth century, and to the best of our knowledge, every electron in the universe shares identical values of electric charge and mass. Unlike protons and neutrons, electrons appear to be fundamental and are point-like. Despite that fact, electrons have a rich internal structure called spin.

Evidence of spin was discovered in experiments in which a beam of neutral atoms, that contains a single valence electron, are guided through a Stern-Gerlach device (SG). In a typical experimental set-up, schematically illustrated in Fig. 2.3, atoms traverse a region where an inhomogeneous magnetic field deflects them. As they exit the device, they impinge on a detection screen that is used to analyze the deviation from their initial trajectory. In classical physics, the deflection force is proportional to the spatial gradient of \( {\vec m} \cdot {\vec B}\), where \( {\vec B} \) is the magnetic field vector and \( {\vec m}\) is a magnetic moment. Because of the observed deflection, physicists hypothesized that the electron possesses an intrinsic magnetic moment. Magnetic moments commonly arise when electric charges form current loops. But the electron is a point particle and so attribution of an electron magnetic moment by this mechanism is problematic. In addition to empirical evidence, it was Paul Dirac who provided, in order to reconcile QM with the theory of special relativity, a convincing theoretical argument for the existence of spin (more precisely spin 1/2). Dirac showed how the electron’s intrinsic magnetic moment is proportional to its spin property.

Fig. 2.3
figure 3

Stern-Gerlach device bifurcates a single stream of atoms, due to their intrinsic spin, into two components. The atoms that segregate into the upper stream are said to be in state |0〉, while those deflected into the lower path are in the |1〉 state

In a classical description \({\vec m}\) is distributed over all directions in space, and so in a set-up similar to that shown in Fig. 2.3, the deflection force engenders a continuous spectrum of paths. The SG experiments showed that the atoms are not deflected in this way. Instead, one observes the behavior illustrated in Fig. 2.3. After passing through the SG device, the atoms tend to segregate into two discrete regions on a detection screen. This binary behavior is reminiscent of our qbulb analogy in which the bulb is either on or off. Similarly, for the photon, it is either in the |H〉 or |V 〉 state. We, therefore, postulate that ket |0〉 is the electron’s internal spin state when the SG device detects the atom in the upper region of the screen, and |1〉 when detected in the lower region. These states are eigenstates of operator n which we now associate with the Stern-Gerlach device pointed along the z-direction. We perform additional measurements by orienting the SG device along different directions. Obvious choices are the x and y directions, and those measurements also reveal binary segregation of atom trajectories. So for the x-instrument the electron’s internal state should also be described by a pair of kets that are eigenstates of a corresponding measurement operator, but which one? Experiments in which a neutral beam is passed through a set of SG devices, with different orientations, led to the quantum mechanical theory of spin discussed below.

2.4.1 Non-commuting Observables and the Uncertainty Principle

Interpreting the results of Stern-Gerlach measurements led physicist to conjecture a new, purely quantum mechanical, property of electrons called spin, or spin-1/2. Spin is represented by three Hermitian operators, corresponding to different orientations of an SG device with respect to axes defined by the beam direction. The operators that correspond to the measurement of electron spin are

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle S_{X} \equiv \frac{\hbar}{2} \, {\boldsymbol{\sigma}}_{X} \\ &\displaystyle &\displaystyle S_{Y} \equiv \frac{\hbar}{2} \, {\boldsymbol{\sigma}}_{Y} \\ &\displaystyle &\displaystyle S_{Z} \equiv \frac{\hbar}{2} \, {\boldsymbol{\sigma}}_{Z}. {}\vspace{-4pt} \end{array} \end{aligned} $$
(2.45)

They are also expressed in vector form \( {\vec S} = S_{X} { {{\hat {\mathbf {i}}}}} + S_{Y} {\hat {\mathbf {j}}} + S_{Z} { \hat {\mathbf {k}}} \) where \( {\hat {\mathbf {i}}}, { \hat {\mathbf {j}}}, {\hat {\mathbf {k}}} \) are the three orthogonal unit vectors of a Cartesian coordinate system. This expression introduces a new dimension-full quantity ħ into our narrative. It is called Planck’s constant and has the physical units of angular momentum so that ħ = 6.626176 × 10−34 J s. Using the results of the previous discussion, we find that the eigenvalues of each of the components of \( {\vec S}\) has the values ±ħ∕2. A measurement with the SZ SG device produces the binary pattern on the detection screen illustrated in Fig. 2.3. Particles in the top pattern are in the |0〉 state, whereas atoms in the lower branch are in the |1〉 state. Because detection is a form of measurement, Postulate IV requires the system to collapse into those corresponding states. Now

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle S_{Z} | 0 \rangle = \hbar/2 \, | 0 \rangle \\ &\displaystyle &\displaystyle S_{Z} | 1 \rangle = -\hbar/2 \, | 1 \rangle {} \end{array} \end{aligned} $$
(2.46)

and so we conclude that state |0〉 corresponds to a measurement in which SZ has the value ħ∕2, whereas the value −ħ∕2 corresponds to the |1〉 state. Suppose an SZ measurement is performed and we found the system to be in the |0〉 state. Those filtered atoms are then taken as an incident beam for a new SG device that is oriented along the x-axes, i.e. we measure atoms in the |0〉 state with SG device SX. Kets |u〉, |v〉 are eigenstates of SX and, according to (2.20), are linear combinations of |1〉 and |0〉. Using the Born rule (Postulate III), and the fact that the system state

$$\displaystyle \begin{aligned} | 0 \rangle= \frac{1}{\sqrt{2}} \Big ( | u \rangle + | v \rangle \Big ), \end{aligned}$$

we will obtain the eigenvalue ħ∕2 50% of the time, and eigenvalue −ħ∕2 50% of the time following a SX measurement. Suppose we find that SX = −ħ∕2, the collapse postulate (Postulate IV) requires that the system “collapses” into state |v〉 following that measurement. Finally, we perform yet another measurement with device SZ. As the system collapsed into state

$$\displaystyle \begin{aligned} | v \rangle = \frac{1}{\sqrt{2}} \Big ( | 0 \rangle - | 1 \rangle \Big ), \end{aligned}$$

there is a 50% probability that measurement with SG device SZ finds the system in state |1〉, despite the fact that we seemed to have filtered this state during the first measurement with SZ! In summary, we made three consecutive measurements of an initial beam with devices SZ, SX and then SZ again. We found that \(\ensuremath {\left |\varPsi \right \rangle }\), the qubit Hilbert space amplitude collapses into state |0〉 following the initial measurement of SZ. Any subsequent measurement with SZ will always find the eigenvalue SZ = ħ∕2, but if we used SX instead, followed by an SZ measurement there is a 50% chance of obtaining the result SZ = −ħ∕2. This, the propensity of certain measurements to influence the results of subsequent independent measurements on a qubit is a common feature of this theory. We will discuss, in later chapters, how it can be exploited to facilitate secure communications channels. To gain a deeper understanding of the mechanism behind this counterintuitive behavior lets explore the following scenario.

Imagine a qubit in state |Ψ〉. We perform measurements of it with any of the SG devices SX, SZ, SY. Let’s use SZ where, according to the Born rule, |〈0|Ψ〉|2 is the probability that the measurement results in the value ħ∕2. If we take many such measurements, always with the same |Ψ〉, we can calculate the mean value of all possible results obtained. According to probability theory, the mean or expectation, value \({\overline x} \) for a set of quantities xi, also denoted by < x > , is

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\overline x} \equiv \sum_{i} \, p_{i} \, x_{i} {} \end{array} \end{aligned} $$
(2.47)

where pi is the probability for event xi to occur. For the state |Ψ〉 = |u〉, \( {\overline S}_{Z} = 0\) since pħ∕2 = 1∕2, x1 = ħ∕2 and pħ∕2 = 1∕2, x2 = −ħ∕2. In addition to the mean value, it is also useful to gauge the proclivity for a given measurement result to differ from its mean value. The latter is called the standard deviation, σ, of a measurement. It is defined

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma = \sqrt{\overline { ( x - {\overline x} )^{2} } } = \sqrt{\overline { ( x ^{2} -2 x {\overline x} + {\overline x}^{2} ) } } = \sqrt{\overline {x ^{2}} - {\overline x}^{2}} {} \end{array} \end{aligned} $$
(2.48)

where we used the fact that \( \overline {x \, {\overline x} } = {\overline x}^{2} \). σ measures the average deviation of measurement results from the expectation value.

Consider now state |Ψ〉 given by (2.17) and from which we evaluate \( {\overline S}_{Z} \). Using the Born rule and (2.46) we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle {\overline S}_{Z} \equiv \frac{\hbar}{2} \, p_{\hbar/2} - \frac{\hbar}{2} \, p_{-\hbar/2} \\ &\displaystyle &\displaystyle p_{\hbar/2} = | \langle 0 | \varPsi \rangle |{}^{2} = \cos^{2}\theta/2 \quad p_{-\hbar/2} = | \langle 1 | \varPsi \rangle |{}^{2} = \sin^{2}\theta/2 . {} \end{array} \end{aligned} $$
(2.49)

Or

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\overline S}_{Z} =\frac{\hbar}{2} \, \cos\theta. {} \end{array} \end{aligned} $$
(2.50)

Since pi = |〈i|Ψ〉|2 = 〈Ψ|i〉〈i|Ψ〉, for i = 0, 1 we can re-express

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle {\overline S}_{Z} = \frac{\hbar}{2} \, \langle \varPsi | 0 \rangle \langle 0 | \varPsi \rangle - \frac{ \hbar}{2} \, \langle \varPsi | 1 \rangle \langle 1 | \varPsi \rangle = \\ &\displaystyle &\displaystyle \langle \varPsi | S_{z} | 0 \rangle \langle 0 | \varPsi \rangle + \langle \varPsi | S_{z} | 1 \rangle \langle 1 | \varPsi \rangle \end{array} \end{aligned} $$
(2.51)

where, in the deriving the second line, we used (2.46). Now

$$\displaystyle \begin{aligned} \begin{array}{rcl} \langle \varPsi | S_{z} | 0 \rangle \langle 0 | \varPsi \rangle + \langle \varPsi | S_{z} | 1 \rangle \langle 1 | \varPsi \rangle = \langle \varPsi | S_{z} \Big ( | 0 \rangle \langle 0 + | 1 \rangle \langle 1 | \Big ) | \varPsi \rangle \end{array} \end{aligned} $$

and so, using the closure relation

(2.52)

which when evaluated is in harmony with the result given by (2.50). Relation (2.52) informs us that the mean value \( {\overline S}_{Z}\) is equal to the inner product of |Ψ〉 with state SZ|Ψ〉. It is a general result valid for any Hermitian operator and will come in handy later. Now

$$\displaystyle \begin{aligned} S_{Z} S_{Z} = S_{X} S_{X} = S_{Y} S_{Y} = \frac{\hbar^2}{4} \end{aligned}$$

and so the standard deviation for measurement SZ is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma = \sqrt{\langle \varPsi | \frac{\hbar^2}{4} | \varPsi \rangle - \frac{\hbar^{2}}{4} \, \cos^{2}\theta} = \frac{\hbar}{2} \sin\theta. {} \end{array} \end{aligned} $$
(2.53)

(2.53) is the average spread in values obtained by measurements with SZ, provided that the system is in state |Ψ〉. Because we can also pose this question for measurements with other devices, it is common practice to denote the standard deviation of an operator A with the symbol ΔA. For the SZ instrument

$$\displaystyle \begin{aligned} \varDelta S_{Z} = \frac{\hbar}{2} \sin\theta \quad 0 \leq \theta \leq \pi \end{aligned}$$

and is also called the uncertainty of a measurement. The larger its value the more uncertain or spread in measured values. The maximum uncertainty for SZ is \( \frac {\hbar }{2} \). This conclusion makes sense since we know that SZ has just two eigenvalues \( \pm \frac {\hbar }{2}\), and so the uncertainty cannot be greater than the range of values obtained by a device. However, ΔSZ can vanish. In that case, there is no uncertainty in the measurement. In other words, if an ensemble of experimenters each had identical copies of |Ψ〉 and if ΔSZ = 0, each measurement from each experiment leads to an identical result. According to (2.53) ΔSZ = 0 when θ = 0 or θ = π. Referring to the Bloch sphere in the previous section we recognize ket |Ψ〉, for θ = 0, π correspond to eigenstates of operator SZ. Consider the following question; if state |Ψ〉 leads to zero uncertainty in an SG measurement along the Z direction, i.e., ΔSZ = 0, does there exist an SG measurement along a different orientation axis that also leads to null uncertainty? Suppose the SG device is oriented along the following direction \( {\hat {\mathbf {n}}} = n_{x} {{\hat {\mathbf {i}}}} + n_{y} {{\hat {\mathbf {j}}}} + n_{z} {{\hat {\mathbf {k}}}} \) where \( n_{x} = \sin \varOmega \cos \chi , n_{y} = \sin \varOmega \sin \chi , n_{z} = \cos \varOmega \) and Ω, χ are the polar and azimuthal angles of the point nx, ny, nz. Along this direction

$$\displaystyle \begin{aligned} \begin{array}{rcl} S_{n} = {\hat{\mathbf{n}}} \cdot {\vec S} = \frac{\hbar}{2} \Big ( \sin\varOmega \cos\chi {\boldsymbol{\sigma}}_{X} + \sin\varOmega \sin\chi {\boldsymbol{\sigma}}_{Y} + \cos\varOmega {\boldsymbol{\sigma}}_{Z} \Big ) \end{array} \end{aligned} $$

and using (2.17) we find that

$$\displaystyle \begin{aligned} {\overline S}_{n} = \ensuremath{\left\langle\varPsi\right|} S_{n} \ensuremath{\left|\varPsi\right\rangle} = \frac{\hbar}{2} \Big ( \cos\theta \cos \varOmega + \cos{}(\phi-\chi) \sin\theta \sin\varOmega \Big ). \end{aligned}$$

Now

$$\displaystyle \begin{aligned} \langle \varPsi | S_{n} \cdot S_{n} | \varPsi \rangle = \frac{\hbar^{2}}{4} \end{aligned}$$

and so

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varDelta S_{n}^2 = \frac{\hbar^{2}}{4} -{\overline S}_{n}^2 = \frac{\hbar^{2}}{4} \Big ( 1 - ( \cos\theta \cos\varOmega + \cos{}(\phi - \chi) \sin\theta \sin\varOmega )^{2} \Big ). \vspace{-1pc} \end{array} \end{aligned} $$

Mathematica Notebook 2.5: Experimenting with uncertainty. http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

For the state in which θ = 0, π, so that ΔSZ = 0,

$$\displaystyle \begin{aligned} \varDelta S_{n} = \frac{\hbar}{2} \sin\varOmega. \end{aligned}$$

Thus ΔSn = 0 only if Ω = 0, π, i.e. Sn must be oriented along the same axis as SZ. Therefore, it is impossible find an SG device with orientation different from that of the SZ device, in which the uncertainty ΔSn also vanishes. This result is a consequence of the fact that SZ, and Sn, for \( {\hat {\mathbf {n}}}\) not along the Z axis, do not commute, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} S_{Z} S_{n} - S_{n} S_{Z} \neq 0. \vspace{-3pt} \end{array} \end{aligned} $$

Non-commutivity of operators leads to the uncertainty principle which relates the product of uncertainty for two measurement operators with an expectation value of their commutator. Given two operators A, B that represent measurement devices

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varDelta A^2 \, \varDelta B^{2} \ \geq \frac{1}{4} | \langle \varPsi | [ {\mathbf{A}},{\mathbf{B}}] | \varPsi \rangle |{}^{2}. \vspace{-3pt} \end{array} \end{aligned} $$

Proof of this theorem can be found in [1]. If both sides of this equality do not vanish the theorem provides a bound in the degree of uncertainty for each measurement.

2.5 Direct Products

In the previous sections we explored and summarized various properties of the qubit. In Chap. 1, we defined and employed the direct product to construct multi-qubit register states. In this section, we introduce the direct product operation for matrices and use the latter to represent multi-qubit states with matrices.

Given column matrices v1 and v2 of dimensions n and m respectively i.e.

$$\displaystyle \begin{aligned} v_{1} = \left ( \begin{array}{c} a_{1} \\ a_{2} \\ \vdots \\ a_{n} \end{array} \right ) \quad v_{2} = \left ( \begin{array}{c} b_{1} \\ b_{2} \\ \vdots \\ b_{m} \end{array} \right )\end{aligned} $$

their tensor, or direct, product v1 ⊗ v2 is a column matrix of dimension m × n whose elements are arranged in the following manner

(2.54)

(Note that the horizontal line is a deliminator, not a divisor symbol.) This definition allows us to construct matrix representations of multi-qubit registers. For example, consider a register comprised of a qubit pair. In the previous chapter we noted that the basis vectors for this, four dimensional, Hilbert space are |00〉, |01〉, |10〉, and |11〉. Now since

$$\displaystyle \begin{aligned} | 0 0 \rangle = | 0 \rangle \otimes | 0 \rangle \end{aligned}$$

we allow the association

$$\displaystyle \begin{aligned} \begin{array}{rcl} |0 \rangle \otimes | 0 \rangle \Leftrightarrow \left ( \begin{array}{c} 1 \\ 0 \end{array} \right ) \otimes \left ( \begin{array}{c} 1 \\ 0 \end{array} \right ) \end{array} \end{aligned} $$

and using definition (2.54) we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} | 0 0 \rangle \Rightarrow \left ( \begin{array}{c} 1 \\ 0 \\0 \\0 \end{array} \right ). {} \end{array} \end{aligned} $$
(2.55)

In the same manner, we find

$$\displaystyle \begin{aligned} \begin{array}{rcl} | 0 1 \rangle = \left ( \begin{array}{c} 0 \\ 1 \\0 \\0 \end{array} \right ), \, | 1 0 \rangle = \left ( \begin{array}{c} 0 \\ 0 \\1 \\0 \end{array} \right ), \, | 1 1 \rangle = \left ( \begin{array}{c} 0 \\ 0 \\0 \\1 \end{array} \right ), {} \end{array} \end{aligned} $$
(2.56)

where we replaced the correspondence symbol with an equality. The four column matrices itemized above, represent the computational basis for the Hilbert space of a two-qubit register. The generalization to any n-qubit register is straightforward.

Mathematica Notebook 2.6: Constructing matrix Kronecker products. http://www.physics.unlv.edu/%7Ebernard/MATH_book/Chap2/chap2_link.html

We defined direct products of operators in the previous chapter. Here we extend that definition into matrix language as follows. Given two operators A, B in a n-dimensional Hilbert space which has as matrix representation the matrices \( { \underline A}_{mn}, { \underline B}_{pq} \) the direct or Kronecker product \( { \underline A} \otimes { \underline B} \) results in a matrix \( { \underline C} \)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\underline C} = \left ( \begin{array}{ccc} A_{11} {\underline B} & \dots & A_{1n} \, {\underline B} \\ A_{21} {\underline B} & \dots & A_{2n}\, {\underline B} \\ \vdots & \ddots & \vdots \\ A_{n1} \, {\underline B} & \dots & A_{nn} \, {\underline B} \end{array} \right ) {} \end{array} \end{aligned} $$
(2.57)

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} A_{rs} \, {\underline B} \equiv A_{rs} \left ( \begin{array}{ccc} B_{11} &\displaystyle \dots &\displaystyle B_{1n} \\ B_{21} &\displaystyle \dots &\displaystyle B_{2n} \\ \vdots &\displaystyle \ddots &\displaystyle \vdots \\ B_{n1} &\displaystyle \dots &\displaystyle B_{nn} \end{array} \right ). \vspace{-4pt} \end{array} \end{aligned} $$

In (1.31) we were somewhat nitpickish in making a distinction between a direct product of two kets, and that of operators. Definition (2.57) applies to both cases. For example, consider two qubit kets

$$\displaystyle \begin{aligned} \ensuremath{\left|a\right\rangle} = \left ( \begin{array}{c} a_{1} \\ a_{2} \end{array} \right ) \quad \ensuremath{\left|b\right\rangle} = \left ( \begin{array}{c} b_{1} \\ b_{2} \end{array} \right ).\end{aligned} $$

Since the kets are represented by column matrices we invoke (2.57) and find that the product ket \( \ensuremath {\left |a\right \rangle } \otimes \ensuremath {\left |b\right \rangle }\) is identical to the result calculated with (2.54), a special case of expression (2.57). From now on, the symbol ⊗ is used to denote the direct product of both operators and kets(bras). For, consider application of (2.57) to operators A, B, and kets \(\ensuremath {\left |\psi \right \rangle } , \ensuremath {\left |\phi \right \rangle }\) to construct the matrix representation of

$$\displaystyle \begin{aligned} \begin{array}{rcl} ({\mathbf{A}} \otimes {\mathbf{B}} ) \quad {\mathrm{and}} \quad \ensuremath{\left|\psi\right\rangle} \otimes \ensuremath{\left|\phi\right\rangle}. \end{array} \end{aligned} $$

We find that

$$\displaystyle \begin{aligned} \begin{array}{rcl} ({\mathbf{A}} \otimes {\mathbf{B}} ) ( \ensuremath{\left|\psi\right\rangle} \otimes \ensuremath{\left|\phi\right\rangle}) = ({\mathbf{A}} \ensuremath{\left|\psi\right\rangle} )\otimes ({\mathbf{B}} \ensuremath{\left|\phi\right\rangle}), {} \end{array} \end{aligned} $$
(2.58)

in harmony with definition (1.31).

2.6 Problems

2.1

Do the exercises in Mathematica Notebook 2.1.

2.2

Give the matrix representations of the states \( \ensuremath {\left |\psi \right \rangle } = \frac {1}{\sqrt {2}} \left ( \ensuremath {\left |0\right \rangle } + \exp (\mathrm{i} \delta ) \ensuremath {\left |1\right \rangle } \right ) \), and \( \ensuremath {\left |\phi \right \rangle } = \frac {1}{\sqrt {2}} \left ( \ensuremath {\left |0\right \rangle } - \exp (\mathrm{i} \beta ) \ensuremath {\left |1\right \rangle } \right ) \), and their dual.

2.3

Using the matrices obtained in problem 2.2 evaluate \( \langle \phi \ensuremath {\left |\psi \right \rangle } \), \( \langle \psi \ensuremath {\left |\phi \right \rangle } \). Compare your results with that obtained using the methods discussed in Chap. 1.

2.4

Find the matrix representation for \( \ensuremath {\left |\phi \right \rangle } \ensuremath {\left \langle \psi \right |} \) and \( \ensuremath {\left |\psi \right \rangle } \ensuremath {\left \langle \phi \right |} \), where \(\ensuremath {\left |\psi \right \rangle }, \ensuremath {\left |\phi \right \rangle }\) are defined in problem 2.2.

2.5

Consider the operator

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{O}} \equiv \ensuremath{\left|0\right\rangle}\ensuremath{\left\langle0\right|} + \mathrm{i} \ensuremath{\left|1\right\rangle}\ensuremath{\left\langle0\right|} - \mathrm{i} \ensuremath{\left|0\right\rangle} \ensuremath{\left\langle1\right|} - \ensuremath{\left|1\right\rangle} \ensuremath{\left\langle1\right|} . \vspace{-4pt} \end{array} \end{aligned} $$

(a) Evaluate, using Dirac’s method discussed in Chap. 1, \( {\mathbf {O}} \ensuremath {\left |\psi \right \rangle } \), where \(\ensuremath {\left |\psi \right \rangle }\) is defined in problem 2.2. (b) Evaluate by re-expressing O and \( \ensuremath {\left |\psi \right \rangle }\) as matrices. Show that the results obtained in both pictures are isomorphic to each other.

2.6

Identify the following states on the Bloch sphere surface

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{(a)} \quad &\displaystyle &\displaystyle \ensuremath{\left|\psi_{1}\right\rangle} = \frac{\mathrm{i}}{\sqrt{ 10} } \ensuremath{\left|0\right\rangle} - \frac{3}{\sqrt{10} } \ensuremath{\left|1\right\rangle}, \\ \mathrm{(b)} \quad &\displaystyle &\displaystyle \ensuremath{\left|\psi_{2}\right\rangle} = \exp(\mathrm{i} \pi/4) \ensuremath{\left|0\right\rangle}, \\ \mathrm{(c)} \quad &\displaystyle &\displaystyle \ensuremath{\left|\psi_{3}\right\rangle} = \frac{\mathrm{i}}{\sqrt{ 2}} \left( \ensuremath{\left|0\right\rangle} - \ensuremath{\left|1\right\rangle} \right ). \vspace{-4pt} \end{array} \end{aligned} $$

2.7

Using the matrix representations for the Pauli matrices, verify identities (2.25).

2.8

Given the matrix

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left ( \begin{array}{cc} 4 &\displaystyle -\mathrm{i} \pi \\ 2\exp(\mathrm{i} \pi/4) &\displaystyle 3 \end{array} \right ) \vspace{-4pt} \end{array} \end{aligned} $$

show that it can be expressed in the form (2.24), by identifying the values of the parameters α, β, b, c.

2.9

Find the conjugate transpose U of expression (2.27). Evaluate the matrix product UU to confirm that U is unitary.

2.10

Use Mathematica Notebook 2.3 to exponentiate the operators σX, σY, σZ, as defined in Eqs. (2.31) and (2.32). Using these results to confirm relation (2.33).

2.11

Use Mathematica Notebook 2.3 to construct the operator

$$\displaystyle \begin{aligned} {\mathbf{W}} ={\mathbf{U}}_{Z}(\phi/2) {\mathbf{U}}_{Y}(\theta/2) {\mathbf{U}}_{Z}(-\phi/2).\end{aligned} $$

Demonstrate that W is unitary.

2.12

Use the operator that you obtained in problem 2.11, to evaluate the following, (a) W σX W, (b) W σY W, (c) W σZ W. Comment on your results.

2.13

Consider the operator A = W σX W , where W is the operator defined in problem (2.11), find the eigenvalues and eigenstates of A.

2.14

Find the eigenvalues and eigenstates of operator

$$\displaystyle \begin{aligned} {\mathbf{A}} = \left ( \begin{array}{cc} a & \sqrt{2} + \mathrm{i} \sqrt{2} \\ \sqrt{2} - \mathrm{i} \sqrt{2} & a \end{array} \right ) \end{aligned}$$

where a is a real number.

2.15

Use Mathematica Notebook 2.4 to plot, as a function of time, the electric field given by expression (2.37), for values of the parameters (a) E0 = 1, δ = 0, δ0 = π, θ = π∕2 (b) E0 = 1, δ = 0, δ0 = 0, θ = 0, (c) E0 = 1, δ = 0, δ0 = 0, θ = π∕4.

2.16

Given the state \(\ensuremath {\left |\psi \right \rangle } = \sqrt { \frac {3}{8} } \ensuremath {\left |0\right \rangle } + \sqrt { \frac {5}{8} } \exp (\mathrm{i} \pi /4) \ensuremath {\left |1\right \rangle } \). Find the standard deviation of measurements with the operators (a) σX, (b) σX σY, (c) σZ.

2.17

Find the matrix representation for the following multi-qubit kets.

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle (a) \quad | 1 \rangle \otimes | 1 \rangle \otimes | 0 \rangle \\ &\displaystyle &\displaystyle (b)\quad | 1 \rangle \otimes | 0 \rangle \otimes | 0 \rangle \\ &\displaystyle &\displaystyle (c) \quad | 1 \rangle \otimes ( | 1 \rangle - | 0 \rangle ) \otimes | 0 \rangle \\ &\displaystyle &\displaystyle (d) \quad | 1 \rangle \otimes |( | 1 \rangle - | 0 \rangle ) \otimes ( | 1 \rangle + | 0 \rangle ) \end{array} \end{aligned} $$

2.18

Find the matrix representation of the following operators. (a) , (b) , (c) σX ⊗σX, (d) σX σX.

2.19

Find the matrix representation of the operator,

2.20

Using the definition for the Kronecker product of matrices, verify (2.58) for arbitrary one-qubit operators A, B and states \(\ensuremath {\left |\psi \right \rangle }, \ensuremath {\left |\phi \right \rangle }\).