Abstract
Linear algebra is the language describing systems in finite (or countably infinite) dimensions, where dimension represents the number of variables at hand.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Linear algebra is the language describing systems in finite (or countably infinite) dimensions, where dimension represents the number of variables at hand. This appears naturally in systems with more than one degrees of freedom or in approximate descriptions of complex systems in a discrete set of variables. Linear algebra also gives the basic framework of quantum mechanics, describing observables in terms of eigenvalues. In two and more dimensions, much of linear algebra is illustrated by vectors and their linear transformations.
Angular momentum \(\mathbf{J}\) is a vector that appears in numerous problems. Like energy and linear momentum, total angular momentum is a conserved quantity. For freely rotating rigid bodies, angular momentum is in proportion to angular velocity
The length \(\Omega \) denotes the rate of rotation and the direction denotes its orientation. According to Mach’s principle, angular velocity is commonly defined relative to the distant stars, where most of the mass is. For periodic motion with period P, the angular velocity satisfies
For motion at a separation \(\mathbf{r}\) about to a given axis, the instantaneous velocity is a tangent
The associated linear momentum is the vector
when the mass m is time-independent. The associated angular momentum of a particle with linear momentum \(\mathbf{p}\) is
Thus, \(\mathbf{J}\) is a vector formed out of \(\mathbf{r}\) and \(\mathbf{p}\). Transformations of \(\mathbf J\) follow the rules for vectors, e.g., when considering translation or rotation of a coordinate system.
Maps in linear algebra are represented by \(n\times m\) matrices, from a linear vector space of dimension m to a linear vector space of dimension n. These vector spaces are often over the real or complex numbers, e.g., \(\mathbb {R}^n\) or \(\mathbb {C}^n\). As such, matrices are comprised of row and column vectors of length m and, respectively, n. An \(n\times m\) matrix is said to be of dimension \(n\times m\).
Consider, for instance, a \(2\times 2\) matrix
Its rows \(\mathbf{r}_{(i)}\) and columns \(\mathbf{c}_{(i)}\) \((i=1,2)\) can be schematically indicated as
with
where we explicitly include the transpose T to denote row vectors according to
These \(2\times 2\) matrices describe various transformations in the two-dimensional plane, such as reflections, rotations and coordinate permutations. Key properties are eigenvalues and the associated eigenvectors, much of which depends on their determinants and symmetry properties.
2 Inner and Outer Products
Two linearly independent vectors \(\mathbf{a}\) and \(\mathbf{b}\) span a parallelogram. The projection of \(\mathbf{a}\) onto \(\mathbf{b}\) defines the inner product (further Sect. 3.5) \(\mathbf{a} \cdot \mathbf{b} = |\mathbf{a}| |\mathbf{b}| \cos \theta \) with
denoting the cosine of the angle between the two, where \(|\mathbf{a}|\) refers to the length of \(\mathbf{a}\) satisfying \(\left| \mathbf{a}\right| = \sqrt{\mathbf{a}\cdot \mathbf{a} }\). Referenced to a Cartesian coordinate system with basis vectors \(\{\mathbf{i}, \mathbf{j},\mathbf{k}\}\), expressions obtain in component form, In three dimensions, we have
and so
The outer product represents the area element, in area and orientation, of the parallellogram, represented by the normal vector
where we used the right handed rule in the direction of movement of a corkscrew turned from \(\mathbf{a}\) to \(\mathbf{b}\). Its length equals the area of the parallelogram
3 Angular Momentum Vector
In circular motion, angular momentum \(\mathbf{J}\) is a vector with the same orientation as the angular velocity \({\varvec{\Omega }}\). By the vector identity
between vectors \(\mathbf{a,b,c}\), circular motion gives the specific angular momentum (angular momentum per unit mass)
since \(\mathbf{r \cdot \mathbf r}=r^2\) and \(\mathbf{r \cdot \varvec{\Omega }}=0\). With (3.4, 3.5), our model problem of circular motion, therefore, implies
that is, \(\mathbf{j}\) represents twice the rate-of-change of surface area traced out by the radius \(\mathbf{r}\) in the orbital motion. Based on (3.4–3.17), this is a geometrical identify, not restricted to circular motion, familiar as Kepler’s third law in planetary motion.
3.1 Rotations
If \(z=re^{i\theta }\), then
as illustrated in Fig. 3.1. That is
Applied to the basic vectors \(\{\mathbf{i}_x,\mathbf{i}_y\}\), we have
Example 3.1. Consider a basis \(\{ \mathbf{i},\mathbf{j}\}\) rotating along with a point (x, y) over the unit circle \(S^1\). That is, \(\mathbf{i}\) points to (x, y) with local tangent \(\mathbf{j}\) to \(S^1\) with counter-clockwise orientation. Moving along \(S^1\) at constant angular velocity \(\omega \), \(\theta =\omega t\) as a function time t and (3.20) implies
and so
3.2 Angular Momentum and Mach’s Principle
Following (3.5) and (3.23), circular particle motion satisfies
where I denotes the moment of inertia about \(\mathbf{n}\).Footnote 1 Evidently, (3.23) implies that \(J=0\) whenever \(\Omega =0\) and visa-versa. In an astronomical context, we may follow MachFootnote 2 and define the angular velocity as the rate of change of angles measured relative to the distant stars. Does (3.23) hold in general?
It turns out that angular momentum is sensitive to matter in the universe anywhere. While (3.23) holds true to great precision under ordinary circumstances when \(\Omega \) is defined relative to the distant stars, deviations appear in the proximity of massive rotating bodies. This can be detected in tracking the orientation \(\mathbf{n}\) of a freely suspended gyroscope relative to a distant star. Recently, the NASA satellite Gravity Probe BFootnote 3 did just that, and measured an angular velocity in \(\mathbf{n}\) at a minute rate of
It agrees within a 20% window of uncertainty with the frame-dragging angular velocity of space-time around the earth, induced by Earth’s angular momentum according to the theory of general relativity. According to the exact solution of rotating black holes in general relativity [3], (3.24) is the frame-dragging angular velocity at about 5 million Schwarzschild radii around a maximally spinning black hole with the same angular as the Earth (and 27 times its mass).
Though small, (3.24) defines a key result in our views on the relation between rotation and angular momentum, that comes out non-trivially in curved space-time predicted by the theory of general relativity. In particular, it changes our perception of the ballerina effect (Fig. 3.2). In reality, a ballerina standing still with respect to the distant stars experiences a slight lifting of her arms up, due to her non-zero angular momentum imparted by frame-dragging around Earth.Footnote 4 In a twist to the original formulation of Mach’s principle, she would experience co-rotation with an angular velocity (3.24) for her arms to be down in a fully relaxed state.
Frame dragging (3.24) induced by the angular momentum of the Earth is manifest also in energetic spin-spin interactions.Footnote 5 In response, particles with angular momentum \(J_p\) about the spin axis of the Earth experience a potential energy [4]
that represents a line-integral of Papapetrou forces [5] mediated by \(\omega \). The energy (3.25) is notoriously small for \(J_p\) of classical objects. However, for charged particles like electrons or protons in magnetic fields around black holes, E can be huge and reach energies on the scale of Ultra High Energy Cosmic Rays (UHECRs) . Measurement of (3.25) around the Earth awaits future satellite experiments.
3.3 Energy and Torque
Angular momentum \(\mathbf{J}=J \mathbf{n}\) can be changed by application of a torque, defined as
The dimension of torque is energy, as follows from [J] = g cm\(^2\) s\(^{-1}\) (mass times rate of change of area). Because angular momentum is a vector, (3.26) shows the appearance of a torque already when changing its orientation, even when keeping its magnitude constant. In this case, (3.26) may be due to a rotation, i.e.,
where \(\mathcal{R}\) is a rotation matrix. (More on matrices in Sect. 3.5) For a rotation over an angle \(\varphi \) about the x-axis, for example, we have (Sect. 3.4.2)
Feymnan [6] gives an illustrative set-up that can be performed using a bicycle wheel attached freely to a rod. In this event, \(\mathbf{n}\) is along the y-axis when the rod is initially held horizontally. Attempting to rotate the rod along the x-axis in an effort to move the wheel overhead is described by (3.27), see Fig. 3.3. By (3.28), it introduces a component of \(\Delta \mathbf{T}\) along the z-axis. The person performing the rotation will experience a tendency to start rotating in the opposite direction to the angular momentum of the wheel, by conservation of total angular momentum in all three dimensions (in each of the three components x, y and z), i.e.,
Since power is a scalar of dimension energy s\(^{-1}\), the power delivered to or extracted from a rotating object is given by the inner product of torque and angular velocity, i.e.,
For our circular motion, we have \(\mathbf{T}=\frac{d}{dt}{} \mathbf{J}=I\frac{d}{dt}{\varvec{\Omega }}\), and hence
It follows that the rotational energy in case of \(\mathbf{J}=I{\varvec{\Omega }}\) satisfies
Although (3.32) applies to non-relativistic mechanics such as spinning tops, somewhat remarkably it gives a fairly good approximation also to the rotational energy \(E_{rot} = k \, {\varvec{\Omega }}\cdot \mathbf{J}\), \(k^{-1}= {2\cos ^2(\lambda /4)}\), of a rotating black hole with non-dimensional angular momentum \(\sin \lambda \), since
To exemplify angular momentum conservation, consider the problem of the Moon’s migration, in absorbing angular momentum in the Earth’s spin due to a gravitational tidal torque.
Example 3.2. Some 4.52 Gyr ago, the Earth’s spin period at birth was \(P=5.4\) h before the Moon was born. The Earth’s normalized angular velocity
then (4.52 Gyr ago) was very similar to the same for Jupiter today, where
denote the actual and, respectively, break-up angular velocity for a planet of mass M and radius R, and G is Newton’s constant. Some data:
The above follows from the following.
-
For the Earth’s \(\Omega _{\oplus ,b}\) and today’s value \(\Omega _\oplus =2\pi /P_\oplus \), we have
$$\begin{aligned} A_1=\left( \frac{\Omega }{\Omega _b}\right) _{\oplus }. \end{aligned}$$(3.37) -
The change \(P_\oplus \) to 5.4 h from 24 h today satisfies the scaling
$$\begin{aligned} \left( \frac{\Omega }{\Omega _b}\right) _\oplus \propto P^{-1}_\oplus . \end{aligned}$$(3.38) -
Consequently, the spin angular velocity relative to break up at birth satisfies
$$\begin{aligned} A_0=\left( \frac{5.4\,\text{ h }}{24\,\text{ h }}\right) ^{-1} A_1. \end{aligned}$$(3.39)that may be compared to the same ratio of Jupiter today.
With Newton’s constant \(G=6.67\times 10^{-8}\) g\(^{-1}\) cm\(^3\) s\(^{-2}\) (recall that \(G\rho \) has dimension angular velocity squared, i.e., s\(^{-2}\).), we have by explicit calculation
and hence the ratio
By aforementioned scaling with \(P_\oplus \), we have
Repeating the above for Jupiter,
that is, our \(A_0\) 4.52 Gyr ago and Jupiter’s \(B_1\) today are very similar. As a consequence, we expect the weather of the Earth at birth to be very similar to that of Jupiter today, essentially a permanent storm by exceedingly large Coriolis forces. Recall that Coriolis forces scale with \(\Omega _{\oplus }^2\propto P_\oplus ^{-2}\). They were initially some 20 times stronger than they are now. Thanks, in part, to spin down by the Moon, we can enjoy today’s clement climate [7].
3.4 Coriolis Forces
Conservation of angular momentum gives rise to apparent forces when moving things around by external forces that leave the angular momentum invariant, as in the absence of any frictional forces. The specific angular momentum in the presence of an angular velocity \(\omega \) is
where \(\sigma \) denotes the distance to the axis of rotation. Moving a fluid element along the radial direction changes \(\omega \), as when the ballerina moves stretched arms inwards, according to \(\delta \omega =-2 {j}{\sigma ^{-3}}\delta \sigma \). It comes with a change in azimuthal velocity \(\delta v_\varphi =\sigma \delta \omega \) seen in a corotating frame, satisfying
In vector form, (3.45) is
This result is commonly expressed in terms of the Coriolis force
Coriolis forces are particularly relevant when working in a rotating frame of reference. In particular, all of us terrestrial inhabitants living with the rotating frame fixed to Earth’s surface. Air moving to a different latitude is subject to (3.47), since it changes the distance \(\sigma \) to the Earth’s axis of rotation, which is approximately polar. Let \(\Omega \) denote the absolute angular velocity of the Earth (relative to the distant stars), and express the angular velocity of the air \(\omega ^\prime = \omega -\Omega \) relative to it, as measured in this rotating frame. Since \(\delta \omega ^\prime =\delta \omega \), moving air from, say, in the direction of the equator produces a retrograde azimuthal velocity (rotation at an angular velocity \(\omega <\Omega \)). Moving it a constant angular velocity towards the equator produces a curved trajectory in response to the (retrograde) constant Coriolis force (3.47). This may give rise to large scale circulation patterns in combination with pressure gradients.
3.5 Spinning Top
The motion of a spinning top tilted at at angle \(\theta \) exemplifies the interaction of angular momentum as a vector with a torque, \(\mathbf{T}\), applied continuously by the Earth’s gravitational force \(\mathbf{F}_g\) as illustrated in Fig. 3.4. In general, we have the relations
For a top that spins with no friction, the magnitude of its angular momentum vector is conserved. By (3.26, 3.27), the top precesses at an angular velocity \({\varvec{\Omega }}_p\) about the z-axis, \(\Omega _p = d\phi /dt\), satisfying
By (3.48), \(T = \Omega _p J \sin \theta = r W \sin \theta \), and hence the angular velocity of precession about the vertical axis satisfies
where W denotes the weight of the top and r the distance of its center of mass away from its pivot on the table.
Example 3.3. Illustrative for some vector calculations is a more explicit calculation of the precession frequency (3.50). To this end, Fig. 3.5 shows a massive ring of radius R spinning at an angular velocity \(\omega \), whereby it attains an angular momentum per unit mass \(J=\omega R^2\). Suppose it is mounted to one end of a rod, that is suspended at a pivot at the other end. An approximately horizontal rod hereby precesses with an angular velocity \(\omega _p\) about the vertical axis without dropping to a vertical position, satisfying (3.49). This result is invariant under linear translation of the CM. Precession is entirely due to the motion of mass-elements about the ring’s CM, allowing us to place the CM at the origin of a spherical coordinate system \((r,\theta ,\varphi )\), as if the CM where placed at the pivot.
With \(\varvec{\omega }=\omega \mathbf{i}_z\), the outer product \(\omega \times \mathbf{r}\) is the rotational velocity \(\mathbf{v}_\phi \) of the end point of a vector \(\mathbf{r}\) and that \(\mathbf{v}_\phi = \omega \sigma \), where \(\sigma =b\sin \theta \) is the distance to the axis of rotation. A mass element \(\delta m=(M/2\pi )\delta \theta \) in the ring herein assumes an angular momentum \(\delta \mathbf{J} = \mathbf{r} \times \delta \mathbf{p} = \delta m \mathbf{r}\times \mathbf{v}\) with position vector
and associated velocity \(\mathbf{v}=d\mathbf{r}/dt\),
and acceleration \(\mathbf{a}=d\mathbf{v}/dt\),
Its inertia introduces a torque
that evaluates to
To finalize, we integrate (3.56) over all mass elements \(\delta m\). Making use of the following averages over the fast angle \(\theta \),
and
we arrive at a total inertial torque \(\mathbf{T}=\int _0^{2\pi } \delta \mathbf{T}\),
With \(J=I\omega \) expressed in the moment of inertia \(I=Mb^2\), the latter reduces to \(T=\omega \omega _pMb^2=\omega _pJ\), i.e., our vector identity (3.49).
In Fig. 3.5, if the bar holding the rotating wheel is initially suspended horizontally at the pivot with zero angular momentum about the z-axis, then the onset of precession \(\omega _p\)—balancing inertial to gravitational torque \(gM\sigma \)—produces a finite angular momentum \(J_z = M\sigma ^2\omega _p\) about the z-axis (upwards, say), where \(\sigma = l\cos \alpha \) is the arm length to the z-axis, now at a dip angle \(\alpha \). Since the total angular momentum about the z-axis remains zero, \(J_z=J\sin \theta \) (pointing downwards). Given \(\omega _pJ=Mg\sigma \), it follows that (cf. Exercise 3.3)
where \(\Omega = \sqrt{g/\sigma }\).
4 Elementary Transformations in the Plane
In the two-dimensional plane with Cartesian coordinates (x, y), transformations describe a map
When linear, such map is a matrix multiplication \(\mathbf{w} = C\mathbf{z}\) ,
with
and
Equivalently, we have
These two views (3.64–3.66) explicitly bring about linearity in the row and column vectors of C.
When working in the two-dimensional plane, we note that (3.62) is equivalent to a map of complex numbers \(z= x+iy \rightarrow w=x^\prime + i y^\prime \), that is occasionally useful when working with conformal transformations \(w=w(z)\) (\(w^\prime (z)\ne 0\)).
4.1 Reflection Matrix
Figure 3.6 illustrates reflections in the two-dimensional plane about the x-axis, the y-axis and through the origin, \(\mathcal{O}=(0,0)\). Reflection about the x-axis is described by
The same transformation can be written as a matrix equation for the equations \(x^\prime = x\) and \(y^\prime = -y\) as follows
Reflection about the y-axis is described by
The same transformation can be written as a matrix equation for \(x^\prime = -x\) and \(y^\prime = y\) as follows
As mentioned above, (3.67–3.69) are equivalent to taking \(z\,\epsilon \,\mathbb {C}\) into, respectively,
Reflection about the origin is described by
The same transformation can be written as a matrix equation for the equations \(x^\prime =- x\) and \(y^\prime = -y\) as follows
The identity matrix is the defined by the transformation which leaves \(\mathbf{z}\) the same, i.e.,
4.2 Rotation Matrix
The above can be extended to continuous transformations such as rotations. The rotation matrix can be derived from the multiplication of complex numbers following (3.18) and (3.62). With \(\mathbf{z}=r\cos \theta \mathbf{i}_x + r \sin \theta \mathbf{i}_y\), we have
in terms of the rotation matrix
Evidently, it satisfies
where \(R^{-1}\) refers to the inverse of R, \(R^T\) refers to the transpose and
defines the determinant of a \(2\times 2\) matrix.
For what follows, we shall generalize (3.9) to matrices. For a square \(n\times n\) matrix A, the transpose obtains by interchanging the off-diagonal components \(a_{ij}\) \((i\ne j)\) about the principle diagonal containing the \(a_{ii}\). Schematically, if L refers to the upper diagonal elements and U refers to the lower diagonal elements, then
The rotation matrix \(R(\varphi )\) in (3.76) is anti-symmetric in its off-diagonal elements, i.e., \(U=-L\). A square matrix is said to be anti-symmetric, if \(U=-L\) and the elements on the principle diagonal are zero. Since the diagonal elements in (3.76) are non-zero, \(R(\varphi )\) is not an anti-symmetric matrix.
Example 3.4. A symmetric matrix, satisfying \(U=L\) as defined in (3.79), is the Lorentz boost
that appears in the transformation of four-momenta in Malinowski space. Both \(R(\varphi )\) and \(\Lambda (\mu )\) have determinant one,
5 Matrix Algebra
Multiplication of two matrices A of dimension \(p\times m\) and B of dimension \(m\times q\) produces a new matrix \(C=AB\) of dimension \(p\times q\). Each entry of C is the inner product of a row from A and a column from B. Schematically, the product C of two \(2\times 2\) matrices is
upon considering A in terms of its rows and B in terms of its columns. The entries of C satisfy
The product \(D=BA\) of the same \(2\times 2\) matrices satisfies
upon considering B in terms of its rows and A in terms of its columns, so that
It is easy to see that in general \(D\ne C\), i.e., matrix multiplication does not commute,
where the notation \([\cdot ,\cdot ]\) refers to the commutator.
Example 3.3. To illustrate, consider the two matrices
The commutation [A, B] then evaluates to
6 Eigenvalue Problems
Eigenvalue problems are defined by the equation
where \(\mathbf{a}\) refers to an eigenvector associated with the eigenvalue \(\lambda \). Equivalently, \(\mathbf{a}\) is in the null-space (is a right null-vector) of \(A-\lambda I\):
For (3.90) to have a non-trivial solution \(\mathbf{a}\), we must have
6.1 Eigenvalues of \(R(\varphi )\)
Let us explore (3.90, 3.91) for the rotation matrix \(R(\varphi )\),
The eigenvalues are \(S^1\). It is a consequence of the fact that rotation is unitary (see Sect. 3.7). Also, the eigenvalues satisfyFootnote 6
The associated eigenvectors
satisfy (3.90). To be definite, (3.90) defines two homogeneous equations in the two unknown coefficients \((\alpha _1,\alpha _2)\),
For the eigenvalues satisfying (3.93), these two equations are linearly dependent. It suffices to take one of them, to solve for \(\alpha _1\) and \(\alpha _2\),
for \(\lambda =e^{i\varphi }\) and, respectively, \(\lambda =e^{-i\varphi }\). We thus arrive at the eigenvector-eigenvalue pairs
These two pairs are complex conjugates. This is no surprise since \(R(\varphi )\) is a real matrix, whose determinant \(\left| R-\lambda I \right| \) defines a quadratic polynomial in \(\lambda \). With real coefficients, its roots are either both real or a pair of complex conjugates.
6.2 Eigenvalues of a Real-Symmetric Matrix
The matrix
is real-symmetric with eigenvalues-eigenvectors \((\lambda _\pm ,\mathbf{x}_\pm )\)
It is readily seen that \(\mathbf{x}_\pm \) are orthogonal:
We can normalize the eigenvectors to
so that \((\mathbf{e}_+,\mathbf{e}_-)\) forms a new orthonormal basis set complementary to \((\mathbf{i},\mathbf{j})\) along the x- and y-axis. Hence, we have the general decompositions
where \(x=\mathbf{i}\cdot \mathbf{x}\) and \(y=\mathbf{j}\cdot \mathbf{y}\). The coefficients a and b can be read off using multiplication by \(\mathbf{e}_\pm \):
Note that (3.89) defines the eigenvectors as invariant subspaces. We now arrive at a new look at A as an operator on \(\mathbf{x}\) in terms of multiplications by eigenvectors along the directions given by the associated eigenvectors,
6.3 Hermitian Matrices
Let \(^\dagger \) denote the Hermitian conjugate,Footnote 7 defined as the complex conjugate of the transpose of a matrix element, a column or row vector or a matrix. We define the scalar product of two vectors \(\mathbf{a}\) and \(\mathbf{b}\) in an n-dimensional vector space by
Real-symmetric matrices generalize to complex valued matrices with the same properties of having real eigenvalues and mutually orthogonal eigenvectors associated with different eigenvalues according to (3.116) and, respectively, (3.119). Following the steps of the previous section, these are the self-adjoint or Hermitian matrices satisfying
defined by transformation of the entries \(H^\dagger _{ij}=\bar{H}_{ji}\). Note that applying \(\dagger \) twice is an identity operation, i.e., \((A^\dagger )^\dagger =A\) for any \(n\times m\) matrix A. Hence, if H is an \(n\times n\) matrix, we have
with real diagonal elements \(a_{ii}\) \((i=1,2,\cdots n\)).
Example 3.5. For instance, the rotation matrix \(R(i\mu )\) with imaginary angle \(\varphi = i\mu \),
is Hermitian. Since \(|R(\varphi )|=1\) for all \(\varphi \), we have \(|H|=1\) by analytic continuation, which also follows by inspection,
The eigenvalue-eigenvectors obtain by analytic continuation of (3.97), i.e.,
According to (3.105), the scalar product between the two eigenvectors satisfies
This result of Example 3.5 is expected, since (3.117–3.119) continues to hold upon replacing T by \(\dagger \), i.e.,
For a Hermitian matrix, the eigenvectors of distinct eigenvalues are mutually orthogonal, where orthogonality is defined according to the inner product (3.105).
Very similar properties of the eigenvalue problem (3.89) appear in the real-symmetric matrix \(\Lambda (\mu )\) of (3.80). Again, we will find that the eigenvalues are real and distinct, whose accompanying eigenvectors are mutually orthogonal. These properties hold true for all real-symmetric matrices, as shown by the following.
Consider an eigenvalue-eigenvector pair \((\lambda ,\mathbf{a})\) to a Hermition matrix A. Then
Here, \(\mathbf{a}^\dagger \mathbf{a}\) is real, obtained from the summation of the squared norms of the entries of \(\mathbf{a}\). For (3.94), for example, we have
The transpose of the left hand side of (3.113) satisfies
and hence \(\lambda \mathbf{a}^\dagger \mathbf{a} = \overline{\lambda }~\overline{\mathbf{a}^\dagger \mathbf a} = \overline{\lambda }{\mathbf{a}^\dagger \mathbf a}\). It follows that the eigenvalues of a Hermitian matrix are real:
since \(\mathbf{a}^T\overline{\mathbf{a}} \equiv \mathbf{a}^\dagger \mathbf{a}\).
Following similar arguments, consider
For a Hermitian A, we have
By (3.117, 3.118), we have \(\lambda _2 \mathbf{a}^\dagger _1\mathbf{a}_2 = \lambda _1 \mathbf{a}^\dagger _2 \mathbf{a}_1\). Since \(\mathbf{a}^\dagger _1\mathbf{a}_2=\mathbf{a}^\dagger _2\mathbf{a}_1\), it follows that
For a Hermitian matrix, the eigenvectors of distinct eigenvalues are mutually orthogonal .
Let us now turn to the example matrix \(\Lambda (\mu )\) in (3.89). Its eigenvalues are defined by (3.91) with \(A=\Lambda \), that is,
whereby
Similar to (3.92), we note
The equation for the eigenvectors (3.95) in terms of \((\alpha _1,\alpha _2)\) are again a linearly dependent system of equations when \(\lambda \) assumes one of the eigenvalues (3.121). Considering the first of (3.95) with \(\lambda =e^\mu \),
we obtain the eigenvalue-eigenvector pair
According to (3.119), the eigenvector associated with \(\lambda =e^{-\mu }\) is orthogonal to that of (3.124). Since we are working in two dimensions, the second eigenvalue-eigenvector pair is therefore
as illustrated in Fig. 3.7. The same obtains by solving (3.123) with \(e^{\mu }\) replaced by \(e^{-\mu }\).
7 Unitary Matrices and Invariants
The reflections and rotations shown in Fig. 3.6 preserve norm and angles. If \(\mathbf{a}\) and \(\mathbf{b}\) are two real vectors and \(\mathbf{a}^\prime \) and \(\mathbf{b}^\prime \) are their images, e.g.,
then the inner product
is preserved, since
by the property of unitarity
In particular, \(|\mathbf{a}^\prime |^2=(\mathbf{a}^\prime )^T\mathbf{a}^\prime =|\mathbf{a}|^2\) and, likewise, \(|\mathbf{b}^\prime |^2=|\mathbf{a}|^2\), showing that their norms are preserved. If \(\theta \) and \(\theta ^\prime \) refer to the angle between \((\mathbf{a},\mathbf{b})\) and, respectively, \((\mathbf{a}^\prime , \mathbf{b}^\prime )\), then
which shows that \(\cos \theta ^\prime = \cos \theta \). Since the norms and angles (between two vectors) are invariant under rotations, we say that \(R(\phi )\) is unitary, defined by the property (3.129).
Generalized to complex valued matrices, we say that A is unitary if
by which A is norm and angle preserving following (3.126–3.130) with \(\dagger \) replacing T. In a unitary matrix, therefore, the columns and rows form orthonormal sets. This is evident by inspection in the rotation matrix \(R(\varphi )\): its row
and column vectors
satisfy
where \(\delta _{ij}\) denotes the Kronecker delta symbol (\(\delta _{ij}=1\) \((i=j)\), \(\delta _{ij}=0\) (\(i\ne j\))).
The eigenvalues of a unitary matrix (3.131) are on the unit circle, as follows from
where \(\mathbf{a}\) denotes an eigenvector of the eigenvalue \(\lambda \).
The \(n\times n\) unitary matrices are U(n). U(n) is a group in that (i) \(C=AB\) is in U(n) for any \(A,B\,\epsilon \, U(n)\), (ii) every \(A\,\epsilon \, U(n)\) has an inverse \(A^{-1}\,\epsilon \, U(n)\) and (hence) (iii) U(n) contains the identity matrix I. (Specifically, \(AI=IA=A\).) For any two matrices A and B, we have
that we state here without proof. (It may be seen from the fact that the determinant equals the product of eigenvalues.) According to (3.131), unitary matrices hereby satisfy
Elements of U(n) have a complete set of orthonormal eigenvectors with eigenvalues on the unit circle (Fig. 3.8). The special unitary group \(SU(n)\,\subset \, U(n)\) have unit determinant,
exemplified by the rotation matrices \(R(\varphi )\) in (3.129).
In contrast, Hermitian matrices have a complete set of orthonormal eigenvectors with eigenvalues on the real axis. A matrix can be both unitary and Hermitian only if its eigenvalues are \(\pm 1\). Examples are \(n\times n\) Householder matrices representing reflections across the plane normal to \(\mathbf{u}\) in \(\mathbb R^n\),
8 Hermitian Structure of Minkowski Spacetime
A Hermitian \(n\times n\) matrix A on \(\mathbb {C}^n\) introduces a metric structure through an inner product defined by their real eigenvalues \(\lambda _i\),
If all eigenvalues are positive, this metric structure introduces a norm equivalent to the Euclidean norm on \(\mathbf{R}^{2n}\),
The Lorentz metric of Sect. 1.5 is an example of a real-symmetric matrix on \(\mathbb {R}^4\) with signature \((1,-1,-1,-1)\), referring to one positive and three negative eigenvalues. The metric structure it introduces follows (3.140) with A given by \(\eta _{ab}\), that is referred to as hyperbolic rather than Euclidean. We next strengthen this association to Hermitian matrices with some interesting consequences.
By dimension, we are at liberty to introduce complex combinations of the real 3+1 space-time components of a vector in terms of two complex-valued component vectors. Embedding the latter into a \(2\times 2\) Hermitian matrix, (a) the Lorentz metric obtains by the determinant of the matrix and (b) Lorentz transformations correspond to unitary transformations by unimodular matrices from SL(2,\(\mathbb {C}\)). Remarkably, the unimodular matrices giving a unitary transformation are effectively square roots of Lorentz transformations of four-vectors. For rotations on the unit sphere,Footnote 8 it gives a double cover of the rotations on the unit sphere \(S^2\).Footnote 9
The causal structure of Minkowski spacetime, defined by the Lorentz metric, is given geometrically by Lorentz invariant light cones. The generators of light cones are light rays. Light rays are integral curves of null-vectors with length zero. This refers to the fact that the change in total phase along a light ray is zero by definition—light rays define the propagation of wave fronts carrying constant total phase of electromagnetic radiation. They carry information on direction, but not distance. Projection of light rays onto the celestial sphere defines a one-to-one map of directions onto \(S^2\). Light rays hereby have two degrees of freedom,Footnote 10 and are either future- or past-oriented with opposite signs of their angular velocity in the propagation of an electromagnetic wave. Since the dimension of Minkowski space is four, this suggests a formulation in two null-vectors.
Expressed in terms of complex variables, a \(2\times 2\) formulation is realized by spinors of \(\left( \epsilon _{AB},\mathbb {C}^2\right) \), where \(\epsilon _{AB}\) refers to the metric spinor as follows.
Given a four-vector \(k^b=(k^t,k^x,k^y,k^z)\), consider the Hermitian matrix
expanded in terms of the Hermitian Pauli spin matricesFootnote 11
with respective eigenvalues \(\lambda = \pm 1\), \(\lambda =\pm i\), \(\lambda =\pm i\) and \(\lambda = \pm i\). The Pauli spin matrices embed the basis vectors of Minkowski space,
Notice that the \(\sigma _i\) \((i=x,y,z)\) are trace-free. From the determinant of Z,
the length of \(k^b\) in Minkowski space satisfies
incorporating the line-element \(s^2 = \eta _{ab}k^ak^b\) with Minkowski metric
Here, we use the Einstein summation convention of summing over all index values \(a=t,x,y,z\) in combinations of covariant and contravariant indices. In Exercise 1.11, we noticed that \(\eta _{ab}\) reduced to 1+1 is invariant under Lorentz boosts. It is not difficult to ascertain that \(\eta _{ab}\) is invariant under general Lorentz transformations including rotations. As such, \(\eta _{ab}\) is a Lorentz invariant tensor.
Consider an element \(L\,\epsilon \,\) SL(2,\(\mathbb {C}\)), mentioned above. These unimodular elements have 8 − 2 = 6 degrees of freedom. Then
preserves (3.146), since
Notice that (3.149) holds true also for \(L=-I\), showing that the sign of L is not determined by a given Lorentz transformation of \(k^b\). Even so, a given L from SL(2,\(\mathbb {C}\)) in (3.148) defines a Lorentz transformation of \(k^b\).
Example 3.6. Consider a Lorentz boost with rapidity \(\mu \) of \(k^b=(1,0,0,0)^T\) to \((\cosh \mu ,0,0,\sinh \mu )^T\) along the z-axis,
It obtains by a boost
whereas a rotation of \(k^b=(0,1,0,0)\) to \((0,\cos \theta ,\sin \theta ,0)\) about the z-axis,
obtains by a rotation
Viewed by continuation starting from the identity matrix, (3.153) goes at the heart of spinors to be introduced below: a rotation over \(2\pi \) in physical space gives rise to a change in sign in \(L(\theta )\). A continuing rotation over \(4\pi \) restores the original sign. The \(L(\theta )\) in (3.153) are elements of SU(2), since \(L^\dagger L=I\). Accordingly, SU(2) is a two-fold cover of SO(3). Light cones are described by null-rays \(k^b\), satisfying
Their embedding (3.142) is therefore in rank-one matricesFootnote 12 (3.142) of the form
whose determinant is identically equal to zero. Here, right hand side expresses the spinor and its transpose indicated by a primed index
following the convention, to using unprimed and primed indices for a row and, respectively, column vector notation. Accordingly, we write
where \(\bar{\kappa }^{A'}\) is the Hermitian transpose of \(\kappa ^A\).
Let \(\kappa \) denote the row vector \(\kappa ^A\) in (3.156). Then (3.148) implies a corresponding Lorentz transformation of a spinor \(\kappa \),
Rotation over \(2\pi \) in real space now has a corresponding sign change in the spinor.
Now write the determinant of \(K=K^{AA'}\) in (3.142) as
in terms of the anti-symmetric metric spinor \(\epsilon _{AB}=-\epsilon _{BA}\), \(\epsilon _{A'B'}=-\epsilon _{B'A'}\) with \(\epsilon _{01}=\epsilon _{0'1'}=1\), i.e.,
Then
with implicit reference to the basis elements (3.170) to be discussed below. In (3.159), note that the incomplete contraction \(\epsilon _{A'B'}K^{AA'}K^{BB'}\) is an antisymmetric tensor in our two-dimensional spinor space \(\mathbb {C}^2\). Since the antisymmetric elements of \(L(2,\mathbb {C})\) are spanned by \(\epsilon _{AB}\),
taking into account (3.159) and \(\epsilon _{AB}\epsilon ^{AB}=2\).
The metric spinor allows lowering and raising indices
Lowering and raising is by multiplication from the left and, respectively, right. The same rules apply to \(A'\). Since the metric spinor is skew symmetric, we automatically have that spinors are null,
In practical terms, the spinor \(\kappa ^A\) is a square root of a null-vector \(k^b\) in \(Z^{AA'}\).
Consider two null-vectors \(k^b\) and \(l^b\) represented by spinors \(o^A\) and \(\iota ^A\). Then
whereby \(k^cl_c\ge 0\), i.e., \(k^b\) and \(l^b\) share the same direction in time, e.g., are future-oriented. Choosing two distinct null-vectors, we may insist
As members of \(\mathbb {C}^2\), choosing such pair as a basis gives
To be explicit, consider
Then \(\iota _A=\epsilon _{AB}\iota ^B=\left( \begin{array}{cc} 1&0\end{array}\right) \), whereby (3.166) is satisfied, and
The result identifies the Pauli spin-matrices with metric spin-tensors
In the notation of linear algebra, note that \(\bar{o}^{A'} = (o^A)^\dagger \), etc. It recovers Pauli matrices in (3.142) as a basis of four-vectors in 3+1 Minkowski space. Note that (3.170) also introduces an algebraic map of a complex second-rank spinor, e.g., \(\phi _{AA'}\), to four-vectors with possibly complex valued components.
9 Eigenvectors of Hermitian Matrices
For \(n\times n\) Hermitian matrix A, \(A^\dagger = A\), let \((\lambda ,\mathbf{a})\) denote one of its eigenvalue-eigenvector pairs. The latter always exists by virtue of eigenvalue solutions to (3.91). Let
denote the normalized eigenvector, satisfying \(\hat{\mathbf{a}}^\dagger \mathbf{a}=1\). For instance, we have
Let \(\mathbf{u}\) be any vector. We may decompose it orthogonally as
Here, \(\mathbf{u}_{||}\) and \(\mathbf{u}_\perp \) are parallel and orthogonal to \(\mathbf{a}\), obtained from the projection operator
Geometrically, the image space of PA consists of all vectors orthogonal to \(u_{||}\),
We also note that \(\mathbf{u}_\perp \) is in the plane with normal \(\mathbf{a}\). The expansion (3.173) hereby satisfies
and hence
Since \(u_{||}=(I-P)\mathbf{u}\), being parallel to \(\mathbf{a}\), it is an eigenvector of A with eigenvalue \(\lambda \). Since \(\mathbf{u}\) in (3.177) is arbitrary, it follows that
Since A is Hermitian with \(\lambda \) real, and \(P^\dagger =P\), we have
We thus find that A and P commute, i.e.,
It follows that in particular that
Since I and A commute trivially, also \((I-P)\) and A commute: \([(I-P),A]=0\) and the Hermitian matrix A operates completely independently on the one-dimensional subspace of vectors along an eigenvector \(\mathbf{a}\) and on the subspace of vectors in the \(n-1\) dimensional hypersurface normal to \(\mathbf{a}\). Equation (3.180) also shows that PA is Hermitian:
Therefore, we can repeat all the steps (3.171–3.180) for an eigenvector \(\mathbf{a}^\prime \) of \(A_1=AP\). Since \(AP\mathbf{a}^\prime = PA \mathbf{a}^\prime \), this eigenvector is in the image space of P, and hence it is orthogonal to \(\mathbf{a}\). By this orthogonality, \(P^\prime \) associated with \(\mathbf{a}^\prime \) commutes with P. It follows that \(A_1P^\prime =APP^\prime \) is Hermitian. Continuing in this fashion, we ultimately arrive at n mutually orthogonal eigenvectors \(\mathbf{a}\), \(\mathbf{a}^\prime , \cdots , \mathbf{a}^{\prime \prime \cdots \prime }.\)
Example 3.7. The \(2\times 2\) Hermitian matrix
has eigenvalue-eigenvector pairs \(\left\{ \left( \lambda _1, \mathbf{a}_1\right) ,~\left( \lambda _2,\mathbf{a}_2\right) \right\} \) with
We wish to view the operation of A on a vector \(\mathbf{u}\) as the sum of linearly independent operations associated with the directions \(\mathbf{a}_1\) and \(\mathbf{a}_2\). We first rewrite (3.185) in terms of the equivalent orthonormal pair
following (3.171). The \(\left\{ \hat{\mathbf{a}}_1, \hat{\mathbf{a}}_2\right\} \) form an orthonormal basis (a complete set of orthonormal vectors) for vectors \(\mathbf{u}\) in our two-dimensional space. They satisfy the property
Here, \(\delta _{ij}\) is the commonly used Kronecker delta symbol. For (3.183), we have
For an arbitrary vector, we can write
Multiplication by \(\hat{\mathbf{a}}_{1,2}\) from the left obtains
Substitution of (3.190) into (3.188) gives the explicit expression
This represents the Gram-Schmidt orthogonal decomposition of \(\mathbf{u}\) with respect to the eigenvectors of A. Accordingly, \(A\mathbf{u}\) satisfies
Since \(\mathbf{u}\) is arbitrary, we conclude
The same follows from \(A=AI\), \(I= \hat{\mathbf{a}}_1\hat{\mathbf{a}}^\dagger _1 + \hat{\mathbf{a}}_2\hat{\mathbf{a}}^\dagger _2.\)
Example 3.8. For \(\Lambda \) in (3.80) we have, according to (3.124, 3.125), a normalized pair of eigenvalues-eigenvectors given by
Following (3.194), we consider
Adding these expressions gives
i.e., we recover our definition of a Lorentz boost,
10 QR Factorization
In viewing an \(n\times m\) matrix A as a linear map from the vector space \(\mathbf{\mathbb C}^m\) to \(\mathbf{\mathbb C}^n\), we frequently encounter the question if A is imaging \(\mathbf{\mathbb C}^m\) onto all of \(\mathbf{\mathbb C}^n\) or just a linear subspace of it. Similarly, A may map all nonzero vectors from \(\mathbf{\mathbb C}^m\) to nonzero vectors or map some linear subspace of \(\mathbf{\mathbb C}^m\) to the origin \(\mathbf{0}\) in \(\mathbf{\mathbb C}^n\).
To streamline this discussion, we introduce the image of A, defined by the linear vector space
and the kernel of A, also known as the null space of A, defined by the linear vector space
The image space is supported by the column vectors \(\mathbf{a}_1\), \(\mathbf{a}_2\), \(\ldots ,\) \(\mathbf{a}_m\) of A, e.g., for a \(2\times 2\) matrix
The image space forms out of linear combinations
by choice of \(\mathbf{u}\) in \(\mathbb C^m\),
The row space is supported by the row vectors \(\mathbf{b}_1\), \(\mathbf{b}_2\), \(\cdots ,\) \(\mathbf{b}_n\) of A, e.g., for a \(2\times 2\) matrix
that forms out of the linear combinations
by choice of \(\mathbf{v}\) in \(\mathbb C^m\),
Following (3.199), the kernel of A consists of the vectors that are orthogonal to all of its row vectors \(\mathbf{b}_i\) \((i=1,2,\ldots n)\), commonly written as
where \(\perp \) refers to orthogonality with respect to the inner product \(\mathbf{a}\cdot \mathbf{b}=\mathbf{a}^\dagger \mathbf{b}\) for vectors in \(\mathbb C^m\).
10.1 Examples of Image and Null Space
Let A be a nonsingular \(2\times 2\) matrix. We read off the columns vectors following (3.200), e.g.,
Hence, \(\text{ Im }\,A\) is defined by all vectors obtained from linear combinations of the \(\mathbf{a}_1\) and \(\mathbf{a}_2\) in (3.207). We say
Evidently, we have
(or \(\mathbb {C}^2\)) since the \(\mathbf{a}_1\) and \(\mathbf{a}_2\) in (3.207) point in different directions, whereby they are linearly independent. This may also be inferred from the fact that \(\text{ det }\,A\ne 0.\)
Alternatively, consider the singular matrix
In this event, the second column satisfies
and the two columns are linearly dependent, as follows also from the fact that
Consequently, the image space is the one dimensional subspace given by
Proceeding with (3.207) above, we have, following (3.203), the row vectors
They happen to be the same as the column vectors since A is real-symmetric. \(\text{ Ker }\,A\) is defined by vectors that are orthogonal to both \(\mathbf{b}_1\) and \(\mathbf{b}_2\). Since \(\mathbf{b}_1\) and \(\mathbf{b}_2\) are linearly independent, we have
For the alternative (3.210), we have the row vectors
In this event, the second row satisfies \(\mathbf{b}_2 = 2\mathbf{b}_1\), whereby they are linearly dependent. Consequently, the null space of A is the one dimensional subspace, given by the vectors orthogonal to \(\mathbf{b}_1\), i.e.,
The matrices (3.207) and (3.210) satisfy, respectively,
10.2 Dimensions of Image and Null Space
In what follows, we will restrict our discussion to square matrices of size \(n\times n\). In this event, the dimension \(\text{ dim }\,\text{ Im }\, A\) of (3.198) is n whenever A is of full rank, i.e., when \(\text{ det }\,A\ne 0\). Complementary to this, we have that \(\text{ dim }\,\text{ Ker }\, A\) of (3.199) is 0 whenever A is of full rank, i.e., when \(\text{ det }\,A\ne 0\). However, \(\text{ dim }\,\text{ Im }\, A<n\) and \(\text{ dim }\,\text{ Ker }\, A>0\) when \(\text{ det }\,A=0\).
The matrices (3.207) and (3.210) exemplify a general relationship of \(n\times n\) matrices, satisfying
To derive this relationship, we begin by observing the invariance
for \(A^\prime =\left( \mathbf{a}_1^\prime ~\mathbf{a}_2^\prime ~\ldots ~\mathbf{a}_n^\prime \right) \) obtained from \(A=\left( \mathbf{a}_1~\mathbf{a}_2~\ldots ~\mathbf{a}_n\right) \) by changing a column vector \(\mathbf{a}_j\) by linear superposition with any of the other column vectors. Specifically, this may be by choice of \(1\le j\le n\) and a linear superposition
This transformation has a corresponding upper triangular transformation matrix U such that \(A^\prime = AU\). For instance, when \(n=3\) and \(j=2,3\)
Since the \(AU_2\) and \(AU_2U_3\) form as superpositions of the column vectors of A, their image space remains \(\text{ Im }\,A\).
The Gram-Schmidt orthogonalization of \(\mathbf{a}_j\) from A to mutually orthoginalized \(\mathbf{a}_i\) \((1\le i \le j-1)\) satisfies
Here, we omit projections \(\mu _i \mathbf{a}_i\) whenever \(\mathbf{a}_i=\mathbf{0}\). Performing (3.223) for each \(j=2,3,\ldots \) consecutively up to \(j=n\) produces \(A^{\prime \prime \cdots \prime }\) with column vectors that are all orthogonal. For \(2\,\times \,2\) matrix A, the Gram-Schmidt orthogonalization of its colum vectors obtains in one step
If A has full rank, then so has \(A^\prime \). This may also be seen from the product rule
since \(\text{ det }\,U=1\). The determinant of \(A^\prime \) is nonzero iff the determinant of A is nonzero. For a 3\(\times 3\) matrix, we apply (3.222). The product \(U=U_2U_3\) is upper triangular,
The above is readily extended to \(n\times n\) matrices
whose columns are mutually orthogonal, where U is upper triangular with unit determinant.
If \(\text{ det }~A=0\), some of the columns of \(A^\prime \) are zero, i.e.,
The null vectors of \(A^\prime \) are of the form \(\mathbf{u}^\prime =(0~\cdots ~1~\cdots 0)^T\), where 1 appears at a position j where \(\mathbf{a}_j=\mathbf{0}\). Since U is invertible, \(A=A^\prime U^{-1}\), and hence \(\mathbf{u} =U\mathbf{u}^\prime \) is a null vector of A. Our theorem (3.219) now readily follows: the number of non-zero columns in \(A^\prime \) define the dimension of the image space of A and the number of zero columns of \(A^\prime \) define the dimension of the null space of A.
Example 3.9. Consider the non-singular matrix
The first and second steps in (3.223) produce, respectively,
It follows that
where the second matrix on the right hand side is \(U=U_2U_3\). Similarly, consider the singular matrix
The first and second step in (3.223) produce, respectively,
It follows that
where the second matrix on the right hand side is \(U=U_1U_2\).
Comparing (3.232–3.236), we see that A in (3.229) has full rank with the trivial null space \(\text{ Ker }\,A=\mathbf{0}\), whereas B in (3.233) is of rank 2 with the nontrivial null space given by the second column of \(U=U_2U_3\), i.e.,
10.3 QR Factorization by Gram-Schmidt
The above is more commonly used to derive the QR factorization of a matrix upon including normalization in each step of the Gram-Schmidt procedure,
if \(\mathbf{a}_j^\prime \ne \mathbf{0}\) (otherwise, we skip this step). The result \(A=QR\) has column vectors of Q forming an orthonormal bases for \(\text{ Im }\,A\) and R upper triangular. If A is square and invertible, then Q is unitary, \(Q^\dagger Q=I\), whereby \(R=Q^\dagger A\).
Example 3.10. Consider the QR factorizations of the non-singular \(2\,\times \,2\) matrix. Let \(A^\prime = AU_2\) be the outcome of the Gram-Schmidt procedure and \(D_2\) denote the diagonal matrix \(D_2\) containing the norms its column vectors. Then \(A^\prime = QD_2\) defines
For a singular matrix, we similarly obtain
For the A and B in (3.229) and (3.233), the QR factorizations are
Table 3.1 summarizes this discussion.
11 Exercises
3.1. Let
Calculate
Compare your answers to (i) and explain.
3.2. Consider a Cartesian coordinate system (x, y, z) and rotation of a vector \(\mathbf{r}\) about the z-axis with angular velocity \(\omega =\Omega i_z\), where \(\mathbf{i}_z\) denotes the unit vector along the z-axis. The velocity of \(\mathbf{r}\) satisfies \(\mathbf{v}=\omega \times \mathbf{r}\), where \(\times \) denotes the outer product.
(i) If \(\mathbf{r} = 2 \mathbf{i}_x + 3 \mathbf{i}_z\), calculate the velocity \(\mathbf{v}\).
(ii) Show that \(\left| \mathbf{v }\right| = \Omega \sigma \), where \(\sigma \) denotes the distance to the axis of rotation.
3.3. Derive the equivalent expression for the dip angle (3.245), given by
3.4. For the each of the following transformations in the two-dimensional plane, state which are projections, reflections and rotations:
3.5. Show that complex numbers \(z=x+iy\) can be written in terms of the matrices
satisfying \(A(z)A(w)=A(zw)\) by the rules of matrix multiplication. In particular, show that for \(z=i\), (3.247) satisfies \(I + A^2=0\), where I denotes the identify matrix (\(z=1\)).
3.6. Consider the matrix
Compute the, determinant, the eigenvalues and eigenvectors.
3.7. Permutation of the x- and y-coordinates is described by
Show that w(z) is not analytic in z, i.e., the Cauchy-Riemann relations are not satisfied. Derive the equivalent \(2\times 2\) matrix equation for \(x^\prime = y\) and \(y^\prime = x\).
3.8. Consider the matrix
Obtain the eigenvalues \(\lambda _{i}\) and eigenvectors \(\mathbf{a}_{i}\) (\(i=1,2)\) and decompose A in the form
where hat refers to normalization to unit norm.
3.9. Show the orthogonality (3.119).
3.10. If A is both unitary and Hermitian, show that \(A=A^{-1}\).
3.11. Consider the Householder matrix (3.139). Show that H is Hermitian (\(H^\dagger = H)\) and unitary \((H^\dagger H=I)\), whence it is involuntary (H is its own inverse): \(H^2=I\). In two dimensions, determine its eigenvalue-eigenvector pairs for a general direction \(\mathbf{u}\) and interpret the result geometrically. What happens in three dimensions to the multiplicity of the eigenvalues?
3.12. Let A be Hermitian, i.e., \(A^\dagger = A\). Show that A is diagonalizable according to
where U is the unitary matrix satisfying \(U^\dagger U = I\). Compute U for A in (3.250). [Hint: Compose U from the eigenvectors of A.]
3.13. Consider the matrix
Compute the determinant and determine the condition number of A, defined by the ratio of the maximal to the minimal square root of the eigenvalues of \(A^\dagger A\). [Hint: use (3.252).] What happens when a approaches zero? Compute the solution to the system of equations
Show that the solution is regular, respectively, ill-behaved as a approaches zero when
What is the condition number of \(A^2\). How does it generalize to \(A^n\) \((n\ge 3)\)?
3.14. Show that U(1) can be identified with the tangents of complex numbers on the unit circle \(S^1\). [Hint: Express elements of \(S^1\) by \(e^{i\theta }\) and generalize the Taylor expansion \(\theta \),
about the identity \(\theta =0\) to arbitrary \(\theta \).]Footnote 13
3.15. Show that U(1) is abelian:
3.16. Show that the elements of U(2) are of the form
and that they are in general not Hermitian.
3.17. Following (3.258), specialize to \(\text{ det }\,A=1\).Footnote 14 Give a general representation of SU(2) in terms of traceless \(2\times 2\) matrices (the sum of diagonal elements being zero). Determine the number of degrees of freedom in view of the conditions \(A^\dagger A=I\) and det A = 1. Show that these traceless matrices are Hermitian and derive their eigenvalues.
3.18. Illustrate a double cover of \(S^1\) by way of a curve on a two-torus.
3.19. From the definition of the inner product of two arbitrary spinors \(o^A\) and \(\iota ^A\) and the definition of lowering indices, show that
3.20. In (3.165), obtain \(\iota ^A\) from a rotation of \(o^A\) and evaluate their inner product as a function of the rotation angle. Next, consider a spinor basis
Show that \(o^A\iota _A=1\) in (3.166). Rank-one matrices of the form \(o^A\bar{\iota }^{A'}\) may be expanded asFootnote 15
where T denotes the ordinary matrix transpose. Express the Pauli spin matrices in this new basis similar to (3.170).
3.21. Occasionally, we allow coordinates to become complex. For reference, recall the line-element
of the Euclidean plane, expressed in Cartesian and, respectively, polar coordinates. The Euclidean is flat, like an ordinary sheet of paper. ConsiderFootnote 16
Show that (3.263) again is the line-element of a flat two-surface using analytic continuation in t.Footnote 17
3.22. Consider the matrix
(i) Obtain the image space and the null space of the matrix for all a. [Hint: Distinguish between \(a=0\) and \(a\ne 0\).];
(ii) Apply Gram-Schidt orthogonalization to obtain \(A^\prime = AU\), where the column factors of \(A^\prime \) are orthogonal and U is upper triangular;
(iii) Obtain the QR factorization of A.
3.23. Following (3.239), obtain the QR factorizations of the \(2\times 2\) rotation matrix \(R(\varphi )\) and the Lorentz boost \(\Lambda (\mu )\) of Example 3.8.
3.24. Obtain the QR factorizations of general \(2\times 2\) matrices that are (a) Hermitian or (b) unitary.
3.25. Let \(i=0,1,2,3\) correspond to (t, x, y, z). For the Pauli spin matrices (3.143), show or calculate
(i) The \(\sigma _i\) are involutory: \(\sigma _1^2=\sigma _2^2=\sigma _3^2=-i\sigma _1\sigma _2\sigma _3 = I\); det \(\sigma _i=-1\) for \(i=1,2,3\) and they are trace-free, Tr\((\sigma _i)=0\).
(ii) The \(\sigma _i\) satisfy \(\sigma _i\sigma _j+\sigma _j\sigma _i = 2 \delta _{ij}I\) \((i,j=1,2,3)\).
(iii) The \(\sigma _a\) \((a=0,1,2,3)\) form a basis of the \(2\times 2\) Hermitian matrices.
(iv) The eigenvalues and eigenvectors of the \(\sigma _i\) \((i=1,2,3)\).
(v) The commutator \([\sigma _i,\sigma _j]\) for all \(i,j=1,2,3\).
Notes
- 1.
Formally \(I_{nn}\), since I is generally a two-index tensor.
- 2.
Ernst Mach (1838–1916).
- 3.
- 4.
- 5.
The complete set of frame dragging induced interactions is described by the Riemann tensor.
- 6.
The product of the eigenvalues equals the determinant of the matrix, as follows from, e.g., the Jordan decomposition theorem. The same theorem shows that the trace of a matrix given by the sum of the elements on the principle diagonal equals the sum of the eigenvalues.
- 7.
Also referred to as the Hermitian transpose or the conjugate transpose.
- 8.
The celestial sphere in the language of cosmology.
- 9.
Commonly referred to as SO(3), described by rotation matrices with determinant +1.
- 10.
Photons carry an additional degree of freedom in polarization.
- 11.
Wolfgang Pauli 1900–1958.
- 12.
The image space is comprised of multiples of one vector \(\left( \begin{array}{cc} \xi&\eta \end{array}\right) ^\dagger \).
- 13.
\(S^1\) is illustrative of a one-dimensional manifold which is compact and simply connected. It has nontrivial topology, since the winding number of a loop in \(S^1\) can take any value in \(\mathbb Z\). By homotopy, the topology of \(S^1\) is the same as that of the punctured disk \(0<|z|\le 1\).
- 14.
As a result, we say \(SU(2)\,\subset \, U(2)\cong SU(2)\,\times \,U(1)\).
- 15.
The symbols \(o^A\bar{\iota }^{A'}\) and \(\bar{\iota }^{A'}o^A\) are the same, i.e., there is no ordering between unprimed and primed indices. Only upon expansion into a matrix, a choice of ordering is made.
- 16.
A so-called Rindler space.
- 17.
It can be shown that flatness is preserved under analytic continuation, whereby the Lorentz metric \(ds^2=-dt^2+dx^2=(idt)^2+dx^2\) is trivially flat.
References
Everitt, C.W.F., et al, 2011, Phys. Rev. Lett., 106, 221101, http://www.einstein.stanford.edu.
Ciufolini, I., & Pavlis, E.C., 2004, Nature, 431, 958.
Kerr, R.P., 1963, Phys. Rev. Lett., 11, 237.
van Putten, M.H.P.M., 2005, Nuov. Cim. B, 28, 597; van Putten, M.H.P.M., & Gupta, A.C., 2009, Mon. Not. R. Astron. Soc., 394, 2238.
Papapetrou A., 1951, Proc. R. Soc., 209, 248.
Feynman, R.P., 1963, Lectures on Physics, Vol. I (Addison-Wesley Publishing Co.), Ch. 20.
van Putten, M.H.P.M., 2017, NewA, 54, 115, arXiv:1609.07474.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
van Putten, M.H. (2017). Vectors and Linear Algebra. In: Introduction to Methods of Approximation in Physics and Astronomy. Undergraduate Lecture Notes in Physics. Springer, Singapore. https://doi.org/10.1007/978-981-10-2932-5_3
Download citation
DOI: https://doi.org/10.1007/978-981-10-2932-5_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2931-8
Online ISBN: 978-981-10-2932-5
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)