Abstract
In the previous chapters, we studied non-gravitational phenomena in inertial reference frames, and often we limited our discussion to Cartesian coordinate systems. Now we want to include gravity, non-inertial reference frames, and general coordinate systems. The aim of this chapter is to introduce some mathematical tools necessary to achieve this goal. We follow quite a heuristic approach. The term Riemannian geometry is used when we deal with a differentiable manifold equipped with a metric tensor.
Access provided by CONRICYT-eBooks. Download chapter PDF
In the previous chapters, we studied non-gravitational phenomena in inertial reference frames, and often we limited our discussion to Cartesian coordinate systems. Now we want to include gravity, non-inertial reference frames, and general coordinate systems. The aim of this chapter is to introduce some mathematical tools necessary to achieve this goal. We will follow quite a heuristic approach. The term Riemannian geometry is used when we deal with a differentiable manifold equipped with a metric tensor (see Appendix C for the definition of the concept of differentiable manifold).
5.1 Motivations
As we will see better in the next chapter, gravity has quite a special property: for the same initial conditions, any test-particleFootnote 1 in an external gravitational field follows the same trajectory, regardless of its internal structure and composition. To be more explicit, we can consider the Newtonian case. Newton’s Second Law reads \(m_i \ddot{\mathbf{x}} = \mathbf{F}\), where \(m_i\) is the inertial mass of the particle. If \(\mathbf{F}\) is the gravitational force on our particle generated by a point-like body with mass M, we have
where \(m_g\) is the gravitational mass of the particle (and we are assuming that \(m_g \ll M\)). In principle, \(m_i\) and \(m_g\) may be different, because the former has nothing to do with the gravitational force (it is well defined even in the absence of gravity!) and the second is the “gravitational charge” of the particle. For instance, in the case of an electrostatic field, the force is given by the Coulomb force, which is proportional to the product of the electric charges of the two objects. The electric charge is completely independent of the inertial mass of a body. On the contrary, the ratio between the inertial and the gravitational masses, \(m_i/m_g\), is a constant independent of the particle. This is an experimental result! We can thus choose units in which \(m_i = m_g = m\), where m is just the mass of the particle. At this point, Newton’s Second Law reads
and the solution is independent of m and the internal structure and composition of the particle: any test-particle follows the same trajectory for the same initial conditions.
The trajectory of a particle can be obtained by minimizing the path length between two events of the spacetime. We can thus think of writing an effective metric such that the equations of motion of the particle take into account the effect of the gravitational field. The example below can better illustrate this point.
In Newtonian mechanics, the Lagrangian of a particle in a gravitational field is \(L = T - V\), where T is the particle kinetic energy, \(V = m \varPhi \) is the gravitational potential energy, and \(\varPhi \) is the gravitational potential; see Sect. 1.8. As seen in Chap. 3, in special relativity, for small velocities T is replaced by Eq. (3.11). The Lagrangian of a non-relativistic particle in a Newtonian gravitational field is thus
Since
we can rewrite Eq. (5.3) as
and the corresponding action as
where we have introduced the metric tensor \(g_{\mu \nu }\) defined as
If we apply the Least Action Principle to the action in Eq. (5.6), we obtain the geodesic equations for the metric \(g_{\mu \nu }\). They are equivalent to the Euler–Lagrange equations for the Lagrangian in (5.3) by construction. So we can describe the gravitational field as a geometrical property of the spacetime.
With this simple example, we see how we can “absorb” the gravitational field into the metric tensor \(g_{\mu \nu }\). The particle trajectories provided by the geodesic equations for the metric \(g_{\mu \nu }\) are not straight lines, because \(g_{\mu \nu }\) takes gravity into account. Note that \(g_{\mu \nu }\) cannot be reduced to the Minkowski metric in the whole spacetime with a coordinate transformation and we say that the spacetime is curved. On the contrary, if we can recover the Minkowski metric \(\eta _{\mu \nu }\) in the whole spacetime with a coordinate transformation, the spacetime is flat. In this second case, the reference frame in which the metric is not \(\eta _{\mu \nu }\) either employs non-Cartesian coordinates or is a non-inertial reference frame (or both).
5.2 Covariant Derivative
The partial derivative of a scalar is a dual vector and it is easy to see that it transforms as a dual vector under a coordinate transformation
The partial derivative of the components of a vector field is not a tensor field. Let \(V^\mu \) be a vector and \(x^\mu \rightarrow x'^\mu \) a coordinate transformation. We have
If the relation between the two coordinate systems is not linear, we have also the second term on the right hand side and we see that \(\partial V^\mu /\partial x^\nu \) cannot be a vector. The reason is that \(dV^\mu \) is the difference between two vectors at different points. With the terminology of Appendix C, vectors at different points belong to different tangent spaces. \(dV^\mu \) transforms as
If \(\partial x'^\mu /\partial x^\alpha \) in front of \(V^\alpha (x+dx)\) were the same as that in front of \(V^\alpha (x)\), then we would have
However, in general this is not the case: \(\partial x'^\mu /\partial x^\alpha \) in front of \(V^\alpha (x+dx)\) is evaluated at \(x+dx\), that in front of \(V^\alpha (x)\) is evaluated at x. In this section we want to introduce the concept of covariant derivative, which is the natural generalization of partial derivative in the case of arbitrary coordinates.
5.2.1 Definition
We know that \(dx^\mu \) is a 4-vector and that the 4-velocity of a particle, \(u^\mu = dx^\mu /d\tau \) is a 4-vector too, since \(d\tau \) is a scalar. However, we know from Eq. (5.10) that \(du^\mu \) is not a 4-vector.
In Sect. 1.7, we introduced the geodesic equations. Since \(u^\mu = dx^\mu /d\tau \), we can rewrite the geodesic equations as
and also as
where we have defined \(D u^\mu \) as
We will now show that \(D u^\mu \) is the natural generalization of \(d u^\mu \) for general coordinate systems and that the partial derivative \(\partial _\mu \) generalizes to the covariant derivative \(\nabla _\mu \). In the case of a 4-vector like \(u^\mu \), we have \(D u^\mu = (\nabla _\nu u^\mu ) dx^\nu \), where \(\nabla _\nu \) is defined as
First, we check that the components of \(\nabla _\nu u^\mu \) transform as a tensor. The first term on the right hand side in Eq. (5.15) transforms as
The Christoffel symbols transform as
Since the calculations become long, we consider the three terms on the right hand side in Eq. (5.17) separately. The first term is
For the second term we have
Lastly, the third term becomes
If we combine the results in Eqs. (5.18)–(5.20), we find the transformation rule for the Christoffel symbols
We see here that the Christoffel symbols do not transform as the components of a tensor, so they cannot be the components of a tensor.
Let us now combine the results in Eqs. (5.16) and (5.21). We find
Note that
and therefore
We use Eq. (5.24) in the last term on the right hand side in Eq. (5.22) and we see that the second and the last terms cancel each other. Equation (5.22) can thus be written as
and we see that \(\nabla _\nu u^\mu \)s transform as the components of a tensor of type (1, 1).
5.2.2 Parallel Transport
As it was pointed out at the beginning of this section, the partial derivative of a vector computes the difference of two vectors belonging to different points and for this reason the new object is not a tensor. The sum or the difference of two vectors is another vector if the two vectors belong to the same vector space, but this is not the case here. Intuitively, we should “transport” one of the two vectors to the point of the other vector and compute the difference there. This is what the covariant derivative indeed does and involves the concept of parallel transport.
Let us consider the example illustrated in Fig. 5.1. We have a 2-dimensional Euclidean space and we consider both Cartesian coordinates (x, y) and polar coordinates \((r,\theta )\). The vector \(\mathbf{V}\) is at point \(A = x_A = \{ x^\mu \}\). Its components are \((V^x,V^y)\) in Cartesian coordinates and \((V^r,V^\theta )\) in polar coordinates. If we think of “rigidly” transporting the vector \(\mathbf{V}\) from point A to point \(B = x_B = \{x^\mu + dx^\mu \}\), as shown in Fig. 5.1, the Cartesian coordinates do not change
However, the polar coordinates change. This operation is called parallel transport and in what follows we want to show that the components of the parallel transported vector are given by
First, the relations between Cartesian and polar coordinates are
If the vector \(\mathbf{V}\) has Cartesian coordinates \((V^x,V^y)\), its polar coordinates \((V^r,V^\theta )\) are
If we parallel transport the vector \(\mathbf{V}\) from the point \((r,\theta )\) to the point \((r+dr,\theta +d\theta )\), we have the vector \(\mathbf{V}_{||}\) in Fig. 5.1. The radial coordinate \(V_{||}^r\) is
where we have neglected \(O (d\theta ^2)\) terms
For the polar coordinate \(V_{||}^\theta \), we have
Since
Equation (5.33) becomes
In polar coordinates, the line element reads
As we have seen in Sect. 1.7, the Christoffel symbols can be more quickly calculated from the comparison of the Euler–Lagrange equations for a free particle with the geodesic equations. The Lagrangian to employ is
The Euler–Lagrange equation for the Lagrangian coordinate r is
For the Lagrangian coordinate \(\theta \), we have
If we compare Eqs. (5.38) and (5.39) with the geodesic equations, we see that the non-vanishing Christoffel symbols are
Now we want to see that, if we parallel transport the vector \(\mathbf{V}\) from point A to point B, the coordinates of the vector at B are given by
For the radial coordinate we have
For the polar coordinate we have
We thus see that we recover the results in Eqs. (5.31) and (5.35).
When we compute the covariant derivative of a vector \(V^\mu \) we are calculating
that is, we calculate the difference between the vector \(V^\mu (x+dx)\) parallel transported to x and the vector \(V^\mu (x)\). Note that now the sign in front of the Christoffel symbols is plus because we are transporting a vector from \(x+dx\) to x, while in Eq. (5.41) we have the opposite case.
5.2.3 Properties of the Covariant Derivative
From the discussion above, it is clear that the covariant derivative of a scalar reduces to the ordinary partial derivative
Indeed a scalar is just a number and the parallel transport is trivial: \(\phi _{A\rightarrow B} = \phi _A\).
The covariant derivative of a dual vector is given by
Indeed, if we consider any vector \(V^\mu \) and dual vector \(W_\mu \), \(V^\mu W_\mu \) is a scalar. Enforcing Leibniz’s rule for the covariant derivative, we have
Since \(\nabla _\mu \) becomes \(\partial _\mu \) for a scalar function
and therefore, equating Eqs. (5.47) and (5.48), we must have
The generalization to tensors of any type is straightforward and we have
It is worth noting that the covariant derivative of the metric tensor vanishes. If we calculate \(\nabla _\mu g_{\nu \rho }\), we find
Since \(g^{\kappa \lambda } g_{\kappa \rho } = \delta ^\lambda _\rho \) and \(g^{\kappa \lambda } g_{\nu \kappa } = \delta ^\lambda _\nu \), we obtain
5.3 Useful Expressions
In this section we will derive a number of useful expressions and identities involving the metric tensor, the Christoffel symbols, and the covariant derivative.
We indicate with g the determinant of the metric tensor \(g_{\mu \nu }\) and with \(\tilde{g}_{\mu \nu }\) the cofactor \((\mu ,\nu )\). The determinant g is defined as
The cofactor \(\tilde{g}_{\mu \nu }\) can be written in terms of the determinant g and of the inverse of the metric tensor as
Indeed, if we have an invertible square matrix A, its inverse is given by
where C is the cofactor matrix and \(C^T\) is the transpose of C (the proof of this formula can be found in a textbook on linear algebra). If we apply Eq. (5.55) to the metric tensor, we recover Eq. (5.54). If we plug Eq. (5.54) into (5.53), we find \(g=g\), which confirms that Eq. (5.54) is correct.
With the help of Eq. (5.54) we can write
as well as
This last expression will be used later.
Let us write the formula for Christoffel symbols
We multiply both sides in Eq. (5.58) by \(g_{\kappa \mu }\) and we get
We rewrite Eq. (5.59) exchanging the indices \(\mu \) and \(\nu \)
We sum Eq. (5.59) with (5.60) and we obtain
Now we can multiply both sides in Eq. (5.61) by \(g g^{\mu \nu }\) and employ Eq. (5.57)
We rewrite Eq. (5.62) as
With the help of Eq. (5.63), the covariant divergence of a generic vector \(A^{\mu }\) can be written as
In the case of a generic tensor \(A^{\mu \nu }\) of type (2, 0), we can write
Note that, if \(A^{\mu \nu }\) is antisymmetric, namely \(A^{\mu \nu } = - A^{\nu \mu }\), \(\varGamma ^\nu _{\sigma \mu } A^{\mu \sigma } = 0\), and Eq. (5.65) simplifies to
5.4 Riemann Tensor
5.4.1 Definition
We know that if the first partial derivatives are differentiable then the partial derivatives commute, i.e. \(\partial _\mu \partial _\nu = \partial _\nu \partial _\mu \); see e.g. [1]. In general, covariant derivatives do not. We can introduce the Riemann tensor \(R^\lambda _{\,\,\rho \nu \mu }\) as the tensor of type (1, 3) defined as
where \(A_\mu \) is a generic dual vector. \(R^\lambda _{\,\,\rho \nu \mu }\) is a tensor because \(\nabla _\mu \nabla _\nu A_\rho \) and \(\nabla _\nu \nabla _\mu A_\rho \) are tensors.
In order to find the explicit expression of the Riemann tensor, first we calculate \(\nabla _\mu \nabla _\nu A_\rho \)
The expression for \(\nabla _\nu \nabla _\mu A_\rho \) is
We combine Eqs. (5.68) and (5.69) and we find
We can now write the Riemann tensor \(R^\lambda _{\,\,\rho \nu \mu }\) in terms of the Christoffel symbols as follows
It is also useful to have the explicit expression of \(R_{\mu \nu \rho \sigma }\). From Eq. (5.71), we lower the upper index with the metric tensor
The first term on the right hand side in Eq. (5.72) can be written as
We use Eq. (5.61) to rewrite \(\partial g_{\mu \lambda }/\partial x^\rho \) and Eq. (5.73) becomes
In the same way, we can rewrite the second term on the right hand side of Eq. (5.72)
With Eqs. (5.74) and (5.75) we can rewrite \(R_{\mu \nu \rho \sigma }\) as
The Riemann tensor \(R_{\mu \nu \rho \sigma }\) is antisymmetric in the first and second indices as well as in the third and forth indices, while it is symmetric if we exchange the first and second indices respectively with the third and fourth indices:
Note that, if \(R^\mu _{\,\,\nu \rho \sigma } = 0\) in a certain coordinate system, it vanishes in any coordinate system. This follows from the transformation rule of tensors
In particular, since in flat spacetime in Cartesian coordinates the Riemann tensor vanishes, it vanishes in any coordinate system even if the Christoffel symbols may not vanish. So in flat spacetime all the components of the Riemann tensor are identically zero. Note that such a statement is not true for the Christoffel symbols, because they transform with the rule in Eq. (5.21), where the last term may be non-zero under a certain coordinate transformation.
5.4.2 Geometrical Interpretation
Here we want to show that the result of parallel transport of a vector depends on the path. With reference to Fig. 5.2, we have the vector \(\mathbf{V}\) at point \(A = x_A = \{ x^\mu \}\). The vector has components
Let us now parallel transport the vector to point \(B = x_B = \{ x^\mu + p^\mu \}\), where \(p^\mu \) is an infinitesimal displacement. After parallel transport, the components of the vector are
Lastly, we parallel transport the vector to point \(D = x_D = \{ x^\mu + p^\mu + q^\mu \}\), where \(q^\mu \) is an infinitesimal displacement too. At point D the components of the vector are
where we have neglected terms of order higher than second in the infinitesimal displacements \(p^\mu \) and \(q^\mu \).
Let us now do the same changing path. We start from the vector \(\mathbf{V}\) at point A and we parallel transport it to point \(C = x_C = \{ x^\mu + q^\mu \}\), as shown in Fig. 5.2. The result is the vector \(\mathbf{V}_{A \rightarrow C}\). We continue and we parallel transport the vector to point D, where the vector components are
If we compare Eqs. (5.81) and (5.82), we find that
The difference in the parallel transport between the two paths is regulated by the Riemann tensor. In flat spacetime, the Riemann tensor vanishes, and, indeed, if we parallel transport a vector from one point to another the result is independent of the choice of the path.
5.4.3 Ricci Tensor and Scalar Curvature
From the Riemann tensor, we can define the Ricci tensor and the scalar curvature after contracting its indices. The Ricci tensor is a tensor of second order defined as
The Ricci tensor is symmetric
Contracting the indices of the Ricci tensor we obtain the scalar curvature
With the Ricci tensor and the scalar curvature we can define the Einstein tensor as
Since both \(R_{\mu \nu }\) and \(g_{\mu \nu }\) are symmetric tensors of second order, the Einstein tensor is a symmetric tensor of second order as well.
5.4.4 Bianchi Identities
The Bianchi Identities are two important identities involving the Riemann tensor. The First Bianchi Identity reads
and it can be easily verified by using the explicit expression of the Riemann tensor in Eq. (5.71). Indeed we have
The Second Bianchi Identity reads
The first term on the left hand side in Eq. (5.90) can be written as
If we choose a coordinate system in which the Christoffel symbols vanish at a certain point (this is always possible, see Sect. 6.4.2), at that point Eq. (5.91) becomes
because in the case of vanishing Christoffel symbols the covariant derivative reduces to the partial one. The second and third terms on the left hand side in Eq. (5.90) read, respectively,
The Second Bianchi Identity can thus be written as
Since the left hand side is a tensor, if all its components vanish in a certain coordinate system they vanish in any coordinate system, and this concludes the proof of the identity.
From the Second Bianchi Identity we find that the covariant divergence of the Einstein tensor vanishes. As we will see in Sect. 7.1, this is of fundamental importance in Einstein’s gravity. If we multiply the Second Bianchi Identity in Eq. (5.90) by \(g^\nu _\kappa \) and we sum over the indices \(\kappa \) and \(\nu \), we find
We multiply by \(g^{\lambda \rho }\) and we sum over the indices \(\lambda \) and \(\rho \)
This is equivalent to
Notes
- 1.
A test-particle must have a sufficiently small mass, size, etc. such that its mass does not significantly alter the background gravitational field, tidal forces can be ignored, etc.
Reference
M.H. Protter, C.B. Morrey, A First Course in Real Analysis (Springer, New York, 1991)
Author information
Authors and Affiliations
Corresponding author
Problems
Problems
5.1
Write the components of the following tensors:
5.2
Write the non-vanishing components of the Riemann tensor, the Ricci tensor, and the scalar curvature for the Minkowski spacetime in spherical coordinates.
5.3
Check that the Ricci tensor is symmetric.
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Bambi, C. (2018). Riemannian Geometry. In: Introduction to General Relativity. Undergraduate Lecture Notes in Physics. Springer, Singapore. https://doi.org/10.1007/978-981-13-1090-4_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-1090-4_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1089-8
Online ISBN: 978-981-13-1090-4
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)