1 Introduction

Using spacetime algebra [7, 10] in an essential way, Cambridge physicists Lasenby, Doran and Gull have created an impressive new Gauge Theory of Gravity (GTG) based on flat spacetime [1, 15]. In my opinion, GTG is a huge improvement over the standard tensor treatment of Einstein’s theory of General Relativity (GR), both in conceptual clarity and in computational power [11]. However, as the prevailing preference among physicists is for a curved-space version of GR, a debate about the relative merits of flat-space and curved-space versions will no doubt be needed to change the minds of many. This paper aims to contribute to that debate by providing a conceptual and historical bridge between curved and flat space theories couched in the unifying language of geometric algebra.

This article sketches the extension of geometric algebra to a geometric calculus (GC) that includes the tools of differential geometry needed for a curved-space version of GR. My purpose is to demonstrate the unique geometrical insight and computational power that GC brings to GR, and to introduce mathematical tools that are ready for use in research and teaching [21]. I presume that the reader has some familiarity with standard treatments of GR as well as with geometric algebra as presented in any of the above references, so certain concepts, notations and results developed there are taken for granted here. Additional mathematical tools introduced herein are sufficient to treat any topic in GR with GC.

This article introduces three different formulations of GR in terms of a unified GC that integrates them into a system of alternative approaches. The first is a coordinate-based formulation that facilitates translation to and from the standard tensor formulation of GR [25]. The second is a deeper gauge theory formulation that is the main concern of this paper. The third is an embedding formulation that deserves mention but will not be elaborated here. Although our focus is on GR, it should be recognized that the mathematical tools of GC are applicable to any problem in differential geometry.

Recognition that GR should be formulated as a gauge theory has been a long time coming, and it is still relegated to a subtopic in most GR textbooks, in part because the standard covariant tensor formalism is not well suited to gauge theory. Still less is it recognized that there is a connection between gravitational gauge transformations and Einstein’s Principle of Equivalence. Gauge theory is the one strong conceptual link between GR and quantum mechanics, if only because it is essential for incorporating the Dirac equation into GR [4, 13]. This is sufficient reason to bring gauge theory to the fore in the formulation of GR.

This article demonstrates that GC is conceptually and computationally ideal for a gauge theory approach to GR—conceptually ideal, because concepts of vector and spinor are integrated by the geometric product into its mathematical foundations—computationally ideal, because computations can be done without coordinates. Much of this article is devoted to demonstrating the efficiency of GC in computations.

On the foundational level, GC and gauge theory provide us with new conceptual resources for reexamining the physical interpretation of GR, in particular, the much-debated Principles of Relativity and Equivalence. The analysis leads to new views on the notions of Special and General Relativity as well as the relation of theory to measurement. The result is a new Gauge Principle of Equivalence to serve as the cornerstone for the GC formulation of GR. It is instructive to compare the GR formulation of gauge equivalence given herein with the apparently quite different formulation in GTG [11] to see how subtle is the difference between passive and active interpretations of equivalent transformations.

Finally, to facilitate detailed comparison of flat space and curved space formulations of differential geometry and GR with GC, the correspondence between basic quantities is summarized in an Appendix. The details are sufficient to prove equivalence of these alternative formulations, though no formal proof is given.

2 Spacetime Models

Every real entity has a definite location in space and time — this is the fundamental criterion for existence assumed by every scientific theory. In Einstein’s Theory of Relativity, the spacetime of real physical entities is a 4-dimensional continuum modeled mathematically by a 4D differentiable manifold \({\mathcal M}^4\). As described in [10], in the Theory of Special Relativity \({\mathcal M}^4\) is identified with a 4D Minkowski vector space \({\mathcal V}^4\). This makes it a flat space model of spacetime. In this model, spacetime points and vector fields are elements of the same vector space. The Theory of General Relativity (GR) employs a curved space model of spacetime, which places points and vector fields in different spaces. Our primary task is to describe how to do that with GC.

In the standard definition of a differentiable manifold coordinates play an essential role. Although GC enables a coordinate-free formulation, we begin with a coordinate-based definition of the spacetime manifold, because that provides the most direct connection to standard practice. Moreover, coordinates are often useful for representing symmetries in vector fields.

To be specific, let x be a generic point in the spacetime manifold \({\mathcal M}^4 = \{x\}\), and suppose that a patch of the manifold is parametrized by a set of coordinates \(\{x^\mu ; \mu =0,1,2,3\}\), as expressed by

$$\begin{aligned} x=x(x^0,x^1,x^2,x^3)\,. \end{aligned}$$
(1)

The coordinate frame of tangent vectors \(g_{\mu }=g_{\mu }(x)\) to the coordinate curves parametrized by the \(\{x^\mu \}\) are then given by

$$\begin{aligned} g_\mu = \partial _\mu x =\frac{\partial {}x}{\partial {}x^\mu }\,. \end{aligned}$$
(2)

At each point x the vectors \(g_{\mu }(x)\) provide a basis for a vector space \({\mathcal V}^4(x)\) called the tangent space to \({\mathcal M}^4\)at x. The vectors in \({\mathcal V}^4(x)\) do not lie in \({\mathcal M}^4\). To visualize that, think of a 2D surface such as sphere \({\mathcal M}^2\) embedded in the 3D vector space \({\mathcal V}^3\). The tangent space \({\mathcal V}^2(x)\) at each point x on the surface is the 2D plane of vectors tangent to the surface at x [17].

At this point we part company with standard treatments of GR by presuming that the tangent vectors at each point x generate a Minkowski geometric algebra \({\mathcal G}_4(x)={\mathcal G}({\mathcal V}^4(x))\) called the tangent algebra at x. Consequently, the inner product of coordinate tangent vectors \(g_{\mu }=g_{\mu }(x)\) generates the components \(g_{\mu \nu }=g_{\mu \nu }(x)\) of the usual metric tensor in GR, that is,

$$\begin{aligned} g_\mu \varvec{\,\cdot \,}g_\nu ={\textstyle \frac{1}{2}} (g_\mu g_\nu +g_\nu g_\mu )=g_{\mu \nu }\,. \end{aligned}$$
(3)

Thus, all the rich structure of the spacetime algebra developed in [10] is inherited by the tangent algebras on the spacetime manifold \({\mathcal M}^4\). This defines a generalized spacetime algebra (STA) of multivector and spinor fields on the whole manifold.

Such fields are inherently geometrical, so they provide raw material for representing real physical entities as geometric objects. It remains to be seen if this material is sufficient for the purposes of physics. As demonstrated in the following sections, the STA of the spacetime manifold carries us a long way towards the ideal of inherently geometrical physics.

One great advantage of STA is that it enables coordinate-free formulation of multivector fields and field equations. To relate that to the coordinate-based formulation of standard tensor calculus, we return to our discussion of coordinates. The inverse mapping of (1) is a set of scalar-valued functions

$$\begin{aligned} x^\mu =x^\mu (x)\, \end{aligned}$$
(4)

defined on the manifold \({\mathcal M}^4\). The gradient of these functions are vector fields

$$\begin{aligned} g^\mu =g^\mu (x)=\bigtriangledown x^\mu \,, \end{aligned}$$
(5)

where \(\bigtriangledown =\partial _x\) is the derivative with respect to the spacetime point x. It follows that

$$\begin{aligned} g_\mu \varvec{\,\cdot \,}g^\nu =\delta ^\nu _\mu \qquad \hbox {or}\qquad g_\mu =g_{\mu \nu }g^\nu \,, \end{aligned}$$
(6)

where the standard summation convention on repeated indices is used. Accordingly, we say that the coordinate coframe \(\{g^\nu \}\) is “algebraically reciprocal” to the coordinate frame \(\{g_\mu \}\).

This algebraic reciprocity facilitates decomposition of a vector field \(a =a(x)\) into its covariant components \(a_\mu = a\varvec{\,\cdot \,}g_\mu \) or its contravariant components \(a^\mu = a\varvec{\,\cdot \,}g^\mu \); thus,

$$\begin{aligned} a=a^\mu g_\mu =a_\mu g^\mu \,, \end{aligned}$$
(7)

Likewise, a bivector \(F=F(x)\) has the expansion

$$\begin{aligned} F = {\textstyle \frac{1}{2}} F^{\mu \nu }g_\mu \wedge g_\nu \,, \end{aligned}$$
(8)

with its “scalar components” \(F^{\mu \nu }\) given by

$$\begin{aligned} F^{\mu \nu }=g^\mu \varvec{\,\cdot \,}F\varvec{\,\cdot \,}g^\nu =g^\nu \varvec{\,\cdot \,}( g^\mu \varvec{\,\cdot \,}F)= (g^\nu \wedge g^\mu )\varvec{\,\cdot \,}F\,. \end{aligned}$$
(9)

Similarly, the gradient operator can be defined in terms of partial derivatives by

$$\begin{aligned} \bigtriangledown = g^\mu \partial _\mu , \end{aligned}$$
(10)

or vice-versa by

$$\begin{aligned} \partial _\mu =\frac{\partial }{\partial {}x^\mu }=g_\mu \varvec{\,\cdot \,}\bigtriangledown \,. \end{aligned}$$
(11)

The action of these operators on scalars is well defined, but differentiation of vectors on a curved manifold requires additional considerations, to which we now turn.

3 Coderivative and Curvature

On flat spacetime the vector derivative \(\bigtriangledown =\partial _x\) is the only differential operator we need. For curved spacetime, we introduce the vector coderivative D as an intrinsic version of \(\bigtriangledown \). Operating on a scalar field \(\phi =\phi (x)\), the two operators are equivalent:

$$\begin{aligned} D\phi =\bigtriangledown \phi \,. \end{aligned}$$
(12)

Like the directional derivative \(\partial _\mu =g_\mu \varvec{\,\cdot \,}\bigtriangledown \), the directional coderivative \(D_\mu =g_\mu \varvec{\,\cdot \,}D\) is a “scalar differential operator” that maps vectors into vectors. Accordingly, we can write

$$\begin{aligned} D_\mu g_\nu = L^\alpha _{\mu \nu }g_\alpha \,, \end{aligned}$$
(13)

which merely expresses the derivative as a linear combination of basis vectors. This defines the so-called coefficients of connexion \(L^\alpha _{\mu \nu }\) for the frame \(\{g_\nu \}\). By differentiating (6), we find the complementary equation

$$\begin{aligned} D_\mu g^\alpha =-L^\alpha _{\mu \nu }g^\nu \,, \end{aligned}$$
(14)

When the coefficients of connexion are known functions, the coderivative of any multivector field is determined.

Thus, for any vector field \(a=a^\nu g_\nu \) we have

$$\begin{aligned} D_\mu a=(D_\mu a^\nu )g_\nu +a^\nu (D_\mu g_\nu ). \end{aligned}$$

Then, since the \(a_\nu \) are scalars, we get

$$\begin{aligned} D_\mu a=(\partial _\mu a^\alpha +a^\nu L^\alpha _{\mu \nu })g_\alpha \,. \end{aligned}$$
(15)

Note that the coefficient in parenthesis on the right is the standard expression for a “covariant derivative” in tensor calculus.

The derivative of any sum or product of multivector fields is easily computed by noting that \(D_\mu \) is a scalar derivation, so it satisfies the usual Leibnitz and distributive rules of a derivative. In fact, those rules were used in computing the derivative in (15).

At last we are prepared to define the vector coderivative by

$$\begin{aligned} D=g^\mu D_\mu \,. \end{aligned}$$
(16)

The “directional coderivative” with respect to any vector field \(a=a(x)\) can now be defined by

$$\begin{aligned} a\varvec{\,\cdot \,}D=a^\mu D_\mu \,. \end{aligned}$$
(17)

Both differential operators D and \(a\varvec{\,\cdot \,}D\) are coordinate free. Though they have been defined with respect to coordinates, they can often be evaluated without reference to coordinates.

Since D is a vectorial differential operator, we can use the coordinate free algebraic operations of STA to manipulate it in precisely the same way we did with \(\bigtriangledown \) in [10]. Thus, the coderivative of any k-vector field \(F=F(x)\) can be decomposed into a codivergence \(D\varvec{\,\cdot \,}F\) and a cocurl \(D\wedge F\), as expressed by

$$\begin{aligned} DF=D\varvec{\,\cdot \,}F+D\wedge F\,. \end{aligned}$$
(18)

If F is an electromagnetic bivector field, we have the obvious generalization of Maxwell’s equation to curved spacetime:

$$\begin{aligned} DF=J\,. \end{aligned}$$
(19)

As done for the vector derivative in [10], this can be decomposed into the vector and trivector equations

$$\begin{aligned} D\varvec{\,\cdot \,}F= & {} J\,, \end{aligned}$$
(20)
$$\begin{aligned} D\wedge F= & {} 0\,. \end{aligned}$$
(21)

From the last equation it is tempting to conclude that \(F=D\wedge A\), where A is a vector potential, but that depends on a property of D that remains to be proved.

To ascertain the geometric properties of the cocurl, we use (13) to obtain

$$\begin{aligned} D\wedge g^\mu =g^\alpha \wedge g^\beta L^\mu _{\alpha \beta } \,. \end{aligned}$$
(22)

The quantity on the right side of this equation is called torsion. In the Riemannian geometry of GR torsion vanishes, so we leave the interesting consideration of nonzero torsion to another day. Considering the antisymmetry of the outer product on the right side of (22), we see that the torsion vanishes if and only if

$$\begin{aligned} L^\mu _{\alpha \beta } = L^\mu _{\beta \alpha } \,. \end{aligned}$$
(23)

This can be related to the metric tensor by considering

$$\begin{aligned} D_\mu g_{\alpha \beta }=\partial _\mu g_{\alpha \beta } =(D_\mu g_\alpha )\varvec{\,\cdot \,}g_\beta +g_\alpha \varvec{\,\cdot \,}(D_\mu g_\beta ) \,, \end{aligned}$$

whence

$$\begin{aligned} \partial _\mu g_{\alpha \beta }=g_{\alpha \nu } L^\nu _{\mu \beta } +g_{\beta \nu } L^\nu _{\mu \alpha } \,. \end{aligned}$$
(24)

Combining three copies of this equation with permuted free indices, we solve for

$$\begin{aligned} L^\mu _{\alpha \beta } = {\textstyle \frac{1}{2}} g^{\mu \nu }(\partial _\alpha g_{\beta \nu } + \partial _\beta g_{\alpha \nu }-\partial _\nu g_{\alpha \beta }) \,. \end{aligned}$$
(25)

This is the classical Christoffel formula for a Riemannian connexion.

To understand the geometric meaning of vanishing torsion, it is helpful to define a torsion tensor

$$\begin{aligned} T(a, b)\equiv a\varvec{\,\cdot \,}Db -b\varvec{\,\cdot \,}Da -[a,b] \,, \end{aligned}$$
(26)

where \([\,a,b\,]\) is the Lie bracket of vector fields a and b defined by

$$\begin{aligned}{}[\,a,b\,]\equiv a\varvec{\,\cdot \,}\bigtriangledown b -b\varvec{\,\cdot \,}\bigtriangledown a \,. \end{aligned}$$
(27)

For a coordinate frame the torsion tensor reduces to

$$\begin{aligned} T(g_\mu ,g_\nu )= g_\mu \varvec{\,\cdot \,}Dg_\nu -g_\nu \varvec{\,\cdot \,}Dg_\mu \,, \end{aligned}$$
(28)

because \([\,g_\mu ,g_\nu \,]=(\partial _\mu \partial _\nu -\partial _\nu \partial _\mu )x=0\). From (28) we see that vanishing of the torsion tensor is equivalent to the symmetry condition (23) on the coefficients of connexion. Thus, from (26) we can conclude that vanishing torsion implies that

$$\begin{aligned}{}[\,a,b\,]= a\varvec{\,\cdot \,}Db -b\varvec{\,\cdot \,}Da \, \end{aligned}$$
(29)

This relation between Lie bracket and coderivative plays an important role in the study of integrability on manifolds.

To look at the significance of vanishing torsion from another angle, note that since \(g^\mu \) is the gradient of a scalar coordinate function, the equation

$$\begin{aligned} D\wedge g^\mu =0 \, \end{aligned}$$
(30)

is equivalent to the following general property of the coderivative:

$$\begin{aligned} D\wedge D\phi =D\wedge \bigtriangledown \phi =0 \,, \end{aligned}$$
(31)

where \(\phi =\phi (x)\) is any scalar field. This is actually an integrability condition for scalar fields, as seen by considering

$$\begin{aligned} D\wedge D\phi =D\wedge g^\mu \partial _\mu \phi =g^\mu \wedge \bigtriangledown \partial _\mu \phi =g^\nu \wedge g^\mu \partial _\nu \partial _\mu \phi =0 \,, \end{aligned}$$
(32)

whence

$$\begin{aligned} \partial _\nu \partial _\mu \phi =\partial _\mu \partial _\nu \phi \,. \end{aligned}$$
(33)

This commutativity of partial derivatives is the classical condition for integrability.

To investigate the integrability of vector fields, we differentiate (14) to get

$$\begin{aligned}{}[D_\mu ,D_\nu ]g^\alpha =R^\alpha _{\mu \nu \beta }g^\beta \,, \end{aligned}$$
(34)

where the operator commutator has the usual definition

$$\begin{aligned}{}[D_\mu ,D_\nu ]\equiv D_\mu D_\nu -D_\nu D_\mu \,, \end{aligned}$$
(35)

and

$$\begin{aligned} R^\alpha _{\mu \nu \beta }=\partial _\mu L^\alpha _{\nu \beta }-\partial _\nu L^\alpha _{\mu \beta } +L^\alpha _{\nu \sigma }L^\sigma _{\mu \beta }-L^\alpha _{\mu \sigma }L^\sigma _{\nu \beta } \,, \end{aligned}$$
(36)

is the usual tensor expression for the Riemannian curvature of the manifold. Vanishing of the curvature tensor is a necessary and sufficient condition for the manifold to be flat, in which case the coderivative reduces to the vector derivative of [10].

Using (30) we can recast the curvature Eq. (34) in terms of the coderivative:

$$\begin{aligned} D\wedge D g^\alpha ={\textstyle \frac{1}{2}} R^\alpha _{\mu \nu \beta }(g^\mu \wedge g^\nu )g^\beta \,. \end{aligned}$$
(37)

This can be analyzed further in the following way:

$$\begin{aligned} D^2g^\alpha =(D\varvec{\,\cdot \,}D+D\wedge D)g^\alpha =D(D\varvec{\,\cdot \,}g^\alpha +D\wedge g^\alpha ) \,. \end{aligned}$$
(38)

Hence, using (30), we obtain

$$\begin{aligned} (D\wedge D)g^\alpha =D(D\varvec{\,\cdot \,}g^\alpha )-(D\varvec{\,\cdot \,}D)g^\alpha \,. \end{aligned}$$
(39)

The right hand side of this equation has only a vector part; hence the trivector part of (37) vanishes to give us

$$\begin{aligned} D\wedge D\wedge g^\alpha = {\textstyle \frac{1}{2}} R^\alpha _{\mu \nu \beta }(g^\mu \wedge g^\nu \wedge g^\beta )=0\,. \end{aligned}$$
(40)

This is equivalent to the well known symmetry property of the curvature tensor:

$$\begin{aligned} R^\alpha _{\mu \nu \beta }+R^\alpha _{\beta \mu \nu }+R^\alpha _{\nu \beta \mu }=0\,. \end{aligned}$$
(41)

However, its deep significance is that it implies

$$\begin{aligned} D\wedge D\wedge A=0\,. \end{aligned}$$
(42)

for any k-vector field \(A=A(x)\). This answers the question raised above about the existence of a vector potential for the electromagnetic field. It is a consequence of the condition (30) for vanishing torsion.

Equation (37) reduces to

$$\begin{aligned} D\wedge D g^\alpha =(D\wedge D)\varvec{\,\cdot \,}g^\alpha =R^\alpha _\beta g^\beta \,, \end{aligned}$$
(43)

where

$$\begin{aligned} R^\alpha _\beta =R^\alpha _{\beta \mu \nu }g^{\mu \nu } \, \end{aligned}$$
(44)

is the standard Ricci tensor. Comparing (43) with (39), we get the following provocative form for the Ricci tensor:

$$\begin{aligned} R(g^\alpha )\equiv R^\alpha _\beta g^\beta = D(D\varvec{\,\cdot \,}g^\alpha )-(D\varvec{\,\cdot \,}D)g^\alpha \,. \end{aligned}$$
(45)

We return to this later.

4 Gauge Principle of Equivalence

General Relativity is a theory of spacetime measurement. Any measurement of distance or direction in spacetime is a comparison of events with a standard, and for that purpose over an extended region a reference system is set up. In Special Relativity theory that purpose is met by inertial reference frames and encoded in the Principle of Relativity, which holds that the laws of physics (or measurements, if you will) are equivalent with respect to all inertial frames. A more precise formulation of this principle is that the equations of physics are Lorentz invariant, that is, invariant (or covariant) under Lorentz rotations.

In creating GR, Einstein struggled to find a suitable generalization of the Relativity Principle, and he formulated his conclusions in his Principle of Equivalence. However, the theoretical significance and physical meaning of the Equivalence Principle has remained intensely controversial to this day [14]. We are speaking here about the socalled “Strong Principle of Equivalence.” The “Weak Principle of Equivalence,” expressed by the equivalence of gravitational and inertial mass, is not problematic. The Strong Principle is vaguely described as equivalence of gravitational forces to accelerating systems. However, the tools of GC enable us to make a more general and precise formulation of the Principle that preserves the spirit if not the content of Einstein’s thinking.

Confusion about the Equivalence Principle can be traced to failure to make crucial distinctions between reference frames and coordinate systems. At a single spacetime point a reference frame can be unambiguously defined as an orthonormal frame of vectors \(\{\gamma _\mu \}\), which serve as a local standard for measurements of length and direction. This can be extended to a differentiable field of orthonormal vectors \(\{\gamma _\mu =\gamma _\mu (x)\}\), which I call a “fiducial frame” or fiducial frame field to emphasize its role as a standard for measurement [9, 12]. It can be regarded as a generalization of “inertial frame” to curved spacetime, and visualized as a field of idealized rigid bodies at each point.

In contrast to the concept of a reference system as a fiducial frame field, a coordinate system is merely a means for labelling events, so it does not involve any spacetime geometry without additional assumptions. In Special Relativity, the terms “inertial coordinates” and “inertial frames” are often used interchangeably. Indeed, the standard choice of rectangular coordinates satisfies both coordinate and frame criteria for a reference system. However, this possibility is unique to flat spacetime. As can be proved with the mathematical apparatus developed below, on curved spacetime a fiducial frame cannot be identified with a coordinate frame, because it is a nonintegrable (or nonholonomic) system of vector fields. Vanishing of the curvature tensor is a necessary and sufficient condition for integrability of fiducial frames. Indeed, we shall see how to calculate the curvature tensor from inertial frames.

With identification of fiducial frames as the appropriate generalization of inertial frames, the generalization of the Special Relativity Principle is now fairly obvious. We simply require equivalence of physics with respect to all fiducial frames. To mathematize this idea, we note that any given fiducial frame \(\{\gamma _\mu \}\) is related to any other fiducial frame \(\{\gamma '_\mu \}\) by a differentiable Lorentz rotation \(R\), which we know from [10] has the canonical form

$$\begin{aligned} \gamma '_\mu =R(\gamma _\mu ) = R \gamma _\mu {\tilde{R}}\,, \end{aligned}$$
(46)

where the underbar indicates that \(R\) is a linear operator, and \(R=R(x)\) is a differentiable rotor field with the normalization

$$\begin{aligned} R{\tilde{R}}= 1\,. \end{aligned}$$
(47)

We can now formulate the

Gauge Principle of Equivalence (GPE): The equations of physics are invariant under Lorentz rotations relating fiducial frames.

In other words, with respect to fiducial frames all physical measurements are equivalent.

To justify its name we need to establish that the GPE is indeed a “gauge principle” and that it is a suitable generalization of Einstein’s Principle of Equivalence. First, in contrast to the Special Relativity Principle that it generalizes, the GPE is indeed a gauge principle because in requires invariance under a position dependent symmetry group, the group of local Lorentz rotations (46). We show below that this is just what is needed to determine the form of gravitational interactions. Second, we note that the Lorentz rotation in (46) can be chosen to be a position dependent boost to a frame that is “accelerating” with respect to the inertial frame, just as Einstein had contemplated in his version of the Equivalence Principle. Later we show how to generalize the local cancellation of apparent gravitational effects noted in his analysis.

We are now in position to conclude that Einstein’s analysis was deficient in two respects: first, in overlooking the crucial distinction between reference frames and coordinate systems; second, in analysis that was too limited to ascertain the full gauge group. Still, we see here one more example of Einstein’s astounding physical intuition in recognizing seeds of an important physical principle before it is given an adequate mathematical formulation.

The above analysis of reference frames and the Equivalence Principle suffices to motivate a reformulation of General Relativity with fiducial frames and the GPE at the foundation. First, some definitions and conventions are needed to streamline the formulation of basic formulas and theorems. The orthonormality of a fiducial frame \(\{\gamma _\mu =\gamma _\mu (x)\}\) is conveniently expressed by

$$\begin{aligned} \gamma _\mu \varvec{\,\cdot \,}\gamma _\nu =\eta _\mu \delta _{\mu \nu } \,, \end{aligned}$$
(48)

where \(\eta _\mu =\gamma _\mu ^2\) is the signature indicator. The reciprocal frame \(\gamma ^\mu \) is then simply given by

$$\begin{aligned} \gamma ^\mu =\eta _\mu \gamma _\mu \,. \end{aligned}$$
(49)

Of course, we assume that the fiducial frame is right-handed, so

$$\begin{aligned} i=\gamma _0\gamma _1\gamma _2\gamma _3 \, \end{aligned}$$
(50)

where \(i=i(x)\) is the righthanded unit pseudoscalar for the tangent space at x.

Any specified fiducial frame \(\{\gamma _\mu \}\) is related to a specified coordinate frame \(\{g_\mu \}\) by a differentiable linear transformation \(\underline{h}\) called the fiducial tensor:

$$\begin{aligned} g_\mu =\underline{h}(\gamma _\mu )=h^\nu _\mu \gamma _\nu \,. \end{aligned}$$
(51)

The matrix elements of the linear operator \(\underline{h}\) are

$$\begin{aligned} h^\nu _\mu = \gamma ^\nu \varvec{\,\cdot \,}\underline{h}(\gamma _\mu )=\gamma ^\nu \varvec{\,\cdot \,}g_\mu = \bar{h}(\gamma ^\nu )\varvec{\,\cdot \,}\gamma _\mu =g^\nu \varvec{\,\cdot \,}\gamma _\mu \,, \end{aligned}$$
(52)

which shows that the adjoint of \(\underline{h}\), denoted by \(\bar{h}\), is

$$\begin{aligned} g^\nu =\bar{h}(\gamma ^\nu )=h^\nu _\mu \gamma ^\mu \,. \end{aligned}$$
(53)

The fiducial tensor is related to the metric tensor by

$$\begin{aligned} g_{\mu \nu }=g_\mu \varvec{\,\cdot \,}g_\nu =\underline{h}(\gamma _\mu )\varvec{\,\cdot \,}\underline{h}(\gamma _\nu ) =h^\alpha _\mu \eta _\alpha h^\alpha _\nu \,. \end{aligned}$$
(54)

Alternatively, we can write

$$\begin{aligned} g_{\mu \nu }=\gamma _\mu \varvec{\,\cdot \,}\bar{h}\underline{h}(\gamma _\nu )=\gamma _\mu \varvec{\,\cdot \,}g(\gamma _\nu ) \,, \end{aligned}$$
(55)

expressing the metric tensor as a symmetric linear transformation \(g=\bar{h}\underline{h}=\underline{h}\bar{h}\) on the fiducial frame. This shows that the metric tensor can be replaced by the fiducial tensor as a fundamental geometric object on spacetime. In the present formulation of GR, the role of the fiducial tensor is to tie fiducial frames to the spacetime manifold by relating them to coordinate frames.

We are now ready to investigate implications of the GPE. To achieve the gauge invariant equations required by the GPE, we need to define a gauge invariant derivative or, as we shall say, a coderivative. It turns out to be the same as the “coderivative” D defined in the last Section, but its physical significance is clarified, and its mathematical form is significantly improved. As before, it will be convenient to define the directional coderivative \(D_\mu =g_\mu \varvec{\,\cdot \,}D\) first.

Since the fiducial frame \(\{\gamma _\nu \}\) can only rotate under displacement, we know from [10] that its directional derivatives necessarily have the form

$$\begin{aligned} D_\mu \gamma _\nu =\omega _\mu \varvec{\,\cdot \,}\gamma _\nu \,, \end{aligned}$$
(56)

where \(\omega _\mu =\omega (g_\mu )\) is a bivector-valued “rotational velocity” for displacements in the \(g_\mu \) direction. Let us call it the fiducial connexion for the frame \(\{\gamma _\nu \}\). Sect. 7 will make it clear that \(\omega _\mu \) is equivalent to the “spin connexion” in conventional GR. Thus, the same connexion is used here for both vector and spinor fields – a noteworthy simplification over conventional theory.

Generalizing (56), we define action of the operator \(D_\mu \) on an arbitrary multivector field \(M=M(x)\) by

$$\begin{aligned} D_\mu M=\partial _\mu M+\omega _\mu \times M \,, \end{aligned}$$
(57)

where the commutator product of A and B is defined by

$$\begin{aligned} A\times B={\textstyle \frac{1}{2}} (AB-BA) \,, \end{aligned}$$
(58)

and it is assumed that

$$\begin{aligned} \partial _\mu \gamma _\nu =0 \,, \end{aligned}$$
(59)

so the the partial derivative \(\partial _\mu =g_\mu \varvec{\,\cdot \,}\bigtriangledown \)operates only on scalar components of M relative to the fiducial basis.

To manifest the relation of definition (57) to our previous definition, we apply it to coordinate frame vectors \( g_\nu =h^\alpha _\nu \gamma _\alpha \) and compare with (13) to get

$$\begin{aligned} D_\mu g_\nu =L^\alpha _{\mu \nu }g_\alpha =(\partial _\mu h^\alpha _\nu )\gamma _\alpha +h^\alpha _\nu \omega _\mu \varvec{\,\cdot \,}\gamma _\alpha \,. \end{aligned}$$
(60)

This equation establishes equivalence of the connexion for a coordinate frame to the connexion for a fiducial frame, but we have no more need for the coordinate connexion except to relate to literature that uses it.

Now the GPE requires invariance of \(D_\mu \) under “change of gauge” to a different fiducial frame, as specified by Eq. (46). To ascertain necessary and sufficient conditions for gauge invariance, we differentiate (46) to get

$$\begin{aligned} D_\mu \gamma '_\nu&=(\partial _\mu R)\gamma _\nu {\tilde{R}}+R\gamma _\nu \partial _\mu {\tilde{R}}+R\omega _\mu \times \gamma _\nu {\tilde{R}}\nonumber \\&=[(\partial _\mu R){\tilde{R}}+{\textstyle \frac{1}{2}} R\omega _\mu {\tilde{R}}]\times (R\gamma _\nu {\tilde{R}})2\acute{} , \end{aligned}$$
(61)

where we have used \((\partial _\mu R){\tilde{R}}=-R\partial _\mu {\tilde{R}}\), which follows from differentiating \(R{\tilde{R}}=1\). It follows that

$$\begin{aligned} D_\mu \gamma '_\nu =\omega '_\mu \times \gamma '_\nu \, \end{aligned}$$
(62)

provided that

$$\begin{aligned} \omega '_\mu =R\omega _\mu {\tilde{R}}+2(\partial _\mu R){\tilde{R}}\,. \end{aligned}$$
(63)

In other words, the directional coderivative \(D_\mu \) is invariant under a change of fiducial frame, as specified by the local Lorentz rotation (46), provided the change of fiducial connection is given by Eq. (63).

This completes our definition of the coderivative to satisfy the GPE. The definition refers to a coordinate frame only to exploit the well understood properties of partial derivatives. That inessential reference is eliminated in the following definition of the directional coderivative \(a\varvec{\,\cdot \,}D\) with respect to an arbitrary vector field \(a=a(x)=a^\mu g_\mu \):

$$\begin{aligned} a\varvec{\,\cdot \,}DM=a\varvec{\,\cdot \,}\bigtriangledown M+\omega (a)\times M \,, \end{aligned}$$
(64)

where \(\omega (a)=a^\mu \omega _\mu \) is the connexion for any chosen fiducial frame \(\{\gamma _\mu \}\), and \(a\varvec{\,\cdot \,}\bigtriangledown \) is the directional derivative of any scalar coefficients with respect to that frame.

We are now mathematically equipped for a deeper analysis of Einstein’s Strong Principle of Equivalence (SPE). Without attempting to parse its many alternative formulations, we adopt the following formulation of the SPE: At any spacetime point x there exists an inertial (i.e. fiducial) reference frame in which the gravitational force vanishes. The nub of Einstein’s idea is that the gravitational force can be cancelled by a suitable acceleration of the reference frame. Mathematically, this means that there exists a fiducial frame for which the connexion vanishes. In other words, the rotor field in the Eq. (46) for change of frame can be chosen to make \(\omega '_\mu =0\) in (63), so that

$$\begin{aligned} \omega _\mu =-2{\tilde{R}}\partial _\mu R \,. \end{aligned}$$
(65)

Read this as asserting that the gravitational force on the left is cancelled by acceleration of the reference frame on the right. A simple counting of degrees of freedom is sufficient to show that this condition can be satisfied at a single point. However, if it is satisfied in a finite neighborhood of that point, then, as established in the next Section, the curvature tensor vanishes and the manifold must be flat. Even so, the condition (65) can be imposed along any curve in spacetime. Indeed, in Sect. 6 we impose it along timelike curves to get an equation of motion for a test body. Therefore, a more precise forumlation of the SPE is the following: Along any spacetime curve there exists an inertial (i.e. fiducial) frame field in which the gravitational force vanishes at each point of the curve.

In the present formulation of GR based on the GPE, the SPE is a theorem rather than a defining principle of the theory [14]. Evidently the SPE played a heuristic role in Einstein’s thinking that helped him identify the gravitational force with a Riemannian connexion, but it is time to replace it with the deeper GPE. The necessity for this conclusion comes from recognizing that, to have physical content, any proposed relativity group must be a symmetry group of the theory. Thus the GPE expresses equivalence of observers (represented by fiducial frames) under local Lorentz rotations, and the gauge invariant coderivative is the theoretical consequence of this symmetry. Some such symmetry of observers seems to have been at the back of Einstein’s mind, but the SPE is insufficient to designate a full symmetry group.

Now let us turn to more practical matters about how to perform calculations in GR. We have introduced the full gauge invariant coderivative by defining it in terms of directional derivatives with \(D=g^\mu D_\mu \). However, that was merely for convenience, and it is worth noting that the operator D can be regarded as more fundamental than \(D_\mu \), as illustrated by the following important theorem:

$$\begin{aligned} \omega (\gamma _\mu )={\textstyle \frac{1}{2}} (\gamma _\alpha \wedge D\wedge \gamma ^\alpha )\varvec{\,\cdot \,}\gamma _\mu -D\wedge \gamma _\mu \,. \end{aligned}$$
(66)

This formula shows explicitly how to calculate a fiducial connexion from the cocurl of the frame vectors. We shall see later that this is a practical method for calculating the curvature tensor.

We can prove theorem (66) by solving the frame coderivative Eqs. (56) for the connexion. First, we contract those equations to get

$$\begin{aligned} D\wedge \gamma _\nu =g^\mu \wedge [\omega (g_\mu )\varvec{\,\cdot \,}\gamma _\nu ] =\gamma ^\mu \wedge [\omega (\gamma _\mu )\varvec{\,\cdot \,}\gamma _\nu ] \,, \end{aligned}$$

and we note that

$$\begin{aligned}{}[\gamma ^\mu \wedge \omega (\gamma _\mu )]\varvec{\,\cdot \,}\gamma _\nu =\gamma ^\mu \wedge [\omega (\gamma _\mu )\varvec{\,\cdot \,}\gamma _\nu ]+\omega (\gamma _\nu ) \,. \end{aligned}$$

Hence

$$\begin{aligned} \omega (\gamma _\nu )=-D\wedge \gamma _\nu +[\gamma ^\mu \wedge \omega (\gamma _\mu )]\varvec{\,\cdot \,}\gamma _\nu \,. \end{aligned}$$
(67)

To express the last term on the right hand side of this equation in terms of the cocurl, we return to (56) and observe that

$$\begin{aligned} (\omega _\mu \varvec{\,\cdot \,}\gamma _\nu )\gamma ^\nu =(\omega _\mu \varvec{\,\cdot \,}\gamma _\nu )\wedge \gamma ^\nu =2\omega _\mu =(D_\mu \gamma _\nu )\gamma ^\nu \,, \end{aligned}$$

whence

$$\begin{aligned} 2g^\mu \wedge \omega _\mu = 2\gamma ^\mu \wedge \omega (\gamma _\mu ) = (D\wedge \gamma _\mu )\wedge \gamma ^\mu \,. \end{aligned}$$

Inserting this into (67), we get the formula (66) as desired.

Finally, it may be noted that the integrability condition (30) for \(g^\mu \) enables us to calculate the fiducial cocurl from the fiducial tensor. Writing \(\gamma ^\mu =\bar{h}^{-1}(g^\mu )=k^\mu _\nu g^\nu \), we find

$$\begin{aligned} D\wedge \gamma _\mu =\eta _\mu (\bigtriangledown k^\mu _\nu )\wedge g^\nu \,. \end{aligned}$$
(68)

5 Gravitational Curvature and Field Equations

We have seen in (34) that the curvature tensor derives from the commutator of coderivatives. From the fiducial definition of the coderivative (57), we easily derive a more transparent and useful result: For any multivector field \(M=M(x)\) we have

$$\begin{aligned}{}[D_\mu ,D_\nu ]M=\omega _{\mu \nu }\times M\,, \end{aligned}$$
(69)

where

$$\begin{aligned} \omega _{\mu \nu }\equiv \partial _\mu \omega _\nu -\partial _\nu \omega _\mu +\omega _\mu \times \omega _\nu =R(g_\mu \wedge g_\nu ) \end{aligned}$$
(70)

is the curvature tensor evaluated on the bivector \(g_\mu \wedge g_\nu \). It must be remembered that the partial derivatives here are given by

$$\begin{aligned} \partial _\mu \omega _\nu ={\textstyle \frac{1}{2}} (\partial _\mu \omega _\nu ^{\alpha \beta })\gamma _\alpha \wedge \gamma _\beta \,, \end{aligned}$$
(71)

where the scalar coefficients are \(\omega _\nu ^{\alpha \beta }=\gamma ^\alpha \varvec{\,\cdot \,}\omega _\nu \varvec{\,\cdot \,}\gamma ^\beta =\omega _\nu \varvec{\,\cdot \,}(\gamma ^\beta \wedge \gamma ^\alpha )\).

At this point it is worth noting that if the fiducial connection is derivable from a rotor field, as specified by the equation \(\omega _\mu =-2{\tilde{R}}\partial _\mu R\) from (65), then the curvature tensor (70) vanishes, as is easily proved by direct substitution. Thus, this is a sufficient condition for vanishing curvature. It is probably also a necessary condition for vanishing curvature, but I have not proved that.

The rest of this Section is devoted to summarizing and analyzing properties of the curvature tensor using the coordinate-free techniques of GC to demonstrate its advantages. For vector fields \(a=a^\mu g_\mu \) and \(b=b^\nu g_\nu \) the fundamental Eq. (69) can be put in the form

$$\begin{aligned}{}[a\varvec{\,\cdot \,}D,b\varvec{\,\cdot \,}D]M=R(a\wedge b)\times M\,, \end{aligned}$$
(72)

provided \([\,a, b\,] = 0\). Vanishing of the Lie bracket is assumed here merely to avoid inessential complications.

Equation (72) shows that curvature is a linear bivector-valued function of a bivector variable that is defined in the tangent algebra at each spacetime point. Thus, for an arbitrary bivector field \(B=B(x)\) we can write

$$\begin{aligned} R(B)\equiv {\textstyle \frac{1}{2}} B\varvec{\,\cdot \,}(\partial _b\wedge \partial _a)\,R(a\wedge b) ={\textstyle \frac{1}{2}} B^{\nu \mu }R(g_\mu \wedge g_\nu ) \,, \end{aligned}$$
(73)

where \(\partial _a\) is the usual vector derivative operating on the tangent space instead of the manifold, and \(B^{\mu \nu }=B\varvec{\,\cdot \,}(g^\mu \wedge g^\nu )\). Note that this use of the vector derivative supplants decomposition into basis vectors and summation over indices, a technique that has been developed into a general method for basis-free formulation and manipulation of tensor algebra [12]. To that end, it is helpful to introduce the terminology traction, contraction and protraction, respectively, for the tensorial operations

$$\begin{aligned} \partial _a\,R(a\wedge b)= & {} g^\mu R(g_\mu \wedge b)=\gamma ^\mu R(\gamma _\mu \wedge b)\,,\nonumber \\ \partial _a\varvec{\,\cdot \,}R(a\wedge b)= & {} g^\mu \varvec{\,\cdot \,}R(g_\mu \wedge b)=\gamma ^\mu \varvec{\,\cdot \,}R(\gamma _\mu \wedge b) \,,\nonumber \\ \partial _a\wedge R(a\wedge b)= & {} g^\mu \wedge R(g_\mu \wedge b)=\gamma ^\mu \wedge R(\gamma _\mu \wedge b) \,. \end{aligned}$$
(74)

that are employed below. These relations are easily proved by decomposing the vector derivative with respect to any basis and using the linearity of \(R(a\wedge b)\) as in (73). Of course, the replacement of vector derivatives by basis vectors and sums over indices in (74) is necessary to relate the following coordinate-free relations to the component forms of standard tensor analysis.

To reformulate (72) as a condition on the vector coderivative D, note that for a vector field \(c=c(x)\) the commutator product is equivalent to the inner product and (72) becomes

$$\begin{aligned}{}[a\varvec{\,\cdot \,}D, b\varvec{\,\cdot \,}D]c= R(a\wedge b)\varvec{\,\cdot \,}c\,. \end{aligned}$$
(75)

To reformulate this as a condition on the vector coderivative, we simply eliminate the variables a and b by traction. Protraction of (75) gives

$$\begin{aligned} \partial _b\wedge [\,a\varvec{\,\cdot \,}D, b\varvec{\,\cdot \,}D\,]c=\partial _b\wedge [\,R(a\wedge b)\varvec{\,\cdot \,}c\,] = R(c\wedge a) + c\varvec{\,\cdot \,}[\,\partial _b\wedge R(a\wedge b)\,]\,. \end{aligned}$$

Another protraction together with

$$\begin{aligned} D\wedge D = {\textstyle \frac{1}{2}} (\partial _b\wedge \partial _a)[\,a\varvec{\,\cdot \,}D, b\varvec{\,\cdot \,}D\,] \end{aligned}$$
(76)

gives

$$\begin{aligned} D\wedge {}D\wedge c = [\,\partial _b\wedge \partial _a\wedge {}R(a\wedge b)\,]\varvec{\,\cdot \,}c +\partial _a\wedge R(a\wedge c)\,. \end{aligned}$$
(77)

According to (42) the left side of this equation vanishes as a consequence of vanishing torsion, and, because the terms on the right have different functional dependence on the free variable c, they must vanish separately. Therefore

$$\begin{aligned} \partial _a\wedge R(a\wedge b) = 0\,. \end{aligned}$$
(78)

This constraint on the Riemann curvature tensor is called the Ricci identity.

The requirement (78) that the curvature tensor is protractionless has an especially important consequence. The identity

$$\begin{aligned} \partial _b\wedge [B\varvec{\,\cdot \,}(\partial _a\wedge R(a\wedge b))] = \partial _b\wedge \partial _a B\varvec{\,\cdot \,}R(a\wedge b) -B\varvec{\,\cdot \,}(\partial _b\wedge \partial _a) R(a\wedge b) \end{aligned}$$
(79)

vanishes on the left side because of (78), and the right side then implies that

$$\begin{aligned} A\varvec{\,\cdot \,}R(B) = R(A)\varvec{\,\cdot \,}B\,. \end{aligned}$$
(80)

Thus, the curvature is a symmetric bivector function. This symmetry can be used to recast (78) in the equivalent form

$$\begin{aligned} R\big ((a\wedge b\wedge c)\varvec{\,\cdot \,}\partial _e\big )\varvec{\,\cdot \,}e = 0\,. \end{aligned}$$
(81)

On expanding the inner product in its argument, it becomes

$$\begin{aligned} R(a\wedge b)\varvec{\,\cdot \,}c +R(c\wedge a)\varvec{\,\cdot \,}b+R(b\wedge c)\varvec{\,\cdot \,}a = 0\,, \end{aligned}$$
(82)

which is closer to the usual tensorial form for the Ricci identity.

As noted in (44), contraction of the curvature tensor defines the Ricci tensor

$$\begin{aligned} R(a)\equiv \partial _b\varvec{\,\cdot \,}R(b\wedge a)\,. \end{aligned}$$
(83)

The Ricci identity (78) implies that we can write

$$\begin{aligned} \partial _b\varvec{\,\cdot \,}R(b\wedge a) =\partial _bR(b\wedge a)\,, \end{aligned}$$
(84)

and also that the Ricci tensor is protractionless:

$$\begin{aligned} \partial _a\wedge R(a) = 0\,. \end{aligned}$$
(85)

This implies the symmetry

$$\begin{aligned} a\varvec{\,\cdot \,}R(b) = R(a)\varvec{\,\cdot \,}b\,. \end{aligned}$$
(86)

An alternative expression for the Ricci tensor is obtained by operating on (75) with (76) and establishing the identity

$$\begin{aligned} {\textstyle \frac{1}{2}} (\partial _a\wedge \partial _b)\varvec{\,\cdot \,}[\,R(a\wedge b)\varvec{\,\cdot \,}c\,] = R(c)\,. \end{aligned}$$
(87)

The result is, in agreement with (43),

$$\begin{aligned} D\wedge D\, a = (D\wedge D)\varvec{\,\cdot \,}a = R(a)\,. \end{aligned}$$
(88)

This could be adopted as a definition of the Ricci tensor directly in terms of the coderivative without reference to the curvature tensor. That might lead to a more efficient formulation of the gravitational field equations introduced below.

Equation (88) shows the fundamental role of the operator \(D\wedge D\), but operating with it on a vector gives only the Ricci tensor. To get the full curvature tensor from \(D\wedge D\), one must operate on a bivector. To that end, we take \(M = a\wedge b\) in (72) and put it in the form

$$\begin{aligned} D\wedge D(a\wedge b) = D\wedge D\times (a\wedge b) = {\textstyle \frac{1}{2}} (\partial _d\wedge \partial _c)\times [\,R(c\wedge d)\times (a\wedge b)\,]\,. \end{aligned}$$

Although the commutator product has the useful “distributive property” \(A\times [B\times C]=[A\times B]\times C+ B\times [A\times C]\), a fair amount of algebra is needed to reduce the right side of this equation. The result is

$$\begin{aligned} D\wedge D(a\wedge b) = R(a)\wedge b + a\wedge R(b)- 2R(a\wedge b)\,, \end{aligned}$$
(89)

or equivalently

$$\begin{aligned} 2R(a\wedge b) = (D\wedge Da)\wedge b + a\wedge (D\wedge Db) -D\wedge D(a\wedge b)\,. \end{aligned}$$
(90)

This differential identity is the desired expression for the curvature tensor in terms of \(D\wedge D\).

Contraction of the Ricci tensor defines the scalar curvature

$$\begin{aligned} R\equiv \partial _aR(a) =\partial _a\varvec{\,\cdot \,}R(a)\,. \end{aligned}$$
(91)

Since \(R(a\wedge b)\), R(a), and R can be distinguished by their arguments, there is no danger of confusion from using the same symbol R for each.

Besides the Ricci identity, there is one further general constraint on the curvature tensor that can be derived as follows. The commutators of directional coderivatives satisfy the Jacobi identity

$$\begin{aligned}{}[a\varvec{\,\cdot \,}D, [\,b\varvec{\,\cdot \,}D, c\varvec{\,\cdot \,}D]] + [b\varvec{\,\cdot \,}D, [ c\varvec{\,\cdot \,}D, a\varvec{\,\cdot \,}D]] +[c\varvec{\,\cdot \,}D, [a\varvec{\,\cdot \,}D, b\varvec{\,\cdot \,}D]] = 0\,. \end{aligned}$$
(92)

By operating with this on an arbitrary nonscalar multivector M and using (72), we can translate it into a condition on the curvature tensor that is known as the Bianchi identity:

$$\begin{aligned} a \varvec{\,\cdot \,}DR(b\wedge c) +b \varvec{\,\cdot \,}DR(c\wedge a) +c \varvec{\,\cdot \,}DR(a\wedge b) = 0\,. \end{aligned}$$
(93)

Like the Ricci identity (81), this can be expressed more compactly as

$$\begin{aligned} \grave{R}[(a\wedge b\wedge c)\varvec{\,\cdot \,}\grave{D}] = 0\,, \end{aligned}$$
(94)

where the accent serves to indicate that D differentiates the tensor R but not its tensor arguments. “Dotting” by free bivector B, we obtain

$$\begin{aligned} \grave{R}[\,(a\wedge b\wedge c)\varvec{\,\cdot \,}\grave{D}]\varvec{\,\cdot \,}B = (a\wedge b\wedge c)\varvec{\,\cdot \,}(D\wedge R(B))\,. \end{aligned}$$

Therefore the Bianchi identity can be expressed in the compact form

$$\begin{aligned} \grave{D}\wedge \grave{R}(a\wedge b) = 0\,. \end{aligned}$$
(95)

This condition on the curvature tensor is the source of general conservation laws in General Relativity.

Contraction of (95) with \(\partial _a\) gives

$$\begin{aligned} \grave{R}(\grave{D}\wedge b)- D\wedge R(b) = 0\,. \end{aligned}$$
(96)

A second contraction yields the differential identity

$$\begin{aligned} \grave{G}(\grave{D}) = \grave{R}(\grave{D})-{\textstyle \frac{1}{2}} DR = 0\,, \end{aligned}$$
(97)

where

$$\begin{aligned} G(a)\equiv R(a) -{\textstyle \frac{1}{2}} aR \end{aligned}$$
(98)

is the Einstein tensor.

In General Relativity, for a given energy-momentum tensor T(a), the spacetime geometry is determined by Einstein’s equation

$$\begin{aligned} G(a) = \kappa T(a)\,, \end{aligned}$$
(99)

where \(\kappa \) is a constant. The contracted Bianchi identity (97) implies the generalized energy-momentum conservation law

$$\begin{aligned} \grave{T}(\grave{D}) = 0\,. \end{aligned}$$
(100)

As is well known, this is not a conservation law in the usual sense, because it is not a perfect divergence and so is not convertible to a surface integral by Gauss’s theorem.

To solve Einstein’s equation (99) for a given energy-momentum tensor, Einstein’s tensor G(a) must be expressed in a form that makes (99) a differential equation that describes the dynamics of spacetime geometry. A direct expression for G(a) in terms of a fiducial connexion and its derivatives is very complicated and its structure is not very transparent. Let us consider an alternative approach. Using (88), we can put Einstein’s equation (99) in the form.

$$\begin{aligned} D\wedge Da = \kappa (T(a) + {\textstyle \frac{1}{2}} a\, \mathrm{Tr}\, T)\,, \end{aligned}$$
(101)

where \(\mathrm{Tr}T = \partial _a T(a)\).

As already noted in connection with Eq. (38), we can express this in alternative forms with the identity

$$\begin{aligned} D^2a= D\wedge Da +D\varvec{\,\cdot \,}Da = D(D\varvec{\,\cdot \,}a) +D(D\wedge a)\,. \end{aligned}$$
(102)

The last term vanishes if the vector field a is a gradient,

$$\begin{aligned} a = D\varphi = \bigtriangledown \varphi \,, \end{aligned}$$
(103)

in which case, (101) can be put in the form

$$\begin{aligned} D\varvec{\,\cdot \,}Da-D(D\varvec{\,\cdot \,}a) =- \kappa (T(a) + {\textstyle \frac{1}{2}} a \mathrm{Tr}\, T)\,. \end{aligned}$$
(104)

This appears to be a simplification in the form of Einstein’s equation, and it can be further simplified by adopting the “gauge condition” \(D\varvec{\,\cdot \,}a = 0\). Indeed, in the linear approximation its left hand side reduces immediately to the usual d’Alembertian wave operator. This formulation of Einstein’s equation was first derived in ref. [7], but it has never been studied further to see if its apparent simplicity leads to any practical advantages.

6 Curvature Calculations

Equations (68), (66), (71), and (70) provide us with an efficient method for calculating curvature from the fiducial tensor in the following sequence of steps

$$\begin{aligned} h^\mu _\nu \quad \rightarrow \quad D\wedge \gamma _\mu \quad \rightarrow \quad \omega _\mu \quad \rightarrow \quad \omega _{\mu \nu }\,. \end{aligned}$$
(105)

Conventional curvature calculations begin by specifying the metric tensor as a function of coordinates by writing the “line element”

$$\begin{aligned} dx^2=dx\varvec{\,\cdot \,}dx=g_{\mu \nu }dx^\mu dx^\nu =h^\alpha _\mu \eta _\alpha h^\alpha _\nu dx^\mu dx^\nu \,. \end{aligned}$$
(106)

Of course, we can take the same starting point for calculations with the fiducial tensor. Details of the present method are illustrated by calculation of the Schwarzschild solution in [9], which is demonstrably superior to the method of Misner, Thorne and Wheeler [16] and other computational methods [20].

7 Gravitational Motion and Precession

The spinor equations of motion for classical particles and rigid bodies set forth in [10] are now easily generalized to include gravitational interactions. This gives us a general method for evaluating gravitational effects on the motion and precession of a spacecraft or satellite, and thus a means for testing gravitational theory.

We begin with the timelike worldline \(x=x(\tau )\) of a material particle with velocity \(v=v(\tau )=dx/d\tau \equiv \dot{x}\), where, as usual, \(d\tau =|\,dx\,|=|\,(dx)^2\,|^{\scriptstyle {\frac{1}{2}}}\), so \(v^2=1\). We attach to this curve a (comoving orthonormal frame) or mobile \(\{e_\mu =e_\mu (x(\tau ))=e_\mu (\tau ); \mu =0,1,2,3\}\). The mobile is tied to the velocity by requiring \(v=e_0\). Rotation of the mobile with respect to a given fiducial frame \(\{\gamma _\mu \}\) is described by

$$\begin{aligned} e_\mu =R\,\gamma _\mu {\tilde{R}}\,, \end{aligned}$$
(107)

where \(R = R(x(\tau ))\) is a unimodular rotor and \(\{\gamma _\mu \}\) is any convenient fiducial frame. As noted in [10], the spinor can be used to model the motion of a small rigid body or a particle with intrinsic spin. In GR it is especially useful for modeling gravitational effects on gyroscopic precession.

In accordance with (64), the coderivative of the mobile is

$$\begin{aligned} v\varvec{\,\cdot \,}D e_\mu =\dot{e}_\mu +\omega (v)\varvec{\,\cdot \,}e_\mu \,, \end{aligned}$$
(108)

where \(\{\mu = 0,1,2,3\}\), \(\omega (v)\) is the fiducial connection for the frame \(\{\gamma _\mu \}\), and \(\dot{e}_\mu =v\varvec{\,\cdot \,}\bigtriangledown e_\mu \). Note that (108) is equivalent to the spinor equation

$$\begin{aligned} v\varvec{\,\cdot \,}DR=\big (\frac{d}{d\tau }+{\textstyle \frac{1}{2}} \omega (v)\big )R\,, \end{aligned}$$
(109)

The coderivative (108) includes a gauge invariant description of gravitational forces on the mobile. As explained in [10], effects of any nongravitational forces can be incorporated by writing

$$\begin{aligned} v\varvec{\,\cdot \,}De_\mu =\dot{e}_\mu +\omega (v)\varvec{\,\cdot \,}e_\mu =\Omega \varvec{\,\cdot \,}e_\mu \,, \end{aligned}$$
(110)

where \(\Omega =\Omega (x)\) is a bivector field acting on the mobile; for example, \(\Omega =(e/m)F\) for an electron with mass m and charge e in a field \(F=F(x)\). The four equations (110) include the equation of motion

$$\begin{aligned} \frac{dv}{d\tau }=(\Omega -\omega (v))\varvec{\,\cdot \,}v\, \end{aligned}$$
(111)

for the particle, and are equivalent to the single rotor equation

$$\begin{aligned} \frac{dR}{d\tau }={\textstyle \frac{1}{2}} (\Omega -\omega (v))R\,. \end{aligned}$$
(112)

For \(\Omega =0\) the particle equation becomes the equation for a geodesic, and the rotor equation describes parallel transfer of the mobile along the geodesic. This equation has been applied to a detailed treatment of gravitational precession in [8]. It is noteworthy that this method works in Gauge Theory Gravity [11] with no essential differences.

8 Dirac Equation with Gravitational Interaction

Recall from [10] that a real Dirac spinor field \(\psi =\psi (x)\) determines an orthonormal frame of vector fields \(e_\mu =e_\mu (x)\) defined by

$$\begin{aligned} \psi \gamma _\mu \widetilde{\psi }=\rho e_\mu \,, \end{aligned}$$
(113)

where scalar \(\rho =\rho (x)\) is interpreted as electron probability density, and \(\psi \gamma _0\widetilde{\psi }=\rho e_0\) is the Dirac current. We can adopt this relation without change by interpreting \(\{\gamma _\mu \}\) as a fiducial frame and writing

$$\begin{aligned} e_\mu =R \gamma _\mu {\tilde{R}}\,, \end{aligned}$$
(114)

where \(R=R(x)\) is a rotor field. This equation has exactly the same form as the Eq. (46) for a change of fiducial frame. Therefore, the Dirac wave function determines a unique, physically significant fiducial frame \(\{e_\mu \}\) on spacetime. Accordingly, its gauge invariant directional coderivative is given by

$$\begin{aligned} D_\nu e_\mu =\partial _\nu e_\mu +\omega _\nu \varvec{\,\cdot \,}e_\mu , \end{aligned}$$
(115)

where \(\omega _\nu \) is the fiducial connexion for the frame \(\{\gamma _\mu \}\). This is consistent with defining the coderivative of the Dirac spinor by

$$\begin{aligned} D_\nu \psi =(\partial _\nu +{\textstyle \frac{1}{2}} \omega _\nu )\psi \,, \end{aligned}$$
(116)

which exhibits \(\omega _\nu \) as equivalent to the “spin connexion” in conventional formulations of GR [25].

The spinor coderivative (116) is form invariant under the spinor gauge transformation

$$\begin{aligned} \psi \qquad \rightarrow \qquad \psi '=\Lambda \psi \,, \end{aligned}$$
(117)

where is a rotor field. This induces a transformation of (116) to

$$\begin{aligned} D_\nu \psi '=(\partial _\nu +{\textstyle \frac{1}{2}} \omega '_\nu )\psi ' \,, \end{aligned}$$
(118)

where

(119)

We could have used this spinor gauge transformation to define the spinor coderivative. But note that it is not (explicitly, at least) related to the gauge equivalence of fiducial frames, so it raises new issues of physical interpretation. It is an active transformation that changes the fields on spacetime, rather than a passive transformation that changes the reference system but leaves fields unchanged. We shall return to this issue in the sequel to this paper.

The generalization of the real Dirac equation in [10] to include gravitational interaction is obtained simply by replacing the partial derivative \(\partial _\mu \) by the coderivative \(D_\mu \). Thus, we obtain

$$\begin{aligned} g^\mu D_\mu \psi \gamma _2\gamma _1\hbar =g^\mu (\partial _\mu +{\textstyle \frac{1}{2}} \omega _\mu )\psi \gamma _2\gamma _1\hbar =eA\psi +m\psi \gamma _0 \,. \end{aligned}$$
(120)

This is equivalent to the standard matrix form of the Dirac equation with gravitational interaction, but it is obviously much simpler in formulation and application. This is not the time and place for solving the real gravitational Dirac equation (120). However, comparison of the spinor coderivative (116) with the rotor coderivative (109) tells us immediately that gravitational effects on electron motion, including spin precession, are exactly the same as on classical rigid body motion.

With the spinor coderivative in hand, the rest of Dirac theory in [10] is easily adapted to gravitational interactions [1, 15].

9 Vector Manifolds

The spacetime manifold \({\mathcal M}^4 = \{x\}\) was introduced as a vector manifold in Sect. 2, and a coordinate frame \(\{g_{\mu }=g_{\mu }(x)\}\) was generated from partial derivatives of a parametrized point in the manifold, as expressed by

$$\begin{aligned} g_\mu = \partial _\mu x \,. \end{aligned}$$
(121)

At each spacetime point x the coordinate frame provides a basis for the tangent space \({\mathcal V}^4(x)\) and generates the tangent algebra \({\mathcal G}_4(x)={\mathcal G}({\mathcal V}^4(x))\).

The reader may have noticed that the role of \({\mathcal M}^4\) itself in subsequent developments is hardly more than a shadow. All the geometry and physics—the vector, multivector and spinor fields, the connexion and the curvature—occur in the tangent algebra. It could be argued that even the spacetime points \(\{x\}\) are superfluous, as coordinates are sufficient to index points of the manifold. This argument is taken to the extreme in most recent mathematical works on differential geometry, where the x is eliminated and (121) is replaced by

$$\begin{aligned} g_\mu = \partial _\mu \,. \end{aligned}$$
(122)

In other words, vectors in a coordinate frame are identified with coordinate partial derivatives; consequently, all vectors \(\{a=a^\mu g_\mu \}\) in the tangent space are identified with the directional derivatives \(\{a^\mu \partial _\mu \}\).

The purported problem with (121) is that it is deficient in mathematical rigor, because the partial derivative is defined as the limit of a difference quotient

$$\begin{aligned} \partial _\mu x =\frac{\bigtriangleup x}{\bigtriangleup x^\mu }\,, \end{aligned}$$
(123)

and the difference vector \({\bigtriangleup x}\) requires subtraction of one point from another, which is not well defined unless they are vectors in a vector space of higher dimension. In other words, it is argued that the Eq. (123) presumes embedding of \({\mathcal M}^4\) in a vector space of higher dimension, whereas GR is concerned with intrinsic properties of manifolds irrespective of any embedding in a higher dimensional space. The definition (122) of tangent vectors as differential operators finesses this issue with a “don’t ask, don’t tell” approach that doesn’t specify what is to be differentiated. Nevertheless, it has been argued that (121) has great heuristic value [18].

It has been almost universally overlooked in the mathematics and physics literature that the identification (122) of tangent vectors with differential operators precludes assigning them the algebraic properties of vectors in geometric algebra as done in this paper. Such conflation of vectors with differential operators has enormous drawbacks. It is sufficient to note that if tangent vectors are not allowed to generate a geometric algebra in the first place, then the algebra must be artifically imposed on the manifold later on, because it is absolutely essential for spinors and quantum mechanics. Indeed, standard practice [19, 25] is to attach the Dirac algebra to the spacetime manifold as an afterthought, and the elaborate formalism of fibre bundles has been employed for that [4]. To avoid all that unnecessary gymnastics, it is necessary to return mathematical respectability to Eq. (121)—that requires reconsidering the concept of a differentiable manifold.

The standard definition of a differentiable manifold employs coordinates to impose differentiable structures on a set [4]. Alternatively, the definition of a vector manifold has been expressly designed to incorporate differentiability directly into the structure of the set [6, 12]. This entails regarding the vectorial difference quotient (123) as a well-defined quantity with the well-defined limit (121). Contrary to common belief, it does not require any assumptions about embedding the spacetime manifold in a (flat) vector space of higher dimension. Indeed, no mention of an embedding space appears in this paper. However, if one insists that an embedding vector space must be assumed to make vectorial operations like (123) meaningful, there is still no loss of generality in representing the spacetime manifold as a vector manifold, because it has been proved that every Riemannian manifold can be embedded in a flat manifold of sufficiently high dimension [5]. Indeed, the theory of vector manifolds may be the ideal venue for investigating embedding theorems, because it offers a powerful new method for differential geometry that efficiently coordinates characterization and analysis of the intrinsic and extrinsic properties of a manifold without presumptions about embedding [12]. As that method is based on the same concept of vector manifold employed here, it is an attractive alternative to the method in this paper, and the two methods can be regarded as complementary. Sobczyk has taken the first steps in the use of GC for an embedding approach to spacetime manifolds [23]. The main interest of physicists in studying extrinsic geometry of spacetime manifolds is the possibility of relating it to fundamental interactions that have not yet been given a satifactory geometric interpretation. There has been little research in that direction [5], but for those who are interested, the theory of vector manifolds with GC can be recommended as providing ideal mathematical tools [12].

With the above brief background on vector manifolds, we are better able to assess the significance of Eq. (121). We can read that equation as extracting the algebraic structure of a Minkowski tangent space from the manifold \({\mathcal M}^4\). However, in the intrinsic approach to manifold geometry taken in this paper, the differentiable structure connecting neighboring tangent spaces is not extracted from the manifold, it is imposed on the manifold by defining a connexion and curvature. Consequently, differential computations throughout the present paper involve only the STA at a single point, and all the tangent algebras are isomorphic. This leads one to wonder if we can simplify the theory and get by with a single copy of the STA. The answer is yes, and the result is the flat space theory of spacetime geometry in [1, 10, 15]. Finally, one should note that all the geometry can be extracted from \({\mathcal M}^4\) itself only if it is an embedded manifold.

10 Historical notes

The present approach to GR was initiated in 1966 by my book Space-Time Algebra [7]. The crucial innovation there was to reduce the standard representation of spacetime geometry by the metric tensor \(g_{\mu \nu }\) to representation by a coordinate frame of vectors \(g_{\mu }\) that generate a real geometric algebra at each spacetime point, as described in Sect. 2. I also introduced the local Lorentz transformations of Eq. (46) and translated Utiyama’s gauge formulation of GR [24] into the real STA. However, the gauge concept was not given the central role it has here. My purpose then was just to incorporate the real Dirac equation into GR. I could not have anticipated the rise of gauge theory to the supreme status that it enjoys in theoretical physics today [13].

In 1966 I was blissfully unaware of similar work decades before. Perhaps that was all to the good, as it might have been intimidating or discouraging. At any rate it would have been an unnecessary distraction, because, as I believed then and know now, all my predecessors had missed a key point, namely, the geometric significance of the Dirac algebra. Thus, Schroedinger [22] and others made the Dirac matrices spacetime dependent and related their products to the metric tensor, as in Eq. (3), in order to incorporate the Dirac equation into GR. To the same end, Fock and Iwanenko [2, 3] and, independently, Weyl [26] in his seminal paper on gauge theory, were evidently the first to introduce the “spin connexion” (119) expressed in terms of Dirac matrices. To do that they were forced to introduce orthonormal frames called vierbeins or tetrads, which are equivalent to fiducial frames represented by matrix elements. However, they all failed to recognize the Dirac matrices as representations of vectors, so they interpreted their constructions as essentially quantum mechanical rather than fundamentally geometrical. At the same time, their treatment of tetrads as mere auxiliary quantities, shows that they failed to recognize the primary physical significance that we have attributed to fiducial frames.

The main limitation of my 1966 book was the lack of mathematical methods to solve field equations that take advantage of simplifications introduced by GA. To remedy that deficiency I embarked on the development of a Geometric Calculus that culminated in publication of a monograph [12] that, among other things, first formulated the theory of vector manifolds.

The STA formulation of GR in the present paper was developed by 1976, but not published until 1986 [8, 9], because I had originally intended to include it in the GC monograph. The claim in those papers that my method is more efficient than Cartan’s exterior calculus in geometric computations was soon supported by direct comparison of computer calculations [20]. However, the most important consequence of that work was stimulating creation of the flat-space gauge theory of gravity by Lasenby, Doran and Gull [1, 15]. That, in turn, stimulated emphasis on gauge theory and the Equivalence Principle in the present paper. Finally, the present approach has been coordinated with the flat-space theory in a predecessor to this paper [11], with details discussed in the Appendix.