1 Prologue

In the introduction to his “Axiomatik der relativistischen Raum-Zeit Lehre”, Hans Reichenbach [65] wrote the following words, emphasizing the principal difference between physics and mathematics, as regards the axiomatic method. In the English translation by his wife Maria [66], they readFootnote 1:

The value of an axiomatic exposition consists in summarising the content of a scientific theory in a small number of statements. An evaluation of the theory may then be limited to an evaluation of the axioms, because every statement of the theory is implicitly contained in the axioms. [...] The problem of the axioms of mathematics was solved by the discovery that they are definitions, that is, arbitrary stipulations which are neither true nor false, and that only the logical properties of a system—its consistency, independence, uniqueness, and completeness—can be the subjects of critical investigation.

There is, however, a fundamental difference between physics and mathematics. Physical statements are more than mere consequences of arbitrary definitions; they are supposed to describe the real world.

“Truth” and “falsehood” have entirely different meanings in physics and in athematics; to judge that a statement in physics is true is not a logical judgement but a judgement concerning the occurrence or nonoccurrence of sense perceptions. To the physicist the question of truth is the most interesting one, for if his theory is true, he can call it in a certain sense a description of reality.

The axiomatic exposition of a physical theory is at the outset subject to the same laws as that of a mathematical theory [...]. Yet since the physical axioms also contain the whole theory implicitly, they must themselves be justified: they must not be arbitrary but true. ‘True’ refers again to a factual judgment ultimately tested by perception.

2 Introduction

The foundation on which contemporary physics rests consists of various theories describing “physical systems”, their “interactions” and “evolution” in time. These systems are thought of as being embedded in an exterior structures called “space” and “time”, or “space-time” for short. These exterior structures are themselves either fixed, in the sense of not being acted upon by the systems it contains, or changing according to some dynamical laws that also govern the interaction of the systems (i.e. “matter”) and space-time. We will have later plenty of opportunity to further explain this difference, but for the moment it suffices to say that in our current collection of theories in physics, only General Relativity makes use of a dynamical space-time structure and that this theory need not be considered if gravitational fields play no significant role in the theoretical description of the system and its associated phenomena. So for a very large piece of physics space-time is a fixed entity the structure of which is unchanging. This does, of course, not mean that the laws of physics are insensitive to that fixed structure; quite the contrary! But it does mean that all systems are embedded into the very same exterior structure that universally acts on all (non-gravitating) systems.

Having said this, I wish to start by briefly recalling some disciplines in physics, where axiomatic thinking has made, or continues to make, fruitful contributions for progress and understanding. I will also list some names associated with these developments, without in any way claiming even approximate completeness, neither in the areas and certainly not for the names. They just reflect what I am more or less familiar with. I do not want to rule out the possibility that there exist other examples which are equally well suited.

  • Mechanics: Isaac Newton (1642–1726), Joseph-Luis de Lagrange (1736–1813), Carl Gustav Jacobi (1804–51), William Rowan Hamilton (1805–65), William Thomson [Lord Kelvin](1824–1907), Peter Guthrie Tait (1831–1901), Ludwig Lange (1863–1936), Gottlob Frege (1848–1925), Heinrich Hertz (1857–94), Georg Hamel (1877–1954), Jean-Marie Souriau (1922–2012), Ralph Abraham (1936), Wladimir Igorewitsch Arnold (1937–2010), Jerrold Eldon Marsden (1942–2010), ...

  • Thermodynamics: Constantin Carathéodory (1873–1950), Robin Giles (1935), Elliot Lieb (1932) & Jakob Yngvason (1945), ...

  • Electrodynamics: James Clerk Maxwell (1831–79), Gustav Mie (1868–1957), Evert Jan Post (1915–2015), Friedrich Hehl (1937) & Yuri Obukhov (1956), ...

  • Special Relativity: Wladimir Sergejewitsch Ignatowski (1875–1942) Hermann Rothe (1882–1923), Alfred Robb (1873–1936), Hans Reichenbach (1891–1953), Alexander Danilowitsch Alexandrow (1912–99), Erik Christopher Zeeman (1925–2016), Walter Benz (1931–2017), ...

  • General Relativity: David Hilbert (1862–1943), Hermann Weyl (1885–1955), Alfred Schild (1921–77), Felix Pirani (1928–2015), Jürgen Ehlers (1929–2008), Andrzej Trautman (1933), Herbert Pfister (1936–2015), Jürgen Audretsch (1942–2018), ...

  • Quantum Theory: George David Birkhoff (1884–1944), Paul Adrien Maurice Dirac (1902–84), Johann von Neumann (1903–57), Joseph Maria Jauch (1914–74), George Whitelaw Mackey (1916–2006), Günther Ludwig (1918–2007), Constantin Piron (1932–2012), ...

  • Quantum Field Theory: Res Jost (1918–90), Lars Gøarding (1919–2014), Arthur Strong Wightman (1922–2013), Rudolf Haag (1922–2016), Daniel Kastler (1926–2015), Huzihiro Araki (1932), Robert Schrader (1939–2015), Konrad Osterwalder (1942), Detlev Buchholz (1944), Alain Connes (1947), ...

–and many others–

I cannot do even do approximate justice to all the past and present developments in these areas. Instead I will pick some examples, most of them well known, to illustrate some attitudes towards axiomatic thinking in physics. The reader should not be surprised that within the physics community opinions differ regarding the use and value of axiomatic thinking. The above quotation of Reichenbach’s words should make clear why this is the case. Another famous and memorable quotation in that regard is that Einstein’s gave as the answer to what he called a “disturbing riddle”, namely, how can it be that mathematics, which is a product of human thinking alone and independent of experience, fits so well to the objects in the world around us? Is it possible that the human mind is capable to discover and understand the world around us—at least to some extent—by pure reason and without any resort to experience? Einstein’s proverbial answer is now a classic [22]:

Insofar as the statements of mathematics refer to reality [German: Wirklichkeit] they are not certain and insofar they are certain, they do not refer to reality.

Einstein praises the axiomatic method for bringing clarity into this dichotomy. It allows to cleanly separate the formal aspects from those regarding the (physical) content. I think it is fair to say that this is an attitude most working theoretical physicists would agree with.

In a second and more technical part, I will focus on the geometric theories of Special and General Relativities, which I am most familiar with. There I will also report on some mathematical results that relate to various axiomatization programmes of space-time theories.

3 Some Examples

In this section, I wish to touch upon a few examples in modern (after and including Newton) physics, with ambivalent attitudes towards axiomatization.

Fig. 10.1
figure 1

[Picture credits: Wikimedia]

Isaac Newton and the cover page of his ‘Principia’, in which he presented his theory in a form that would do honour to modern treatments in its consistency and clarity with which ideas from mathematics and natural philosophy (physics) are developed separately and—carefully—connected.

3.1 Isaac Newton and Mechanics

Ever since Newton’s “Principia” (Philosophiae Naturalis Principia Mathematica (1686)); see Fig. 10.1, theories for selected parts of the phenomenological world have been presented in a more or less axiomatic form. In the physics community, it is widely accepted, if sometimes only implicitly, that

falsification is the essence of progress in physics

$$\begin{aligned} \boxed {\boxed { A\rightarrow B\Rightarrow \bar{B}\rightarrow \bar{A} }} \end{aligned}$$
(10.1)

Only if the physicist’s “deduction” \(A\rightarrow B\) from the theoretical hypotheses—here collectively denoted by the letter A—to the phenomenological consequences—collectively denoted by B—is indeed an unbroken logical conclusion (within the logical system of the theory at hand) can we actually learn something definite from the occurrence of \(\bar{B}\), namely, that at least some of our hypotheses within A must be false. Nothing definite can be learned from B happening, except a gain in (subjective) confidence into our theory, which is often referred to as a theory’s “confirmation”. It is mainly for this reason that the axiomatic method is accepted in physics as a proper mode of generating progress.

For a modern physicist, Newton’s Principia is the classic example for that kind of approach. Rigorous mathematical deductions are based on careful and profound conceptual discussions. On the other hand, in order to keep this rigorous line of reasoning, Newton had to abstain from certain speculations that, too, are a necessary part of theorizing in physics. A good example of this is given by his letter to Bently of February 25, 1692, in which Newton clearly states—like nowhere in the Principia—his belief that his theory of gravity is essentially incomplete, independent of the fact that it allows to compute celestial motions. What it lacks is a proper “philosophical” understanding of how the gravitational action, the quantity of which he had fully outlined, is actually mediated from one body to another. He writes [60, Letter 406, pp. 253–4]

That gravity should be innate inherent and essential to matter so that one body may act upon another at a distance through a vacuum without the mediation of anything else by and through which their action of force may be conveyed from one to another, is to me so great an absurdity that I believe no man who has in philosophical matters any competent faculty of thinking can ever fall into it. Gravity must be caused by an agent acting constantly according to certain laws, but whether this agent be material or immaterial is a question I have left to the consideration of my readers [of the Principia].

By “leaving the decision to the readers” Newton seems to say that the intended applications of his theory are independent of such “philosophical” questions. This is, of course, not true. In 1805, Laplace in his Mécanique Céleste considered a fluid model of the aether as the carrier of gravitational fields, which put the finite propagation speed into evidence and had the immediate consequence that the force that one body exerts onto another does not point parallel to the line connecting simultaneous positions. Nevertheless, the tremendous success of the Principia can be seen as due to Newton’s clever and well-chosen separation between aspects that fit into an axiomatic scheme of sufficient predictive power and those aspects that await further “philosophical” clarification without much impact on the current set of intended applications. In my opinion, a very strong case for the axiomatic method indeed.

3.2 Heinrich Hertz and Modern Analytical Mechanics

Hertz’s “Prinzipien der Mechanik” was finished in October 1893, 3 months before the author’s tragic death at the age of 36 due to “blood poisoning” (Fig. 10.2).Footnote 2 It was published posthumously in 1894 by his assistant Philipp Lenard. This is a very unusual book indeed, praised by many, but also considered as “not really useful” and “totally unsuitable for the beginners” (Sommerfeld).Footnote 3 It presents the foundations of mechanics in a new and axiomatic way, keeping a strict separation between “kinematics and geometry” on one side and “mechanics proper” on the other. A central aim of Hertz’s programme was to eliminate the semi-intuitive usage of the concept of “force” and to replace it with the analytically much clearer variational principles (d’Alembert, Maupertuis, Lagrange, Jacobi, Hamilton), to the collection of which Hertz contributed yet another one: the principle of straightest path.

Fig. 10.2
figure 2

[Picture credits: Wikimedia (left); the right picture is taken from the public domain reprint of Hertz [32] by Sändig Reprint Verlag, Liechtenstein (1999)]

Heinrich Hertz and the cover page of his remarkable book on the “Prinzipien der Mechanik”, which unfortunately left little lasting impression in the physics community.

Today Hertz’s Mechanics is little to almost not known by physicists. What remained of his treatment are the systematic distinction and characterization of “holonomic” (integrable) and “anholonomoic” (non-integrable) constraints, nowadays presented in modern differential-geometric language, and, more famously, the exceptional introductory chapter that outlines in a programmatic fashion and in great detail (50 pages) Hertz’s epistemological concept of the role of theories in physics and science in general. It gives a clear view on the philosophical attitude behind this remarkable treatise on mechanics. It is symptomatic that this “Einleitung” is still available as a separate book in a modern edition, whereas the actual text on mechanics only exists as a photocopy-based reproduction by Sändig Reprint Verlag, Liechtenstein (1999).

The extensive “Einleitung” is, to be sure, meant to justify the approach that is to follow: a meticulously organized string of definitions, remarks (“Anmerkungen”), theorems (“Lehrsätze”), conclusions (“Folgerungen”), additions (“Zusätze”), and exercises (“Aufgaben”). The book is divided into two parts, called “books” (“Bücher”), the first being entitled: “On geometry and kinematics of material systems”, the second: “On mechanics of material systems”. The logic behind this strict division reflects the epistemological Ansatz outlined in the introduction to Hertz [32]:

We form for ourselves images or symbols of external objects; and the form which we give them is such that the necessary consequents of the images in our mind are always the necessary consequents in nature of the things pictured.

The images which we here speak of are our conceptions of things. With the things themselves they are in conformity in one important respect, namely, in satisfying the above-mentioned requirement. For our purpose it is not necessary that they should be in conformity with the things in any other respect whatever.

The images which we may form of things are not determined without ambiguity by the requirement, that the consequences of the images must be the images of the consequences.

Of two images of the same object, that one is the more appropriate which pictures more of the essential relations of the object,—the one which we may call the more distinct. Of two images of equal distinctness the more appropriate is the one which contains, in addition to the essential characteristics, the smaller number of superfluous or empty relations—the simpler of the two.

Fig. 10.3
figure 3

[Picture credits: Pictures reproduced from the public domain reprint of Hertz [32] by Sändig Resprint Verlag, Liechtenstein (1999)]

Table of contents of both parts—“books”—of Hertz’ “Prinzipien der Mechanik”. The first book (left) on the “geometry and kinematics or material systems” is deliberately and carefully kept distinct from the second (right) on the “mechanics of material systems”.

The table of contents of both books are given in Fig. 10.3. This division clearly illustrates Hertz’ epistemology: Abstract pictures (left-hand side) obtained from “inner inspection” (innere Anschauung) versus knowledge of the real world (right-hand side) obtained from “experience” (Erfahrung). For example, the first chapter in each book is on the concepts of “Time, Space and Mass”, approached via “inner inspection” and “experience”, respectively. This may be criticized on many accounts, not least because “experience as such” is too naive a concept. But what is important for us here is that this dual approach that aims for axiomatization is spelled out explicitly and embedded in a rich epistemological discussion. Hertz convincingly demonstrates how this “mapping-epistemology” almost inevitably leads the requirement of a dual development, keeping the formulation of structures based on “inner inspection” and “outer experience” sufficiently independent with connections being carefully (and reversibly!) drawn only after they are reasonably matured.

3.3 Constantin Carathéodory and Classical Thermodynamics

A classic branch of physics that has invited axiomatic formulations again and again up to this very day is Thermodynamics. The preface of Sommerfeld’s 5th volume of his “Lectures on Theoretical Physics”, which are on Thermodynamics and Statistics, opens with the following sentence:

Thermodynamics is the paradigm [german: Musterbeispiel] of an axiomatically constructed science.

The person most often named as the initiator of serious mathematical attempts in this direction is Constantin CarathéodoryFootnote 4 In 1909, he gave an innovative axiomatic formulation of phenomenological (i.e. not statistical) thermodynamics which included his “principle of adiabatic inaccessibility” Carathéodory [14] (Fig. 10.4). That principle, which he understood as direct expression of numerous phenomenological facts, had a very simple formulation in terms of Pfaffians (differential one-forms). Is says—in modernized vocabulary—that the kernel distribution for the one-form of heat must be integrable and hence admits an integrating factor (denominator), the latter being essentially the temperature (up to reparametrization). In this way, the powerful machinery of differential forms was identified as the right tool to express phenomenological facts of great generality and almost universal applicability.Footnote 5

Fig. 10.4
figure 4

[Picture credits: Wikimedia (left) and Springer Verlag (right)]

Constantin Carathéodory and the top of the first page of his axiomatic formulation of thermodynamics [14] that later received great recognition and largely influenced modern developments.

With a delay of about 10 years, Carathéodory’s axiomatization was greeted with much respect and considered highly useful for future developments, including pedagogical aspects. The first to recognize this was Born [10], followed by Ehrenfest-Afanasjewa [21] and even the influential and widely read Geiger-Scheel Handbook of Physics included a separate entry on Carathéodory’s axiomatization, written by Landé [43]. Born, who can be said to have made the strongest early supporter of this line of research, remained a lifelong advocator, as can be seen from his wonderful presentation of Carathéodory’s ideas in his famous semi-popular book “Natural Philosophy of Cause and Chance” [12]. Many modern textbooks and lecture notes from the second half of the twentieth century pay due tribute to the work of Carathéeordory, see, e.g. [73] and [74].

But there is more to it. Carathéodory’s work inspired others to pursue further this programme of axiomatization of thermodynamics. In particular, it was felt that Carathéodory’s principle is genuinely local in nature, whereas one would also like to make global statements, see [27, Chap. 1.3]. Moreover, it was felt that the restriction to equilibrium states should be relaxed, so as to also include the more realistic processes in actual applications. This development already set in the 1960s with the remarkable treatise by Giles [27], the programme of which was extended and given new life with a “fresh approach” by Elliot Lieb and Jakob Yngvason in an impressive series of papers addressing a wide range of interested physicists and mathematicians alike Lieb and Yngvason [45,46,47,48]. Their work is fundamentally based on the structure imposed by an order relation \(\prec \) to be read as: \(A\prec B\) if and only if state B is adiabatically accessible from state A. Now this relation is defined globally on the space of admissible (not necessarily equilibrium) states.

Many physicists think that abstract experimenting with axiomatic formulations is a game to be played as best after the essential physics has already been understood. But thermodynamics is a case where this critical attitude is certainly unjustified. It that respect it is remarkable that the “tour de force of physical and mathematical reasoning”, as the Lieb-Yngvason approach was sometimes called [50, p. 631], was highly welcomed by some members of the engineering departments, as is evidenced by the textbook of [76] entitled: “The Entropy Principle” and the subtitle: “Thermodynamics for the Unsatisfied”. In the preface of that book, the author, who is a professor of mechanical engineering, states his dissatisfaction with the often-made claim by physicists that the concept of entropy cannot be understood without recourse to statistical mechanics. He tells the story that when he came across the “fresh look at entropy and the second law of thermodynamics” [47] and also the corresponding more technical elaboration [46], he felt that

For the first time in my academic life I began to feel that I really understood the entropy of classical thermodynamics.[...] Although the theory is mathematically complex, it is based on an idea so simple that each student of science or engineering should be able to understand it. I then decided to involve my students in order to test whether the Lieb-Yngvason theory is as convincing as I believed.

Well, the test went apparently positive and the book [76] by Thess is an outcome of that endeavour.

Fig. 10.5
figure 5

[Picture credits: Springer Verlag (left) and Wikimedia (right)]

Max Born and the cover page of his book on the “mechanics of the atom” (Atommechanik) that resulted from his lectures of the winter semester 1923/24 in Göttingen and that at the eve of Quantum Mechanics he described as a “logical experiment”.

3.4 Max Born and the “Old” Quantum Mechanics

Quantum Mechanics, as we know and use it today, was formulated by Heisenberg (matrix mechanics 1925) and Schrödinger (wave mechanics 1926). In the time immediately before that period, the physicists’ approach to the “mechanics of the atom” consisted in applications of advanced methods from analytical mechanics and perturbation theory, mostly borrowed from the methods that theoretical astronomers used. In particular, this included Hamilton-Jacobi theory. These methods were supplemented by “quantum rules” that were imposed on top of, and largely in contradiction with, the dynamical laws of mechanics and electrodynamics, so as to be able to explain the typical quantum phenomena, like the discrete spectral lines, which must correspond to transitions between discrete stable stationary states of the atom, whereas the classical theory inevitably leads to a continuum of unstable states (due to electromagnetic decay). It was clear to everybody that a proper theoretical understanding had to come from a fundamental change in the theoretical foundations, though opinions and expectations differed as to which of the fundamental principles could be maintained and trusted and which had to be given up.

One might think that there can be little value to any attempts to axiomatize a theory that in the minds of the leading scientists is already “written-off”. The Bohr-Sommerfeld theory, as it was called, was a pragmatic list of recipes to calculate (surprisingly successfully) spectroscopic data, but what could possibly be gained from a deep-lying mathematical and conceptual analysis? After all, the theory simply cannot be true.

Precisely! one may reply with Max Born. And because it cannot be true we wish to know how and where it fails, not just that is fails “somehow” and “somewhere”. In order to be able to draw such conclusions, or at least in order to gain insights in that direction, we must give the doomed theory a logical shape that—in principle—allows to draw such conclusions. The book “Atommechanik” by Born [11] is just such an attempt (Fig. 10.5). This is best explained in the introduction by Born himself:

The title ‘Atommechanik’ of this lecture, which I delivered in the winter-semester 1923/24 in Göttingen, is formed after the label ‘Celestial Mechanics’. In the same way as the latter labels that part of theoretical astronomy which is concerned with the calculation of trajectories of heavenly bodies according to the laws of mechanics, the word ‘Atommechanik’ is meant to express that here we deal with the facts of atomic physics from the particular point of view of applying mechanical principles. This means that we are attempting a deductive presentation of atomic theory. The reservations, that the theory is not sufficiently developed (matured), I wish to disperse with the remark that we are dealing with a test case, a logical experiment, the meaning of which just lies in the determination of the limits to which the principles of atomic- and quantum physics succeed, and to pave the ways which shall lead us beyond that limits. I called this book ‘Volume I’ in order to express this programme already in the title; the second volume shall then contain a higher approximation to the ‘final’ mechanics of atoms.

We refer to [29] for a detailed discussion of Born’s book. As a result, one may say that it did serve its purpose to some extent, for it was Born himself who made some of the decisive contributions to quantum mechanics in the years to follow. In that sense, the book is now obsolete, though it is still known for its concise discussion and application of Hamilton-Jacobi theory (in the older, non-geometric presentation) and its use in mechanical perturbation theory.

3.5 Werner Heisenberg and Quantum Field Theory

It is interesting to compare the approaches discussed so far with that of Heisenberg, who in his later years hoped to give a unified mathematical formulation of elementary particles and their interactions. His basic idea arose from his critical reflection on the very notion of “particle”, which in quantum field theory is far less obvious concept than the often made recourse to ancient philosophical concepts might suggest. In quantum field theory, the basic entity is the field, which obeys dynamical laws and respects certain symmetries, including the automorphisms of space-time. Particles are associated to certain states, which may or may not be dynamically stable according to the laws of interaction. Heisenberg compared these states with the states of an atom in ordinary quantum mechanics, among which there may be transitions according to dynamical laws and selection rules imposed by symmetries. This attempt of Heisenberg’s was not successful for many reasons, not least because of its uncertain mathematical setting and the ensuing lack of mathematical control, which prevented proper deductions. What is of interest to us here is that Heisenberg takes a different view as regard to what should come first: a controlled mathematical setting or a proper physical understanding. This Heisenberg [31] outlined in the preface of his book “Einführung in die Einheitliche Feldtheorie der Elementarteilchen” (Fig. 10.6):

Most physicists would presumably agree that one should not start too early to strive for mathematical rigour. On the other hand, it also seems unclear what it could mean that a physical problem has been “solved”, if not on the basis of “exact mathematical expression”. Here, in my opinion, Born has the better approach in regarding exact mathematical expressions as part of a “logical experiment”. Like a laboratory experiment requires utmost care in the articulation of its set-up as well as its actual performance in order to guarantee its reproducibility, a logical experiment can likewise not bear with faulty or uncontrolled mathematical expression. Proper understanding in physics equally relies on both of these aspects, which should go hand in hand rather than being played off against each other.

This ends our presentation of selected examples of axiomatic thinking in physics. Perhaps the biggest gap in our selection is that left by omitting axiomatic quantum field theory, which would have gone beyond the scope of this presentation. In the second part of this contribution, I will focus on theories and mathematical models of space-time, which due to their close relation with geometry can be viewed as an ideal playing ground for axiomatic thinking.

Fig. 10.6
figure 6

[Picture credit: Cover page reprinted with kind permission of Hirzel Verlag]

Quotation from, and cover page of, Werner Heisenberg’s book on the attempted unified field theory of elementary particles Heisenberg [31].

4 Space-Time

Ever since the advent of Special Relativity (SR) in 1905 and General Relativity in 1915, it became manifest that the geometric structure of space-time is to be addressed explicitly as a contingent entity in the formulation of physical laws. This is not to say that space-time structure has a lesser role in, say, classical mechanics. But for a long time that structure was taken as more or less self-evident and without need to be separately listed among the hypotheses of physical theories. It is characteristic of that situation that the need to spell out explicitly the geometric hypotheses underlying Galilei-Newton space-times was only felt after Special Relativity was formulated. To my knowledge, the first to do this for Galilei-Newton space-time was Hermann Weyl in his famous book Raum-Zeit-Materie, the first edition of which appeared in 1918.Footnote 6 He characterized the geometry of Galilei-Newton space-time in terms of affine and metric structures, the latter separately for a “time-metric”, that measures oriented time distances between any two points—also called “events”—in space-time, and a space metric that measures distances between any pair of simultaneous points, i.e. points of vanishing time difference. Following the spirit of his Erlanger Programm, Klein [38] [78, § 18] argued that this type of “geometry” may be characterized by its automorphism group, which in case of Newton-Galilei space-time is the inhomogeneous Galilei group and in case of Special Relativity is the inhomogeneous Lorentz group, also known as Poincaré group. See Künzle [40] for a comprehensive comparison of these structures.

4.1 Minkowski Space

Since Weyl’s treatment of various space-time structures from a unifying perspective, the question of how to characterize and motivate them appeared time and again on the agenda of mathematically inspired physicists. It led to various attempts to axiomatize the geometry of Minkowski space, i.e. the space-time of SR. An early and very elaborate attempt—even before Weyl—is that of Robb [67,68,69,70], who based his axioms on light propagation in the attempt to design primitives based on intuitive physical operations. Characteristic for his complex systems of axiomsFootnote 7 is the fundamental role played by relations of causality, which in view of later developments (see below) lets Robb appear much ahead of his time Briginshaw [13]. Another system of axioms for Minkowski space, somewhat modelled after Hilbert’s system for Euclidean Geometry, has been established by Schutz [71]. Remarkably, this system of axioms is shown to be independent. Let it also be mentioned that once more Carathéodory [15] was among the first to give a simplified axiomatic formulation of SR, the intention of which was also to eliminate “rods” but retain a simplified concept of “clock” based on light signals, a so-called “Lichtuhr”, the concept of which already appeared in [34, p. 54], though without further explanation.Footnote 8 Weyl once more returned to an axiomatization of Minkowski space in his (so far unpublished) lecture on “Axiomatik” held at the University of Göttingen of the winter semester 1930–31, of which a manuscript survived at the Princeton Institute for Advanced Studies [77].Footnote 9

It would clearly be hopeless to attempt to give a fair overview over these developments over the last 100 years, not all or which appeal to the physicist. For a physicist, the axioms should have some more or less intuitive relation to operations that can, in principle, be carried out by means of existing objects. Rather, I wish to state some more or less recent results that are of interest from the physical as well as mathematical point of view.

4.1.1 Beckman-Quarles Analogues

We begin by recalling the famous theorem of Beckman and Quarles [5] that characterizes the Euclidean group as distance-preserving maps:

Theorem 10.1

(Beckmann & Quarles 1953) Let \((\mathbb {R}^n,Q)\) be Euclidean space for \(n\ge 2\) with standard Euclidean quadratic form \(Q(x)=\sum _{i=1}^n x_i^2\) and associated distance function \(d(x,y):=\sqrt{Q(x-y)}\). Let \(T:\mathbb {R}^n\rightarrow \mathbb {R}^n\) be a map such that there exist a positive real number \(\rho \) so that \(d(x,y)=\rho \Rightarrow d\bigl (f(x),f(y)\bigr )=\rho \), then T is an element of the group \(E_n\) of Euclidean transformations (including orientation reversing reflections).

This theorem is remarkable insofar as the map T is not required to fulfill any other property than preserving a single length \(\rho \). No assumptions whatsoever are made concerning injectivity, surjectivity, bijectivity, continuity, or affine-linearity. All these properties are implied by the single requirement that some distance \(\rho >0\) be preserved, as well the isometry property that if a single distance is preserved, all distances are. Note also that for \(n=1\) the statement of the theorem is obviously false: For example, the map \(T:\mathbb {R}\rightarrow \mathbb {R}\) given by \(T(x)=x+1\) for \(x\in \mathbb {Z}\subset \mathbb {R}\) and \(T(x)=x\) for \(x\not \in \mathbb {Z}\subset \mathbb {R}\) satisfies the hypotheses for \(\rho =1\).

It is clear that the Beckman-Quarles theorem can just as well be stated for a general real (or complex, but we are interested in the real case only) affine space \(A^n\) of dimension \(n\ge 2\), whose associated vector space V has a symmetric, positive definite inner product \(g:V\times V\rightarrow \mathbb {R}\), associated quadratic form \(Q_g(x)=g(x,x)\) and distance function \(d_g(x,y):=\sqrt{Q_g(x-y)}\).

The situation in Special Relativity differs from that only insofar as g is not positive definite, but rather has signature \((-1,1,1,1)\). This means that point pairs (xy) have either

$$\begin{aligned} Q_g(x-y) {\left\{ \begin{array}{ll} <0 &{} \Leftrightarrow (x,y)\, \text {are timelike separated},\\ =0 &{} \Leftrightarrow (x,y)\, \text {are lightlike separated},\\ >0 &{} \Leftrightarrow (x,y)\, \text {are spacelike separated}. \end{array}\right. } \end{aligned}$$
(10.2)

In this case, a distance function \(d_g\) does not exist. For example, the naive generalization, \(d_g(x,y)=\sqrt{\vert Q_g(x-y)\vert }\), gives \(d_g(x,y)=0\) whenever (xy) are lightlike separated and does not imply \(x=y\). However, it is clear that we could have formulated the Beckman-Quarles theorem using the “squared distance” \(d^2_g(x,y)=Q_g(x-y)\). Then it is natural to ask whether the corresponding statement remains true in the indefinite case. This is not at all obvious since the starting idea of the proof given in [5] is tailored to the positive definite case. However, after some efforts, it turned out that a Beckman-Quarles result indeed holds for non-zero “squared-distances”:

Theorem 10.2

(Benz 1980-1 and Lester 1981) Let \((\mathbb {R}^n,g)\) be Minkowski space for \(n\ge 2\) with standard Minkowskian quadratic form \(Q_g(x)=g(x,x)=-x_0^2+\sum _{i=1}^{n-1}x_i^2\). Let \(T:\mathbb {R}^n\rightarrow \mathbb {R}^n\) be a map such that there exists a non-zero real number \(\sigma \) so that \(Q(x-y)=\sigma \Rightarrow Q\bigl (T(x)-T(y)\bigr )=\sigma \). Then T is an element of the Poincaré group \(P_n\) of \((\mathbb {R}^n,g)\), i.e. the composition of a translation and a Lorentz transformation (including orientation-reversing reflections).

The cases of timelike separation (\(\sigma <0\)) were proven for all dimensions (\(n\ge 2\)) by Benz [6], the planar case (\(n=2\)) for timelike or spacelike separation (\(\sigma \ne 0\)) by Benz [7], and the remaining cases of spacelike separation (\(\sigma >0\)) and dimensions \(n\ge 3\) by Lester [44].

4.1.2 Causal Relations

What about lightlike separations? Here no Beckman-Quarles theorem is known. What is known are several results for maps that are required to be bijections and that in both directions preserve the causal- and light-cone structure implicit in (10.2), possibly refined by adding a time orientation.

To state them more precisely, we recall that a time orientation on Minkowski space consists in the selection of one of the two components of the set of non-zero causal (i.e. timelike or lightlike) vectors. The selected component is then called the set of “future pointing” causal vectors. Any member o of that component may then represent the choice of orientation as follows: Let (Vg) be an n-dimensional real vector space with inner product of signature \((-1,1,\cdots ,1)\). Let o be timelike, i.e. \(g(o,o)=-1\) (without loss of generality we can choose o to be normalized). The cone of causal vectors in which o lies is called “the future”. Then it is easy to see that any other non-zero causal (i.e. timelike or lightlike) vector v (i.e. \(g(v,v)\le 0\)) is also an element of the future, if and only if \(g(v,o)<0\).

This allows to introduce into Minkowski space M (the affine space corresponding to V) the notions of causal and chronological future and past, as well as the light cones, as follows:

$$\begin{aligned} I^+(x)\,:=\,&\{y\in M:Q_g(y-x)<0\wedge g(y-x,o)<0\}\end{aligned}$$
(10.3a)
$$\begin{aligned} =&\quad \text {``chronological future of x''},\nonumber \\ I^-(x)\,:=\,&\{y\in M:Q_g(y-x)<0\wedge g(y-x,o)>0\}\end{aligned}$$
(10.3b)
$$\begin{aligned} =&\quad \text {``chronological past of x''},\nonumber \\ J^+(x)\,:=\,&\{y\in M:Q_g(y-x)\le 0\wedge g(y-x,o)<0\}\end{aligned}$$
(10.3c)
$$\begin{aligned} =&\quad \text {``causal future of x''},\nonumber \\ J^-(x)\,:=\,&\{y\in M:Q_g(y-x)\le 0\wedge g(y-x,o)>0\}\end{aligned}$$
(10.3d)
$$\begin{aligned} =&\quad \text {``causal past of x''},\nonumber \\ L^+(x)\,:=\,&\{y\in M:Q_g(y-x)=0\wedge g(y-x,o)<0\}\end{aligned}$$
(10.3e)
$$\begin{aligned} =&\quad \text {``future light-cone of x''},\nonumber \\ L^-(x)\,:=\,&\{y\in M:Q_g(y-x)=0\wedge g(y-x,o)>0\}\\ =&\quad \text {``past light-cone of x''}.\nonumber \end{aligned}$$
(10.3f)

Clearly \(L^\pm (x)=J^\pm (x)-I^\pm (x)\). We also use the notation \(I^+(x)\cup I^-(x)=I(x)\) and correspondingly \(J^+(x)\cup J^-(x)=J(x)\) and \(L^+(x)\cup L^-(x)=L(x)\) for the double cones.

Using this language, we first state some of the main theorems concerning the question as to how far the casual relations determine the Poincaré group and therefore encode the geometry of Minkowski space-time:

Theorem 10.3

(Alexandrov, Zeeman, Borchers & Hegerfeld) Let (Mg) be \(n>2\)-dimensional Minkowski space and \(T:M\rightarrow M\) a bijection so that whenever \(y\in I^+(x)\) then \(T(y)\in I^+\bigl (T(x)\bigr )\) and \(T^{-1}(y)\in I^+\bigl (T^{-1}(x)\bigr )\); then T is the composition of a time-orientation-preserving Poincaré transformation and a positive constant rescaling \(x\mapsto ax\), called a positive “homothety”, where \(a>0\). The same is true if we replace \(I^+\) by \(I^-\) or \(J^+\) or \(J^-,L^+\) or \(L^-\). Moreover, without invoking a time orientation, we still have the corresponding statements. That is, if \(y\in I(x)\) implies \(T(y)\in I\bigl (T(x)\bigr )\) and \(T^{-1}(y)\in I\bigl (T^{-1}(x)\bigr )\), then it follows that T is a Poincaré transformation (including time-orientation-reversing ones) and general homotheties \(x\mapsto ax\) with \(a\ne 0\).

Most of this was proven by Alexandrov [1] and independently by Zeeman [79], the non-time-oriented case separately by Borchers and Hegerfeld [9]. Note that the existence of the additional homotheties \(x\mapsto ax\) is obvious, since they clearly preserve the causal and chronological relations and also lightlike separations. In that sense, the stated results are the strongest one could have hoped for, except perhaps for the requirement that the maps be bijections, a requirement that was not necessary in the Beckman-Quarles case. But again we emphasize that continuity and even the affine character was not required, but rather comes out as a result.

The results of this last theorem have been interpreted in various ways. For example, the fact that Poincaré transformations and homotheties are the only bijections that preserve the relation of point pairs to be lightlike separated can be read as saying that the invariance of the speed of light alone already determines the Poincaré group (up to homotheties), without any essential further input from the “principle of relativity” and, in particular, the “law of inertia”. We recall that it is the latter that is usually invoked to get the affine structure of space-time through the particular path structure determined by the law of inertia.Footnote 10 Hence, we may say that the affine structure of Minkowski space is already encoded in its light-come structure, and also its causal structure. Sometimes, this is expressed—not quite accurately—that “causality implies the Lorentz group”, which, in fact, is just the title of [79].

4.1.3 Non-standard Topologies

An related line of attack for characterizing the full structure of Minkowski space by axioms in terms of operations that have a more or less intuitive physical meaning in terms of elementary operations is to endow it with another, finer (in the sense of “more” open sets) topology than the standard one.Footnote 11 A first suggestion in that direction was made by Zeeman [80] with his “fine topology”, which is the finest that induces the standard topology on any timelike straight line and any spacelike hyperplane. This topology is strictly finer than the standard one and Zeeman showed that the group of homeomorphisms of Minkowski space endowed with the fine topology is precisely the Poincaré group extended by homothetys. Negative aspects of this fine topology are that, albeit being Hausdorff connected and locally connected, it is neither normal, nor locally compact, nor first countable. Zeeman concluded that “these disadvantages are outweighed by the physical advantages” [80, p. 162]. The physical advantages may be expressed by saying that “openness” in this topology is defined in a physically more operational form, since a set is defined to be open if an inertial observer (moving on a timelike straight line) “times” it to be open, and if every equivalence class of mutually simultaneous events intersect it in an open set.

That these “advantages” are not so obvious has been argued by Hawking et al. [30]. First, the restriction to inertial observers (i.e. to straight timelike lines) is clearly too restrictive and, second, an experiment that takes place in real time with finite duration cannot directly access sets of mutually simultaneous events. Consequently, a different topology was proposed in [30], called the “path topology”, which is defined to be the finest topology that induces the standard topology on any timelike curve.Footnote 12 That is, a set is open if and only if any observer “times” it to be open. This definition applies to arbitrary space-times, not just Minkowski space, and leads to topologies which are path connected, locally path connected (hence connected and locally connected), and Hausdorff, but improves on Zeeman’s fine topology by being also first countable and separable. However, it is not regular, normal, locally compact, or paracompact [30, Theorem 3]. The set of homeomorphims of a space-time in the path topology are precisely the smooth conformal isometries. In that sense can the causal, differential, and conformal structure of a space-time be encoded into a topology that has a fairly straightforward physical interpretation. That this is true in any space-time (and not just the so-called strongly causal ones, as assumed by Hawking et al. [30]) has been proven by Malament [49]. Moreover, for special space-times which obey the condition of being “future and past distinguishing”, there is indeed a certain analogue of the Alexandrov-Zeeman results quoted above. To state it, we remark that the definitions (10.3) can be generalized to arbitrary Lorentzian, time-oriented manifolds (Mg) where the chronological future \(I^+ (x)\) of a point x is the set of points that can be connected to x by a future-pointing timelike smooth curve. If we replace the word timelike by causal (i.e. nowhere spacelike) we get the definition of \(J^+(x)\). Now, a space-time is future or past distinguishing, if and only if \(I^+(x)=I ^+(y)\) implies \(x=y\) and \(I^-(x)=I^-(y)\) implies \(x=y\), respectively. Then, we have

Theorem 10.4

(Malament 1977) Let (Mg) and \((M',g')\) be two future and past distinguishing space-times (connected, time-oriented, four-dimensional smooth Lorentz manifolds without boundary). Let \(T:M\rightarrow M'\) be a bijection so that \(y\in I^+(x)\) implies \(T(y)\in I^+\bigl (T(x)\bigr )\) and \(T^{-1}(y)\in I^+\bigl (T^{-1}(x)\bigr )\). Then T is a smooth conformal isometry, that is, T is a diffeomorphism and there exists a smooth, nowhere vanishing function \(\Omega \) on M so that \(T^*g'=\Omega ^2\,g\).

This is proven (among other results) by Malament [49]. The half-order relation of causal connectibility encodes the entire topological, differential, and conformal structure, at least if the space-time is assumed to be future—as well as past distinguishing. The necessity of both conditions has been demonstrated by Malament [49].

4.2 General Relativity

With the work of Hawking et al. [30, 49] we have entered the realm of General Relativity, the axiomatization of which goes back to Hilbert’s “Grundlagen der Physik” [33,34,35]Footnote 13 and had ever since remained a research topic on the agenda of interested and mathematically inspired relativists. Activities in the 100 years since Hilbert’s first attempts have rather increased than decreased if one compares the second to the first 50 years since then.

General Relativity describes the gravitational interaction of matter in terms of the geometry of space-time (Mg), which is considered to be a four-dimensional differentiable manifold M (the points of which are called “events”) with Lorentzian structure g (i.e. a symmetric, non-generate bilinear form in each tangent space, which is of signature \((-1,1,1,1)\)). The fundamental physical principle behind this geometrization of gravity is Einstein’s Equivalence Principle, according to which all matter components, from elementary fields and particles to astronomical objects, couple to gravity in a universal fashion. This “universality” is such that it can be encoded in a single geometry of space-time that is the common habitat for any form of matter.

Hilbert’s axiomatization of General Relativity is closely linked up with the far more ambitious project to find a common basis for all of (fundamental) physics in terms of the then known fundamental fields, namely, the gravitational and the electromagnetic field. For the latter, he followed the lines of “Grundlagen einer Theorie der Materie” by Mie [55,56,57], the plan of which was to understand the elementary constituents of matter, e.g. the electron, in terms of exact, finite-energy solutions of a mathematically modified, non-linear theory for the electromagnetic field. At that time, Mie’s theory was perceived by many to be a promising candidate, and among them Weyl, who from the third edition [78] included summaries of Mie’s theory in his “Raum-Zeit-Materie”, though in the fifth edition turned more sceptical. The “feldtheoretische Einheitsideal”, as Hilbert still liked to call it in [35, p. 1], was looked upon with increasing scepticism by the then younger generation of physicists, perhaps most penetrating by 20-year young Wolfgang Pauli, who ended his celebrated 237-page article “Relativitätstheorie”, written for the “Enzyklopädie der Mathematischen Wissenschaften”, after having devoted separate chapters to various “matter-theories” of Mie, Weyl, and Einstein, with the following words [62, p. 775]:

Whatever one may think of these arguments, one thing seems certain: that the foundations of the current theory need to be supplemented by new elements, which are foreign to the continuum theory of fields, in order to achieve a satisfactory solution of the problem of matter.

More modern attempts to give an axiomatic basis for General Relativity are much more reluctant in linking it up with contemporary theories of fundamental matter. Still today it is entirely unclear how the quantum(field)-theoretic nature of fundamental matter relates to the classical field theory of space-time and its geometry, which has so far resisted all attempts to “quantization”, despite enormous efforts up to this day. On the other hand, General Relativity needs matter for its interpretation as a theory of “physical geometry”. Its geometric statements refer to the actual behaviour of “clocks” and “rods”. But “clocks” and “rods” are usually very complex many-body systems, eventually based on the laws of quantum physics.

This operational meaning of the notion of “geometry” in physics has been frequently stressed by Einstein, in a very illuminating form in his “Geometrie und Erfahrung” [22]. “Clocks” and “rods” entered the picture as idealized objects the relations of which define what one intends to define by “physical geometry” in the first place. Hence, there is a reciprocal dependence between matter and geometry, since the laws of matter also depend on the geometry of space and time.

One might argue that even with the addition of the qualification “idealized”, clocks and rods are not yet sufficiently well characterized, for it has not yet been said precisely what geometric relations they are supposed to determine. This has once been pointedly expressed by Robb, whose scepticism against the use of such vaguely defined concepts is omnipresent throughout his entire work [67,68,69,70]. In [70, p. 13], he wrote:

It is not sufficient to say that Einstein’s clocks and measuring rods are ideal ones: for, before we are in the position to speak of them as being ideal, it is necessary to have some clear conception as to how one could, at least theoretically, recognise ideal clocks and measuring rods in case one were ever sufficiently fortunate as to come across such things; and in case we have this clear conception, it is quite unnecessary, in our theoretical investigations, to introduce clocks or measuring rods at all.

In a memorable discussion that took place in September 1920 at the “Tagung der Gesellschaft Deutscher Naturforscher und Ärzte in Bad Nauheim”, where many “anti-relativists” voiced their “concerns”, Einstein addressed this fundamental difficulty in a reply to the mathematician Georg Hamel, who inquired about the physical interpretation of gravitational redshift. Einstein’s reply reads as follows [37, Doc. 46, p. 353]:

It is a logical weakness of the theory of relativity in its present state to be forced to introduce rods and clocks as separate entities instead of being able to deduce them as solutions of differential equations. However, in view of the empirical foundations of the theory, those consequences regarding the behaviour of rigid bodies and clocks are among the most reliable ones.

Weyl, who also attended and spoke at this conference, was the first to systematically replace these unclear (in the sense of their dynamical modelling) notions of “clocks” and “rods” by something more transparent. He clearly felt that these notions of uncertain physical foundation should not serve as primitives in any axiomatic scheme of GR. This is why he came up with the idea to at least reduce complexity by replacing “clocks” and “rods” with “particles” and “light-rays”, both of which are, in fact, solutions to differential equations, as Einstein required in his reply to Hamel. Particles move on timelike “autoparallel curves”, by which one understands geodesics modulo their parametrization. If the autoparallel curve is parametrized by its proper length, or any parameter affinely equivalent to that, it is called a “geodesic”.Footnote 14 But here it is deliberately not assumed that the particle is a clock, that is, it is not capable of measuring and recording the length of the curve.Footnote 15 Hence, “particles” determine the set of all timelike autoparallels (or unparametrized timelike geodesics), which in Weyl’s terminology define a “projective structure”. In contrast, an “affine structure” is defined by the set of all geodesics (parametrized). Light rays, on the other hand, determine the light cone in each tangent space of the manifold. Now, knowing the set of vectors v for which \(Q_g(v)=0\) determines the quadratic form \(Q_g\), and hence g, up to a multiplicative constant. For the manifold, this means that light rays determine the metric up to conformal rescalings \(g\mapsto \Omega ^2 g\), or, in other words, the conformal structure. Weyl proved that a Lorentzian metric is—up to globally constant rescalings—uniquely characterized by the conformal and projective structures it defines.

But the actual task is the converse: To derive the existence of a Lorentzian metric form the primitives consisting of a set M of events and sets of subsets called “particles” and “light rays“. In 1972 Ehlers et al. [19]Footnote 16 posed and partially solved the ambitious problem of setting up a system of axioms involving only “particles” and “light rays”, from which the hierarchy of

$$ \text {topological--differential--conformal--projective--affine--metric} $$

structures should be derived, eventually resulting in a four-dimensional differentiable manifold with pseudo-Riemannian structure of Lorentzian signature \((-,+,+,+)\).

4.2.1 The Ehlers-Pirani-Schild System

The system of axioms set up by Jürgen Ehlers, Felix Pirani, and Alfred Schild—or “EPS-System” as it is often called—has, roughly, the following structure:

\(\bullet \):

Primitive elements are a set M of “events” (points in space-time) and two sets of subsets \(\mathcal {L}\) and \(\mathcal {P}\), called “light-rays” and “particles”.

D:

A set \(D_1,\cdots , D_4\) of four axioms characterize the differential-topological structure of M. Typical requirements are that “particles” are smooth (here \(C^3\)) one-dimensional manifolds and that the self-maps from one particle to itself based on light-“echoes” (forward-backward connections between neighbouring particles using light rays) are also smooth.

L:

On top of [D], a set \(L_1,L_2\) of two axioms fix the causal structure with an underlying \(C^3\) manifold M and a \(C^2\) conformal structure. The latter consists in a conformal equivalence class \(\mathscr {C}\) of Lorentzian metrics: if \(g\in \mathscr {C}\) then \(\mathscr {C}=\{\exp (\Omega )g: \Omega \in C^2(M,\mathbb {R})\}\). Here \(C^2(M,\mathbb {R})\) denotes the set of twice continuously differentiable (\(C^2\)), real-valued functions on M.

P:

On top of [D], a set \(P_1,P_2\) of two axioms characterize a projective structure \(\mathscr {P}\), by which one understands the class of free-fall worldlines without parametrization, i.e. embedded one-dimensional submanifolds. In terms of curves as maps \(\mathbb {R}\rightarrow M\), i.e. parametrized curves, the former can be characterized by equivalence classes with respect to the equivalence relation induced by reparametrization. Eventually, \(\mathscr {P}\) can be thought of as an equivalence class of torsion-free connection, where if \(\Gamma \in \mathscr {P}\) and E denotes the \(T^1_1(M)\)-valued tensor-field of identity-endomorphisms (in each tangent space), then \(\mathscr {P}= \{\Gamma +E\otimes \varphi +\varphi \otimes E\): \(\varphi \in ST^*(M)\}\). (\(ST^*(M)\) denotes the set of smooth sections in the cotangent bundle.)

C:

A last axiom, C, requires some “compatibility” (see below) between the conformal structure \(\mathscr {C}\) and the projective structure \(\mathscr {P}\). Given that compatibility, the authors claim to be able to derive a Weyl structure,Footnote 17 by which one understands a triple \((M,\mathscr {C},\nabla )\), where \(\nabla \) is a \(\mathscr {C}\)-compatible connection, which means that it is torsion free and there exists for any \(g\in \mathscr {C}\) a covector field \(\varphi _g\) such that

$$\begin{aligned} \nabla g=\varphi _g\otimes g. \end{aligned}$$
(10.4)

It is easy to check that if (10.4) holds for \((g,\varphi _g)\) then it also holds for \((g',\varphi _{g'})\) with \(g'=\exp (\Omega )g\) and \(\varphi _{g'}=\varphi _g+d\Omega \).

R:

In order to reduce this to a semi-Riemannian geometry, additional physical input is needed. Ehlers et al. [19] choice was to postulate the absence of so-called “second clock-effect”.Footnote 18 In this case, the one-form \(\varphi _g\) is closed, \(d\varphi _g=0\), and hence locally exact, \(\varphi _g=-d\Omega \), so that \(\nabla \) is the Levi-Civita connection for \(g'=\exp (\Omega )g\) and we are back to the semi-Riemannian case. In this case, one calls the Weyl geometry “integrable”.

There is a technical and a conceptual problem with this approach. The technical issue concerns the precise notion of “compatibility” that should be invoked in axiom C to achieve the reduction to a Weyl geometry. The conceptual issue concerns the choice of physical input that allows to further reduce the Weyl to the semi-Riemannian geometry.

Let us start with the conceptual problem first. It was felt by many that the choice of Ehlers et al. [19], to just declare the non-existence of the second clock-effect, is physically not convincing and going against the spirit of the whole approach. First, it is not convincing because just postulating the absence of these unwanted effect is conceptually not sufficient. Rather, one should show their incompatibility with fundamental properties of matter. Second it’s against the spirit because it re-introduces the notion of “clocks” which we wanted to eliminate.Footnote 19

It is clear that in order to meet this conceptual criticism one has to inject more physics into the scheme that, first, allows to eliminate non-integrable Weyl geometries and, second, keeps the newly injected physics simple enough to not lead us back to “clocks” and “rods”. The central and physically very plausible idea was to inject the information that the point particles that define the projective structure are, according to modern physics, eventually described by quantum mechanics and/or quantum field theory, the wave equations of which lead to particle trajectories in the short-wavelength limit. Just like light rays emerging in the short-wavelength limit from Maxwell’s equations. But here we get timelike worldlines for massive fields. It could indeed be shown that using matter fields as primitives suffices to finally arrive at semi-Riemannian geometries, see [2,3,4].

As regards the technical problem, it turned out that that there can be subtle differences in the precise formulation of how the projective structure \(\mathscr {P}\) and the conformal structure \(\mathscr {C}\) are required to be “compatible”. A recent paper Matveev and Scholz [53] distinguishes three notions of compatibility:

  1. (A)

    Light cone compatibility: Every lightlike autoparallel of \(\mathscr {C}\) is an autoparallel of one (and hence each) connection representing \(\mathscr {P}\). This is how the compatibility is formulated by Ehlers et al. [19].

  2. (B)

    Riemannian compatibility: There exists a \(g\in \mathscr {C}\) the Levi-Civity connection of which represents \(\mathscr {P}\).

  3. (C)

    Weyl compatibility: There is a \(\mathscr {C}\)-compatible Weyl connection representing \(\mathscr {P}\).

The issue arises because in [19] compatibility is defined as in (A) above, but the conclusion drawn by these authors is as if (C) were required. Now, it is not difficult to see that (C) and (B) each imply (A). Comparatively, recently, it was shown by Matveev and Trautman [54] that (B) is strictly stronger than (A), i.e. that (A) does not imply (B). This left open the question whether (A) implied (C), as conjectured by Ehlers et al. [19]. This question was answered in the affirmative only very recently by Matveev and Scholz [53], so that (A) and (C) are, in fact, equivalent.

Finally, I wish to add one more technical observation concerning the EPS scheme. One might wonder how, despite the final gap from Weyl to semi-Riemannian geometries, this scheme manages to end up straight with the already very special class of Weyl geometries. How, for example, does it come about that Finsler geometries are ruled out? We recall that Finsler-like generalizations of Lorentzian geometries can be definedFootnote 20 and have been applied to problems in gravitational physics in order to test possible deviations from the predictions of General Relativity, see [41, 42]. Now, given a Lorentzian metrig g, it is not difficult to see that Finsler geometries (of Berwald type) exist that have the same conformal and projective structure as g but are not semi-Riemannian, see Tavakol and van den Bergh [75].Footnote 21 This suggests that EPS’ final arrival at a Weyl structure may crucially depend on fine-tuned technical assumptions the physical justification of which one should clarify. To see this in slightly more detail, we look at their axiom \(L_1\) (Fig. 10.7):

Fig. 10.7
figure 7

[Picture credit: Springer Verlag; the figure on the left is reproduced from the cited reprint]

Drawing and quotation from Ehlers et al., [20, p. 72–73].

The subtle point here is the required differentiability class \(C^2\) of the function \(g:p\mapsto -t(e_1)t(e_2)\) on all of U, that is, including the points p on P. The function g itself is something like the squared spacelike distance of p to the midpoint between \(e_1\) and \(e_2\). If the distance function is of Euclidean type, i.e. its square is a homogeneous polynomial of degree two (in the limit \(p\rightarrow e\)), its second derivative would just be twice the metric tensor at e. But for a genuine Finsler metric the limit \(p\rightarrow e\) of the second derivative will be direction dependent. The \(C^2\)-requirement therefore eliminates all Finsler metrics which are not of (semi-)Riemannian type.Footnote 22 Note that Finsler metrics still give rise to well-defined geodesic problems, that is, curves extremizing the length functional (in the positive definite case) or energy functional (in the indefinite case), giving rise to ordinary differential equations satisfying existence and uniqueness criteria, like Lipschitz continuity, so that Picard-Lindelöf’s theorem can be applied to assure existence and uniqueness. Hence, from that perspective, it would have been sufficient to adopt a weaker than \(C^2\) criterion and still obtain well-defined geodesic principles.Footnote 23 This point has already been made by Tavakol and van den Bergh [75] and more recently by Lämmerzahl and Perlick [41]. For a very most recent comprehensive account, see Bernal et al. [8].

So how could the \(C^2\)-requirement really be justified? Ehlers et al. [19] discuss the physical motivation for their axiom \(L_1\) and refer to Isaak [36] for observational/experimental support. But looking up this reference the reader finds no real “paper” but rather a short letter to the editor (about two-third of a single column on a two-column page), followed by the referee’s report which is so critical that the editor decided to only publish the paper together with the report “by arrangement with the author and the referee”. In any case, Isaak [36] reports on what was then the current observational status on the universal properties of light propagation in matter-free space; universality meaning independence of (a) orientation in space, (b) the source’s state of motion, (c) the frequency, and (d) the polarization. Finslerian geometries are generically non-isotropic and would hence violate (a) and possibly also (c) and (d), though there are interpretational issues regarding cancellations of these non-isotropy effects, which act on the optical as well as solid-state components of the actual experimental device; compare [42].

The lesson to be learned from this is that much more physics than naively anticipated may be hidden in apparently small and innocently looking regularity assumptions which, therefore, are not so innocent after all.Footnote 24

5 Conclusions and Summary

This ends our little tour on selected aspects of axiomatic thinking in physics. Some of these aspects were rather superficial, others somewhat deeper. I find it hard to come up with a résumé that states more than what has already been said (and partially quoted) by Reichenbach and Einstein. It is clear that the famous dictum ascribed to Hilbert, who in view of the axioms of Euclidean geometry allegedly once said that instead of “points”, “lines”, and “planes” we could just as well say “tables”, “chairs”, and “beer-mugs” is a critical one when applied to axiomatic physics. In an obvious sense, it remains true: axiomatization of, say, geometric structures do as such not care about what the objects of this geometry are. They may, for example, be quantum states (density matrices), classical states, or space-time points. But then, as Reichenbach stressed at the beginning, the task for the physicist is not finished. The primitives need to be related to reality, they need to be endowed with a “Wirklichkeitsbezug”. And then, a “beer-mug” makes quite a difference to an abstract “plane”. For the working physicist, the most important value in axiomatic thinking lies in, what Einstein [22] called the clean separation of formal aspects from those regarding the (physical) content. Clearly, one might argue that this dichotomy is by itself not so well founded as Einstein made it sound, and I would be inclined to agree with that. On the other hand, I would not know how to substantially improve on Einstein [22]’s statement from “Geometrie und Erfahrung”, which reads as follows [37, Doc. 52, p. 386]:

The progress brought about by axiomatics consists in separating the logically-formal aspects from the actual content and intuitive [anschaulische] aspects; only the logically-formal is subject to axiomatics, not the intuitive [anschauliche] or any other [sonstige] content.

I do not quite like the “any other” [sonstige], as if the scheme alone is devoid of any content. To me, the physical content cannot be thought of as entirely independent of the logical and mathematical structures within which the articulation of that “content” takes place. Pursuing this will presumably lead me onto rather thin ice. So let me therefore summarize the following:

  • Hilbert’s axiomatization programme is pursued—in one form or another—in many branches of classical and modern physics, though the status of axioms is different to that in pure mathematics.

  • Opinions diverge as regard to its heuristic value, that is, concerning its use and power in the creative process of developing “insight” into the Laws of Nature.

  • One of the most interesting but also most difficult question intimately associated to this programme is how to interpret Hilbert’s term “deepening” (German: “Tieferlegung”). There is no natural objective measure for “depth” and often, in physics, the number of axioms is reduced at the price of a priori inbuilt physical limitations (e.g. Hilbert’s connection of General Relativity with Mie’s theory).

  • In physics, this is related to the problem of “fundamentality”, which is often passionately discussed with too many ideologically motivated preconceptions. I suggest to follow Max Born and regard axiomatic approaches pragmatically as “logical experiments”, which contribute to our understanding just as much as experiments in the lab do. Both should go hand in hand and not be played off against each other.

  • Axiomatic approaches to space-time theories in physics are alive and active.