1 Introductory Discussion

1.1 Critique of Quantization and a New Methodology

The idea of quantization was first put forward by Dirac [1] in 1925 in an attempt to extend Heisenberg’s theory of matrix mechanics [2]. He based the concept on a formal analogy between the Hamilton and the Heisenberg equation and on the principle of correspondence, namely that a quantum theoretical model should yield a “classical” one in some limit. This analogy motivated Dirac to develop a scheme that constructs one or more quantum analogues of a given “classical system” formulated in the language of Hamiltonian mechanics.Footnote 1 When it was discovered that Dirac’s scheme, nowadays known as canonical quantization, was ill-defined (see [3, 4] for the original works by Groenewold and van Hove, also [5, §5.4], in particular [5, Theorem 5.4.9]), physicists and mathematicians attempted to develop a more sophisticated machinery rather than questioning the ansatz. The result has been a variety of quantization algorithms, one of which is particularly noteworthy: Geometric quantization (cf. [6, 7] for an introduction).

In his seminal paper, Segal [8] expressed the need to employ the language of differential geometry in quantum theory. He understood that determining the relevant differential-geometric structures, spaces and their relation to the fundamental equations of quantum theory creates the mathematical coherence necessary to adequately address foundational issues in the subject. By merging this ansatz with Kirillov’s work in representation theory [9], Segal, Kostant [10] and Souriau [11] were able to construct the algorithm of geometric quantization. However, rather than elaborating on the relation between quantum and classical mechanics, geometric quantization unearthed a large amount of geometric structures [12, §23.2], introduced in an ad hoc manner.

It is tempting to blame this state of affairs on the inadequacy of the geometric ansatz or the theory, but instead we invite the reader to take a step back. What is the reason for the construction of a quantization algorithm? Why do we quantize? Certainly, quantum mechanics should agree with Newtonian mechanics in some approximation, where the latter is known to accord with experiment, but is it reasonable to assume the existence of an algorithm that constructs the new theory out of the old one?

These questions are of philosophical nature and it is useful to address them within the historical context. Clearly, the step from Newtonian mechanics to quantum mechanics was a scientific revolution, which is why we find the work of the philosopher and physicist Thomas Kuhn [13] of relevance to our discussion. Kuhn is known for his book “The Structure of Scientific Revolutions” [13], in which he analyzed the steps of scientific progress in the natural sciences. For a summary see [14].

Kuhn argues that, as a field of science develops, a paradigm is eventually formed through which all empirical data is interpreted. As, however, the empirical evidence becomes increasingly incompatible with the paradigm, it is modified in an ad hoc manner in order to allow for progress in the field. Ultimately, this creates a crisis, as attempts to account for the evidence become increasingly ad hoc, unmanageably elaborate and ultimately contradictory. Unless a new paradigm is presented and withstands experimental and theoretical scrutiny, the crisis persists and deepens, because of the internal and external inconsistencies of the current paradigm.

This process can be directly observed in the history of quantum theory. When Newtonian mechanics was faced with the problem of describing the atomic spectra and the stability of the atom in the beginning of the twentieth century [15], it was ad hoc modified by adding the Bohr–Sommerfeld quantization condition [15, 16], despite its known inconsistency with then accepted principles of physics [17, 18]. This ad hoc modification of Newtonian mechanics continued with Werner Heisenberg’s [2] and Erwin Schrödinger’s [19] postulation of their fundamental equations of quantum mechanics, two descriptions later shown to be formally equivalent by von Neumann in his constitutive work [20]. Schrödinger’s and Heisenberg’s description can be viewed as an ad hoc modification, because their equations are formulated on a Newtonian spacetime and intended to replace Newton’s second law without being based on postulated principles of nature. With his quantization algorithm [1], Dirac supplied a convenient way to pass from the mathematical description of a physical system in Newtonian mechanics to the then incomplete, new theory. In accordance with Kuhn’s description, it was a pragmatic, ad hoc step, not one rooted in deep philosophical reflection. Nonetheless, the concept of quantization is ingrained in quantum theory as of today [21], while the as of now futile search for unity in physics has become increasingly ad hoc and elaborate [22, §19].

We are thus reminded of our historical position and the original intention behind quantization: We would like to be able to mathematically describe microscopic phenomena, having at hand neither the fundamental equations describing those phenomena nor a proper understanding of the physical principles involved allowing us to derive such equations. That is, what we lack with respect to our knowledge of microscopic phenomena is, in Kuhn’s words, a paradigm. Rather than having a set of principles of nature, which we use to intuitively understand and derive the fundamental laws of quantum theory, we physicists assume the validity of the old theory, namely Newtonian mechanics or special relativity in its Hamiltonian formulation, only to apply an ad hoc algorithm to obtain laws we have inadequately understood. This is why the concept of quantization itself is objectionable.

Indeed, even if a mathematically well defined quantization scheme existed, it would remain an ad hoc procedure and one would still need additional knowledge which quantized systems are physical (cf. [23, §5.1.2] for a discussion of this in German). From a theory builder’s perspective, it would then be more favorable to simply use the quantized, physically correct models as a theoretical basis and deduce the classical models out of these, rather than formulating the theory in the reverse way. Hence quantization can be viewed as a procedure invented to systematically guess quantum-theoretical models. This is done with the implicit expectation of shedding some light on the conceptual and mathematical problems of quantum theory, so that one day a theory can be deduced from first principles. Thus a quantum theory which is constructed from a quantization scheme must necessarily be incomplete. More precisely, it has not been formulated as a closed entity, since for its formulation it requires the theory it attempts to replace and which it potentially contradicts.

As a result of this development, quantum mechanics and thus quantum theory as a whole has not been able to pass beyond its status as an ad hoc modification of Newtonian mechanics and relativity to date. For a recapitulation of the history of quantum theory illustrating this point, see e.g. the article by Heisenberg [18].

Fortunately, our criticism does not apply to the theory of relativity, which, to our knowledge, provides an accurate description of phenomena [24], at least in the macroscopic realm. As the principles of relativity theory are known cf. [25, p. XVII]), the ridiculousness of “relativizing” Newtonian mechanics is obvious. Indeed, in the theory of relativity physics still finds a working paradigm.

Rejecting quantization neither leads to a rejection of quantum theory itself, nor does it imply that previous attempts to put quantum theory into a geometric language were futile. If we reject quantization, we are forced to view quantum theory as incomplete and phenomenological, which raises the question of what the underlying physical principles and observables are. Considering that the theory of relativity is mainly a theory of spacetime geometry, asking, as Segal did, for the primary geometric and physical quantities in quantum theory offers a promising and natural approach to this question.

Therefore, we reason that we theorists should look at the equations of quantum theory with strong empirical support and use these to construct a mathematically consistent, probabilistic, geometric theory, tied to fundamental physical principles as closely as possible. But how is this to be approached?

1.2 The Madelung Equations as a Geometric Ansatz

In the year 1926, the same year Schrödinger published his famous articles [19, 26, 27], the German physicist Erwin Madelung reformulated the Schrödinger equation into a set of real, non-linear partial differential equations [28] with strong resemblance to the Euler equations [29, §1.1] found in hydrodynamics. The so-called Madelung equations areFootnote 2

$$\begin{aligned} m \dot{\vec {X}} = {\vec {F}} + \frac{\hbar ^2}{2m} \nabla \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} ,\end{aligned}$$
(1.1)
$$\begin{aligned} \nabla \times {\vec {X}} = 0 \, ,\end{aligned}$$
(1.2)
$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho \, {\vec {X}} \right) = 0 , \end{aligned}$$
(1.3)

where m is the mass of the particle, \(X = \partial / \partial t + {\vec {X}}\) is a real vector field, called the drift (velocity) field, \(\rho \) is the probability density (by an abuse of terminology), \(\vec {F}\) the external force and \(\dot{\vec {X}}\) denotes the so-called material derivative (cf. [29, p. 4]) of X along itself. Madelung already believedFootnote 3 that these equations could serve as a foundation of quantum theory. He reached this conclusion, because the equations exhibit a strong link between quantum mechanics and Newtonian continuum mechanics [28]. Thus Madelung used these equations to interpret quantum behavior by exploiting the analogy to the Euler equations. At this point in history, it was not clear how to interpret the wave function, as the Born rule and the ensemble interpretation had just recently emerged [30]. Madelung’s misinterpretation of quantum mechanics may perhaps be the reason why it took almost 25 years for his approach to become popular again, when Bohm employed the Madelung equations to develop what is now known as Bohmian mechanics [31, 32]. Nonetheless a clear distinction should be drawn [33] between the Madelung equations and the Bohmian theory [31, 32]. Despite the popularity of Bohm’s approach, a discussion of the Madelung equations on their own [34,35,36,37,38,39] seems less common.

Today, the importance of the Madelung equations lies in the fact that they naturally generalize the Schrödinger equation and in doing so expose the sought-after geometric structures of quantum theory and its classical limit. As a byproduct, one obtains a natural answer to the question why complex numbers arise in quantum mechanics. The Madelung equations, by their virtue of being formulated in the language of Newtonian mechanics, make it possible to construct a wide class of quantum theories by making the same coordinate-independent modifications found in Newtonian mechanics, without any need to construct a quantization algorithm as, for example, in geometric [6, 7] and deformation quantization [23]. This greatly simplifies the construction of new quantum theories and therefore makes the Madelung equations the natural foundation of quantum mechanics and the natural ansatz for any attempts of interpreting quantum mechanics.

For some of these modifications it is not possible to construct a Schrödinger equation and for others the Schrödinger equation becomes non-linear, which suggests that there exist quantum-mechanical models that cannot be formulated in the language of linear operators acting on a vector space of functions. From a conceptual point of view, this might prove to be a necessity to remove the mathematical and conceptual problems that plague relativistic quantum theory today, or at least expose the origins of these problems. In fact, the Madelung equations admit a straight-forward (general-)relativistic generalization leading to the Klein–Gordon equation, which is, however, not discussed here and arguably unphysical.Footnote 4 The Madelung equations and their modifications are henceforth particularly suited for studying quantum theory, from the differential-geometric perspective. We thus believe that they will take a central role both in the future construction of an internally consistent, geometric quantum theory as well as the realist understanding of microscopic phenomena.

1.3 Outline and Conventions

In this article we formalize the Madelung picture of quantum mechanics and thus provide a rigorous framework for further development. A first step is made by postulating a modification intended to model particle creation and annihilation. In addition, we give a possible interpretation of quantum mechanics that is an extension of the stochastic interpretation developed by Tsekov [42], which in turn originated in ideas from Bohm and Vigier [43] in the 1950s.

Our article is organized as follows: We first construct a spacetime model on which to formulate the Madelung equations using relativistic considerations. In Sect. 3, we further motivate the need for the Madelung equations in the formulation of quantum mechanics and then give a theorem stating the equivalence of the Madelung equations and the Schrödinger equation, if the force is irrotational and a certain topological condition is satisfied. We also address concerns raised in the literature regarding this point (cf. [38, §3.2.2] and [39, 44, 45]). We introduce some terminology and proceed with a basic, mathematical discussion. In Sect. 4, we discuss the operator formalism in the Schrödinger picture and its relation to the Madelung equations. We proceed by giving a formal interpretation of the Madelung equations in Sect. 5.1 and then speculate in Sect. 5.2 that quantum mechanical behavior originates in noise created by random irregularities in spacetime curvature, that is random, small-amplitude gravitational waves. How the violation of Bell’s inequality can be achieved in this stochastic interpretation is also discussed. In Sect. 6, we propose a modification of the Madelung equations, intended to model particle creation and annihilation, and show how this in general leads to a non-linearity in the Schrödinger equation. We conclude this article with a brief review of our results including a table and an overview of some open problems.

Some prior remarks: To fully understand this article, an elementary knowledge of Riemannian geometry, relativity and quantum mechanics is required. We refer to [46, Chap. 1–4] and [47,48,49], respectively. The mathematical formalism of the article is, however, not intended to deter anyone from reading it and should not be a hindrance to understanding the physics we discuss, which is not merely of relevance to mathematical physicists. For the sake of clarification, we have attempted to provide some intuitive insight along the lines of the argument. Less mathematically versed readers should skip the proofs and the more technical arguments while being aware that precise mathematical arguments are required, as intuition fails easily in a subject this far away from everyday experience. Moreover, we stress that Sect. 5.2 should be considered fully separate from the rest of the article. At this point the stochastic interpretation, however well motivated, is speculation, but this does not invalidate the rest of the argument.

On a technical note, we usually assume that all mappings and manifolds are smooth. This assumption can be considerably relaxed in most cases, but this would lead to additional, currently unnecessary technicalities. Our notation mostly originates from [46], but is quite standard in physics or differential geometry. For example, \(\varphi _*\) is the pushforward and \(\varphi ^*\) the pullback of the smooth map \(\varphi \), \(\cdot \) is tensor contraction of adjacent entries or the Euclidean inner product, \(\text {{d}}\) the Cartan derivative, \(\varepsilon \) the Levi–Civita symbol, \(\left[ . , . \right] \) the Lie bracket (of vector fields), \(\mathfrak X \left( \mathcal Q \right) \) denotes the space of smooth vector fields and \(\Omega ^k \left( \mathcal Q \right) \) the space of smooth k-forms on the smooth manifold \(\mathcal Q\), respectively. We use the Einstein summation convention and, where relativistic arguments are used, the metric signature is \((+---)\), which gives tangent vectors of observers positive “norm”. Definitions are indicated by italics.

2 Construction of Newtonian Spacetime

In order to be able to construct a rigorous proof of the equivalence of the Schrödinger and Madelung equations, we first construct a spacetime model suitable for our purposes. For a discussion on prerelativistic spacetimes see e.g. [25, §1.1 to §1.3] and [50, Chap. 1].

To describe the motion of a point mass of mass \(m \in \mathbb {R}_+ = \left( 0 , \infty \right) \) in Newtonian physics, we consider an open subset \(\mathcal Q\) of \(\mathbb {R}^4\), which has a canonical topology and smooth structure. The need to restrict oneself to open subsets of \(\mathbb {R}^4\) arises, for instance, from the fact that it is common for forces in Newtonian physics to diverge at the point where the source is located. We exclude such points from the manifold. For similar reasons we also allow non-connected subsets.

To be able to measure spatial distances within the Newtonian ontology, one intuitively needs a degenerate, Euclidean metric. However, this construction should obey the principle of Galilean relativity (cf. [25, Postulate 1.3.1]).

Principle 1 (Galilean Relativity) For any two non-accelerating observers that move relative to each other with constant velocity all mechanical processes are the same.

Therefore, if we formulate physical laws coordinate-independently with some (degenerate) metric \(\delta \) and attribute to it a physical reality, then all observers should measure the same distances. However, in physical terms, whether one travels some distance at constant velocity or is standing still, fully depends on the observer, hence the coordinate system chosen to describe the system. This is a deep problem within the conceptual framework of Newtonian mechanics. One way to circumvent this, is to prevent the measurement of distances for different times. For a mathematical treatment of such Neo-Newtonian or, better to say, Galilean spacetimes see [50, Chap. 1]. A less complicated and physically more satisfying approach is to consider a Newtonian spacetime as a limiting case of a special-relativistic one. More precisely, a Newtonian spacetime is an approximative spacetime model appropriate for mechanical systems involving only small velocities relative to an inertial frame of reference and relative to the speed of light, not involving the modeling of light itself and with negligible spacetime curvature. In this relativistic ontology, the above conceptual problem does not occur, as the notion of spatial and temporal distance is made observer-dependent, which is necessary due to the phenomenon of time dilation and length contraction.

As quantum mechanics is formulated in a Newtonian/Galilean spacetime, it is consequently necessary to view it as a theory in the so-called Newtonian limit. This limit is naively defined by neglecting terms of the order \({{\mathrm{\mathcal O}}}\left( \left( |\vec {v}| / c \right) ^2 \right) \) in equations involving only physically measurable quantities, where \(|\vec {v}|\) is the speed corresponding to the velocity \(\vec {v}\) of any mass point relative to the inertial frame and c is the speed of light (in vacuum). Obviously, this is not a rigorous definition, but this naive approach suffices for our purposes here. We will give a more thorough discussion of the Newtonian limit in a future work [51]. Also note that \(|\vec {v}| / c\) is dimensionless and hence the Newtonian limit is independent of the chosen system of units.

Our reasoning directly leads us to the definition of Newtonian spacetime.

Definition 2.1

(Newtonian Spacetime) A Newtonian spacetime is a tuple \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \), where

  1. (i)

    \(\mathcal Q\) is an open subset of \(\mathbb {R}^4\) equipped with the standard topology and smooth structure,

  2. (ii)

    the time form \(\text {{d}}\tau \) is an exact, non-vanishing 1-form and the spatial metric \(\delta \) is a symmetric, non-vanishing, covariant 2-tensor field, such that there exist coordinates \(x= \left( t, x^1, x^2, x^3 \right) \equiv \left( t, {{\vec {x}}} \right) \) on \(\mathcal Q\) with

    $$\begin{aligned} \text {{d}}\tau = \text {{d}}t , \quad \delta = \delta _{ij} \, \text {{d}}x^i \otimes \text {{d}}x^j = \begin{pmatrix} 0 &{} &{} &{} \\ &{} 1 &{} &{} \\ &{} &{} 1 &{} \\ &{} &{} &{} 1 \end{pmatrix} \end{aligned}$$
    (2.1a)

    for \(i,j \in \lbrace 1,2, 3 \rbrace \).

  3. (iii)

    the Newtonian orientation \(\mathcal O\) is a (smooth) \({{\mathrm{GL}}}^+\left( \mathbb {R}^3 \right) \)-reduction of the frame bundle \({{\mathrm{Fr}}}\left( T \mathcal Q \right) \) defined as follows (see e.g. [46, Definition 9.6] and [52, §6.1] for definitions). Consider the Lie group

    $$\begin{aligned} {{\mathrm{GL}}}^+ \left( \mathbb {R}^3 \right) := \bigg \{A \in {{\mathrm{End}}}\left( \mathbb {R}^3 \right) \bigg | { \det A > 0} \bigg \} , \end{aligned}$$
    (2.1b)

    the vector field B, defined by

    $$\begin{aligned} \delta \left( B, B\right)&= 0 , \end{aligned}$$
    (2.1c)
    $$\begin{aligned} \text {{d}}\tau \cdot B&= 1 , \end{aligned}$$
    (2.1d)

    and the \({{\mathrm{GL}}}^+\left( \mathbb {R}^3 \right) \)-right action

    $$\begin{aligned} \left( \zeta , A \right) \rightarrow \zeta \cdot \begin{pmatrix} 1 &{} 0 \\ 0 &{} A \end{pmatrix} = \left( \zeta _0, A^i{}_1 \, \zeta _i , A^i{}_2 \, \zeta _i, A^i{}_3 \, \zeta _i \right) \end{aligned}$$
    (2.1e)

    for \(\zeta = \left( \zeta _0,\zeta _1,\zeta _2,\zeta _3 \right) \in {{\mathrm{Fr}}}\left( T \mathcal Q \right) \) and \(A \in {{\mathrm{GL}}}^+\left( \mathbb {R}^3 \right) \). Then \(\mathcal O\) is a \({{\mathrm{GL}}}^+\left( \mathbb {R}^3 \right) \)-reduction of the frame bundle \({{\mathrm{Fr}}}\left( T \mathcal Q \right) \) with the property that there exists a global frame field \(\xi :\mathcal Q \rightarrow {{\mathrm{Fr}}}\left( T \mathcal Q\right) \) satisfying \(\xi _0 = B\) and \(\text {{d}}\tau \cdot \xi _i =0\) for all \(i \in \lbrace 1,2,3 \rbrace \), such that

    $$\begin{aligned} \mathcal O =\bigg \{\zeta \in {{\mathrm{Fr}}}\left( T \mathcal Q \right) \bigg | \exists q \in \mathcal Q \, \exists A \in {{\mathrm{GL}}}^+\left( \mathbb {R}^3 \right) :\zeta = \xi _q \cdot \begin{pmatrix} 1 &{} 0 \\ 0 &{} A \end{pmatrix} \bigg \}. \end{aligned}$$
    (2.1f)
  4. (iv)

    The tangent bundle \(T \mathcal Q\) is equipped with a covariant derivative, called the Newtonian derivative \(\nabla \), which is

    1. (a)

      compatible with the temporal metric \(\text {{d}}\tau ^2 := \text {{d}}\tau \otimes \text {{d}}\tau \) :

      $$\begin{aligned} \nabla \text {{d}}\tau ^2 = 0 , \end{aligned}$$
      (2.1g)
    2. (b)

      compatible with the spatial metric:

      $$\begin{aligned} \nabla \delta = 0 , \end{aligned}$$
      (2.1h)
    3. (c)

      torsion-free, i.e. \(\forall X,Y \in \mathfrak X \left( \mathcal Q \right) \):

      $$\begin{aligned} \nabla _X Y = \nabla _Y X + \left[ X , Y \right] . \end{aligned}$$
      (2.1i)

The vector field B is called the intrinsic observer (vector) field. An (ordered) triple of tangent vectors \(\left( Y_1, Y_2, Y_3 \right) \) at some \(q \in \mathcal Q\) is called right-handed, if \(( B_q, Y_1, Y_2, Y_3 ) \in \mathcal O\). Analogously we define right-handedness of a triple of vector fields. Coordinates satisfying (2.1a) are called Eulerian coordinates, if in addition \(( \partial /\partial x^1,\partial /\partial x^2, \partial /\partial x^3 )\) is right-handed.

For convenience, we identify the points \(q \in \mathcal Q \subseteq \mathbb {R}^4\) with their Eulerian coordinate values, s.t. \(q= \left( t, {{\vec {x}}} \right) \). Condition (ii) can be read as an integrability condition, i.e. the coordinates are chosen in accordance with the geometric structures and not vice versa. Thus the definition is coordinate-independent. The Newtonian orientation (iii) is necessary in the definition to be able to mathematically distinguish a physical system modeled on a Newtonian spacetime from its mirror image. It is easy to check that (2.1c) and (2.1d) uniquely determine B to be

$$\begin{aligned} B = \frac{\partial }{\partial t} =: \frac{\partial }{\partial \tau } , \end{aligned}$$
(2.2)

so our definition of \(\mathcal O\) is sensible. As it is the case for ordinary orientations on manifolds, there are precisely two possible Newtonian orientations \(\mathcal O\) on \(\mathcal Q\).

Clearly, the intrinsic observer field B plays a special role. Condition (2.1d) means that the time form \(\text {{d}}\tau \) determines the parametrization of the integral curves of the intrinsic observer field, including its “time orientation”, and condition (2.1c) means that the integral curves of the observer field have no spatial length, or, equivalently, they describe mass points at rest. Therefore, due to the existence of a “preferred rest frame”, Principle 1 is actually violated is actually violated in Definition 2.1, if one does not consider a Newtonian spacetime as the limiting case of a special relativistic model for a particular observer. Mathematically this is captured by the fact that Galilei boosts are not spatial isometries of a Newtonian spacetime, i.e. isometries with respect to the degenerate spatial metric. Within the special relativistic ontology, however, the Lorentz boosts are isometries of the physical spacetime and we can find a Newtonian spacetime corresponding to the boost by taking the Newtonian limit. This procedure yields two different spatial metrics, one for each observer. Therefore, Principle 1 is indeed satisfied on an ontological level.

Excluding point (iv), Newtonian spacetimes trivially exist. The following lemma shows that the Newtonian connection is also well-defined.

Lemma 2.2

(Existence & Uniqueness of the Newtonian Connection) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime. Then the Newtonian connection \(\nabla \) is unique and trivial in Eulerian coordinates, i.e. all connection coefficients vanish.

Proof

Consider \(g:= \text {{d}}\tau ^2 + \delta \). (2.1g) and (2.1h) in Definition 2.1 imply

$$\begin{aligned} \nabla g = \nabla \left( \text {{d}}\tau ^2 + \delta \right) = \nabla \text {{d}}\tau ^2 + \nabla \delta = 0 . \end{aligned}$$
(2.3)

Now \(\nabla \) is just the Levi–Civita connection with respect to the standard Riemannian metric g in the global chart \(\left( \mathcal Q, x \right) \) and the result follows. \(\square \)

We conclude that our construction is both physically and mathematically consistent. Yet before we can set up physical models on a Newtonian spacetime \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \), we need to consider the relevant dynamical quantities as obtained from the theory of relativity. These considerations will yield two subclasses of tangent vectors.

Recall that in relativity theory the spacetime model is a time-orientedFootnote 5 Lorentzian 4-manifold \(\left( \mathcal Q, g \right) \), equipped with the Levi–Civita connection \(\nabla \). If a curve

$$\begin{aligned} \gamma :I \rightarrow \mathcal Q :\tau \rightarrow \gamma \left( \tau \right) , \end{aligned}$$
(2.4)

defined on an open interval \(I \subseteq \mathbb {R}\) is assumed to describe physical motion, we require its tangent vector field \(\dot{\gamma }:= \gamma _* (\partial / \partial \tau )\) to be timelike, future directed and to be parametrized with respect to proper time \(\tau \). For the latter

$$\begin{aligned} g \left( \dot{\gamma }, \dot{\gamma }\right) = c^2 \end{aligned}$$
(2.5)

is a necessary and sufficient condition. Such curves \(\gamma \) are known as observers and if for a tangent vector \(X \in T \mathcal Q\) an observer \(\gamma \) exists with \(X= \dot{\gamma }_\tau \) for some \(\tau \in I\), then X is called an observer vector. Vector fields \(X \in \mathfrak X \left( \mathcal Q \right) \) whose values \(X_q \in T_q \mathcal Q\) are observer vectors at every \(q \in \mathcal Q\) are accordingly called observer (vector) fields. Since a region of physical spacetime with negligible curvature can be approximately described by special relativity, we may restrict ourselves to the case where \(\mathcal Q \subseteq \mathbb {R}^4\) is open and \(g= \eta \) is the Minkowski metric. In standard coordinates \(\left( t, {{\vec {x}}}\right) \) on \(\mathcal Q\), we write the tangent vector of an observer \(\gamma \) as

$$\begin{aligned} \dot{\gamma }= \dot{t} \, \partial _t + \dot{x}^i \, \partial _i , \end{aligned}$$
(2.6)

where the dot denotes differentiation with respect to \(\tau \). On the other hand, condition (2.5) requires

$$\begin{aligned} \dot{t} = \frac{1}{\sqrt{1- \left( \frac{1}{c} \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t} \right) ^2 }} \quad , \end{aligned}$$
(2.7)

where we used the notation

$$\begin{aligned} \left( \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t} \right) ^2 := \delta \left( \frac{\text {{d}}}{\text {{d}}t}, \frac{\text {{d}}}{\text {{d}}t} \right) = \delta _{ij} \frac{\text {{d}}x^i}{\text {{d}}t} \frac{\text {{d}}x^j}{\text {{d}}t} . \end{aligned}$$
(2.8)

A first order Taylor expansion of (2.7) in

$$\begin{aligned} \frac{1}{c}|\frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t}| := \frac{1}{c} \sqrt{ \left( \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t} \right) ^2 } \end{aligned}$$
(2.9)

around 0 yields

$$\begin{aligned} \dot{t} \approx 1 , \end{aligned}$$
(2.10)

which is the expression for \(\dot{t}\) in the Newtonian limit. This implies

$$\begin{aligned} \dot{{\vec {x}}} \equiv \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}\tau } = \dot{t} \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t} \approx \frac{\text {{d}}{{\vec {x}}}}{\text {{d}}t} . \end{aligned}$$
(2.11)

Plugging (2.10) and (2.11) back into (2.6) we get

$$\begin{aligned} \dot{\gamma }\approx \frac{\partial }{\partial t} + \frac{\text {{d}}x^i}{\text {{d}}t} \frac{\partial }{\partial x^i} . \end{aligned}$$
(2.12)

If we carry this reasoning over to observer vectors \(X \in T_q \mathcal Q\) at any \(q \in \mathcal Q\), then we get in the Newtonian limit

$$\begin{aligned} X \approx {\partial _t}|_{q} + {\vec {X}} \, \end{aligned}$$
(2.13)

with \({\vec {X}} = X^i {\partial _i}|_{q}\). This is the reason for naming \(B= \partial _t\) in Definition 2.1 the ‘intrinsic observer vector field’.

To obtain the other important class of tangent vectors, we have a look at the dynamics. Hence we consider a test particle,Footnote 6 which is described by an observer \(\gamma \) and has mass \(m \in \mathbb {R}_+\). The force on the particle is defined by

$$\begin{aligned} F:= m \, \frac{\nabla \dot{\gamma }}{\text {{d}}\tau } , \end{aligned}$$
(2.14)

which is just the generalization of Newton’s second law to general relativity. Note that gravity is not a force, but a pseudo-force. Due to metricity of the connection and condition (2.5) we obtain

$$\begin{aligned} g \left( \dot{\gamma }, \frac{\nabla \dot{\gamma }}{\text {{d}}\tau } \right) = 0 , \end{aligned}$$
(2.15)

which roughly means that the (relativistic) velocity is orthogonal to the (relativistic) acceleration. Applying this on (2.14), we get

$$\begin{aligned} g \left( \dot{\gamma }, F \right) = 0 , \end{aligned}$$
(2.16)

hence F is spacelike [53, Chap. 5, 26. Lemma]. In the Newtonian limit, the force field F must stay “spacelike”. This is indeed the case, which we see by using the definition (2.14) of F together with the approximation (2.12) for \(\dot{\gamma }\):

$$\begin{aligned} \frac{F}{m} = \frac{\nabla \dot{\gamma }}{\text {{d}}\tau } \approx \frac{\nabla }{\text {{d}}t} \left( \frac{\partial }{\partial t} + \frac{\text {{d}}x^i}{\text {{d}}t} \frac{\partial }{\partial x^i} \right) = \frac{\text {{d}}^2 x^i}{\text {{d}}t ^2 } \, \frac{\partial }{\partial x^i} \equiv \frac{\text {{d}}^2 {{\vec {x}}}}{ \text {{d}}t ^2} . \end{aligned}$$
(2.17)

This directly shows that \(F^0\) has to vanish in the Newtonian limit.

We have thus obtained the two types of tangent vectors (and hence curves and vector fields) of relevance in any physical model set in a Newtonian spacetime, i.e. tangent vectors \(Y \in T \mathcal Q\) with either \(Y^t=0\) or \(Y^t=1\). Our discussion motivates the following definition.

Definition 2.3

(Newtonian Vectors) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime. A tangent vector \(Y \in T \mathcal Q \) at \(q \in \mathcal Q\) is called Newtonian spacelike, if \(\text {{d}}\tau \cdot Y= 0\) or, equivalently, in Eulerian coordinates

$$\begin{aligned} Y = {\vec {Y}} := Y^i \, {\frac{\partial }{\partial x^i}}|_{q} . \end{aligned}$$
(2.18a)

\(Y \in T_q \mathcal Q \) is called a Newtonian observer vector, if \(\text {{d}}\tau \cdot Y= 1\) or, equivalently,

$$\begin{aligned} Y = {\frac{\partial }{\partial t}}|_{q} + Y^i \, {\frac{\partial }{\partial x^i}}|_{q} = {\frac{\partial }{\partial t}}|_{q} + {\vec {Y}} . \end{aligned}$$
(2.18b)

A tangent vector Y is called Newtonian, if Y is either a Newtonian observer vector or Newtonian spacelike. For a Newtonian vector Y, we call \({\vec {Y}}\) the spacelike component of Y.

It follows that a tangent vector X, describing the velocity vector of a point mass in the Newtonian limit at some instant, is a Newtonian observer vector, and a vector F, giving the force acting on such a particle according to (2.14) at that instant, is Newtonian spacelike (i.e. \(F = {\vec {F}}\)).

Remark 2.4

The above terminology carries over to vector fields, e.g. a Newtonian observer (vector) field Y is one whose values \(Y_q \in T_q \mathcal Q\) are Newtonian observer vectors for every \(q \in \mathcal Q\). We denote the space of (smooth) Newtonian vector fields by \(\mathfrak X_N \left( \mathcal Q \right) \), the space of (smooth) Newtonian spacelike vector fields by \(\mathfrak X_{Ns} \left( \mathcal Q \right) \) and the space of (smooth) Newtonian observer vector fields by \(\mathfrak X_{Nt} \left( \mathcal Q \right) \).

Note that there are not any “Newtonian lightlike” vectors. Indeed, for physical consistency we require \(|\vec {X} |< c\).

The space of Newtonian spacelike vector fields forms a real vector space, the space of Newtonian observer vector fields does not. However, if we add a Newtonian spacelike vector field to a Newtonian observer vector field, we still have a Newtonian observer vector field. The intrinsic observer field is then the trivial Newtonian observer field, its integral curves physically correspond to observers at rest with respect to some inertial observer \(\gamma \) in Minkowski spacetime \(\left( \mathbb {R}^4, \eta \right) \).

Instead of considering a single observer \(\gamma \) in Minkowski spacetime, let us now assume that it is the integral curve of an observer field X. If each integral curve of X describes the trajectory of a test particle of equal mass m, then (2.14) adapted to this case yields

$$\begin{aligned} F = m \nabla _X X . \end{aligned}$$
(2.20)

In the Newtonian limit, we obtain the Newtonian spacelike vector field \(F \approx {\vec {F}}\) and the Newtonian observer vector field \(X \approx \partial _t + \vec {X}\), hence (2.20) is approximated by

$$\begin{aligned} {\vec {F}} \approx m \nabla _{\partial _t + \vec {X}} \left( \partial _t + \vec {X} \right) = m \left( \frac{\partial X^i}{\partial t} + X^j \, \frac{\partial X^i}{\partial x^j} \right) \, \frac{\partial }{\partial x^i} \equiv m \biggl ( \frac{\partial \vec {X}}{\partial t} + \nabla _{\vec {X}} {\vec {X}} \biggr ) . \end{aligned}$$
(2.21)

We thus see that the Newtonian limit naturally gives rise to what is known as the material derivative in the fluid mechanics literature [29, p. 4]. Intuitively, the material derivative of a Newtonian observer vector field X along itself gives the acceleration of a point \({{\vec {x}}}\) in space moving along the flow lines of X at some time t [54, §1.2]. However, as we have obtained this from the Levi–Civita connection in the Newtonian limit and not in the context of fluids, we do not use this terminology here. Nonetheless we shall adapt our notation. So if \(X \in \mathfrak X_{Nt} \left( \mathcal Q \right) \) is a Newtonian observer field and \(Y \in \mathfrak X_N \left( \mathcal Q \right) \) a Newtonian vector field then, according to Lemma 2.2, the Newtonian derivative \(\nabla \) of Y along X can be written as

$$\begin{aligned} \nabla _ X Y = \frac{\partial {\vec {Y}}}{\partial t} + \nabla _{\vec {X}} \vec Y =: \frac{\partial {\vec {Y}}}{\partial t} + \left( \vec {X} \cdot \nabla \right) {\vec {Y}} \, \end{aligned}$$
(2.22)

in full compliance with (2.21). If \(X \in \mathfrak X_{Ns} \left( \mathcal Q \right) \) is Newtonian spacelike instead, then

$$\begin{aligned} \nabla _ X Y = \nabla _{\vec {X}} \vec Y = \left( \vec {X} \cdot \nabla \right) {\vec {Y}} . \end{aligned}$$
(2.23)

This also shows that for Newtonian vector fields X, Y the expression \(\nabla _ X Y\) is always Newtonian spacelike.

For the special case of a Newtonian observer field X, we use the notation

$$\begin{aligned} \dot{X} := \nabla _X X = \nabla _X \vec {X} = \dot{ \vec {X} } = \frac{\partial \vec {X}}{\partial t} + \left( \vec {X} \cdot \nabla \right) \vec {X} , \end{aligned}$$
(2.24)

which has the natural interpretation of acceleration.

We still have to mathematically construct the relevant vector calculus operators on Newtonian spacetimes without the need to refer to the Newtonian limit.

So let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, define

$$\begin{aligned} \Omega _t:= \bigg \{ {{\vec {x}}} \in \mathbb {R}^3 \bigg | {\left( t, {{\vec {x}}} \right) \in \mathcal Q}\bigg \} , \quad I := \bigg \{ t \in \mathbb {R}\bigg | {\Omega _t \ne \emptyset }\bigg \} \end{aligned}$$
(2.25)

and let \(\iota _t :\Omega _t \rightarrow \mathcal Q\) be the natural inclusion. By the regular value theorem, there is a unique topology and smooth structure on \(\Omega _t\) such that it becomes an embedded, smooth submanifold of \(\mathcal Q\) and it can then be naturally equipped with the flat Riemannian metric \(\iota ^*_t \delta \). It also inherits a natural orientation from the Newtonian orientation \(\mathcal O\) on \(\mathcal Q\). Thus \(\Omega _t\) is an oriented Riemannian 3-manifold and hence the vector calculus operators \({{\mathrm{grad}}}\), div and \({{\mathrm{curl}}}\) are well defined (cf. [46, Ex. 4.5.8] for definitions). These can be naturally extended to operators on \(\mathcal Q\) by considering \(T_{{{\vec {x}}}} \Omega _t\) as a linear subspace of \(T_{\left( t, {{\vec {x}}} \right) } \mathcal Q\) for each \(\left( t ,{{\vec {x}}}\right) \in \mathcal Q\).

Definition 2.5

(Vector Calculus on Newtonian Spacetimes) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, \(X \in \mathfrak X _N \left( \mathcal Q \right) \) be a smooth Newtonian vector field, \(f \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) and let \(\iota _t :\Omega _t \rightarrow \mathcal Q\) be defined as above for each \(t\in \mathbb {R}\) such that \(\Omega _t \ne \emptyset \). We then define for every \(\left( t, {{\vec {x}}} \right) \in \mathcal Q\)

  1. (i)

    the gradient of f, denoted by \(\nabla f \in \mathfrak X _{Ns} \left( \mathcal Q \right) \), via

    $$\begin{aligned} \left( \nabla f \right) _{\left( t,{{\vec {x}}}\right) } := \left( {{\mathrm{grad}}}\iota _t^*f\right) _{{{\vec {x}}}} , \end{aligned}$$
    (2.26a)
  2. (ii)

    the divergence of X, denoted by \(\nabla \cdot X \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \), via

    $$\begin{aligned} \left( \nabla \cdot X \right) \left( t, {{\vec {x}}} \right) := \left( \text { div } \left( {\vec {X}} _ {\iota _t \left( . \, \right) } \right) \right) \left( {{\vec {x}}} \right) , \end{aligned}$$
    (2.26b)
  3. (iii)

    the curl of X, denoted by \(\nabla \times X \in \mathfrak X _{Ns} \left( \mathcal Q \right) \), via

    $$\begin{aligned} \left( \nabla \times X \right) _ {\left( t, {{\vec {x}}} \right) } := \left( {{\mathrm{curl}}}\left( {\vec {X}} _ {\iota _t \left( . \, \right) } \right) \right) _ { {{\vec {x}}} } , \end{aligned}$$
    (2.26c)
  4. (iv)

    and the Laplacian of f as

    $$\begin{aligned} \Delta f := \nabla \cdot \left( \nabla f \right) . \end{aligned}$$
    (2.26d)

Note that this definition just yields the ordinary vector calculus operators on \(\mathbb {R}^3\), naturally adapted to the setting of Newtonian spacetimes. Similarly, the cross product \(\times \) can be extended from \(T \Omega _t\) to \(T\mathcal Q\). Moreover, the definitions naturally extend to complex valued functions and vector fields.

With this, we have finished our construction of a spacetime model, the associated (differential) operators and the elementary concepts needed for any physical model constructed upon it.

3 Local Equivalence of the Schrödinger and Madelung Equations

We now employ the construction of the previous section to set up a model of a non-relativistic quantum system with one Schrödinger particle.

In the Schrödinger picture of quantum mechanics [47, §4.1 to §4.3] such a system under the influence of an external force

$$\begin{aligned} {\vec {F}} = - \nabla V \end{aligned}$$
(3.1)

with potential \(V \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) is described by a so called wave function \(\Psi \in C^\infty \left( \mathcal Q, \mathbb {C} \right) \), satisfying the Schrödinger equation [19, 26, 27]

$$\begin{aligned} \i \hbar \frac{\partial }{\partial t}\Psi = - \frac{\hbar ^2}{2m} \Delta \Psi + V \Psi \end{aligned}$$
(3.2)

together with the rule that \(\rho := \Psi ^* \Psi \equiv {|\Psi |}^2 \) gives the probability density for the particle’s position at fixed time.Footnote 7 This description has a number of disadvantages:

  1. (i)

    The function \(\Psi \) is complex and it is not apparent how and why this is the case. This in turn prevents a direct physical interpretation.

  2. (ii)

    The equation is already integrated, in the sense that it is formulated in terms of the potential V and that the phase of \(\Psi \) is only specified up to an arbitrary real summand. This in turn suggests that the equation is not fundamental, i.e. it is not formulated in terms of directly measurable physical quantities.

  3. (iii)

    It is not apparent how to generalize the Schrödinger equation to the case where no potential exists for a given force \({\vec {F}}\).

  4. (iv)

    It is not entirely apparent how to generalize the Schrödinger equation to more general geometries, i.e. what happens in the presence of constraints, and what the underlying topological assumptions are.

  5. (v)

    Related to this is the fact that, due to the \(\partial \Psi / \partial t\) term, there is no obvious relativistic generalization. This in turn reintroduces the conceptual problems with Principle 1.

  6. (iv)

    Let \(t \in I\) and let \(\mu _t\) be the canonical volume form on \(\Omega _t\) (cf. (2.25)) with respect to the metric \(\iota ^*_t \delta \), i.e. \(\mu _t = \text {{d}}x^1 \wedge \text {{d}}x^2 \wedge \text {{d}}x^3 \equiv \text {{d}}^3x\). The statement that for any Borel measurable \(N \subseteq \Omega _t \subseteq \mathbb {R}^3\) the expression

    $$\begin{aligned} \int _{N} \iota ^*_t\rho \, \mu _t \quad \in \left[ 0,1\right] \end{aligned}$$
    (3.3)

    gives the probability for the particle to be found within the region N at time t is inherently non-relativistic. Again this leads to problems with Principle 1.

In this section we will observe that these problems are strongly related to each other and find their natural resolution in the Madelung picture.

Before we state and prove the main theorem of this section, that is Theorem 3.2, we would like to remind the reader of the Weber identity [55] known from fluid dynamics, since it is essential for passing between the Newtonian and the Hamiltonian description.

Lemma 3.1

(Weber Identity) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime and let \(\vec {X} \in \mathfrak X _{Ns} \left( \mathcal Q \right) \) be a smooth Newtonian spacelike vector field.

Then

$$\begin{aligned} \left( \vec {X} \cdot \nabla \right) \vec {X} = \nabla \left( \frac{{\vec {X}}^2}{2} \right) - \vec {X} \times \left( \nabla \times \vec {X} \right) . \end{aligned}$$
(3.4)

Proof

Let \(t \in I\), as defined in (2.25). For the vector fields \(\vec {X}^t := \vec {X}_{\iota _t \left( . \, \right) }, {\vec {Y}}^t := \vec Y_{\iota _t \left( . \, \right) } \in \mathfrak X \left( \Omega _t \right) \) and the induced (standard) connection \(\nabla \) on \(\Omega _t \subseteq \mathbb {R}^3\), we have, using standard notation, as a standard result in vector calculus in \(\mathbb {R}^3\) (cf. [29, p. 165, Eq. 7]) that

$$\begin{aligned} \nabla \left( \left( \iota ^*_t \delta \right) \left( \vec {X}^t,\vec Y^t \right) \right) = \vec {X}^t \times \left( \nabla \times {\vec {Y}}^t \right) + \left( {\vec {X}}^t \cdot \nabla \right) \vec Y^t +\vec Y^t \times \left( \nabla \times {\vec {X}}^t \right) + \left( \vec Y^t \cdot \nabla \right) {\vec {X}}^t . \end{aligned}$$

To obtain (3.4), we set \({\vec {X}}^t = \vec Y^t\) and let t vary. \(\square \)

We named Theorem 3.2 in the honor of Erwin Madelung, as it is mainly based on his article [28] and we merely formalized it to meet the standards of mathematical physics. Note that the choice of sign of \(\varphi \) is pure convention. We choose it such that for \(\partial /\partial t\) future directed in Minkowski spacetime \(\left( \mathbb {R}^4, \eta \right) \) (cf. [25, Definition 3.1.3]) and \(\partial \varphi / \partial t > 0\), the vector field

$$\begin{aligned} X = \frac{\hbar }{m} {{\mathrm{grad}}}\varphi \equiv \frac{\hbar }{m} \, \eta ^{-1} \cdot \text {{d}}\varphi \end{aligned}$$
(3.5)

is future directed.

Theorem 3.2

(Madelung’s Theorem) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, \(m, \hbar \in \mathbb {R}_+\) and let \(I \subseteq \mathbb {R}\), \(\Omega _t \subseteq {\mathbb {R}^3}\) be defined as in (2.25).

If \(X \in \mathfrak X _{Nt} \left( \mathcal Q \right) \) is a Newtonian observer vector field, \({\vec {F}} \in \mathfrak X _{Ns} \left( \mathcal Q \right) \) a Newtonian spacelike vector field, \(\rho \in C^\infty \left( \mathcal Q , \mathbb {R}_+ \right) \) a strictly positive, real function and the first Betti number \(b_1 \left( \Omega _t \right) \) of \(\Omega _t\) vanishes for all \(t \in I\), then

$$\begin{aligned} m \dot{{X}} = {\vec {F}} + \frac{\hbar ^2}{2m} \nabla \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} , \end{aligned}$$
(3.6a)
$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho \, {\vec {X}} \right) = 0 , \end{aligned}$$
(3.6b)
$$\begin{aligned} \nabla \times {\vec {X}} = 0 , \end{aligned}$$
(3.6c)
$$\begin{aligned} \nabla \times {\vec {F}} = 0 , \end{aligned}$$
(3.6d)

imply that there exist \(\varphi , V \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) such that

$$\begin{aligned} X = \frac{\partial }{\partial t} - \frac{\hbar }{m} \nabla \varphi , \end{aligned}$$
(3.6e)
$$\begin{aligned} {\vec {F}} = - \nabla V , \end{aligned}$$
(3.6f)
$$\begin{aligned} H := \frac{m}{2} {\vec {X}} ^2 + V - \hbar \frac{\partial \varphi }{\partial t} - \frac{\hbar ^2}{2m} \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} = 0 . \end{aligned}$$
(3.6g)

Moreover, if one defines

$$\begin{aligned} \Psi := \sqrt{\rho } \, e^{-\i \varphi } , \end{aligned}$$
(3.6h)

then it satisfies

$$\begin{aligned} \i \hbar \frac{\partial }{\partial t}\Psi = - \frac{\hbar ^2}{2m} \Delta \Psi + V \Psi . \end{aligned}$$
(3.6i)

Conversely, if \(\Psi \in C^\infty \left( \mathcal Q, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \) and \(V \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) satisfy (3.6i), define \(\rho := |\Psi |^2 \in C^\infty \left( \mathcal Q , \mathbb {R}_+ \right) \), \({\vec {F}}\) via (3.6f) and

$$\begin{aligned} {\vec {X}} := \frac{\hbar }{m} \mathfrak {I}\left( \frac{\nabla \Psi }{\Psi }\right) \equiv \frac{\hbar }{2 \i m} \left( \frac{\nabla \Psi }{\Psi }- \frac{\nabla \Psi ^*}{\Psi ^*} \right) \end{aligned}$$
(3.6j)

such that \(X:= \partial / \partial t + {\vec {X}}\) is a Newtonian observer vector field. Then (3.6a), (3.6b), (3.6c) and (3.6d) hold.

Proof

\(\implies \)” By the definition of curl (2.26c), we have for any fixed \(t \in I\)

$$\begin{aligned} \text {{d}}\left( \left( \iota ^*_t\delta \right) \cdot {\vec {X}}_{\iota _t \left( . \, \right) } \right) = 0 , \, \text {{d}}\left( \left( \iota ^*_t\delta \right) \cdot {\vec {F}}_{\iota _t \left( . \, \right) } \right) = 0 . \end{aligned}$$
(3.7a)

Since \(b_1 \left( \Omega _t \right) = 0\), all closed 1-forms are exact and hence \(\exists \tilde{\varphi }^t, \tilde{V}^t \in C^\infty \left( \Omega _t, \mathbb {R}\right) \):

$$\begin{aligned} \left( \iota ^*_t\delta \right) \cdot {\vec {X}}_{\iota _t \left( . \, \right) } = \text {{d}}\tilde{\varphi }^t , \, \left( \iota ^*_t\delta \right) \cdot {\vec {F}}_{\iota _t \left( . \, \right) } = \text {{d}}\tilde{V}^t . \end{aligned}$$
(3.7b)

If we now let t vary and observe that \(\mathcal Q = \bigsqcup _{t \in I} \Omega _t\), the left hand sides yield smooth 1-forms on \(\mathcal Q\) and so do the right hand sides. In other words, the function

$$\begin{aligned} \tilde{\varphi }:\mathcal Q \rightarrow \mathbb {R}:\left( t, {{\vec {x}}} \right) \rightarrow \tilde{\varphi }^t \left( {{\vec {x}}} \right) =: \tilde{\varphi }\left( t, x \right) \end{aligned}$$
(3.7c)

has smooth partial derivatives \(\partial \tilde{\varphi }/ \partial x^i\) on \(\mathcal Q\) for \(i \in \lbrace 1,2,3 \rbrace \), but \(\partial \tilde{\varphi }/ \partial t\) need not exist. However, if we integrate \(\partial \tilde{\varphi }/ \partial x^1\) with respect to \(x^1\), we obtain a smooth function on \(\mathcal Q\), i.e. by choosing the integration constants appropriately we may assume \(\tilde{\varphi }\in C^\infty \left( \mathcal Q, \mathbb {R}\right) \). We then repeat this argument to obtain \(\tilde{V} \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \).

Choosing \(\varphi := - m \tilde{\varphi }/ \hbar \) and \(V := - \tilde{V}\), we get via (3.7b) and (2.26a), that (3.6e) and (3.6f) hold.

Define now

$$\begin{aligned} U := - \frac{\hbar ^2}{2m} \, \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} , \end{aligned}$$
(3.7d)

and \(\tilde{U} := V + U\). Using the Weber identity (Lemma 3.1) together with (3.6c), equation (3.6a) reads

$$\begin{aligned} - \hbar \frac{\partial }{\partial t}\left( \nabla \varphi \right) + \nabla \left( \frac{m}{2} \, {\vec {X}}^2 \right) = - \nabla \tilde{U} . \end{aligned}$$
(3.7e)

Due to smoothness of \(\varphi \) and the Schwarz’ theorem, we have

$$\begin{aligned} \frac{\partial }{\partial t} \nabla \varphi = \nabla \frac{\partial \varphi }{\partial t} , \end{aligned}$$
(3.7f)

and hence

$$\begin{aligned} \nabla H \equiv \nabla \left( \frac{m}{2} \, {\vec {X}}^2 + \tilde{U} - \hbar \frac{\partial \varphi }{\partial t} \right) = 0 . \end{aligned}$$
(3.7g)

Thus H, as defined by the left side of (3.6g), depends only on t. If \(H \ne 0\), we can redefine V via \(V - H \rightarrow V\) as then \({\vec {F}} = - \nabla \left( V - H \right) = - \nabla V\) remains true. Hence (3.6g) follows.

We now define \(\Psi \) via (3.6h), \(R:= \sqrt{\rho }\) and calculate in accordance with (2.26d):

$$\begin{aligned} \Delta \Psi&= \nabla \cdot \left( \nabla \left( R \, e^{-\i \varphi } \right) \right) \nonumber \\&= \nabla \cdot \left( \nabla R \, e^{-\i \varphi } - \i R \nabla \varphi \, e^{-\i \varphi } \right) \nonumber \\&= e^{-\i \varphi } \left( \Delta R - 2 \i \nabla R \cdot \nabla \varphi - \i R \Delta \varphi - R \left( \nabla \varphi \right) ^2 \right) \nonumber \\&= e^{-\i \varphi } \left( \Delta R - R \left( \nabla \varphi \right) ^2 - \i \left( 2 \nabla R \cdot \nabla \varphi + R \Delta \varphi \right) \right) . \end{aligned}$$
(3.7h)

Plugging \(\rho = R^2\) and (3.6e) into (3.6b) yields

$$\begin{aligned} 2 R \frac{\partial R}{\partial t} - \frac{\hbar }{m} \left( 2 R \, \nabla R \cdot \nabla \varphi + R^2 \, \Delta \varphi \right) = 0 . \end{aligned}$$
(3.7i)

Since R vanishes nowhere, we can multiply with \(m / (\hbar R)\), compare with (3.7h) and arrive at

$$\begin{aligned} - \mathfrak {I}\left( e^{ \i \varphi } \Delta \Psi \right) = \frac{2m}{\hbar } \frac{\partial R}{\partial t} . \end{aligned}$$
(3.7j)

On the other hand, (3.6g) can also be reformulated in terms of \(\varphi \) and R to yield

$$\begin{aligned} - \frac{\hbar ^2}{2m} \left( \Delta R - R \left( \nabla \varphi \right) ^2 \right) - \hbar R \frac{\partial \varphi }{\partial t} + V R = 0 . \end{aligned}$$
(3.7k)

By comparing this with (3.7h), we see that we can construct a \(\Delta \Psi \) by adding \(\i \) times the imaginary part of \(e^{\i \varphi } \Delta \Psi \) for which we have the expression (3.7j). This gives

$$\begin{aligned} - \frac{\hbar ^2}{2m} \Delta \Psi \, e^{ \i \varphi } + V R = - \frac{\hbar ^2}{2m} \, \i \left( \frac{2m}{\hbar } \frac{\partial R}{\partial t} \right) + \hbar R \frac{\partial \varphi }{\partial t} = \i \hbar \frac{\partial R}{\partial t} + \hbar R \frac{\partial \varphi }{\partial t} . \end{aligned}$$
(3.7l)

To take care of the right hand side, we notice

$$\begin{aligned} \i \hbar e^{\i \varphi } \frac{\partial \Psi }{\partial t} = \i \hbar \frac{\partial R}{\partial t} + \hbar R \frac{\partial \varphi }{\partial t} . \end{aligned}$$
(3.7m)

Thus, by multiplying (3.7l) by \(e^{-\i \varphi }\), we finally arrive at the Schrödinger equation (3.6i).

” The reverse construction amounts to Madelung’s discovery [28]. We may define the real function \(R := |\Psi |=: \sqrt{\rho }\), yet, unfortunately, we cannot write \(\Psi \) as in (3.6h), since the complex exponential is not (globally) invertible. Instead we define \(Q := \Psi / |\Psi |\) and observe that by (3.6j)

$$\begin{aligned} {\vec {X}} = \frac{\hbar }{m} \mathfrak {I}\left( \frac{\nabla \left( R Q\right) }{ R Q} \right) = \frac{\hbar }{m} \mathfrak {I}\left( \frac{\nabla Q}{ Q} \right) . \end{aligned}$$
(3.7n)

We now do the calculation backwards with Q instead of \(e^{- \i \varphi }\). So in analogy to (3.7h) we consider

$$\begin{aligned} \Delta \Psi&= \nabla \cdot \left( \nabla \left( R Q \right) \right) = \nabla \cdot \left( \nabla R \, Q + R \, \frac{\nabla Q}{Q} \, Q\right) \nonumber \\&= \Delta R \, Q + 2 \, \nabla R \cdot \left( \frac{ \nabla Q}{Q} \right) \, Q + R \, \nabla \cdot \left( \frac{ \nabla Q}{Q} \right) \, Q + R \, \left( \frac{ \nabla Q}{Q} \right) ^2 \, Q \nonumber \\&= Q \left( \Delta R + R \, \left( \frac{ \nabla Q}{Q} \right) ^2 + 2 \, \nabla R \cdot \left( \frac{ \nabla Q}{Q} \right) + R \, \nabla \cdot \left( \frac{ \nabla Q}{Q} \right) \right) \end{aligned}$$
(3.7o)

and in analogy to (3.7m) we obtain

$$\begin{aligned} \i \hbar \frac{\partial \left( R Q\right) }{\partial t} = \i \hbar \frac{\partial R}{\partial t} Q + \i \hbar R \, \left( \frac{\frac{\partial Q}{\partial t}}{Q}\right) \, Q . \end{aligned}$$
(3.7p)

Dividing the Schrödinger equation (3.6i) by Q and inserting (3.7o) as well as (3.7p), we can take the imaginary part \(\mathfrak {I}\) as well as the real part \(\mathfrak {R}\). This is done by employing the facts that both commute with derivatives, derivatives of Q divided by Q are purely imaginary and that for any complex number \(A \in \mathbb {C}\), we have \(\mathfrak {R}\left( \i A\right) = - \mathfrak {I}A\) and \(\mathfrak {I}\left( \i A\right) = \mathfrak {R}A\). Then after some further algebraic manipulation and using (3.7n), the imaginary part yields the continuity equation (3.6b) and the real part gives (3.6g) with \(\hbar \, \mathfrak {I}\left( (\partial Q / \partial t) /Q \right) \) instead of \(- \hbar \, \partial \varphi / \partial t\). For the latter, we again use the Weber identity from Lemma 3.1 and note

$$\begin{aligned} \nabla \left( \frac{\frac{\partial Q}{\partial t}}{Q} \right) = \frac{\frac{\partial }{\partial t} \nabla Q}{Q} - \frac{\frac{\partial Q}{\partial t} \nabla Q}{Q^2} = \frac{\partial }{\partial t} \left( \frac{\nabla Q}{Q}\right) . \end{aligned}$$
(3.7q)

Recalling the definition (3.6f) of \({\vec {F}}\) we indeed obtain (3.6a). Finally, (3.6d) and (3.6c) are obtained by seeing that \({\vec {F}}\) is a gradient vector field and by calculating

$$\begin{aligned} \nabla \times \left( \frac{\nabla Q}{Q}\right) = \frac{\nabla \times \nabla Q}{Q} - \frac{\nabla Q \times \nabla Q}{Q^2} = 0 . \end{aligned}$$
(3.7r)

This completes the proof. \(\square \)

Since for every \((t,{{\vec {x}}} ) \in \mathcal Q\) the open ball centered at the point is canonically a Newtonian spacetime as well, the theorem shows that the Madelung equations for irrotational force fields and the Schrödinger equation are locally equivalent.

Remark 3.3

(On the ‘Quantization Condition’) In the literature one finds the claim that a quantization condition needs to be added for the Schrödinger and the Madelung equations to be equivalent [45, §3.2.2]; [39]; [38, §6]; [44], namely

$$\begin{aligned} \frac{m}{2 \pi \hbar } \oint _{\gamma } \iota _t^* \left( \delta \cdot X \right) \in \mathbb {Z} \end{aligned}$$
(3.8a)

for all \(t \in I\) and all smooth loops \(\gamma :[0,2 \pi ] \rightarrow \Omega _t\). Note that, as observed by Holland [44, §3.2.2], equation (3.8a) is astonishingly similar, yet inequivalent to the Bohr–Sommerfeld quantization condition in the old quantum theory [16]. Recalling Stoke’s theorem [46, Theorem 4.2.14] and that the irrotationality (3.6c) of X is equivalent to closedness of \(\iota _t^* \left( \delta \cdot X \right) \) for all \(t \in I\), we see that expression (3.8a) vanishes for \(t \in I\) and all \(\gamma \) if and only if \(b_1 \left( \Omega _t\right) \equiv 0\). (3.8a) can therefore only be relevant for the case \(b_1 \left( \Omega _t\right) \ne 0\) for some \(t \in I\). Condition (3.8a) originates from the simplest quantum mechanical model of the Hydrogen atom and indeed excludes apparently unphysical bound states, but, as we will show in detail, is itself of topological origin.

We consider the Madelung equations for a particle with charge \(-q \in \left( - \infty , 0 \right) \) being attracted via the Coulomb force by a particle with charge q fixed at position \(0 \in \mathbb {R}^3\). The maximal domain where \({\vec {F}}\) is smooth is \(\mathbb {R}\times \left( \mathbb {R}^3 \setminus \left\{ 0 \right\} \right) \), and we have \(b_1 \left( \mathbb {R}^3 \setminus \left\{ 0 \right\} \right) = 0\). Together with irrotationality (3.6d) of \({\vec {F}}\), we can thus find a potential \(V :\mathbb {R}\times \left( \mathbb {R}^3 \setminus \left\{ 0 \right\} \right) \rightarrow \mathbb {R}\). Moreover, in spherical coordinates

$$\begin{aligned}&\left( t,r, \theta , \phi \right) :\mathbb {R}^4 \setminus \bigg \{ \left( t,{{\vec {x}}}\right) \in \mathbb {R}^4 \bigg | {x^1 \ge 0, x^2=0}\bigg \} \rightarrow \nonumber \\&\quad \mathbb {R}\times \mathbb {R}_+ \times \left( 0, \pi \right) \times \left( 0, 2 \pi \right) \end{aligned}$$
(3.8b)

we can write the values of V as \(V \left( r\right) \), since \({\vec {F}}\) is time-independent.

If we now look for stationary (i.e. t-independent) solutions of the Madelung equations, we find the natural domains \({{\mathrm{dom}}}\rho = {{\mathrm{dom}}}X\) to be of the form \(\mathbb {R}\times \Omega =: \mathcal Q\) with open \(\Omega \subseteq \mathbb {R}^3 \setminus \lbrace 0 \rbrace \), but in general we cannot assume \(b_1 \left( \Omega \right) = 0\). That is, to be able to write down the Schrödinger equation by application of Theorem 3.2, we have to formally restrict ourselves to a (maximal, non-unique) subset \(\Omega ' \subseteq \Omega \) with \(b_1 \left( \Omega ' \right) = 0\). The set \(I \times \Omega '\) is the natural domain of \(\varphi \) and \(\Psi \), but first we have to find the solution and then we may fix \(\Omega '\). Due to the rotational symmetry of the problem, we may already assume

$$\begin{aligned} \Omega ' \subseteq W := \mathbb {R}^3 \setminus \bigg \{ {{\vec {x}}} \in \mathbb {R}^3 \bigg | {x^1 \ge 0, x^2=0}\bigg \} \subseteq \Omega \end{aligned}$$
(3.8c)

such that \(\mathbb {R}\times W = {{\mathrm{dom}}}\left( t,r, \theta , \phi \right) \). Note that the assumption of stationarity implies \(\nabla (\partial \varphi /\partial t) = 0\), but \(\varphi \) may be time dependent. If we now proceed, as usual, by separation of variables in spherical coordinates, we obtain a splitting \(\varphi \left( t, r ,\theta , \phi \right) = \varphi _0 \left( t\right) +\varphi _1 \left( r \right) + \varphi _2 \left( \theta \right) + \varphi _3 \left( \phi \right) \) with \(E \in \mathbb {R}\) and \(\varphi _0 \left( t\right) = t E / \hbar \), a radial equation and a spherical one. The latter leads to \(\varphi _3 \left( \phi \right) = - \tilde{m} \phi \) with \(\tilde{m} \in \mathbb {R}\) and the associated Legendre equation for \(\xi :(-1,1) \rightarrow \mathbb {C} :y = \cos \theta \rightarrow \xi \left( y\right) \) given by

$$\begin{aligned} \left( 1- y^2 \right) \frac{\text {{d}}^2 \xi }{ \text {{d}}y^2} \left( y \right) -2 y\frac{\text {{d}}\xi }{ \text {{d}}y} \left( y \right) + \left( l (l+1) - \frac{\tilde{m} ^2}{1 - y^2}\right) \xi \left( y\right) = 0 . \end{aligned}$$
(3.8d)

Now one usually asks for the condition

$$\begin{aligned} \Psi \left( t, r, \theta , \phi \right) = \Psi \left( t, r, \theta , \phi + 2 \pi k \right) \end{aligned}$$
(3.8e)

to be satisfied for some \(x \in \mathcal Q\) and for all \(k \in \mathbb {Z}\), which constrains \(\tilde{m}\) (and ultimately the other quantum numbers l and n) to be integer. If \(\Psi \) were a global function, (3.8e) would follow from the continuity of \(\Psi \) on \(\mathcal Q = \mathbb {R}\times \Omega \) and the property of \(\Omega \), that there exists an \({{\vec {x}}} \in \Omega \) such that the curve \(\gamma _{{{\vec {x}}}} :\mathbb {R}\rightarrow \Omega \), given by

$$\begin{aligned} \gamma _{{{\vec {x}}}} \left( s \right) := \left( \sqrt{\left( x^1\right) ^2 + \left( x^2 \right) ^2} \cos s,\sqrt{\left( x^1\right) ^2 + \left( x^2 \right) ^2} \sin s , x^3 \right) , \end{aligned}$$
(3.8f)

lies entirely in \(\Omega \). However, assumption (3.8e) cannot be made if we only ask for \(\Psi \) to be continuous on \(\mathbb {R}\times \Omega ' \subseteq \mathbb {R}\times W\). As Eq. (3.8d) also admits solutions for \(l, \tilde{m}\) not integer [56, p. 288ff]; [57, p. 180f], we may continue to solve the other equationFootnote 8 and ultimately find that there are solutions \(\Psi \) with \(\tilde{m} \notin \mathbb {Z}\) and \(X \in \mathfrak X \left( \mathbb {R}\times \Omega ' \right) \), given by

$$\begin{aligned} X_{\left( t, r, \theta , \phi \right) } = {\frac{\partial }{\partial t}}|_{\left( t, r, \theta , \phi \right) } + \frac{\hbar \, {\tilde{m}}}{m r^2 \sin ^2 \theta } \, {\frac{\partial }{\partial \phi }}|_{\left( t, r, \theta , \phi \right) } \end{aligned}$$
(3.8g)

in spherical coordinates. Keep in mind that \(\Omega '\) also depends on \(|\Psi |\), in particular we have to exclude all zeros of the wave function. Yet the field X, as given by (3.8g), can be smoothly extended to \(\mathbb {R}\times W'\) with

$$\begin{aligned} \Omega ' \subset W' := \mathbb {R}^3 \setminus \bigg \{ {{\vec {x}}} \in \mathbb {R}^3 \bigg | {x^1 = x^2=0}\bigg \} \subseteq \Omega \end{aligned}$$
(3.8h)

and \(b_1 \left( W' \right) = 1\). For this X equation (3.8a) does not hold. (3.8a) would indeed hold for all stationary solutions, had we ad hoc assumed that \(\Psi \) is a global function, i.e. \(\Omega ' = \Omega \). Conversely, had we ad hoc assumed condition (3.8a), then \(\Psi \) would be a global function.

We conclude that the Madelung equations in general admit more solutions than the Schrödinger equation, if the latter is assumed to be globally valid. However, since any point in \(\mathcal Q \subseteq \mathbb {R}^4\) admits a contractible neighborhood, Theorem 3.2 shows that the claim that “theories based on the Madelung equations simply do not reproduce the Schrödinger equation” [39, §IV] is incorrect. While the existence of apparently unphysical additional solutions in this model of the hydrogen atom does indicate a potential defect of the model, it does not imply that the Madelung equations yield an incorrect description of quantum phenomena: This model of the hydrogen atom neglects the motion of the nucleus, the dynamics of the electromagnetic fields, as well as relativistic effects. It is thus plausible that the problem expressed in [45, §3.2.2]; [39]; [38, §6]; [44] stems from an oversimplification of the physical situation. Moreover, Wallstrom raised the interesting question of stability of stationary solutions in this model [39, §IV]. Since unstable solutions are in a sense ‘unphysical’, it might be possible to exclude the additional ones on that ground.

We now fix some terminology, that is partially derived from [39] and partially our own. The Madelung picture consists of the Madelung equations, that is

  1. (i)

    the Newton–Madelung equation (3.6a),

  2. (ii)

    the continuity equation (3.6b),

  3. (iii)

    the vanishing vorticity/irrotationality of the drift (velocity) field X (3.6c),

the topological condition \(b_1 \left( \Omega _t \right) = 0\) for all \(t \in I\) and the irrotationality of the (external) force (3.6d). Obviously, the Madelung equations are a system of partial differential equations of third order in the probability density \(\rho \) and of first order in the drift field X. That means in particular, that \(\rho \) and X are the primary quantities of interest in the Madelung picture, as opposed to e.g. (time-dependent) wave functions in the Schrödinger picture or (time-dependent) operators in the Heisenberg picture. It is therefore justified to call a solution of the Madelung equations \(\left( \rho , X \right) \) a state (of the system) and \(\left( \rho _t, \vec {X}_{t} \right) \) with \(\rho _t \in C^\infty \left( \Omega _t, \mathbb {R}_+\right) \), \(\vec {X}_{t} \in \mathfrak X \left( \Omega _t \right) \) a state (of the system) at time \(t \in I\). The flow of the drift field is called the drift flow or probability flow and the mass of the particle times the drift field is called the drift momentum field, for reasons explained in Sect. 5. The drift field X is a Newtonian observer vector field and, in accordance with Definition 2.3, \({\vec {X}}\) is the spacelike component of the drift field. In reminiscence of the hydrodynamic analogue (unsteady potential flow) [29, §2.1], we call (3.6g) the Bernoulli-Madelung equation. The operator \(U :C^\infty \left( \mathcal Q, \mathbb {R}_+\right) \rightarrow C^\infty \left( \mathcal Q, \mathbb {R}\right) \), as defined by

$$\begin{aligned} U \left( \rho \right) := - \frac{\hbar ^2}{2m} \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} , \end{aligned}$$
(3.9)

is known as the quantum potential or Bohm potential. Analogously, we call the operator \({\vec {F}} _B := -\nabla U :C^\infty \left( \mathcal Q, \mathbb {R}_+ \right) \rightarrow \mathfrak X _{Ns} \left( \mathcal Q\right) \) (with \(\left( \nabla U\right) \left( \rho \right) := \nabla \left( U \left( \rho \right) \right) \)) the quantum force or Bohm force. This terminology is primarily historically motivated, we emphasize that the interpretation of \(-\nabla U\) as an actual force is deeply problematic. Again, we refer to Sect. 5.

A priori, there are four real-valued functions constituting a solution of the Newton–Madelung equation: \(\rho \) and three components of \({\vec {X}}\). If X is irrotational and \(b_1 \left( \Omega _t\right) \equiv 0\), it is enough to know the two functions \(\rho \) and \(\varphi \) (or the wave function \(\Psi \)) to fully determine the physical model. If a solution \(\Psi \) of the Schrödinger equation is known, the simplest way to recover \(\rho \) and X is by calculating

$$\begin{aligned} \rho = \Psi ^* \Psi \end{aligned}$$
(3.10)

and the spacelike component \({\vec {X}}\) of X via (3.6j). So by using Madelung’s theorem (Theorem 3.2), we can move freely between the Schrödinger and Madelung picture, at least locally.

Remark 3.4

(On Time Dependence) In correspondence with the arguments outlined in Sect. 2, the irrotationality of X is a consequence of the (special-)relativistic condition

$$\begin{aligned} \text {{d}}\left( \eta \cdot X \right) = 0 , \end{aligned}$$
(3.11a)

where X is an observer vector field on an open subset of Minkowski spacetime \(\left( \mathcal Q \subseteq \mathbb {R}^4, \eta \right) \). As noted before, in the Newtonian limit \(X^0 \approx c\) and we thus obtain the conditions

$$\begin{aligned} \frac{1}{c} \frac{\partial {\vec {X}}}{\partial t} \approx 0 , \quad \nabla \times {\vec {X}} = 0 \end{aligned}$$
(3.11b)

instead of mere irrotationality on \(\mathcal Q\) to stay consistent within the relativistic ontology. That is, if X is irrotational, it must also be approximately time-independent in the above sense or the (naive) Newtonian limit breaks down.

There is a mathematical problem that deserves to be mentioned.

Question 1

(Existence and Uniqueness of Solutions) Assuming that the probability density \(\rho \) and the drift field X are given and smooth on \(\Omega \equiv \Omega _0\), under which conditions does there exist a smooth solution to the Madelung equations? Is it unique? Is the vector field X complete?

Apparently the question has been partially resolved by Jüngel et al. [59], who showed local existence and uniqueness of weak solutions in the special case of X being a gradient vector field.

Returning to our original discussion in the beginning of this section, how do the Madelung equations offer a resolution of the problems associated with the Schrödinger equation?

We see that the use of the complex function \(\Psi \) makes it possible to rewrite the Newton–Madelung equation and the continuity equation into one complex, second order, linear partial differential equation, which is arguably simpler to solve. Thus one can view the Schrödinger equation as an intermediate step in solving the Madelung equations, as has already been noted by Zak [60]. The Madelung equations are formulated in terms of quantities that do not have a “gauge freedom”, that means all the quantities in the Madelung equation are in principle uniquely defined and physically measurable. This argument alone is sufficient to consider the Madelung equations more fundamental than the Schrödinger equation. For example, the actual physical quantity corresponding to the phase \(\varphi \) must be a coordinate-independent derivative thereof, as the physically measurable predictions in the Schrödinger picture are invariant under the transformation \(\varphi \rightarrow \varphi + \varphi _0\) with \(\varphi _0 \in \mathbb {R}\) and, of course, coordinate transformations. A similar argument can be made for the potential V of the force \({\vec {F}}\). Thus, if one wishes to generalize the description of quantum systems with one Schrödinger particle to the relativistic and/or constrained case, starting with the Madelung equations rather than the Schrödinger equation is the natural choice. Indeed, the Madelung equations offer a straight-forward (though unphysical, cf. footnote 4) generalization of the Schrödinger equation to the (general)-relativistic case, but we will not discuss this here. For this reason, the resolution of (v) and (vi) will be postponed. The treatment of constrained non-relativistic systems can be approached by either solving the Madelung equations together with these constraints directly (cf. Sect. 5.1 for the interpretation of \(\rho \) and X) or by passing over to a Hamiltonian formalism with the use of the Bernoulli-Madelung equation (3.6g). A generalization of the Madelung equations for non-conservative forces is immediate and the generalization to dissipative systems has been pursued in [42]. Note that for some generalizations it might not be possible to construct a Schrödinger equation, notably for rotational drift fields and forces.

Remark 3.5

(Geometric Constraints) If the constraint is geometric, i.e. if the particles are constrained to an embedded submanifold M of \(\mathbb {R}^3\), like the surface of a sphere or a finite Möbius band, the adaption of the Madelung equations follows the same procedure as for any other Newtonian continuum theory:

  1. (i)

    We first assume that \(\mathcal Q\) is an open subset of \(\mathbb {R}\times M \subseteq \mathbb {R}^4\) instead of \(\mathbb {R}^4=\mathbb {R}\times \mathbb {R}^3\) and pull back the structures \(\text {{d}}\tau , \delta , \nabla \) on \(\mathbb {R}^4\) via the inclusion map \(\xi :\mathbb {R}\times M \rightarrow \mathbb {R}\times \mathbb {R}^3\). Defining \(I, \Omega _t, \iota _t\) as in (2.25) and taking again the pullback of the spatial metric to get \(h_t\) for each \(t \in I\), this yields the vector calculus operators divergence, gradient and Laplacian on \(\mathcal Q\), in full analogy to Definition 2.5.

  2. (ii)

    Then we write down the continuity equation for these new vector calculus operators, \(\rho \in C^\infty \left( \mathcal Q ,\mathbb {R}\right) \) and \(X= \partial / \partial t + {\vec {X}} \in \mathfrak X \left( \mathcal Q \right) \) such that \({\vec {X}}_{\iota _t}\) is tangent to \(\Omega _t\) for every \(t \in I\) (defined in full analogy to (2.25)).

  3. (iii)

    Restrict the force to \(\mathcal Q \subseteq \mathbb {R}\times M\) and take only the tangential parts, then write down the Newton–Madelung equations for the new force \({\vec {F}}\) and vector calculus operators.

  4. (iv)

    We replace the irrotationality of \({\vec {X}}\) by the condition

    $$\begin{aligned} \text {{d}}\left( h_t \cdot {\vec {X}}_{\iota _t}\right) = 0 \quad \quad \forall t \in I . \end{aligned}$$
    (3.12)
  5. (v)

    To construct a Schrödinger equation, we require that the new \({\vec {F}}\) also satisfies the above condition and, of course, the topological condition \(b_1 \left( \Omega _t \right) = 0\) for all \(t \in I\) needs to hold. Then proceed as in the proof of Theorem 3.2.

We conjecture that this procedure just yields the ordinary Schrödinger equation on M with Laplacian induced by h (considered as Riemannian for fixed t).

Madelung’s theorem also gives an explicit condition for the global equivalence of the equations, which is, of course, topological. However, in practice we do not know the natural domain \(\mathcal Q\) of \(\rho \) and X in advance, but we are given (sufficiently smooth) initial values of \(\rho \) and X on \(\lbrace 0 \rbrace \times \Omega \) with \(\Omega = \Omega _0 \subseteq \mathbb {R}^3\) and would then like to know whether we can apply Theorem 3.2 globally. There is a convenient answer to this question by noting that, on the grounds of Theorem 5.2, we may identify \(\mathcal Q \subseteq \mathbb {R}^4\) to be the image of \(\lbrace 0 \rbrace \times \Omega \) under the flow of X. Since we would like to have a global dynamical evolution of the system, we may assume that there exists an open interval \(I \subseteq \mathbb {R}\) such that the flow \(\Phi _t\) of X is defined for all \(t \in I\). The next proposition states the topological consequences of this situation.

Proposition 3.6

(Topology of \(\Omega _t\) and \(\mathcal Q\)) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, \(I \subseteq \mathbb {R}\) be an open interval with \(0 \in I\) and let \(\Omega := \Omega _0 = \bigg \{ {{\vec {x}}} \in \mathbb {R}^3 \bigg | {\left( 0, {{\vec {x}}}\right) \in \mathcal Q}\bigg \}\). Further, let X be a Newtonian observer vector field with flow \(\Phi \), such that \(\mathcal Q\) is the image \(\Phi _{I}\left( \lbrace 0 \rbrace \times \Omega \right) \).

Then \(\Omega _t := \bigg \{ {{\vec {x}}} \in \mathbb {R}^3 \bigg | { (t, {{\vec {x}}}) \in \mathcal Q}\bigg \}\) is diffeomorphic to \(\Omega \) and \(\mathcal Q\) is diffeomorphic to \(I \times \Omega \). In particular, the Betti numbers \(b_i \left( \Omega \right) \), \(b_i \left( \Omega _t\right) \) and \(b_i \left( \mathcal Q \right) \) coincide for every \(i \in \mathbb {N}_0 := \mathbb {N}\cup \lbrace 0 \rbrace \) and \(t \in I\).

Proof

Define \(\phi := \Phi _{\upharpoonright _{I \times (\lbrace 0 \rbrace \times \Omega )}}\). Hence \({{\mathrm{dom}}}\phi = I \times \Omega \) and \(\phi \) is surjective onto \(\mathcal Q\). Since X is a Newtonian observer vector field, we find that there exists a smooth function \(\vec {\Phi } :I \times \Omega \rightarrow \mathcal Q\) such that for all \(t \in I, {{\vec {x}}} \in \Omega \):

$$\begin{aligned} \phi \left( t, {{\vec {x}}}\right) = \Phi _t \left( 0, {{\vec {x}}} \right) = \bigl (t, \vec {\Phi }_t \left( {{\vec {x}}} \right) \bigr ) . \end{aligned}$$
(3.14a)

Since \(\Phi _t\) is injective for any \(t \in I\), so is \(\vec \Phi _t :\Omega \rightarrow \Omega _t\) and also \(\phi \). As \(\phi \left( t, . \, \right) = \Phi _t \left( 0 , . \, \right) = \left( t, \vec \Phi _t \left( .\, \right) \right) \) for all \(t \in I\), the differential \((\partial \vec \Phi _t / \partial {{\vec {x}}})\) of \(\vec \Phi _t\) has full rank (cf. [46, Proposition 3.2.10/1]) and thus

$$\begin{aligned} \phi _* = \begin{pmatrix} 1 &{} 0 \\ \frac{\partial \vec \Phi }{\partial t} &{} \frac{\partial \vec \Phi }{\partial {{\vec {x}}}} \end{pmatrix} \end{aligned}$$
(3.14b)

has full rank. Since a smooth bijection whose differential has full rank everywhere is a diffeomorphism, \(\phi \) is a diffeomorphism. Therefore \(\mathcal Q\) is diffeomorphic to \(I \times \Omega \) and since \(\Omega _t\) is an embedded submanifold of \(\mathcal Q\), it is diffeomorphic to \(\lbrace t \rbrace \times \Omega \) under \(\phi \) and hence diffeomorphic to \(\Omega \) itself.

Since diffeomorphic manifolds are (smoothly) homotopy equivalent, \(\mathcal Q\) is homotopy equivalent to \(I \times \Omega \) and all \(\Omega _t\)s are homotopy equivalent to \(\Omega \). One can directly proof from the definition of (smooth) homotopy equivalence (see e.g. [46, Definition 4.3.5]) that for any smooth manifold \(\Omega \) and an open interval I, the product \(I \times \Omega \) is homotopy equivalent to \(\Omega \). Thus \(\mathcal Q\) is also homotopy equivalent to \(\Omega \). Since homotopy equivalent manifolds have isomorphic de Rham cohomology groups (cf. [46, Corollary 4.3.10]), their Betti numbers coincide. \(\square \)

This means that under physically reasonable assumptions, the global applicability of Theorem 3.2 is determined by the topology of the initial value hypersurface \(\Omega \). If one works in the relativistic ontology, this condition \(b_1 \left( \Omega _t \right) = 0\) for all \(t \in I\) should be replaced by \(b_1 \left( \mathcal Q \right) = 0\). In the (naive) Newtonian limit, Proposition 3.6 then states that the latter condition implies the former one. We also wish to note that, if \(\Omega \) is not connected, (3.6) prevents the components from ‘merging’—in the sense that a solution cannot be extended to later times. Physically, this means that a two-particle model is more appropriate in this situation.

If the condition \(b_1 \left( \Omega _t\right) \equiv 0\) is not satisfied (as in Remark 3.3), a global \(\varphi \) need not exist and as a consequence a global wave function cannot be constructed. Apart from stationary solutions on \(\mathcal Q = \mathbb {R}\times \Omega \) with \(b_1 \left( \Omega \right) \ne 0\), this can happen, for example, when attempting to describe the Aharonov-Bohm effect [61], or when a connected component of the domain of the initial probability density \(\rho _0 := \rho \left( 0, \, \right) \) is not simply connected. It is unknown to us whether topological problems, as expressed in Remark 3.3, also occur in other quantum mechanical models. If not, it is possible to argue that the Madelung equations are the global version of the Schrödinger equation. This might yield additional physical solutions.

4 Relation to the Linear Operator Formalism

The Schrödinger picture of quantum mechanics is not limited to the Schrödinger equation, but also gives a set of rules how to determine expectation values, standard deviations and other probabilistic quantities for physical observables like position, momentum, energy, et cetera. In this section we examine how the Schrödinger picture and the Madelung picture relate to each other. We will observe that the Madelung picture suggests modifications of the current axiomatic framework of quantum mechanics, namely the replacement of the von Neumann axioms with the axioms of standard probability theory by Kolmogorov. Again, we will restrict ourselves to the 1-particle Schrödinger theory, but our treatment has consequences for the general axiomatic framework of quantum mechanics.

The general, mathematically naive formalism of quantum mechanics states [12, §3.6]; [47, §2.1], that for every “classical observable” A there is a linear mappingFootnote 9 \(\hat{A} :\mathcal H \rightarrow \mathcal H\) of some Hilbert space \(\mathcal H\) with inner product \(\left\langle . \text {,} . \right\rangle \), that is assumed to be hermitian/self-adjoint, in the sense that \(\forall \Psi , \Phi \in \mathcal H\) the operator \(\hat{A}\) satisfies:

$$\begin{aligned} \left\langle \Psi \text {,} \hat{A} \Phi \right\rangle = \left\langle \hat{A}\Psi \text {,} \Phi \right\rangle . \end{aligned}$$
(4.1)

The self-adjointness (4.1) assures that the eigenvalues of \(\hat{A}\), if they exist, are real. This is necessary, because the eigenvalues are taken to be the values of the observable A and these have to be real, physical quantities. Note that the time t is treated as a parameter in this formalism and both \(\hat{A}\) and \(\Psi \) may depend on it.

In the Schrödinger theory, \(\mathcal H\) is assumed to be a vector space consisting of functions \(\Psi \) from some open subset \(\Omega \) of \(\mathbb {R}^3\) to \(\mathbb {C}\). Moreover, it should be equipped with the \(L^2\)-inner product [62, §B]

$$\begin{aligned} \left\langle . \text {,} . \right\rangle :\quad \mathcal H \times \mathcal H \rightarrow \mathbb {C} \quad :\quad \left( \Psi , \Phi \right) \rightarrow \left\langle \Psi \text {,} \Phi \right\rangle := \int _{\Omega } \text {{d}}^3 x \, \Psi ^*\left( {{\vec {x}}} \right) \Phi \left( {{\vec {x}}} \right) \, . \end{aligned}$$
(4.2)

Hence \(\mathcal H\) ought to be a linear subspace of \(L^2 \left( \Omega , \mathbb {C} \right) \) and, to assure completeness, it ought to be closed. The fact, that this formalism does not allow for the domain of \(\Psi \) to change over time, can be remedied by allowing \(\Omega \) and hence \(\mathcal H\) to be time-dependent, but then one does not just have one Hilbert space, but a collection

$$\begin{aligned} \bigg \{ \mathcal H_t\, \bigg | {\, t \in I \subseteq \mathbb {R}}\bigg \} \quad \text {with} \quad \mathcal H_t \subseteq L^2 \left( \Omega _t, \mathbb {C} \right) \quad \text {for all} \quad t \in I . \end{aligned}$$
(4.3)

So the formalism of Newtonian spacetimes is also implicitly used in this approach.

The most common operators in the Schrödinger theory are the position operators \(\hat{x}^i = x^i\), the momentum operators \(\hat{p}_i = \i \hbar \, \, \partial /\partial x^i\), the energy operatorFootnote 10 \(\hat{E} = \i \hbar \partial _t\) and the angular momentum operators \(\hat{L}_i = \varepsilon _{ij}{}^k \hat{x}^j \hat{p}_k\). With the exception of the position operator, these are all related to spacetime symmetries (cf. [47, §3.3]), and they are the operators that are used to heuristically construct more general ones by consideration of the “classical analogue”. The question, which other observables are admissible, is in general not answered by the formalism itself—a problem which was used to justify more sophisticated quantization algorithms [6, §8.1]; [23, §5.1.2]. We leave the Galilei operators and spin operators aside in this article, as the former are better treated in the context of approximate Lorentz boosts and spin is not covered here.

The knowledge of the operators \({\hat{x}}^i\) and \({\hat{p}}_j\) is enough to show the naiveté of the Hilbert space formalism: Assuming a sufficient degree of differentiability of functions in \(\mathcal H\), we obtain the canonical commutation relation

$$\begin{aligned} \left[ \hat{x}^i , {\hat{p}}_j \right] = \i \hbar \, \delta ^i_j . \end{aligned}$$
(4.4)

It is common knowledge among mathematical physicists that this is in direct conflict with the self-adjointness of \(\hat{x}^i\) and \(\hat{p}_j\). For the sake of coherence, we state and prove the relevant assertion.

Proposition 4.1

There does not exist any Hilbert space \(\left( \mathcal H, \left\langle . \text {,} . \right\rangle \right) \) with linear maps \(\hat{x}, \hat{p} :\mathcal H \rightarrow \mathcal H\) such that the following hold:

  1. (i)

    \(\hat{x}, \hat{p}\) are self-adjoint.

  2. (ii)

    \(\hat{x}, \hat{p} \) satisfy the commutation relation

    $$\begin{aligned} \left[ \hat{x} , \hat{p} \right] = \i \hbar . \end{aligned}$$
    (4.5)

Proof

Since both \(\hat{x}\) and \(\hat{p} \) are self-adjoint, they are bounded (cf. [12, Corollary 9.9]).

From (4.5), we find that \(\hat{x}, \hat{p} \) are non-zero and thus have non-zero (operator) norms \(||\hat{x}|| , ||\hat{p}||\). Since \(\hat{x}\) is self-adjoint, it is normal and thus \(||{\hat{x}}^2|| = {||\hat{x}|| }^2\). Now one proves by induction that for all \(n \in \mathbb {N}\) we have

$$\begin{aligned} \left[ \hat{x}^n , \hat{p} \right] = \i n \hbar \, \hat{x}^{n-1} . \end{aligned}$$
(4.6a)

Taking norms and applying the triangle inequality, we get

$$\begin{aligned} n \hbar \le 2 ||\hat{x}|| ||\hat{p} || , \end{aligned}$$
(4.6b)

thus \(\hat{x}\), \(\hat{p}\) or both, are unbounded. This is a contradiction. \(\square \)

Therefore, even within the application of the 1-particle Schrödinger theory, the Hilbert space formalism is inadequate. We refer to the book by Hall [12] for alternative descriptions.

Still, we would like to have a closer look at the expectation values of \(\hat{x}^i\), \(\hat{p}_j\), \(\hat{L}_k\) and \(\hat{E}\) in the context of the Madelung picture. Indeed, we will find that the Madelung picture gives a natural explanation for why the operators yield the physically correct expectation values—within Kolmogorovian probability theory (cf. [63, 64]). In addition, the Madelung picture offers a natural, more intuitive formalism and, by making the analogy to Newtonian mechanics explicit, shows directly which observables are ‘physical’.

In order to show this, we need to make some assumptions on the ‘regularity’ of the involved functions and spaces: So given a Newtonian spacetime \(\left( \mathcal Q, \text {{d}}\tau , \delta , \mathcal O \right) \), we would like the operators \(\hat{x}^i\), \(\hat{p}_j\), \(\hat{L}_k\) and \(\hat{E}\) to be well-defined and satisfy

$$\begin{aligned} \left\langle \Psi _t \text {,} \hat{A} \Psi _t \right\rangle = \left\langle \hat{A} \Psi _t \text {,} \Psi _t \right\rangle \end{aligned}$$
(4.7)

for each \(t \in I\) and all ‘wave functions’

$$\begin{aligned} \Psi :\mathcal Q \rightarrow \, \mathbb {C} :\left( t, {{\vec {x}}} \right) \rightarrow \Psi _t \left( {{\vec {x}}} \right) . \end{aligned}$$
(4.8)

Observe that (4.7) only makes sense for \(\hat{A} = \hat{E}\), if we choose a potential \(V:\mathcal Q \rightarrow \mathbb {R}\) and, for given t, interpret \(\hat{E}\) as the Schrödinger operator

$$\begin{aligned} - \frac{\hbar ^2}{2m} \Delta + V \left( t, . \right) . \end{aligned}$$
(4.9)

Hence, from the Cauchy–Schwarz inequality and integration by parts, we conclude that \(\Psi \) needs to be an element ofFootnote 11

$$\begin{aligned} \begin{array}{ll} \mathcal W \left( \mathcal Q, \mathbb {C} \right) := \biggl \lbrace \Psi \in C^2 \left( \mathcal Q, \mathbb {C} \right) \biggm \vert \forall \, i \in \lbrace 1, 2, 3\rbrace \, \forall t \in I :\, \, \Psi _t, x^i \Psi _t , \, \frac{\partial \Psi _t}{\partial x^i} , \, \frac{\partial \Psi _t}{\partial t} \\ \text {lie in} \, \, L^2 \left( \Omega _t, \mathbb {C} \right) \, \, \text {and} \, \Psi _t \, \, \text {vanishes on the boundary} \, \, \partial \Omega _t \, \, \text {in} \, \, \mathbb {R}^3 \biggr \rbrace , \end{array} \end{aligned}$$
(4.10)

needs to satisfy the Schrödinger equation and that \(\Omega _t\) needs to split into a product of three open intervals (up to a set of measure zero). The necessity of this unnatural assumption on \(\Omega _t\) may be considered another indicator that the standard formulation of quantum mechanics is problematic. Commonly, one makes the implicit, stronger assumptions that each \(\Omega _t\) is \(\mathbb {R}^3\) (up to a set of measure zero), that \(\Psi \) is a smooth solution of the Schrödinger equation with \(\frac{\partial \Psi _t}{\partial t} \in L^2 \left( \Omega _t, \mathbb {C} \right) \), and that for each \(t \in I\) the function \(\Psi _t\) is an element of the space of (\(\mathbb {C}\) -valued) Schwartz functions

$$\begin{aligned} \mathcal S\left( \Omega _t, \mathbb {C} \right) := \bigl \lbrace \Psi _t :\Omega _t \rightarrow \mathbb {C} \bigm \vert \forall \,\text {multi-indices}\, \alpha , \beta :\left( x^\alpha \, \partial ^\beta \, \Psi _t \right) \in L^2 \left( \Omega _t, \mathbb {C} \right) \bigr \rbrace \end{aligned}$$
(4.11)

on \(\Omega _t\) (cf. [62]; [65, §5.1.3 & §6.2]). For convenience, we choose the stronger assumptions on \(\Psi \) and its domain in the following.

In order to relate everything to the Madelung picture, assume we are also given functions \(\varphi \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) and \(R \in C^\infty \left( \mathcal Q, [0, \infty ) \right) \) such that \(\Psi = R e^{- \i \varphi }\). Since \(\Psi _t\) is \(L^2\)-integrable for every t, we may normalize it to have unit \(L^2\)-norm. Then \(\rho := R^2\), pulled back to \(\Omega _t\), satisfies the mathematical axioms of a probability density (with respect to \(\text {{d}}^3 x\)).

In the first instance, we may consider the position operators:

$$\begin{aligned} \left\langle \Psi _t \text {,} \hat{x}^i \Psi _t \right\rangle = \int _{\Omega _t} \Psi ^*_t \left( {{\vec {x}}} \right) \, \left( \hat{x}^i \Psi _t \right) \left( {{\vec {x}}} \right) \, \text {{d}}^3 x = \int _{\Omega _t} x^i \, \rho \left( t, {{\vec {x}}} \right) \text {{d}}^3 x = \mathbb E \left( t, x^i \right) . \end{aligned}$$
(4.12)

So we get the expectation value \(\mathbb E \left( t, . \right) \) of the ith coordinate function with respect to the probability density \(\rho (t, . )\) on \(\Omega _t\). Then

$$\begin{aligned} \mathbb E \left( t, {{\vec {x}}} \right) := \left( \mathbb E \left( t, x^1 \right) ,\mathbb E \left( t, x^2 \right) ,\mathbb E \left( t, x^3 \right) \right) \in \mathbb {R}^3 \end{aligned}$$
(4.13)

gives the mean position of the particle at time t in \(\Omega _t\). Moreover, in Kolmogorvian probability theory, we can ask for the expectation value of the position \(\mathbb E \left( t, {{\vec {x}}}, U_t \right) \) on any other Borel set \(U_t \in \mathcal B \left( \Omega _t \right) \), its standard deviation et cetera. This can also be done in the ‘von Neumann philosophy’ by multiplication with the indicator function

$$\begin{aligned} \chi _{U_t} :\Omega _t \rightarrow \mathbb {C} :{{\vec {x}}} \rightarrow \chi _{U_t} \left( {{\vec {x}}}\right) := {\left\{ \begin{array}{ll} 1 &{}, {{\vec {x}}} \in {U_t} \\ 0 &{}, \text {else} \end{array}\right. } , \end{aligned}$$
(4.14)

but the product \(\chi _{U_t} \, \Psi _t\) is usually not differentiable.

Second, consider the momentum operators: By (4.7) for \(\hat{A} = \hat{p}_i \), we have

$$\begin{aligned} \left\langle \Psi _t \text {,} \hat{p} _i \Psi _t \right\rangle = \mathfrak {R}\left\langle \Psi _t \text {,} \hat{p} _i \Psi _t \right\rangle \, \end{aligned}$$
(4.15)

and thus

$$\begin{aligned} \left\langle \Psi _t \text {,} \hat{p} _i \Psi _t \right\rangle&= - \i \hbar \int _{\Omega _t} \Psi ^*_t \left( {{\vec {x}}} \right) \, \frac{\partial \Psi _t}{\partial x^i} \left( {{\vec {x}}} \right) \, \text {{d}}^3 x \end{aligned}$$
(4.16)
$$\begin{aligned}&= - \i \hbar \int _{\Omega _t} \Psi ^*_t \left( {{\vec {x}}} \right) \, \left( \frac{\partial R}{\partial x^i} \left( t, {{\vec {x}}} \right) e^{-\i \varphi \left( t, {{\vec {x}}} \right) } - \i \frac{\partial \varphi }{\partial x^i} \left( t, {{\vec {x}}} \right) \, \Psi _t \left( {{\vec {x}}} \right) \right) \, \text {{d}}^3 x \end{aligned}$$
(4.17)
$$\begin{aligned}&=\int _{\Omega _t} -\hbar \frac{\partial \varphi }{\partial x^i} \left( t, {{\vec {x}}} \right) \, \rho \left( t, {{\vec {x}}} \right) \, \text {{d}}^3 x \end{aligned}$$
(4.18)
$$\begin{aligned}&= \mathbb E \left( t, m X^i \right) , \end{aligned}$$
(4.19)

using (3.6e). Therefore, if we are willing to interpret \(m X^i\) as the random variable for the ith component of the momentum, \(\left\langle \Psi _t \text {,} \hat{p} _i \Psi _t \right\rangle \) yields its expectation value. This interpretation is indeed a result of the correspondence principle (see Sect. 5). In the linear operator formalism, however, we cannot simply replace the domain of the integral by some \(U_t \in \mathcal B \left( \Omega _t \right) \), since \(\hat{p}_i\) will no longer be interpretable as a momentum operator: The quantity

$$\begin{aligned} \int _{U_t} \text {{d}}^3 x \, \, \Psi _t^*\left( {{\vec {x}}} \right) \, \left( \hat{p}_i \, \Psi _t \right) \left( {{\vec {x}}} \right) \end{aligned}$$
(4.20)

is usually not real. Hence the expectation value of the ith momentum in the region \(U_t\) is not clearly defined in the ‘von Neumann philosophy’. Contrarily, in Kolmogorovian probability theory we only need to compute

$$\begin{aligned} \mathbb E \left( t, m X^i, U_t \right) := \int _{U_t} m X^i \left( t, {{\vec {x}}} \right) \, \rho \left( t, {{\vec {x}}} \right) \text {{d}}^3 x \end{aligned}$$
(4.21)

to get the expectation value.

Concerning the energy operator, we need to exclude the zeros of \(\Psi \) from \(\mathcal Q\) to define the energy E via

$$\begin{aligned} E := \frac{m}{2} {\vec {X}} ^2 + V - \frac{\hbar ^2}{2m} \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} . \end{aligned}$$
(4.22)

Note that \(L^2 \left( \Omega _t, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \) is not a vector space, so ‘superposition’ of wave functions requires a formal change of domain. By (4.7) for \(\hat{A} = \hat{E}\) and the Bernoulli-Madelung equation (3.6g), an argument analogous to the one for the momentum operators indeed yields

$$\begin{aligned} \left\langle \Psi _t \text {,} \hat{E} \Psi _t \right\rangle&= \i \hbar \int _{\Omega _t} \Psi ^*_t \left( {{\vec {x}}} \right) \, \frac{\partial \Psi _t}{\partial t} \left( {{\vec {x}}} \right) \, \text {{d}}^3 x \end{aligned}$$
(4.23)
$$\begin{aligned}&=\int _{\Omega _t} \hbar \frac{\partial \varphi }{\partial t} \left( t, {{\vec {x}}} \right) \, \rho \left( t, {{\vec {x}}} \right) \, \text {{d}}^3 x \end{aligned}$$
(4.24)
$$\begin{aligned}&=\int _{\Omega _t} E \left( t, {{\vec {x}}} \right) \, \rho \left( t, {{\vec {x}}} \right) \, \text {{d}}^3 x \end{aligned}$$
(4.25)
$$\begin{aligned}&= \mathbb E \left( t, E \right) . \end{aligned}$$
(4.26)

For the angular momentum operators the previous arguments can be repeated and one also finds the correct expectation value (cf. [44, §3.8.2]).

Therefore, our treatment not only suggests the inadequacy of the von Neumann approach, but also leads us to the following postulate.

Postulate 1

Quantum theory is correctly axiomatized by Kolmogorovian probability theory.

Clearly, the statement is in potential conflict with von Neumann’s projection postulate. Historically, von Neumann laid the mathematical foundations of modern quantum mechanics in 1932 [20], while Kolmogorov’s axiomatization of modern probability theory [63] was published in 1933. Therefore von Neumann did not know of Kolmogorov’s work at the time and that certain formulations of the Born rule [30] were in potential conflict with it. The view commonly taken today with respect to this issue [66, 67] is that quantum theory employs a more general notion of probability: There is a non-commutative probability theory and the Kolmogorovian approach is the particular, commutative case. However, apart from quantum theory, we are not aware of any applications of said generalization. Taking the historical context into account, it appears plausible that the projection postulate is wrong. In fact, the current view implicitly suggests that Kolmogorov has failed in axiomatizing probability theory in its most general framework, a statement we find troublesome.

Let us consider a specific example to make the potential conflict between the two approaches more explicit: If we ask for the probability of the particle in the state \(\left( \rho , X \right) \) to have an energy \(E \left( t, . \right) \) in the range \(J \in \mathcal B \left( \mathbb {R}\right) \) at time \(t \in I\), then, following Kolmogorov, this is given by

$$\begin{aligned} \int _{U_t} \iota ^*_t\rho \, \text {{d}}^3 x , \, U_t := \left( E \left( t, . \right) \right) ^{-1} \left( J \right) \subseteq {\Omega _t} . \end{aligned}$$
(4.27)

In contrast, if \(\hat{E}\), considered as a linear map from a linear subspace of \(\mathcal S\left( \Omega _t, \mathbb {C} \right) \) to \(L^2 \left( \Omega _t, \mathbb {C} \right) \), has a point spectrum \(\bigg \{ E_n \in \mathbb {R}\big | {n \in \mathbb {N}} \bigg \}\) (cf. [12, §9.4]) with mutually orthonormal eigenvectors \(\left( \Phi _{n,t}\right) _{n \in \mathbb {N}}\), then, following von Neumann, the same probability is given by

$$\begin{aligned} \sum _{n \in \mathbb {N}, E_n \in J} |\left\langle \Phi _{n,t} \text {,} \Psi _t \right\rangle |^2 . \end{aligned}$$
(4.28)

It is clear that for \(\Psi _t= \Phi _{n,t}\) both expressions yield either 1 or 0 for \(E_n \in J\) or \(E_n \notin J\), respectively. For more general states the two do not appear to coincide, but this statement requires a proof (in terms of a counterexample) and maybe there exists an approximation. If the expressions (4.27) and (4.28) differ, it is an empirical question which one of the two, if any, is correct. As a difference can only arise for time-dependent states, one should postpone this question until the corresponding relativistic theory has been laid out (see Remark 3.4).

5 Interpreting the Madelung Equations

As claimed previously, the Madelung equations are easier to interpret than the Schrödinger equation and it is the aim of this section to convince the reader of the truth of this statement. We first give a probabilistic, mathematical interpretation in Sect. 5.1 and then proceed with a more speculative discussion in Sect. 5.2.

5.1 Mathematical Interpretation

Contrary to Madelung’s interpretation of \(\rho \) as a mass density [28], quantum mechanics is now widely acknowledged to be a probabilistic theory with \(\rho \) being the probability density for finding the particle within a certain region of space. This is referred to as the Born interpretation or ensemble interpretation, named after Max Born [30]. For a discussion on why other interpretations are not admissible, we refer to [47, §4.2] and, of course, Born’s original article [30]. Taking this point of view, it is potentially fallacious to assume that X describes the actual velocity of the particle, as this appears to oppose the probabilistic nature of the theory. However, we can interpret \(\vec j := \rho {\vec {X}} \) as the probability current density, since then the continuity equation (3.6b) reads

$$\begin{aligned} 0 = \frac{\partial \rho }{\partial t} + \nabla \cdot \vec j . \end{aligned}$$
(5.1)

The physical meaning of this equation becomes more apparent when it is formulated in the language of integrals: Let \(N= N_0 \subseteq \Omega _0\) be an open set, \(\Phi \) be the flow of X and assume \(N_t := \vec \Phi _t \left( N\right) \subseteq \Omega _t\) exists for each \(t \in I\). This is for instance the case, if we have the situation of Proposition 3.6. Then the Reynold’s transport theorem [54, §6.3] implies that for such an N moving along the flow

$$\begin{aligned} \frac{\partial }{\partial t} \int _{N_t} \iota ^*_t\rho \, \text {{d}}^3 x = 0 \, \end{aligned}$$
(5.2)

for all times \(t \in I\)—provided \(\iota ^*_t \rho \) is integrable on \(\Omega _t\) for all \(t \in I\).Footnote 12 Integrability is assured by the fact that \(\iota ^*_t\rho \) is a probability density for all \(t \in I\) and, following the discussion in Sect. 4, one may even assume that it is Schwartz, i.e.

$$\begin{aligned} \iota ^*_t \rho \in \mathcal S \left( \Omega _t, \mathbb {R}_+ \right) \quad \quad \forall t \in I . \end{aligned}$$
(5.3)

By Gauß’ divergence theorem we have

$$\begin{aligned} \int _{\partial N_t} \vec j_{\iota _t} \cdot \text {{d}}{\vec {A}}_t = - \int _{N_t} \iota ^*_t\frac{\partial \rho }{\partial t} \, \, \text {{d}}^3 x . \end{aligned}$$
(5.4)

Equation (5.2) states that the probability that the particle is found within N stays conserved if N moves along the flow of X. (5.4) states that the probability flux leaving \(N_t\) is the probability current through its surface obtained from \(\vec j\).

We conclude that the primary importance of the drift field lies in the fact that its flow describes the probabilistic propagation of the system. If, for example, we take N to be a “small” region with \(95 \%\) chance of finding the particle and we let this region “propagate” along the drift flow, then this probability will not change over time. However, it might happen that the volume of N increases or decreases. Under appropriate assumptions on convergence, the change of volume of N is given by

$$\begin{aligned} \frac{\partial }{\partial t} \int _{N_t} \text {{d}}^3 x = \int _{N_t} \bigl ( \nabla \cdot {\vec {X}}\bigr )_{\iota _t} \, \text {{d}}^3x , \end{aligned}$$
(5.5)

again by the Reynold’s transport theorem. Therefore, the divergence of the spacelike component of the drift field is a measure of how N spreads or shrinks with time. Moreover, “holes” in \(\Omega \), appearing for instance due to the vanishing of \(\rho \), can also be viewed as propagating with time (see Proposition 3.6 and [34, p. 11]), due to the fact that the ‘spacelike part’ of the drift flow \(\vec \Phi _t\) is a diffeomorphism for each time t. The situation is schematically depicted in figure 1.

Remark 5.1

(Particle Structure) In the Madelung picture particles are treated as (approximately) point-like, since the support of \(\iota ^*_t\rho \) can be made arbitrarily small. In this context, we would also like to remark that, if the initial probability density is given by a Gaußian with standard deviation \(\sigma \in \mathbb {R}_+\) and the initial drift field is constant, then solving the Madelung equations and taking the limit \(\sigma \rightarrow 0\) might make it possible to assign trajectories, energies, etc. to individual particles.

Fig. 1
figure 1

The region N, where the particle is located with initial probability \( \mathbb P \left( 0, N\right) = \int _N \iota _0^* \rho \, \text {{d}}^3x\), propagates along the flow \(\vec \Phi \) in space. After time \(t>0\), the region has been transformed to \(N_t = \vec \Phi _t \left( N\right) \). The probability to find the particle, as well as the type and number of holes within the region, stays conserved, but the region may be distorted, shrunk or expanded

Yet this discussion does not fully answer the question how the drift field itself is to be interpreted and practically determined. The following result, central to the resolution of this question, was conjectured by Christof Tinnes (TU Berlin) and a weaker version had already been discovered by Ehrenfest [68].

Theorem 5.2

(Expectation Value of the Drift Field) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, let \(\Omega _t\), I be defined as in (2.25), \(X \in \mathfrak X _{Nt} \left( \mathcal Q \right) \) be a Newtonian observer vector field with flow \(\Phi \), \(\rho \in C^\infty \left( \mathcal Q , \mathbb {R}_+ \cup \lbrace 0 \rbrace \right) \) a positive, real function such that \(\iota _t^* \rho \) is Schwartz and a probability density for all \(t \in I\), and assume the continuity equation (3.6b) holds. Define for all \(t \in I\), \(U_t \in \mathcal B \left( \Omega _t \right) \) and \(f \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) the expectation value of f at time t over \(U_t\):

$$\begin{aligned} \mathbb E \left( t, f, U_t \right) := \int _{U_t} \iota ^*_t f \, \, \iota ^*_t\rho \, \text {{d}}^3 x . \end{aligned}$$
(5.6a)

Then for every \(N \in \mathcal B \left( \Omega _0 \right) \), such that the functions

$$\begin{aligned} N \rightarrow \mathbb {R}:{{\vec {x}}} \rightarrow \sup _{t \in I} \left| X^i \left( \Phi _t \left( 0, {{\vec {x}}} \right) \right) \, \rho \left( \Phi _t \left( 0, {{\vec {x}}} \right) \right) \, \det \biggl ( \biggl (\frac{\partial \vec \Phi _t}{\partial {{\vec {x}}}} \biggr ) \left( {{\vec {x}}} \right) \biggr ) \right| \end{aligned}$$
(5.6b)

are bounded and integrable for each \(i \in \lbrace 1, 2, 3 \rbrace \), and every \(t \in I\) s.t. \(N_t := \vec \Phi _t \left( N \right) \) exists, we have

$$\begin{aligned} \mathbb E \left( t, {\vec {X}}, N_t \right) = \frac{\text {{d}}}{\text {{d}}t} \mathbb E \left( t, {{\vec {x}}}, N_t \right) . \end{aligned}$$
(5.6c)

Here we defined

$$\begin{aligned} \mathbb E \left( t, {\vec {X}}, U \right) := \left( \mathbb E \left( t, X^1, U \right) ,\mathbb E \left( t, X^2, U \right) , \mathbb E \left( t, X^3, U \right) \right) \in \mathbb {R}^3 . \end{aligned}$$
(5.6d)

Proof

The theorem is a corollary of the Reynold’s transport theorem, formulated as in [29, p. 10]. Since \(\vec \Phi _t\), if defined, is a homeomorphism onto its image, the respective topologies coincide, and hence N is a Borel set if and only if \(N_t\) is a Borel set. We now note that for \(f \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) with

$$\begin{aligned} N \rightarrow \mathbb {R}:{{\vec {x}}} \rightarrow \sup _{t \in I} \left| X_{\Phi _t \left( 0, {{\vec {x}}} \right) } \left( f \right) \, \rho \left( \Phi _t \left( 0, {{\vec {x}}} \right) \right) \, \det \biggl ( \biggl (\frac{\partial \vec \Phi _t}{\partial {{\vec {x}}}} \biggr ) \left( {{\vec {x}}} \right) \biggr )\right| \end{aligned}$$
(5.7a)

bounded and integrable, and assuming convergence of the respective integrals, the continuity equation (3.6b) implies

$$\begin{aligned} \frac{\text {{d}}}{\text {{d}}t} \int _{N_t} \iota _t^* \left( f \rho \right) \, \text {{d}}^3 x&= \int _{N_t} \left( \frac{\partial \left( f \rho \right) }{\partial t} + \nabla \cdot \left( f \rho {\vec {X}} \right) \right) _{\iota _t} \text {{d}}^3 x \end{aligned}$$
(5.7b)
$$\begin{aligned}&= \int _{N_t} \left( \frac{\partial f}{\partial t} + \nabla f \cdot {\vec {X}} \right) _{\iota _t} \, \iota _t^*\rho \, \text {{d}}^3 x \end{aligned}$$
(5.7c)
$$\begin{aligned}&= \int _{N_t} X_{\iota _t} \left( f \right) \, \iota _t^*\rho \, \text {{d}}^3 x \, . \end{aligned}$$
(5.7d)

Now set \(f= x^i\) with \(i \in \lbrace 1,2, 3\rbrace \), observe that \(\iota ^*_t \left( x^i \rho \right) \) is integrable and due to \(X\left( x^i \right) = X^i\) the result follows. \(\square \)

Equation (5.6c) roughly means that the expectation value of the drift field in some region N moving along its flow is given by the velocity of the expectation value of the position in \(N_t\). Moreover, Theorem 5.2 can be used to find an even more direct interpretation of the drift field.

Corollary 5.3

(Interpretation of the Drift Field) Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime, let \(\Omega _t\), I be defined as in (2.25), \(X \in \mathfrak X _{Nt} \left( \mathcal Q \right) \) be a Newtonian observer vector field with flow \(\Phi \), \(\rho \in C^\infty \left( \mathcal Q , \mathbb {R}_+ \rbrace \right) \) a strictly positive, real function such that \(\iota _t^* \rho \) is Schwartz and a probability density for all \(t \in I\), and assume the continuity equation (3.6b) holds.

Define \(\mathbb E\) as in Theorem 5.2 and for \(t \in I\), \(U_t \in \mathcal B \left( \Omega _t \right) \) let \(\mathbb P \left( t, U_t \right) := \mathbb E \left( t, 1, U_t \right) \) be the probability of \(U_t\). Further, for \(\epsilon \in \mathbb {R}_+\), \({{\vec {y}}} \in \Omega := \Omega _0\) define

$$\begin{aligned} N^\epsilon \left( {{\vec {y}}} \right) := \bigg \{ {{\vec {x}}} \in \Omega \bigg | { {{\mathrm{dist}}}\left( {{\vec {x}}} , {{\vec {y}}} \right) < \epsilon } \bigg \} \end{aligned}$$
(5.8a)

with \({{\mathrm{dist}}}\) denoting the Riemannian distance on \(\left( \Omega , \iota ^*_0 \delta \right) \).

Then for every \(\epsilon \in \mathbb {R}_+\), \({{\vec {y}}} \in \Omega \) and every \(t \in \mathbb {R}\) such that \(N_t^\epsilon \left( {{\vec {y}}} \right) := \vec \Phi _t \left( N^\epsilon \left( {{\vec {y}}} \right) \right) \subseteq \Omega _t\) exists and the functions (5.6b) for \(N = N^\epsilon \left( {{\vec {y}}} \right) \), \(i \in \lbrace 1, 2, 3\rbrace \) are bounded and integrable, we have

$$\begin{aligned} {\vec {X}}_{\Phi _t\left( 0,{{\vec {y}}} \right) } = \lim _{\epsilon \rightarrow 0} \frac{\frac{\text {{d}}}{\text {{d}}t} \mathbb E \left( t, {{\vec {x}}}, N_t^\epsilon \left( {{\vec {y}}} \right) \right) }{\mathbb P \left( t, N_t^\epsilon \left( {{\vec {y}}} \right) \right) } . \end{aligned}$$
(5.8b)

Proof

By assumption \(N_t^\epsilon \left( {{\vec {y}}} \right) \) is defined, thus the restriction \(\vec \xi _t\) of \(\vec \Phi _t\) to the open submanifold \(N^\epsilon \left( {{\vec {y}}} \right) \) and its image is also defined. Since \(N^\epsilon \left( {{\vec {y}}} \right) \) is open in \(\Omega \), it is Borel-measurable. As \(N_t^\epsilon \left( {{\vec {y}}} \right) \) is also open, non-empty and \(\rho >0\), it follows \(\mathbb P \left( t, N_t^\epsilon \left( {{\vec {y}}} \right) \right) > 0\). By Theorem 5.2 we have

$$\begin{aligned} \frac{\frac{\text {{d}}}{\text {{d}}t} \mathbb E \left( t, {{\vec {x}}}, N_t^\epsilon \left( {{\vec {y}}} \right) \right) }{\mathbb P \left( t, N_t^\epsilon \left( {{\vec {y}}} \right) \right) } = \frac{ \mathbb E \left( t, {\vec {X}}, N_t^\epsilon \left( {{\vec {y}}} \right) \right) }{\mathbb P \left( t, N_t^\epsilon \left( {{\vec {y}}} \right) \right) } = \frac{ \int _{ N_t^\epsilon \left( {{\vec {y}}} \right) } \iota ^*_t \left( X^i \, \rho \right) \, \text {{d}}^3 x }{ \int _{ N_t^\epsilon \left( {{\vec {y}}} \right) } \iota ^*_t\rho \, \text {{d}}^3 x } \, e_i , \end{aligned}$$
(5.9a)

where \(e_i\) is the coordinate basis vector for \(i \in \lbrace 1,2,3\rbrace \). For every \(\epsilon ' \in \mathbb {R}\) with \(0 < \epsilon ' \le \epsilon \) the point \(\vec \Phi _t\left( {{\vec {y}}} \right) \) is in \(N_t^{\epsilon '} \left( {{\vec {y}}} \right) \) by definition. Moreover, the diameter of \(N_t^{\epsilon '} \left( {{\vec {y}}} \right) \) tends to zero as \(\epsilon ' \rightarrow 0\) due to continuity of \(\vec \xi _t\). Considering \(\iota ^*_t\rho \, \text {{d}}^3 x\) as a volume form and applying [69, §8.4, Lemma 1] yields

$$\begin{aligned} X^i \circ \Phi _t\left( 0,{{\vec {y}}} \right) \, e_i = \lim _{\epsilon \rightarrow 0} \frac{ \int _{ N_t^\epsilon \left( {{\vec {y}}} \right) } \iota ^*_t\left( X^i \, \rho \right) \, \text {{d}}^3 x }{ \int _{ N_t^\epsilon \left( {{\vec {y}}} \right) } \iota ^*_t\rho \, \text {{d}}^3 x } \, e_i . \end{aligned}$$
(5.9b)

Identifying \(e_i\) with \(\partial _i\), such that we may write \({\vec {X}} = X^i e_i\), completes the proof. \(\square \)

Corollary 5.3 yields a direct interpretation of the drift field in terms of probabilistic quantities. Since \(\mathbb P \left( t, N_t^\epsilon \left( {{\vec {y}}} \right) \right) \) is the probability of the particle to be found in the set \(N_t^\epsilon \left( {{\vec {y}}} \right) \), equation (5.9b) states that the drift field gives the infinitesimal velocity of the expectation value of the particle’s position per unit probability of finding the particle in this region. That is, if the particle is certain to be found in a small enough region of space, the (approximately constant) drift field gives the velocity of the expectation value of the particle’s position in this region. To be able to make practical use of this statement, we postulate the following.

Postulate 2

(Interpretation of the Drift Field) The velocity of the expectation value of the position for an ensemble of particles in a small region of space is equal to the average velocity of the ensemble of particles in that region.

Postulate 2 states that one can determine the drift field at each point by determining the average velocity of the particles hitting the point. Within a stochastic analogue of the theory, it should be possible to assign a precise mathematical meaning to Postulate 2 and determine its truth value, but for our purposes here we shall assume the truth of the statement without proof.Footnote 13 In this context, it is useful to observe that, by construction, the domains \({{\mathrm{dom}}}\rho \) and \({{\mathrm{dom}}}X\) are equal and hence the drift field only needs to be given where particles can actually be found. This compatibility of the interpretation of the drift field with the ensemble/Born interpretation of quantum mechanics is also the point where the Madelung picture differs [42] from the Bohmian interpretation [31, 32]. Indeed, our discussion shows how the ensemble interpretation naturally coheres with the mathematical formalism of the Madelung picture, once the Born rule is assumed.

We are now in a position to practically apply the formalism. This is implicitly related to the question whether the wave function is “objective” or “an element of physical reality” [71]. We translate this as being measurable in the physical sense. In the Madelung picture, this amounts to the question whether the probability density and the drift field are measurable, both of which are probabilistic quantities.

Consider now, for example, a particle gun that is used in the set-up of an arbitrary quantum mechanical experiment, in principle describable via the Madelung equations. Before we run the experiment, we need to collect initial data to solve the Madelung equations. According to the Born interpretation, we do this by placing a suitable detector in front of the particle gun and measuring the 3-dimensional (!) distribution of positions (where the particle hits) and, following Corollary 5.3 and Postulate 2, the average momenta (how hard the particle hits) at each point. If we run the experiment infinitely often, which is of course an idealization, we expect to obtain a smooth probability density \(\rho \) and a smooth drift momentum field \(P = m X\) in space at time \(t=0\). We can then run the actual experiment (ideally) infinitely often and measure the distribution of positions and the average momenta at each position. If the Madelung equations provide a correct description of the physical process and the detectors are ideal, this data will coincide with the one predicted by the Madelung equations for the given initial data. Therefore, both \(\rho \) and \({\vec {X}}\) are measurable and thus objective as probabilistic quantities. No measurement problem appears in this case: The time evolution of the probability density is deterministic and the theory makes only probabilistic statements on individual measurements. Furthermore, the mathematical formalism makes no statement on the process of measurement itself.

Remark 5.4

(Ideal Detectors & the Heisenberg Relation) Within the Copenhagen interpretation of the Schrödinger theory, it is possible to deny the existence of ideal detectors on the basis of the (here one-dimensional) Heisenberg inequality

$$\begin{aligned} \Delta x \, \Delta p \ge \frac{\hbar }{2} . \end{aligned}$$
(5.10)

However, if one employs the ensemble interpretation and observes that (5.10) is derivable within the Schrödinger picture, one is forced to conclude that (5.10) is not a statement on individual particles, but one of statistical nature. That is, within the ensemble interpretation, (5.10) does not support the interpretation it is given within the Copenhagen point of view, which itself has been subject to criticism for a long time [71]. In fact, the Heisenberg inequality is a general statement on Fourier transforms [65, Theorem 4.1] and \(\Delta p\) is not the standard deviation for the momentum given by the Madelung picture in conjunction with the Kolmogorovian probability theory (see Sect. 4 and [44, §6.7.3 & §8.5]). Hence, if Postulate 1 is adapted and \(\Delta p\) stands for the standard deviation in momentum, the Heisenberg inequality is incorrect. We conclude that the Heisenberg inequality does not put any restrictions on the precision of individual measurements and it does not appear to bear any physical significance within the Madelung picture.

For further discussion on the interpretation of quantum mechanical states, we again refer to [47, §9.3].

Having concluded our discussion on the continuity equation (3.6b), we now interpret the irrotationality of the drift field (3.6c). Equation (3.6c) has a direct interpretation using the fluid dynamics analogue, namely that it has vanishing vorticity

$$\begin{aligned} \vec \sigma := \nabla \times {\vec {X}} . \end{aligned}$$
(5.11)

Following [54, §1.4], half of the vorticity “represents the average angular velocity of two short fluid line elements that happen, at that instant, to be mutually perpendicular”. This statement derives itself from [29, Eq. 2.1]. Returning to the situation as depicted in Fig. 1, we therefore find that the irrotationality of X means that N does not shear or rotate when propagating along the flow of X. Thus any distortion of the region N over time is due to shrinkage or expansion, not shear or rotation. Moreover, the vorticity of the velocity field of a fluid gives the infinitesimal circulation density, which is derived from the integral definition of the curl operator [72]. In particular, if the vorticity of X vanishes, then for all curves \(\gamma \) in some \(\Omega _t\) joining any two points in a simply connected, open subset \(U \subseteq \Omega _t\), the value of

$$\begin{aligned} \int _\gamma \iota ^*_t \left( \delta \cdot X \right) \end{aligned}$$
(5.12)

depends only on the endpoints. Several researchers [73,74,75] have already suggested that quantum mechanical spin is related, or even equivalent, to the vorticity of the drift field. Indeed, the factor of 1 / 2 is very suggestive and there exists already works in the literature concerning this question [76, §9 & §10]; [37, 41, 44, 73, 75, 77], but as this article is only concerned with single-Schrödinger particle systems, we do not discuss this relation here. Moreover, we are not in a position to pass judgment or elaborate on this relation yet. For a mathematical introduction to vorticity, see [29, §1.2] and for a very illustrative, freely accessible, graphical exposition of the curl operator, see [78]. We also highly recommend watching the movie on vorticity [79] from the point of view advocated here.

It remains to interpret the Newton–Madelung equation (3.6a). Due to the fact that the Newton–Madelung equation (3.6a) reduces to Newton’s second law (2.14) for masses that are “large” (as compared, e.g. to the Planck mass), the classical limit of the entire model is quite easily obtained by looking at the large mass approximation of the Madelung equations:

$$\begin{aligned} m \dot{X} = {\vec {F}} ,\end{aligned}$$
(5.13a)
$$\begin{aligned} \nabla \times {\vec {X}} = 0 ,\end{aligned}$$
(5.13b)
$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho {\vec {X}} \right) = 0 . \end{aligned}$$
(5.13c)

As the prior discussion also applies here, these equations yield a probabilistic version of Newtonian mechanics.Footnote 14

This makes them compatible with the ensemble interpretation and the requirement that Newtonian mechanics must hold in some limit, as stated in the introductory discussion. Note again that \(\rho \) should not vanish on its specified domain and that \({{\mathrm{dom}}}\rho = {{\mathrm{dom}}}X\). Hence, following Theorem 3.2, we observe that the first ad hoc modification of Newtonian mechanics in quantum mechanics, i.e. the replacement of Newton’s 2nd law with the Schrödinger equation, amounted to implicitly going over to a probabilistic formalism and adding the Bohm force. We are thus motivated to postulate a new “principle of classical correspondence”, which was originally postulated by Niels Bohr in terms of quantum numbers [17].

Postulate 3

(Non-quantum Limit) For large masses, non-relativistic quantum theory, that is quantum mechanics, reduces to a probabilistic version of Newtonian Mechanics.

Experimentally, this limit can be made quantitative by sending particles of different mass through a double slit and finding the value \(m_q\) at which equations (5.14) cease to be a good description. The so-called classical limit is then \(m/m_q \gg 1\), which is independent of units. On a theoretical level, one could non-dimensionalize the Madelung equations and look at the magnitude of the perturbation introduced by the Bohm force, but we abstain from doing this here.

A generalized version of the Newtonian limit is also immediate.

Postulate 4

(Generalized Newtonian Limit) For large masses, small velocities and negligible spacetime curvature, relativistic quantum theory reduces to a probabilistic version of Newtonian Mechanics.

Clearly, it is only the Newton–Madelung equation that changes under the non-quantum limit and our previous discussion on the other two Madelung equations remains valid in this case. An interpretation of the Newton–Madelung equation thus has to focus on the Bohm force

$$\begin{aligned} {\vec {F}} _ {B} \left( \rho \right) = \frac{\hbar ^2}{2m} \nabla \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} . \end{aligned}$$
(5.14)

A peculiar feature of this term, as well as the Madelung equations as a whole, is the invariance under the scaling transformation \(\rho \rightarrow \lambda \rho \) with \(\lambda \in \mathbb {R}\setminus \lbrace 0 \rbrace \). Hence the Madelung equations do not change, if \(\rho \) is not normalized, a fact that could be useful for the generalization to multi-particle systems (see Sect. 6). For the interpretation of (5.14), this means that the value of the term is not influenced by the value of the probability density, but only by its shape.

This property is to be expected a priori by the principle of locality: If we have two isolated ensembles specified by the states \(\left( \rho _1, X_1 \right) , \left( \rho _2, X_2 \right) \), respectively, satisfying \({{\mathrm{dom}}}\rho _1 \cap {{\mathrm{dom}}}\rho _2 = \emptyset \), then describing them separately from another via the Madelung equations or together should not make any difference in terms of dynamics. More precisely, for \(A \subseteq \mathcal Q := {{\mathrm{dom}}}\rho _1 \cup {{\mathrm{dom}}}\rho _2\) we again define the indicator function

$$\begin{aligned} \chi _A :\mathcal Q \rightarrow \mathbb {R}:x \rightarrow \chi _A \left( x\right) := {\left\{ \begin{array}{ll} 1 &{}, x \in A \\ 0 &{}, \text {else} \end{array}\right. } \end{aligned}$$
(5.15)

of A, \(\chi _1 := \chi _{{{\mathrm{dom}}}\rho _1}\), \(\chi _2 := \chi _{{{\mathrm{dom}}}\rho _2}\), and we set \(\rho := \chi _1 \, \rho _1 /2 + \chi _2 \, \rho _2 /2\), as well as \(X :=\chi _1 \, X_1 +\chi _2 \, X_2\). As for \({{\mathrm{dom}}}\rho _1 \cap {{\mathrm{dom}}}\rho _2 = \emptyset \) both \(\rho \) and X are smooth, we can now check whether they are a solution to the Madelung equations. We indeed have

$$\begin{aligned} \frac{\Delta \sqrt{\rho }}{\sqrt{\rho }} = \frac{\Delta \sqrt{\rho _1}}{\sqrt{\rho _1}} \chi _1 + \frac{\Delta \sqrt{\rho _2}}{\sqrt{\rho _2}} \chi _2 , \end{aligned}$$
(5.16)

and the other two equations also separate, as required by this consistency condition. Interestingly, if the domains overlap and \(\rho \) and X are sufficiently smooth, then (5.16) does not hold and thus \(\left( \rho , X\right) \), as defined, is in general not a solution of the Madelung equations. This can be explained by the fact that one gets an entirely new ensemble in that case and hence the non-linearity of (5.14) in \(\rho \) is not necessarily a defect of the theory. Non-linearity here means that \({\vec {F}} _B\) is not linear (and not even defined), if extended to the vector space \(C^\infty \left( \mathcal Q, \mathbb {R}\right) \) via (5.14). This point of view potentially explains the results of the double slit experiment, but the statement remains of speculative nature unless a careful mathematical treatment is given.

As compared to the respective Newtonian theory, the term (5.14) also causes an additional coupling between the drift field and the probability density, that goes beyond the requirement that the flow of the drift field is probability preserving in the sense of the continuity equation (5.2). Thus how the probability density changes in space determines how the drift field behaves and vice versa in a nonlinear manner. Consequently, perhaps quite surprisingly to some, it is a nonlinearity that causes much of quantum-mechanical behavior.

Intuitively, (5.14) represents a kind of noise that disappears for large masses, which leads us propose an alternative terminology for the term (5.14): Quantum noise or Bohm noise.

5.2 Speculative Interpretation

At this point, we can only speculate on the origin of the (quantum) noise term, but there is a particular interpretation that suggests itself given our current knowledge of physics and considering that the term is only relevant for small masses. Before we proceed, we would like to stress that what follows is speculative and should be considered as standing fully separate from the rest of the article. We understand the controversial nature of various attempts of interpreting quantum mechanics [80], but we consider the need to find a coherent interpretation of the equations as vital for the progress in the field. Needless to say, any interpretation of a theory of nature has to exhibit a strong link between the applied theory and the mathematical formalism and may not contradict either. In the following, we will speak about quantum mechanics in general and not limit ourselves to the 1-particle Schrödinger theory.

In 2005 Couder et al. [81] discovered that a silicon droplet on the surface of a vertically oscillating silicon bath remains stationary in a certain frequency regime, in which coalescence is prevented. When the sinusoidal, vertical force on the bath reaches a critical amplitude, the droplet begins to accelerate and can be made to “walk” on the surface of the bath [82]. Surprisingly, this basic setup is a macroscopic quantum analogue and can be used to build more complicated ones. For a mathematical model see [83], and for a brief summary we refer to [84]. If two droplets approach each other, they either scatter, coalesce or lock into orbit. In the latter case, Couder et al. observed that the distances between the averaged orbits is approximately one Faraday wavelength [82], which means that they are “quantized”, in the sense of being discrete. Moreover, when Couder and Fort studied the statistical behavior of such a droplet passing a double-slit wall, it resembled the one found in the quantum-mechanical analogue [85]. The fact that Eddi et al. were in addition able to establish the occurrence of tunneling for the droplet [86], suggests that a qualitatively similar behavior occurs in the microscopic realm. How is this to be explained?

A physicist in the beginning of the twentieth century might have justified this analogy via a vibration of the ether: If the particle is massive enough, the influence of the ether’s motion on the particle is negligible and it behaves according to Newton’s laws. Yet when the mass of the particle is small, the more or less random vibrations of the ether cannot be neglected any more and a statistical description, that models the noise caused by the ether’s vibration via (5.14), becomes necessary.

Of course, this explanation is flawed.

The Michelson–Morley experiment famously ruled out any influence of the ether’s motion on light [87] and an influence on matter had not been observed, which ultimately led to the creation of the theory of special relativity [88]. In addition, the existence of the ether would have established the existence of a preferred ‘rest frame’, being the one in which the ether is stationary, which in turn, if the above interpretation were correct, would suggest a natural tendency of particles to move along with the ether. This would cause an additional drift caused by the overall “ether wind”, that is not present in the Newton–Madelung equation (3.6a).

However, according to the current state of knowledge, by which we mean the point of view imposed by the Einstein equivalence principle and the related non-Euclidean geometry of spacetime (see [48, 49] for an introduction to general relativity, [25, 89] for a more mathematical treatment), a similar argument can be made explaining the noise term (5.14). That is, if we assume the existence of gravitational waves that are too weak to have a directly observable influence on macroscopic objects, yet strong enough to have an influence on microscopic particles such as electrons.

Consider the following, purely relativistic Gedanken experiment: Say we have a physical, inertial observer Alice who perceives her surroundings as having, for instance, a flat geometryFootnote 15 and who, by some miraculous power, is able to sense the position of an otherwise freely moving particle without disturbing it. Note that this is not a contradiction to the Heisenberg inequality, as explained in Remark 5.4. If the sufficiently weak gravitational waves are more or less random and there is no gravitational recoil, the particle will move geodesically in the actual geometry, but this will not be a straight line according to Alice’s perceived, macroscopic geometry. If there is gravitational recoil, the particle might not move geodesically and could in principle loose or gain mass, depending highly on the relation between the spacetime geometry and the mass of the particle. Either way, Alice would describe the motion of the particle as random and she would have to resort to a statistical description, possibly taking the shape of the Madelung equations. Just as in the case of the droplet, the apparently random behavior would be caused by a highly complicated, non-linear underlying dynamics, very susceptible to initial conditions, yet would also be deterministic. Alice, being aware of the underlying physics, would have to construct a model for geometric noise, that is noise caused by seemingly random small-scale curvature irregularities in spacetime.

While we are aware of the radicality of this ansatz, it appears plausible to us that the Madelung equations and thus also the Schrödinger equation could be a model of geometric noise. The fact that a droplet on a vibrating fluid bath is a quantum-mechanical analogue appears to be more than mere coincidence, considering that space and time cannot be assumed to be adequately described by special relativity on the scale of the Bohr radius without severe extrapolation. Even though we do not expect general relativity to be valid at the quantum scale, the thought experiment shows how someone only trained in relativity theory might interpret quantum behavior. Moreover, this conceptual approach can potentially resolve the old question why the electron surrounding a hydrogen nucleus does not radiate, which would cause the atom to be instable [15], and why a description employing the Coulomb force works well, despite it only being valid in electrostatics:

The electron is standing almost still with respect to the nucleus, but the local spacetime around the nucleus is non-static. In the hydrodynamic analogy, it is like a ball caught in a vortex of a vibrating fluid, which in this case is spacetime itself. The ball does not move much with respect to the fluid, but the fluid does move with respect to an outside observer at rest.

A geometric origin of the noise term (5.14) has already been proposed by Delphenich [90], but, to our knowledge, no satisfactory derivation has been proposed yet. The proposal that quantum behavior is caused by random fluctuations of some microscopic ‘fluid’ goes back to Bohm and Vigier [43]. In his model of stochastic mechanics, Nelson gave a similar interpretation [91]. Tsekov has formulated his stochastic interpretation of the Madelung equations as follows: “\([\dots ]\) the vacuum fluctuates permanently and for this reason the trajectory of a particle in vacuum is random. If the particle is, however, too heavy the vacuum fluctuations generate negligible forces and this particle obeys the laws of classical mechanics.” [42] Note that the word ’forces’ is better replaced by ’deviations from the macroscopic metric’ in the interpretation we propose. Ultimately this interpretation should be supported by a mathematical derivation of the Madelung equations from a relativistic model of random irregularities in spacetime curvature.

Question 2

If quantum behavior is caused by random small-scale curvature irregularities in spacetime, how is the noise term to be derived?

We do not believe that such a derivation, if it exists, is currently within reach and thus caution against any attempts to find it. Even if the hypothesis of quantum behavior being caused by gravitational waves is correct, it appears doubtful that the Einstein equation holds on the quantum scale and thus one lacks the basic equations to model the gravitational waves. Even if they are known, one will most likely be faced with a system of non-linear partial differential equations for which no general solution can be found and then one would still have to find a way to model the randomness. Clearly superposition of waves is only applicable if the differential equation is linear, which makes modeling the randomness a non-trivial task already for Ricci-flat plane waves. Moreover, if one works in the linear approximation, in general one encounters arguably unphysical singularities in the metric [92].

Ultimately, a deep question that needs to be addressed in this interpretation of quantum mechanics is how the violation of Bell’s inequality is achieved. Ballentine traces the violation of Bell’s inequality in quantum mechanics back to the locality postulate used in the derivation of the inequality [47, §20.7].

Postulate 5

(Bell-Locality) If two spatially separated measurement devices A and B respectively measure the observables a and b of an ensemble of two distinct, possibly indistinguishable particles, then the result of b obtained by B does not change as a different observable \(a'\) is measured by A and vice versa.

If we assume that the stochastic interpretation is correct, then it appears to us that there are two possible resolutions to prevent actual so-called “actions at a distance”.

The first one is that, as in the case of the droplets, the particle itself creates gravitational waves and this in turn influences the motion of other particles, which might appear like a non-local interaction. This approach appears slightly implausible to us, since this could lead to a fluctuation in the mass of the particle, which is not observed. In addition we would naively expect such waves to travel approximately at the speed of light with respect to the macroscopic metric, but Theorem 5 and thus Bell’s argument also includes spacelike separated measurements [47, §20.4].

The second, to our mind more plausible resolution is to drop an assumption that is implicit in most modern physical theories, namely that a region of space (relative to a physical observer) containing particles is topologically simple on mesoscopic and microscopic scales. The suggestion that there is a connection between topology and entanglement has recently been made by van Raamsdonk [93], but, to our knowledge, goes back to Wheeler [94, 95]. In that case, we would not only have to renounce the statement that spacetime is flat at the quantum scale, but also that it can be adequately modeled by an open subset of \(\mathbb {R}^4\). So the idea is that handles in spacetime are observed as entangled particles and the system satisfies both the principle of causality and locality as implemented in the theory of relativity. This necessitates the view of fundamental particles as geometric and topological spacetime solitons, as in Wheeler’s “geometrodynamics” [94, 95]. Then Theorem 5 is not applicable as the particles are not distinct and thus Bell’s inequality can be violated even if Theorem 5 is true. The non-locality observed for entangled particles is then not real, but only apparent, caused by interactions of the particles with the measurement apparatus and a naive conception of space and time.

However, in order to overcome the speculative nature of this discussion, we suggest that the proper implementation of spin and the treatment of multi-particle systems in the Madelung picture is carried out first. Following the discussion in Sect. 2, this might require a detour through the relativistic theory and the Newtonian limit.

6 Modification: Particle Creation and Annihilation

As stated in the introduction, the Madelung equations can be naturally modified to study a wider class of possible quantum systems. For instance, one can consider rotational forces and ‘higher order quantum effects’ by viewing the noise term (5.14) as the first order in a Taylor approximation in 1 / m around 0 of a non-linear operator in \(\rho \) and its derivatives. The modification we propose here is of conceptual nature and intended to be applied in the generalization of the formalism to many particle systems. Though we do not wish to fully address this generalization here, we remark that, due to the symmetrization postulate [47, §17.3], the concept of spin needs to be properly implemented in the Madelung picture first, to be able to study systems with multiple mutually indistinguishable particles. The results obtained in the linear operator formalism can serve as a guide (see [47, §18.4]), but should also be questioned.

The phenomenon of particle creation and annihilation is not one that requires a relativistic treatment per se [47, §17.4], despite the fact that it is most commonly considered within relativistic quantum theory. Besides, the treatment in quantum field theory is also not free of problems (see e.g. [96, §9.5]). This raises the question how this phenomenon should be modeled in the Madelung picture. Following our discussion in the previous section, it becomes obvious that the continuity equation

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho {\vec {X}} \right) = 0 , \end{aligned}$$
(6.1)

needs to be modified, as, once normalized, it leads to the conservation of probability

$$\begin{aligned} \int _{\Omega _t} \iota ^*_t \rho \, \text {{d}}^3 x = 1 \quad \forall t \in \mathbb {R}. \end{aligned}$$
(6.2)

In fluid mechanics (6.1) is the conservation of mass [29, §1.1]. To model a change in mass of the fluid, e.g. due to chemical reactions, one includes a source term

$$\begin{aligned} \tilde{u} = \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho {\vec {X}} \right) , \end{aligned}$$
(6.3)

which implies that

$$\begin{aligned} \frac{\partial }{\partial t} \int _{\Omega _t} \iota _t^*\rho \, \text {{d}}^3 x = \int _{\Omega _t} \iota _t^* \tilde{u} \, \text {{d}}^3 x , \end{aligned}$$
(6.4)

by the Reynold’s transport theorem (modulo questions of convergence). In quantum mechanics, Eq. (6.4) can be interpreted as stating that the probability of finding the particle anywhere changes with time, which is the desired modification to the continuity equation. More precisely, \(\tilde{u}\) should be replaced by a smooth, possibly trivial operator u applied to \(\rho \) and X, in the sense that

$$\begin{aligned} u \left( \rho , X \right) :\mathcal Q \rightarrow \mathbb {R}\end{aligned}$$
(6.5)

is smooth for all smooth \(\rho \) and X. That the domain of \(u \left( \rho , X \right) \) is \(\mathcal Q \), rather than, e.g. \( \mathcal Q \times \mathcal Q \), is required by the principle of locality. Moreover, since probabilities are nonnegative and not greater than 1, we also have to demand

$$\begin{aligned} \int _{\Omega _t} \iota _t^*\rho \, \text {{d}}^3 x \in [0,1] \subset \mathbb {R}\quad \forall t \in I . \end{aligned}$$
(6.6)

Thus the Madelung equations for one Schrödinger particle that can be created and annihilated (e.g. by formation from or disintegration into gravitational waves, see Sect. 5) consist of the Newton–Madelung equation (3.6a), the irrotationality of the drift field (3.6c) and the modified continuity equation

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho {\vec {X}} \right) = u \left( \rho , X\right) , \end{aligned}$$
(6.7)

where the precise form of u is still unknown. Due to the scaling invariance of the Newton–Madelung equation, this modification does not change the underlying dynamics. We also remark that the requirement for the equations to separate for isolated ensembles puts restrictions on u (see page 40sqq.).

Proposition 6.1

Let \(\left( \mathcal Q , \text {{d}}\tau , \delta , \mathcal O \right) \) be a Newtonian spacetime with \(b_1 \left( \Omega _t \right) = 0\) for all \(t \in I\), as defined in (2.25), let \(X \in \mathfrak X _{Nt} \left( \mathcal Q \right) \) be a Newtonian observer vector field, \({\vec {F}} \in \mathfrak X _{Ns} \left( \mathcal Q \right) \) be a Newtonian spacelike vector field, \(\rho \in C^\infty \left( \mathcal Q , \mathbb {R}_+ \right) \) a strictly positive, real function,

$$\begin{aligned} u :C^\infty \left( \mathcal Q , \mathbb {R}_+ \right) \times \mathfrak X_{Nt}\left( \mathcal Q\right) \rightarrow C^\infty \left( \mathcal Q , \mathbb {R}\right) :\left( \rho , X \right) \rightarrow u \left( \rho , X \right) \end{aligned}$$
(6.8a)

an operator and \(m, \hbar \in \mathbb {R}_+\).

Then the irrotationality of X (3.6c) and F (3.6d) imply that \(\exists \varphi , V \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) such that equations (3.6e), (3.6f) hold and by setting \(\Psi := \sqrt{\rho }\, e^{-\i \varphi }\),

$$\begin{aligned} \xi \left( \Psi \right) := \frac{\hbar }{2 |\Psi |^2} \, u \left( |\Psi |^2, \frac{\partial }{\partial t} + \frac{\hbar }{2 \i m} \left( \frac{\nabla \Psi }{\Psi } - \frac{\nabla \Psi ^*}{\Psi ^*} \right) \right) , \end{aligned}$$
(6.8b)

the Newton–Madelung equation (3.6a) together with equation (6.7) imply

$$\begin{aligned} \i \hbar \frac{\partial \Psi }{\partial t} = - \frac{\hbar ^2}{2m} \Delta \Psi + V \Psi + \i \xi \left( \Psi \right) \Psi . \end{aligned}$$
(6.8c)

Conversely, for \(\Psi \in C^\infty \left( \mathcal Q, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \), \(V \in C^\infty \left( \mathcal Q, \mathbb {R}\right) \) and

$$\begin{aligned} \xi :C^\infty \left( \mathcal Q, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \rightarrow C^\infty \left( \mathcal Q, \mathbb {R}\right) :\Psi \rightarrow \xi \left( \Psi \right) \end{aligned}$$
(6.8d)

satisfying (6.8c), define \(\rho := |\Psi |^2\), \({\vec {F}}\) via (3.6f), u via (6.8a) as well as (6.8b) and \({\vec {X}}\) via (3.6j) such that \(X:= \partial / \partial t + {\vec {X}} \in \mathfrak X _{Nt} \left( \mathcal Q \right) \). Then (3.6a), (6.7), (3.6c) and (3.6d) hold.

Proof

The proof is entirely analogous to the one of Theorem 3.2. Instead of (3.7i), we get

$$\begin{aligned} 2 R \frac{\partial R}{\partial t} - \frac{\hbar }{m} \left( 2 R \nabla R \cdot \nabla \varphi + R^2 \Delta \varphi \right) = u \left( R^2, \frac{\partial }{\partial t} - \frac{\hbar }{m} \nabla \varphi \right) . \end{aligned}$$
(6.9a)

Using (3.6j), Definition (6.8b) above and Formula (3.7h) for \(\Delta \Psi \), we obtain

$$\begin{aligned} \hbar \frac{\partial R}{\partial t} = - \frac{\hbar ^2}{2m} \mathfrak {I}\left( e^{\i \varphi } \Delta \Psi \right) + R \xi \left( \Psi \right) . \end{aligned}$$
(6.9b)

Together with the real part of \(e^{\i \varphi } \Delta \Psi \) (3.7k) we indeed get (6.8c). The reverse implication is also proven in full analogy to Theorem 3.2. \(\square \)

If we now define an operator \(\hat{\Xi }\) acting on \(\mathcal S \left( \mathbb {R}^3, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \) via

$$\begin{aligned} \hat{\Xi }\Psi _t := \left( - \frac{\hbar ^2}{2m} \Delta + V + \i \xi \left( \Psi _t \right) \right) \Psi _t \end{aligned}$$
(6.10)

for \(\Psi _t \in \mathcal S \left( \mathbb {R}^3, \mathbb {C} \setminus \lbrace 0 \rbrace \right) \), then \(\hat{\Xi }\) is usually non-linear and need not even be defined on \(\mathcal S \left( \mathbb {R}^3, \mathbb {C}\right) \). The Schrödinger equation modeling particle creation and annihilation

$$\begin{aligned} \hat{E} \Psi _t = \hat{\Xi }\Psi _t \end{aligned}$$
(6.11)

can then not be recast into an eigenvalue equation for \(\hat{\Xi }\), as the separation ansatz will not work. We have thus proposed a physically reasonable model in which the current axiomatic framework of quantum mechanics breaks down (see Sect. 4 and [47, §2.1]).

7 Conclusion

In the introductory discussion we have argued that the use of a quantization algorithm in the formulation of quantum mechanics is a strong indication that quantum mechanics and thus quantum theory as a whole is, as of today, an incomplete theory. We also suggested that the identification of fundamental geometric quantities is a promising path to overcome this somewhat unsettling feature, as these quantities will inevitably be part of a new axiomatic framework for the theory. We then proceeded in Sect. 2 by constructing a Newtonian spacetime on which we then formulated the Madelung equations in Sect. 3. This construction enabled us to proof a local equivalence of the Madelung equations and the Schrödinger equation for irrotational forces. By relating the Madelung equations to the linear operator formalism thereafter, we showed that the Madelung equations naturally explain why the position, momentum, energy and angular momentum operators take the shape commonly found in quantum mechanics textbooks. These results strongly indicate that the Madelung equations formulated on a Newtonian spacetime provide the natural mathematical basis for quantum mechanics and that this basis should include the relevant aspects of Kolmogorovian probability theory. In Sect. 5.1 we gave a formal discussion of the Madelung equations that can be used for practically interpreting and applying the formalism, as well as extending the mathematical model. We then proceeded in Sect. 5.2 by speculating that quantum mechanics provides a statistical model for spacetime geometric noise, which is a variant of the stochastic interpretation developed by Bohm, Vigier and Tsekov. To give an example how to naturally extend the Madelung equations, we proposed an unfinished model for particle creation and annihilation for single-Schrödinger particle systems in Sect. 6. We observed that this can lead to a non-linearity in the resulting Schrödinger equation and thus makes the linear operator formalism inapplicable.

Some of our results have been summarized in the table. The abbreviations QM and GQT stand for quantum mechanics and geometric quantum theory, respectively.

Subject

Textbook/Copenhagen QM

GQT in Newtonian limit for Schrödinger particles

cf.

Spacetime model

Newtonian spacetime \(( \mathcal Q, \text {{d}}\tau , \delta , \mathcal O )\) with \(\Omega _t = \mathbb {R}^3\) (modulo sets of measure zero) \(\forall t \in I\), but implicit

Newtonian spacetime \(\left( \mathcal Q, \text {{d}}\tau , \delta , \mathcal O \right) \)

Section 4 & Section 2

Single particle state

(Spinor) wave function \(\Psi \)

Probability density \(\rho \) and drift field X

Section 3 & Section 5.1

Probability theory used

von Neumann with projection postulate

Kolmogorov with measure \(\int \text {{d}}^3 x \, \iota _t^* \rho \) applied on Borel sets or Lebesque sets of \(\Omega _t \subseteq \mathbb {R}^3\)

§4   & Post. 1

Observables

Inner products of wave functions, elements in the spectrum of (linear) endomorphisms of a Hilbert space \(\mathcal H\)

Probabilities and expectation values of real-valued functions on \(\mathcal Q\) (possibly depending on states)

As above

Measurement problem

Unresolved; in Copenhagen interpretation measurement causes ‘wave function collapse’

Not an issue; wave function is a mathematical tool encoding information on ensembles of particles; measurement itself is not modeled

Section 5.1 & Remark 5.4

Wave-particle duality

Particle identified with wave function \(\Psi \); interpreted as actual wave in Copenhagen interpretation

Makes no statement on internal structure of particles; treats them as effectively point-like

Remark 5.1

Superposition

Fundamental principle of QM; implemented via linear operator formalism on some Hilbert space

Not a principle; only sensible, if \(\Psi \) exists and dynamical evolution equation in \(\Psi \) is linear

Sections 4, 5.1 & 6

Classical correspondence

QM supposedly yields Newtonian mechanics in the limit \(\hbar \rightarrow 0\)

A Newtonian probability theory is obtained in large mass approximation

Eq. (5.14) & Post. 3

Canonical quantization

Ill-defined scheme to obtain dynamical equations from Newtonian mechanics

Rejected; instead dynamical equations are postulated, justified by arguments and empirical evidence

Sections 1 & 3

Uncertainty relation

Interpreted as fundamental uncertainty in measurable position and momentum

Relation formally derivable, but interpretation not supported; no restriction on maximal precision of measurement on theoretical level

Remark 5.4 & Section 5.1

Particle creation & Annihilation

Not possible; in QFT via second quantization formalism

Possible in principle via modification of continuity equation

Section 6

Fundamental theory?

Yes, in Newtonian limit

No, phenomenological

Sections 1 & 5.2

Despite all of these remarkable successes of the Madelung picture, there are still many open problems that need to be addressed to complete it and put quantum theory on a new foundation. From a mathematical point of view, the most important one is formulated by Theorem 1. We are currently working on the proper generalization of the Madelung equations to the relativistic setting, which is of conceptual importance due to the principle of relativity as discussed in the beginning of Sect. 2. However, there are many potentially fruitful paths of extending the Madelung equations in the non-relativistic setting already. How is spin to be geometrically implemented? How does the generalization to many particle systems work? How exactly do we model particle creation and annihilation? Finally, there remains the question of interpreting the Madelung equations: How does the hydrodynamical quantum analogue discovered by Couder et al [81,82,83, 85, 86, 97] relate to the actual behavior of quanta? How is matter related to spacetime geometry on the quantum scale?

To answer these questions, the non-quantum limit, the existing literature on quantum theory formulated in the linear operator formalism (e.g. [21, 41, 47, 98]), as well as already existent results obtained in Bohmian mechanics (e.g. [31, 32, 43, 44, 76, 77, 99, 100]) will be of use.