1 Introduction

In 1935, Einstein, Podolsky and Rosen (EPR) argued that quantum mechanics is incomplete by considering two particles in one dimension moving in opposite directions and whose joint wave function (see (3.2.1) below) was such that the measurement of the position of one of the particles immediately determined the position of the other particle and, similarly, the measurement of the momentum of one of the particles immediately determined the momentum of the other one.

Since, said EPR, a measurement made on one particle obviously could not possibly influence the physical state of the other particle, situated far away from the first particle, and since the wave function of both particles specifies neither the position nor the momentum of those particles, this quantum mechanical description of the state of both particles provided by this wave function must be incomplete in the sense that other variables, such as the values of the positions and momenta of both particles, must be included in a complete description of that physical system.

EPR’s argument had been widely misunderstood and misrepresented or ignored by almost everybody at that time. But not by Schrödinger, who, in his “cat paper,” originally published in German [34], as well as in the papers [35, 36], understood the “paradox” raised by EPR and deepened the perplexity that it causes.

Schrödinger showed that for certain states, called now maximally entangled (see Sect. 2.1), it is not just that the positions and the momenta of the particles are perfectly correlated. He showed that, for every observable associated with the first particle, there is another observable associated with the second particle such that the results of the measurements of both observables are perfectly correlated.

In [11], following [24, 25], we explained that, if one assumes locality, meaning that there is no effect whatsoever on the state of the second particle due to a measurement carried out on the first particle (when both particles are sufficiently spatially separated), there must exist what we call a “non-contextual value map” v which assigns to each observable A a value v(A) that pre-exists its measurement and is simply revealed by it. The word “non-contextual” refers to the fact that, since it pre-exists the measurement, the value v(A) does not depend on the procedure used to measure A.

However several theorems, originally due to Bell [3] and to Kochen and Specker [27], preclude the possibility of a non-contextual value map.Footnote 1 Since the existence of this map is a logical consequence of the assumption of locality and of the perfect correlations, the assumption of locality is false.

In this paper, we first summarize the arguments of our previous paper [11] (Sect. 2). We then turn to the EPR paper as well as related work by Einstein alone and Schrödinger (Sect. 3). Next, we provide a proof of nonlocality similar to the one of Sect. 2, but using only functions of the EPR variables, namely positions and momenta (Sect. 4). This argument relies on a theorem of Clifton [14].

We then consider what happens in Bohmian mechanics (Sect. 5): in that theory, particles have, at all times, both a position and a momentum and one might therefore think that this would imply the existence of a non-contextual value-map for functions of those variables. We explain however, through an analysis of what a measurement of momentum means in that theory, that this is not the case. Finally we briefly discuss how nonlocality manifests itself in Bohmian mechanics.

For a discussion of the relationship between this work and previous ones, including [1, 12, 13, 20, 26, 37], see Sect. 7 of [11].

2 Proof of Nonlocality Based on Perfect Correlations

We will first discuss special quantum states, called maximally entangled, for pairs of physical systems that can possibly be located far apart, and having the property that, for each quantum observable of one of the systems, there is an associated observable of the other one such that the result of the measurement of that observable is perfectly correlated with the result of the measurement on the first one.

2.1 Maximally Entangled States

Consider a finite dimensional (complex) Hilbert space \(\mathcal H\), of dimension N, and orthonormal bases \(\psi _n\) and \(\phi _n\) in \(\mathcal H\) (we will assume below that all bases are orthonormal). A unit vector \(\Psi \) in \(\mathcal H \otimes \mathcal H\) is maximally entangled if it is of the form

$$\begin{aligned} \Psi = \frac{1}{\sqrt{N}}\sum _{n=1}^N \psi _n \otimes \phi _n. \end{aligned}$$
(2.1.1)

Since we are interested in quantum mechanics, we will refer to those vectors as maximally entangled states and we will associate, by convention, each space in the tensor product with a “physical system,” namely we will consider the set \(\{\phi _n\}_{n=1}^N\) as a basis of states for physical system 1 and the set \(\{\psi _n\}_{n=1}^N\) as a basis of states for physical system 2.

Now, given a maximally entangled state, one can associate to each operator of the form \({\mathbbm {1}} \otimes O\) (meaning that it acts non-trivially only on particle 1) an operator of the form \( \tilde{O} \otimes {\mathbbm {1}} \) (meaning that it acts non-trivially only on particle 2). Here \({\mathbbm {1}}\) denotes the identity operator on \(\mathcal H\).

Define the operator U mapping \(\mathcal H\) to \(\mathcal H\) by setting

$$\begin{aligned} U \phi _n =\psi _n, \end{aligned}$$
(2.1.2)

\(\forall n= 1, \dots , N\), and extending U to an anti-linear operator on all of \(\mathcal H\):

$$\begin{aligned} U \left( \sum _{n=1}^N c_n \phi _n\right) =\sum _{n=1}^N c^*_n U \phi _n= \sum _{n=1}^N c^*_n \psi _n \end{aligned}$$
(2.1.3)

where \(^*\) denotes the complex conjugate.

Using the operator U, the state \(\Psi \) in (2.1.1) can be written as:

$$\begin{aligned} \Psi = \frac{1}{\sqrt{N}}\sum _{n=1}^N U\phi _n \otimes \phi _n . \end{aligned}$$
(2.1.4)

It is easy to check that this formula is the same for any basis, see [11, Eq. 3.1.8].

U thus determines, and is uniquely determined by, a maximally entangled state \(\Psi \).

Given such a state \(\Psi \), and hence U, we may associate to every operator of the form \({\mathbbm {1}} \otimes O\) an operator of the form \( \tilde{O} \otimes {\mathbbm {1}}\) by setting

$$\begin{aligned} \tilde{O} = U O U^{-1}. \end{aligned}$$
(2.1.5)

Suppose \(\phi _n\) are eigenstates of O, with eigenvalues \(\lambda _n\),

$$\begin{aligned} O \phi _n = \lambda _n \phi _n. \end{aligned}$$
(2.1.6)

Then, the states \(\psi _n= U \phi _n\) are eigenstates of \(\tilde{O}\), also with eigenvalues \(\lambda _n\):

$$\begin{aligned} \tilde{O} \psi _n = \lambda _n \psi _n. \end{aligned}$$
(2.1.7)

This implies and is in fact equivalent to the following relationship between the operators O and \(\tilde{O}\):

$$\begin{aligned} ( O \otimes {\mathbbm {1}} - {\mathbbm {1}} \otimes \tilde{O} ) \Psi = 0, \end{aligned}$$
(2.1.8)

directly expressing the fact that, in the state \( \Psi \), \(O \otimes {\mathbbm {1}}\) and \({\mathbbm {1}} \otimes \tilde{O}\) are perfectly correlated.

We may summarize this as follows:

Theorem 2.1

Consider a finite dimensional Hilbert space \(\mathcal H\), of dimension N, and a maximally entangled state \(\Psi \in \mathcal H \otimes \mathcal H\). Then, for any self-adjoint operator O acting on \(\mathcal H\), there exists a self-adjoint operator \(\tilde{O}\) acting on \(\mathcal H\) such that (2.1.8) holds.

Remarks

  1. 1.

    A simple example of a maximally entangled state is:

    $$\begin{aligned} |\Psi \rangle= & {} \frac{1}{\sqrt{2}} \big (| \uparrow \rangle | \downarrow \rangle -| \downarrow \rangle | \uparrow \rangle \big ), \end{aligned}$$
    (2.1.9)

    where the right factors refer to system 1 and left ones to system 2. That state, according to ordinary quantum mechanics, means that the spin measured on system 1 will have equal probability to be up or down, but is perfectly anti-correlated with the spin measured on system 2.

    In the notation of (2.1.1), one has:

    $$\begin{aligned} \phi _1= | \uparrow>,\;\; \phi _2 = | \downarrow>, \;\;\psi _1= -|\downarrow>, \;\; \psi _2=|\uparrow >, \end{aligned}$$

    and therefore,

    $$\begin{aligned} U | \uparrow>= & {} -| \downarrow>,\\ U | \downarrow>= & {} | \uparrow >. \end{aligned}$$

    If one takes

    $$\begin{aligned} O = \left( \begin{array}{ccc} 1 &{}\quad 0 \\ 0 &{}\quad -1 \end{array}\right) \end{aligned}$$
    (2.1.10)

    which corresponds to the spin operator for system 1 and has eigenvectors \(\phi _1\) with eigenvalue 1 and \(\phi _2\) with eigenvalue \(-1\), one computes that

    $$\begin{aligned} {\tilde{O}} = U O U^{-1}= \left( \begin{array}{ccc} -1 &{}\quad 0 \\ 0 &{}\quad 1 \end{array}\right) =-O, \end{aligned}$$
    (2.1.11)

    which means that the spin operator for systems 1 and 2 are perfectly anti-correlated, since \(\tilde{O}\) is minus the spin operator for system 2.

    We will use later the following:

  2. 2.

    Products of maximally entangled states are maximally entangled states: If one has two Hilbert spaces \( \mathcal{H}_1\), \( \mathcal{H}_2\), and two maximally entangled states \(\Psi _i \in \mathcal{H}_i \otimes \mathcal{H}_i\), \(i= 1, 2\), then it is easy to check that the state \(\Psi = \Psi _1 \otimes \Psi _2\) is maximally entangled in \(\mathcal{H} \otimes \mathcal{H}\), where \(\mathcal{H}= \mathcal{H}_1 \otimes \mathcal{H}_2\) (under the canonical identification of \((\mathcal{H}_1 \otimes \mathcal{H}_1) \otimes (\mathcal{H}_2 \otimes \mathcal{H}_2)\) with \( \mathcal{H}\otimes \mathcal{H}\)).

Let us now see what this notion of maximally entangled state implies for quantum measurements.

Suppose that we have a pair of physical systems, whose states belong to the same finite dimensional Hilbert space \(\mathcal H\). And suppose that the quantum state \(\Psi \) of the pair is maximally entangled, i.e. of the form (2.1.1).

Any observable acting on system 1 is represented by a self-adjoint operator O, which has therefore a basis of eigenvectors. Since the representation (2.1.4) of the state \(\Psi \) is valid in any basis, we may choose, without loss of generality, as the set \(\{\phi _n\}_{n=1}^N\) in (2.1.1) the eigenstates of O. Let \(\lambda _n\) be the corresponding eigenvalues, see (2.1.6).

If one measures that observable O, the result will be one of the eigenvalues \(\lambda _n\), each having equal probability \(\frac{1}{N}\). If the result is \(\lambda _k\), the (collapsed) state of the system after the measurement, will be \(\psi _k \otimes \phi _k \). Then, the measurement of observable \(\tilde{O}\), defined by (2.1.5), (2.1.2), on system 2, will necessarily yield the value \(\lambda _k\).

Reciprocally, if one measures an observable \(\tilde{O}\) on system 2 and the result is \(\lambda _l\), the (collapsed) state of the system after the measurement, will be \(\psi _l \otimes \psi _l \), and the measurement of observable O on system 1 will necessarily yield the value \(\lambda _l\).

To summarize, we have derived the following consequence of the quantum formalism:

Principle of Perfect Correlations. In any maximally entangled quantum state, of the form (2.1.1), there is, for each operator O acting on system 1, an operator \( \tilde{O}\) acting on system 2 (defined by (2.1.5), (2.1.2)), such that, if one measures the physical quantity represented by operator \( \tilde{O}\) on system 2 and the result is the eigenvalue \(\lambda _l\) of \( \tilde{O}\), then, measuring the physical quantity represented by operator O on system 1 will yield with certainty the same eigenvalue \(\lambda _l\), and vice-versa.Footnote 2

2.2 Schrödinger’s “Theorem”

The following property will be crucial in the rest of the paper.

Locality. If systems 1 and 2 are spatially separated from each other, then measuring an observable on system 1 has no instantaneous effect whatsoever on system 2 and measuring an observable on system 2 has no instantaneous effect whatsoever on system 1.

Finally, we must also define:

Non-contextual value-maps. Let \(\mathcal H\) be a finite dimensional Hilbert space and let \({\mathcal A}\) be the set of self-adjoint operators on \({\mathcal H}\). Suppose \(\mathcal H\) is the quantum state space for a physical system and \({\mathcal A}\) is the set of quantum observables. Suppose there are situations in which there are observables A for which the result of measuring A is determined already, before the measurement. Suppose, that is, that A has, in these situations, a pre-existing value v(A) revealed by measurement and not merely created by measurement. Of course, this implies that for every experiment \(\mathcal{E}_A\) measuring A, the result \(v(\mathcal{E}_A)\) of that experiment, in the situation under consideration, must be v(A). And suppose finally that the situation is such that we have a pre-exiting value v(A) for every \(A\in {\mathcal A}\).

We would then have a non-contextual value-map, namely a map \(v: {\mathcal A}\rightarrow \mathbb R\) that assigns the value v(A) to any experiment associated with what is called in quantum mechanics a measurement of an observable A. There can be different ways to measure the same observable. The value-map is called non-contextual because all such experiments, associated with the same quantum observable A, are assigned the same value.

This notion of value-map is not a purely mathematical one, since it involves the notion of an experiment that measures a quantum observable A, which we have not mathematically formalized. However, we shall need only the following obvious purely mathematical consequence of non-contextuality.

A non-contextual value-map has the fundamental property that, if \(A_i\), \(i=1, \dots , n\), are mutually commuting self-adjoint operators on \({\mathcal H}\), \( [A_i, A_j]= 0, \forall i, j =1, \dots , n\), then, if f is a function of n variables and \(B= f(A_1, \dots , A_n)\), then

$$\begin{aligned} v(B)= f(v(A_1), \dots , v(A_n)). \end{aligned}$$
(2.2.1)

It is a well-known property of quantum mechanics that, since all the operators \(A_1, \dots , A_n, B\) commute, they are simultaneously measurable and the result of those measurements must satisfy (2.2.1).

But, and this is what we emphasized in [11], (2.2.1) follows trivially from the non-contextuality of the value-map. Indeed, a valid quantum mechanical way to measure the operator \(B= f(A_1, \dots , A_n)\) is to measure \(A_1, \dots , A_n\) and, denoting the results \(\lambda _1, \dots , \lambda _n\), to regard \(\lambda _B=f(\lambda _1, \dots , \lambda _n)\) as the result of a measurement of B. Since, by the non-contextuality of the map v, all the possible measurements of B must yield the same results, (2.2.1) holds.

Thus, once one has a non-contextual value-map, one does not even need to check (2.2.1).

Now we will use the perfect correlations and locality to establish the existence of a non-contextual value-map v, for a maximally entangled quantum state of the form (2.1.1) or, equivalently, (2.1.4). By the principle of perfect correlations, or any operator O on system 1, there is an operator \(\tilde{O}\) on system 2, defined by (2.1.5), (2.1.2), which is perfectly correlated with O through (2.1.8).

Thus, if we were to measure \(\tilde{O}\), obtaining \(\lambda _l\), we would know that

$$\begin{aligned} v(O)= \lambda _l \end{aligned}$$
(2.2.2)

concerning the result of then measuring O. Therefore, v(O) would pre-exist the measurement of O. But, by the assumption of locality, the measurement of \(\tilde{O}\), associated with the second system, could not have had any effect on the first system, and thus, this value v(O) would pre-exist also the measurement of \(\tilde{O}\) and this would not depend upon whether \(\tilde{O}\) had been measured. Letting O range over all operators on system 1, we see that there must be a non-contextual value-map \(O\rightarrow v(O)\).

To summarize, we have shown:

Schrödinger’s “Theorem”. Let \({\mathcal A}\) be the set of self-adjoint operators on the component Hilbert space \({\mathcal H}\) of a physical system in a maximally entangled state (2.1.1). Then, assuming locality and the principle of perfect correlations, there exists a non-contextual value-map \(v: {\mathcal A}\rightarrow \mathbb R\).

Remark

  • We put “Theorem” in quotation marks because the statement concerns physics and not just mathematics. Its conclusions are nevertheless inescapable assuming the hypothesis of locality and the empirical validity of the principle of perfect correlations, a principle which is, as we showed, a consequence of the quantum formalism.

2.3 The Non-existence of Non-contextual Value-Maps

The problem posed by the non-contextual value-map v whose existence is implied by Schrödinger’s “theorem” is that such maps simply do not exist (and that is a purely mathematical result). Indeed, one has the:

“Theorem”: Non-existence of non-contextual value-maps. Let \({\mathcal A}\) be the set of self-adjoint operators on the Hilbert space \(\mathcal H\) of a physical system. Then there exists no non-contextual value-map \(v: {\mathcal A}\rightarrow \mathbb R\).

This “theorem” is an immediate consequence of the following theorem, since (2.3.1), (2.3.2) are consequences of (2.2.1).Footnote 3

Theorem 2.2

Let \(\mathcal H\) be a finite dimensional Hilbert space of dimension at least three, and let \({\mathcal A}\) be the set of self-adjoint operators on \({\mathcal H}\). There does not exist a map \(v: {\mathcal A}\rightarrow \mathbb R\) such that:

  1. (1)

    \( \; \forall O \in { {\mathcal A}}\),

    $$\begin{aligned} v(O) \;\;\text{ is } \text{ an } \text{ eigenvalue } \text{ of } \;\;O \end{aligned}$$
    (2.3.1)
  2. (2)

    \(\forall O, O' \in { {\mathcal A}}\) with \( [O, O']= OO'-O'O=0\), and for any real valued function f of two real variables,

    $$\begin{aligned} v(f(O, O'))=f(v(O), v(O')). \end{aligned}$$
    (2.3.2)

See [11] for a discussion of the proof of the theorem, which is a consequence of stronger theorems, originally due to Bell [3] and to Kochen and Specker [27], with simplified proofs of Theorem 2.2 due to Mermin [28], and to Peres [31, 32].

2.4 Nonlocality

The conclusion of Schrödinger’s “theorem” and of the “Theorem” on the non-existence of non-contextual value-maps plainly contradict each other. So, the assumptions of at least one of them must be false. Moreover, the stronger Theorem 2.2 is a purely mathematical result. To derive Schrödinger’s “theorem,” we assume only the perfect correlations and locality. The perfect correlations are an immediate consequence of quantum mechanics. The only remaining assumption is locality. Hence we can deduce:

Nonlocality “Theorem”. The locality assumption is false.

See [11, Sects. 5, 7] for a discussion of the relation between this proof and other proofs of nonlocality.

3 The Original EPR Argument

Let us now turn to the original EPR argument [18] and explain its connection to the notion of locality. EPR gave both a general argument and a specific example.

3.1 EPR’s General Setup

For their general argument, EPR considered a system of two particles, 1 and 2, in one dimension, that may be far apart and a physical quantity represented by a self-adjoint operator O that acts on system 1. We shall assume that O has an orthonormal basis of eigenvectors \( \phi _n (x_1)\) with eigenvalues \(\lambda _n\).

One can then write the joint state of both particles as:

$$\begin{aligned} \Psi (x_1, x_2)= & {} \sum _{n=1}^\infty \psi _n(x_2) \phi _n (x_1), \end{aligned}$$
(3.1.1)

where \(\psi _n(x_2)\) are the (\(x_2\) dependent) coefficients of that expansion.Footnote 4

After a measurement of O on system 1, if the result is \(\lambda _l\), then the state collapses to \(\psi _l(x_2) \phi _l (x_1) \), i.e. \(\phi _l (x_1)\) for the first particle and \(\psi _l(x_2)\) for the second.

If, on the other hand, one considers a physical quantity represented by an operator \(O'\) that acts on system 1, and one assumes that \(O'\) has eigenvectors \( \phi '_s (x_1)\) and eigenvalues \(\mu _s\), one can write the joint state as:

$$\begin{aligned} \Psi (x_1, x_2)= & {} \sum _{s=1}^\infty \psi '_s(x_2) \phi '_s (x_1) \end{aligned}$$
(3.1.2)

After a measurement of \(O'\) on system 1, if the result is \(\mu _k\), then the state collapses to \(\psi '_k(x_2) \phi '_k (x_1)\), i.e. \(\phi '_k (x_1)\) for the first particle and \(\psi '_k(x_2)\) for the second.

We will discuss the implications of that observation after giving the concrete examples of the operators considered by EPR.

3.2 The Example of Position and Momentum

For their specific example, EPR introduced a two particle wave functionFootnote 5:

$$\begin{aligned} \Psi _{EPR} (x_1, x_2)= & {} \int _{-\infty }^{\infty } \exp (i(x_1-x_2+x_0)p) dp \end{aligned}$$
(3.2.1)

(putting \(\hbar =1\)). This can be written, by analogy with (3.1.1), i.e. with sums replaced by integrals, as:

$$\begin{aligned} \Psi _{EPR} (x_1, x_2)= & {} \int _{-\infty }^{\infty } \psi _p(x_2) \phi _p (x_1) dp \end{aligned}$$
(3.2.2)

with: \( \phi _p (x_1)= \exp (ix_1p) \), and \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\).

It will be useful to introduce the Fourier transform of a wave function \(\Psi \):

$$\begin{aligned} \widehat{\Psi }( p_1, p_2)= \frac{1}{2\pi } \int \exp (-i (p_1 x_1+p_2 x_2))\Psi ( x_1, x_2) dx_1 dx_2, \end{aligned}$$
(3.2.3)

whose inverse is:

$$\begin{aligned} \Psi ( x_1, x_2) = \frac{1}{2\pi } \int \exp (i (p_1 x_1+p_2 x_2)) \widehat{\Psi }( p_1, p_2) dp_1 dp_2. \end{aligned}$$
(3.2.4)

EPR took the operator O to be the momentum operator

$$P_1=-i\frac{d}{dx_1} $$

acting on the first particle and on a suitable set of functions (see [33, Chap. VIII] for precise definitions).

We know that \( \phi _p (x_1)= \exp (ix_1p) \) is a (generalized) eigenstate of \(P_1\) of eigenvalue p, and \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\) is a (generalized) eigenstate of eigenvalue \(-p\) of the momentum operator

$$P_2=-i\frac{d}{dx_2} $$

acting on the second particle.

Alternatively, \(P_j\), \( j=1,2\), can be defined by its action on \(\widehat{\Psi }( p_1, p_2)\):

$$\begin{aligned} P_j \Psi ( x_1, x_2) = \frac{1}{2\pi } \int \exp (i (p_1 x_1+p_2 x_2)) p_j \widehat{\Psi }( p_1, p_2) dp_1 dp_2 \;,\quad j=1,2\;. \end{aligned}$$
(3.2.5)

EPR took the operator \(O'\) to be the position operator \(Q_1=x_1\) acting on the first particle.

Using a standard identity for distributions (\( \int _{-\infty }^{\infty } \exp (ixp) dp = 2\pi \delta (x)\)) one can write the state (3.2.1), as:

$$\begin{aligned} \Psi _{EPR} (x_1, x_2)= & {} 2\pi \delta (x_1-x_2+x_0) \nonumber \\= & {} 2\pi \int _{-\infty }^{\infty } \delta (x-x_2+x_0) \delta (x_1-x) dx \nonumber \\= & {} \int _{-\infty }^{\infty } \psi '_x(x_2) \phi '_x (x_1) dx, \end{aligned}$$
(3.2.6)

with \(\psi '_x(x_2)= {\sqrt{2\pi }} \delta (x-x_2+x_0) \) and \(\phi '_x(x_1)= {\sqrt{2\pi }} \delta (x_1-x)\). The last formula is analogous to (3.1.2).

The (generalized) eigenfunctions of the operator \(Q_1=x_1\) are \(\phi '_x(x_1)= {\sqrt{2\pi }}\delta (x_1-x)\), with eigenvalue x, and \(\psi '_x(x_2)= {\sqrt{2\pi }} \delta (x-x_2+x_0) \) is a (generalized) eigenvector of the operator \(Q_2=x_2\), with eigenvalue \(x+x_0\).

Therefore, depending on whether we choose to measure the operator O or \(O'\) on the first particle, one can produce two different states, \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\) and \(\psi '_x(x_2)= {\sqrt{2\pi }} \delta (x-x_2+x_0)\), for the second particle, which can be, in principle, as far as one wants from the first one.

Moreover, the states \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\) and \(\psi '_x(x_2)= {\sqrt{2\pi }}\delta (x-x_2+x_0)\) are (generalized) eigenfunctions of two non-commuting operators, \(P_2\) and \(Q_2\).

3.3 The Conclusions of the EPR Paper by EPR

Since EPR assumed no actions at a distance, they concluded that the values of two non commuting observables, like \(P_2\) and \(Q_2\), for the second particle, far away from where the measurements on the first particle take place, must have “simultaneous reality” when the system is in the quantum state (3.2.1). Thus, say EPR, quantum mechanics, i.e., the description provided by the state (3.2.1), is an incomplete description of physical reality.

But they could have made a simpler argument: considering only one variable is enough to show that quantum mechanics is incomplete. Indeed, I can know the position of the second particle by measuring the position of the first one. If that measurement, being made far away from the second particle, does not affect the state of the second particle, then the position of that second particle (which is left undetermined by the state (3.2.1)) must exist independently of any measurement on the first particle.

And, since one can reason by exchanging the two particles, one can also know the position of the first particle by measuring the one of the second particle, so that the position of the first particle must also exist independently of any measurements.

Of course, they could have made the same argument about the momentum of either particle, but there was no need to bring in both quantities.

3.4 The Conclusions of the EPR Paper by Einstein

In a June 19, 1935 letter to Schrödinger, Einstein complained that the EPR paper had been written by Podolsky “for reasons of language” and that the main point “was buried, so to speak, by erudition” [19].

Then Einstein explains what is, for him, the main point: in the notation used here, see (3.1.1), if one measures quantity O on system 1, the state collapses to some state \(\psi _l(x_2)\) for the second particle. Similarly, if one measures a quantity \(O'\) on system 1, see (3.1.2), the state collapses to some different state \(\psi '_k(x_2)\) for the second particle.

For the state \(\Psi _{EPR}\), (3.2.2), (3.2.6) one obtains either a state of the form \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\), if one measures the momentum of the first particle or a state of the form \(\psi '_x(x_2)= {\sqrt{2\pi }}\delta (x-x_2+x_0)\), if one measures the position of the first particle.

The fact that one can obtain two different states for the second particle by acting on the first particle, far away from the second one, proves that the wave function description in quantum mechanics is incomplete (assuming of course locality) since a more complete description would be provided by both states together.

Einstein said that “he could not care less” [21, p. 38] about the fact that those states, \(\psi _p(x_2)= \exp (-i(x_2-x_0)p)\) and \(\psi '_x(x_2)= {\sqrt{2\pi }}\delta (x-x_2+x_0)\), are or are not eigenstates of some observable (related to the second particle).

This is indeed different, and simpler, than the conclusion of the EPR paper, but it is still more complicated than the argument that we gave in Sect. 3.3.

3.5 Schrödinger’s Extension of EPR

What Schrödinger did in his 1935 paperFootnote 6 [34] and in [35, 36], was to reflect on the EPR paper [18]. He introduced what we call here maximally entangled states and concluded that the value of every observable O for the first system can be determined by the measurement of the corresponding observable \(\tilde{O}\) on the second system, distant from the first one. That puzzled him a lot. Of course, like EPR, Schrödinger always assumed locality.

To illustrate his puzzlement, Schrödinger used the following example. Let O be the energy of the harmonic oscillator, \(O= \frac{1}{2} ( p^2+\omega ^2 x^2)\) with \(p=-i \frac{d}{dx}\). It is well known that the eigenvalues of the operator O are of the form \(\omega (n+\frac{1}{2})\), \(n= 0, 1, 2, \dots \). But, argued Schrödinger, if those values can be determined by measuring a similar operator \(\tilde{O}\) acting on a distant system, they must pre-exist the measurement of O, and that should hold true for every value of \(\omega \). But, by the EPR reasoning, the values of the position x and the momentum p of the first system can also be determined by measuring either the operator \( \tilde{x}\) or the operator \(\tilde{p}\) on the second system, so the values of x and p must also pre-exist their measurements. But it is impossible for the quantity \(\frac{1}{2} (p^2+\omega ^2 x^2)\) to belong to the set \(\{ \omega (n+\frac{1}{2}) | n= 0, 1, 2, \dots \}\), for arbitrary values of \(\omega \) and any given values of x and p.

It is interesting to compare Schrödinger’s attitude to that of von Neumann a little before 1935 [39] (von Neumann’s book was published in German in 1932 but translated into English only in 1955); von Neumann proved a “no hidden variable theorem” similar in its conclusion to our Theorem 2.2, but by making the much stronger assumption that (2.3.2) holds even for non-commuting operators O and \(O'\), at least for the function \(f(x,y)= x+y\), and he concluded that the “value-map” cannot exist. If one assumes that (2.3.2) holds for non-commuting operators, then it is very simple to prove the non-existence of a value-map. Take \(O=\frac{1}{\sqrt{2}} \sigma _x\), \(O'= \frac{1}{\sqrt{2}} \sigma _y\), where \(\sigma _x\) and \(\sigma _y\) are the Pauli matrices corresponding to the spin along the x and y axes. Then \(O+O'= \frac{\sigma _x + \sigma _y}{\sqrt{2}} \) corresponds to the spin at an angle of \(45^\circ \) between the x and y axes. All the Pauli matrices have eigenvalues equal to \(\pm 1\) and so does \(O+O'\). Thus \(v(O)= v(O') = \pm \frac{1}{\sqrt{2}}\), and we have \(v(O)+ v(O') = \pm {\sqrt{2}}\) or 0. But we also have \(v(O+O')= v\big ((\sigma _x + \sigma _y)/{\sqrt{2}}\big )= \pm 1\). Thus (2.3.2) cannot hold for this choice of O and \(O'\) and \(f(x,y)= x+y\).

If Schrödinger had reasoned like von Neumann he would also have derived a “no hidden variable theorem,” using his example of the harmonic oscillator: Indeed, if \(O = \frac{1}{2} (p^2+\omega ^2 x^2)\), and one applies (2.3.2) even to non-commuting operators, one gets \(v(O)= \frac{1}{2} (v(p)^2+\omega ^2 v(x)^2)= \omega (n+\frac{1}{2})\) for some \(n= 0, 1, 2, \dots \), which, as Schrödinger observed, would be impossible for arbitrary v(p), v(x) and \(\omega \). But Schrödinger’s goal was not to prove that a value-map was impossible, since the point of his “theorem” was to show that it existed (assuming locality of course). He was just baffled by the situation: recognizing that this relationship between values suggested by the form of \(O = \frac{1}{2} (p^2+\omega ^2 x^2)\) could not always hold, he wondered what relationship, if any, might exist among the relevant values. Of course, had Schrödinger made the (unwarranted) assumption of von Neumann and applied (2.3.2) to non-commuting operators, he would have been even more baffled, since he would probably have been led to question the locality assumption.

Finally, note that in 1966, much later than 1935, John Bell constructed in [3] an explicit counter-example to von Neumann’s conclusions, by giving a simple example of a “hidden variables theory” that reproduces the quantum mechanical results for a single spin operator (but, of course, without satisfying (2.3.2) for non-commuting operators). Bohmian mechanics (see Sect. 5) also provides a counter-example to von Neumann’s conclusions, but a more comprehensive one.

3.6 A Regularized EPR State

A way to avoid dealing with generalized functions or distributions such as (3.2.1), (3.2.6) is to put a cutoff both in the spatial and the momentum variables, x and p. A convenient way to do that is to require that x take values in a finite (but arbitrarily large) box on a lattice of (arbitrarily small) spacing a, which amounts to putting a cutoff in the momentum variable p.

So, let \(x\in \Lambda _a=[-L, L] \cap a\mathbb Z\), or \(x= na, n \in \mathbb Z, |n| \le M\),with \(M= [\frac{L}{a}]\), and \([\cdot ]\) denoting the integer part.

Let \(\widehat{\Lambda }_a\) be the dual of \(\Lambda _a\):

$$\widehat{\Lambda }_a= \left\{ p= \frac{2\pi k}{a(2M+1)}, k \in \mathbb Z, |k| \le M \right\} .$$

Then, one has the orthogonality relation: \(\forall x \in \Lambda _a\)

$$\begin{aligned} \sum _{p \in \widehat{\Lambda }_a} \exp (\pm ixp) = {\sqrt{2M+1}} \delta _{a, L} (x)\equiv (2M+1) \delta _{x, 0}, \end{aligned}$$
(3.6.1)

where \( \delta _{x, 0}\) is the Kronecker delta.

And, \(\forall p \in \widehat{\Lambda }_a\),

$$\begin{aligned} \sum _{x \in \Lambda _a} \exp (\pm ixp) = {\sqrt{2M+1}} \delta _{a, L} (p)\equiv (2M+1) \delta _{p, 0}. \end{aligned}$$
(3.6.2)

Let, \(\forall x_1, x_2, x_0 \in \Lambda _a\),

$$\begin{aligned} \Psi _{EPR}^{a, L} (x_1, x_2) =\sum _{p \in \widehat{\Lambda }_a} \exp (i(x_1-x_2+x_0)p), \end{aligned}$$
(3.6.3)

where the sum \(x_1-x_2+x_0\) is modulo 2aM.

Using (3.6.1),

$$\begin{aligned} \Psi _{EPR}^{a, L} (x_1, x_2) = \sqrt{2M+1} \delta _{a, L} (x_1-x_2+x_0) \end{aligned}$$
(3.6.4)

can be written as:

$$\begin{aligned} \Psi _{EPR}^{a, L} (x_1, x_2) = \sum _{x \in \Lambda _a} \delta _{a, L} (x-x_2+x_0) \delta _{a, L} (x_1-x). \end{aligned}$$
(3.6.5)

One can also introduce the finite Fourier transform:

$$\begin{aligned} \widehat{\Psi }(p_1, p_2) = \frac{1}{{2M+1}}\sum _{x_1, x_2 \in \Lambda _a} \exp (-i(x_1p_1+x_2p_2)) \Psi (x_1, x_2) \end{aligned}$$
(3.6.6)

whose inverse is:

$$\begin{aligned} \Psi (x_1, x_2) = \frac{1}{ {2M+1}} \sum _{p_1, p_2 \in \widehat{\Lambda }_a} \exp (i(x_1p_1+x_2p_2)) \widehat{\Psi }(p_1, p_2). \end{aligned}$$
(3.6.7)

The analogues of the operators \(P_1\), \(P_2\), \(Q_1\), \(Q_2\) of Sect. 3.2 are:

$$\begin{aligned} P_j \Psi (x_1, x_2)= \sum _{p_1, p_2 \in \widehat{\Lambda }_a} \exp (i(x_1p_1+x_2p_2)) p_j \widehat{\Psi }(p_1, p_2), \quad j=1,2, \end{aligned}$$
(3.6.8)

and

$$\begin{aligned} Q_j \Psi (x_1, x_2)= x_j \Psi (x_1, x_2), \quad j=1,2. \end{aligned}$$
(3.6.9)

These operators have proper (not generalized) eigenvectors:

$$\begin{aligned} P_j \exp (-i(x_1p_1+x_2p_2))= p_j \exp (-i(x_1p_1+x_2p_2)) \end{aligned}$$
(3.6.10)

and

$$\begin{aligned} Q_j \delta _{a, L} (x_1 -x_{0, 1}) \delta _{a, L} (x_2 -x_{0, 2})= x_{0, j} \delta _{a, L} (x_1 -x_{0, 1}) \delta _{a, L} (x_2 -x_{0, 2}). \end{aligned}$$
(3.6.11)

Thus, if one applies the collapse rule for the measurement of the observable \(P_1\) to \(\Psi _{EPR}^{a, L} (x_1, x_2)\), when the observed value is p, the resulting state will be proportional to \( \exp (i(x_1-x_2+x_0)p)\), meaning that the state of the second particle will be proportional to \(\exp (-i(x_2-x_0)p)\). And, if one applies the collapse rule for the measurement of the observable \(Q_1\) to \(\Psi _{EPR}^{a, L} (x_1, x_2)\), when the observed value is x, the resulting state will be proportional to \( \delta _{a, L} (x-x_2+x_0) \delta _{a, L} (x_1-x)\), meaning that the state of the second particle will be proportional to \(\delta _{a, L} (x-x_2+x_0)\).

4 Proof of Nonlocality Using the EPR Variables

Given a state like (3.2.1), (3.2.6), we can almost repeat the arguments of Sect. 2 in order to prove nonlocality. First observe that one has an analogue of a Schrödinger theorem. Consider a generalized state for four particles in one dimension:

$$\begin{aligned} \delta (x_1-x_3+x_0) \delta (x_2-x_4+x_0), \end{aligned}$$
(4.1)

which is just the product of two copies of the EPR state (up to a \(4\pi ^2\) factor, see (3.2.6)), one for the pair of particles (1, 3), the other for the pair of particles (2, 4). Alternatively, one may regard this as a state of two particles in two dimensions, with coordinates \((x_1, x_2)\) and \((x_3, x_4)\). In our previous notation, system 1 will consist of particles 1 and 2 and system 2 will consist of particles 3 and 4.Footnote 7

One may also replace that state by its regularized version, see (3.6.4):

$$\begin{aligned} \delta _{a, L} (x_1-x_3+x_0) \delta _{a, L} (x_2-x_4+x_0). \end{aligned}$$
(4.2)

By Remark 2 in Sect. 2.1, the state (4.2) is maximally entangled and so the state (4.1) is also (formally) maximally entangled.Footnote 8

We need to introduce standard operators \(Q_1\), \(Q_2\), \(Q_3\), \(Q_4\), that act as multiplication on a suitable set of functions in \(L^2(\mathbb R^4)\):

$$\begin{aligned} Q_j \Psi ( x_1, x_2, x_3, x_4)= x_j \Psi ( x_1, x_2, x_3, x_4)\;,\quad j=1,2,3,4\;, \end{aligned}$$
(4.3)

and operators \(P_1\), \(P_2\), \(P_3\), \(P_4\) that act by differentiation on a suitable set of functions in \(L^2(\mathbb R^4)\):

$$\begin{aligned} P_j \Psi ( x_1, x_2, x_3, x_4)= -i\frac{\partial }{\partial x_j} \Psi ( x_1, x_2, x_3, x_4)\;,\quad j=1,2,3,4\;. \end{aligned}$$
(4.4)

Or, using the Fourier transform (3.2.3) of \(\Psi \) (for four variables):

$$\begin{aligned}&P_j \Psi ({x_{1}, x_{2}, x_{3}, x_{4}}) = \nonumber \\&\frac{1}{(2\pi )^2} \int \exp (i (p_1 x_1+p_2 x_2+p_3 x_3+p_4 x_4)) p_j \widehat{\Psi }( p_1, p_2, p_3, p_4) dp_1 dp_2, dp_3 dp_4,\qquad \quad \end{aligned}$$
(4.5)

\(\text{ for } \quad j=1,2,3,4 \).

Consider the eight operator \(Q_1\), \(Q_2\), \(Q_3\), \(Q_4\), \(P_1\), \(P_2\), \(P_3\), \(P_4\), defined by (4.3) and (4.4), (4.5).

Let \({\mathcal B}\) be the set of products of analytic functions of one of the operators \(Q_1\), \(Q_2\), \(P_1\), \(P_2\) defining a self-adjoint operator, and let \(\tilde{\mathcal B}\) be the set of sums of products of analytic functions of one of the operators \(Q_3\), \(Q_4\), \(P_3\), \(P_4\) defining a self-adjoint operator.

Given the maximally entangled state (4.1), for every operator \(\tilde{O}\in \tilde{\mathcal B}\), there is a corresponding (in the sense of the Principle of Perfect Correlations) operator \(O \in {\mathcal B}\), and vice-versa. (For \(x_0=0\), O is obtained by changing in \(\tilde{O}\) the index 3 to 1 and the index 4 to 2). And, by Schrödinger’s theorem, assuming locality, there is a non-contextual value-map \(v: { {\mathcal B}} \rightarrow {\mathbb R}\) that satisfies (2.2.1) and therefore also the property (2.3.2).

However this is contradicted by a theorem of Clifton [14], proven in the appendix.

Theorem 4.1

Non-existence of pre-existing values for positions and momenta.

Consider the set of analytic functions of one of the operators \(Q_1\), \(Q_2\), \(P_1\), \(P_2\). And let \({\mathcal B}\) be the set of products of such functions defining a self-adjoint operator. Then, there does not exist a map

$$\begin{aligned} v: { {\mathcal B}} \rightarrow {\mathbb R} \end{aligned}$$
(4.6)

such that:

  1. (1)
    $$\begin{aligned} v(f( O))=f (v( O)), \end{aligned}$$
    (4.7)

    for any real valued function f of a real variable.

  2. (2)

    \(\forall O, O' \in { {\mathcal B}}\) with \( [O, O']= OO'-O'O=0\), (2.3.2) for \(f(x,y)=xy\) holds:

    $$\begin{aligned} v(O O')=v(O) v(O'). \end{aligned}$$
    (4.8)

In particular, there cannot exist a non-contextual value-map.

So, combining the EPR argument with the previous theorem, we again establish nonlocality, without using Bell’s inequalities.

The logic is the same as in Sect. 2:

  1. 1.

    EPR show that the perfect correlations plus locality imply that the values of some physical quantities (the values v(O) of the operators \(O\in {\mathcal A}\) in Sect. 2.3 or the operators \(O\in {\mathcal B}\) here), must exist independently of whether one measures them or not, and that defines a non-contextual value-map.

  2. 2.

    Theorems 2.2 or 4.1 show that merely assuming the existence of such a map leads to a contradiction.

Therefore the locality assumption is false!

5 What Happens in Bohmian Mechanics?

In Bohmian mechanics, or pilot-wave theory, the complete state of a closed physical system composed of N particles is a pair (|quantum state>, \(\mathbf{X})\), where |quantum state> is the usual quantum state (given by the tensor product of wave functions with some possible internal states), and \(\mathbf{X}= (X_1,\ldots , X_N)\) is the configuration representing the positions of the particles (that exist, independently of whether one “looks” at them or one measures them; each \(X_i\in \mathbb R^3\)).Footnote 9

These positions are the “hidden variables” of the theory, in the sense that they are not included in the purely quantum description |quantum state>, but they are not at all hidden: it is only the particles’ positions that one detects directly, in any experiment (think, for example, of the impacts on the screen in the two-slit experiment). So the expression “hidden variables” is really a misnomer, at least in the context of Bohmian mechanics.

Both objects, the quantum state and the particles’ positions, evolve according to deterministic laws, the quantum state guiding the motion of the particles. Indeed, the time evolution of the complete physical state is composed of two laws (we consider, for simplicity, spinless particles):

  1. 1.

    The wave function evolves according to the usual Schrödinger’s equation.

  2. 2.

    The particle positions \(\mathbf{X}=\mathbf{X} (t)\) evolve in time according to a guiding equation determined by the quantum state: their velocity is a function of the wave function. If one writesFootnote 10:

    $$\Psi (x_1, \dots , x_N)=R (x_1, \dots , x_N)e^{iS (x_1, \dots , x_N)}, $$

    then:

    $$\begin{aligned} \frac{ d X_k (t)}{dt}= \displaystyle \nabla _k S (X_1(t),\ldots ,X_N(t)), \end{aligned}$$
    (5.1)

    where \( \nabla _k\) is the gradient with respect the coordinates of the kth particle.

In order to understand why Bohmian mechanics reproduces the usual quantum predictions, one must use a fundamental consequence of that dynamics, equivariance: If the probability density \(\rho _{t_0}(\mathbf{x})\) for the initial configuration \( \mathbf{X}_{t_0}\) is given by \(\rho _{t_0}(\mathbf{x}) = |\Psi (\mathbf{x}, t_0)|^2\), then the probability density for the configuration \(\mathbf{X}_t\) at any time t is given by

$$\begin{aligned} \rho _t (\mathbf{x})= |\Psi (\mathbf{x}, t)|^2, \end{aligned}$$
(5.2)

where \(\Psi (\mathbf{x}, t)\) is a solution to Schrödinger’s equation. This follows easily from equation (5.1).

Because of equivariance, the quantum predictions for the results of measurements of any quantum observable are obtained if one assumes that the initial density satisfies \(\rho _{t_0} (\mathbf{x}) = |\Psi (\mathbf{x}, t_0)|^2\). The assertion that configurational probabilities at any time \(t_0\) are given by this “Born rule” is called the quantum equilibrium hypothesis. The justification of the quantum equilibrium hypothesis—and, indeed, a clear statement of what it actually means—is a long story, too long to be discussed here (see [15]).

In Bohmian mechanics, particles have a velocity at all times and therefore they have what we would be inclined to call a momentum (mass \(\times \) velocity). So one might ask, what sort of probability does Bohmian mechanics supply for the latter: will it agree with the quantum mechanical probability for momentum? The answer, as we will see in the next subsection, is no!

One may also ask: isn’t having both a position and a velocity at the same time contradicted by Heisenberg’s inequalities? Moreover, since Bohmian mechanics is deterministic, the result of any quantum experiment must be pre-determined by the initial conditions of the system being measured and of the measuring device. But why doesn’t that provide a non-contextual value-map whose existence is precluded by Theorem 4.1? We will discuss these issues in the following subsections and this will also provide an example of how nonlocality manifests itself in Bohmian mechanics.

5.1 The Measurement of Momentum in Bohmian Mechanics

To understand what is going on, we should analyze “momentum measurements,” i.e., what are called momentum measurements in standard quantum mechanics. Consider a simple example, namely a particle in one space dimension with initial wave function \(\Psi (x, 0)=\pi ^{-1/4}\exp (- x^2/2)\). Since this function is real, its phase \(S=0\) and the particle is at rest (by equation (5.1): \( \frac{d X(t)}{dt} = \frac{\partial S(X(t),t)}{\partial x}\)). Nevertheless, the measurement of momentum p must have, according to the usual quantum predictions, a probability distribution whose density is given by the square of the Fourier transform of \(\Psi (x, 0)\), i.e. by \(|\hat{\Psi }(p)|^2=\pi ^{-1/2}\exp (- p^2)\). Isn’t there a contradiction here? Isn’t there a clear disagreement with the quantum predictions?

In order to answer this question, one must focus on the quantum mechanical measurement of momentum. One way to do this is to let the particle move freely and to detect its asymptotic position X(t) as \(t\rightarrow \infty \). Then, one sets \(p= \displaystyle \lim _{t\rightarrow \infty } \frac{X(t)}{t}\) (putting the mass \(m=1\)).

Consider the free evolution of the initial wave function at \(t_0=0\), \( \Psi (x, 0)=\pi ^{-1/4}\exp (- x^2/2)\). The solution of Schrödinger’s equation with that initial condition is:

$$\begin{aligned} \Psi (x,t) = \frac{1}{(1+it)^{1/2}} \frac{1}{\pi ^{1/4}}\exp \left[ - \frac{x^2}{2 (1+it)}\right] , \end{aligned}$$
(5.1.1)

and thus

$$\begin{aligned} | \Psi (x,t)|^2=\frac{1}{\sqrt{\pi \big [1+t^2\big ]} }\exp \left[ - \frac{x^2}{ 1+t^2}\right] . \end{aligned}$$
(5.1.2)

If one writes \(\Psi (x,t)= R (x,t) \exp \big [iS (x,t)\big ]\), one gets (up to a t-dependent constant):

$$\begin{aligned} S(x,t)= \frac{t x^2}{2 (1+t^2)}, \end{aligned}$$
(5.1.3)

and the guiding equation (5.1) becomes:

$$\begin{aligned} \frac{d}{dt} X(t)= \frac{t X(t)}{1+t^2}, \end{aligned}$$
(5.1.4)

whose solution is:

$$\begin{aligned} X(t)=X(0) \sqrt{1+t^2}. \end{aligned}$$
(5.1.5)

This gives the explicit dependence of the position of the particle as a function of time. If the particle is initially at \(X(0)=0\), it does not move; otherwise, it moves asymptotically, when \(t\rightarrow \infty \), as \(X(t)\sim X(0)t\). Thus, \(p = \lim _{t\rightarrow \infty } X(t)/t = X(0)\).

Now, assume that we start with the quantum equilibrium distribution:

$$ \rho _0 (x)= |\Psi (x, 0)|^2=\pi ^{-1/2}\exp (- x^2). $$

This is the distribution of X(0). Thus, the distribution of \(p = \lim _{t\rightarrow \infty } X(t)/t = X(0)\) will be \(\pi ^{-1/2} \exp (-p^2) = |\hat{\Psi }(p,0)|^2\). This is the quantum prediction! But the detection procedure (measurement of X(t) for large t) does not measure the initial velocity (which is zero with probability 1).

Remarks

  1. 1.

    Although the particles do have, at all times, a position and a velocity, there is no contradiction between Bohmian mechanics and the quantum predictions and, in particular, with Heisenberg’s uncertainty principle. The latter is simply a relation between variances of results of measurements. It implies nothing whatsoever about what exists or does not exist outside of measurements, since those relations are simply mathematical consequences of the quantum formalism which, strictly speaking, dictates only what takes place during a measurement.

  2. 2.

    Bohmian mechanics shows that what are called measurements of quantum observables other than positions are typically merely interactions between a microscopic physical system and a macroscopic measuring device whose statistical results coincide with the quantum predictions.

    To use a fashionable expression, one might say for both Bohmian mechanics and standard quantum mechanics, values of most observables are emergent. But it is only in Bohmian mechanics that one can understand how that emergence comes about.

5.2 The Contextuality of the Momentum Measurements in Bohmian Mechanics

The reader might nevertheless worry that there is in fact an intrinsic property of the particle that is revealed in a momentum measurement, for example its original position, since, as we showed in the previous subsection, \(p = \lim _{t\rightarrow \infty } X(t)/t = X(0)\) in the simple case considered there. Of course, if one were to measure the position one would also find an intrinsic property of the particle (namely its position!).Footnote 11 But doesn’t that contradict our Theorem 4.1 (our example could of course be formulated in two dimensions by taking a product of wave functions of the form (5.1.1))? After all, the latter theorem asserts that there does not exist a value-map that assigns to a quantum system pre-existing values that are revealed by quantum measurements and here we seem to have just defined such a map.

However, as we shall explicitly show, the map provided by Bohmian mechanics would be contextual (see the Appendix for the concrete operators that we use in the proof of Theorem 4.1). In particular the value v(O) will depend on which other operators \(O', O'', \dots \), one measures together with O. Hence relations like (4.8) that are needed to prove Theorem 4.1 will not be valid: for example, if one writes \(v(O O') = v(O) v(O')\) and \(v(O O'') = v(O) v(O'')\), the value v(O) will in general be different in the two relations.

We will now show in particular that the measurement of momentum is contextual, using a modified version of the example given by (5.1.1).

Take that quantum state (5.1.1) and write \(\Psi _0(x\)) for \(\Psi (x,0)\). Consider the corresponding Gaussian wave functions:

$$\begin{aligned} \Psi _{+k}(x) = \Psi _0(x) e^{ikx} \end{aligned}$$
(5.2.1)

and

$$\begin{aligned} \Psi _{-k}(x) = \Psi _0(x) e^{-ikx} \end{aligned}$$
(5.2.2)

where \(k>0\). We will assume below that k is large.

Consider first the initial wave function \(\Psi _{+k}(x) = \Psi _0(x) e^{ikx}\). This is a right-moving Gaussian wave packet moving with speed k. Thus at time t it will be centered at kt. Explicitly, the solution of Schrödinger’s equation is:

$$\begin{aligned} \Psi _{+k}(x, t) = \frac{1}{(1+it)^{1/2}} \frac{1}{\pi ^{1/4}} \exp \left( ikx-\frac{ik^2t}{2}- \frac{(x-kt)^2 }{2 (1+it)}\right) , \end{aligned}$$
(5.2.3)

which can also been seen immediately from (5.1.1) using Galilean invariance. For this wave packet we have that \(p = \lim _{t \rightarrow \infty } \frac{X(t)}{t} \approx k\) for \(k \gg 1\).

Now form an \(N=2\) entangled state \(\Psi \) from the wave functions (5.2.1), (5.2.2)Footnote 12:

$$\begin{aligned} \Psi (x,y) = A[ \Psi _{+k}(x)\Psi _{+k}(y) + \Psi _{-k}(x)\Psi _{-k}(y)], \end{aligned}$$
(5.2.4)

with A the normalization constant. Let \(O = P_x\). Consider two different experiments that measure O:

\(\text{ Experiment}_1(O)\): measure O alone by the procedure described in Sect. 5.1, with result corresponding to the solution to the guiding equation (5.1) associated with the solution of Schrödinger’s equation.

\(\text{ Experiment}_2(O)\): first measure at time 0 the position \(Q_y\) of the second particle, then measure O by the above procedure.

For \(\text {Experiment}_1(O)\), we claim that the result is

$$\begin{aligned} v(\text{ Experiment}_1(O)) \approx \mathrm{sgn}\,(X(0) + Y(0)) k \end{aligned}$$
(5.2.5)

for k large.

To prove (5.2.5), introduce the variables:

$$\begin{aligned} w= & {} \frac{x+y}{\sqrt{2}}, \\ z= & {} \frac{x-y}{\sqrt{2}}. \nonumber \end{aligned}$$
(5.2.6)

In terms of these variables, we can rewrite (5.2.4) as

$$\begin{aligned} \Psi (w, z) =A (\Psi _{+k'}(w) + \Psi _{-k'}(w)) \Psi _0(z). \end{aligned}$$
(5.2.7)

with \(k' = \sqrt{2} k\).

So the solution of Schrödinger’s equation factorizes into a function \(\Psi (w, t)\) of (wt) and a function \(\tilde{\Psi }(z, t)\) of (zt). We have that \(\tilde{\Psi }(z, t)\) is given by (5.1.1) with x replaced by z, while for \(\Psi (w, t)\) we get a sum of two wave functions like (5.2.3), one with k replaced by \(k'\), the other with k replaced by \(-k'\):

$$\begin{aligned} \Psi (w, t) =A(\Psi _{+k'}(w, t) + \Psi _{-k'}(w,t)) \end{aligned}$$
(5.2.8)

with \(\Psi _{\pm k'}(w, t)\) of the form (5.2.3).

For large t, \(|\Psi (w, t)|^2\) is a sum of two more or less non-overlapping terms, one corresponding to the part of the wave function with \(k'\) (whose support is around \(k't\)), the other one corresponding to the part of the wave function with \(-k'\) (whose support is around \(-k't\)):

$$\begin{aligned} |\Psi (w, t)|^2 \approx A^2 (|\Psi _{+k'}(w, t)|^2 + |\Psi _{-k'}(w,t)|^2). \end{aligned}$$
(5.2.9)

Since the solution of Schrödinger’s equation factorizes into a function of (wt) and one of (zt), the guiding equations (5.1) for W(t) and Z(t) are decoupled. For Z(t) we obtain a solution like (5.1.5) (\(Z(t) \approx Z(0)t\) as \(t \rightarrow \infty \)).

To analyze W(t), note that one property of the dynamics (5.1) is that, in one dimension, trajectories cannot cross.Footnote 13 Since there is a symmetry between the two parts of the wave function (5.2.8) (upon reflection, \(\Psi _{+k'}\) becomes \(\Psi _{-k'}\) ), if the initial condition \(W (0) > 0\), the particle must stay on the right, while if \(W (0) < 0\), the particle must stay on the left. Moreover, by equivariance, the particle evolves so as to be in the support of \(|\Psi (w, t)|^2\), which, by (5.2.9), consists of two non-overlapping terms supported around \(\pm k't\) for large t. So, for large k and large times, we get that \(W (t) \approx \mathrm{sgn}\,W (0)k' t = \mathrm{sgn}\,W (0) \sqrt{2} kt\).

Rewriting what we’ve found in terms of the X(t) and Y(t) variables, we get that \(X (t)= \frac{W(t)+Z(t)}{\sqrt{2}} \approx \frac{1}{\sqrt{2}}( \mathrm{sgn}\,W (0) \sqrt{2} k t + Z(0)t)\) and thus, \(v(\text{ Experiment}_1(O))= \lim _{t \rightarrow \infty } \frac{X(t)}{t} \approx \mathrm{sgn}\,(X(0) + Y(0)) k\), for k large, which is (5.2.5).

For \(\text{ Experiment}_2(O) \), if Y is the result of the measurement of \(Q_y\), the wave function (5.2.4) collapses, yielding for the wave function of the x systemFootnote 14:

$$\begin{aligned} \Psi (x)=A(Y) (c_+(Y) \Psi _{+k}(x) + c_-(Y)\Psi _{-k}(x)). \end{aligned}$$
(5.2.10)

with \(c_\pm (Y) =\Psi _{\pm k}(Y)\) and A(Y) the normalization coefficient.

The solution of Schrödinger’s equation with this initial condition is again a sum of two wave functions like (5.2.3), one with \(+k\), the other with \(-k\), multiplied by coefficients \(c_\pm (Y)\):

$$\begin{aligned} \Psi (x, t)= A(Y)(c_+(Y) \Psi _{+k}(x, t) + c_-(Y)\Psi _{-k}(x, t)), \end{aligned}$$
(5.2.11)

where \(\Psi _{\pm k}(x, t)\) of the form (5.2.3).

We can now more or less reason as we just did for the \(\Psi (w, t)\) given by (5.2.8), except that because of the coefficients \(c_\pm (Y)\) there is no symmetry here between the two parts of the wave function—unless the complex exponentials in \(c_\pm (Y)\) are real (i.e. \(e^{ikY}=\pm 1\)). Nonetheless, the effect of the coefficients in (5.2.11) is merely to replace the \(\cos kx\), which would arise there if \(c_\pm (Y)>0\) (i.e. \(e^{ikY}= 1\)), by its translate \(\cos (kx+kY)\). Thus the \(|\Psi |^2\) probability of the interval \([X_m, \infty )\) will be 1/2 for some \(X_m\) with \(|X_m| < \frac{\pi }{2k}\).Footnote 15 Thus, by no-crossing and equivariance, we get that for large times \(X(t) \approx \mathrm{sgn}\,(X(0)) kt \) for \(k \gg 1\), and thus

$$\begin{aligned} v(\text{ Experiment}_2(O))= \lim _{t \rightarrow \infty } \frac{X(t)}{t} \approx \mathrm{sgn}\,(X (0)) k. \end{aligned}$$
(5.2.12)

Comparing (5.2.5) and (5.2.12), we see that the measurement of momentum is contextual, since it may depend on whether or not one measures another operator \(Q_y\) together with \(O = P_x\).

5.3 An Example of Nonlocality in Bohmian Mechanics

It would go far beyond the scope of this paper to really explain how nonlocality appears in Bohmian mechanics in general, but we saw an example of nonlocality in Bohmian mechanics in the previous subsection: the particles with coordinates x and y having the entangled quantum state (5.2.4), can be (in principle) as far apart as one wants and the result of the measurement of \( O=P_x\) will depend on whether or not one measures \(Q_y\) before measuring \(P_x\), and, since the time interval between these two measurements can be arbitrarily small, we have indeed here an example of an instantaneous action at a distance. Here we should regard the measurement of \(P_x\) as taking a (large but) finite time, and x and y as referring to different (distant) origins.

The fact that Bohmian mechanics is nonlocal is obviously a merit rather than a defect, since we know that any theory accounting for the quantum phenomena must be nonlocal, as shown in Sects. 24 (and many other places).

6 Summary and Conclusions

Both EPR and Schrödinger argued that the quantum mechanical description of a system by its wave function is incomplete in the sense that other variables must be introduced in order to obtain a complete description. Their argument was very simple: if I can determine the result of a measurement carried at one place by doing another measurement far away from that place, then that result must pre-exist its measurement. The wave function alone does not tell us what that result is. Therefore, the quantum mechanical description of a system by its wave function is incomplete.

However, there was a crucial assumption in the reasoning of EPR and Schrödinger, which was too obvious for them to question it: that doing a measurement at one place cannot possibly affect instantaneously the physical situation far away, or what is now called the assumption of locality.

The history of the EPR-Schrödinger argument is complicated, because although their conclusion about incompleteness of quantum mechanics was right, their assumption of locality was not. The completion of quantum mechanics was found by de Broglie in 1927 and developed by Bohm in 1952. Bohm showed that one may consistently assume that particles have trajectories and explained on that basis how to understand measurements as consequences of the theory and not, as they are in ordinary quantum mechanics, as a deus ex machina [7].

The falsity of the locality assumption was shown by John Bell in 1964 [2] and by subsequent experiments. Bell first recalled that, if one assumes locality, then, as the EPR argument correctly showed, there must exist other variables than the quantum state to characterize a physical system. But then Bell showed that the distribution of those variables must satisfy some constraints that are violated by quantum predictions, predictions that were later verified experimentally (see [22] for a survey).

Here and in [11] we give a simpler argument, but using the maximally entangled states introduced by Schrödinger: for those states, one can, for each observable associated to one system, construct another observable associated to the second system, possibly far away from the first one, such that the results of the measurement of both observables are perfectly correlated. Then, assuming locality, those results must pre-exist their measurement. But assuming that, in general, observables have values before their measurement leads to a contradiction. Hence, the assumption of locality is false.

The difference between this paper and [11] is that here we use the position and momentum variables used by EPR, while in [11] we used spin variables, such as those in terms of which the EPR argument was reformulated by Bohm [6].

Next one might ask how Bohmian mechanics deals with this impossibility of the pre-existence of measurement results prior to measurements, since it is a deterministic theory, and in such a theory everything is pre-determined by the initial condition. In [11] we reviewed that the measurements of spin variables are contextual, in fact should not properly be called measurements at all. Here we illustrate the contextuality of momentum. In both cases, the contextuality is linked to nonlocality, as it must be, since as explained here and in [11], if locality were true, then measurements must (sometimes) be non-contextual. Bohmian mechanics is an extremely natural version of quantum mechanics, involving the obvious ontology evolving the obvious way. A proper appreciation of the role of contextuality in Bohmian mechanics can help dispel the widespread uneasy feeling that somehow there must be something amiss in that theory.