Einstein, Podolsky, and Rosen concluded that quantum mechanics has to be incomplete. This obviously leads to the question of how to complete it. It especially leads to the question about ‘hidden variables’ that would, for example, allow to simultaneously determine position and momentum of a particle.

John von Neumann had already raised the question of hidden variables in his famous textbook [159, p. 109], three years before the EPR paper, but without being mentioned by EPR. In chapter V of his book, von Neumann presented a formal mathematical proof of the impossibility of such hidden variables. It is unknown whether EPR knew of this proof and whether they had refrained from writing their paper, had they known of the proof.Footnote 1

Later, several physicists would find an essential gap in von Neumann’s proof, the contribution by John Bell being the most consequential one (see Sect. 5.2). However, already in 1935 Grete Hermann, whom we mentioned in connection with Heisenberg’s response to the EPR paper, had pointed out the existence of this gap: von Neumann assumed the linearity of the expectation values. This assumption is valid in quantum mechanics, but in a theory containing hidden variables this linearity is not necessarily given and in general the assumption will not be valid. For this reason, Hermann is not surprised at all that von Neumann achieves to prove the impossibility of such variables, she even spoke of a circular argument (see Hermann [90, pp. 99–102]). Towards the end of the corresponding section, Hermann writes:

But with this consideration, the decisive physical question, whether a progressing physical research may achieve calculating more precise predictions than is possible today, cannot be transformed into the mathematical question – being not at all equivalent to the physical question – whether such a development could be realised by exclusively using the quantum mechanical operator calculus.Footnote 2

Grete Hermann was a philosopher and as such saw herself in the tradition of Immanuel Kant. She could not accept that it should be impossible in quantum mechanics to give the cause for a single radioactive decay beyond the purely statistical interpretation. This is why she was interested in von Neumann’s proof and was quite gratified to find that gap in it. She discussed this issue a lot with Heisenberg and Carl Friedrich von Weizsäcker in Leipzig. In his autobiography Der Teil und das Ganze, Heisenberg gave an eloquent account of these discussions [88, pp. 163–173].

A theory containing hidden variables that circumvents von Neumann’s proof is Bohm’s theory. This theory is what the next section is about.

5.1 Bohm’s Theory

In his textbook on quantum mechanics, Bohm had presented a simplified version of the EPR experiment that operates with two particles of spin 1/2 [21]. He, however, rejected EPR’s conclusion of quantum mechanics’ incompleteness. For, in his opinion, the assumption of a local reality contradicts quantum theory.Footnote 3

Nevertheless, Bohm was unsatisfied with the then common (Copenhagen) point of view on quantum theory. In an interview he once said (cf. Pauli [130, p. 341]):

I wrote my book Quantum Theory in an attempt to understand quantum theory from Bohr’s point of view. After I’d written it I wasn’t satisfied that I really understood it, and I began to look again.

‘Looking again’ led Bohm to develop a new interpretation [22, 23].Footnote 4 Strictly speaking, he not only developed a new interpretation, but a new theory. Whereas this theory leaves the wave function untouched, it introduces new, ‘hidden’ variables. These variables act non-locally and are therefore consistent with Bohm’s assumption of a non-local reality.

In the introduction to the first paper, Bohm emphasises [22, p. 166]:

Most physicists have felt that objections such as those raised by Einstein are not relevant, first, because the present form of the quantum theory with its usual probability interpretation is in excellent agreement with an extremely wide range of experiments, at least in the domain of distances larger than 10−13 cm, and secondly, because no consistent alternative interpretations have as yet been suggested. The purpose of this paper […] is, however, to suggest just such an alternative interpretation.

The mentioned scale of 10−13 cm is the scale of nuclear physics. At the time, the common assumption was that smaller scales presented violations of quantum mechanics. Bohm himself assumed his theory to be equivalent to quantum mechanics only on scales larger than 10−13 cm, and to deviate from it on smaller scales.

If one discards fields and decides to only describe particles, Bohm’s additional variables are the positions of the particles, e.g., of electrons.Footnote 5 The dynamics of these positions is determined by the autonomous wave function Ψ that is governed by Schrödinger’s equation. In opposition to Newtonian mechanics, the velocities of these positions cannot be chosen freely but are determined by Ψ. The quantum mechanical probabilities, calculated from Ψ as usual, then become probabilities in the sense of classical statistics, i.e., they only express our ignorance of these particle positions.

Bohm presented his new interpretation in two papers [22, 23], the second one containing a detailed analysis of the measurement process.Footnote 6 Bohm declares von Neumann’s proof of the impossibility of hidden variables irrelevant for his theory. According to Bohm, the reason for this is that the hidden variables depend on the system as well as on the measurement apparatus; this is called a contextual situation , and such variables are called contextual, as opposed to a non-contextual situation where the variables depend only on the system, independently of the degrees of freedom interacting with it. Bell would later judge Bohm’s critique of von Neumann’s proof vague and imprecise, offering a trenchant critique of his own [12].

Bohm’s ideas were not really new. In the 1920s, Louis de Broglie had proposed a theory of pilot waves that showed many similarities with Bohm’s later theory and seemed appealing to Einstein at first (cf. Einstein [56]), see Sect. 1.3. De Broglie had dropped his idea following the harsh criticism by Pauli during the 1927 Solvay Conference, but came back to it after Bohm’s papers were published. In his contribution to the Born Festschrift, he explains the return to his theory and also establishes an interesting connection to Einstein’s ideas of particles being singularities of fields [44]. The same Festschrift contains contributions by Bohm and Einstein on this topic referencing each other [24, 61]. In a letter to Born, Einstein commented on his contribution:

For the presentation volume to be dedicated to you, I have written a little nursery song about physics, which has startled Bohm and de Broglie a little.Footnote 7

At that time, Einstein was not partial to de Broglie’s pilot wave theory anymore and certainly not to Bohm’s variation of it. This should not come as a surprise, as both theories explicitly contradict the locality postulated by Einstein.

Bohm himself found that the greatest progress his idea brought, when compared to de Broglie, was the following [22, p. 167]:

The essential new step in doing this is to apply our interpretation in the theory of the measurement process itself as well as in the description of the observed system.

His elaborations on this can be found in appendix B to his second paper [23]. It also includes comments on Rosen’s interpretation [136], which exhibits similar traits.

The formal starting point of Bohm’s (and de Broglie’s) theory is this wave function ansatz:

$$\displaystyle \begin{aligned} \varPsi=R\exp\left({\frac{\mathrm{i}S}{\hbar}}\right). \end{aligned} $$
(5.1)

Using this ansatz, Schrödinger’s equation can be decomposed into an equation for the amplitude R and an equation for the phase S. The phase equation is similar to the Hamilton-Jacobi equation of classical mechanics but contains an additional term that Bohm called the quantum potential and that is determined by the wave function (5.1).Footnote 8 Its presence leads to the particles being ‘guided’ by the wave function in a non-local way on trajectories that cannot be understood intuitively. Two examples: As a particle’s velocity is determined by the phase of the wave function, a ground state electron in a hydrogen atom is at rest, since its wave function is real. In a double slit experiment, the particle does not go through the slit it is ‘heading to’, but through the other one.

The exact expression of the quantum potential associated with (5.1) reads:

$$\displaystyle \begin{aligned} Q:= -\frac{\hbar^2}{2m}\frac{\nabla^2 R}{R}, \end{aligned} $$
(5.2)

where m is the mass of the particle. Evidently, Q is invariant under rescaling the amplitude, RλR, which once again stresses that Ψ cannot be a classical field.

Bohm discusses the EPR situation in chapter 8 of his second paper, working with the original wave function by EPR and not with the simplified version he used in his textbook (which, of course, has to do with the fact that initially it was not clear how to apply Bohm’s theory to spin states). Because the EPR wave function (2.1) is real, the particles are at rest. Their possible positions are described by an ensemble satisfying x 1 − x 2 = a. Bohm then proceeds to describing the situation in a way reminiscent of Bohr:

Now, if we measure the position of the first particle, we introduce uncontrollable fluctuations in the wave function for the entire system, which, through the ‘quantum-mechanical’ forces, bring about corresponding uncontrollable fluctuations in the momentum of each particle. Similarly, if we measure the momentum of the first particle, uncontrollable fluctuations in the wave function for the system bring about, through the ‘quantum-mechanical’ forces, corresponding uncontrollable changes in the position of each particle. Thus, the ‘quantum-mechanical’ forces may be said to transmit uncontrollable disturbances instantaneously from one particle to another through the medium of the Ψ-field.

Because Bohm accepts the non-locality as a fundamental aspect of the theory, he avoids the EPR criterion from the very beginning.

After the measurement is completed, the measured particle remains trapped in a wave packet; the other wave packets are empty and it is implicitly assumed that these empty packets do not interfere. Within usual quantum mechanics, this assumption can be justified by the process of decoherence, see Sect. 5.4. From this point of view, trajectories are superfluous and “entirely based on a classical prejudice” [174]. There is, in fact, no experiment that requires Bohmian trajectories for its explanation.

Bohm’s proposition faced heavy pushback. Sharp-tongued Pauli criticised it in letters and published writing, the latter ironically in a contribution to a de Broglie Festschrift [125]. Schrödinger was critical of it, too. In a letter to Einstein he wrote:

The fact that Bohm proposes to use the same function for probability distribution and force potential is unacceptable to me. Every trajectory occurring in reality may well be thought of as a member of different ensembles of trajectories. But the trajectories one adds in thought cannot act on the dynamics.Footnote 9

Bohm himself had pointed out the asymmetries in his theory. Electrons are described with the help of particle trajectories, but not photons, even though they seem to also exhibit particle properties when blackening a photographic plate. In electromagnetism, the quantity corresponding to a classical field that is ‘guided’ by the quantum field wave potential is not the photon but the electromagnetic four-potential. Favouring particle positions over their momenta also breaks the symmetry between the corresponding quantum mechanical representations. Additional problems concern the spin formulations and interactions in relativistic quantum field theories.Footnote 10

Bohm and most of those after him assumed that the initial probability distribution is given by Born’s rule, i.e., by |Ψ|2. More recently, attempts have been made to get by without this assumption [132, 157]. Born’s probability distribution then becomes a process of relaxation, so to speak, that reduces a (mostly arbitrary) initial distribution to |ψ|2. Needless to say, certain assumptions have to be made, similar to Boltzmann’s ansatz for the collision frequency when deriving the second law of thermodynamics. A possible cosmological origin of these assumptions is open to speculation.

5.2 The Bell Inequalities

We looked at how Einstein based his arguments on the assumption of a local reality. From it, he deduced the incompleteness of quantum theory. It was Irish physicist John Stewart Bell (1928–1990) who, from this assumption, derived very general inequalities that are violated by quantum theory. Thus, it can be experimentally tested whether quantum theory is right or the assumption of local reality. And the experiments were unambiguously in favour of quantum theory.Footnote 11

Bell’s paper directly refers to EPR; its title being “On the Einstein-Podolsky-Rosen paradox” [11]. The first paragraph explains how the two papers are connected:

The paradox of Einstein, Podolsky, and Rosen was advanced as an argument that quantum mechanics could not be a complete theory but should be supplemented by additional variables. These additional variables were to restore to the theory causality and locality. In this note that idea will be formulated mathematically and shown to be incompatible with the statistical predictions of quantum mechanics.

Note that Bell calls the EPR situation a paradox, probably referring to the conflict between the notion of a local reality and (non-local) quantum theory. The significance of Bell’s work lies precisely in the fact that he brought this conflict to a level of concrete, experimentally verifiable (in)equalities, surrendering it to a definitive decision.

If you are convinced of the universal validity of quantum theory, you will not be surprised to hear of the conflict between Bell’s inequalities and quantum theory. But if you stand by the classical assumption based on locality, the conflict is astonishing and unsettling [181]. This explains the great interest in Bell’s work, which according to Alain Aspect is “one of the most remarkable papers in the history of physics” [2]. Whatever your point of view, there is no question that this paper has stimulated the debate on the foundations of quantum mechanics like no other in the past fifty years.

As stated above, Bell had already been working on the foundations of quantum theory since 1952, mostly because he was impressed by Bohm’s publications.Footnote 12 Bohm’s theory explicitly contained ‘hidden variables’ that interact in a non-local way. Their existence seemed to contradict the above mentioned proof by von Neumann that had stated the impossibility of such variables as long as one holds on to quantum theory’s predictions. Bell detected the gap in von Neumann’s proof and authored a paper on it in 1964 (even before his work on the inequalities), but only published it two years later [12]. He apparently did so without knowledge of Grete Hermann’s above mentioned work in 1935.

Bell began his paperFootnote 13 with the question whether quantum mechanical states could be represented as averages over a new kind of individual states, for which, e.g., the spin values would be determined with respect to any direction, or for which the position and the momentum of a particle would be determined simultaneously. Such states are called ‘dispersion free’ because in contrast to quantum mechanical states they show no dispersion. For these individual states one needs new ‘hidden’ variables in addition to the wave function. These variables shall be denoted by λ. They are introduced to allow the prediction of individual measurement results, where quantum mechanics only allows statistical assertions.

Bell then proceeded to a detailed discussion of von Neumann’s proof and identified – as had done Grete Hermann before him – its sore spot: an overly strong assumption. Von Neumann had postulated the linearity of the expectation values: The expectation value of a sum of operators is the sum of their expectation values.Footnote 14 Whereas this rule applies in quantum mechanics, it does not necessarily apply to hidden variables. Thus, von Neumann’s assumption is too strong.Footnote 15

Bell then turned to other proofs proclaiming the impossibility of theories with hidden variables, especially to the work of Gleason [80] and Kochen and Specker [108].Footnote 16 These works show that there are no non-contextual models with hidden parameters that are compatible with the predictions of quantum mechanics.Footnote 17 Though Bell did not use the term ‘non-contextual’,Footnote 18 his ideas develop exactly in the sense of the term. Non-contextual means that the full state, given by ψ and λ, can, for example, assign a well-defined spin component to every direction, no matter which other components or quantities are being measured. The exclusion of non-contextual models by the above mentioned proofs means: One cannot assume that the results of a quantum mechanical measurement exist before the measurement. An experimental validation of this impossibility is discussed in D’Ambrosio et al. [43].

These results in themselves are interesting enough, as they challenge the view schooled by classical physics. But Bell’s most significant insight was that all these proofs are based on overly strong assumptions. The hidden variables may not only be associated with the measured system but also with the measurement apparatus, corresponding to a contextual situation. Generally, this means a situation in which a measurement result may depend on what other measurements are taken. Bell [12, p. 451]:

The result of an observation may reasonably depend not only on the state of the system (including hidden variables) but also on the complete disposition of the apparatus.

Bell discussed contextual situations. Assuming a local theory with hidden variables, he derived very general inequalities that are violated by quantum mechanics [11]. If these inequalities are violated, Einstein’s assumption of locality is wrong.

Fur the purpose of experimental testing, one usually works with a generalised version of Bell’s original equations. This generalised version was first given by Clauser et al. [41]. Hence their name, CHSH inequalities or CHSH test, which stands for the initials of the authors. To derive these inequalities would go beyond the scope of this book,Footnote 19 but the main ideas shall be described.

Let us consider the experimental setup depicted in Fig. 5.1. The source in the centre sends out two particles with half-integral spin in opposite directions. Let us also assume that both particles are in the non-local state (2.9), this being the singlet state occurring in Bohm’s version of EPR’s thought experiment. To the right and to the left of the source, at equal distance to the source, there are two ‘polarisers’, P1 and P2, that let the particle through only if its spin points upwards in relation to a given direction; that way, the spin component with respect to that direction is measured. Let a and a denote the two possible directions at P1 and let b and b denote the two possible directions at P2. There is a detector behind each polariser responding to incoming particles.

Fig. 5.1
figure 1

Experimental setup to test the Bell inequalities

When choosing directions a = b, the state (2.9) exhibits a perfect anticorrelation – if the spin points upwards at P1, it points downwards at P2, and vice versa. However, the Bell inequalities require at least two directions at each polariser. This corresponds to the contextuality of the situation. Now the assumption of locality states, that the measurement result at P1 is independent of the chosen direction at P2. In the experiments, this is assured by choosing a random direction at P2 so quickly that no signal from P1, traveling with less than or with light speed, can reach P2 before the direction at P2 is randomly chosen. (That is, the spacetime interval between the events ‘measurement of the spin at P1’ and ‘choice of direction at P2’ is spacelike.)

Let the correlation of the measurement results at P1 and P2 be described by a function C(a, b) that depends on the two chosen directions. (In case of a perfect anticorrelation, the value of this function shall be − 1. In case of a perfect correlation, its value shall be + 1.) From the assumption of locality alone, one can then deduce the following Bell inequality (or CHSH inequality):

$$\displaystyle \begin{aligned} \vert C({\mathbf a}, {\mathbf b})+C({\mathbf a}, {\mathbf b'}) +C({\mathbf a'}, {\mathbf b})-C({\mathbf a'}, {\mathbf b'})\vert \leq 2. \end{aligned} $$
(5.3)

Quantum mechanics yields the boundary \(2\sqrt {2}>2\) on the right-hand side, and there are indeed quantum mechanical states, that explicitly violate (5.3). Of course, these states are entangled like (2.9).Footnote 20

In the experiments, one usually uses photons whose directions of polarisation take over the role of the spin in the above described examples. The first significant tests were performed by Alain Aspect and his group in Paris in the early 1980s. They found a violation of the CHSH inequalities with a 5σ-confidence. This group achieved the spacelike separation of P 1 and P 2.

So far, all relevant experiments have validated quantum mechanics and violated the Bell inequalities – and with them the assumption of locality. Nonetheless, possible loopholes that would utilise experimental imperfections to save the validity of the Bell inequalities keep being discussed [162]. One possible loophole would be the violation of the spacelike separation of the above mentioned events; but this has been essentially ruled out in all experiments. Another loophole would be biased statistics that could be rooted in the fact that not all of the photons are captured by the detectors; this is called the detection loophole. One can of course also question the concept of free will and thus whether it is even possible to randomly choose the direction of polarisation at P2. However, this idea seems far-fetched to most physicists and shall not be discussed here.

The current experimental situation is mainly aimed at definitively closing these loopholes and has already been widely successful at it.Footnote 21 Even though some details are still being discussed – one can, with near certainty, conclude that the Bell inequalities are violated empirically, that the predictions of quantum mechanics hold true, and that the assumption of a local reality is wrong.

The tests of local reality proposed by Bell are essentially tests of inequalities. In addition to that, Greenberger et al. [81] were able to present a state whose test for local reality is a test of an equality. The Greenberger-Horne-Zeilinger state (GHZ state) is a state not of two (as is Bell’s state), but of three (or more) entangled photons. While quantum mechanics predicts the value − 1 for a certain observable (a specific product of spin components) of a system in this state, local reality predicts the value + 1. In this case, too, did experimental tests result in the validity of quantum mechanics.Footnote 22

Bell’s work and the following development were set in motion by the EPR paper. For Einstein, the assumption of a local reality was pivotal. The above described development showed that this assumption contradicts empirically validated predictions of quantum mechanics. Would Einstein have adjusted his point of view if he had known of these results? There is no point in speculating, but it is hard to imagine that Einstein would have ignored empirical evidence.

Bell pointed out that the question of determinism was secondary to Einstein – his main concern being local reality (also cf. Maudlin [117]). In Bell [13], he wrote:

It is important to note that to the limited degree to which determinism plays a role in the EPR argument, it is not assumed but inferred. What is held sacred is the principle of ‘local causality’ – or ‘no action at a distance’. […] It is remarkably difficult to get this point across, that determinism is not a presupposition of the analysis.Footnote 23

Surely Einstein did not believe that God ‘plays dice’, but he was more willing to let go of determinism than of locality.

In the above mentioned article, Bell also addresses Bohr’s reaction to the EPR paper (discussed in Sect. 4.2), also cf. Whitaker [165]. He essentially views Bohr’s paper as unintelligible: “While imagining that I understand the position of Einstein, as regards the EPR correlations, I have very little understanding of the position of his principal opponent, Bohr.” And after discussing some central aspects of Bohr’s paper: “Indeed I have very little idea what this means.” He concludes asking: “Is Bohr just rejecting the premise – ‘no action at a distance’ – rather than refuting the argument?”Footnote 24 We have nothing to add to this.

5.3 The Many-Worlds Interpretation

Hugh Everett (1930–1982) published an article in 1957, based on the doctoral thesis he had written under the supervision of John Wheeler [72]. In it, he introduces a new interpretation of quantum theory, which he called the relative state formulation. Later, it became known as the ‘many-worlds interpretation’ or the ‘Everett interpretation’.

Everett quoted attempts to quantise general relativity as a motivation, which in those days were of interest to Wheeler. One aspect of the problem is how to interpret a wave function that is applied to the whole universe, and therefore has no exterior observer. The title of his thesis was indeed Theory of the universal wave function. In it, however, quantising the theory of relativity plays no role; ten years later, Bryce DeWitt would pick up this thread in the context of Everett [48].

The key to Everett’s interpretation is to take the formalism of quantum theory seriously and, in a sense, accept it as definitive. In particular, the Schrödinger equation (1.4) shall always be exactly valid for an isolated system. So in this interpretation, there is no collapse of the wave function, which has fundamental consequences for the role of the observer in quantum theory.

Let us consider a simple example of the quantum mechanical measurement process that was described by von Neumann in his book [159]. Consider a quantum mechanical system with half-integral spin. To measure that spin with respect to a freely chosen direction (defined, for example, by a magnetic field in z-direction), a measurement apparatus is connected to the system. According to the rules of quantum theory, the spin value can either be + ħ∕2 or −ħ∕2; in the first case we denote the state with the symbol | 〉 (‘spin up’), in the latter case we denote it with the symbol | 〉 (‘spin down’). We already encountered these states in Bohm’s version of EPR’s thought experiment, see Sect. 2.3.

In a consistent treatment of the measurement, the measurement apparatus will also be described by a quantum state. In order to measure the spin, the system and the apparatus must interact in such a manner that the state of the apparatus is correlated with the state of the system. Ideally, this goes on without the apparatus perturbing the system. For example, if the spin is measured in the z-direction, the interaction shall transform the uncorrelated initial states | 〉|ϕ 0〉 (‘spin up’) and | 〉|ϕ 0〉 (‘spin down’), where |ϕ 0〉 is the initial state of the apparatus, as follows:

(5.4)

The state |ϕ 〉 (|ϕ 〉) is then interpreted as ‘apparatus measured spin up’ (‘apparatus measured spin down’). Now, if quantum mechanics holds universally, the superposition principle holds universally. Then, according to Eq. (5.4), a superposition of spin up and spin down (resulting in a state with spin right or spin left) will develop as follows:

(5.5)

This, however, is nothing but the superposition of macroscopic states (‘pointer states’) of the measurement apparatus! Because one does not observe such a superposition (one always observes apparatus in definite classical states), von Neumann had postulated the collapse of the wave function, suspending the superposition principle during the measuring process and modifying the formalism of quantum mechanics, see Sect. 1.4.

Everett followed another path. He considered the superposition (5.5) real. But how do you explain that such states are never observed? The key to the answer is to explicitly involve the observer. Let |O 0〉 be the initial state of the observer before the measurement, |O 〉 the state ‘observer sees spin up ’ and |O 〉 the state ‘observer sees spin down’, then instead of (5.5) we have the following, larger superposition, that also includes the observer:

(5.6)

Does this not worsen the situation? No, says Everett. The expansion (5.6) means branching the wave function into independent components (‘branches’), each corresponding to its own classical world. The whole quantum reality can thus be pictured as a world in which the same observer exists in two components of the wave function – one version of the observer sees spin up, the other version of the observer sees spin down. Thus, all possible outcomes of quantum measurements are physically realised in the full quantum world. Such a branching is robust due to decoherence which will be discussed in the next section.

The function | 〉|ϕ 〉 that is multiplied with the version |O 〉 of the observer is called the ‘relative state’ with respect to |O 〉 (and, accordingly, for the second component in (5.6)). This is why Everett called his interpretation the relative state formulation.

Of course, this picture not only holds for spin measurements, but for the measurements of all observables, be it the measurement of an electron’s position or the hypothetical observation of Schrödinger’s cat. According to the Everett interpretation, there is no superposition of a dead and a living cat in the classical world but rather a superposition of a world with a dead cat and a world with a living cat.

Everett’s formulation does not separate system and observer. Von Neumann’s psycho-physical parallelism (see Sect. 1.4) thus must be generalised. In his original formulation, von Neumann specifically states a connection between observer and observed system. In the Everett interpretation, there is merely a correspondence between the versions of the observer and the respective relative states of the system. In John Bell’s words [15, p. 133]: “The psycho-physical parallelism is supposed such that our representatives in a given ‘branch’ universe are aware only of what is going on in that branch.”

In the next sentence, Bell calls the Everett interpretation extravagant: “Now it seems to me that this multiplication of universes is extravagant, and serves no real purpose in the theory, and can simply be dropped without repercussions.” He therefore (at least in this respect)Footnote 25 prefers Bohm’s interpretation, which differs from the Everett interpretation only in that it adds classical particles (and field configurations) to the wave functions. After a measurement, these are trapped in a wave packet and describe the observed classical world.Footnote 26

But is the Everett interpretation really extravagant? It follows quite naturally once you take the formalism of quantum theory seriously and do not manually introduce things, such as the collapse of the wave function. Looking at it that way, this interpretation is, in fact, minimalistic and corresponds directly to the formalism found in textbooks. Therefore, it does not seem quite right to consider it a standalone interpretation. From a fundamental point of view, there is only one quantum world – but with many classical, or better quasi-classical, components.

Bell’s discomfort is shared by many physicists. Bohm’s theory is an attempt at saving the idea of one macroscopic world. Other attempts go further and modify the Schrödinger equation by introducing additional, non-linear or stochastic terms. These terms are designed to cause the collapse of the wave function: superpositions such as (5.5) then develop according to these modified dynamics into one of their two components, with the probability given by Born’s rule; that is, the wave function ‘collapses’ into one of the two components.Footnote 27 Two of the most intensely discussed collapse models are the GRW model – named after its inventors Ghirardi, Rimini, and Weber – and the CSL model , which emerged from the GRW model.Footnote 28 Until today, there is no empirical evidence of a violation of Schrödinger’s equation and, thus, the validity of one of the collapse models. A detailed overview of collapse models and their experimental tests can be found in Bassi et al. [9].

Within the Everett interpretation, there is no EPR problem [173]. If we interpret (5.6) as a spin measurement in Bohm’s version of the EPR experiment, then the two possible outcomes with their corresponding versions of the observer physically exist in the combined state. Because of the non-locality of the quantum mechanical formalism, Einstein’s criterion of locality cannot be applied, and EPR’s conclusion that quantum mechanics is incomplete cannot be drawn.Footnote 29

Einstein did not get a chance to react to Everett’s proposition, as he died in 1955, but Everett did meet Podolsky and Rosen at the Xavier conference in October 1962 (see Xavier University [172] for a transcript of the conference contributions). Peter Byrne describes their encounter in his biography on Everett [38, pp. 252–261]. The discussions were intense. Most conference participants considered Everett’s interpretation valid and consistent, even if they were unwilling to accept its philosophical consequences. The same was true for Podolsky and Rosen. For Rosen, following the lines of EPR’s arguments, the ongoing discussions of conceptual problems of quantum theory were further proof of the theory’s incompleteness.

Reading the contributions to the discussion, one can feel the tension that builds up when you try to maintain the superposition principle as well as the linearity of the Schrödinger equation, but are unwilling to accept the consequences of the ‘many worlds’. Everett commented [38, p. 255]:

Yes, it’s a consequence of the superposition principle that each separate element of the superposition will obey the same laws independent of the presence or absence of one another. Hence, why insist on having a certain selection of one of the elements as being real and all of the others somehow mysteriously vanishing?

Everett’s original formulation brings up further important questions. For example, it is not clear which set of wave functions is supposed to be the basis of the branching. It is also unclear how Born’s probability interpretation can result from a formalism that does not contain probabilities on a fundamental level. Everett was sure that his interpretation is consistent, but he could only give rudimentary answers to these questions. More precise answers could only be given after achieving a deeper understanding of how classical properties arise in a world that is fundamentally described by quantum theory. This is what the next section is about.

5.4 The Classical Limit

The concept of measurement, or rather the measuring process, plays a central role in discussions on the fundamentals of quantum theory. During the measurement, the Schrödinger equation is apparently suspended and the only state in a superposition that survives is the one that corresponds to the measurement result. John von Neumann has formalised this measurement process and introduced the collapse of the wave function as a new dynamical process, see Sect. 1.4. But why should the measurement of a system play such a crucial role?

Indeed, a measurement is nothing more than an interaction between two systems, where one system is the one to be measured, and the other one, the ‘apparatus’, is the one performing the measurement. So should we not simply call it an interaction with special properties? Perhaps more than any other person, John Bell spoke out against attributing a special role to measurements in the debate on the fundamentals of quantum mechanics. In his widely noticed paper “Against ‘measurement’ ”, Bell wrote on the concept of measurement [14, p. 34]: “[…] the word has had such a damaging effect on the discussion, that I think it should now be banned altogether in quantum mechanics.”

The problem of measurement in quantum theory is really part of a more general problem: How and when do classical properties form? So the actual issue is the problem of the classical limit – an aspect that goes unnoticed when you attribute a special role to measurement situations.

The problem of the classical limit had been discussed early on. During the Solvay Conference in 1927, Max Born asked how it could be understood that the trace of every alpha particle in a Wilson chamber appears as an (almost) straight line, although one needs a spherically symmetric wave function to describe its propagation. Two years later, Neville Mott presented his idea that the interaction of the alpha particles with atoms inside the Wilson chamber was responsible for the observed shape of the alpha particles traces [118]. This idea was not pursued in the time to come, probably due to Niels Bohr and the Copenhagen interpretation.

Decades later, Heinz-Dieter Zeh (Heidelberg, Germany) recognised how strongly quantum systems interact with the degrees of freedom in their environment and how important these interactions are for the classical limit. Macroscopic systems are always coupled with environmental degrees of freedom, (e.g., photons, scattering molecules, …), so that they cannot be described as isolated systems. The Schrödinger equation can only be applied to the entire system, which is assumed to be closed; only from its solution for the entire system can one derive the behaviour of the subsystems. One finds that macroscopic subsystems generally show classical behaviour. The interactions with the degrees of freedom of the environment lead to a global entanglement of the system and its environment, making the system appear classical. This mechanism is referred to as decoherence. Footnote 30

In the following, we shall briefly outline how decoherence results from the formalism of quantum theory.Footnote 31 A fundamental assumption is that this formalism indeed holds for all systems, without restrictions, and does not require modification by a dynamical collapse.

In accordance with von Neumann, let us simply consider an interaction between a ‘system’ \({\mathscr S}\) and an ‘apparatus’ \({\mathscr A}\), correlating \({\mathscr S}\) and \({\mathscr A}\) without changing the state of the system; this being an ‘ideal measurement’ as described in the preceding chapter, cf. Eq. (5.4). Once again, we use the simple example of measuring the spin to demonstrate the process. If initially two states ‘spin up’ and ‘spin down’ exist in the system, the apparatus will be correlated with these states during the measurement according to Eq. (5.4).

Of course, due to the superposition principle, the state of the system may well be a superposition of different states with arbitrary complex coefficients α and β. The interaction then leads to

(5.7)

But this corresponds, as in (5.5), to a superposition of different states of the apparatus (‘pointer states’)! So far, we have only repeated von Neumann’s argument, in which he concluded the necessity for an additional dynamics (‘collapse’ or ‘reduction’ of the wave packet).

Considering Zeh’s idea, we now take into account the fact that the apparatus \({\mathscr A}\) is not an isolated system, but that it interacts with the degrees of freedom of the environment, which we shall denote by \({\mathscr E}\). If |E 0〉 denotes the initial state of the environment, then the state of the environment will be correlated with the states of the apparatus (and through this, indirectly with the states of the system) when apparatus and environment interact. Applying the superposition principle seems to worsen the situation, when compared to (5.7), because now the degrees of freedom of the environment also need to be taken into account:

(5.8)

This is an entangled state between, in general, many degrees of freedom that can be spatially quite far apart (as in the EPR situation). The crucial point of the matter, however, is that – in contrast to the apparatus – the degrees of freedom of \({\mathscr E}\) are not observable. For example, photons could scatter on the apparatus’ surface and then disappear irreversibly. What can really be observed locally (at the system or at the apparatus) ensues from the reduced density matrix (cf. Appendix). If one assumes (quite realistically) that the states of the environment for different n are nearly orthogonal, then the density matrix for (5.8) is:

$$\displaystyle \begin{aligned} \rho \approx \vert\alpha\vert^2\vert\negthinspace\uparrow\rangle \langle\uparrow\negthinspace\vert \otimes \vert\phi_{\uparrow}\rangle \langle\phi_{\uparrow}\vert + \vert\beta\vert^2\vert\negthinspace\downarrow\rangle \langle\downarrow\negthinspace\vert \otimes \vert\phi_{\downarrow}\rangle \langle\phi_{\downarrow}\vert. \end{aligned} $$
(5.9)

But this is precisely the density matrix of a classical statistical ensemble of spin up and spin down. The information on possible interferences, expressed by the off-diagonal elements of the density matrix, passed into correlations between the apparatus and unavailable degrees of freedom of the environment: “The interference terms still exist, but they are not there!”Footnote 32 The discussion on spin measurement is, of course, valid for the interaction system-apparatus-environment in general.

The first quantitative calculations on decoherence in realistic situations were performed by Erich Joos and Zeh in 1985 [100]. Part of these applications concerned the important question of the localisation of objects. Because the superposition principle holds, one should not expect objects to be in a specific localised state; the general case in quantum theory is a superposition of localised states, i.e., extended states. Joos and Zeh showed that a very weak coupling to the degrees of freedom of the environment is sufficient for macroscopic objects to decohere, i.e., to localise. For example, a dust particle in interstellar space whose state is a superposition of various different locations will interact strongly enough with the 3 K cosmological background radiation, present everywhere in the universe, to appear as a classical (localised) particle if its radius is larger than only 10−3 cm. It is not the particle path that is being disturbed by scattering – it is the environment that is being changed. The interaction causes an entanglement with the environment; this entanglement causes the decoherence. Hence, entanglement not only is responsible for the pure quantum properties of the system, but also for the emergence of classical behaviour.

Thus, objects do not per se possess classical properties. To which degree they appear classical or not, depends on the details of their interaction with the environment. These details can be obtained from quantitative calculations. Because decoherence generally happens very fast, it looks like a spontaneous localisation, a ‘quantum jump’. In contrast to a (never observed) dynamical collapse that would violate the Schrödinger equation, it is therefore referred to as an apparent collapse of the wave function. All observed phenomena, including all measurement processes, can (at least in principle) be consistently described by applying the Schrödinger equation to the entire system and restricting it to the subsystems in question. By consequence, a dynamical collapse has not yet been necessary to explain the outcome of any known experiment.

All experiments on decoherence, beginning with the first ones in 1996, have confirmed the theoretical predictions. The Vienna experiments on the slow disappearance of interference patterns through controlled interaction with the environment are worth highlighting. These interferences are created with, for example, fullerene molecules that are sent through a Talbot-Lau interferometer and interfere with themselves. Introducing a gas as a scattering environment [82] or heating for the emission of photons [83] make the interference pattern disappear, in accordance with the predictions by Joos and Zeh [100] and others.Footnote 33

The Nobel lectures by Serge Haroche and David Wineland are impressive accounts of entanglement and decoherence experiments [84, 170]. The theoretical considerations on the classical limit have become quantum mechanical routine.

Decoherence also plays a role in discussions on how relevant quantum mechanical superposition is for understanding the human consciousness. A possible relevance had been raised by Roger Penrose, among others, in the late 1980s. In detailed calculations, Max Tegmark was able to show that, due to decoherence, such superpositions in the brain – even if they were present – would disappear too rapidly to be of relevance for consciousness [155].Footnote 34 This example shows the wide scope of application of decoherence, i.e., of applications based on the quantum mechanical formalism.

The importance of decoherence is that it allows to explain the validity of classical concepts; at the same time, it can define the range of validity of these concepts. Objects appear classical, even if, on a fundamental level, they are described by quantum theory. The wave-particle ‘complementarity’, a historically relevant principle for quantum theory, follows naturally from applying quantum mechanics to realistic situations and from the process of decoherence. The fundamental concept of a state is a wave function in a generally high-dimensional configuration space, from which follow, according to the specific context, particle-like or wave-like properties in the three-dimensional space familiar to us.

Decoherence also solves a possible inconsistency in the Everett interpretation (see Sect. 5.3): With respect to which variables do the various branches of the wave function become mutually independent? The natural interaction with the environment selects a specific set of variables (for example, the position basis in the case of the localisation of objects discussed above). They define the robust, quasi-classical branches of the wave function. In Bohm’s theory, it is the decoherent branches that carry the ‘particles’ as classical properties emerge, as opposed to the independent, empty wave packets.

With certain additional assumptions, the probability interpretation of usual quantum mechanics (the Born rule) can now be understood within the framework of the Everett interpretation. Many derivations, especially those making use of the density matrix concept, use circular arguments, because the desired outcome is already implicit in the ansatz, cf., e.g., Wallace [161, part II]. Some derivations, especially Zurek [183], work exclusively with the entangled state of the entire system and try to deduce the probabilities of the branches from their number of occurrences within the total wave function. Whether this constitutes an actual derivation of the probability interpretation or rather a consistency consideration, is a matter of debate. In any case, these analyses show that the Everett branches of the wave function can be interpreted consistently and realistically, at least as ‘heuristic fiction’ in the sense of Zeh [179, chapters 3 and 5].Footnote 35

The probability interpretation can be applied as soon as there is decoherence. Then, the interferences between states that correspond to different ‘measurement results’ are no longer observable. This is also the moment when you are allowed to apply the ‘Heisenberg cut’ (see Sect. 4.4) . The dynamical process of decoherence thus justifies the phenomenological interpretation of the theory. Without decoherence, the probability interpretation makes no sense.

The following (slightly modified) table from Joos [99, p. 194], sets the main properties of the Everett interpretation against the corresponding properties of the collapse models.

Collapse models

Everett

How and when does a collapse occur?

What is the exact structure of the Everett branches?

Traditional psycho-physical parallelism: perception is parallel to the state of the observer

New form of psycho-physical parallelism: perception is parallel to a component of the universal wave function

Probabilities are postulated

Probabilities may potentially be derived from the formalism (controversial)

Potential conflicts with relativity

No conflict with local interactions

Experimental test:

Experimental test:

Search collapse-like deviations from the Schrödinger equation

Search for macroscopic superpositions

Seems impossible because of decoherence

Seems impossible because of decoherence

The Everett interpretation (which makes use of the unaltered linear formalism of quantum theory) and collapse models (which explicitly modifies the Schrödinger equation) are in principle distinguishable in experiment. In macroscopic superpositions, this seems impossible because of decoherence, but it is possible and conceivable to test the predictions of specific collapse models in mesoscopic scenarios [9].

One objection against the Everett interpretation is that we do not perceive the other macroscopic components of the wave function; hence, they do not exist. But what would the world look like if the Everett interpretation was correct? Because of decoherence, it would look exactly the way we perceive it. This debate brings to mind the historic debate between the Ptolemaic and the Copernican system, which went on for centuries. Everett himself made this comparison in a note added in proof [72, p 460]:

Arguments that the world picture presented by this theory is contradicted by experience, because we are unaware of any branching process, are like the criticism of the Copernican theory that the mobility of the earth as a real physical fact is incompatible with the common sense interpretation of nature because we feel no such motion. In both cases the argument fails when it is shown that the theory itself predicts that our experience will be what it in fact is. (In the Copernican case the addition of Newtonian physics was required to be able to show that the earth’s inhabitants would be unaware of any motion of the earth. )

Only future developments in physics will allow for a final decision in this debate.