1 Introduction

In 1957 Everett [1] analyzed the quantum measurement process in terms of the unitary evolution of a total quantum state that describe not only the system to be measured, but also the apparatus, the observers of the apparatus and the observers of the observers.Footnote 1 Several shortcomings initially plagued the interpretation that Everett introduced. The solution to the ‘preferred basis’ problem seems to be solved by the decoherence theory. Another question that was not sufficiently well addressed was the Born rule. That this problem remained rather recently is clear from the discussions by Weinberg [4] and the criticism by Hemmo and Pitowsky [5]. New attempted proofs are still put forward. Wallace [6, 7] has made a very elaborate proof that improves on Deutsch [8] use of decision theory. The proofs by Sebens and Carroll [9,10,11] utilize the kind of ‘self-locating’ uncertainty introduced by Vaidman [12] have been criticized by Kent [13]. The critique by Kent is concerned with the lack of justification for the use of classical probabilityFootnote 2 in a situation which is inherently quantal. This critique may be equally appropriate against the recent attempt by McQueen and Vaidman [11].Footnote 3

Another criticism from Kent [14] and Maudlin [15] is directed to the lack of clear statements that define the theory. As Everett’s quantum mechanics has abandoned the traditional postulates, there is a need for a new set of postulates. Everett assumed that the quantum state belongs to a Hilbert space, but this is a somewhat abstract mathematical notion. Everett took the position that “The wave function is taken as the basic physical entity with no a priori interpretation. Interpretation comes after an investigation of the logical structure of the theory.” However, if we are to understand how the notion of probability appears, we need a firm grip on the interpretation of the quantum state. This article offers the interpretation that Everett postponed.

The problematic relation between the theory of decoherence and any derivation of the Born rule emphasizes the need for a better starting point of the theory. The decoherence theory is based on the Born rule, but the decoherence theory is needed in order to give the branches for which the Born rule should give the probabilities. This relation constitutes an unacceptable circularity [16].

Wallace has relied in discussions of decoherence on that the Hilbert norm is a proper measure of what is important or negligible, thus avoid the reliance on the Born rule. Wallace argues, “the Hilbert space norm is a perfectly objective feature of the physics, before any considerations of probability.” However, that raises the question: Which are the reasons for the use of the Hilbert space, and its norm?

Some attempts to prove the Born rule rely on the assumption, but without proof, that probabilities are conserved under unitary transformations [6, 9, 10, 17, 18], that probability is local quantity in space, or that probability is independent of what happens later [1, 6]. Several approaches to the Born rule are only addressing a situation after the measurement [9,10,11] or after a pre-measurement [17, 18]. These shortcomings calls for a better derivation of the Born rule in the context of Everett’s quantum mechanics.

For the analysis to be convincing, it is pivotal to have a definite and physically motivated starting point, that everything is derived from clearly stated assumptions [19], and the discussion of the Born rule probabilities address the situation prior to measurement. Not even the applicability of the probability concept can be taken for granted, as is illustrated by the criticism from Albert [20] and Kent [21].

2 Postulates

If measurements are to be described by Everett’s Quantum Mechanics (EQM), a wave packet entering the volume of a detector has to correspond to that the particle enters the detector. To conclude this, we have to rely on something other than the intuition that we have gained from using the standard (Copenhagen) postulates as they are abandoned in EQM. To this end, two postulates are formulated. The first postulate address how the wavefunction describes position; the second defines the dynamics.

Postulate EQM 1

The quantum state: The state is a set of complex functions of positions

$$\begin{aligned} \varPsi =\{\psi _{jk}(t,{\mathbf {x}}_1,{\mathbf {x}}_2, \ldots )\} \end{aligned}$$
(1)

where index k is for gauge components and j is a composite index for the spin components of all particles. Its basic interpretation is given by that the density

$$\begin{aligned} \rho _j(t, {\mathbf {x}}_1,{\mathbf {x}}_2,\ldots ) = \sum _k |\psi _{jk}(t, {\mathbf {x}}_1,{\mathbf {x}}_2, \ldots )|^2 \end{aligned}$$
(2)

answers where the system Footnote 4 is in position and spin. It is absolute square integrable and normalized to one

$$\begin{aligned} \int \! \int \!\cdots dx_1dx_2 \cdots \sum _{jk} |\psi _{jk}(t, {\mathbf {x}}_1,{\mathbf {x}}_2, \ldots )|^2 = 1. \end{aligned}$$
(3)

This requirement signifies that the system has to be somewhere, not everywhere. If the value of the integral is zero, the system does not exist anywhere.

With the usual way of writing the norm \(\Vert \cdot \Vert\), Eq. (3) can be written \(\Vert \varPsi \Vert ^2 = 1\). If something is measurable, then it is possible to separate such a small part from the rest.Footnote 5 The separated part will act as a system of its own, thus cannot have zero norm. The difference between two states \(\varPsi\) and \(\varPsi '\) for which \(\Vert \varPsi - \varPsi ' \Vert = 0\) can have no measurable consequences, as \(\varPsi - \varPsi '\) is, according to EQM 1, physically equivalent to a function which is zero everywhere. This equivalence implies that the state of the system can be viewed as a vector in the Hilbert space of functions of the type (1), the \(L^2\) Hilbert space.

The state vector \(\psi\) is not directly observable as it is gauge dependent, while the density (2) is independent of gauge and is, in principle, an observable quantity. The density, the distribution for where the system is located, gives how much the system is present at a location in configuration space \({\mathbf {x}}_1, {\mathbf {x}}_2, \ldots\), with the discrete index j. The density can be denoted the position distribution or the presence distribution, and both will be used here. The quantity presence has previously been denoted measure of existence by Vaidman [22] and caring measure by Greaves [23, 24], but they have not fully clarified its meaning. The context in which they discussed the meaning of the quantity \(\rho\) was that of probabilities and the Born rule. They did not derive the Born rule from their concepts, but the EQM 1 turns out to be a powerful starting point to prove the Born rule.

In EQM1, there is no mention of any relation between the density (presence) (2) and probability. When the propagation of different parts is dependent on each other due to coherence, the concept of probability is not relevant. But, the density \(\rho _j(t, {\mathbf {x}}_1, {\mathbf {x}}_2,\ldots )\) as a distribution of the particles positions is always relevant. It is similar to Schrödinger’s original interpretation of quantum mechanics [25], in which for a single electron \(-e\rho ({\mathbf {x}})\) was assumed to be a (classical) charge density. Schrödinger wrongly assumed it could be used in Maxwell’s equations. There were two reasons for this failure. In the many-electron situation, the density \(\rho _j(t, {\mathbf {x}}_1, {\mathbf {x}}_2,\ldots )\) cannot be used in connection with classical electrodynamics and the not yet invented QED should have been used instead of classical electrodynamics. The take away from Schrödinger’s attempt is that in 1926 it was appropriate to consider a fundamental distributed quantity. The fundamental significance of \(\rho _j(t, {\mathbf {x}}_1, {\mathbf {x}}_2,\ldots )\) is that it gives where the system is in configuration space, as laid out in EQM 1. Any other interpretation or significance of \(\rho _j(t, {\mathbf {x}}_1, {\mathbf {x}}_2,\ldots )\) should be derived from EQM 1 together with the following postulate and other physical circumstances that can be assumed.

Postulate EQM 2

The equation of motion: There is a linear and unitary time development of the state, e.g.,

$$\begin{aligned} i\hbar \partial _t \varPsi = H \varPsi , \end{aligned}$$
(4)

whereHis the hermitian Hamiltonian. The term unitary signifies that the value of the left hand side in (3) is a constant of motion for any state (1) of the system.

When investigating how the theory describes the world we observe, the Hamiltonian has to be assumed to have realistic features. We should realize that we have no proper understanding of a world where the interactions are different from those that govern this world. In particular, measurements are physical processes governed by the existing forces. As the standard model of particle physics is formulated in terms of locally interacting fields, it will be assumed that interactions are local and that there are locally conserved particle currents.

The following relations lends support to the interpretation of the density (2) as the distributed position. For the sake of simplicity, the spin index j and the time dependence are omitted here.

From the system density (2) a single-particle density for the N particles of the same kind can be calculated

$$\begin{aligned} \rho ({\mathbf {x}}) = N\int d^3x_2 d^3 x_3 \cdots \rho ({\mathbf {x}}, {\mathbf {x}}_2, {\mathbf {x}}_3, \ldots ). \end{aligned}$$
(5)

Similarly, a two-particle density can be defined as

$$\begin{aligned} \rho ({\mathbf {x}}_a, {\mathbf {x}}_b) = N_aN_b\int d^3x_3 d^3 x_4 \cdots \rho ({\mathbf {x}}_a, {\mathbf {x}}_b, {\mathbf {x}}_3, {\mathbf {x}}_4 \ldots ) \end{aligned}$$
(6)

where \(N_b = N_a -1\) if the two particles are ‘identical’. If relevant collective coordinates are introduced \(X_1, X_2, \ldots\), a corresponding density \(\rho (X_1, X_2, \ldots )\) can be defined. These densities are physically significant. If all interactions are local, the single-particle density is locally conserved,

$$\begin{aligned} \partial _t \rho + \nabla \cdot {\mathbf {j}} = 0, \end{aligned}$$
(7)

and so is the two-particle and the system density.

The following illustrates the physical significance of the single-particle density. For a bound system, the single-particle density can be probed with an external potential by measuring the related energy change,

$$\begin{aligned} \varDelta E = \int d^3x \, V({\mathbf {x}}) \rho ({\mathbf {x}}) . \end{aligned}$$
(8)

In nuclear and particle physics, electron scattering can be used to extract the ground state charge density, which corresponds to the single-particle density of protons and for hadrons a combination of the up and down quark single-particle densities, respectively.

Two-particle and many-particle correlations in the wavefunction of a complex system can be extracted from static properties and excitation probabilities to excited states with various properties. Hund’s rule for the structure of the ground-state properties of atoms states that if the valence (nl) shell is half-filled, then the system has maximal total spin S and orbital angular momentum L. In this state, the electrons are as far away as possible from each other, minimizing the electron-electron repulsive Coulomb energy. The corresponding two-particle density is zero for \({\mathbf {x}}_a = {\mathbf {x}}_b\).

The structure of molecules is interesting because it illuminates the relevance of both single-particle densities and correlations. The electrons move in the electric field from the nuclei, and, in the Oppenheimer–Born approximation, the nuclei move in the electric field of the electrons given by their single-particle density. The electronic energies give rise to correlations of the positions of the nuclei that are well represented by the structure of the system density (2).

Collective coordinates are suited for the location and orientation of macroscopic bodies. The density in those coordinates may describe where the different items in the laboratory are located, which enables the quantum description of experiments. Note that it is the interactions that cause atoms to exist, make them bind together into molecules, crystals, and different kinds of macroscopic bodies. That the macroscopic objects are found in well-defined positions is due to decoherence [26]. Of course, EQM 1 does not affect any dynamical process or which wavefunctions that are allowed. Specifically, EQM 1 does not dictate the answer to the preferred basis. Instead, the answer is given by the decoherence in systems of macroscopic objects surrounded by gases of molecules, photons, neutrinos, and gravitons. EQM 1 is chosen to enable the investigation of how and if, EQM can describe nature. It fulfills an epistemic need.

The discussion above shows that if we wish to interpret the meaning of\(\rho _j({\mathbf {x}}_1, {\mathbf {x}}_2, \ldots )\)without any attention to the measurement process, the interpretation that is given by EQM 1 or something to the same effect seems unavoidable.

2.1 Alternative Postulates

Below are listed the standard measurement postulates, which are replaced by EQM 1 and 2.

  1. S1

    The state of a physical system is a normalized vector \(| \varPsi \rangle\) in a Hilbert space H, which evolves unitarily with time.

  2. S2

    Every measurable quantity is described by a Hermitian operator (observable) B, acting in H.

  3. S3

    The only possible result of measuring a physical quantity is one of the eigenvalues of the corresponding observable B.

  4. S4

    The probability for obtaining eigenvalue b in a measurement of B is \(P(b) = \langle \varPsi | \pi _b |\varPsi \rangle\), where \(\pi _b\) is the projector onto the eigen-subspace of B having eigenvalue b.

  5. S5

    The post-measurement state is (the result of the unitary development during the measurement of) \(\pi _b|\varPsi \rangle / P(b)^{1/2}\).

Some modern formulations of the postulates allow for positive operator value measurements, but that generalization offers nothing extra here. It is the same as the projection value measurement postulates S2–S5 up to a unitary transformation [27].

The standard postulates amount to a partial interpretation of quantum mechanics. It is complete enough for the investigation of well-defined localized systems, but not for the environment as a whole. EQM 1 and 2 imply the content of these postulates. In the comments to EQM 1, it is shown the state belongs to a Hilbert space so that EQM 1 and 2 imply S1. In EQM, the measurement is as any other process described by the dynamics given by EQM 2. Section 3 shows how S2 and S3 are implied.

Everett [1], Wallace [7], and others, posit that the wavefunction belongs to a Hilbert space. As they do not explicitly state which Hilbert space they refer to, one might wrongly conclude that it is an abstract Hilbert space with no relation to the physical world. Nevertheless, the dynamical Eq. (4) relates each degree of freedom with a particle type. Nevertheless, these authors do not give any rule for what the wavefunction amplitude signifies. One cannot directly rely on standard practice, as that is motivated by the standard postulates, in particular, the Born rule (S4). Wallace argued in connection with the decoherence theory that the Hilbert norm measures the relative importance of different parts of the state. To start with, postulating the mathematics and derive the physics from this, becomes at best, a backward way to define the theory.Footnote 6

Geroch [29] suggested a related interpretation of the quantum state, which states that a region of configuration space is ‘precluded’ if the wavefunction is very small there. This suggestion corresponds to ignoring contributions from configuration space where the system is hardly present. From EQM 1, this recipe may be motivated in some situations, but, as with the Hilbert space postulates, the Geroch approach does not give a physical meaning to the wavefunction.

A postulate formulated in terms of momenta rather than the position in configuration space, can that replace EQM 1? That seems not to be a practical starting point for our description of the world. We observe a ‘classical world’ of macroscopic objects at reasonably well-defined positions. The position basis is useful when formulating the decoherence theory, which explains the appearance of the classical world. In Sect. 3, the meaning of the amplitudes (absolute square) in another basis is derived from EQM 1 by considering experiments in which a physical process unitarily transforms that basis to localized wave packets. To mimic this procedure in the momentum basis would entail processes of which we have no elementary understanding, which is necessary for an argumentation that starts directly from the postulates. To put it simply, if we cannot give a meaning to localized states, how can we then make any connection with the world we observe.

3 Basics of Measurements

It is difficult to analyze which quantities can be measured based on the general and abstract view of quantum mechanics, which is meant to be valid for any type of interaction. As mentioned above, it is assumed that the fundamental interactions are not only local but that they can give rise to the kind of objects that we have around us, for example, detectors and laboratories.

Detectors can typically record that a particle entered it, which can be used to create position information. The momentum of a charged particle can be transformed into a measurement of position. The measurement of photon energy can be transformed into the measurement of a position using a grating. The measurement of the angular momentum of an atom can be transformed into a photon energy measurement by the Zeeman effect or position by a Stern–Gerlach apparatus. These are examples of measurements of physical quantities which correspond to Hermitian operators and can be transformed to a position measurement. Even the recording of the time for an event is, in principle, transformable to a position.

There are measurement methods that do not rely on position measurements. For example, gamma photon energies and other high energy particle energies can be measured by recording the number of produced secondary particles. The primary way in which such detectors are usually calibrated is by comparing with position transforming methods. The following discussion of measurements will be confined to the recording of a particle entering a detector, which we can call a particle recorder. These detectors may be a part of an array of detectors in order to get position information from which detector was hit.

Particle detectors react when a particle is entering a particular volume or area. There is an infinite set of states with support inside the volume (area) and another infinite set of orthonormal states with support only outside. Together they make up a complete basis. The Hermitian operator that corresponds to measurements with this detector can be defined such that all the inside states are eigenstates with a common eigenvalue and the outside with another value. This detector can only tell whether a particle came into it or not. For a particle recorder array, the Hermitian operator may be constructed by associating the same value for all states inside one particle recorder, but different values for the different recorders. Additionally, another value should be attributed to the outside of all particle recorders. In summary, this detector records if any and which of the individual particle recorders fired. This record corresponds to a particular eigenvalue of a Hermitian operator.

The detector described so far is highly idealized. For example, it is unrealistic that a particle recording detector can register particles at any energy. However, at a specific experiment, the energy range of the particles is limited. The described model is relevant as long the efficiency is close to 100% in the real experiment.

Fig. 1
figure 1

The position detector consists of the particle recorders D1–D3, which receive the different components of the wavefunction \(|\psi \rangle\) due to the unitary transformation U. The state \(| a_n \rangle\) transforms to \(| y_n \rangle\) by U

In Fig. 1, is shown schematically what is involved when a property corresponding to an operator B is measured with an array of particle recorders. There is physical process that transforms the corresponding eigenstates \(|b \rangle\) to become a state corresponding to that one of eigenstates of the detector system \(|y \rangle\). If the unitary operator representing the transformation is denoted U, we have

$$\begin{aligned} |y\rangle = U |b\rangle . \end{aligned}$$
(9)

If the state to be measured is written in the eigenstates to B,

$$\begin{aligned} |\psi \rangle = \sum _b c_b | b \rangle, \end{aligned}$$
(10)

then the state that enters the position detector system is

$$\begin{aligned} \sum _b c_b U|b\rangle = \sum _b c_b |y_b\rangle . \end{aligned}$$
(11)

This expresses that the different eigenstates \(|b\rangle\) enters separate particle recorders and is there represented by \(|y_b\rangle\).

As the functions \(y_b(j,{\mathbf {x}})\) with differing value of b have disjoint spatial support, the density of the state (11) is

$$\begin{aligned} \rho _j({\mathbf {x}}) = \sum _b |c_b|^2 |y_b(j,{\mathbf {x}})|^2 . \end{aligned}$$
(12)

It describes where the system is according to EQM 1. Summation over the spin and integration over the volume of one of the particle recorders will give the value \(|c_b|^2\), where b is the eigenvalue of B associated with that recorder. The interpretation of this result is that

$$\begin{aligned} \rho _b = |c_b|^2 \end{aligned}$$
(13)

as a function of the discrete variable, b tells where the system is for the eigenvalue of B.

The interaction between the interior of a particle recorder and the corresponding state of the particle \(|y \rangle\) and the signaling to the environment give rise to decoherence such that branches appear, in which precisely one detector has fired. There is further discussion of decoherence in Sect. 3.1.

In order to simplify the notation, it will be assumed that the state \(|\psi \rangle\) (10), instead of the transported state (11), directly interacts with the (composite) detector. Then, the interaction with the detector M is described by

$$\begin{aligned} \bigg (\sum _b c_b | b \rangle \bigg ) |M_\emptyset \rangle \rightarrow \sum _b c_b | b \rangle ' |M_b \rangle . \end{aligned}$$
(14)

The detector changes its state from its nothing registered state \(| M_\emptyset \rangle\) to a state \(| M_b \rangle\), which corresponds to that the state \(| b \rangle\) is registered. The state of the system before and after the measurement \(| b \rangle\) and \(| b \rangle '\), respectively, can be the same state. In reality, there is a set of states of the detector that all correspond to the value of b. This and similar complications are henceforth ignored.

According to Everett, the observation process is described by

$$\begin{aligned} \bigg (\sum _b c_b | b \rangle ' |M_b \rangle \bigg ) |O_\emptyset \rangle \rightarrow \sum _b c_b | b \rangle ' |M_b \rangle |O_b \rangle , \end{aligned}$$
(15)

where the state of the state of the observer O is altered to having observed the value that the detector has measured. The distribution \(\rho _b\) gives the position of the total system over the branches. Another way to express this, the value of \(\rho _b\) gives the presence at the branch with the outcome b of the observer and everything else entangled with the measurement result.

The significance of \(\rho _b\) is the most important result from this section, as it is crucial for the derivation of the Born rule. The postulates S2 and S3 have been derived, which is apparent from Eqs. (14) and (15), which expresses that the detector and its observer measures an eigenstate of the intended operator. This finding is familiar from Everett’s original discussions [1], but there the standard Hilbert space rules were used, while the current discussions show how S2 and S3 emanate from EQM1, EQM2, and the assumed properties of the interactions. The locality of interactions and the appearance of locally conserved currents are necessary assumptions for the setup in Fig. 1 to be viable.

3.1 Decoherence: Selector and Protector

There may be an ambiguity in the transformation (14). If \(|\psi \rangle\) is written in another basis \(|x \rangle\) that are eigenstates to operator X, \([X,B] \ne 0\), then we get

$$\begin{aligned} \bigg (\sum _x d_x | x \rangle \bigg ) |M_\emptyset \rangle \rightarrow \sum _x d_x | x \rangle ' |M'_x \rangle , \end{aligned}$$
(16)

where the detector states \(|M'_x \rangle\) are linear combinations of the states \(| M_b \rangle\). From (16), it might look like as if the quantity X has been measured. However, the assumed experimental setup, Fig. 1, with realistic properties of the particle recorders guarantees that B is measured, as expression (14) suggests.

When a particle recorder is excited because the system enters it, very many degrees of freedom get excited. The possibility of interference between the terms in the right-hand side of (14) is then negligible if the measurement setup is appropriately performed. For example, for an interference to be possible between the terms that corresponds to particle recorder number 1 in Fig. 1 being excited and particle recorder 2 being excited, respectively, all the particles of recorder 1 has to be in the same state in the two terms, and the same has to apply to recorder 2. Considering the vast number of particles that change their state during an interaction with the incoming particle, the presence of this situation is minimal.

The storing of the measurement data into some memory, constructed to be resilient and with considerable redundancy, is itself enough to hinder any coherence between the possible measurement values. If the data that is written on paper, the ink molecules that attach to the paper are not likely to lose their position by quantum spreading. They attach to the paper and each other, forming macroscopic structures. It is well known that macroscopic structures are measured continuously by their surroundings [26]. The quantum Zeno effect then implies that the quantum uncertainty of the position of the writing will be minimal. If nothing else protects from coherence between the terms of (14), the way we store the data guarantees that we will not notice any effects of coherence.Footnote 7

The decoherence defines a unique basis for the detector states. In the case of a measurement setup like in Fig. 1 the states \(|M_b\rangle\)are local in space, while the alternative basis states \(|M'_x \rangle\) used in (16) are not. The non-diagonal matrix elements of a local operator L within the \(|M_b \rangle\)-basis are FAPP zero, while in the alternative basis, that is not at all the case. The arguments in the last paragraph imply that for any experimental setup, the branches are kept apart in configuration space, so that a local operator still have FAPP zero matrix elements between different branches.Footnote 8 There is no ambiguity in the measurement basis.

The derivation of the decoherence mechanism [31, 32] is based on the traditional interpretation of quantum mechanics. Joos [16] and Baker [33] questioned the use of decoherence theory to infer the Born rule as it is already assumed. However, decoherence theory primarily relies on the Born rule to conclude that the environmental particles are ‘measured’ somewhere after they scattered off ‘macroscopic’ systems. The interpretation given by EQM 1 serves well to replace the Born rule in decoherence theory.

Kent [21] and Dawid and Thébault [34] argued that the ‘fuzziness’ that decoherence gives to the branches definition is unacceptable. The point is that the observer’s beliefs can then not give a well-defined theory of a well-defined Born rule. However, the fuzziness of branches is the fuzziness of actual measurements. It is generally assumed that the experiments can be made arbitrarily exact, in principle. If that is not the case, several interpretations will be in trouble. If the fuzziness of decoherence is a problem for EQM, the experimental verification of the Born rule is equally in trouble. The decoherence is an integral part of our understanding of the measurement apparatus and is independent of any interpretation of quantum mechanics.

If decoherence has not occurred, no measurement has been performed. Consider a photon experiment in which there is no decoherence causing detector. Then the photon will fly around in a maze of mirrors, beamsplitters, etc. Finally, the photon will be absorbed in one of those elements of the setup or a laboratory wall or similar body that cause decoherence without any recording. As long we have a quantum phenomenon with only very few degrees of freedom of elementary or collective nature, it is not yet a measured system.

In the continuation, the term ‘world’ includes all the relevant environment. If we are in a well-defined branch and we need not consider recoherence, then that is our world. Branch denotes one of several structures for which the world wavefunction has a negligible amplitude at regions separating them during an extended period.

3.2 The Measurement Result

So far, it has been established that the measurement setup, as in Fig. 1, can create one well-defined branch for every possible measurement value, which are eigenvalues to a Hermitian operator. The quantum state of the branch is that of the eigenstate (with amplitude \(c_b\)) entering the detector. The standard postulates S1, S2, and S3, Sect. 2, are fulfilled in each branch, as well as S5 if the current branch is renormalized to norm one by the observer within a branch.Footnote 9

Looking at the many branches from the outside, the question “What reading did the observer get?” is equivalent to “What is the distribution of observer readings?”—The answer is given by the distribution \(\rho _b\) (13). This value is the total density (the norm) of the b-term in the final state of (14) or (15). Note that once decoherence has taken place, the created branches evolve independently, keeping their norms conserved.

If \(\rho ({\mathbf {x}}_1,{\mathbf {x}}_2, \ldots )\), describes what exists, is ontic, then we must consider that \(\rho _b\) is also ontic. This relation implies that in any basis, the distribution \(\rho _b\) gives information about what exists, but in general, it is not the full information.

4 Repeated Measurements

Suppose the detector is able to record several subsequent measurements of identically prepared systems (10). Further, assume that the way the detector interacts with the next system is not essentially affected by previous measurements. The second measurement is described by the transition

$$\begin{aligned}&\bigg (\sum _{b_2} c_{b_2} | b_2 \rangle \bigg ) \sum _{b_1} c_{b_1} | b_1 \rangle ' |M_{b_1} \rangle \rightarrow \sum _{b_1b_2} c_{b_2} c_{b_1} | b_2 \rangle ' | b_1 \rangle ' |M_{b_1b_2} \rangle . \end{aligned}$$
(17)

When the interaction with the observer is included the final state becomes

$$\begin{aligned} \sum _{b_1b_2} c_{b_2} c_{b_1} | b_2 \rangle ' | b_1 \rangle ' |M_{b_1b_2} \rangle |O_{b_1b_2} \rangle . \end{aligned}$$
(18)

Each sequence of readings belong to different branches. The distribution of observer reading sequences is now

$$\begin{aligned} \rho _{b_1b_2} = |c_{b_1}|^2|c_{b_2}|^2. \end{aligned}$$
(19)

After N measurements, the sequences of observer readings are distributed according to

$$\begin{aligned} \rho _{b_1b_2...b_N} = |c_{b_1}|^2|c_{b_2}|^2\cdots |c_{b_N}|^2. \end{aligned}$$
(20)

To focus on the value \(b = u\), denote the summed density of all the other values of b by

$$\begin{aligned} \rho _{\lnot u} = \sum _{b\ne u} |c_b|^2 \end{aligned}$$
(21)

and \(\rho _u = |c_u|^2\). The sum of the densities (20) over all sequences where \(b=u\) appears precisely m times out of N measurements is

$$\begin{aligned} \rho (m \!:\! N | u) = \frac{N!}{(N-m)!m!}(\rho _u)^m (\rho _{\lnot u})^{N-m}. \end{aligned}$$
(22)

This gives the total summed density of the branches in which the value u was found by the observer m times. Hence, the question “How many times have the observer measured the value u?” is answered by \(\rho (m\!:\! N | u)\) as a distribution over m-values.

For large number of measured systems N, the distribution (22) may be approximated by a gaussian, see Feller [36],

$$\begin{aligned} \rho (m\!:\! N | u) \approx \frac{1}{(2\pi N\rho _u\rho _{\lnot u})^{1/2}} \exp \bigg (-\frac{(m-N\rho _u)^2}{2N\rho _u\rho _{\lnot u}}\bigg ). \end{aligned}$$
(23)

The distribution (23) may be represented as function of the relative frequency \(z=m/N\) taken as a continuous variable. The properly normalized position or presence distribution with respect to z is

$$\begin{aligned} \rho (z | u) = \bigg ( \frac{ N }{ 2\pi \rho _u\rho _{\lnot u} } \bigg )^{1/2} \exp \bigg (-\frac{ N(z-\rho _u)^2 }{2 \rho _u\rho _{\lnot u} }\bigg ). \end{aligned}$$
(24)

As \(N \rightarrow \infty\) this density approaches the delta function \(\delta (z-\rho _u)\). This relation says that at infinitely large \(N\,\), there is only one value of the frequency \(z=\rho _u\). It might look like a big stride towards proving Born’s probability rule, but \(\rho (z | u)\) is an approximate result.

To get from the exact expression for \(\rho (m: N\, |\,u)\) (22) to the continuous frequency distribution, the interval [0, 1] is divided into a set of intervals \(\{I_k\}\),

$$\begin{aligned} I_k = [0,1] \,\cap [z_k-\varDelta z/2, \, z_k+\varDelta z/2[ , \, z_k = \rho _u + k\varDelta z. \end{aligned}$$
(25)

The index k belongs to the minimal set of integers such that \(\{I_k\}\) covers [0, 1]. Define \({\tilde{\rho }}(k)\) as the sum of densities \(\rho (m: N\, |\,u)\) with m/N in the interval \(I_k\). Set

$$\begin{aligned} \rho _{\varDelta z}(z|u) = {\tilde{\rho }}(k)/\varDelta z\; \text{ if }\; z \in I_k. \end{aligned}$$
(26)

This is a histogram type piece-wise constant function. If \(\varDelta z = \varDelta z_1/N^{-1/2}\) and \(\varDelta z_1\) is small and N is large, then \(\rho _{\varDelta z}(z|u)\) can be arbitrarily close to \(\rho (z | u)\).

In order to adequately justify the use of the frequency distribution (24), an operator should be found that is closely related to this distribution. The first guess may be the frequency operator

$$\begin{aligned} F_N = \frac{1}{N} \sum _{i=1}^N f_i \end{aligned}$$
(27)

where \(f_i\) operates on the i-th system being measured with \(f|u\rangle =|u\rangle\) and \(f|b\rangle = 0\) if \(b\ne u\). The eigenvalues of \(F_N\) are \(z = m/N, \, m= 1,\ldots ,N\). The density related to \(F_N\) acting on this state is given by (22) with m replaced by zN. As pointed out by Squires [37], the density values of this discrete distribution approaches zero as \(N \rightarrow \infty\).

Take instead, the operator \(F_{N\varDelta z}\) defined by its action on products of eigenstates to the operator B. If the frequency of the eigenvalue u is in the interval \(I_k\) with midpoint \(z_k\), then

$$\begin{aligned} F_{N\varDelta z} |b_N\rangle |b_{N-1}\rangle \ldots |b_1\rangle = z_{k} |b_N\rangle |b_{N-1}\rangle \ldots |b_1\rangle . \end{aligned}$$
(28)

The density of this operator is \({\tilde{\rho }}(k)\). As the eigenvalues \(z_k\) of \(F_{N\varDelta z}\) is a discrete set its density distribution \(\rho _{z_k} = {\tilde{\rho }}(k)\) is represented by a bar graph rather than the histogram that represents \(\rho _{\varDelta z}(z|u)\).

To see the behavior of these densities as N approaches infinity, the Chebyshev inequality [36] can be applied to the distribution \(\rho (m\!:\! N\, |\,u)\) (22). The result can be written as

$$\begin{aligned} \sum _{|m/N-\rho _u| > \varDelta z/2} \rho (m\!:\! N\, |\,u) \le \frac{4\rho _u\rho _{\lnot u}}{(\varDelta z)^2 N}. \end{aligned}$$
(29)

From this follows that \(\sum _{k \ne 0}{\tilde{\rho }}(k) \rightarrow 0\) as \(N \rightarrow \infty\). Hence, \({\tilde{\rho }}(0)\) approaches one for any fixed value of \(\varDelta z\). The delta function limit of \(\rho (z | u)\) is confirmed by the exact calculation.

The quantity \(\rho (z|u)\) is a continuous approximate representation of \(\rho (m\!:\! N | u)\), which is a sum of the densities of several branches. The interpretation is that \(\rho (z|u)\) gives the position distribution for the relative frequency z of everything entangled with the measurement result. The presence of the observer within an interval in the relative frequency of length dz is \(\rho (z|u)\,dz\).

5 The Born Rule

The Born rule is an indispensable tool when investigating the quantitive features of microscopic systems. It relies on the concept of probability, but this concept is not straightforwardly available in EQM. All branches with non-zero amplitudes are created, so there is no single outcome about which we can be uncertain. Classical probability theory is silent about the relative frequency of a particular outcome that the observer should expect to see. Nevertheless, an observer will in a particular branch have seen a more or less random sequence of outcomes, and the observer’s presence distribution (24) seems to say that the Born rule is valid. Though the notion of classical probability is not valid here, there may be another concept available. Denote that hypothetical quantity Everettian quantum probability, or only quantum probability when EQM is assumed.

Papineau [38] analyzed what requirements quantum probabilities need to fulfill. He argued that it is sufficient for quantum probabilities to relate to ‘non-probabilistic facts’ in the same way as probability does in practice. He identified the two ways in which this relation exists. First, he identified the ‘Inferential Link,’ which is the use of observed frequency from a finite number of repetitions for the inference of a value of a probability. Second, Papineau identified that we use probabilities to guide decisions, which he called the ‘Decision-Theoretic Link.’ If the two links can be established in EQM, Papineau’s analysis makes away with the idea that the notion of probability necessitates the existence of uncertainty. No doubt, uncertainty is indispensable in the context of single outcome probabilities, for which the term ‘classical probability’ is used here. For Everettian quantum probabilities, uncertainty is neither available nor necessary.

In Sect. 5.2, starting from the result of Sect. 4 it is shown that the inferential link is present. Inferring the value of \(\rho (u)\) from observations works equally well in EQM as in a single outcome interpretation. Section 5.3 is devoted to showing that the work by Greaves and Myrvold [39] combined with the results of Sects. 4 and 5.2 give the decision-theoretic link.

5.1 Frequentist and Bayesian Probabilities

The proof of the inferential link is closely related to the frequentist view of probabilities, where the probability is the relative frequency from infinitely many repetitions [36, 40]. The following criticism against the frequentist view of classical probability, adapted from Appleby [41, 42] and Wallace [7], should be responded before accepting a proof relying on frequencies observed after many or infinitely repeated events.

  1. 1.

    C: After infinitely many repetitions, the value of the relative frequency can deviate from the probability. The probability of such sequences tends to zero, but this latter use of the notion of probability makes this definition of probability circular.

    R: In EQM, all possible sequences of measurement results together constitute the reality. Not only a single sequence as in the case of a single outcome at each measurement. After infinitely many repetitions, the universal wavefunction is only located at the relative frequency \(z=\rho _u\). The observer sees a random sequence that suggests an analysis in terms of probability, which, given the behavior of the presence distribution, will be taken to be \(\rho _u\). This analysis creates no circularity as presence is a quantity on its own, not derived from the probability concept. However, the probability concept does not directly appear from such an analysis. As pointed out by Caves and Schack [43], this deficiency was the problem with the attempts by Saunders and others that have tried to use Finkelstein’s finding that

    $$\begin{aligned} \bigg \Vert (F_N - |c_u|^2) \prod _{k=1}^N |\psi \rangle _k \bigg \Vert \rightarrow 0 \; \text{ as } \; N \rightarrow \infty , \end{aligned}$$
    (30)

    to prove the Born rule. If the Hilbert norm does not have any physical meaning, there is no meaning to the members of the sequence off which the limit is taken. The criticism against the otherwise mathematically correct analysis of Hartle [44] and Gutmann [45] was essentially the same. These authors showed that the state of infinitely many identical systems \(|\varPsi _{\infty } \rangle = \prod _{k = 1}^{\infty } | \psi \rangle _{k}\) is an eigenstate to the frequency operator \(F_{\infty } = \lim _{N \rightarrow \infty }F_N\). These proofs had to battle with the difficulties of non-separable Hilbert spaces. The lack of a proper interpretation of the wavefunction, such as EQM1, causes those derivations to be mere mathematical exercises. Nevertheless, they do add to the consistency of the \(N \rightarrow \infty\) limit of equation (29).

  2. 2.

    C: It is impossible to make infinitely many repetitions.

    R: In a theoretical analysis, it is possible to consider thought experiments in which there are infinitely many repetitions.

  3. 3.

    C: There is no well-defined frequency of a particular outcome in an infinite sequence, as a reordering can change the value.

    R: The universe of all of the branches is left unchanged under reordering if all branches are reordered in the same way with respect to their ordinal number in the sequence. Any other reordering would violate the branches being the result of repeated experiments.

  4. 4.

    C: Any particular infinite sequence has zero probability, so how can those with the ‘right frequency’ be favored against the one with another frequency?

    R: This is reminiscent of the observation by Squires [37], that for any value of m the density \(\rho (m\!:\! N | u)\) (22,23) approaches zero as N approaches infinity. As was seen above, by bunching together all sequences into intervals in relative frequencies, the probability of the ‘right frequency’ interval approaches one.

  5. 5.

    C: The first (finite) part of an infinite sequence is a vanishingly small compared with the rest and give no reliable information about the infinite sequence.

    R: This is the problem of making statements with certainty from an observed sequence. This problem implies that we, at best, can learn about the sought frequency with some probability. Thus, there has to be something more to the notion of probabilities than frequencies. Probabilities must also relate to the beliefs of agents that are uncertain about actual realities. The frequentist must then answer why it is not only a matter of beliefs, as is the case in the subjective Bayesian view of the probability concept. In the case of a deterministic process, an agent’s probability assertion seems to merely reflect the agent’s subjective knowledge state and her corresponding assessment of the process. However, Lewis [46] opened for the possibility that even if the subjective view is the primary view of probabilities, there may still be possible that, in some instances, there can exist objective probabilities. If they do not exist in the case of a deterministic processes in which an agent is in principle able to know all the facts, then objective probabilities seems to require that there are hidden variables that are impossible to access fully, a fundamental randomness, or, as in EQM, that it is an illusion that only one alternative happens.

The frequentists view is that, under the given circumstances, the probabilities are objective properties. In EQM, the wavefunction closely describes a real objectively existing physical system. Correspondingly, finding the Born rule in EQM entails the finding of an objective property.

As mentioned above, an alternative to the frequentist view of probabilities is the subjective Bayesian view.Footnote 10 In this view, probabilities reflect an agent’s estimate of the likelihood of a particular outcome. De Finetti [47] and Savage [48] has advocated this understanding of probability. The theory concerns the beliefs of rational agents for which Savage identified the requirements formulated as postulates. These postulates, which give a foundation for subjective probability theory, are formulated in terms of decisions by rational agents. Greaves and Myrvold [39] have reformulated these postulates to suit the situation of quantum measurements. This formulation turns out to be useful to establish Papineau’s decision-theoretic link.

Most notable in this theory is the update of probabilities a rational agent will make on the discovery of new information,

$$\begin{aligned} P(A|B) = \frac{P(A\cap B)}{P(B)} = \frac{P(B|A)P(A)}{P(B)}. \end{aligned}$$
(31)

Here, P(A|B) is the probability of A when the agent know B to be true, \(P(A\cap B)\) is the agent’s probability for both A and B, and P(B) is the total probability for B. This expression originates from the identity \(P(A\cap B) = P(A|B)P(B)=P(B|A)P(A)\). In order to evaluate the update expression, the agent has to analyze the process that leads to an outcome. The strength of the Bayesian view is that it clarifies the notion of probability.

The theory is only a skeleton, which has abstracted away the world about which the agent has beliefs. In order to get any values of the probabilities, the nature of the world has to be taken into account by the agent. The beliefs are about some features of the world around us.

5.2 Statistics and Single-Outcome Believer

Consider an observer who believes there is no branching, only one outcome at every time. She is recording the results from a well-designed measurement process of a quantum phenomenon where the state contains more than one possible value. After a long sequence of measurements, according to EQM, the observer is distributed over very many branches. In each branch, a random sequence is observed, which calls for statistical analysis by the observer. The observer will assume that there is a probability \(P_u\) of measuring the value u in a single measurement. The probability of the measured relative frequency z after N repeated measurements, for this value of \(P_u\), is then

$$\begin{aligned} P(z | u) = \bigg ( \frac{ N }{ 2\pi P_u(1-P_u)} \bigg )^{1/2} \exp \bigg (-\frac{ N(z-P_u)^2 }{ 2P_u(1-P_u) }\bigg ). \end{aligned}$$
(32)

As this is a very narrow distribution for large N, a frequentist analysis would give that \(P_u\) is in some narrow interval around the measured value of the relative frequency, with some low p-value. The Bayesian analysis, assuming de Finetti’s infinite exchangeability, gives rise to the probability distribution for the value of \(P_u\) conditioned on the measured relative frequency z,

$$\begin{aligned} P(P_u | z) = \frac{P(z | u) P(P_u)}{ \int _0^1 \! dP_u \, P(z | u)P(P_u)}. \end{aligned}$$
(33)

Here, \(P(P_u)\) gives how likely the observer believed different values of \(P_u\) were before the observation was made. If it is constant, as may be the case if there was no previous information, the dependence of \(P(P_u | z)\) on \(P_u\) will be given by P(z|u).

The relative frequency z is distributed over all branches according to (24), see Fig. 2. Hence, the distribution of \(P_u\) over the branches may be seen as the folding of the two distributions (24) and (32).

Fig. 2
figure 2

The solid line shows the density, the presence distribution, \(\rho (z | u)\) (24) for \(\rho _u = 0.3\) and \(N=1000\). The dotted line shows where an observer in a typical branch may estimate the probability P(z|u) to be from the observed sequence alone

As the number of repeated measurements N grows, the width of the probability distribution P(z|u), tends to zero as does the position distribution \(\rho (z|u)\). After a large number of repeated measurements, the observer sees a relative frequency close to \(\rho _u\), and the value of N implies that the value of \(P_u\) is probably close to the observed frequency. Hence, the observer believes that the probability \(P_u\) is very close to the value of \(\rho _u\).

To summarize, the observer distribution in relative frequency (24) is narrowing in precisely the same fashion as for a classic probability (32). The integral of \(\rho (z|u)\) is dominated by the peak, which implies that the observer’s position is mostly where the relative frequency is close to \(\rho _u\). If the observer believes in a single outcome interpretation, the observer has arguments for the statistical analysis. This observer’s position is dominantly, where she has reason to conclude that the Born rule is correct and subsequently uses it to make inference about the wave function. According to EQM 1, where do we expect to find our selves? If we are to expect anything, our expectation of being near the peak of \(\rho (z|u)\) will be high, and our expectation of being in the far tails will be low. Our expectation agrees with observation, physicists believe in the Born rule.

When the standard interpretation quantum mechanics is verified on bases of the Born rule, the data is compared with the expectation we have from the Born rule. As has been demonstrated, EQM gives rise to the same expectation as the Born rule. This finding implies that EQM is equally well verified as quantum mechanics with the standard postulates.

The discussion in the present subsection reveals that EQM supports an agent’s inference of wavefunction properties, at least qualitatively. Papineau’s inferential link is essentially established. An observer’s rational expectations have been used, but without a well-defined probability theory, the expectations cannot be discussed in quantitative terms. The concept of probabilities is not really at hand yet, but the decision-theoretic link will supply that.

5.3 Decision Theoretic Probabilities

When all alternatives with non-zero amplitudes are going to be present, it becomes a challenge to understand the appearance of a probability concept, quantum probability. The observer knows that every possibility with non-zero amplitude is represented in some branch. When observing the outcome, she will also branch, each of her ‘descendants’ seeing the value of that branch. As mentioned above, Papineau has argued that it is enough to show the inferential link, which was shown in the previous Sect. 5.2, and the decision-theoretic link, which will be discussed in this section. Although there is no uncertainty, thus no classical probability, the situation before the measurement warrants much the same decision theory as when classical uncertainty is at hand.

Deutsch [8] pioneered the use of decision theory to understand Everettian quantum probabilities. Wallace [6, 7, 49, 50] and Greaves [23, 51] have continued this work in their own directions. Both base their analysis on the postulates that Savage formulated [48] to define the theory of classical probability. These axioms imply that the decisions a rational agent makes correspond to maximizing the expected utility,

$$\begin{aligned} \langle U \rangle = \sum _A P(A)U_A. \end{aligned}$$
(34)

Here P(A) is the probability, with \(\sum _i P(A) = 1\), and \(U_A\) is the numerical value of the utility that the agent will get on outcome A. From considering the decisions the agent will make under a variety of situations, and a variety of utilities the agent might get, her subjective probabilities P(A) are uniquely determined, and the utilities \(U_A\) are determined up to an affine transformation. It is not assumed that the agent consciously optimizes the expected utility, but rational behavior implies it nevertheless.

Lewis [46] acknowledges that even if probabilities are primarily subjective, there are instances like radioactive decay where probabilities are objective features. He formulated the link between subjective probabilities and objective probabilities in the Principal Principle. It implies that an agent who knows that there is an objective probability P sets her subjective probability to P. The following analysis as well as that of the previous section follows Lewis view, there is an objective feature, the position distribution, that causes a rational agent to have certain expectations.

Wallace has developed Deutsch attempted proof of the Born rule into an almost acceptable proof. He has constructed a set of axioms, which he claims any rational agent, who believes in EQM, necessarily obeys. The axioms are not self-evident and not sufficiently motivated in his very general setting, but given the axioms, the Born rule follows. Greaves has taken a skeptical attitude against Wallace’s attempts and confined her work towards understanding the concept of probability.

Greaves and Myrvold [39] have reformulated Savage postulates for rationality, to suit the case of experiments performed in branching as well as non-branching situations. The following quote formulated the purpose of their investigation.— “The problem is not one of deriving the correct probabilities within the theory; it is one of either making sense of ascribing probabilities to outcomes of experiments in the Everett interpretation, or of finding a substitute on which the usual statistical analysis of experimental results continues to count as evidence for quantum mechanics.”

Following Savage analysis, Greaves and Myrvold arrived at an expression for an agent’s expected utility. A rational agent will seek to maximize the expected utility,Footnote 11

$$\begin{aligned} \langle U \rangle = \sum _b w(b) U_b, \end{aligned}$$
(35)

where \(U_b\) is the numerical value of the utility the agent gets at the outcome b. The w(b) is a weight that the agent assigns to the outcome b. In the case of a single outcome, it is the agent’s subjective probability, or credence, of the outcome. In the case of a branching universe, Greaves and Myrvold call it quasi-credence. Fundamentally, it is a subjective property, precisely as the probability is considered to be. In both cases, the values of the weights are well-defined if the agent is rational. The weights are subject to the condition \(\sum _b w(b) = 1\).

If new information is presented, they are updated according to the Bayesian update expression,

$$\begin{aligned} w(c|b) = \frac{w(c\wedge b)}{w(b)}. \end{aligned}$$
(36)

The value of w(c|b) give the agents belief or weight of outcome c in a measurement under the condition that b has been measured, while \(w(c\wedge b)\) is the weight for the outcome b and c. One particular updating situation is for the particular value that has been measured. If the concept of classical probability applies, the probability of seeing that value is updated to one. According to EQM, after the branching, the observer’s descendants experience as if its branch is the world. After the measurement value is known, the rational agent/observer will set the corresponding quasi-credence to one, \(w(b|b) = 1\). Likewise, after branching, the agent will normalize the quantum state of her specific branch to have amplitude one, as that is her ‘system’ now.

For repeated identical independent events, the de Finetti’s infinite exchangeability property can be used, which will give the same Bayesian statistical analysis of the branching world as for the non-branching. For example, the agent will attain a weight distribution \(w(w_u|z)\) for the single event weight \(w_u\) corresponding to the expression (33). Greaves and Myrvold argue that de Finetti’s theorem, contrary to de Finetti’s position, provides us with the notion of objective weights. In the case of a single outcome, they are called chance, while in the branching world, Greaves and Myrvold use the term branching-weights. The \(w(w_u|z)\) is the subjective weight distribution for what the objective value of \(w_u\) might be, given a measured value of the relative frequency z.

Greaves and Myrvold assumed that the EQM Born rule was already proven, which gives the branching-weights equal to \(\rho _u\). However, that is not assumed in the present discussion. What is then the significance of the weights, w(b), when the world is branching? As for classical probabilities, the weights are given by the agent’s understanding of the world. An agent that interprets the physical world, according to EQM1, will take into consideration its statement about how the current world (branch) will become distributed over any new branches. For simplicity, assume that the agent is sure about which wavefunction is to be measured. The additional complication related to a mixed initial state is trivial to handle once the pure state situation is understood.

In the single outcome case, the classical probabilities correspond to the agent’s beliefs about where she will be present after the measurement. In the branching case, the agent knows from EQM 1 how her presence will be distributed. There is here a similarity between the branching and the single outcome cases. This similarity suggests that in the branching case, the weights w(b) should be equal to the presence values, \(\rho (b)\). This identification becomes evident in the case of many repeated measurements.

Consider a measurement that give two branches, u and \(\lnot u\) with weights \(w_u\) and \(w_{\lnot u}\), respectively. Assume that the utility of a branch only depends on the number of times the value u and \(\lnot u\) has been measured. The total expected utility after N repeated measurements is then

$$\begin{aligned} \langle U \rangle _N = \sum _{m=0}^N w(m \!:\! N |u)U(m,u,N-m,\lnot u), \end{aligned}$$
(37)

where \(U(m,u,N-m,\lnot u)\) is the utility of a branch in which u appears m times and \(\lnot u\)\(N-m\) times and

$$\begin{aligned} w(m \!:\! N |u) = \frac{N!}{(N-m)!m!}(w_u)^m (w_{\lnot u})^{N-m}. \end{aligned}$$
(38)

The multiplicative form of weights of branches after multiple independent branchings can be seen from the expression (36), with \(w(c|b) = w(c)\) independent of the value of b.

From \(w(m \!:\! N |u)\) a frequency distribution of the weights w(z|u) is arrived at in the same way as \(\rho (z|u)\) (24). The functional form of \(w(m \!:\! N |u)\) and w(z|u) are identical to that of \(\rho (m \!:\! N |u)\) and \(\rho (z|u)\), respectively. A rational agent that believes in EQM will have to put \(w_u\), the weight of branch u, equal to \(\rho _u\), the presence of branch u, at least for the wavefunction she thinks is the likely one. If she puts her weight \(w_u\) different from \(\rho _u\), then for large N values she will hardly have any presence where she expects to have most of her presence, see Fig. 3. An agent should make decisions to optimize the utility where she will typically be in the future. That is, the agent should optimize the expected utility (35) with \(w(b) = \rho _b\).

Fig. 3
figure 3

The solid line shows the density, the presence distribution, \(\rho (z | u)\) (24) for \(\rho _u = 0.3\) and \(N=1000\). An agent that assumes the weight \(w_u = 0.5\) will make decisions such that she gets her favorable utilities where the dashed line is large. With \(w_u = 0.5\) she risks getting unfavorable utilities at the position she should expect her self to be according to EQM

What about a single measurement where no branch dominates the presence distribution. The previous argumentation is even then applicable, as the weight the agent should apply to a branch should not depend on whether the measurement is repeated or not. This conclusion follows from the physical picture of local interactions, local detector systems, preparations so that later measurements can be performed sufficiently independent of the previous ones. These are the physical assumptions that were already stated in Sect. 2 and on which Eq. (37) rests. We have to conclude that a rational agent that knows the wavefunction to be measured and thus knows the \(\rho _b\)-values is compelled to put her weights equal to those values, \(w_b = \rho _b\). Thus the decision version of the Born rule has been proven.

The reader might feel hesitant at this point, but you might remember your hesitation about classical probabilities before you became educated on that subject. Suppose an uneducated person is offered a choice between a bet which pays twice the punt if event A happens and a bet which only pays 1.5 the punt on B happens. The person is told by an educated friend that there is only a 3 out of 10 chance for A to happen, but 7 out of 10 for B to happen. The uneducated might now reason: B may happen, but it is also possible that A happens, and I get more money in that case. We are not going to do this ten times, and anyhow I am told that even if this thing is repeated ten times, A might happen in all of the ten repetitions. Both A and B are possible outcomes, and no one can deny that, and I will win substantially more in case A. I will rather bet on A. It is only the educated and long term perspective that makes one go for bet B instead. We know that we will find our selves in many decisions, where probabilities apply. If we always stick with carefully estimated probabilities and the corresponding rational decision, we will with high probability be winners. Likewise, an agent confronted with possible choices that will affect the utilities at the different branches can, of course, neglect the weights, if she does not see any relevance in them. However, when the estimate of where she will be in the future after thousands of events is compared with what an equal weight strategy corresponds to, see Fig. 3, then it becomes clear that the rational behavior is to take into account the weights \(w_b=\rho _b\) in the decision as if they are probabilities.

The decision-theoretic link is established. The weights \(w_b\) enter into branching world decision making in the same way as classical probabilities do in single outcome interpretations. The Born rule gives the objective values, and they can be estimated using statistics in the same as classical probabilities can be estimated. This result implies that the inferential link is present, not only at the qualitative level as in Sect. 5.2, but also quantitatively. As the weights behave as probabilities, they deserve to be called probabilities. To distinguish them from their classical counterparts, the term (Everettian) quantum probabilities is more precise.

5.4 Explanation and Defense of the Greaves–Myrvold Theory

The astonishing and novel feature of the works by Papineau and Greaves is that the operational aspects of probability, statistics, and decision making are available for EQM agents, though they have no uncertainty about the future. Saunders and Wallace [52, 53] have tried to argue for a possible way of ‘talking,’ a semantics, by which the agent is uncertain about the outcome before a quantum measurement is performed. Such semantics seems contrived and in conflict with the mathematical expressions. The state before the measurement \(| \psi \rangle | M_{\emptyset } O_{\emptyset } \rangle\) contradicts their description in terms of already existing branches. In EQM, all that exists is represented by the wavefunction. Any physical partition has to correspond to an expansion of \(| \psi \rangle\) in some basis \(| a \rangle\). Such an expansion cannot be understood as a partition into any pre-existing worlds, as the individual basis states \(| a \rangle\) are in disagreement with the prepared state, \(|\psi \rangle\). Any description in non-mathematical language that is not supported by the mathematical expressions will produce an ill-defined description of nature. The appearance of probability without any uncertainty being present is exciting progress that should be acknowledged.

Kent [21] criticized Savage’s postulates on which Greaves and Myrvold based their considerations. He argued that many possible strategies are in conflict with the postulates, but are rational. Kent lists seven alternative strategies that do not conform to Savage postulates. He claimed that his alternative strategies are rational, but he only argued that those strategies could be applied consistently, which is not the same as rational. It is possible to be irrational and consistent. Two of the strategies are undefined, but the others can easily be shown to be irrational by varying the set of rewards the agent will get.

With the intent to criticize the concept of branch-weight, Kent suggests five different computer-generated branching worlds \(\hbox {CBU}_{1-4}\) and CBU-qualia. Kent claims that Greaves’ and Myrvold’s weights do not apply to them. Kent’s analysis of these worlds seems insufficient to warrant his conclusion. Anyhow, the present analysis has shown the applicability of the Greaves-Myrvold theory for EQM.

Kent’s conclusion from Greaves’ and Myrvold’s work is that “Everettians cannot give an explanation that says that all observers in the multiverse will observe confirmation of the Born rule, or that very probably all observers will observe confirmation of the Born rule.” Indeed, in EQM, there will be some branches with a low presence where the statistics disconfirm the Born rule. However, in a single-outcome interpretation, the Born rule implies that there is a finite probability that we will fail to confirm the Born rule. In both kinds of interpretation, we are in the same predicament. Our understanding of the world might be wrong because we have only experienced low probability or low weight events. Independent of the interpretation, we have to assume that this is not the case. EQM gives that the total presence of the branches in which we should have seen the Born rule is overwhelmingly large. Thus we expect to be in a branch where the statistics are in agreement with the Born rule. For a further argumentation against Kent’s criticism, see [54].

Price [55] is skeptical towards the existence of ‘probability’ in situations where there is no uncertainty present. As has been shown above, there is no real uncertainty, but there is a distribution situation to which Savage decision theory is applicable. Uncertainty turns out not to be a requirement as the concept presence successfully replaces the classical concept of probability.

Further, Price erroneously regards a person’s descendants in the different branches, as if they are different persons. In the example ‘Legless at Bondi beach,’ he discusses the misfortune that swimmer’s choice causes to one of his descendants as if the swimmer caused harm to another person in an unethical way. However, the choice corresponds to a gamble that, in a single outcome scenario, could, if unlucky, give a disastrous result. There is no reason to view the decision that might cause oneself harm more or less ethical depending on if that happens with a low probability or happens with a low presence.

Price also questions the use of Savage type decision theory. He argues that the decision strategy he calls “social justice” is rational but in conflict with the Greaves-Myrvold decision theory. That this strategy should be rational is argued from the rationality of the principle for organizing societies called ‘social justice.’ Again, Price views the descendants as if they are different individuals that exist together in a shared social context, but that is not a correct view of the situation. The isolation of the branches due to decoherence guarantees that the utilities that a rational agent assigns to a branch are independent of the utilities in the other branches unless the ‘offers’ given to the agent cause an artificial correlation.

Albert [20] has criticized the work by Greaves and Myrvold by suggesting that he might care more about branches where he is fat because there is more of him there. In one way, this is a complaint against their lack of physical reasoning for what value the weights should have. The Savage type rationality axioms do not include any such facts about the world, and Greaves and Myrvold were clear that a rule for the values of their branch-weights has to come from some additional arguments.

The fatness argument also suggests that an agent may have different priorities and wishes when she believes in a branching world than when she believes there is a single outcome. That is indeed possible, but that constitutes no ambiguity for the weights. The weights an agent puts to the different outcomes are independent of the preferences the agent has. The utilities that enter into an agent’s decisions are only auxiliary quantities in the analysis of weights or probabilities.

6 What is Real?

6.1 The EQM Ontology

The configuration space density (2) defined in EQM 1 serves an epistemic purpose primarily. It lays out a starting point for the investigations of the world around us. The wavefunction, including all its spin components and eventual gauge indices, describes what exists, but its gauge dependence shows that it also contains something spurious. EQM says that particles with distributed positions for their spin components exist.Footnote 12 Relativity implies that currents are equally real. A full discussion of the ontology is postponed to future studies. Anyhow, quantum gravity and related investigations may modify the ontology.

6.2 Misconceptions

A misconception about EQM is that the branching creates several copies of the world, which may lead to concerns about energy conservation. However, branching is not copying. Branching is a multi-entanglement creation. To understand the mistake when branching is thought of as copying, consider two electrons that collide, which causes the electrons to become entangled. However, the entanglement does not cause the two electrons to become four electrons. Correspondingly, an observer gets entangled with a detector when recording the measurement value, but this process does not imply copying of the observer into several observers. Instead, the observer becomes distributed into several separate regions in configuration space. The observer’s record of the measurement result varies with the position in configuration space. As detector systems are never 100% efficient, there will be regions in configuration space where the observer failed to get information about the measured system, bad luck.

Some criticisms of the derivations of the Born rule are related to the misconception that branching is copying. For example, Hemmo and Pitowsky [5] contrast the standard quantum mechanics where one alternative becomes “realized” with EQM where “all of them are real”. Price [55] wrote about EQM that “all possible outcomes of quantum measurement are treated as equally real”. With such views, it is no wonder that they come to that the Born rule is a logical impossibility. If we know that something will ‘really’ happen, then it has probability one.

The EQM description of a single particle is a spatially varying amplitude for each spin component. Its absolute square \(\rho _j({\mathbf {x}})\) gives the locations of the particles, which is a distribution. If the particle is in a bound state, it can be probed with forces. The strength of the interactions will reveal the values of \(\rho\) in different regions (and spins).

Is the quantum particle equally real at all positions where the amplitude is non-zero? A quantum particle is not that kind of thing that is localized to a single point. The question presumes something that is not at hand. The same is valid for complex systems as well. They are not localized to a single point in configuration space and spin, but a distributed quantity. The view that all the branches are equally real is a category mistake. It is the whole set of branches with their respective amplitudes that constitute reality. Within a branch, that particular branch constitutes the whole reality. What is real depends on the (possible) perspective.

From the pre-measurement perspective, the future observer’s experience of the world being one specific branch is an illusion. The views expressed by Hemmo, Pitowsky, and Price that “all of them are real” corresponds to viewing mirages as if they are real.

Rovelli [3] questioned the ideas behind EQM that the observer does not learn a unique value, “If so, how could have we learned quantum theory?”. However, it has been shown in previous sections that in the overwhelming part of the presence distribution, physicists can deduce the Born rule and confirm quantum mechanics. Note, the pessimist might even claim that if the universe gives single random outcomes, we cannot learn about its nature. If a single outcome were the case, with a small probability, he or she would be right.

7 Conclusions and Final Remarks

The need for assumptions in order to address probabilities have been discussed by Barrett [19]. In the present theory, the assumptions are given by the two postulates and the assumption on the interactions. The postulate EQM 1 gives a physical foundation to Everett’s quantum mechanics. It implies that the quantum state belongs to a Hilbert space and an extraordinary simplicity of the postulates.

The observation of wavefunction structures is a complex process that involves decoherence. The previous formulations of decoherence theory were based on the Born rule. EQM 1 interprets the quantity appearing in the Born rule \(\rho _b\) to give to what extent the system is present at b. The physical description of measurements assumes that interactions are essentially that of the standard model of particle physics. This assumption implies that particle recording detectors only react to the part of the measured state that falls upon the detector. There is no need to explain the measurement process under various hypothetical types of interactions that might not even allow for the construction of detectors. Much of the discussions of measurements and the Born rule have gone astray in unnecessarily general and abstract reasonings. The cause for this may be the heritage of classical mechanics and quantum mechanics textbooks, where the mechanics and the interactions are two completely independent entities.

The path to proving the Born rule has been the one proposed by Papineau, which consists of two legs. Firstly, it was shown that statistical inference of \(\rho _b\) is possible within the major part of where the system is present. This result implies that EQM is equally well verified as the standard interpretation of quantum mechanics. Secondly, a rational agent will make decisions as if the Born rule quantum probabilities were classical probabilities.

It is possible to take the quantum state as the full description of the physical world without any additional degrees of freedom or mechanisms that select a single value in a measurement. All aspects of the measurement process are fully understood using Everett’s interpretation with EQM 1 and EQM 2. This fact explains the elusive character of the measurement problem that made Feynman doubt its existence. It proves the suggestion [57] that a selection happens without a cause is correct. As there is no actual selection, there is no cause for it.