Homer Nodded: Von Neumann’s Surprising Oversight

Mermin, N. David; Schack, Rüdiger

doi:10.1007/s10701-018-0197-5

Homer Nodded: Von Neumann’s Surprising Oversight

Open access
Published: 31 July 2018

Volume 48, pages 1007–1020, (2018)
Cite this article

Download PDF

You have full access to this open access article

Foundations of Physics Aims and scope Submit manuscript

Homer Nodded: Von Neumann’s Surprising Oversight

Download PDF

3947 Accesses
12 Citations
21 Altmetric
1 Mention
Explore all metrics

Abstract

We review the famous no-hidden-variables theorem in von Neumann’s 1932 book on the mathematical foundations of quantum mechanics (Mathematische Grundlagen der Quantenmechanik, Springer, Berlin, 1932). We describe the notorious gap in von Neumann’s argument, pointed out by Hermann (Abhandlungen der Fries’schen Schule 6:75–152, 1935) and, more famously, by Bell (Rev Modern Phys 38:447–452, 1966). We disagree with recent papers claiming that Hermann and Bell failed to understand what von Neumann was actually doing.

Challenging the Gospel: Grete Hermann on von Neumann’s No-Hidden-Variables Proof

Motivation for This Work

Further Developments

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Over half a century ago Bell [2] criticized the famous argument of von Neumann [10] that hidden-variable theories cannot underlie quantum mechanics. Unknown to Bell, Hermann [7] had published the same criticism three decades earlier. Bell then went on to prove an important no-hidden-variables theorem of his own,^{Footnote 1} without making the mistake of von Neumann that he (and Hermann) had noted.^{Footnote 2}

Recently Bub [4] claimed that Bell had misunderstood von Neumann’s argument, and quite recently Dieks [6] expanded on Bub, adding similar criticism of the earlier work of Hermann. We, however, agree with Hermann’s and Bell’s reading of von Neumann, and believe that Bub and Dieks fail to make sense of the surprising gap in von Neumann’s argument that Hermann and Bell correctly identified.^{Footnote 3}

In Sect. 2 we summarize von Neumann’s argument against hidden variables, and identify his oversight. In Sect. 3 we describe Bell’s criticism of von Neumann’s argument. While Bell does not convey some of von Neumann’s subtle distinctions, he does get von Neumann’s error exactly right. Section 4 describes the much earlier, but less well-known criticism of von Neumann by Hermann. She captures better than Bell the full character of von Neumann’s argument, and, like Bell, correctly explains what’s wrong with it.

We comment in Sects. 2–4 on Bub’s and Dieks’ reading of von Neumann and why we believe that reading is wrong.

2 Von Neumann’s Argument

2.1 Von Neumann’s Assumptions

Von Neumann derives much of the structure of quantum mechanics together with his argument against hidden variables, from four assumptions. Because the four assumptions lead not only to the structure of quantum mechanics, but also to von Neumann’s no-hidden-variables argument, if hidden variables are nevertheless compatible with quantum mechanics, then at least one of his assumptions must be wrong. Von Neumann concludes that one cannot construct a hidden-variables model without doing irreparable damage to the structure of quantum mechanics. But Hermann and Bell both point out that one of von Neumann’s four assumptions, essential for the no-hidden-variables part of his argument, can be dropped without altering the structure of ordinary quantum mechanics (implied by the remaining three) in any significant way.

Two of von Neumann’s assumptions, $\mathrm{{A}}^\prime $ and $\mathrm{{B}}^\prime $, deal with “physical quantities” and their measurement. They are about statistical properties of data, and they make no explicit reference to the formalism of quantum mechanics. The other two Assumptions I and II, make no explicit mention of measurement, data, or statistics. They simply associate physical quantities with Hermitian operators on a Hilbert space, in a way that preserves certain structural relationships obeyed by both the physical quantities and the Hermitian operators, thereby bringing into the story much of the formal mathematical apparatus of quantum mechanics. Here are von Neumann’s four assumptions:^{Footnote 4}

Assumption A$'$: (p. 311^{Footnote 5}) There exists an expectation function Exp from physical quantities to the real numbers.

A physical quantity ${{\mathcal {R}}}$ can be subject to a measurement, which yields a real number r. If you have an ensemble of physical systems, all associated with the same set of physical quantities, and you measure the same physical quantity ${{\mathcal {R}}}$ on a large enough random sample of the systems, then the mean of all those measurement outcomes is called $\mathrm{Exp}({{\mathcal {R}}})$.^{Footnote 6} Implicit in Assumption A$'$, and in the notation Exp(${{\mathcal {R}}}$), is the physical assumption, not always emphasized, that this mean value does not depend on which of several possible distinct ways of measuring ${{\mathcal {R}}}$ might be chosen.

One way to define a physical quantity is to specify a way to measure it. As an important example, if ${{\mathcal {R}}}$ is a physical quantity that one does know how to measure, and f is a function that takes real numbers to real numbers, then one can define another physical quantity $f({{\mathcal {R}}})$ by specifying that to measure $f({{\mathcal {R}}})$ you measure ${{\mathcal {R}}}$ and then apply f to the outcome r of the ${{\mathcal {R}}}$-measurement.

We shall point out below that the criticisms of Hermann’s and Bell’s readings of von Neumann by Bub and Dieks are invalidated by the fact that von Neumann’s four assumptions also provide another way to define physical quantities that makes no explicit mention of measurements.

Assumption $\mathrm{{A}}^\prime $ also states explicitly that Exp(${{\mathcal {R}}}$) is non-negative if the physical quantity ${{\mathcal {R}}}$ is “by nature” non-negative. Nobody has any issues with this.

Assumption B$'$: (p. 311) If ${{\mathcal {R}}}, {{\mathcal {S}}}, \ldots $ are arbitrary physical quantities, not necessarily simultaneously measurable, and $a, b, \ldots $ are real numbers then the expectation function Exp is linear:

$$\begin{aligned} \mathrm{Exp}(a{{\mathcal {R}}}+ b{{\mathcal {S}}}+ \cdots ) = a\,\mathrm{Exp}({{\mathcal {R}}}) + b\,\mathrm{Exp}({{\mathcal {S}}})+ \cdots . \end{aligned}$$

(1)

If several different physical quantities ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ can be simultaneously measured, then you can define a physical quantity that is a function f of them all by specifying that $f({{\mathcal {R}}}, {{\mathcal {S}}}\ldots )$ is measured by measuring them jointly, and applying f to the results $r, s,\ldots $ of all those measurements. The linearity condition $\mathrm{{B}}^\prime $ for jointly measurable quantities follows straightforwardly from this definition, applied to the function $f(r, s, \ldots ) = ar + bs +\cdots .$

Now it is one of the most important features of quantum mechanics that not all physical quantities can be simultaneously measured.^{Footnote 7} Extending the scope of Assumption $\mathrm{{B}}^\prime $ to quantities ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ that are not jointly measurable is problematic, however, since at this stage it is not even clear what $a{{\mathcal {R}}}+ b{{\mathcal {S}}}+\cdots $ in $\mathrm{{B}}^\prime $ might mean for such quantities. Indeed, von Neumann immediately remarks that $\mathrm{{B}}^\prime $ characterizes such a linear combination “only in an implicit way”, since there is “no way to construct from the measurement [instructions] for ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ such [instructions] for ${{\mathcal {R}}}+ {{\mathcal {S}}}+\cdots .$”^{Footnote 8}

Bub and Dieks both take this to mean that von Neumann uses assumption $\mathrm{{B}}^\prime $ to define linear combinations of physical quantities that are not simultaneously measurable. This is the entire basis for their criticisms of Bell and Hermann. If $\mathrm{{B}}^\prime $ is just a definition, it cannot also be an invalid assumption, as Hermann and Bell maintain. But as we shall see below, the full set of von Neumann’s four assumptions contains another way to define linear combinations of physical quantities that are not simultaneously measurable. With that alternative definition, Assumption $\mathrm{{B}}^\prime $ can indeed impose a nontrivial constraint on the values an Exp function can have for such linear combinations. There is no reason to insist that Assumption $\mathrm{{B}}^\prime $ must be taken as a definition.

Assumption I

(p. 313) There is a 1-to-1 correspondence between physical quantities ${{\mathcal {R}}}$ and Hermitian operators R that act on a Hilbert space. For any real-valued function f, if the quantity ${{\mathcal {R}}}$ has the operator R, then the quantity $f({{\mathcal {R}}})$ has the operator f(R).

$$\begin{aligned} {{\mathcal {R}}}\longleftrightarrow R\ \Longrightarrow \ f({{\mathcal {R}}}) \longleftrightarrow f(R). \end{aligned}$$

(2)

The requirement that this 1-to-1 correspondence must be preserved by functions is quite powerful. We have noted in our discussion of Assumption $\mathrm{{A}}^\prime $ von Neumann’s specification of how to define functions of a physical quantity. Standard Hilbert space mathematics tells us how to define functions of a Hermitian operator. Requiring, as Assumption I does, that these two quite different ways of evaluating functions should preserve the one-to-one correspondence between physical quantities and Hermitian operators has surprisingly strong consequences. Appendix 1 illustrates the power of this function-preserving 1–1 correspondence.

Because this association of physical quantities with Hermitian operators is one-to-one, it is possible to use Hermitian operators to define physical quantities, and vice-versa. Assumption II provides a pertinent example of this.

Assumption II

(p. 314) If the physical quantities ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ have the Hermitian operators $R, S,\ldots $, then the physical quantity $a{{\mathcal {R}}}+ b{{\mathcal {S}}}+\cdots $ has the Hermitian operator $aR + bS + \cdots $, whether or not ${{\mathcal {R}}}, {{\mathcal {S}}}, \ldots $ are simultaneously measurable:^{Footnote 9}

$$\begin{aligned} a{{\mathcal {R}}}+ b{{\mathcal {S}}}+\cdots \ \longleftrightarrow \ aR + bS +\cdots . \end{aligned}$$

(3)

Assumption II provides the obvious way to define $a{{\mathcal {R}}}+ b{{\mathcal {S}}}+\cdots $ for sums of physical quantities that are not simultaneously measurable. There is no problem in defining linear combinations of arbitrary Hermitian operators. The physical quantity $a{{\mathcal {R}}}+ b{{\mathcal {S}}}\ +\cdots $ can then be defined, under Assumption II, to be the one that corresponds to the Hermitian operator $aR + bS +\cdots $, where $R, S,\ldots $ are the Hermitian operators that correspond to the individual physical quantities ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $. This definition reduces to the simple definition in terms of measurement outcomes when the quantities are jointly measurable. Assumption II extends that definition when they are not.

This observation invalidates what Bub and Dieks have to say about Hermann’s and Bell’s alleged misunderstanding of von Neumann. Whether von Neumann intended to define such sums through Assumption II is beside the point, though we believe he did,^{Footnote 10} and Hermann clearly thought that he did. To invalidate Bub’s and Dieks’ criticism of Hermann and Bell it is enough that an alternative definition exists in addition to the definition Bub and Dieks attribute to von Neumann.^{Footnote 11}

2.2 What von Neumann Proves with His Assumptions

Von Neumann first proves^{Footnote 12} that if an ensemble of physical systems and the associated Exp function satisfy all four of his assumptions, then the Exp function for that ensemble must have the form

$$\begin{aligned} \mathrm{Exp}({{\mathcal {R}}}) = \mathrm{Tr}(UR), \end{aligned}$$

(4)

where U is a non-negative^{Footnote 13} Hermitian operator characteristic of the ensemble but independent of the physical quantity ${{\mathcal {R}}}$. In modern language there must be a density matrix U, such that the Exp function for the ensemble is the trace of the product of that density matrix with the Hermitian operator that corresponds to that physical quantity.^{Footnote 14}

The Exp function characterizing a pure quantum state $\phi $ is indeed of the form (4) with the density matrix U given by $|\phi \rangle \langle \phi |$. And, of course, the ensembles associated with ordinary quantum states do indeed satisfy all four of von Neumann’s assumptions.

Von Neumann addresses the question of hidden variables on p. 323.^{Footnote 15} He asks whether the dispersion of any ensemble characterized by a wave function $\phi $ could result from the fact that such pure states are not the fundamental states, but only statistical mixtures of several more basic states. To specify such “actual states” one would need additional data — “hidden parameters”, which we denote here collectively by $\lambda $. When adjoined to the quantum state $\phi $ these hidden parameters would determine everything — i.e. the resulting subensembles would be free of dispersion:

$$\begin{aligned} \mathrm{Exp}_{\phi ,\lambda }({{\mathcal {R}}}^2) = (\mathrm{Exp}_{\phi ,\lambda }({{\mathcal {R}}}))^2 \end{aligned}$$

(5)

for all physical quantities ${{\mathcal {R}}}$. The statistics of the nondeterministic ensemble, characterized by (4) with $U = U_\phi = |\phi \rangle \langle \phi |$, would result from appropriately weighted averages over all the actual states, ($\phi ,\lambda $), into which the $\phi $-ensemble was decomposed by the hidden parameters.^{Footnote 16}

Von Neumann shows (again straightforwardly) that a $\phi $ ensemble cannot be so decomposed into dispersion-free ($\phi ,\lambda $) subensembles provided the Exp functions for the subensembles, $\mathrm{Exp}_{\phi ,\lambda }$, are also of the form (4) with density matrix U given by some $U_{\phi ,\lambda }$. Therefore if the Exp functions for quantum states can be represented by weighted averages of Exp functions for dispersion-free subensembles, then some of those subensembles cannot have Exp functions of the form (4), and therefore some of von Neumann’s four assumptions must fail for some of those subensembles.

Which assumptions might it be that fail for the dispersion-free subensembles?

2.3 Von Neumann nods

Von Neumann clearly believes^{Footnote 17} I and II to be the assumptions that must be abandoned if there are dispersion-free subensembles. When he states that “the established results of quantum mechanics can never be derived” (p. 324) if there are dispersion free subensembles,” the reason he offers is that if they did exist, then “it [would be] impossible that the same physical quantities exist with the same function connections (i.e., that I and II hold).” That is indeed what I and II are about — functional relations among physical quantities, mediated by their corresponding Hermitian operators. Assumptions I and II, as noted above, make no mention of ensembles or statistical distributions. They specify broad structural relations, that it might be reasonable to expect to hold for physical quantities, regardless of what subensembles they might be measured in footnote.^{Footnote 18}

If indeed it was Assumptions I and II that von Neumann expected to fail for the dispersion-free subensembles, then one can understand his now notorious “It is therefore not, as is often assumed, a question of a reinterpretation of quantum mechanics, — the present system of quantum mechanics would have to be objectively false, in order that another description of the elementary processes than the statistical one be possible.” (p. 325)

So strong a conclusion might indeed be appropriate if Assumptions I and II were the only suspects. But there are other suspects, $\mathrm{{A}}^\prime $ and $\mathrm{{B}}^\prime $ that von Neumann, unaccountably, fails to question. These have to do with the nature of physical quantities and the statistics of ensembles. They have nothing to do with “function connections” among physical quantities, or “relations assumed by quantum mechanics.” Could assumptions $\mathrm{{A}}^\prime $ or $\mathrm{{B}}^\prime $ be sacrificed for the dispersion-free subensembles without making “the present system of quantum mechanics $\ldots $ objectively false”?

It might indeed be radical to abandon for subensembles the idea, $\mathrm{{A}}^\prime $, that single physical quantities and simultaneously measurable sets give rise to statistics that do not depend on the particular way in which they are measured. One could argue whether that would be more or less radical than abandoning I and II for the subensembles. But why bother to argue? Why not simply give up assumption $\mathrm{{B}}^\prime $ for linear combinations of physical quantities that are not simultaneously measurable?

It is a peculiar feature of ordinary quantum mechanics that Assumption $\mathrm{{B}}^\prime $ holds for the mean values over the $\phi $-ensembles specified by quantum states, even when the physical quantities cannot be jointly measured. But there is no compelling reason to expect that $\mathrm{{B}}^\prime $ should continue to hold for averages over the ($\phi ,\lambda $)-subensembles into which the $\phi $-ensembles might be subdivided by specifying additional hidden variables.

Bub and Dieks pass over $\mathrm{{B}}^\prime $, as a candidate for the assumption that fails for the dispersion-free subensembles, because they insist on interpreting it as nothing more than a definition. Dieks says that it would make no sense to reject $\mathrm{{B}}^\prime $ for those subensembles because it is “analytic”. But as emphasized above, Assumption II provides a powerful alternative way to define linear combinations of physical quantities that are not jointly measurable. In terms of that definition it is not only meaningful to reject $\mathrm{{B}}^\prime $ for the hypothetical dispersion-free subensembles, but quite compatible with the general structure of ordinary quantum mechanics. Thanks to Hermann and Bell, Bub and Dieks are aware that they need a reason for not blaming $\mathrm{{B}}^\prime $. Von Neumann, who was unable to benefit from Bell’s later criticisms^{Footnote 19} seems just to have overlooked the possibility. Homer not only nodded. He seems to have been fast asleep. Bell’s describing his oversight as “silly” in a magazine interview does not strike us as excessive.^{Footnote 20}

There is no reason at all to require the Exp functions on possible dispersion-free subensembles to be linear on physical quantities that are not simultaneously measurable. Maintaining “the established results of quantum mechanics” only requires $\mathrm{{B}}^\prime $ to hold when those subensembles are recombined to make up the $\phi $-ensemble characterizing the full quantum state $\phi $. This is precisely the point made by John Bell fifty years ago, and, thirty years before Bell, by Grete Hermann.

3 Bell’s Criticism of von Neumann

The most important part of Bell [2] is his better version of von Neumann’s attempt at a no-hidden-variables theorem. Bell restricts von Neumann’s assumption $\mathrm{{B}}^\prime $ to physical quantities ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ that can be simultaneously measured. The linear combination ${{\mathcal {W}}}= a{{\mathcal {R}}}+ b{{\mathcal {S}}}+ \cdots $ can then be measured by jointly measuring ${{\mathcal {R}}}, {{\mathcal {S}}},\ldots $ and forming the corresponding linear combination of those measurement outcomes. With a more elaborate argument, quite different from von Neumann’s, Bell can still rule out dispersion-free subensembles, provided the Hilbert space has three or more dimensions.^{Footnote 21}

To explain the point of his own refinement of von Neumann, Bell must explain the problem with von Neumann’s then widely accepted result. He does this rather informally, condensing von Neumann’s four assumptions into “Any real linear combination of any two Hermitian operators represents an observable, and the same linear combination of expectation values is the expectation value of the combination.”

This overly brisk summary^{Footnote 22} insufficiently emphasizes von Neumann’s distinction between physical quantities and Hermitian operators.^{Footnote 23} It underemphasizes the importance of the mapping being 1-to-1. It does not distinguish between assumptions that refer to the statistical Exp functions and assumptions that do not. Nevertheless, this rough summary is enough to make clear what Bell objects to in von Neumann’s assumptions, and this is all he needs to set the stage for his own improvement on von Neumann.

What Bell objects to is that although the linearity of expectation values of noncommuting operators^{Footnote 24} “is true for quantum mechanical states, it is required by von Neumann of the hypothetical dispersion free states also.” But the “additivity of expectation values $\ldots $ is a quite peculiar property of quantum mechanical states, not to be expected a priori. There is no reason to demand it individually of the hypothetical dispersion free states, whose function it is to reproduce the measurable peculiarities of quantum mechanics when averaged over.” [Bell’s italics.]

This is the same as the reason we give in Sect. 2 for the failure of von Neumann’s no-hidden-variables proof: the culprit is indeed assumption $\mathrm{{B}}^\prime $. We have no doubt that Bell knew exactly what the problem was.^{Footnote 25}

4 Hermann’s Criticism of von Neumann

In 1935, three years after the publication of von Neumann’s book and three decades before John Bell’s criticism of that book, Grete Hermann wrote about it.^{Footnote 26} She raised the same objection as Bell would thirty years later. Her criticism of von Neumann is more thorough than Bell’s, because she follows von Neumann’s argument more closely.^{Footnote 27} By not conflating von Neumann’s four assumptions, she is able to address questions Bell couldn’t formulate (and didn’t need to, for his purposes.) But after precisely identifying von Neumann’s oversight, she offers him some escape hatches that we cannot make much sense of.^{Footnote 28}

Hermann considers an ensemble of physical systems. There are physical quantities ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ that can be measured on the systems of the ensemble. There is a function $\mathrm{Exp}({{\mathcal {R}}})$ that gives the mean value of the measurement outcomes arising from an ${{\mathcal {R}}}$-measurement on all the systems of the ensemble. “Von Neumann assumes that

$$\begin{aligned} \mathrm{Exp}({{\mathcal {R}}}+{{\mathcal {S}}}) = \mathrm{Exp}({{\mathcal {R}}}) + \mathrm{Exp}({{\mathcal {S}}}). \end{aligned}$$

(6)

In words: the expectation value of a sum of physical quantities is equal to the sum of the expectation values of the two quantities [her italics]: von Neumann’s proof stands or falls with this assumption. [our italics]”

This crucial assumption is equivalent to von Neumann’s $\mathrm{{B}}^\prime $. It is trivial, Hermann notes, for classical physics, and for quantum mechanical quantities that can be simultaneously measured, because then “the value of their sum is nothing other than the sum of the values that each of them separately takes, from which follows immediately the same relation for the mean values of these magnitudes. The relation is, however, not self-evident for quantum mechanical quantities between which uncertainty relations hold, and in fact for the reason that the sum of two such quantities is not immediately defined at all: since a sharp measurement of one of them excludes that of the other, so that the two quantities cannot simultaneously assume sharp values, the usual definition of the sum of two quantities is not applicable. Only by the detour over certain mathematical operators assigned to these quantities does the formalism introduce the concept of a sum also for such quantities.”

Hermann is saying here that because it is not clear how to define the sum in (6) or in Assumption $\mathrm{{B}}^\prime $ of two quantities that are not jointly measurable, “to introduce the concept of a sum$\ldots $for such quantities” requires a detour involving mathematical operators assigned to them — i.e. von Neumann’s Assumptions I and II. By emphasizing the need for a detour into I and II she underlines that it is not necessary to take $\mathrm{{B}}^\prime $ to define the sum of quantities that are not simultaneously measurable. Hermann is reading von Neumann just as we do.^{Footnote 29}

For an ensemble characterized by a wave-function $\phi $, Hermann notes,

$$\begin{aligned} \mathrm{Exp}({{\mathcal {R}}}) = (R\phi ,\phi ), \end{aligned}$$

(7)

and therefore (6) is valid by virtue of the quantum mechanical identity

$$\begin{aligned} ((R + S)\phi , \phi ) = (R\phi ,\phi ) + (S\phi ,\phi ). \end{aligned}$$

(8)

Here R and S are, she notes, “mathematical operators assigned to the quantities ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$.” Since (8) holds whether or not R and S commute, (6) holds whether or not ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ are simultaneously measurable.^{Footnote 30} So $\mathrm{{B}}^\prime $ does hold for ensembles characterized by wave functions.

But what about subsets of those ensembles “selected from them on the basis of any new features.” For those subensembles “one can no longer infer from the asserted addition rule for $(R\phi ,\phi )$, that also in these subsets the expectation value of the sum of physical quantities is the same as the sum of their expectation values. In this way, however, an essential step in Neumann’s proof is missing.” There it is: precisely the same problem that we describe in Sect. 2 and that Bell identified thirty years after Hermann.

We wish Hermann had stopped here. But she goes on. It is our guess that she goes on because she knows that this obvious problem did not stop von Neumann. What can he have been thinking? At this point we cannot paraphrase her account, because we can no longer follow it. We attach it as Appendix 2, in the hope that the reader may understand her better than we have done.

Setting aside what we take to be Hermann’s efforts to find the motivation behind von Neumann’s oversight, she has, in fact, read von Neumann more closely than Bell. She has the whole story. Once again, the culprit is Assumption $\mathrm{{B}}^\prime $. The only real difference between the reading we and Bell give and hers, is that she considers the possibility that von Neumann himself was aware of the obvious problem, and implicitly limited himself to subensembles for which the difficulty did not arise. But if he did that, then he had committed himself to the view that the hidden variables single out only those subensembles that lack features which make them any different from the larger $\phi $-ensembles that they combine to give. So even if he did know what he was doing, he was begging the question.

Notes

Not to be confused with the more famous “Bell’s theorem” of Bell [1]. The relation between these two different theorems of Bell is discussed in Mermin [8].
Bell raises a similar objection to his own improved argument, but this is not relevant to our concerns here.
Bub [5] adds Mermin [8] to his list of those who read von Neumann wrong.
We give them the names used by von Neumann.
Page references are all to the English translation of von Neumann.
In 1932 most physicists talked about probabilities in terms of ensembles. In this paper we follow von Neumann’s language. For a given physical system an ensemble can be understood as an assignment of probabilities to the outcomes of all possible ways to measure physical quantities (or sets of jointly measurable physical quantities) defined on that system. For a given ensemble, the function Exp(${{\mathcal {R}}}$) is then the standard expectation value for the probability distribution associated with the particular measurement of ${{\mathcal {R}}}$.
The acknowledgment that some physical quantities cannot be jointly measured introduces a crucial feature of quantum mechanics even into assumptions $\mathrm{{A}}^\prime $ and $\mathrm{{B}}^\prime $.
Bottom of p. 309.
Von Neumann’s statement of Assumption II is only for the special case $a=b=1$. But Assumptions I and $\mathrm{{A}}^\prime $ tell us that aR is the Hermitian operator associated with the physical quantity $a{{\mathcal {R}}}$, which leads directly to the more general form we give here.
Von Neumann’s derivation of the density matrix form for the Exp-function, mentioned below, makes explicit use of Assumption II to construct such linear combinations, to which he then applies Assumption $\mathrm{{B}}^\prime $.
In what follows we expand on how von Neumann uses his four assumptions to arrive at his no-hidden-variables theorem, and why his conclusions that hidden variables would undermine the fundamental principles of quantum mechanics are indeed not justified by his argument.
Pages 314–316, with some additional mopping up on pp. 316–320.
Von Neumann uses the term “definite”.
Von Neumann’s proof is quite straightforward. Three decades later Andrew Gleason proved what is now known as Gleason’s Theorem: that the density matrix form (4) follows from premises essentially equivalent to Assumptions $\mathrm{{A}}^\prime $, I, and II. Gleason does not use assumption $\mathrm{{B}}^\prime $ for physical quantities that are not jointly measurable. His argument is notoriously intricate (and requires that the Hilbert space have three or more dimensions).
He also brings up the question on pp. 209–210, but defers answering it, promising to show later that “an introduction of hidden parameters is certainly not possible without a basic change in the present theory.”
Putting it in terms of probability distributions p rather than ensembles, the question is whether $p_{[\phi ]}$ can be expressed as a weighted average of dispersion-free distributions, conditioned not just on the state $\phi $, but also on the additional parameters $\lambda $.
“Homer nods.”: Even the best of us sometimes slip up. From John Dryden’s translation of line 359 of Horace’s Ars Poetica: indignor quandoque bonus dormitat Homerus.
When von Neumann adds “Nor would it help if there existed other, as yet undiscovered, physical quantities, in addition to those represented by the operators in quantum mechanics, because the relations assumed by quantum mechanics (i.e., I, II) would have to fail already for the by-now known quantities,” he underlines that he is blaming Assumptions I and II.
It would be interesting to know if he ever became aware of Hermann’s.
See Mermin [8].
Bell then criticizes his own no-hidden-variables argument by challenging the implicit assumption that the result of measuring ${{\mathcal {R}}}$ should not depend on what other jointly measurable physical quantities ${{\mathcal {R}}}$ is measured with. But that’s another story.
Bell does mention by name Assumptions $\mathrm{{B}}^\prime $, I, and II in a footnote, but says nothing about their separate content.
Bell does make the distinction in an earlier introductory section, but unlike von Neumann he does not repeatedly insist on it. His use of “observable” to mean “physical quantity” is unfortunate, since by 1966 most physicists used the term for both physical quantities and Hermitian operators.
Bell often fails to distinguish between “noncommuting operators” and “not jointly measurable physical quantities”. We show in Appendix 1 that the association of noncommuting operators with physical quantities that are not jointly measurable does indeed follow from Assumptions $\mathrm{{A}}^\prime $, I, and II.
Bell mentions the nonadditivity of eigenvalues of non-commuting operators not because he thought von Neumann had overlooked this, but because it helps him explain the “nontriviality of the additivity of expectation values”, which von Neumann, unaccountably, takes for granted in the dispersion-free subensembles. Similarly, when Bell mentions that “a measurement of a sum of noncommuting observables cannot be made by combining trivially the results of separate observations on the two terms” because “it requires a quite distinct experiment,” he is not suggesting that von Neumann, who returns to this point repeatedly, was unaware of it.
The part of Hermann [7] that we address here is the short (small print!) Sect. 7, “The circle in von Neumann’s proof”, pp. 251–253. Hermann calls von Neumann “Neumann”. We have restored his “von” for the sake of uniformity.
Perhaps in 1935 the distinctions von Neumann relied on had not yet been absorbed into a terminology that obscured important distinctions.
We conjecture that she may have found von Neumann’s blatant oversight so surprising that she tried, unsuccessfully, to guess what else he may have had in mind. It is these final remarks of hers that lead Dieks to state that her views are closer than Bell’s to von Neumann.
Less anachronistically, we are reading von Neumann just as she does.
In Appendix 1 we prove from von Neumann’s Assumptions $\mathrm{{A}}^\prime $, I, and II that physical quantities that are jointly measurable do indeed correspond 1-to-1 to Hermitian operators that commute.
Unlike the proofs in von Neumann [10], or Park and Margenau [9], the elementary, but somewhat complicated, proof that follows makes no use of the spectral decomposition theorem for Hermitian operators.
The step is being able to apply $\mathrm{{B}}^\prime $ to the ($\phi ,\lambda $)-subensembles extracted from a $\phi $-ensemble by the value $\lambda $ of the hidden variables.
Here she notes that von Neumann does take the step.
She is saying that since he does take the step he must be implicitly assuming that the ($\phi ,\lambda $)-subensembles have no features to distinguish them from the original $\phi $-ensemble, for which $\mathrm{{B}}^\prime $ is valid.
She allows von Neumann the option of begging the question, rather than overlooking an obvious objection.
Here she echoes von Neumann in suggesting that the failure of I and II might be behind the existence of a dispersion free subensemble.

References

Bell, J.S.: On the Einstein–Podolski–Rosen paradox. Physics 1, 195–200 (1964). (Reprinted in Bell, [3])
Article Google Scholar
Bell, J.S.: On the problem of hidden variables in quantum mechanics. Rev. Modern Phys. 38, 447–452 (1966). (Reprinted in Bell, [3])
Article ADS MathSciNet MATH Google Scholar
Bell, J.S.: Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press, Cambridge (1987)
MATH Google Scholar
Bub, J.: Von Neumann’s ‘no hidden variables’ proof: a re-appraisal. Found. Phys. 40, 1333–1340 (2010). arXiv:1006.0499
Bub, J.: Is Von Neumann’s ‘no hidden variables’ proof silly? In: Halvorson, H. (ed.) Deep Beauty—Understanding the Quantum World Through Mathematical Innovation, pp. 393–407. Princeton University Press, Princeton (2011)
Chapter Google Scholar
Dieks, D.: Von Neumann’s impossibility proof: mathematics in the service of rhetorics. Stud. Hist. Philos. Modern Phys. 60, 136–148 (2017). arXiv:1801.09305
Article ADS MathSciNet MATH Google Scholar
Hermann, G.: Die naturphilosophischen Grundlagen der Quantenmechanik (Auszug). Abhandlungen der Fries’schen Schule 6, 75–152 (1935). English translation: Chapter 15 of “Grete Hermann — Between physics and philosophy”, Elise Crull and Guido Bacciagaluppi, eds., Springer, 2016, 239-278. [Volume 42 of Studies in History and Philosophy of Science]
Mermin, N.D.: Hidden variables and the two theorems of John Bell. Rev. Modern Phys. 65, 803–815 (1993). Recently posted at arXiv.1802.10119; three minor errata have been repaired, some footnotes of commentary have been added, and the present manuscript is announced as forthcoming
Park, J.L., Margenau, H.: Simultaneous measurability in quantum theory. Int. J. Theor. Phys. 1(3), 211–283 (1968)
Article Google Scholar
von Neumann, J.: Mathematische Grundlagen der Quantenmechanik, Springer, Berlin (1932). English translation: Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton (1955)

Download references

Acknowledgements

We would like to thank Ulrich Mohrhoff for bringing Dieks [6], to our attention.

Author information

Authors and Affiliations

Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, 14853, USA
N. David Mermin
Department of Mathematics, Royal Holloway University of London, Egham, Surrey, TW20 0EX, UK
Rüdiger Schack

Authors

N. David Mermin
View author publications
You can also search for this author in PubMed Google Scholar
Rüdiger Schack
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rüdiger Schack.

Appendices

Appendix 1

Here are two examples of the power of von Neumann’s Assumptions $\mathrm{{A}}^\prime $, I, and II. The problematic assumption $\mathrm{{B}}^\prime $ is not used; sums of physical quantities that are not jointly measurable are defined by sums of the corresponding Hermitian operators, using Assumption II. See also von Neumann [10] and Park and Margenau [9].

Theorem

The result of measuring a physical quantity must lie in the spectrum of the corresponding Hermitian operator.

Let R be the Hermitian operator corresponding to the physical quantity ${{\mathcal {R}}}$, and consider a function f(x) that is 0 if x belongs to the spectrum of R, and 1 elsewhere. This means that $f(R) = 0$, the zero operator. So Assumption I requires that $f({{\mathcal {R}}}) = 0$, the physical quantity that is always 0. Thus, for every result r of measuring ${{\mathcal {R}}}$, we have $f(r)=0$, which means that r belongs to the spectrum of R.

Theorem

The correspondence between physical quantities and Hermitian operators must associate jointly measurable quantities with commuting operators and vice-versa.

If two Hermitian operators R and S commute, then it is a mathematical fact that there is a third Hermitian operator T of which they are both functions:

$$\begin{aligned} R = f(T),\ \ S = g(T). \end{aligned}$$

(9)

By Assumption I the correspondence preserves functional relations, so

$$\begin{aligned} {{\mathcal {R}}}= f({{\mathcal {T}}}),\ \ {{\mathcal {S}}}= g({{\mathcal {T}}}). \end{aligned}$$

(10)

Using Assumption $\mathrm{{A}}^\prime $, one can then simultaneously measure ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ by measuring ${{\mathcal {T}}}$ and applying to the outcome of that measurement the functions f and g.

The converse is trickier.^{Footnote 31} Let the physical quantities ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ be jointly measurable, and let R and S be the corresponding Hermitian operators. By Assumption $\mathrm{{A}}^\prime $, functions of such jointly measurable quantities can be measured by measuring the individual quantities ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ and evaluating the function at the individual outcomes r and s. So products of two jointly measurable physical quantities, that differ only in the order in which the quantities appear, can all be measured by the same experiment and have the same measurement outcomes. Such physical quantities are therefore identical. Since the correspondence is 1-to-1, they must therefore all correspond to the same Hermitian operator. We show below that this leads to identities among the operators R and S which we can exploit to show that R and S must commute.

To begin with, it follows from Assumptions I and II that the physical quantity $({{\mathcal {R}}}+ {{\mathcal {S}}})^2 - {{\mathcal {R}}}^2 - {{\mathcal {S}}}^2 = 2{{\mathcal {R}}}{{\mathcal {S}}}= 2{{\mathcal {S}}}{{\mathcal {R}}}$ corresponds to the Hermitian operator $(R+S)^2 - R^2 - S^2 = (RS+SR)$. Therefore the Hermitian operator corresponding to both ${{\mathcal {R}}}{{\mathcal {S}}}$ and ${{\mathcal {S}}}{{\mathcal {R}}}$ when ${{\mathcal {R}}}$ and ${{\mathcal {S}}}$ are jointly measurable is given by

$$\begin{aligned} {{\mathcal {R}}}{{\mathcal {S}}}\ = \ {{\mathcal {S}}}{{\mathcal {R}}}\ \ \ \longleftrightarrow \ \ (RS+SR)/2. \end{aligned}$$

(11)

The next step is to apply the general rule (11) to another jointly measurable pair, ${{\mathcal {R}}}$ and ${{\mathcal {R}}}{{\mathcal {S}}}$:

$$\begin{aligned} {{\mathcal {R}}}({{\mathcal {R}}}{{\mathcal {S}}}) \ \longleftrightarrow \ [R(RS+SR)/2 + (RS+SR)R/2]/2 = (R^2S + 2RSR + SR^2)/4.\nonumber \\ \end{aligned}$$

(12)

One more application of (11), to the pair ${{\mathcal {S}}}$ and ${{\mathcal {R}}}({{\mathcal {R}}}{{\mathcal {S}}})$, gives

$$\begin{aligned} {{\mathcal {S}}}({{\mathcal {R}}}({{\mathcal {R}}}{{\mathcal {S}}})) \ \ \longleftrightarrow \ \ [2(SR)(RS) + 2(SR)^2 + 2(RS)^2 + S^2R^2 + R^2S^2 ]/8. \end{aligned}$$

(13)

Interchanging the names of ${{\mathcal {S}}}$ and ${{\mathcal {R}}}$ we also have

$$\begin{aligned} {{\mathcal {R}}}({{\mathcal {S}}}({{\mathcal {S}}}{{\mathcal {R}}})) \ \ \longleftrightarrow \ \ [2(RS)(SR) + 2(RS)^2 + 2(SR)^2 + R^2S^2 + S^2R^2]/8. \end{aligned}$$

(14)

On the other hand, directly squaring both sides of (11) gives

$$\begin{aligned} ({{\mathcal {R}}}{{\mathcal {S}}})^2 \ \ \longleftrightarrow \ \ [(RS)^2 + (SR)^2 + (RS)(SR) + (SR)(RS) ]/4. \end{aligned}$$

(15)

Since ${{\mathcal {S}}}({{\mathcal {R}}}({{\mathcal {R}}}{{\mathcal {S}}}))$, ${{\mathcal {R}}}({{\mathcal {S}}}({{\mathcal {S}}}{{\mathcal {R}}}))$, and $({{\mathcal {R}}}{{\mathcal {S}}})^2$ are all the same physical quantity, the sum of the right sides of (13) and (14) must be the same operator as twice the right side of (15). This gives us

$$\begin{aligned} R^2S^2 + S^2R^2 = (RS)(SR) + (SR)(RS) . \end{aligned}$$

(16)

We also have, as a direct application of (11) to the pair ${{\mathcal {R}}}^2$ and ${{\mathcal {S}}}^2$,

$$\begin{aligned} {{\mathcal {R}}}^2{{\mathcal {S}}}^2 \ \ \longleftrightarrow \ \ [R^2S^2 + S^2R^2]/2, \end{aligned}$$

(17)

and therefore, in view of (16),

$$\begin{aligned} {{\mathcal {R}}}^2{{\mathcal {S}}}^2 \ \ \longleftrightarrow \ \ [(RS)(SR) + (SR)(RS)] /2. \end{aligned}$$

(18)

Since the operators on the right sides of (18) and (15) must be the same, we have

$$\begin{aligned} (RS)(SR) + (SR)(RS) = (RS)^2+(SR)^2, \end{aligned}$$

(19)

and therefore

$$\begin{aligned} (RS-SR)^2 = 0. \end{aligned}$$

(20)

This requires the square of the Hermitian operator $C = i(RS-SR)$ to vanish, which in turn requires C itself to vanish. So R and S must indeed commute.

Appendix 2

We reproduce below the final paragraph and a half of the section on von Neumann in Hermann [7]. The footnotes are our own comments. We have a sense of what Hermann is trying to say in the first half-paragraph below and the first half of the last paragraph. We have no useful comments on the final part of the final paragraph. We would guess that she is struggling, unsuccessfully, to guess the kind of thinking that led to von Neumann’s surprising oversight.

In this way, however, an essential step in von Neumann’s proof is missing.^{Footnote 32} If instead — like von Neumann — one does not give up on this step,^{Footnote 33} then one has implicitly absorbed into the interpretation the unproven assumption that there can be no distinguishing features, of the elements of an ensemble of physical systems characterized by $\phi $, on which the result of the ${{\mathcal {R}}}$-measurement depends.^{Footnote 34} However, the impossibility of such features is precisely the claim to be proven. Thus the proof runs in a circle.^{Footnote 35}

On the other hand, from the standpoint of von Neumann’s calculus one can argue against this, that it is an axiomatic requirement that all physical quantities are uniquely associated with certain Hermitian operators in a Hilbert space, and that through the discovery of new features invalidating the present limits of predictability, this association would inevitably be broken.^{Footnote 36} Indeed, any discovery that is representable in the operator calculus would have its contents specified only through the form of a wave function, which for quantities not simultaneously measurable exhibits the smearing out required by the uncertainty relations, and which finds application only by way of the probability interpretation. By this consideration, however, the crucial physical question of whether the progress of physical research can attain more precise predictions than are possible today, cannot be twisted into the impossibly equivalent mathematical question of whether such a development would be representable solely in terms of the quantum mechanical operator calculus. There would need to be a compelling physical reason, if not only the physical data known to date, but also all the results of research still to be expected in the future are related to each other according to the axioms of this formalism. But how should one find such a reason? The fact that the formalism has so far proven itself, so that one is justified in seeing in it the appropriate mathematical description of known natural connections, does not mean that the as yet undiscovered natural law connections should also have the same mathematical structure.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Mermin, N.D., Schack, R. Homer Nodded: Von Neumann’s Surprising Oversight. Found Phys 48, 1007–1020 (2018). https://doi.org/10.1007/s10701-018-0197-5

Download citation

Received: 29 May 2018
Accepted: 28 June 2018
Published: 31 July 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10701-018-0197-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Homer Nodded: Von Neumann’s Surprising Oversight

Abstract

Similar content being viewed by others

Challenging the Gospel: Grete Hermann on von Neumann’s No-Hidden-Variables Proof

Motivation for This Work

Further Developments

1 Introduction

2 Von Neumann’s Argument

2.1 Von Neumann’s Assumptions

Assumption I

Assumption II

2.2 What von Neumann Proves with His Assumptions

2.3 Von Neumann nods

3 Bell’s Criticism of von Neumann

4 Hermann’s Criticism of von Neumann

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Theorem

Theorem

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Homer Nodded: Von Neumann’s Surprising Oversight

Abstract

Similar content being viewed by others

Challenging the Gospel: Grete Hermann on von Neumann’s No-Hidden-Variables Proof

Motivation for This Work

Further Developments

1 Introduction

2 Von Neumann’s Argument

2.1 Von Neumann’s Assumptions

Assumption I

Assumption II

2.2 What von Neumann Proves with His Assumptions

2.3 Von Neumann nods

3 Bell’s Criticism of von Neumann

4 Hermann’s Criticism of von Neumann

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Theorem

Theorem

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation